5Stat NoteIn the fifth of a series of articles about statistics for biologists, Anthony Hilton and Richard Armstrong ask:
is one set of data more variable than another?
www.sfam.org.uk34 June 2006
(Hilton & Armstrong, 2005). Ahypothetical experiment wascarried out to investigate theefficacy of two novel mediasupplements (S1 and S2) inpromoting the development ofcell biomass. Three ten-litrefermentation vessels weresterilised and filled withidentical growth media withthe exception that the mediain two of the vessels wassupplemented with ten ml of
of variation and theassumption of homogeneity ofvariance may need to beexplicitly tested. This Statnotedescribes four such tests, viz.,the variance-ratio (F) test,Bartlett’s test, Levene’s test,and Brown and Forsythe’stest.
The scenario
We return to the scenariofirst described in Statnote 3
important assumption for theuse of the ‘t’ test (Hilton &Armstrong, 2005) or analysisof variance (ANOVA)(Armstrong & Hilton, 2004) isthat the variability of thedifferent groups beingcompared is similar, i.e., thatthey exhibit homogeneity ofvariance. Replicatemeasurements within a controland a treated group, however,often exhibit different degrees
HERE MAY BEoccasions when it isnecessary to testwhether the
variability of two or more setsof data differ.
An investigator, forexample, may wish to testwhether a new treatmentreduces the variability of aparticular microbial responsecompared with an oldertreatment. In addition, an
T
www.sfam.org.uk 35June 2006
Features
either medium supplement S1or S2. The vessels wereallowed to equilibrate andwere subject to identicalenvironmental / incubationconditions. The vessels werethen inoculated with a cultureof Bacterium x at an equalculture density and thefermentation allowed toproceed until all the availablenutrients had been exhaustedand bacterial growth hadceased. The entire volume ofculture media in eachfermentation vessel was thenremoved and filtered torecover the bacterial biomass,which was subsequently driedand the dry weight of cellsmeasured. This experimentwas repeated 25 times and thedry weight of biomassproduced in each of the threegroups recorded in Table 1.
The variance-ratio test
If there are only two groupsinvolved, then their variancescan be compared by a two-tailvariance ratio test (F-test)(Snedecor & Cochran, 1980).
How is the test done?
The larger variance isdivided by the smaller and theresulting F ratio comparedwith the value in a table of thevariance ratio to obtain a P-value, entering the table forthe number of degrees offreedom (DF) of thenumerator and denominator.This test uses the two-tailprobabilities of F because weare testing whether or not thetwo variances differ ratherthan whether variance A isgreater than variance B.Hence, this calculation differsfrom that carried out during atypical ANOVA, since in thelatter, it is whether thetreatment variance is largerthan the error variance that isbeing tested (Armstrong &Hilton, 2004). Publishedstatistical tables of the F ratio(Fisher & Yates, 1963;Snedecor & Cochran, 1980)are usually in the form of one-tail tables. Hence, the 2.5%
probability column has to beused to obtain the 5%probability.
Interpretation of theresults
When the unsupplementedand S1 data are compared(Table 1), a value of F = 1.03was obtained. This value isless than the F value in the2.5% column (P > 0.05) andconsequently, there is noevidence that the addition ofthe medium S1 increased ordecreased the variance inreplicate flasks.
Bartlett’s test
If there are three or moregroups, then the differentgroups could be tested inpairs using the F-test
described above, but a betterapproach is to test all thevariances simultaneously usingBartlett’s test (Snedecor &Cochran, 1980).
How is the test done?
If there are equal numbersof observations in each group,calculation of the test statisticis straight-forward and aworked example is shown inTable 2. If the three variancesdo not differ from each other,then the ratio M/C is amember of the chi-square (χ2)distribution with (a – 1)degrees of freedom (DF),where ‘a’ is the number ofgroups being compared. If thegroups have different numbersof observations in each(unequal ‘n’), then the
calculations are slightly morecomplex and are given inSnedecor and Cochran (1980).
Interpretation of theresults
In the worked example inTable 2, the value of χ2 washighly significant (P < 0.001)suggesting real differencesbetween the variances of thethree groups. The previous F-test suggested, however, thatthe variance of theunsupplemented data wassimilar to that of the growthmedium S1. Therefore, it isthe effect of the growthmedium S2 that hassubstantially increased thevariance of bacterial biomass.Hence, if these data were tobe analysed by ANOVA(Armstrong & Hilton, 2004),the assumption ofhomogeneity of variancewould not hold and it may benecessary to transform thedata to logarithms beforeanalysis to stabilize thevariance. Data transformationis described in more detail inStatnote 4 (Hilton &Armstrong, 2006).
The use of the χ2
distribution to test thesignificance of M/C isquestionable if the DF withinthe groups are less than fiveand in such a case, there arespecial tables for calculatingthe significance of the statistic(Pearson & Hartley, 1954).Bartlett’s test is used lesstoday and may not normallybe available as part of astatistics software package.This is because the test isregarded as being too‘sensitive’ resulting in toomany significant resultsespecially with data from long-tailed distributions (Snedecor& Cochran, 1980). Hence useof the test may raiseunjustified concerns aboutwhether the data conform tothe assumption ofhomogeneity of variance. As aconsequence, Levene (1960)developed a more robust testto compare three or more
Variances: US = 463.36. S1 = 447.88. S2 = 18695.24Variance-ratio test comparing US and S1: F = 463.36/447.88 = 1.03(2-tail distribution of F, P > 0.05)
US
461
472
473
481
482
482
494
493
495
S1
562
573
574
581
582
586
591
592
592
S2
354
359
369
403
425
476
511
513
534
US
506
502
501
505
508
500
513
512
511
S1
607
600
603
605
607
609
611
611
615
S2
556
578
604
623
644
668
678
698
703
US
518
527
524
529
537
535
542
S1
617
622
626
628
631
637
645
S2
714
721
722
735
754
759
765
Table 1. Dry weight of bacterial biomass under unsupplemented(US) and two supplemented (S) growth conditions (S1 and S2) ina sample of 25 fermentation vessels.
M = v[a (ln s*2) – Σ ln si2] where s*2 is the mean of the variances, ‘a’ the
number of groups, v = DF of each group, and ln = logarithms to base e.Hence, M = 102.62 C = 1 + (a +1)/(3av) = 1.018χ2 = M/C = 102.62/1.018 = 100.8 (DF = a – 1, P < 0.001)
Group
Unsupplemented
S1
S2
Total
Variance
436.36
447.88
18695.24
19606.48
In (variance)
6.1385
6.1045
9.8360
22.079
Table 2. Comparison of the variances of three groups with equalobservations (v = 25) in each by Bartlett’s test.
www.sfam.org.ukJune 200636
References
■ Armstrong RA & Hilton A(2004) The use of analysis ofvariance (ANOVA) in appliedmicrobiology. Microbiologist, vol5: No.4 18.
■ Brown MB & Forsythe AB(1974) Robust tests for theequality of variances. J Am StatsAssoc 69: 264-267.
■ Fisher RA & Yates F (1963)Statistical tables. Longman,London.
■ Hilton A & Armstrong RA(2005) Statnote 3: Comparingthe difference between twogroups. Microbiologist, vol 6:No.4 30.
■ Hilton A & Armstrong RA(2006) Statnote 4: What if thedata are not normal?Microbiologist, vol 7: No.1 34
■ Levene H (1960) In:Contributions to Probability andStatistics. Stanford UniversityPress, Stanford, California.
■ Pearson ES & Hartley HO(1954) Biometrika Tables forStatisticians, vol1. CambridgeUniversity Press.
■ Snedecor G W & Cochran W G (1980) Statistical Methods,7th Ed. Iowa State UniversityPress, Ames Iowa.
Did you know thatprevious Stat Notes areavailable for downloadfrom the website inAdobe Actobat PDFformat?
Simply click the articlesyou wish to view and/orright click a link to savea copy of the PDF toyour hard disk. Simply visit: http://www.sfam.org.uk/features.php
Instantaccess!
Dr Anthony* Hilton and Dr Richard Armstrong***Pharmaceutical Sciences and**Vision Sciences, AstonUniversity, Birmingham, UK
variances (Snedecor &Cochran, 1980).
Levene’s test. How isthe test done?
Levene’s test makes use ofthe absolute deviation of theindividual measurements fromtheir group means rather thanthe variance to measure thevariability within a group.Avoiding the squaring ofdeviations as in the calculationof variance results in ameasure of variability that isless sensitive to the presenceof a long-tailed distribution.An ANOVA (Armstrong &Hilton, 2004) is thenperformed on the absolutedeviations and if significant,the hypothesis ofhomogeneous variances isrejected.
Interpretation of thedata
A Levene’s test on the datain Table 1 using STATISTICAsoftware, for example, gave avalue of F = 52.86 (DF 2,72;P < 0.001) confirming theresults of Bartlett’s test.
More recently, Levene’s testhas also been called intoquestion since the absolutedeviations from the groupmeans are likely to be highlyskewed and therefore, violateanother assumption required
for an ANOVA, that ofnormality (Armstrong andHilton, 2004). This problembecomes particularly acute ifthere are unequal numbers ofobservations in the variousgroups being compared. As aconsequence, a modificationof the Levene test has beenproposed by Brown andForsythe (1974).
Brown-Forsythe test.How is the test done?
This differs from Levene’stest in that an ANOVA isperformed not on the absolutedeviations from the groupmeans but on deviations fromthe group medians. This testmay be more accurate thanLevene’s test even when thedata deviate from a normaldistribution. Nevertheless,both Levene’s and the Brown-Forsythe tests suffer from thesame defect in that to assessdifferences in variancerequires an ANOVA, and anANOVA requires theassumption of ‘homogeneity ofvariance,’ which some authorsconsider to be a ‘fatal flaw’ ofthese analyses.
Conclusion
There may becircumstances where it isnecessary for microbiologiststo compare variances rather
than means, e,g., in analysingdata from experiments todetermine whether aparticular treatment alters thedegree of variability or testingthe assumption ofhomogeneity of variance priorto other statistical tests.
All of the tests described inthis Statnote have theirlimitations. Bartlett’s test maybe too sensitive but Levene’sand the Brown-Forsythe testsalso have problems. We wouldrecommend the use of thevariance-ratio test to comparetwo variances and the carefulapplication of Bartlett’s test ifthere are more than twogroups.
Considering that these testsare not particularly robust, itshould be remembered thatthe homogeneity of varianceassumption is usually the leastimportant of those consideredwhen carrying out an ANOVA.
If there is concern aboutthis assumption and especiallyif the other assumptions of theanalysis are also not likely tobe met, e.g., lack of normalityor non additivity of treatmenteffects (Armstrong & Hilton,2004) then it may be bettereither to transform the data orto carry out a non-parametrictest on the data.