11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Sections 11.1, 11.2, and 11.4
Shiwen Shen
Department of StatisticsUniversity of South Carolina
Elementary Statistics for the Biological and Life Sciences
(STAT 205)
1 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Chapter 7
2 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Chapter 11
3 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Comparing more than two means
In Chapter 7 we had two groups and tested H0 : µ1 = µ2.
In Chapter 11 we will have I groups and testH0 : µ1 = µ2 = · · · = µI .
We are still interested in whether the population means arethe same across groups, there’s just more than two.
The alternative hypothesis is HA : one or more ofµ1, µ2, . . . , µI are different.
Let’s look at an example where I = 5.
4 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Example 11.1.1: Sweet Corn
When growing sweet corn, can organic methods be usedsuccessfully to control harmful insects and limit their effect on thecorn? In a study of this question researchers compared the weightsof ears of corn under five conditions in an experiment in whichsweet corn was grown using organic methods. In one plot of corn abeneficial soil nematode was introduced. In a second plot aparasitic wasp was used. A third plot was treated with both thenematode and the wasp. In a fourth plot a bacterium was used.Finally, a fifth plot of corn acted as a control; no special treatmentwas applied here. Thus, the treatments were
Treatment 1: Nematodes
Treatment 2: Wasps
Treatment 3: Nematodes and wasps
Treatment 4: Bacteria
Treatment 5: Control5 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Section 11.1 Background
6 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Weights of ears of corn receiving five treatments
7 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Comparing four population means requires six comparisons
The PROBLEM of MULTIPLE COMPARISONS occurs whenmultiple hypotheses are considered simultaneously.
H0 : µ1 = µ2 H0 : µ1 = µ3 H0 : µ1 = µ4
H0 : µ2 = µ3 H0 : µ2 = µ4 H0 : µ3 = µ4
8 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Type I error
Its natural to ask: why not now just compare all possible pairsµ1 − µ2, µ1 − µ3, µ2 − µ3, etc., each with 5% level?Question: With 4 samples (6 pairs), is the type I error6 × 5% = 30%?
Answer: NO! The calculation of type I error is morecomplicated!
9 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Type I error
Its natural to ask: why not now just compare all possible pairsµ1 − µ2, µ1 − µ3, µ2 − µ3, etc., each with 5% level?Question: With 4 samples (6 pairs), is the type I error6 × 5% = 30%?Answer: NO! The calculation of type I error is morecomplicated!
9 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Signal to Noise
H0 : µ1 = µ2 = µ3 = µ4
Ha : one or more of µ1, µ2, µ3, µ4 are different
Which study has more pronounced differences between groups?(a) or (b)?
10 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Analysis of Variance (ANOVA)
We will test H0 : µ1 = µ2 = · · · = µI via analysis of variance(ANOVA).
ANOVA compares how variable the sample meansy1, y2, . . . , yI are to how variable observations are around eachmean.
Assumptions: Observations in each group are independentlynormally distributed with the same variance σ2. (You are notnecessary to know the value of σ2.) The data in differentgroups are also independent.
11 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Building Test Statistic: Between Group Difference
Y1 is the sample mean for group 1.¯Y is the sample mean for all group.Between group difference:
n1 × (Y1 − ¯Y ) + n2 × (Y2 − ¯Y ) + n3 × (Y3 − ¯Y )
12 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Building Test Statistic: Between Group Variation
Between group variation:
n1 × (Y1 − ¯Y )2 + n2 × (Y2 − ¯Y )2 + n3 × (Y3 − ¯Y )2
= Σ3i=1ni (Yi − ¯Y )2
It is between group sums of squares.
13 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Building Test Statistic: Within Group Variation
Y1i are the observations from group 1, i = 1, 2, ..., n1.Y1 is the sample mean for group 1.Within group variation:
Σi (Y1i − Y1)2 + Σi (Y2i − Y2)2 + Σi (Y3i − Y3)2
14 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Building Test Statistic: Within Group Variation
By definition, sample standard deviation of group 1 is
s21 =Σi (Y1i − Y1)2
n1 − 1
Within group variation:
Σi (Y1i − Y1)2 + Σi (Y2i − Y2)2 + Σi (Y3i − Y3)2
= (n1 − 1) × s21 + (n2 − 1) × s22 + (n3 − 1) × s23
= Σi (ni − 1) × s2i
It is weighted sum of sample standard deviations, called withingroup sums of squares.
15 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Definition of Sums of Squares
Sums of Squares (SS) are sums of squared deviations from acentral value.
Between Group SS.
Within Group SS
Total SS =∑I
i=1
∑nij=1(Yij − ¯Y )2
Relation:
TOTAL SS = Between Group SS + Within Group SS
Mean Square (MS) is the average of the squared deviationsfrom a central value. It is a Sum of Squares (SS) divided bythe number of informative values in the SS, called “degrees offreedom”, or df
16 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
ANOVA formulae
17 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
Example 11.2.1: Weight Gain of Lambs
The table shows the weight gains (in 2 weeks) of young lambson three different diets.
The total number of observations is n. = 3 + 5 + 4 = 12
18 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
ANOVA table
estimate of the unknown variance:
σ2 = s2pool = MS(Within),
which is 23.33 in the Lamb data.
19 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
F test
H0 : µ1 = µ2 = · · · = µI
Test statistic:
F =MS(Between)
MS(Within),
which has an F (dfbetween, dfwithin) distribution if H0 is true.
The F distribution with ν1 and ν2 degrees of freedom is thedistribution of the ratio of two independent mean squares.Notation: F ∼ F (ν1, ν2).
In R, Within Group SS is called Residual SS
Obtain P-value from R using aov() command.
20 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
R code for lamb diet data
weight <- c(8,16, 9, 9,16,21,11,18,15,10,17, 6)
diet <- c( 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3)
diet <- factor(diet)
fit <- aov(weight~diet)
summary(fit)
Df Sum Sq Mean Sq F value Pr(>F)
diet 2 36 18.00 0.771 0.491
Residuals 9 210 23.33
With p-value 0.491 greater than 5% level, we accept the nullhypothesis and claim the mean weight gain is the same on all threediets.
21 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
R code for corn growth data (Example 11.1.1)
weight=c(16.5,11.0, 8.5,16.0,13.0,15.0,15.0,13.0,14.5,10.5,
11.5, 9.0,12.0,15.0,11.0,12.0, 9.0,10.0, 9.0,10.0,
12.5,11.5,12.5,10.5,14.0, 9.0,11.0, 8.5,14.0,12.0,
16.0, 9.0, 9.5,12.5,11.0, 6.5,10.0, 7.0, 9.0, 9.5,
8.0, 9.0,10.5, 9.0,18.5,14.5, 8.0,10.5, 9.0,17.0,
7.0, 8.0,13.0, 6.5,10.0,10.5, 5.0, 9.0, 8.5,11.0)
treat=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,
1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,
1,2,3,4,5,1,2,3,4,5)
treat=factor(treat)
fit=aov(weight~treat)
summary(fit)
Df Sum Sq Mean Sq F value Pr(>F)
treat 4 52.31 13.0771 1.6461 0.1758
Residuals 55 436.94 7.9443
With p-value 0.1758 greater than 5% level, we accept the nullhypothesis. 22 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
MAO activity in schizophrenia
23 / 24
11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test
MAO activity (Fig. 1.1.2)
url="http://people.stat.sc.edu/sshen/courses/17sstat205/data/mao.txt"
x = read.table(url,header=FALSE)
y = x[,1]
diagnosis = factor(x[,2])
fit = aov(y~diagnosis)
summary(fit)
Df Sum Sq Mean Sq F value Pr(>F)
diagnosis 2 136.1 68.06 6.346 0.00411 **
Residuals 39 418.3 10.72
With p-value 0.00411 less than 5% level, we reject the nullhypothesis and claim that at least one of the disease diagnosis isdifferent in MAO activity.
24 / 24