Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | lewis-robertson |
View: | 213 times |
Download: | 0 times |
Previous Lecture: Phylogenetics
Analysis of Variance
This Lecture
Judy Zhong Ph.D.
Learning Objectives
Until now, we have considered two groups of individuals and we've wanted to know if the two groups were sampled from distributions with equal population means or medians.
Suppose we would like to consider more than two groups of individuals and, in particular, test whether the groups were sampled from distributions with equal population means.
How to use one-way analysis of variance (ANOVA) to test for differences among the means of several populations ( “groups”)
Hypotheses of One-Way ANOVA
All population means are equal No treatment effect (no variation in means among groups)
At least one population mean is different There is a treatment effect Does not mean that all population means are different
(some pairs may be the same)
H1:Not all of the population means are the same
One-Factor ANOVA
All means are the same:The null hypothesis is true
(No treatment effect)
One-Factor ANOVA
At least one mean is different:The null hypothesis is NOT true
(Treatment effect is present)
or
(continued)
One-Way ANOVA:Model Assumptions
The K random samples are drawn from K independent populations
The variances of the populations are identical The underlying data are approximately normally
distributed
Basic Idea partitioning the variation
Suppose there are K groups with observations.
yij j-th observation in i-th group, y overall mean,
yi mean of group i
y
ijy y
i y y
ij y
i Deviation of group mean from grand
mean
Deviation of observations from
group mean
y
ij
i
ij
Knnn ,...,, 21
Partitioning the variation
y
ij y 2 y
ij y
i 2 yi y 2
Total variation(total SS)
Variation due to random sampling(within SS)
Variation due to factor(between SS)
Total variation is the sum of Within-group variability and Between-group variability
y
ij y y
ij y
i yi y
y
ij y
i Deviation of observations from group mean (within group variability)
Deviation of observations from overall mean (between group variability) y i y
Partitioning the variation
y
ij y 2
j1
ni
i1
3
yij y
i 2j1
ni
i1
3
yi y 2
j1
ni
i1
3
y overall mean
yi mean of group i
n13,n
24,n
34
G rou p 1 G rou p 2 G rou p 3
Resp on se , X
y
y1 y2
y3
Group 1 Group 2 Group 3
Response, X
Group 1 Group 2 Group 3
Response, X
If Between group variability is large and Within group variability is small => reject Ho
If Between group variability is small and Within group variability is large => accept Ho
Basic Idea of ANOVA
Partition of Total Variation
Variation Due to Factor (Between SS)
Variation Due to Random Sampling (Within SS)
Total Variation (total SS)
Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within-Group Variation
Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation
= +
d.f. = n – 1
d.f. = k – 1 d.f. = n – k
Total Sum of Squares
Total SS (y
ij y )2
i1
n j
j1
k
Where:
Total SS = Total sum of squares
k = number of groups (levels or treatments)
nj = number of observations in group j
yij = ith observation from group j
= grand mean (mean of all data values) y
Total SS = Between SS + Within SS
Total Variation
G rou p 1 G rou p 2 G rou p 3
Resp on se , X
Total SS (y
11 y )2 (y
12 y )2 ... (y
knk y )2
y
Between-Group Variation
y1
Group 1 Group 2 Group 3
Response, X
Between SS (y
j y )2
i1
n j
j1
k
n1(y
1 y )2 n
2(y
2 y )2 ...n
k(y
k y )2
y2
y3
y
Within-Group Variation
1Y
3Y
G rou p 1 G rou p 2 G rou p 3
Resp on se , X
Within SS (y
ij y
i)
j1
ni
i1
k
(ni 1) *S
i2
i1
k
(continued)
2Y
Obtaining the Mean Squares
Within MS
Within SS
n k
Between MS
Between SS
k 1
Total MS
Total SS
n 1
One-Way ANOVA Table
Source of Variation
dfSS MS(Variance)
Between Groups
B SS BMS =
Within Groups
n - kW SS WMS =
Total n - 1TSS =BSS+WSS
k - 1 BMS
WMS
F ratio
k = number of groupsn = sum of the sample sizes from all groupsdf = degrees of freedom
BSS
k - 1
WSS
n - k
F =
One-Way ANOVAF Test Statistic
Test statistic
Degrees of freedom df1 = k – 1 (k = number of groups) df2 = n – k (n = sum of sample sizes from all populations)
F
Between MS
Within MS
Interpreting One-Way ANOVA F Statistic
The F statistic is the ratio of the among estimate of variance and the within estimate of variance The ratio must always be positive df1 = k -1 will typically be small df2 = n - k will typically be large
Decision Rule: Reject H0 if F > FU
Otherwise do not reject H0
0
= .05
Reject H0Do not reject H0
FU
Example
You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the 0.05 significance level, is there a difference in mean distance?
Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204
••••
•
Example
270
260
250
240
230
220
210
200
190
••
•••
•••••
Distance
Y 1 249.2 Y 2 226.0 Y 3 205.8
Y 227.0
Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204
Club1 2 3
Y 1
Y 2
Y 3
Y
Example
Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204
Y1 = 249.2
Y2 = 226.0
Y3 = 205.8
Y = 227.0
n1 = 5
n2 = 5
n3 = 5
n = 15
k = 3
B SS = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
W SS = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6
BMS = 4716.4 / (3-1) = 2358.2
WMS = 1119.6 / (15-3) = 93.325.275
93.3
2358.2F
Test Statistic:
Decision:
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
Test Statistic:
Decision:
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F
BMS
WMS
2358.2
93.325.275
Test Statistic:
Decision:
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F = 25.275
F
BMS
WMS
2358.2
93.325.275
Test Statistic:
Decision:
Reject H0 at = 0.05
Conclusion:
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F = 25.275
F
BMS
WMS
2358.2
93.325.275
Test Statistic:
Decision:
Reject H0 at = 0.05
Conclusion:There is evidence that at least one µj differs from the rest
0
= .05
FU = 3.89Reject H0Do not
reject H0
Critical Value:
FU = 3.89
Example
H0: µ1 = µ2 = µ3
H1: µj not all equal 0.05 df1= 2, df2 = 12
Table 9: Critical Value=2.052
F
BMS
WMS
2358.2
93.325.275
F = 25.275
Source SS DF MS F P-value
Between 4716.4 2 2358.2 25.76 <0.001
Within 1119.6 12 93.3
Total 5836.0
ANOVA Table
Next Lecture: Categorical Data Methods