Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | derek-levine |
View: | 94 times |
Download: | 3 times |
Anthony J Greene 1
ANOVA: Analysis of Variance
1-way ANOVA
Anthony J Greene 2
ANOVA
I. What is Analysis of Variance1. The F-ratio2. Used for testing hypotheses among more than two means3. As with t-test, effect is measured in numerator, error variance
in the denomenator4. Partitioning the Variance
II. Different computational concerns for ANOVA1. Degrees Freedom for Numerator and Denominator2. No such thing as a negative value
III. Using Table B.4IV. The Source TableV. Hypothesis testing
Anthony J Greene 3
M2 M3M1
Anthony J Greene 4
ANOVA
• Analysis of Variance• Hypothesis testing for more than 2 groups
• For only 2 groups t2(n) = F(1,n)
Merror
effect
error
effect
s
MM
s
st
s
sF 21
2
2
Anthony J Greene 5
BASIC IDEA
• As with the t-test, the numerator expresses the differences among the dependent measure between experimental groups, and the denominator is the error.
• If the effect is enough larger than random error, we reject the null hypothesis.
VarianceTreatment Within
VarianceTreatment Between F
M1 = 1 M2 = 5 M3 = 1
Is the Effect Variability
Large
Compared to the Random Variability
Grp 1 Grp 2 Grp 3
Effect VRandom V
=
Anthony J Greene 6
BASIC IDEA
• If the differences accounted for by the manipulation are low (or zero) then F = 1
• If the effects are twice as large as the error, then F = 3, which generally indicates an effect.
Error
Error Effect Treatment F
Anthony J Greene 7
Sources of Variance
Anthony J Greene 8
Why Is It Called Analysis of Variance?Aren’t We Interested In Means, Not Variance?
• Most statisticians do not know the answer to this question?
• If we’re interested in differences among means why do an analysis of variance?
• The misconception is that it compares 12 to 2
2. No
• The comparison is between effect variance (differences in group means) to random variance.
Anthony J Greene 9
Learning Under Three Temperature Conditions
TGxT ,T is the treatment total, G is the Grand totalM2 M3M1
Anthony J Greene 10
Computing the Sums of Squares
Anthony J Greene 11
How Variance is Partitioned
This simply disregards group membership and computes an overall SS
Variability Between and Within Groups is Included
N
XXSS
22
Keep in mind the general formula for SS
N
GXSSTotal
22
M1 = 1 M2 = 5 M3 = 1
Grp 1 Grp 2 Grp 3
Anthony J Greene 12
How Variance is Partitioned
Imagine there were no individual differences at all.
The SS for all scores would measure only the fact that there were group differences.
Grp 1 Grp 2 Grp 3
N
XXSS
22
Keep in mind the general formula for SS
N
G
n
TSSBetween
22
1 5 11 5 11 5 11 5 11 5 1
M1 = 1 M2 = 5 M3 = 1
T1 = 5 T2 = 25 T3 = 5
Anthony J Greene 13
How Variance is Partitioned
SS computed within a column removes the mean.
Thus summing the SS’s for each column computes the overall variability except for the mean differences between groups.
M1 = 1 M2 = 5 M3 = 1
Grp 1 Grp 2 Grp 3
2)( MXSS
Keep in mind the general formula for SSSSSSWithin
0-11-13-11-10-1
4-53-56-53-54-5
1-12-12-10-10-1
Anthony J Greene 14
How Variance is Partitioned
N
GXSSTotal
22
M1 = 1 M2 = 5 M3 = 1
Grp 1 Grp 2 Grp 3
0-11-13-11-10-1
4-53-56-53-54-5
1-12-12-10-10-1
N
G
n
TSSBetween
22
2)( MXSS
SSSSWithin
Anthony J Greene 15
Computing Degrees Freedom
• df between is k-1, where k is the number of treatment groups (for the prior example, 3, since there were 3 temperature conditions)
• df within is N-k , where N is the total number of ns across groups. Recall that for a t-test with two independent groups, df was 2n-2? 2n was all the subjects N and 2 was the number of groups, k.
Anthony J Greene 16
Computing Degrees Freedom
Anthony J Greene 17
How Degrees Freedom Are Partitioned
N-1 = (N - k) + (k - 1)
N-1 = N - k + k – 1
Anthony J Greene 18
Partitioning The Sums of Squares
Anthony J Greene 19
Computing An F-Ratio
between
betweenbetween df
SSMS
within
withinwithin df
SSMS
within
between
MS
MSF
Anthony J Greene 20
Consult Table B-4
Take a standard normal distribution, square each value, and it looks like this
Anthony J Greene 21
Table B-4
Anthony J Greene 22
Two different F-curves
Anthony J Greene 23
ANOVA: Hypothesis Testing
Anthony J Greene 24
Basic Properties of F-Curves
Property 1: The total area under an F-curve is equal to 1.
Property 2: An F-curve starts at 0 on the horizontal axis and extends indefinitely to the right, approaching, but never touching, the horizontal axis as it does so.
Property 3: An F-curve is right skewed.
Anthony J Greene 25
Finding the F-value having area 0.05 to its right
Anthony J Greene 26
Assumptions for One-Way ANOVA
1. Independent samples: The samples taken from the populations under consideration are independent of one another.
2. Normal populations: For each population, the variable under consideration is normally distributed.
3. Equal standard deviations: The standard deviations of the variable under consideration are the same for all the populations.
Anthony J Greene 27
Learning Under Three Temperature Conditions
M1 = 1 M2 = 5 M3 = 1
Anthony J Greene 28
Learning Under Three Temperature Conditions
M1 = 1 M2 = 5 M3 = 1
Is the Effect Variability
Large
Compared to the Random Variability
Anthony J Greene 29
Learning Under Three Temperature Conditions
xT
Anthony J Greene 30
Learning Under Three Temperature Conditions
xT
Anthony J Greene 31
Learning Under Three Temperature Conditions
Anthony J Greene 32
Learning Under Three Temperature Conditions
Anthony J Greene 33
Learning Under Three Temperature Conditions
M2 M3M1
Anthony J Greene 34
Learning Under Three Temperature Conditions
ΣX2 = 106
191
16936916
144
M2 M3M1
Anthony J Greene 35
Learning Under Three Temperature Conditions
TG
M2 M3M1
36
Learning Under Three Temperature Conditions
M2 M3M1
Calculating the F statistic
Sstotal = X2-G2/N = 46
SSbetween =
SSbetween = 30
SStotal= Ssbetween + SSwithin
Sswithin = 16
N
G
n
T 22
28.1133.1
15
1216230
within
within
between
between
within
betweeen
dfSSdfSS
MS
MSF
Anthony J Greene 38
Distribution of the F-Statistic for One-Way ANOVA
error
treatment
within
between
MS
MS
MS
MSF
Suppose the variable under consideration is normally distributed on each of k populations and that the population standard deviations are equal. Then, for independent samples from the k populations, the variable
has the F-distribution with df = (k – 1, n – k) if the null hypothesis of equal population means is true. Here n denotes the total number of observations.
Anthony J Greene 39
ANOVA Source Table
for a one-way analysis of variance
Anthony J Greene 40
The one-way ANOVA test for k population means (Slide 1 of 3)
Step 1 The null and alternative hypotheses are
Ho: 1 = 2 = 3 = …= k
Ha: Not all the means are equal
Step 2 Decide On the significance level,
Step 3 The critical value of F, with df = (k - 1, N - k), where N is the total number of observations.
Anthony J Greene 41
The one-way ANOVA test for k population means (Slide 2 of 3)
Anthony J Greene 42
The one-way ANOVA test for k population means (Slide 3 of 3)
Step 4 Obtain the three sums of squares, STT, STTR, and SSE
Step 5 Construct a one-way ANOVA table:
Step 6 If the value of the F-statistic falls in the rejection region, reject H0;
Anthony J Greene 43
Post Hocs
•H0 : 1 = 2 = 3 = …= k
•Rejecting H0 means that not all means are equal.
•Pairwise tests are required to determine which of the means are different.
•One problem is for large k. For example with k = 7, 21 means must be compared. Post-Hoc tests are designed to reduce the likelihood of groupwise type I error.
Anthony J Greene 44
Criterion for deciding whether or not to reject the null hypothesis
Anthony J Greene 45
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8
3 4 6
0 1 4
1 1 7
A researcher wants to test the effects of St. John’s Wort, an over the counter, herbal anti-depressant. The measure is a scale of self-worth. The subjects are clinically depressed patients. Use α = 0.01
Anthony J Greene 46
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6
0 1 4
1 1 7
T1=5 T2=10 T3=30
Compute the treatment totals, T, and the grand total, G
Anthony J Greene 47
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7
T1=5 T2=10 T3=30
n1=5 n2=5 n3=5
Count n for each treatment, the total N, and k
Anthony J Greene 48
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7
T1=5 T2=10 T3=30
n1=5 n2=5 n3=5
M1=1 M2 =2 M3 =6
Compute the treatment means
Anthony J Greene 49
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7
T1=5 T2=10 T3=30
n1=5 n2=5 n3=5
M1=1 M2 =2 M3 =6
SS=6 SS=8 SS=10
Compute the treatment SSs
(0-1)2=1
(1-1)2=0
(3-1)2=4
(0-1)2=1
(1-1)2=0
sum
Anthony J Greene 50
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7 X2= 229
T1=5 T2=10 T3=30
n1=5 n2=5 n3=5
M1=1 M2 =2 M3 =6
SS=6 SS=8 SS=10
Compute all X2s and sum them
Anthony J Greene 51
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7 X2= 229
T1=5 T2=10 T3=30 SSTotal=94
n1=5 n2=5 n3=5
M1=1 M2 =2 M3 =6
SS=6 SS=8 SS=10
Compute SSTotal
SSTotal= X2 – G2/N
Anthony J Greene 52
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7 X2= 229
T1=5 T2=10 T3=30 SSTotal=94
n1=5 n2=5 n3=5 SSWithin=24
M1=1 M2 =2 M3 =6
SS1=6 SS2=8 SS3=10
Compute SSWithin
SSWithin= SSi
Anthony J Greene 53
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7 X2= 229
T1=5 T2=10 T3=30 SSTotal=94
n1=5 n2=5 n3=5 SSWithin=24
M1=1 M2 =2 M3 =6 d.f. Within=12
SS1=6 SS2=8 SS3=10 d.f. Between=2
d.f. Total=14
Determine d.f.s
d.f. Within=N-k
d.f. Between=k-1
d.f. Total=N-1
Note that (N-k)+(k-1)=N-1
Anthony J Greene 54
One-Way ANOVA
control low dose high dose
0 1 5
1 3 8 G=45
3 4 6 N=15
0 1 4 k=3
1 1 7 X2= 229
T1=5 T2=10 T3=30 SSTotal=94
n1=5 n2=5 n3=5 SSWithin=24
MX1=1 MX2 =2 MX3 =6 d.f. Within=12
SS1=6 SS2=8 SS3=10 d.f. Between=2
d.f. Total=14
Ready to move it to a source table
Anthony J Greene 55
One-Way ANOVA
Compute the missing values
Source SS df MS F
Between 70 2
Within 24 12
Total 94 14
Anthony J Greene 56
One-Way ANOVA
Compute the missing values
Source SS df MS F
Between 70 2 35
Within 24 12 2
Total 94 14
Anthony J Greene 57
One-Way ANOVA
Compute the missing values
Source SS df MS F
Between 70 2 35 17.5
Within 24 12 2
Total 94 14
Anthony J Greene 58
One-Way ANOVA1. Compare your F of 17.5 with the critical value at
2,12 degrees of freedom, = 0.01: 6.93
2. reject H0
Source SS df MS F
Between 70 2 35 17.5
Within 24 12 2
Total 94 14
Anthony J Greene 59
One-Way ANOVA
Low Medium High
2 6 9
4 4 10
3 5 8
0 3 10
2
1
6
6
8
9
Students want to know if studying has an impact on a 10-point statistics quiz, so they divided into 3 groups: low studying (0-5hrs./wk), medium studying (6-15 hrs./wk) and high studying (16+ hours/week). At α=0.01, does the amount of studying impact quiz scores?
Anthony J Greene 60
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8
0 3 10
2
1
6
6
8
9
T1=12 T2=30 T3=54
Compute the treatment totals, T, and the grand total, G
Anthony J Greene 61
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
T1=12 T2=30 T3=54
n1=6 n2=6 n3=6
Count n for each treatment, the total N, and k
Anthony J Greene 62
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
T1=12 T2=30 T3=54
n1=6 n2=6 n3=6
M1=2 M2 =5 M3 =9
Compute the treatment means
Anthony J Greene 63
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
T1=12 T2=30 T3=30
n1=6 n2=6 n3=6
M1=2 M2 =5 M3 =9
SS=10 SS=8 SS=10
Compute the treatment SSs
(2-2)2=0
(4-2)2=4
(3-2)2=1
(0-2)2=4
(2-2)2=0
(1-2)2=1
sum
Anthony J Greene 64
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
X2=682
T1=12 T2=30 T3=54
n1=6 n2=6 n3=6
M1=2 M2 =5 M3 =9
SS=10 SS=8 SS=10
Compute all X2s and sum them
Anthony J Greene 65
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
X2= 682
T1=12 T2=30 T3=54 SSTotal=170
n1=6 n2=6 n3=6
M1=2 M2 =5 M3 =9
SS=10 SS=8 SS=10
Compute SSTotal
SSTotal= X2 – G2/N
Anthony J Greene 66
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=96
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
X2= 682
T1=12 T2=30 T3=54 SSTotal=170
n1=6 n2=6 n3=6 SSWithin=28
M1=2 M2 =5 M3 =9
SS1=10 SS2=8 SS3=10
Compute SSWithin
SSWithin= SSi
Anthony J Greene 67
One-Way ANOVA
low medium high
2 6 9
4 4 10 G=90
3 5 8 N=18
0 3 10 k=3
2
1
6
6
8
9
X2= 682
T1=12 T2=30 T3=54 SSTotal=170
n1=6 n2=6 n3=6 SSWithin=28
M1=2 M2 =5 M3 =9 d.f. Within=15
SS1=10 SS2=8 SS3=10 d.f. Between=2
Determine d.f.s
d.f. Within=N-k
d.f. Between=k-1
d.f. Total=N-1
Note that (N-k)+(k-1)=N-1
Anthony J Greene 68
One-Way ANOVA
Fill in the values you have
Source SS df MS F
Between 2
Within 28 15
Total 170 17
Anthony J Greene 69
One-Way ANOVA
Compute the missing values
Source SS df MS F
Between 142 2 71 37.97
Within 28 15 1.87
Total 170 17
Anthony J Greene 70
One-Way ANOVA1. Compare your F of 37.97 with the critical value at
2,15 degrees of freedom, = 0.01: 6.36
2. reject H0
Source SS df MS F
Between 142 2 71 37.97
Within 28 15 1.87
Total 170 17