Factorial ANOVA
Two-factor, crossed,additive models
1
Applied Stats Algorithm
No Unacceptable
Numerical Categorical
Predictor(s)
CategoricalNumerical Both
1 Factor 2+ Factors
Today
Scientificquestion?
Classify Study
Response Variable Multi-Var
Univariate
Fixed Effects Random Effects
Censored Complete
2
Experiment Description One (numerical) response variable Dependent, Outcome
Two (categorical) independent variables Treatments, Predictors, Explanatory
If the independent variables are both factors and are crossed, called a Factorial Design If there are observations at each treatment
combination, called a complete design There are also incomplete and fractional factorial
designs3
Farming Example(Factorial setup) Suppose we continue with the farming example 16 observations of crop yield (Y) 4 fertilizers (Factor A) with levels {ππ1, ππ2, ππ3, ππ4} 2 types of crop (Factor B) with levels {ππππππππ,π π πππ π π π π π ππππ}
There are 4 οΏ½ 2 = 8 different treatments, each with 2 replications
This is a 4 x 2 factorial experiment
4
Single-Factor setup Why canβt we just label the treatments as a
single factor and analyze it as before? We can Factor AB with 8 levels:
{ππ1 ππππππππ, ππ1 π π πππ π π π π π ππππ, ππ2 ππππππππ, β¦ , ππ4 π π πππ π π π π π ππππ}y <- c(65, 64, 56, 60, 55, 58, 62, 65, 66, 69, 72, 76, 60, 64, 68, 70)
a <- factor(rep(c("a1", "a2", "a3", "a4"), each=4))
b <- factor(rep(c("corn", "soybean"), times=8))
farm <- data.frame(yield=y, fertil=a, type=b)
farm <- within(farm, x <- paste(fertil, type))
# farm$x <- paste(farm$fertil, farm$type)with(farm, tapply(yield, x, mean))
boxplot(yield ~ x, data=farm)
anova(lm(yield ~ x, data=farm))5
R output
6
R output> with(farm, tapply(yield, x, mean))
a1 corn a1 soybean a2 corn a2 soybean a3 corn a3 soybean a4 corn a4 soybean
60.5 62.0 58.5 61.5 69.0 72.5 64.0 67.0
> anova(lm(yield ~ x, data=farm))Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
x 7 315.75 43.857 1.8992 0.194
Residuals 8 190.00 23.750
No significant effects detected, despite the fact that we already know fertilizer is important Effect hidden by non-importance of crop type
7
Two-factor ANOVA Single-factor analysis doesnβt give us the effects
separated by the two factors, which we want It is also impossible to spot interactions with only
one factor Think of an ππ π₯π₯ ππ factorial design as a series of ππ single-factor experiments, each with ππ groups Or a series of ππ experiments with ππ groups
Define simple effects as the results of one of those ππ (or ππ) experiments
8
Farming example We have 4 fertilizers, and 2 types of crop For the ππππππππ crops, we have a simple effect of
fertilizer on crop yield For the π π πππ π π π π π ππππ crops, we have another simple
effect of fertilizer on crop yield These two simple effects, averaged together,
are called the main effect of fertilizer If the simple effects are the same as the main
effect, then there is no interaction present Otherwise, there is an interaction
9
Interactions An interaction means βit dependsβ The effect of Factor A depends on the level of
Factor B The effect of Factor B depends on the level of
Factor A Important to recognize before setting model If present, the statistical model will change
10
Farming Example(Interaction) Do we believe that the effect of fertilizer depends
on what crop type is planted? This is like saying βFertilizer ππ1 is best for corn,
but ππ3 is best for soybeansβ Itβs very possible!
If so, cannot say something like βFertilizer ππ1 is better than ππ3β because β¦ it depends!
Just check the group means to find out Such a plot is called an interaction plot
11
Plot the group means
with(farm, interaction.plot(fertil, type, y, col=c("red", "blue"),
main="Interaction Plot", xlab="Fertilizer mean", ylab="Yield"))
12
Another way to do it
with(farm, interaction.plot(type, fertil, y, col=c("red", "blue"),
main="Interaction Plot", xlab="Fertilizer mean", ylab="Yield"))
13
Interactions Other, equivalent definitions of interaction The values of one or more contrasts change at
different levels of the other factor The main effect is not representative of the simple
effects The differences among cell means representing
effect of Factor A at one level of Factor B are not the same as at another level of Factor B
The effects of one factor are conditionally related to the levels of another factor
14
Additive Model It doesnβt look like the effect of Fertilizer
depends on crop type, so fit an additive modelππππππππ = ππ + πΌπΌππ + π½π½ππ + ππππππππ
ππππππππ β πππ‘π‘π‘ obs. from πππ‘π‘π‘ level of Factor A and πππ‘π‘π‘ level of Factor Bππ β overall grand meanπΌπΌππ β deviation from mean of πππ‘π‘π‘ group of Factor Aπ½π½ππ β deviation from mean of πππ‘π‘π‘ group of Factor Bππππππππ β error/noise/uncertainty We will study models with interactions later
15
Additive Models An additive model means that the βeffectsβ are
added together There is an effect of being in group ππ of Factor A If youβre in group ππ, you get πΌπΌππ added to your
expected outcome There is an effect of being in group ππ of Factor B If youβre in group ππ, you get π½π½ππ added to your
expected outcome If youβre in both groups, you get πΌπΌππ + π½π½ππ added to
your expected outcome16
R output> with(farm, tapply(yield, list(fertil, type), mean))
corn soybean
a1 60.5 62.0
a2 58.5 61.5
a3 69.0 72.5
a4 64.0 67.0
> anova(lm(yield ~ fertil + type, data=farm))Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
fertil 3 283.25 94.417 5.4023 0.01571 *
type 1 30.25 30.250 1.7308 0.21507
Residuals 11 192.25 17.477
17
Conclusions It seems like Fertilizer has an effect on crop
Yield, but crop Type does not We did not allow the effect of Fertilizer to depend
on Type, but it doesnβt look like it does anyway According to the interaction plot, at least Should probably do a statistical test for this Coming up!
Of course, we should also follow-up our main effect conclusion with some pairwise tests, contrasts, etcβ¦
18
Factorial ANOVA
Models with interactions
19
Consider Consider 8 hypothetical experiments, each
involving 2 levels of 2 different factors (A and B) Group means:Expβt 1π π 1 π π 2 οΏ½πππ΄π΄
ππ1 5 5 5ππ2 5 5 5οΏ½πππ΅π΅ 5 5
Expβt 3π π 1 π π 2 οΏ½πππ΄π΄
ππ1 7 3 5ππ2 7 3 5οΏ½πππ΅π΅ 7 3
Expβt 2π π 1 π π 2 οΏ½πππ΄π΄
ππ1 4 4 4ππ2 6 6 6οΏ½πππ΅π΅ 5 5
Expβt 4π π 1 π π 2 οΏ½πππ΄π΄
ππ1 6 2 4ππ2 8 4 6οΏ½πππ΅π΅ 7 3
Expβt 5π π 1 π π 2 οΏ½πππ΄π΄
ππ1 6 4 5ππ2 4 6 5οΏ½πππ΅π΅ 5 5
Expβt 7π π 1 π π 2 οΏ½πππ΄π΄
ππ1 8 2 5ππ2 6 4 5οΏ½πππ΅π΅ 7 3
Expβt 6π π 1 π π 2 οΏ½πππ΄π΄
ππ1 5 3 4ππ2 5 7 6οΏ½πππ΅π΅ 5 5
Expβt 8π π 1 π π 2 οΏ½πππ΄π΄
ππ1 7 1 4ππ2 7 5 6οΏ½πππ΅π΅ 7 3 20
No Interactions
21
Interactions
22
Possible Outcomes As youβve just seen, it is possible to have No interaction, and No main effects Main effect for one factor but not the other Main effects for both factors
Interaction between factors, and No main effects Main effect for one factor but not the other Main effects for both factors
Yikes! The combinations get more numerous as the number of levels (or factors) increases 23
Removable Interactions Sometimes we apply transformations to our data Perhaps to meet ANOVA assumptions Or to speak to Americans
These can dampen out or create interactions:
24
Removable Interactions An interaction is removable if there exists some
transformation on the response that will make the interaction disappear
You can tell from the interaction plot, sort of If the lines cross, it is not removable If the lines would cross if we plotted the axes
reversed, it is not removable Removable interactions are less convincing
since they depend on the way the response variable is collected/measured
25
Practice!
26
Interactive Model(Setup) Two-factor design 2 levels of Factor A; 3 levels of Factor B 4 observations at each treatment combination
Data Table Table of Sumsππ1π π 1 ππ1π π 2 ππ1π π 3 ππ2π π 1 ππ2π π 2 ππ2π π 3ππ111 ππ112 ππ113 ππ121 ππ122 ππ123ππ211 ππ212 ππ213 ππ221 ππ222 ππ223ππ311 ππ312 ππ313 ππ321 ππ322 ππ323ππ411 ππ412 ππ413 ππ421 ππ422 ππ423
Sum π΄π΄π΅π΅11 π΄π΄π΅π΅12 π΄π΄π΅π΅13 π΄π΄π΅π΅21 π΄π΄π΅π΅22 π΄π΄π΅π΅23
π π 1 π π 2 π π 3 ππππππππ1 π΄π΄π΅π΅11 π΄π΄π΅π΅12 π΄π΄π΅π΅13 π΄π΄1ππ2 π΄π΄π΅π΅21 π΄π΄π΅π΅22 π΄π΄π΅π΅23 π΄π΄2ππππππ π΅π΅1 π΅π΅2 π΅π΅3
27
Interactive Model Suppose we want to include an interaction now
ππππππππ = ππ + πΌπΌππ + π½π½ππ + πΌπΌπ½π½ ππππ + ππππππππππππππππ β πππ‘π‘π‘ obs. from πππ‘π‘π‘ level of Factor A and πππ‘π‘π‘ level of Factor Bππ β overall grand meanπΌπΌππ β deviation from mean of πππ‘π‘π‘ group of Factor Aπ½π½ππ β deviation from mean of πππ‘π‘π‘ group of Factor BπΌπΌπ½π½ ππππ β deviation remaining after main effects are
removedππππππππ β error/noise/uncertainty
28
Deviations of means (Two-factor model) π΄π΄ππ effect = οΏ½πππ΄π΄ππ β οΏ½ππππ π΅π΅ππ effect = οΏ½πππ΅π΅ππ β οΏ½ππππ Interaction effect = οΏ½ππππππ β οΏ½ππππ β οΏ½πππ΄π΄ππ β οΏ½ππππ β οΏ½πππ΅π΅ππ β οΏ½ππππ
= οΏ½ππππππ β οΏ½πππ΄π΄ππ β οΏ½πππ΅π΅ππ + οΏ½ππππ
We can write the between group deviation as:οΏ½ππππππ β οΏ½ππππ = οΏ½πππ΄π΄ππ β οΏ½ππππ + οΏ½πππ΅π΅ππ β οΏ½ππππ + ( οΏ½ππππππ β οΏ½πππ΄π΄ππ β οΏ½πππ΅π΅ππ + οΏ½ππππ)οΏ½ππππππ β οΏ½ππππ = (π΄π΄ππ effect) + (π΅π΅ππ effect) + (Interaction effect)
29
Deviations of individuals(Two-factor model) That was for group means, not individuals Deviation of an individual from the grand mean Sum of the deviations from the group mean to
grand mean, and individual to group meanππππππππ β οΏ½ππππ = οΏ½ππππππ β οΏ½ππππ + (ππππππππ β οΏ½ππππππ)
Replace with formula on last slide gives:
ππππππππ β οΏ½ππππ = οΏ½πππ΄π΄ππ β οΏ½ππππ + οΏ½πππ΅π΅ππ β οΏ½ππππ + (οΏ½ππππππ β οΏ½πππ΄π΄ππ β οΏ½πππ΅π΅ππ + οΏ½ππππ) + ππππππππ β οΏ½ππππππ
Does this relationship hold for SS as well?30
SS revisited Need to partition the between-group SS into
three parts instead of just one Factor A, Factor B, Interaction
ππππππππ β οΏ½ππππ = οΏ½πππ΄π΄ππ β οΏ½ππππ + οΏ½πππ΅π΅ππ β οΏ½ππππ + οΏ½ππππππ β οΏ½πππ΄π΄ππ β οΏ½πππ΅π΅ππ + οΏ½ππππ + ππππππππ β οΏ½ππππππ
Ξ£ ππππππππ β οΏ½ππππ2 = Ξ£ οΏ½πππ΄π΄ππ β οΏ½ππππ
2+ Ξ£ οΏ½πππ΅π΅ππ β οΏ½ππππ
2 + Ξ£ οΏ½ππππππ β οΏ½πππ΄π΄ππ β οΏ½πππ΅π΅ππ + οΏ½ππππ2
+ Ξ£ ππππππππ β οΏ½ππππππ2
ππππππ = πππππ΄π΄ + πππππ΅π΅ + πππππ΄π΄ π₯π₯ π΅π΅ + πππππΈπΈ
31
Homework example Consider an experiment where we randomly
assign a group of 16 students to 4 treatments Half will read the lecture notes, half wonβt Half will do the practice problems, half wonβt
Measure their score on the midterm, out of 100 4 treatment combinations (2 x 2 factorial) Check for interaction
32
Homework example(R code & output)df <- as.data.frame(expand.grid(readLecture = c(rep("no read",2),
rep("read", 2)), pracProbs = c(rep("no practice",2), rep("practice",2)), KEEP.OUT.ATTRS = F))
require(plyr)
df <- arrange(df, pracProbs, readLecture)
df$score <- c(50,45,56,60, 65,58,63,70, 60,70,68,65, 88,89,86,94)
> with(df, tapply(score, list(readLecture, pracProbs), mean))no practice practice
no read 52.75 65.75
read 64.00 89.25
with(df, interaction.plot(readLecture, pracProbs, score, col=c("red", "blue"), main="", type = "b"))
with(df, interaction.plot(pracProbs, readLecture, score, col=c("red", "blue"), main="", type = "b"))
33
Homework example(R output) Interaction plots (both ways, just because) When increasing one level of Factor A also
increases the effect of Factor B, this is called a reinforcement effect
The other way is called an interference effect
34
Homework example(R code & output)> anova(lm(score ~ readLecture*pracProbs, data=df))Analysis of Variance Table
Response: score
Df Sum Sq Mean Sq F value Pr(>F)
readLecture 1 1207.56 1207.56 48.9139 1.447e-05 ***
pracProbs 1 1463.06 1463.06 59.2633 5.561e-06 ***
readLecture:pracProbs 1 150.06 150.06 6.0785 0.02974 *
Residuals 12 296.25 24.69
40
Homework example(Conclusions) When the interaction is not significant, it should
be removed and the main effects tested with an additive model Some statisticians leave it there and test main
effects, but this is less powerful It looks like the two types of studying reinforce
each other Need to be careful how we interpret main effects
in the presence of an interaction, usually Here it is pretty clear β both study methods are
clearly beneficial 41
Farming example(R code & output)> anova(lm(yield ~ fertil + type, data=farm))Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
fertil 3 283.25 94.417 5.4023 0.01571 *
type 1 30.25 30.250 1.7308 0.21507
Residuals 11 192.25 17.477
> anova(lm(yield ~ fertil*type, data=farm))Analysis of Variance Table
Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
fertil 3 283.25 94.417 3.9754 0.05262 .
type 1 30.25 30.250 1.2737 0.29178
fertil:type 3 2.25 0.750 0.0316 0.99186
Residuals 8 190.00 23.750
42
Testing Contrasts
Differences between marginal means are definitely contrasts (main effect for fertilizer) π»π»0: ππ11+ππ21
2= ππ12+ππ22
2= ππ13+ππ23
2= ππ14+ππ24
2
How to form these contrasts in factorial ANOVA?
ππ1 ππ2 ππ3 ππ4 πππ π ππππ
π‘π‘1 ππ11 ππ12 ππ13 ππ14ππ11 + β―+ ππ14
4
π‘π‘2 ππ21 ππ22 ππ23 ππ24ππ21 + β―+ ππ24
4
πππ π ππππ ππ11 + ππ212
ππ12 + ππ222
ππ13 + ππ232
ππ14 + ππ242 ππ
43
πΉπΉ ππ ππππ ππππ ππππ ππππ ππππππππ ππππππππ ππππππππ π¬π¬(ππ|πΏπΏ = ππ)1 C 0 0 0 0 0 0 0 π½π½01 S 0 0 0 1 0 0 0 π½π½0 + π½π½42 C 1 0 0 0 0 0 0 π½π½0 + π½π½12 S 1 0 0 1 1 0 0 π½π½0 + π½π½1 + π½π½4 + π½π½53 C 0 1 0 0 0 0 0 π½π½0 + π½π½23 S 0 1 0 1 0 1 0 π½π½0 + π½π½2 + π½π½4 + π½π½64 C 0 0 1 0 0 0 0 π½π½0 + π½π½34 S 0 0 1 1 0 0 1 π½π½0 + π½π½3 + π½π½4 + π½π½7
Testing Contrasts(Reference Coding) First, write out the model and expected value for
each cell (use reference coding to start)E πππππ π ππππ = π½π½0 + π½π½1ππ2 + π½π½2ππ3 + π½π½3ππ4 + π½π½4π‘π‘2 +π½π½5ππ2π‘π‘2 + π½π½6ππ3π‘π‘2 + π½π½7ππ4π‘π‘2
44
Testing Contrasts(Reference Coding)
To test π»π»0: ππ11+ππ212
= ππ12+ππ222
corresponds to
π»π»0: π½π½0 + 12π½π½4 β π½π½0 + π½π½1 + 1
2(π½π½4 + π½π½5) = 0
π»π»0:π½π½1 = π½π½5 = 0 (can you see why?)
ππ1 ππ2 ππ3 ππ4 πππ π ππππ
π‘π‘1 π½π½0 π½π½0 + π½π½1 π½π½0 + π½π½2 π½π½0 + π½π½3 π½π½0 +14
(π½π½1 + π½π½2 + π½π½3)
π‘π‘2 π½π½0 + π½π½4π½π½0 + π½π½1
+π½π½4 + π½π½5π½π½0 + π½π½2
+π½π½4 + π½π½6π½π½0 + π½π½3
+π½π½4 + π½π½7
π½π½0 + π½π½4 +14 (π½π½1 + π½π½2 + π½π½3)
+14 (π½π½5 + π½π½6 + π½π½7)
πππ π ππππ π½π½0 +12π½π½4
π½π½0 + π½π½1 +12 (π½π½4 + π½π½5)
π½π½0 + π½π½2 +12 (π½π½4 + π½π½6)
π½π½0 + π½π½3 +12 (π½π½4 + π½π½7)
45
Reference Coding(R code & output)> fit <- lm(yield ~ fertil*type, data=farm)> model.matrix(fit)
> L <- rbind(c(0, 1, 0, 0, 0, 0, 0, 0),c(0, 0, 0, 0, 0, 1, 0, 0))
> glh.test(fit, L)
Test of General Linear Hypothesis
Call:
glh.test(reg = fit, cm = L)
F = 0.0895, df1 = 2, df2 = 8, p-value = 0.9153
46
Testing Contrasts(Reference Coding)
Interactions are also sets of contrasts π»π»0: ππ21 β ππ11 = ππ22 β ππ12 = ππ23 β ππ13 = ππ24 β ππ14 π»π»0: ππ12 β ππ11 = ππ22 β ππ21 and
ππ13 β ππ12 = ππ23 β ππ22 and ππ14 β ππ13 = ππ24 β ππ23
ππ1 ππ2 ππ3 ππ4 πππ π ππππ
π‘π‘1 ππ11 ππ12 ππ13 ππ14ππ11 + β―+ ππ14
4
π‘π‘2 ππ21 ππ22 ππ23 ππ24ππ21 + β―+ ππ24
4
πππ π ππππ ππ11 + ππ212
ππ12 + ππ222
ππ13 + ππ232
ππ14 + ππ242 ππ
47
Testing Contrasts(Reference Coding)
π»π»0: ππ21 β ππ11 = ππ22 β ππ12 = ππ23 β ππ13 = ππ24 β ππ14 To test the interaction corresponds toπ»π»0:π½π½4 = π½π½4 + π½π½5 = π½π½4 + π½π½6 = π½π½4 + π½π½7π»π»0:π½π½5 = π½π½6 = π½π½7 = 0
ππ1 ππ2 ππ3 ππ4 πππ π ππππ
π‘π‘1 π½π½0 π½π½0 + π½π½1 π½π½0 + π½π½2 π½π½0 + π½π½3 π½π½0 +14
(π½π½1 + π½π½2 + π½π½3)
π‘π‘2 π½π½0 + π½π½4π½π½0 + π½π½1
+π½π½4 + π½π½5π½π½0 + π½π½2
+π½π½4 + π½π½6π½π½0 + π½π½3
+π½π½4 + π½π½7
π½π½0 + π½π½4 +14 (π½π½1 + π½π½2 + π½π½3)
+14 (π½π½5 + π½π½6 + π½π½7)
πππ π ππππ π½π½0 +12π½π½4
π½π½0 + π½π½1 +12 (π½π½4 + π½π½5)
π½π½0 + π½π½2 +12 (π½π½4 + π½π½6)
π½π½0 + π½π½3 +12 (π½π½4 + π½π½7)
48
Reference Coding(R code & output)> L <- rbind(c(0, 0, 0, 0, 0, 1, 0, 0),
c(0, 0, 0, 0, 0, 0, 1, 0),c(0, 0, 0, 0, 0, 0, 0, 1))
> glh.test(fit, L)Test of General Linear Hypothesis
Call:
glh.test(reg = fit, cm = L)
F = 0.0316, df1 = 3, df2 = 8, p-value = 0.9919
> anova(fit)Response: yield
Df Sum Sq Mean Sq F value Pr(>F)
fertil 3 283.25 94.417 3.9754 0.05262 .
type 1 30.25 30.250 1.2737 0.29178
fertil:type 3 2.25 0.750 0.0316 0.99186
Residuals 8 190.00 23.750
49
Testing Contrasts(Reference Coding) Reference coding gets worse the higher the
order of the factorial experiment Cell means is no better
Effect coding gets easier Better learn it now!
50
πΉπΉ ππ ππππ ππππ ππππ ππππ ππππππππ ππππππππ ππππππππ π¬π¬(ππ|πΏπΏ = ππ)1 C 1 0 0 1 1 0 0 π½π½0 + π½π½1 + π½π½4 + π½π½51 S 1 0 0 -1 -1 0 0 π½π½0 + π½π½1 β π½π½4 β π½π½52 C 0 1 0 1 0 1 0 π½π½0 + π½π½2 + π½π½4 + π½π½62 S 0 1 0 -1 0 -1 0 π½π½0 + π½π½2 β π½π½4 β π½π½63 C 0 0 1 1 0 0 1 π½π½0 + π½π½3 + π½π½4 + π½π½73 S 0 0 1 -1 0 0 -1 π½π½0 + π½π½3 β π½π½4 β π½π½7
4 C -1 -1 -1 1 -1 -1 -1 π½π½0 β π½π½1 β π½π½2 β π½π½3+π½π½4 β π½π½5 β π½π½6 β π½π½7
4 S -1 -1 -1 -1 1 1 1 π½π½0 β π½π½1 β π½π½2 β π½π½3βπ½π½4 + π½π½5 + π½π½6 + π½π½7
Testing Contrasts(Effect Coding)E πππππ π ππππ = π½π½0 + π½π½1ππ1 + π½π½2ππ2 + π½π½3ππ3 + π½π½4π‘π‘1 +π½π½5ππ1π‘π‘1 + π½π½6ππ2π‘π‘1 + π½π½7ππ3π‘π‘1
51
Testing Contrasts(Effect Coding)
To test π»π»0: ππ11+ππ212
= ππ12+ππ222
corresponds to
π»π»0: π½π½0 + π½π½1 β π½π½0 + π½π½2 = 0π»π»0:π½π½1 = π½π½2 as expected
ππ1 ππ2 ππ3 ππ4 πππ π ππππ
π‘π‘1π½π½0 + π½π½1+ π½π½4 + π½π½5
π½π½0 + π½π½2+ π½π½4 + π½π½6
π½π½0 + π½π½3+ π½π½4 + π½π½7
π½π½0 β π½π½1 β π½π½2 β π½π½3+π½π½4 β π½π½5 β π½π½6 β π½π½7
π½π½0 + π½π½4
π‘π‘2π½π½0 + π½π½1β π½π½4 β π½π½5
π½π½0 + π½π½2β π½π½4 β π½π½6
π½π½0 + π½π½3β π½π½4 β π½π½7
π½π½0 β π½π½1 β π½π½2 β π½π½3βπ½π½4 + π½π½5 + π½π½6 + π½π½7
π½π½0 β π½π½4
πππ π ππππ π½π½0 + π½π½1 π½π½0 + π½π½2 π½π½0 + π½π½3 π½π½0 β π½π½1 β π½π½2 β π½π½3 π½π½0
52
Testing Contrasts(Effect Coding)
π»π»0: ππ21 β ππ11 = ππ22 β ππ12 = ππ23 β ππ13 = ππ24 β ππ14 To test the interaction corresponds toπ»π»0: 2π½π½4 + 2π½π½5 = 2π½π½4 + 2π½π½6 = 2π½π½4 + 2π½π½7 = 2π½π½4 β 2π½π½5 β 2π½π½6 β 2π½π½7π»π»0:π½π½5 = π½π½6 = π½π½7 = βπ½π½5 β π½π½6 β π½π½7π»π»0:π½π½5 = π½π½6 = π½π½7 = 0
ππ1 ππ2 ππ3 ππ4 πππ π ππππ
π‘π‘1π½π½0 + π½π½1+ π½π½4 + π½π½5
π½π½0 + π½π½2+ π½π½4 + π½π½6
π½π½0 + π½π½3+ π½π½4 + π½π½7
π½π½0 β π½π½1 β π½π½2 β π½π½3+π½π½4 β π½π½5 β π½π½6 β π½π½7
π½π½0 + π½π½4
π‘π‘2π½π½0 + π½π½1β π½π½4 β π½π½5
π½π½0 + π½π½2β π½π½4 β π½π½6
π½π½0 + π½π½3β π½π½4 β π½π½7
π½π½0 β π½π½1 β π½π½2 β π½π½3βπ½π½4 + π½π½5 + π½π½6 + π½π½7
π½π½0 β π½π½4
πππ π ππππ π½π½0 + π½π½1 π½π½0 + π½π½2 π½π½0 + π½π½3 π½π½0 β π½π½1 β π½π½2 β π½π½3 π½π½0
53
A Common Practice We want to predict the Grade in a class from Sex
{Male, Female} and PoST {Stats, Other} A 2x2 factorial design
Main interest is the effect of sex, but recognize that maybe PoST has an effect too, so
Split the data by PoST, and run two-sample t-tests comparing sex within each PoST
If one is significant and the other isnβt, then the effect of sex depends on PoST Good idea? Discuss β¦
54
Visualize
55
A Common Practice(R code & output)> with(subset(fact, post == "Stats"), t.test(score[sex == "Male"], score[sex == "Female"]))
Welch Two Sample t-test
t = -2.0979, df = 17.684, p-value = 0.05057
95 percent confidence interval:
-20.42798825 0.02798825
mean of x mean of y
68.8 79.0
> with(subset(fact, post == "Other"), t.test(score[sex == "Male"], score[sex == "Female"]))
Welch Two Sample t-test
t = -2.556, df = 17.62, p-value = 0.02007
95 percent confidence interval:
-18.961573 -1.838427
mean of x mean of y
50.4 60.8
56
A Common Conclusion At the 5% significance level, Males and females have similar grades in the
Stats program Males and females have different grades in the
Other programs Therefore, the effect of sex depends on program
of study
What do you think?
57
The correct solution> fit <- lm(score ~ sex*post, data= fact)> anova(fit)Analysis of Variance Table
Response: score
Df Sum Sq Mean Sq F value Pr(>F)
sex 1 1060.9 1060.9 10.557 0.00251 **
post 1 3348.9 3348.9 33.326 1.398e-06 ***
sex:post 1 0.1 0.1 0.001 0.97501
Residuals 36 3617.6 100.5
58
Three Factors: A, B and C Three main effects: one each for A, B and C There are many subsets of simple effects The effect of A, at level π π ππ and ππππ, etcβ¦
Three two-factor interactions AxB (averaging over C) AxC (averaging over B) BxC (averaging over A)
One three-factor interaction: AxBxC
59
Three-factor interaction The form of the AxB interaction depends on the
value of C The form of the AxC interaction depends on the
value of B The form of the BxC interaction depends on the
value of A These statements are equivalent Use the one that is easiest to understand
60
Graph a 3-factor interaction Make a 2-factor interaction plot, at each level of
the third factor Maybe pick the factor that has the fewest levels
61
Higher order factorial designs For F factors, There will be one intercept
There will be πΉπΉ1 main effects
There will be πΉπΉk k-factor interactions
There is an F test for each one
62
Higher order factorial designs As the number of factors increases The higher-way interactions get harder and harder
to understand All the tests are still tests of sets of contrasts Differences between differences of differences β¦
It gets harder and harder to write down the contrasts
Effect coding becomes easier
63