Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | april-mckinney |
View: | 212 times |
Download: | 0 times |
© Department of Statistics 2012STATS 330 Lecture 18 Slide 1
Stats 330: Lecture 18
© Department of Statistics 2012STATS 330 Lecture 18 Slide 2
Anova Models• These are linear (regression) models where all
the explanatory variables are categorical.• If there is just one categorical explanatory
variable, then we have the “one-way anova” model discussed in STATS 201/8
• If there are two categorical explanatory variables, then we have the “two-way anova” model, also discussed in STATS 201/8
• However, we shall regard these as just another type of regression model
© Department of Statistics 2012STATS 330 Lecture 18 Slide 3
Example: One way model
• In an experiment to study the effect of carcinogenic substances, six different substances were applied to cell cultures.
• The response variable (ratio) is the ratio of damaged to undamaged cells, and the explanatory variable (treatment) is the substance
• On website – carcinogenic substances data
© Department of Statistics 2012STATS 330 Lecture 18 Slide 4
Data ratio treatment1 0.08 control2 0.08 choral hydrate3 0.10 diazapan4 0.10 hydroquinone5 0.07 econidazole6 0.17 colchicine7 0.08 control8 0.10 choral hydrate9 0.08 diazapan10 0.10 hydroquinone11 0.08 econidazole12 0.19 colchicine13 0.09 control14 0.08 choral hydrate150.12 diazapan. . . More data
© Department of Statistics 2012STATS 330 Lecture 18 Slide 5
Distributions skewed?
boxplot(ratio~treatment, data=cancer.df, ylab = "ratio", main = "Ratios for different substances")
control chloralhydrate colchicine diazapan econidazole hydroquinone
0.2
0.4
0.6
0.8
Ratios for different substances
ratio
© Department of Statistics 2012STATS 330 Lecture 18 Slide 6
The modelmeanratio
where the mean depends on the substance. Thus,
Colchicine
Control
meanratio
meanratio
:colchicineFor
:control theFor
.....
We make the usual assumptions about the errors (normal, equal variance, independent etc)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 7
Offset form
ColchizineColchicine
eEconidazoleEconidazol
neHydroquinoneHydroquino
DiazepanDiazepan
ateChoralHydrateChoralHydr
Control
mean
mean
mean
mean
mean
mean
© Department of Statistics 2012STATS 330 Lecture 18 Slide 8
Dummy variable form
• Define:CH = 1 if treatment = Choral Hydrate, 0 elseD = 1 if treatment = diazapan, 0 elseH = 1 if treatment = hydroquinone, 0 elseE = 1 if treatment = econidazole, 0 elseC = 1 if treatment = colchicine, 0 else
• Then
CEH
DCHratio
ColchicineeEconidazolneHydroquino
DiazapanrateChloralHyd
© Department of Statistics 2012STATS 330 Lecture 18 Slide 9
Estimation • To estimate the offsets and the baseline
(control) mean, we use lm as usual. We have to rearrange the levels to make the control the baseline
carcin.df = read.table(file.choose(), header=T)carcin.df$treatment = factor(carcin.df$treatment, levels = c("control", "chloralhydrate", "colchicine", "diazapan", "econidazole", "hydroquinone"))summary(lm(ratio~treatment, data=carcin.df))
© Department of Statistics 2012STATS 330 Lecture 18 Slide 10
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.23660 0.02037 11.616 < 2e-16 ***treatmentchloralhydrate 0.03240 0.02880 1.125 0.26158 treatmentcolchicine 0.21160 0.02880 7.346 2.02e-12 ***treatmentdiazapan 0.04420 0.02880 1.534 0.12599 treatmenteconidazole 0.02820 0.02880 0.979 0.32838 treatmenthydroquinone 0.07540 0.02880 2.618 0.00931 ** ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.144 on 294 degrees of freedomMultiple R-squared: 0.1903, Adjusted R-squared: 0.1766 F-statistic: 13.82 on 5 and 294 DF, p-value: 3.897e-12
lm output
© Department of Statistics 2012STATS 330 Lecture 18 Slide 11
0.25 0.30 0.35 0.40 0.45
-0.2
0.0
0.2
0.4
Fitted values
Res
idua
lsResiduals vs Fitted
100 200199
-3 -2 -1 0 1 2 3
-2-1
01
23
Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als
Normal Q-Q
100200199
0.25 0.30 0.35 0.40 0.45
0.0
0.5
1.0
1.5
Fitted values
Sta
ndar
dize
d re
sidu
als
Scale-Location100 200199
-2-1
01
23
Factor Level Combinations
Sta
ndar
dize
d re
sidu
als
control chloralhydrate colchicinetreatment :
Constant Leverage: Residuals vs Factor Levels
100 200199
Non-normal?
Variances about equal
ignore
© Department of Statistics 2012STATS 330 Lecture 18 Slide 12
boxcoxplot(ratio~ treatment, data=carcin.df)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 13
Analyzing ¼ power
WB test: previous p=0.00
Current p=0.06
Normality better
0.70 0.72 0.74 0.76 0.78 0.80
-0.2
-0.1
0.0
0.1
0.2
Fitted values
Res
idua
ls
Residuals vs Fitted
10050 200
-3 -2 -1 0 1 2 3-2
-10
12
Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als
Normal Q-Q
10050200
0.70 0.72 0.74 0.76 0.78 0.80
0.0
0.5
1.0
1.5
Fitted values
Sta
ndar
dize
d re
sidu
als
Scale-Location10050 200
-2-1
01
2
Factor Level Combinations
Sta
ndar
dize
d re
sidu
als
control chloralhydrate colchicinetreatment :
Constant Leverage: Residuals vs Factor Levels
10050 200
© Department of Statistics 2012STATS 330 Lecture 18 Slide 14
Analyzing ¼ power: summary
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.68528 0.01244 55.105 < 2e-16 ***treatmentchloralhydrate 0.01744 0.01759 0.992 0.3222 treatmentcolchicine 0.12120 0.01759 6.891 3.37e-11 ***treatmentdiazapan 0.02993 0.01759 1.702 0.0898 . treatmenteconidazole 0.01470 0.01759 0.836 0.4039 treatmenthydroquinone 0.04104 0.01759 2.333 0.0203 *
Residual standard error: 0.08793 on 294 degrees of freedomMultiple R-squared: 0.1714, Adjusted R-squared: 0.1573 F-statistic: 12.16 on 5 and 294 DF, p-value: 1.008e-10Residual standard error: 0.08793 on 294 degrees of freedomMultiple R-Squared: 0.1714, Adjusted R-squared: 0.1573 F-statistic: 12.16 on 5 and 294 DF, p-value: 1.008e-10
© Department of Statistics 2012STATS 330 Lecture 18 Slide 15
Testing equality of means
The standard F-test for equality of means is computed using the anova function
Here comparing equal means model (Null model) with different means model – only one term in model
> quarter.lm <- lm(ratio^(1/4)~treatment, data=carcin.df)> anova(quarter.lm)Analysis of Variance Table
Response: ratio^(1/4) Df Sum Sq Mean Sq F value Pr(>F) treatment 5 0.47017 0.09403 12.161 1.008e-10 ***Residuals 294 2.27337 0.00773
Highly significant differences
© Department of Statistics 2012STATS 330 Lecture 18 Slide 16
Oneway plot plot (s20x)> onewayPlot(quarter.lm)
control chloralhydrate colchicine diazapan econidazole hydroquinone
0.5
0.6
0.7
0.8
0.9
Plot of `ratio^(1/4)' by levels of `treatment',with TUKEY intervals (95%, pooled SDs)
treatment
ratio
^(1
/4)
Tukey: all cover true values with 95% prob
© Department of Statistics 2012STATS 330 Lecture 18 Slide 17
Two factors: example
Experiment to study weight gain in rats
– Response is weight gain over a fixed time period
– This is modelled as a function of diet (Beef, Cereal, Pork) and amount of feed (High, Low)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 18
Data> diets.df gain source level1 73 Beef High2 98 Cereal High3 94 Pork High4 90 Beef Low5 107 Cereal Low6 49 Pork Low7 102 Beef High8 74 Cereal High9 79 Pork High10 76 Beef Low. . . 60 observations in all
© Department of Statistics 2012STATS 330 Lecture 18 Slide 19
Two factors: the model
• If the (continuous) response depends on two categorical explanatory variables, then we assume that the response is normally distributed with a mean depending on the combination of factor levels: if the factors are A and B, the mean at the i th level of A and the j th level of B is ij
• Other standard assumptions (equal variance, normality, independence) apply
© Department of Statistics 2012STATS 330 Lecture 18 Slide 20
Diagramatically…
Source = Beef
Source = Cereal
Source = Pork
Level=High
11 12 13
Level=Low
21 22 23
© Department of Statistics 2012STATS 330 Lecture 18 Slide 21
Decomposition of the means
• We usually want to split each “cell mean” up into 4 terms:– A term reflecting the overall baseline level of
the response– A term reflecting the effect of factor A (row
effect)– A term reflecting the effect of factor B (column
effect)– A term reflecting how A and B interact.
© Department of Statistics 2012STATS 330 Lecture 18 Slide 22
Mathematically…Overall Baseline: 11 (mean when both factors are at their baseline levels)
Effect of i th level of factor A (row effect): i111The i th level of A, at the baseline of B, expressed as a deviation from the overall baseline)
Effect of j th level of factor B (column effect) : 1j -11 (The j th level of B, at the baseline of A, expressed as a deviation from the overall baseline)Interaction: what’s left over (see next slide)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 23
Interactions• Each cell (except the first row and column) has
an interaction:Interaction = cell mean - baseline - row effect - column
effect
cell mean = baseline + row effect + column effect + interaction
1111
11111111 )()(ninteractio
jiij
jiij
)(
)()(
1111
11111111
jiij
jiij
© Department of Statistics 2012STATS 330 Lecture 18 Slide 24
Notation• Overall baseline: = 11
• Main effects of Ai = i1 - 11
• Main effects of Bj = 1j - 11
• AB interactions: ij = ij - i1 - 1j + 11
• Thus, ij = i + j + ij
© Department of Statistics 2012STATS 330 Lecture 18 Slide 25
Importance of interactions
• If the interactions are all zero, then the effect of changing levels of A is the same for all levels of B
• In mathematical terms, ij – i’j doesn’t depend on j
• Equivalently, effect of changing levels of B is the same for all levels of A
• If interactions are zero, relationship between factors and response is simple
© Department of Statistics 2012STATS 330 Lecture 18 Slide 26
Why are comparisons simple when interactions
are zero?
'
'
''
'
)(
)(
jj
jiji
ijji
ijjiijij
Doesn’t depend on i!
© Department of Statistics 2012STATS 330 Lecture 18 Slide 27
Splitting up the mean: rats
Cell Means
Beef Cereal Pork Baseline col
High 100 85.9 99.5 100
Low 79.2 83.9 78.7 79.2
Baseline row 100 85.9 99.5 100
Factors are : level (amount of food) and source (diet)
Row effect for Low: 79.2 – 100 = -20.8
Col effect for Cereal: 85.9 - 100 = -14.1
Col effect for Pork: 99.5 - 100 = -0.5
Low-Cereal interaction: 83.9 - 79.2 - 85.9 + 100 = 18.8
Low-Cereal interaction: 78.7 - 79.2 - 99.5 + 100 = 0
© Department of Statistics 2012STATS 330 Lecture 18 Slide 28
Exploratory plots> plot.design(diets.df)
8590
95
Factors
mea
n of
gai
n Beef
Cereal
Pork
High
Low
source level
More gain on high amount of feed and Beef diet
© Department of Statistics 2012STATS 330 Lecture 18 Slide 29
dotplot(source~gain|level, data=diets.df)
gain
60 80 100 120
Beef
Cereal
Pork
High
60 80 100 120
Low
© Department of Statistics 2012STATS 330 Lecture 18 Slide 30
Fit model
> diets.lm<-lm(gain~source+level + source:level, data=diets.df)> summary(diets.lm)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.000e+02 4.632e+00 21.589 < 2e-16 sourceCereal -1.410e+01 6.551e+00 -2.152 0.03585sourcePork -5.000e-01 6.551e+00 -0.076 0.93944 levelLow -2.080e+01 6.551e+00 -3.175 0.00247sourceCereal:levelLow 1.880e+01 9.264e+00 2.029 0.04736sourcePork:levelLow -3.052e-14 9.264e+00 -3.29e-15 1.00000
Residual standard error: 14.65 on 54 degrees of freedomMultiple R-Squared: 0.2848, Adjusted R-squared: 0.2185
F-statistic: 4.3 on 5 and 54 DF, p-value: 0.002299
© Department of Statistics 2012STATS 330 Lecture 18 Slide 31
Fitting as a regression model
Note that this is equivalent to fitting a regression with dummy variables R2, C2, C3
R2 = 1 if obs is in row 2, zero otherwise
C2 = 1 if obs is in column 2, zero otherwise
C3 = 1 if obs is in column 3, zero otherwise
The regression is
Y ~ R2 + C2 + C3 + I(R2*C2) + I(R2*C3)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 32
> R2 = ifelse(diets.df$level=="Low",1,0)> C2 = ifelse(diets.df$source=="Cereal",1,0)> C3 = ifelse(diets.df$source=="Pork",1,0)> reg.lm = lm(gain ~ R2 + C2 + C3 + I(R2*C2) + I(R2*C3), data=diets.df)> summary(reg.lm)Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.000e+02 4.632e+00 21.589 < 2e-16 ***C2 -1.410e+01 6.551e+00 -2.152 0.03585 * C3 -5.000e-01 6.551e+00 -0.076 0.93944 R2 -2.080e+01 6.551e+00 -3.175 0.00247 ** I(R2 * C2) 1.880e+01 9.264e+00 2.029 0.04736 * I(R2 * C3) -2.709e-14 9.264e+00 -2.92e-15 1.00000
Regression summary
© Department of Statistics 2012STATS 330 Lecture 18 Slide 33
Testing for zero interactions
> anova(diets.lm)Analysis of Variance Table
Response: gain Df Sum Sq Mean Sq F value Pr(>F) source 2 266.5 133.3 0.6211 0.5411319 level 1 3168.3 3168.3 14.7666 0.0003224 ***source:level 2 1178.1 589.1 2.7455 0.0731879 . Residuals 54 11586.0 214.6
Some evidence of interaction
© Department of Statistics 2012STATS 330 Lecture 18 Slide 34
Interaction plot> interaction.plot(source,level,gain)
8085
9095
100
diets.df$source
mea
n of
die
ts.d
f$ga
in
Beef Cereal Pork
diets.df$level
HighLow
Non-parallellines indicate interaction
© Department of Statistics 2012STATS 330 Lecture 18 Slide 35
Do we need Source in the model?
> model1<-lm(gain~source*level) # note shorthand> model2<-lm(gain~level)> anova(model2,model1)Analysis of Variance Table
Model 1: gain ~ levelModel 2: gain ~ source * level Res.Df RSS Df Sum of Sq F Pr(>F)1 58 13030.7 2 54 11586.0 4 1444.7 1.6833 0.1673
Not significant!
No significant effect of Source
© Department of Statistics 2012STATS 330 Lecture 18 Slide 36
Notations: reviewFor two factors A and B
• Baseline: = 11
• A main effect: i = i1- 11
• B main effect: j = 1j - 11
• AB interaction: ij = ij - i1 - 1j + 11
• Then ij = + i + j + ij
© Department of Statistics 2012STATS 330 Lecture 18 Slide 37
Zero interaction model
• If we have only one observation per factor level combination, we can’t estimate the interactions and the error variance
• We have to assume that the interactions are zero and fit an “additive model”
gain ~ level + source
• Can test zero interactions In a reduced form) using the “Tukey one-degree of freedom test” –
© Department of Statistics 2012STATS 330 Lecture 18 Slide 38
Possible models for two factors
For two factors A and B possible models are:
• Y~1 (Fit single mean only)• Y~A (cell means depend on A alone)• Y~B (Cell means depend on B alone)• Y~A+B (Cell means have no
interaction)• Y~A*B (General model, cell means
have no restrictions)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 39
In terms of “effects”General model is Y~A+B+A:B
(equivalently Y~A*B) Mathematical form is
E(Yij) = + i + j + ij
• Y~1 implies i =0, j=0, ij=0
• Y~A implies j=0, ij=0
• Y~B implies i =0, ij=0
• Y~A+B implies ij=0
© Department of Statistics 2012STATS 330 Lecture 18 Slide 40
Interpreting AnovaAll F-tests essentially compare a model to a sub-model, using an estimate of 2 in the denominator:
The anova function can do this explicitly, as in anova(model1, model2), with the estimate of 2 coming from the bigger model.
When we use just 1 argument, as in anova(model1), the models being compared are selected implicitly
2ˆ/) (
dRSSRSS
F ModelSubmodel
© Department of Statistics 2012STATS 330 Lecture 18 Slide 41
Interpreting Anova (cont)
• For example, consider a model with 2 factors A and B:
> anova(lm(y~A+B+A:B))Analysis of Variance Table
Response: y Df Sum Sq Mean Sq F value Pr(>F) A 1 12.774 12.774 9.9147 0.003978 **B 2 4.031 2.015 1.5642 0.227629 A:B 2 6.898 3.449 2.6768 0.086985 . Residuals 27 34.788 1.288
Full-model estimate of 2
© Department of Statistics 2012STATS 330 Lecture 18 Slide 42
First lineThe first line of the table compares the model y~A with a null model (all means the same), using an estimate of 2 =1.288 from the full model y~A+B+A:B
> model1<-lm(y~A)> model0<-lm(y~1)> anova(model0,model1)Analysis of Variance Table
Model 1: y ~ 1Model 2: y ~ A Res.Df RSS Df Sum of Sq F Pr(>F) 1 32 58.491 2 31 45.716 1 12.774 8.6623 0.006105 **
Difference in Numerator of F test
© Department of Statistics 2012STATS 330 Lecture 18 Slide 43
Second lineThe second line of the table compares the “no interaction” model y~A+B with the model y~A, using an estimate of 2 from the full model y~A+B+A:B
> model2<-lm(y~A+B)> model1<-lm(y~A)> anova(model1,model2)Analysis of Variance Table
Model 1: y ~ AModel 2: y ~ A + B Res.Df RSS Df Sum of Sq F Pr(>F)1 31 45.716 2 29 41.685 2 4.031 1.4021 0.2623
Difference in Numerator of F test in line 2
© Department of Statistics 2012STATS 330 Lecture 18 Slide 44
Third lineThe third line of the table compares full model y~A+B+A:B with the “no interaction” model y~A+B, using an estimate of 2 from the full model
> model2<-lm(y~A+B)> model3<-lm(y~A+B+A:B)> anova(model2,model3)Analysis of Variance Table
Model 1: y ~ A + BModel 2: y ~ A + B + A:B Res.Df RSS Df Sum of Sq F Pr(>F) 1 29 41.685 2 27 34.788 2 6.898 2.6768 0.08699 .
Difference in Numerator of F test in line 3
© Department of Statistics 2012STATS 330 Lecture 18 Slide 45
To summarise:
• Terms are added line by line
• The F-test compares the current model with the previous model
• At each stage, the estimate of 2 is obtained from the full model.
© Department of Statistics 2012STATS 330 Lecture 18 Slide 46
More than two factors: example
• An experiment was conducted to compare different diets for feeding chickens. The diets depended on 3 variables: – Source of Protein (variable protein) : either “groundnut” or
“soybean”– Level of protein (variable protlevel): either 0, 1 or 2– Level of fish solubles (variable fish) :either high or low
• Response variable was weight gain (variable chickweight)
© Department of Statistics 2012STATS 330 Lecture 18 Slide 47
datachickweight protein protlevel fish1 6559 groundnut 0 Low2 7075 groundnut 0 High3 6564 groundnut 1 Low4 7528 groundnut 1 High5 6738 groundnut 2 Low6 7333 groundnut 2 High7 7094 soybean 0 Low8 8005 soybean 0 High9 6943 soybean 1 Low10 7359 soybean 1 High11 6748 soybean 2 Low12 6764 soybean 2 High
. . . 24 observations in all
© Department of Statistics 2012STATS 330 Lecture 18 Slide 48
Data characteristics
• There are 3 factors– protein with 2 levels (groundnut, soybean)– protlevel with 3 levels (0,1,2)– fish with 2 levels high, low
• There are 2 x 3 x 2 = 12 factor level combinations, so 12 means
• Each combination is observed twice, so 24 observations in all
© Department of Statistics 2012STATS 330 Lecture 18 Slide 49
InteractionsLet ijk be the population mean of all observations taken at level i of protein, level j of protlevel and level k of fish
We can split this mean up into 8 terms:
An overall baseline = 111
3 “main effects” e.g.i = i11 - 111
3 “two-way interactions” e.g. ij iji1j1
A “3-way interaction”
ijk ijki1j111kij1jki1k
© Department of Statistics 2012STATS 330 Lecture 18 Slide 50
Interactions (cont)
Then
ijkijkijjkik
ijk
As before, if any one of the subscripts i, j, k is 1 then the corresponding interaction is zero.
Interpretation:
e.g. if the protlevel x fish and the 3-way interactions are all zero, then the effect of changing levels of fish is the same for all levels of protlevel.
© Department of Statistics 2012STATS 330 Lecture 18 Slide 51
Why?
''
''
'
)()(
)()(
)()(
ikikkk
ikijkji
ikijkjiijkijk
Doesn’t depend on j!
© Department of Statistics 2012STATS 330 Lecture 18 Slide 52
Estimating terms> model1<-lm(chickweight~protein*protlevel*fish, data=chickwts.df)> summary(model1) Estimate Std. Error t value Pr(>|t|) (Intercept) 6927.0 223.5 30.992 8e-13 proteinsoybean 904.0 316.1 2.860 0.0144protlevel1 266.5 316.1 0.843 0.4156 protlevel2 -80.0 316.1 -0.253 0.8045 fishLow -501.5 316.1 -1.587 0.1386 proteinsoybean:protlevel1 -772.0 447.0 -1.727 0.1098 proteinsoybean:protlevel2 -1089.0 447.0 -2.436 0.0314proteinsoybean:fishLow -256.0 447.0 -0.573 0.5774 protlevel1:fishLow -99.0 447.0 -0.221 0.8285 protlevel2:fishLow 245.5 447.0 0.549 0.5929 proteinsoybean:protlevel1:fishLow 127.0 632.2 0.201 0.8441 proteinsoybean:protlevel2:fishLow 435.0 632.2 0.688 0.5045 Residual standard error: 316.1 on 12 degrees of freedomMultiple R-Squared: 0.7531, Adjusted R-squared: 0.5269 F-statistic: 3.328 on 11 and 12 DF, p-value: 0.02482
© Department of Statistics 2012STATS 330 Lecture 18 Slide 53
Anova for the chick weights
> anova(model1)Analysis of Variance Table
Response: chickweight Df Sum Sq Mean Sq F value Pr(>F) protein 1 373003 373003 3.7334 0.077286 . protlevel 2 636519 318260 3.1854 0.077679 . fish 1 1423014 1423014 14.2429 0.002653 **protein:protlevel 2 858702 429351 4.2974 0.039134 * protein:fish 1 7073 7073 0.0708 0.794706 protlevel:fish 2 309421 154710 1.5485 0.252201 protein:protlevel:fish 2 50036 25018 0.2504 0.782453 Residuals 12 1198926 99911 ---Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Suggests model protein*protlevel + fish
© Department of Statistics 2012STATS 330 Lecture 18 Slide 54
Check> anova(lm(chickweight~protein*protlevel + fish), lm(chickweight~protein*protlevel*fish))Analysis of Variance Table
Model 1: chickweight ~ protein * protlevel + fishModel 2: chickweight ~ protein * protlevel * fish Res.Df RSS Df Sum of Sq F Pr(>F)1 17 1565456 2 12 1198926 5 366530 0.7337 0.6121
Not significant, but interpret with caution
Effect of fish the same for each protein/protlevel combination
© Department of Statistics 2012STATS 330 Lecture 18 Slide 55
Interpretation of interactions
• If a factor (say A) does not interact with the others, the effect of changing levelsof A is the same for all levels of the other factors
• If the 3 way interactions are zero, then the interaction between A and B is the same for all levels of C
© Department of Statistics 2012STATS 330 Lecture 18 Slide 56
Summary
• Anova models are interpreted just like regressions, except– No question of planarity (linear by definition )– Need to interpret interactions– Judge effect of factors by anova – Factors in anova added one at a time– Suitable for completely randomised
experiments where it is reasonable to assume observations are independent