Lecture 33 Multiple Factor ANOVA - Purdue Universityghobbs/STAT_512/Lecture_Notes/... · Lecture 33...

Post on 01-Apr-2018

224 views 6 download

transcript

33-1

Lecture 33

Multiple Factor ANOVA

STAT 512

Spring 2011

Background Reading

KNNL: Chapter 24

33-2

Topic Overview

• ANOVA with multiple factors

33-3

3-Way ANOVA Model

• Three factors A, B, and C having a, b, and c,

levels, respectively

• Notation is similar to before.

33-4

Data for three-way ANOVA

− Y, the response variable

− Factor A with levels i = 1 to a

− Factor B with levels j = 1 to b

− Factor C with levels k = 1 to c

− Yijkl is the lth observation in cell (i,j,k), l = 1 to

nijk

− A balanced design has nijk = n

33-5

Cell Means Model

ijkl ijk ijklY = µ + ε

− ijkµ is the theoretical mean or expected value

of all observations in cell (i,j,k).

− ( )2~ 0,iid

ijkl Nε σ

− ( )2~ ,ijkl ijkY N µ σ are independent

33-6

Treatment Means

1 1 1

1 1 1

, , ,

1

, ,

c abij ijk i k ijk jk ijk

k j i

acbc abi ijk j ijk k ijk

j k i k i j

abc ijk

i j k

µ µ µ µ µ µ

µ µ µ µ µ µ

µ µ

= = =

= = =

=

∑ ∑ ∑

∑ ∑ ∑

i i i

ii i i ii

iii

33-7

Estimates

1

1 1 1

, , ,

1 1 1

, , , , , ,

1

, , ,

ˆ

ˆ ˆ ˆ

ˆ ˆ ˆ

ˆ

nijk ijkl

l

cn anbnij ijkl i k ijkl jk ijkl

k l j l i l

acnbcn abni ijkl j ijkl k ijkl

j k l i k l i j l

abcn ijkl

i j k l

Y

Y Y Y

Y Y Y

Y

µ

µ µ µ

µ µ µ

µ

=

= = =

= = =

=

∑ ∑ ∑

∑ ∑ ∑

i i i

ii i i ii

iii

33-8

Factor effects model ( ) ( ) ( ) ( )ijk i j k ijklij ik jk ijk

Y = µ +α +β + γ + αβ + αγ + βγ + αβγ + ε

− µ is the overall (grand) mean

− , ,i j kα β γ are the main effects of factors A, B,

and C

− ( ) ( ) ( ), ,ij ik jk

αβ αγ βγ are the two-way (first

order) interactions

− ( )ijk

αβγ is the three-way (second order)

interaction

33-9

Factor Effects

( )

( )

( )

( )

i i ij i jij

j j i k i kik

k k jk j kjk

ijk ij i k jk iijk

j k

α µ µ αβ µ µ µ µ

β µ µ αγ µ µ µ µ

γ µ µ αβ µ µ µ µ

αβγ µ µ µ µ µ

µ µ µ

= − = − − +

= − = − − +

= − = − − +

= − − − +

+ + −

ii iii i ii i i iii

i i iii i ii ii iii

ii iii i i i ii iii

i i i ii

i i ii iii

Plug in cell means to estimate.

33-10

Constraints

• Usual constraints listed on page 997 – sums

of effects for ANY of the indices are zero.

Under these, µiii will be the grand mean.

• In SAS, constraints are all set up to compare

everything to abcµ . Thus a factor effect is

zero if it includes any of the “last” levels of

the factors.

33-11

Assumptions

• Constancy of variance applies across cells;

can do residual plots across treatment

combinations

• For violations, transformations can

sometimes be useful; WLS is a standard

remedial measure if the error distribution is

normal but the variances are different.

33-12

ANOVA Table

• SSTR/Model is partitioned into:

� Main Effects

� Two Way Interactions

� Three Way Interactions

� Etc.

• DF are multiplicative. For example, three-

way interaction between A, B, C, takes up

( )( )( )1 1 1a b c− − − DF.

• SS formulas given on page 1008.

33-13

Steps in 3-Factor Analysis

1. Fit full model and check assumptions

2. Start with the 3-way interaction and

determine if it is significant.

3. If not, may consider pooling. To avoid

likelihood of Type I errors, best to pool only

in cases where p-value is not close to

significant.

4. If 3-way interaction (or multiple 2-way

interactions) are significant, then analyze the

three factors jointly in terms of ijkµ .

33-14

Steps in 3-Factor Analysis (2)

5. If only a single two-way interaction is

significant, may again consider pooling, and

can analyze via regular interaction plot. Do

NOT pool any term for which higher order

terms are significant.

6. Can analyze main effects if factor not

involved in important interaction. May also

be able to look at main effects if they are

large compared to the interactions.

33-15

With More than three factors...

• Hope that higher order interactions are not

significant (this is often the case). If they

are, try to analyze cell means. Assuming

they are not...

• Interactions that overlap (e.g. AB and BC)

and are significant suggest analysis of the

three-factor level means.

• Another potential strategy is to combine

factors (e.g. gender and smoking might be

considered one factor with 4 levels)

33-16

Multiple Comparisons

• Tukey, Bonferroni, and Scheffe adjustments

can be made as before (see page 1017 for

appropriate degrees of freedom to use;

generally model and/or error).

• Can utilize contrasts to study specific

questions (should use Scheffe if looking at

any unplanned contrasts; Bonferroni is

appropriate for contrasts that have been

planned in advance)

33-17

Unequal Cell Sizes

• Formulas change a bit as not all of the ijkn

are the same

• Look at Type III SS as well as Type I (the

closer the sample sizes are to each other,

the less difference there will be).

• MUST use LSMeans to do comparisons

33-18

Empty Cells

• Can often be problematic for larger designs

• Create situations where some effects are

confounded; generally interactions can

only be partially studied.

• Usually forced to assume some interactions

are zero.

• See page 964 for more on empty cells

33-19

Example

• Problem 24.6 (alloy.sas)

• Studying the effects of three factors on the

hardness of an alloy

• Factor A: Use of a chemical additive (1 =

low amount; 2 = high amount)

• Factor B: Temperature (1 = low, 2 = high)

• Factor C: Time allowed for process (1 =

low, 2 = high)

• Three observations per cell, balanced design

33-20

33-21

33-22

33-23

33-24

Interactions

• Parallel lines suggests no interactions. If we

look at the ANOVA table, this is seen there

as well. Source DF SS MS F Pr > F

additive 1 789 789 235 <.0001

time 1 2440 2440 727 <.0001

add*time 1 0.20 0.20 0.06 0.8095

temp 1 1539 1539 458 <.0001

add*temp 1 0.24 0.24 0.07 0.7926

time*temp 1 2.94 2.94 0.88 0.3634

ad*tim*tem 1 0.60 0.60 0.18 0.6778

Error 16 53.7 3.36

Total 23 4826

33-25

Analysis

• In this (nice) case we can simply look at the

individual means and draw conclusions

additive LSMEAN Pr > |t|

1_low 54.2250000 <.0001

2_high 65.6916667

time LSMEAN Pr > |t|

1_low 49.8750000 <.0001

2_high 70.0416667

33-26

Analysis (2)

temp LSMEAN Pr > |t|

1_low 51.9500000 <.0001

2_high 67.9666667

• High levels for all three variables are

preferred.

• Don’t forget assumptions (in this case not

too bad; something weird in cell #1)

33-27

Example (adjusted)

• Data changed a bit (see SAS code)

• Basically, for illustration, interchanged the

cells for A = 1, B = 2 and A = 2, B = 2

• Interaction Plot now suggests interaction

33-28

33-29

Two-way Interaction Plots

• From the 3-way interaction plot we can

guess that the interaction has to do with

time (but not temp since individually, lines

for same level of temp are parallel)

• This is confirmed by looking at the 2-way

interaction plots

33-30

• Interaction between additive and

temperature

33-31

• No interaction between additive/time (and

no apparent effect of additive if we ignore

temperature)

33-32

• No interaction between time/temp; there is

apparently a main effect of temperature in

addition to the interaction.

33-33

ANOVA Output

Source DF SS MS F Pr > F

additive 1 0.24 0.24 0.07 0.7926

time 1 2440 2440 727 <.0001

add*time 1 0.60 0.60 0.18 0.6778

temp 1 1539 1539 458 <.0001

add*temp 1 789 789 235 <.0001

time*temp 1 2.94 2.94 0.88 0.3634

ad*tim*tem 1 0.20 0.20 0.06 0.8095

Error 16 53.7 3.36

Total 23 4826

33-34

Results

• Additive interacts with Temperature; Will want

to examine that interaction

• Temperature is by itself significant; so

probably can look at main effect for that as

well.

• Would be inappropriate to look at main effect

for Additive; factor is important in how it

interacts with temp and main effect here will

be misleading

• Can look at main effect for time since there is

no interaction there.

33-35

Results (2)

time LSMEAN Pr > |t|

1_low 49.8750000 <.0001

2_high 70.0416667

temp LSMEAN Pr > |t|

1_low 51.9500000 <.0001

2_high 67.9666667

• Longer time is better

• Apparently higher temperature is better

33-36

Results (3)

additive temp LSMEAN Number

1_low 1_low 46.317 1

1_low 2_high 73.800 2

2_high 1_low 57.583 3

2_high 2_high 62.133 4

i/j 1 2 3 4

1 <.0001 <.0001 <.0001

2 <.0001 <.0001 <.0001

3 <.0001 <.0001 0.0028

4 <.0001 <.0001 0.0028

33-37

Results (4)

• Can identify a “best” combination of

additive and temperature (low additive,

high temperature)

• As we saw in the interaction plot, the

additive counteracts the effect of raising

the temperature to some degree

33-38

Upcoming …

• More multiple ANOVA / ANCOVA

examples

• Fixed vs. Random Effects