+ All Categories
Home > Documents > PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship...

PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship...

Date post: 09-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
16
PLS205 2014 6.1 Lab 6 (Topic 9) PLS205 Lab 6 February 13, 2014 Laboratory Topic 9 A word about factorials Specifying interactions among factorial effects in SAS The relationship between factors and treatment Interpreting results of an experiment with a factorial treatment structure Visualizing simple and main effects Visualizing three-way interactions APPENDIX: The Almost Practically Complete Analysis of Example 6.1 APPENDIX 2: Graphing in Excel A word about factorials A factorial is not an experimental design. Why? Because the term "factorial" merely describes the structure of the treatment effects (i.e. the factors), not how they are randomized. Specifically, a factorial treatment structure is one in which all levels of every factor are present in all possible combinations with all the levels of every other factor in the experiment (i.e. the crossing of factors is complete and orthogonal). It is this complete, orthogonal structure that allows an experimenter to gain insight into interactions among factors. Seen in this way, it becomes clear that any of the true experimental designs (i.e. randomization strategies) we have discussed so far (CRD, RCBD, Latin Square) can be factorials, provided the treatments are structured correctly. A factorial is a complete, orthogonal structure of treatment effects intended to provide insight into their interactions. Specifying Interactions Among Factorial Effects in SAS Specifications about designs with factorial treatment structures are entered through the Model statement of the Proc GLM, and this syntax can assume one of two forms: Stars (*) or bars (|). Stars are used to partition out specific interactions from the Treatment SS and are useful when certain interactions must be used as error terms in custom F tests. Examples: Model Resp = a b a*b specifies partitioning of SST into main effect A, me B and interaction AxB Model Resp = a a*b a*b*c specifies me A and interactions AxB and AxBxC Bars are used as a nice shortcut to partition the Treatment SS into all possible combinations of the included factors. On a standard PC keyboard, the bar symbol (|) is typed as Shift-\. Examples: Model Resp = a|b is equivalent to Model Resp = a b a*b Model Resp = a|b|c is equivalent to Model Resp = a b c a*b a*c b*c a*b*c An additional nice trick to know is the use of "@" in factorial model statements. The "@" symbol in conjunction with bars (|) allows you to specify all possible combinations of model factors up to a certain level (e.g. two-way effects), saving you lots of typing. An example: Model Resp = Block a|b|c@2 is equivalent to Model Resp = Block a b c a*b a*c b*c notice this excludes the three-way effect a*b*c
Transcript
Page 1: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.1 Lab 6 (Topic 9)

PLS205 Lab 6 February 13, 2014

Laboratory Topic 9

∙ A word about factorials

∙ Specifying interactions among factorial effects in SAS

∙ The relationship between factors and treatment

∙ Interpreting results of an experiment with a factorial treatment structure

∙ Visualizing simple and main effects

∙ Visualizing three-way interactions

∙ APPENDIX: The Almost Practically Complete Analysis of Example 6.1

∙ APPENDIX 2: Graphing in Excel

A word about factorials

A factorial is not an experimental design. Why? Because the term "factorial" merely describes the

structure of the treatment effects (i.e. the factors), not how they are randomized. Specifically, a factorial

treatment structure is one in which all levels of every factor are present in all possible combinations with

all the levels of every other factor in the experiment (i.e. the crossing of factors is complete and

orthogonal). It is this complete, orthogonal structure that allows an experimenter to gain insight into

interactions among factors. Seen in this way, it becomes clear that any of the true experimental designs

(i.e. randomization strategies) we have discussed so far (CRD, RCBD, Latin Square) can be factorials,

provided the treatments are structured correctly.

A factorial is a complete, orthogonal structure of treatment effects

intended to provide insight into their interactions.

Specifying Interactions Among Factorial Effects in SAS

Specifications about designs with factorial treatment structures are entered through the Model statement

of the Proc GLM, and this syntax can assume one of two forms: Stars (*) or bars (|).

Stars are used to partition out specific interactions from the Treatment SS and are useful when certain

interactions must be used as error terms in custom F tests. Examples:

Model Resp = a b a*b specifies partitioning of SST into main effect A, me B and interaction AxB

Model Resp = a a*b a*b*c specifies me A and interactions AxB and AxBxC

Bars are used as a nice shortcut to partition the Treatment SS into all possible combinations of the

included factors. On a standard PC keyboard, the bar symbol (|) is typed as Shift-\. Examples:

Model Resp = a|b is equivalent to Model Resp = a b a*b

Model Resp = a|b|c is equivalent to Model Resp = a b c a*b a*c b*c a*b*c

An additional nice trick to know is the use of "@" in factorial model statements. The "@" symbol in

conjunction with bars (|) allows you to specify all possible combinations of model factors up to a certain

level (e.g. two-way effects), saving you lots of typing. An example:

Model Resp = Block a|b|c@2 is equivalent to Model Resp = Block a b c a*b a*c b*c

notice this excludes the three-way effect a*b*c

Page 2: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.2 Lab 6 (Topic 9)

The Relationship Between Factors and Treatment

Until now, we have had only a single 'treatment' (the effect of which we are trying to understand) with

zero (CRD), one (RCBD), or two (LS) blocking variables (the effects of which we are trying to account

for but not really investigate). With factorials, we now have two or more 'factors' that are experimentally

equivalent to the single 'treatment' variable from the first half of the course. To illustrate this equivalence,

reconsider Example 1 from Lab 3 (Topics 4-5): An experiment with 6 treatments (L08, L12, L16, H08,

H12, H16), where L/H refers to Low/High temperatures and 8/12/16 refers to hours of light. This is

exactly equivalent to having temperature as one factor and light as another, organized as a factorial:

Model Growth = Treatment; df Treatment = 5

Model Growth = Temp Light Temp*Light; df Temp = 1

df Light = 2

df Temp*Light = 2

Sum = 5

What this is meant to show is that the old classification variable ‘Treatment’ is simply a combination of

two factors (light and temperature). Rewriting the model in terms of the factors does not affect the Model

df at all; it simply expands the class variable ‘Treatment’ into ‘Temp Light Temp*Light’. Before, we

accomplished this ‘opening up’ of the treatment through orthogonal contrasts. The insights gained

through each approach are equivalent.

Example 6.1 Two-Way ANOVA with interactions [Lab6ex1.sas]

In a study comparing the relative growth of five varieties of turfgrass (VARIETY) in three experimental

soil mixtures (SOIL), six pots were prepared with each VARIETY-SOIL combination. The 90 pots were

randomly allocated to six growth chambers (BLOCKS) and the dry matter yields were measured by

clipping the plants at the end of four weeks. In this experiment, the researchers are interested only in

these five varieties and three soil mixtures; so VARIETY and SOIL can be regarded as fixed factors.

Data RCBDFactorial;

Do Soil = 1 to 3;

Do Variety = 1 to 5;

Do Block = 1 to 6;

Input Yield @@;

Output;

End;

End;

End;

Cards;

22.1 24.1 19.1 22.1 25.1 18.1

27.1 15.1 20.6 28.6 15.1 24.6

22.3 25.8 22.8 28.3 21.3 18.3

19.8 28.3 26.8 27.3 26.8 26.8

20.0 17.0 24.0 22.5 28.0 22.5

13.5 14.5 11.5 6.0 27.0 18.0

16.9 17.4 10.4 19.4 11.9 15.4

15.7 10.2 16.7 19.7 18.2 12.2

Page 3: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.3 Lab 6 (Topic 9)

15.1 6.5 17.1 7.6 13.6 21.1

21.8 22.8 18.8 21.3 16.3 14.3

19.0 22.0 20.0 14.5 19.0 16.0

20.0 22.0 25.5 16.5 18.0 17.5

16.4 14.4 21.4 19.9 10.4 21.4

24.5 16.0 11.0 7.5 14.5 15.5

11.8 14.3 21.3 6.3 7.8 13.8

;

Proc GLM Data = RCBDFactorial;

Class Soil Variety Block;

Model Yield = Soil|Variety Block; * This Model includes all main effects

as well as the Method*Variety interaction;

Proc GLM Data = RCBDFactorial;

Class Soil Variety Block;

Model Yield = Soil|Variety|Block@2; * Exploratory model to examine the

one-way block interactions (see discussion below);

Run;

Quit;

NOTE: This initial analysis enables us to see if the interaction is significant and decide:

Main or simple effects?

Take a look at the resultant ANOVA table:

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 19 1361.153778 71.639673 3.45 <.0001

Error 70 1451.637778 20.737683

Corrected Total 89 2812.791556

R-Square Coeff Var Root MSE Yield Mean

0.483916 24.69855 4.553865 18.43778

Source DF Type III SS Mean Square F Value Pr > F

Soil 2 953.1562222 476.5781111 22.98 <.0001

Variety 4 11.3804444 2.8451111 0.14 0.9680

Soil*Variety 8 374.4882222 46.8110278 2.26 0.0330

Block 5 22.1288889 4.4257778 0.21 0.9557

This is an RCBD with 6 blocks. Even though there are six replications per Method-Variety combination

(which allows us to include their interaction in the model), there is only one replication per Method-

Variety-Block combination. The upshot of this is that the Block*Factor interactions are inside the

experimental error for this ANOVA. In other words, if the model statement had been:

Model Yield = Soil|Variety|Block;

there would have been no variation left to estimate the error (dfe = 0), because:

Block * Soil = 10 df

Block * Variety = 20 df

Block * Soil * Variety = 40 df

70 df = dfe

Page 4: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.4 Lab 6 (Topic 9)

We exclude the one-way Block interactions from the model because, in general, we don't care about them

(remember, we block to reduce the error term, not to gain understanding of the effect of blocking). In

other words, this is a choice we make. Excluding the two-way Block interaction is not a choice, however;

it cannot be a part of the model because it is the only term we have for our error. Of course, we still want

to check these Block*Treatment interactions to see if they are significant. It they are not significant it is

justifiable to relegate these interactions to the error term. If they are significant, you can attempt a

transformation or be aware that they will contribute to a larger MSE when taken out of the model.. To

test the Block*Treatment interactions they can simply be placed into an exploratory model (the second

Proc GLM above):

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 49 2040.397111 41.640757 2.16 0.0069

Error 40 772.394444 19.309861

Corrected Total 89 2812.791556

R-Square Coeff Var Root MSE Yield Mean

0.725399 23.83313 4.394299 18.43778

Source DF Type III SS Mean Square F Value Pr > F

Soil 2 953.1562222 476.5781111 24.68 <.0001

Variety 4 11.3804444 2.8451111 0.15 0.9631

Soil*Variety 8 374.4882222 46.8110278 2.42 0.0308

Block 5 22.1288889 4.4257778 0.23 0.9476

Soil*Block 10 242.2944444 24.2294444 1.25 0.2881 NS

Variety*Block 20 436.9488889 21.8474444 1.13 0.3589 NS

A little side note about what interactions to include in your model

Although one can do exploratory work with different interactions in the model and then merge

the Block*Treatment into the error, you should always keep the treatment interactions in the

model, whether significant or not. This makes it very clear to the reader the status of the

interaction and saves you from having to do a lot of explaining. Sometimes in higher order

factorials (e.g. four factors) the higher order interactions (e.g. 4-way interactions) are excluded

from the model if they are not significant to simplify the model.

Interpreting Results of an Experiment with a Factorial Treatment Structure

The above ANOVA results indicate that there are significant differences among soil mixtures but not

among varieties. More importantly, however, it shows that the interaction between these two factors is

significant (i.e. the effects of soil are different for the different varieties, and vice versa).

Because the interaction is significant, it is not appropriate to analyze the main effects.

One must compare the soil means separately for each variety (simple effects).

Example 6.1b [Lab6ex1b.sas]

Proc Sort Data = RCBDFactorial; To analyze simple effects, you must first sort

By Variety; by one of the factors (in this case, Variety)

Proc GLM Data = RCBDFactorial; and then run an ANOVA for each level of that factor

Page 5: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.5 Lab 6 (Topic 9)

Class Soil Block;

Model Yield = Soil Block;

Means Soil / Tukey;

By Variety;

Run; Quit;

The above code tells SAS to generate five different ANOVAs, one for each variety. The results:

Variety Treatment Block MSD Tukey

1 0.0519 NS 0.1822 NS 6.45 1 = 3 3 = 2

2 0.0746 NS 0.5530 NS 7.15 1 = 3 = 2

3 0.0130 ** 0.3708 NS 5.90 1 = 3 3 = 2

4 0.0041 *** 0.6843 NS 8.38 1 3=2

5 0.0144 ** 0.8428 NS 7.50 1 = 2 2 = 3

By investigating the simple effects, we see that only some varieties are significantly affected

by the seed treatment. The MSE and means separation tests vary across varieties.

Visualizing Simple and Main Effects

Example 6.1c

proc gplot data=rcbdfactorial ;

** Main effect plots **;

axis1 offset=(5 pct,5 pct);

axis2 offset = (5 pct,5 pct);

symbol1 i=std1mtj v=none color=BLUE;

plot Yield * Soil = 1 /

description="Means plot of Yield by Soil";

run;

axis1 offset=(5 pct,5 pct);

axis2 offset = (5 pct,5 pct);

symbol1 i=std1mtj v=none color=RED;

plot Yield * Variety = 1 /

description="Means plot of Yield by Variety";

run;

** Two-way Plots **;

axis1 offset=(5 pct,5 pct);

axis2 offset = (5 pct,5 pct);

symbol1 i=std1mtj v=none color=BLUE;

symbol2 i=std1mtj v=none color=BLACK;

symbol3 i=std1mtj v=none color=GREEN;

symbol4 i=std1mtj v=none color=ORANGE;

symbol5 i=std1mtj v=none color=RED;

plot Yield * Soil = Variety /

description="Means plot of Yield by Soil and Variety";

run;

axis1 offset=(5 pct,5 pct);

axis2 offset = (5 pct,5 pct);

symbol1 i=std1mtj v=none color=BLUE;

symbol2 i=std1mtj v=none color=BLACK;

symbol3 i=std1mtj v=none color=GREEN;

Page 6: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.6 Lab 6 (Topic 9)

plot Yield * Variety = Soil /

description="Means plot of Yield by Variety and Soil";

run;

quit;

The symbol statement `i=std1mtj' determine various details in the plot:

For each mean, an interval of length 1 standard error (std1) to either side of the mean (m) is

shown. Each interval has a top and bottom line (t), and the means are joined (j).

`color' is for color (the options include black, red, blue, green, cyan, gold).

`v' determines the symbol used for the individual observations, in this example (v=none) the individual

observations are not shown in the figure.

Don’t get too worried about the code! It is simply connecting the data points and giving standard

errors.

Remember that standard error bars give you an idea of the distribution of observations about each mean.

Plots of the main effects

Page 7: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.7 Lab 6 (Topic 9)

These plots show the main effects of soil and variety on yield. In the case of having a NS interaction, the

implication is that each factor affects the response variable independent of the other; so consideration of

the main effects alone would be sufficient. [NOTE: Of course, this is not the case in this example.]

The interaction plots

Page 8: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.8 Lab 6 (Topic 9)

The non-parallel nature of the lines in this interaction plot demonstrates visually the significant interaction

we found in the ANOVA.

DON’T FORGET: As always, we need to test assumptions in these tests. In this particular

example, there are eight different ANOVAs (one for variety for each of the three soil mixtures

and one for each soil mixture for each of the five varieties), the assumptions of each of which

must be met. See the appendix at the end of this lab for the full, un-cut procedure.

Example 6.2 Three-Way ANOVA with one replication [Lab6ex2.sas]

The following is the code for a generic CRD with a 3x5x2 factorial treatment structure:

Data ThreeFact;

Input a b c resp @@;

Cards;

1 1 1 61 2 1 1 38 3 1 1 81 1 1 2 31 2 1 2 27 3 1 2 113

1 2 1 39 2 2 1 61 3 2 1 49 1 2 2 68 2 2 2 103 3 2 2 143

1 3 1 121 2 3 1 82 3 3 1 41 1 3 2 78 2 3 2 57 3 3 2 63

1 4 1 79 2 4 1 68 3 4 1 59 1 4 2 122 2 4 2 127 3 4 2 167

1 5 1 91 2 5 1 31 3 5 1 61 1 5 2 92 2 5 2 43 3 5 2 128

;

Proc GLM Data = ThreeFact;

Class a b c;

Model Resp = a|b|c;

Run;

Quit;

Running the program like this will make you sad because there are zero degrees of freedom for the error

term and thus no estimation of the error SS. The result? A bunch of dots. The solution to this problem is

to assume that there is no three-way interaction, allowing us to then use the three-way interaction as an

estimate of the experimental error. To do this, modify the model statement above as follows:

Model Resp = a|b|c@2;

Page 9: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.9 Lab 6 (Topic 9)

and re-run the program. The results:

Source DF Type III SS Mean Square F Value Pr > F

a 2 3599.266667 1799.633333 620.56 <.0001 ***

b 4 6423.133333 1605.783333 553.72 <.0001 ***

c 1 5333.333333 5333.333333 1839.08 <.0001 ***

a*b 8 9675.066667 1209.383333 417.03 <.0001 ***

a*c 2 5692.466667 2846.233333 981.46 <.0001 ***

b*c 4 7987.000000 1996.750000 688.53 <.0001 ***

You should also be able to determine which assumptions to test here and how to do them.

Visualizing Three-Way Interactions

Can't we do better than just assume a three-way interaction to be NS? What is a three-way interaction

anyway? Though words may only confuse the issue here, one way to think about it might be:

A three-way interaction exists if the character of the interaction between two factors differs

among the different levels of a third factor.

Difficult to articulate but easy to visualize. Walk through the following steps to see how one can cleverly

visualize three-way interactions in a two-dimensional plot:

1. Open the Word file ThreeWayInteraction.doc. Familiarize yourself as to how the new

dependent variable C1-C2 was created:

A B C C1-C2 Resp

1 1 1 30 61

1 1 2 30 31

The new variable (C1-C2) is simply the effect of C1 relative to C2 for any given combination

of levels of Factors A and B. [Side note: If C had three levels (C1, C2, C3) instead of just

two, the procedure outlined here would have to be carried out for three new variables (C1-C2,

C1-C3, and C2-C3) instead of just one.]

2. Set up your graph with C1-C2 as the DEPENDENT variable (Y-axis) and A and B as the

CLASS variables (A on the X-axis, and B as the “group” variable). See the example below.

The C1-C2 variable replaces the response variable as the dependent variable.

a1 a2 a3

b1 30 11 -32

b1 -29 -42 -94

b3 43 25 -22

b4 -43 -59 -108

b5 -1 -12 -67

The output (it’s like seeing in four dimensions!)

Page 10: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.10 Lab 6 (Topic 9)

One way to think about this: Each line represents one level of B, and the average of each line

represents the effect of C for each level of B. While these averages differ among lines (i.e. B*C is

significant), their differences are fairly constant across all levels of A.

In other words, the roughly parallel nature of the lines in this interaction plot shows us that the difference

in the effects of C at the different levels of B do not vary significantly across the levels of A.

[Translation: No significant three-way interaction, so we are justified in using A*B*C as our error term.]

phew!

APPENDIX: The Almost Practically Complete Analysis for Example 6.1

Step 1: Decide if you need to analyze simple effects Data RCBDFactorial;

Do Soil = 1 to 3;

Do Variety = 1 to 5;

Do Block = 1 to 6;

Input Yield @@;

Output;

End;

End;

End;

Cards;

-120

-100

-80

-60

-40

-20

0

20

40

60

a1 a2 a3

C1

-C2

b1

b1

b3

b4

b5

Page 11: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.11 Lab 6 (Topic 9)

22.1 24.1 19.1 22.1 25.1 18.1

27.1 15.1 20.6 28.6 15.1 24.6

22.3 25.8 22.8 28.3 21.3 18.3

19.8 28.3 26.8 27.3 26.8 26.8

20.0 17.0 24.0 22.5 28.0 22.5

13.5 14.5 11.5 6.0 27.0 18.0

16.9 17.4 10.4 19.4 11.9 15.4

15.7 10.2 16.7 19.7 18.2 12.2

15.1 6.5 17.1 7.6 13.6 21.1

21.8 22.8 18.8 21.3 16.3 14.3

19.0 22.0 20.0 14.5 19.0 16.0

20.0 22.0 25.5 16.5 18.0 17.5

16.4 14.4 21.4 19.9 10.4 21.4

24.5 16.0 11.0 7.5 14.5 15.5

11.8 14.3 21.3 6.3 7.8 13.8

;

Proc GLM Data = RCBDFactorial;

Class Soil Variety Block;

Model Yield = Soil|Variety Block;

Proc GLM Data = RCBDFactorial;

Class Soil Variety Block;

Model Yield = Soil|Variety|Block@2;

Run;

Quit;

Notice there are 2 Proc GLM's in this code. The first features the model we're interested in, and we run it

to see if there is a significant Soil*Variety interaction (i.e. to see if we should analyze main or simple

effects). The second is what we call an "exploratory model" to check the significance of the two-way

block interactions. The output:

First Proc GLM

Source DF Type III SS Mean Square F Value Pr > F

Soil 2 953.1562222 476.5781111 22.98 <.0001 ***

Variety 4 11.3804444 2.8451111 0.14 0.9680

Soil*Variety 8 374.4882222 46.8110278 2.26 0.0330 *

Block 5 22.1288889 4.4257778 0.21 0.9557

There is a significant Soil*Variety interaction, so we must look at simple effects.

Second and third Proc GLM results

Source DF Type III SS Mean Square F Value Pr > F

Method*Block 10 242.2944444 24.2294444 1.25 0.2881 NS

Variety*Block 20 436.9488889 21.8474444 1.13 0.3589 NS

Neither 2-way block interaction is significant, so we're justified in merging them into the error (and

gaining 30 df by doing so).

Step 2: Analyze the simple effect of Soil (i.e. for each Variety separately) Data RCBDFactorial;

Do Soil = 1 to 3;

Do Variety = 1 to 5;

Do Block = 1 to 6;

Input Yield @@;

Output;

End;

End;

End;

Cards;

Page 12: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.12 Lab 6 (Topic 9)

22.1 24.1 19.1 22.1 25.1 18.1

27.1 15.1 20.6 28.6 15.1 24.6

22.3 25.8 22.8 28.3 21.3 18.3

19.8 28.3 26.8 27.3 26.8 26.8

20.0 17.0 24.0 22.5 28.0 22.5

13.5 14.5 11.5 6.0 27.0 18.0

16.9 17.4 10.4 19.4 11.9 15.4

15.7 10.2 16.7 19.7 18.2 12.2

15.1 6.5 17.1 7.6 13.6 21.1

21.8 22.8 18.8 21.3 16.3 14.3

19.0 22.0 20.0 14.5 19.0 16.0

20.0 22.0 25.5 16.5 18.0 17.5

16.4 14.4 21.4 19.9 10.4 21.4

24.5 16.0 11.0 7.5 14.5 15.5

11.8 14.3 21.3 6.3 7.8 13.8

;

Proc Sort Data = RCBDFactorial;

By Variety;

Proc GLM Data = RCBDFactorial;

Class Soil Block;

Model Yield = Soil Block;

Means Soil / Tukey;

By Variety;

Output Out = PR r = res p = pred;

Proc Print Data = PR;

Proc Univariate normal data = PR;

Var res;

By Variety;

Proc GLM data = RCBDFactorial;

Class Soil;

Model Yield = Soil;

Means Soil / hovtest = Levene;

By Variety;

Proc GLM Data = PR;

Class Soil Block;

Model Yield = Soil Block pred*pred;

By Variety;

Proc Plot Data = PR;

Plot res*pred;

By Variety;

Run;

Quit;

The first Proc GLM carries out five separate ANOVA's, one for each Variety; it also generates predicted

and residual values. The Proc Univariate tests for normality of residuals within each ANOVA. The

second Proc GLM conducts Levene's Tests for Soil within each level of variety. And the last Proc GLM

tests for nonadditivity within each of the five models. The output is extensive but can be organized as

shown on the next page:

Normality of residuals (Variety 1 – Variety 5)

Test --Statistic--- -----p Value------

Shapiro-Wilk W 0.962536 Pr < W 0.6512

Shapiro-Wilk W 0.954449 Pr < W 0.4990 Shapiro-Wilk W 0.954754 Pr < W 0.5043 Shapiro-Wilk W 0.991391 Pr < W 0.9996 Shapiro-Wilk W 0.945644 Pr < W 0.3608

Homogeneity of variances (Variety 1 – Variety 5)

Levene's Test for Homogeneity of Yield Variance

ANOVA of Squared Deviations from Group Means

Page 13: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.13 Lab 6 (Topic 9)

Sum of Mean

Source DF Squares Square F Value Pr > F

Method 2 4967.3 2483.7 2.16 0.1502 Method 2 1479.1 739.5 3.59 0.0532

Method 2 128.7 64.3531 0.37 0.6985 Method 2 1425.9 713.0 0.93 0.4157 Method 2 734.1 367.1 0.93 0.4161

Nonadditivity (Variety 1 – Variety 5)

Source DF Type I SS Mean Square F Value Pr > F

pred*pred 1 53.2354203 53.2354203 4.25 0.0693 pred*pred 1 8.3170588 8.3170588 0.38 0.5517 pred*pred 1 0.1770860 0.1770860 0.01 0.9171 pred*pred 1 105.6521057 105.6521057 5.44 0.0446 pred*pred 1 56.8633800 56.8633800 3.05 0.1148

The only assumption we violate is Nonadditivity within Variety 4 (though a few others are close). At this

point, you could try to transform the data for Variety 4 to bring that subset of your data into alignment

with the ANOVA assumptions. But since you're already able to detect differences among soils within

Variety 4 (see summary of Tukey separations below), you may decide that transforming is not worth it.

Variety Treatment Block MSD Tukey

1 0.0519 NS 0.1822 NS 6.45 1 = 3 3 = 2

2 0.0746 NS 0.5530 NS 7.15 1 = 3 = 2

3 0.0130 ** 0.3708 NS 5.90 1 = 3 3 = 2

4 0.0041 *** 0.6843 NS 8.38 1 3=2

5 0.0144 ** 0.8428 NS 7.50 1 = 2 2 = 3

To be truly comprehensive in our analysis, we should also analyze the differences among varieties for

each of the soils. To do this, simply sort by Soil instead of Variety and replace all the "by Variety"

commands with "by Soil" commands in the code; other changes are necessary in the class and model

statements, resulting in a final code like the one below:

Data RCBDFactorial;

Do Soil = 1 to 3;

Do Variety = 1 to 5;

Do Block = 1 to 6;

Input Yield @@;

Output;

End;

End;

End;

Cards;

22.1 24.1 19.1 22.1 25.1 18.1

27.1 15.1 20.6 28.6 15.1 24.6

22.3 25.8 22.8 28.3 21.3 18.3

19.8 28.3 26.8 27.3 26.8 26.8

20.0 17.0 24.0 22.5 28.0 22.5

13.5 14.5 11.5 6.0 27.0 18.0

16.9 17.4 10.4 19.4 11.9 15.4

15.7 10.2 16.7 19.7 18.2 12.2

15.1 6.5 17.1 7.6 13.6 21.1

21.8 22.8 18.8 21.3 16.3 14.3

19.0 22.0 20.0 14.5 19.0 16.0

20.0 22.0 25.5 16.5 18.0 17.5

16.4 14.4 21.4 19.9 10.4 21.4

Page 14: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.14 Lab 6 (Topic 9)

24.5 16.0 11.0 7.5 14.5 15.5

11.8 14.3 21.3 6.3 7.8 13.8

;

Proc Sort Data = RCBDFactorial;

By Soil;

Proc GLM Data = RCBDFactorial;

Class Variety Block;

Model Yield = Variety Block;

Means Variety / Tukey;

By Soil;

Output Out = PR r = res p = pred;

Proc Univariate normal data = PR;

Var res;

By Soil;

Proc GLM data = RCBDFactorial;

Class Variety;

Model Yield = Variety;

Means Variety / hovtest = Levene;

By Soil;

Proc GLM Data = PR;

Class Variety Block;

Model Yield = Variety Block pred*pred;

By Soil;

Proc Plot data = PR;

Plot res*pred;

By Soil;

Run;

Quit;

And the results:

Normality of residuals (Method 1 – Method 3)

Test --Statistic--- -----p Value------

Shapiro-Wilk W 0.975394 Pr < W 0.6943

Shapiro-Wilk W 0.977548 Pr < W 0.7573

Shapiro-Wilk W 0.976278 Pr < W 0.7204

Homogeneity of variances (Method 1 – Method 3)

Levene's Test for Homogeneity of Yield Variance

ANOVA of Squared Deviations from Group Means

Sum of Mean

Source DF Squares Square F Value Pr > F

Variety 4 2008.6 502.2 2.47 0.0705

Variety 4 4763.3 1190.8 1.40 0.2620

Variety 4 1963.4 490.8 0.87 0.4950

Nonadditivity (Method 1 – Method 3)

Source DF Type I SS Mean Square F Value Pr > F

pred*pred 1 1.27330875 1.27330875 0.07 0.7914

pred*pred 1 72.8265041 72.8265041 2.90 0.1049

pred*pred 1 17.2174555 17.2174555 1.07 0.3129

All assumptions are nicely met, so we can report the ANOVA results for Variety without reservations:

Method Treatment Block MSD Tukey

1 0.3947 NS 0.7011 NS 7.10 4=3=5=2=1

Page 15: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.15 Lab 6 (Topic 9)

2 0.4435 NS 0.9244 NS 9.06 5=3=2=1=4

3 0.0347 * 0.0950 NS 6.93 2=1=3=4

1=3=4=5

Interesting. While for the overall ANOVA no differences among Varieties was detected, here we see that

within Method 3, differences are in fact present.

This is an "almost practically complete analysis" because a complete analysis would require commentary

(i.e. interpretation) of all the results generated above, a discussion as to which variety-method

combinations are recommended or not recommended, etc. One should also make efforts to visualize the

data using bar charts or interaction plots. The things to realize is that, even for a simple example like this,

the necessary analysis can be substantial.

An added thorn: This analysis of simple effects involves an enormous number of

independent questions: 8 Shapiro-Wilk tests, 8 Levene’s tests, 8 non-additivity tests, 45

Tukey pairwise comparisons! This has major implications in terms of the experiment-

wise error rate, so be aware!

APPENDIX 2: Graphing in Excel

The same results can be easily obtained in excel by organizing the data into series (rows) as below:

Var1 Var2 Var3 Var4 Var5

Soil1 21.8 21.9 23.1 26.0 22.3

Soil2 15.1 15.2 15.5 13.5 19.2

Soil3 18.4 19.9 17.3 14.8 12.6

and then selecting insert->line->2D-line

Page 16: PLS205 Lab 6 February 13, 2014 Laboratory Topic 9...PLS205 2014 6.2 Lab 6 (Topic 9) The Relationship Between Factors and Treatment Until now, we have had only a single 'treatment'

PLS205 2014 6.16 Lab 6 (Topic 9)

Errors can be added by selecting Chart tools->layout ->error bars -> custom ->specify value and selecting

rows for each series from a Table organized as above with SE.

SE Var1 Var2 Var3 Var4 Var5

Soil1 1.1 2.4 1.4 1.3 1.5

Soil2 2.9 1.4 1.5 2.3 1.4

Soil3 1.1 1.4 1.8 2.3 2.2

The non-parallel nature of the lines in this interaction plot demonstrates visually the significant interaction

we found in the ANOVA.

10.0

12.0

14.0

16.0

18.0

20.0

22.0

24.0

26.0

28.0

30.0

Var1 Var2 Var3 Var4 Var5

Soil1

Soil2

Soil3


Recommended