Notes from Design of Experiments - University of Wisconsin ...

Notes from Design of Experiments

Cambridge Part III Mathematical Tripos 2012-2013

Lecturer: Rosemary Bailey

Vivak Patel

March 21, 2013

1

Contents

1 Overview 41.1 Stages of a Statistically Designed Experiment . . . . . . . . . . . 41.2 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . 41.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Definitions, Notation, and Conventions . . . . . . . . . . . . . . . 51.5 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 Covariance Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Unstructured Experiments 82.1 Completely Randomised Design . . . . . . . . . . . . . . . . . . . 82.2 Treatment Subspace . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Linear Model for Unstructured Experiments . . . . . . . . . . . . 9

2.3.1 Estimation and Variance . . . . . . . . . . . . . . . . . . . 92.3.2 Sums of Squares and Mean Squares . . . . . . . . . . . . 102.3.3 Null Model . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.4 Analysis of Variance . . . . . . . . . . . . . . . . . . . . . 112.3.5 Normal Assumptions and F-statistic . . . . . . . . . . . . 12

3 Experiments with Blocking 133.1 General Block Design . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1.1 Purpose and Goals of Blocking . . . . . . . . . . . . . . . 133.1.2 Types of Blocking and Considerations for Each . . . . . . 13

3.2 Orthogonal Block Design . . . . . . . . . . . . . . . . . . . . . . 143.2.1 Definition and Properties . . . . . . . . . . . . . . . . . . 143.2.2 Construction and Randomisation . . . . . . . . . . . . . . 153.2.3 Fixed Effects Blocking Model Analysis . . . . . . . . . . . 153.2.4 Random Effects Blocking Model Analysis . . . . . . . . . 16

4 Treatment Structures 184.1 Treatment Factors and their Subspaces . . . . . . . . . . . . . . . 184.2 Hasse Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.3 Main Effect, Fitted Values, Interactions . . . . . . . . . . . . . . 194.4 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.5 Factorial Experiments . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Row-Column Designs 225.1 Double Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.2 Latin Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225.3 General Construction and Randomisation . . . . . . . . . . . . . 235.4 Orthogonal Subspaces . . . . . . . . . . . . . . . . . . . . . . . . 235.5 Fixed Effect: Model and Analysis . . . . . . . . . . . . . . . . . . 245.6 Random Effect: Model and Analysis . . . . . . . . . . . . . . . . 24

6 Small Units Inside Larger Units 266.1 Treatment on E-Units Containing O-Units . . . . . . . . . . . . . 26

6.1.1 Overview, Construction and Modelling . . . . . . . . . . . 266.1.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6.2 Treatment Effects in Different Strata . . . . . . . . . . . . . . . . 27

2

6.2.1 General Description and Construction . . . . . . . . . . . 276.2.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286.2.3 Design Advantages . . . . . . . . . . . . . . . . . . . . . . 29

6.3 Split Plot Designs . . . . . . . . . . . . . . . . . . . . . . . . . . 296.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.3.2 Model and Analysis . . . . . . . . . . . . . . . . . . . . . 30

7 More on Latin Squares 317.1 Uses of Latin Squares . . . . . . . . . . . . . . . . . . . . . . . . 317.2 Graeco-Latin Squares . . . . . . . . . . . . . . . . . . . . . . . . 317.3 Applications of Graeco-Latin Squares . . . . . . . . . . . . . . . . 32

A Example of Interactions 32

B Three Factor Hasse Diagram 32

3

1 Overview

1.1 Stages of a Statistically Designed Experiment

1. Consultation or collaboration with the experimenter:

(a) Determine the time frame before the start of the experiment

(b) Ask “dumb” questions to understand the experiment

(c) Be aware that the experimenter may have preconceptions about ex-perimental design

2. Statistical Design

3. Do the experiment and collect the data

(a) Data should not be processed in any way (do not reorganise, do notallow for calculations, etc.)

(b) Data collection shoud not be delegated to juniors

4. Data Scrutiny: look over data for anomalies, outliers or bad practices

5. Data Analysis

(a) This should be planned and tested on dummy data during the designstage

(b) Be prepared to modify planned analysis for unexpected events

6. Interpretation: use analysis to answer the original question

1.2 Practical Considerations

1. Experiments must answer specific questions, such as:

(a) Estimate a quantity with unbiased estimators with small variances

(b) Test a hypothesis with high power for detecting practical differences

2. Increasing the number of times each treatment is tested (i.e. the repli-cants):

(a) Ideally reduces variance and increases the power in a testing scenario

(b) In reality, it will increase the cost of an experiment, and, with toofew samples, may actually increase variance

3. Increase the amount of local control (i.e. grouping unit that are alike):

(a) Ideally reduces variances within each group by reducing the sourcesof variability, and increases power

(b) In reality, in non-orthogonal designs, local control can increase vari-ance, reduce the degrees of freedom, and reduce the power when thenumber of units is too low. It also increases the complexity of theexperiment, analysis and interpretation.

4. Important constraints: costs, availability of test materials or experimentalunits, existence of natural blocks between experimental units, and admin-istrative barriers

4

1.3 An Example

Example 1.1. Suppose we have three varieties of rye-grass (Cropper, Melba &Melle) and 4 different quantities of fertiliser (0, 80, 160 & 240 kg/ha) which wewant to compare. We have two fields on which we can place these treatments, butwe are constrained by the equipment, which can only apply one type of rye-grassto a strip of the field. Therefore, we have the following design:

Field 1 Field 2Cropper Melba Melle Melba Cropper Melle

0 160 240 160 80 0160 80 80 0 160 8080 0 160 240 0 240240 240 0 80 240 160

1. This is a combinatorial design: each strip has 1 type of grass, and withineach strip, each plot has 1 of the 4 quantities of fertiliser

2. The strips were assigned randomly within each field, and the quantitieson each plot within each strip were assigned randomly

1.4 Definitions, Notation, and Conventions

Definition 1.1. An experimental unit is the smallest unit to which a treatmentcan be applied.

Definition 1.2. A treatment is the entire description of what can be appliedto an experimental unit. We let τ denote the set of all treatments and t = |τ |denote the number of treatments. Each individual treatment is denoted with alower case Arabic letter.

Definition 1.3. An observational unit is the smallest unit on which we measurea response. We denote Ω as the set of observational units, and use lower caseGreek letters to denote individual observational units. We let N = |Ω| be thenumber of such units.

Example 1.2. Examples of Experimental Units, Observational Units and Treat-ments

1. In the previous example, the O-unit and E-unit are both the plots, and thetreatment is the pair (grass variety, fertiliser quantity)

2. Drugs for chronic illnesses. Suppose we have patients with a chronic illnessand each patient changes the drug they use every month. Then, eachpatient per month is the experimental unit

3. Feeds for calves. Suppose we have 10 calves per pen and each pen is givena different feed. The E-unit is the pen, and the O-unit is each calf.

4. Hand wash. Suppose we are comparing washing hands with no soap, stan-dard soap and a new soap. Then there are three treatments.

5

Definition 1.4. A design is a function T : Ω→ τ (i.e. it maps an observationalunit to a treatment). A plan or layout is the design translated back into thecontext of the actual experiment

Definition 1.5. A treatment or observational unit structure is any meaningfulway of dividing up the treatments or observational units into some sort of cate-gory. If no such structure exists then the treatments or observational units arecalled unstructured.

Note 1.1. Any type of treatment structure can occur with any type of observa-tional unit structure. Experimental design deals with creating experiments withthe appropriate structures and analysing the results accordingly.

1.5 Linear Models

Definition 1.6. Let Yω denote the random response variable to ω ∈ Ω and yωdenote its actual measured value. A linear model for the response is Yω = Zω+τiwhere:

1. Zω is a random variable depending only on ω

2. τi is a constant depending on treatment i ∈ τ

Note 1.2. Because we have random variables, we must define a probabilityspace for such variables. We let the probability space be the set of occasions anduncontrolled conditions under which the experiment might occur.

Consider Zα and Zβ where α 6= β. Both of these random variables aredefined on the same probability space, and thus have a common distribution.To simply the analysis of these distributions, we can make several, commonmodelling simplifications:

1. Simple Text-book Model: Zω N(0, σ2) for all ω.

2. Fixed Effect Model: Zω N(µω, σ2) and are independent, where µω

depends on how the unit fits into the plot/observational unit structure.(E.g. we may have that if certain units are in the same block, then wemay assume they have the same mean).

3. Random Effect Model: Zω are identically distributed, but there is acorrelation between Zα and Zβ depending on how α and β are related inthe plot structure.

4. Randomisation Model: Zω are identically distributed, but the corre-lation between Zα and Zβ depend on how the observational units arerandomised.

1.6 Covariance Matrices

This is a general, but important theorem on the covariance of matrices, whichwe will use in analysing models where random effects are assumed.

Theorem 1.1. Suppose Cov[Y ] =∑li=1 ξiQi where Qi is a known matrix of

orthogonal projection onto the eigenspace Wi of Cov[Y ] with unknown eigen-values ξi. Then:

6

1. If x ∈Wi ⇒ Cov[x · Y ] = ‖x‖2ξi

2. If i 6= j and x ∈Wi, y ∈Wj ⇒ Cov[x · Y, z · Y ] = 0

3. If W ≤Wi then E[‖PWY ‖2] = ξi dim(W ) + ‖PW τ‖2.

Note 1.3. Cov[Y ] is positive definite (i.e. ξi > 0), Qi are symmetric, idempo-tent and sum to the identity matrix. Also, QiQj = 0.

Proof. Given the properties of the Qi’s above:

1. E[xTY Y Tx] = xTCov[Y ]x. Since x ∈ Wi, xTQjx = 0 for j 6= i. There-

fore, Cov[x · Y ] = ξixTQix = ξi‖x‖2.

2. Cov[x ·Y, z ·Y ] = E[xTY Y T z]. Note that xTQl = 0 for l 6= i and Qkz = 0for k 6= j. Since j 6= i, the variance will be 0.

3. Consider: E[‖PWY ‖2] =∑j E[(PWY )2

j ] =∑j Cov[(PWY )j ]+E[(PWY )j ]

2 =∑j Cov[PWY ]jj+E[(PWY )j ]

2 = trace(Cov[PWY ])+‖PW τ‖2 = trace(ξiPWQiPW )+

‖PW τ‖2 = ξi dim(W ) + ‖PW τ‖2

7

2 Unstructured Experiments

2.1 Completely Randomised Design

Definition 2.1. In a completely randomised design, the observational units arethe same as the experimental units, there are no blocks, there are t treatments,and the replications for treatment i are ri, so that

∑i ri = N

The process for constructing a completely randomised design is as follows:

1. Label each unit 1,...,N

2. Allocate treatment 1 to units 1,...,r1; allocate treatment 2 to units r1 +1, . . . , r1 + r2; etc.

3. Choose a random permutation of 1, . . . , N. Suppose P (i) = j then theunit labelled i will receive the treatment allocated to j in the previousstep.

Example 2.1. Suppose t = 3, N = 5, r1 = 3 and r2 = 2.

1. Our observational units are numbered: 1 2 3 4 5

2. We allocate A to treatments 1,2 and 3. We allocate B to treatments 4 and5.

3. Suppose we are given a permutation which maps (1, 2, 3, 4, 5) → (2, 3,5, 1, 4). Then, for example, P (5) = 3, so unit 5 will receive treatment A.Overall, T (1) = B, T (2) = A, T (3) = A, T (4) = B, and T (5) = A.

2.2 Treatment Subspace

Definition 2.2. Given Ω, the set of observational units and the design functionT , let:

1. V = <Ω be the set of all real column vectors whose coordinates are indexedby the elements in Ω

2. VT = v ∈ V |T (α) = T (β) ⇒ vα = vβ be the treatment subspace, whosemembers are called the treatment vectors.

3. If v ∈ VT and∑ω∈Ω vω = 0 then v is called a treatment contrast.

4. The usual scalar product for v, w ∈ V is v · w =∑ω∈Ω vωwω

5. v is orthogonal to w, denoted v ⊥ w, if v · w = 0

6. Let W ≤ V (i.e. W is a subspace of V ). The orthogonal complement ofW is W⊥ = v ∈ V |v · w = 0,∀w ∈W

Proposition 2.1. Given the previous definition, we have some simple proper-ties:

1. dim(V ) = |Ω| = N and dim(VT ) = |τ | = t

2. W⊥ is a subspace with dim(W⊥) = N − dim(W ) and V = W⊥ ⊕W

8

3. Let u1, . . . , ut be a basis for W . The orthogonal projection of v onto Wis w = PW (V ) =

∑ti=1

v·ui‖ui‖2ui. Thus, PW is idempotent, symmetric and

has full rank.

Note 2.1. A convenient basis for VT is u1, . . . , ut where ui,α = 1[T (α) = i]

2.3 Linear Model for Unstructured Experiments

Let Y be the random vector of responses over Ω, and Y = Z + τ where

1. τ =∑ti=1 τiui ∈ VT

2. Z is a random vector/matrix on which we assume E[Z] = 0 and Cov[Z] =σ2I.

Proposition 2.2. Using the assumptions above:

1. E[Y ] = τ

2. Cov[Y ] = σ2I

3. Let W ≤ V be a subspace, then

(a) E[PWY ] = PW τ

(b) Cov[PWY ] = σ2PW

(c) E[‖PWY ‖2] = σ2dim(W ) + ‖PW τ‖2

Proof. Note that these properties will be used to analyse the remaining struc-tures of experiments:

1. E[Y ] = E[Z + τ ] = 0 + τ = τ

2. Cov[Y ] = Cov[Z + τ ] = Cov[Z] = σ2I

3. Assuming W ≤ V :

(a) E[PWY ] = PWE[Y ] = PW τ

(b) Cov[PWY ] = PWCov[Y ]PTW = σ2PWPTW = σ2PW since PW is idem-

potent.

(c) Consider E[‖X‖2] = E[∑ωX

2ω] =

∑ω E[X2

ω] =∑ω Cov[Xω] +

(E[X])2. Replacing X with PWY then E[‖PWY ‖2] =∑ω σ

2PWωω+

PW τ2T (ω) = σ2trace(PW ) + ‖PW τ‖2 = σ2dim(W ) + ‖PW τ‖2.

2.3.1 Estimation and Variance

Suppose we want to estimate∑ti=1 λiτi where τi are unknown and λi ∈ −1, 0, 1

usually.

Example 2.2. Suppose we want to estimate the treatment constant correspond-ing to treatment i or the differences between the treatment constants of i, j, thenwe could compute respectively:

1. λj = 1[j = i]. Then∑j λjτj = τi

9

2. Let λj = −1, λi = 1 and all others be 0. Then∑j λjτj = τi − τj.

Proposition 2.3. x · Y is an unbiased estimator for∑i λiτi where:

1. x =∑iλiriui

2. Cov[x · Y ] = σ2∑iλ2i

ri

Remark 2.1. In fact, x·Y is the best linear unbiased estimator since it achievesthe smallest variance out of all linear estimators of the quantity of interest.

Proof. First we compute the expected value of x · Y which is simply E[x · Y ] =x ·E[Y ] = x · τ =

∑i λiτi.

Then we compute the covariance: Cov[x · Y ] = xTCov[Y ]x = σ2‖x‖2 =

σ2∑iλ2i

r2i‖ui‖2 = σ2

∑iλ2i

ri

Therefore, if we are given measurements y we can estimate∑i λiτi by x ·Y .

To simplify our notation we have the following two definitions:

Definition 2.3. Let us use the same basis for VT as above. Let Y be a vector(random or otherwise). Then the sum of all responses of units treated withtreatment i is ui · Y = SUMT=i. Therefore, the best estimate for τi is τi =SUMT=i

ri

Definition 2.4. The vector of fitted values for τ is τ = PVT y =∑ti=1 τiui.

2.3.2 Sums of Squares and Mean Squares

Although we are able to estimate linear combinations of the treatment constants,we still need to find estimates of σ2 in order to compute the estimated varianceof our estimates. The final result of Proposition 2.2 suggests that we can do thisby finding a subspace of V whose dimension is nonzero and that is perpendicularto τ ∈ VT . The best choice is clearly V ⊥T which we call the subspace of residuals.This motivates the following definitions.

Definition 2.5. Let W ≤ V .

1. The sum of squares of W is SS(W ) = ‖PWY ‖2

2. The degrees of freedom of W is dfW = dim(W )

3. The mean square of W is MS(W ) = SS(WdfW

= ‖PWY ‖2dim(W )

4. The expected mean square of W is EMS(W ) = E[MS(W )]

Remark 2.2. Suppose our subspace is VT , the SS(VT ) is actually called thecrude sum of squares, since we have not taken into account the null model (seebelow). When it is necessary we will denote this as CSS(VT ).

Proposition 2.4. Let V ⊥T be the subspace perpendicular to VT . Then MS(V ⊥T )is an unbiased estimator of σ2.

Proof. From Proposition 2.2, we have that: EMS(W ) = σ2 + ‖PW τ‖2dim(W ) Since

τ ∈ VT , PV ⊥Tτ = 0. And the dim(V ⊥T ) = dim(V )− dim(VT ) = N − t 6= 0.

Remark 2.3. Consequently, the unbiased estimator for Cov[x·Y ] is MS(V ⊥T )‖x‖2.

We are often interested in reporting the standard error, which is√MS(V ⊥T )‖x‖.

10

2.3.3 Null Model

We are often interested in differences between our treatments. When we believethat our treatments are all alike, we are operating under the null model. Moreformally:

Definition 2.6. The null model is the situation in which all treatment constantsare the same value. Particularly, τ1 = · · · = τt so that E[Y ] = κu0 whereu0 =

∑i ui = [1 . . . 1]T . The linear subspace formed by the null model is denoted

as V0 or W0.

Definition 2.7. The estimate for the treatment effect under the null model isthe overall mean τ = y.

Definition 2.8. The subspace of the treatment space in which all treatments donot have equal effect is WT = V T0 ∩ VT .

Proposition 2.5. Let V0, WT and V ⊥T be defined as above.

1. Sum of Squares: (Hint: Pythagorean Theorem)

(a) SS(mean) = SS(V0) = Y2N

(b) SS(treatments) = SS(WT ) = CSS(VT )− SS(V0)

(c) SS(residuals) = SS(V ⊥T ) = ‖Y ‖2 − CSS(VT )

2. Degrees of Freedom (and Mean Squares)

(a) dim(V0) = 1.

(b) dim(WT ) = dim(VT )− dim(V0) = t− 1

(c) dim(V ⊥T ) = N − t

3. Expected Mean Square

(a) EMS(V0) = σ2 + Y2N

(b) EMS(WT ) = σ2 +∑i ri(τ−τu0)2

t−1

(c) EMS(V ⊥T ) = σ2

2.3.4 Analysis of Variance

Suppose we have two hypotheses:

1. H0 : τ1 = · · · = τt

2. H1 : not all τi are equal

UnderH0, the EMS(WT ) = EMS(V0). Thus, we want to compareMS(treatments)against MS(residuals) to see if the treatments do indeed have different effects.By the previous proposition, it is clear that EMS(treatments) ≥ EMS(residuals).Thus, we only have a one sided test. So we have the following interpretations:

1. If MS(treatments)MS(residuals) then we can reject H0

2. If MS(treatments) > MS(residuals) we must assume normality and usethe F-statistics (see below).

11

3. If MS(treatments) ≈MS(residuals) we do not have enough evidence toreject H0

4. If MS(treatments) < MS(residuals) then we should again assume nor-mality and check the F-statistic. If it is too small, this may be an indica-tion that something is amiss with the data.

To summarise the analysis, we usually organise the information in an ANOVAtable, which shows the subspace (Source), Sum of Squares, Degrees of Freedom,Mean Square and Variance Ratio.

Completely Randomised Design ANOVA Table

Source SS DF MS VR

Mean Y2N 1 MS(V0) MS(V0)/MS(V ⊥T )

Treatments SS(WT ) t-1 MS(WT ) MS(WT )/MS(V ⊥T )Residuals <subtraction> N-t MS(V ⊥T )

Total∑ω Y

2ω N

2.3.5 Normal Assumptions and F-statistic

In some cases, our assumptions under the linear model are not enough to con-clude if we can or cannot reject the null hypothesis. Thus, we may need toassume that our data is distributed normally. We have the following theorem.

Theorem 2.1. Suppose Y ∼ NN (τ, σ2I). Then:

1. If x · z = 0 then x · Y and z · Y are independent random variables.

2. Let x =∑iλiriui then

x·Y−∑i λiτi

SE(x·Y ) ∼ tN−t

3. If W ≤ V , PW τ = 0, and dim(W ) = d, then SS(W )σ2 ∼ χ2

d

4. If W1,W2 ≤ V with PW1τ = PW2τ = 0 and dimensions d1, d2, thenMS(W1)MS(W2) ∼ Fd1,d2

Assuming H0 is true, and assuming normality, we can use an F-statisticto evaluate the probability of the null being true when we are in the caseMS(treatments) > MS(residuals).

12

3 Experiments with Blocking

3.1 General Block Design

3.1.1 Purpose and Goals of Blocking

Purpose: the purpose of block design is to increase local control (i.e. groupingalike units together), thus reducing the variance and increasing the precision ofour fitted values and effects.

Implementation: ideally, (1) each block should be approximately the same sizeand (2) contain each treatment at least once.

3.1.2 Types of Blocking and Considerations for Each

Types of Blocks:

1. Natural discrete divisions: these are differences between observationalunits which occur naturally and must be taken into consideration if webelieve they will influence the treatment effect

2. Continuous gradients: these could be naturally occurring differences whichwe arbitrarily divide into discrete parts for analysis because we believe thatthe continuous variable will influence the treatment effect

3. Trial control: blocks are formed in consideration of practical aspects ofthe design, and may be in conflict with continuous gradient blocking

Example 3.1. Natural discrete divisions:

1. If testing animals, one may want to block by gender.

2. In testing an industrial process, one may want to block by the types ofchemicals used.

3. In consumer product testing, we can block according to the tester or by theweek.

Example 3.2. Continuous gradients:

1. In agricultural tests, regions of the soil may vary in a continuous fash-ion, but we block together regions and consider the properties of the soilidentical within each block.

2. In animal testing, we can make discrete blocks out of the weights or agesof the animals.

Example 3.3. Trial Control:

1. In a field trial, if our equipment can only move in a straight line, we mayneed to block by strips. If regions of the soil vary along the strip, we maynot be able to block by both resulting in a conflict with continuous gradientblocking.

2. In a clinical trial, we may want to block by the treating medical staff.

13

3. In a lab experiment, we may block by the technician or equipment beingused.

Considerations for implementing each type of block:

1. When we have natural divisions, we should always block by them. How-ever, we may be able to achieve Implementation (1) but not necessarily(2).

2. When we have continuous gradients, we should always block them unlessthe number of units is too small.

3. Trial management blocking should always be done, and data collectionshould be done block by block. Goals (1) and (2) are both possible toachieve under this scenario.

3.2 Orthogonal Block Design

3.2.1 Definition and Properties

Let Ω consist of b blocks each of size k, and let B(ω) indicate the block containingunit ω.

Definition 3.1. The block subspace is VB = v ∈ V |B(α) = B(β)⇒ vα = vβ.And we define the block subspace excluding the null subspace as WB = VB ∩V ⊥0 .

Proposition 3.1. Given the block subspace, we have some basic properties:

1. dim(VB) = b

2. A (convenient) basis for VB is v1, . . . , vb where vj,ω = 1[B(ω) = j].

Definition 3.2. A block design is orthonormal if WT ⊥WB.

Proposition 3.2. Let sij be the number of times treatment i occurs in block j.The block design is orthonormal if and only if sij = ri

b .

Proof. Note that by definition of WT ,WB , WT ⊥ WB ⇔ WT ⊥ VB . Wecontinue the proof using bases. Recall that VT has basis u1, . . . , ut and VBhas basis v1, . . . , vb. Therefore, VB ⊥ WT if and only if for every vector inVT whose dot product with u0 is 0, is also perpendicular to every vector in thebasis of VB Thus, this is equivalent to two conditions holding:

1. For any ai, uT0

∑i aiui =

∑i airi = 0, and

2. For j = 1, . . . , b, vTj∑i aiui =

∑i aisij = 0.

These conditions hold if and only if ∃c ∈ < such that sij = cri j = 1, . . . , b.Noting that

∑i sij = k since each block contains k units by assumption and∑

i ri = bk, we have that k = cbk. So c = b−1. Thus, sij = rib ⇔WB ⊥WT .

Definition 3.3. A complete block design has blocks of size t and every blockcontains each treatment once.

Proposition 3.3. A complete block design is orthonormal.

Proof. Let b be the number of blocks. Since each treatment occurs once inevery block, each treatment has b replicants. So sij = 1 = b/b. By the previousproposition, the result holds.

14

3.2.2 Construction and Randomisation

Treat each block as a completely randomised design. For each block, use adifferent random permutation.

3.2.3 Fixed Effects Blocking Model Analysis

Fixed effects analysis assumes that the means for observational units are differ-ent for units in different plots, but they have the same variance. This type ofmodel is best used for Natural discrete divisions.

We use the following notation:

1. Let E[Zα] = ζB(α) since we are assuming the mean depends on the block.Thus, let E[Z] = ζ.

2. Equivalently, E[Yα] = τT (α) + ζB(α), and E[Y ] = τ + ζ.

3. We can split τ, ζ into components that belongs to W0 and WT and WB

respectively.

(a) Let τ0 = τu0 ∈ V0. Let ζ0 = ζu0 ∈ V0.

(b) Let τT = τ − τ0 and ζB = ζ − ζ0 be the components in WT and WB

respectively.

4. Let WE = (VT + VB)⊥. Then EMS(WE) = σ2.

Note 3.1. In estimation, τ0 +ζ0 ∈ V0 so we cannot distinguish them. However,since we have that WB ⊥WT , we can estimate τT and ζB.

We summarise some of the properties of the subspaces here:

Fixed Effects Blocking Table

Space Dimension Projection/Fitted Values Expected Mean Square

V N = bk Y

W0 1 PV0Y = Y u0 σ2 + ‖τ0 + ζ0‖2

WT t− 1 PWTY = PVT Y − PV0

Y σ2 + ‖τT ‖2t−1

WB b− 1 PWBY = PVBY − PV0Y σ2 + ‖ζB‖2

b−1

WE b(k − 1)− (t− 1) <subtraction> σ2

Based on the above table, we can construct an ANOVA table (check marksindicate that these values should be computed):

Fixed Effects Blocking ANOVA Table

Source DF SS MS VR

Mean (V0) 1 X XBlocks (WB) b-1 X X XTreatments (WT ) t-1 X X XResiduals (WE) <subtraction> X XTotal N X

We can look at the variance ratios for WB and WT to determine if blocking was

15

necessary and if we can reject the null hypothesis. However, once we block weshould not re-analyse the data without blocking.

3.2.4 Random Effects Blocking Model Analysis

Random effects blocking models should be used for all other types of blockingbesides natural discrete divisions.

Theorem 3.1. Suppose we have a random effects blocking model and the Cov[Zα, Zβ ] =σ21[α = β] + σ2ρ11[α 6= β, B(α) = B(β)] + σ2ρ21[B(α) 6= B(β)] Then:

1. The eigenspaces and corresponding projections of Cov[Y ] are:

(a) V0 with corresponding orthogonal projection matrix N−1J , where Jis the matrix of all 1’s.

(b) WB with corresponding orthogonal projection matrix k−1JB−N−1J ,where JBα,β = 1[B(α) = B(β)]

(c) V ⊥B with corresponding orthogonal projection matrix I − k−1JB.

2. The eigenvalues for each space are:

(a) ξ0 = σ2 [(1− ρ1) + k(ρ1 − ρ2) +Nρ2]

(b) ξ1 = σ2 [(1− ρ1) + k(ρ1 − ρ2)]

(c) ξ2 = σ2 [(1− ρ1)]

3. Thus, we can write Cov[Y ] = ξ0N−1J+ξ1(k−1JB−N−1J)+ξ2(I−k−1JB)

Note 3.2. We expect that units within the same block have a higher correlations,so we expect ρ1 > ρ2.

Proof. Note that a lot of this is guessing. By assuming a random effects model,we have that

Cov[Yα, Yβ ] = Cov[Zα, Zβ ]

= σ21[α = β] + σ2ρ11[α 6= β, B(α) = B(β)] + σ2ρ21[B(α) 6= B(β)]

Thus,

Cov[Y ] = σ2I + σ2ρ1(JB − I) + σ2ρ2(J − JB)

= σ2(1− ρ1)I + σ2(ρ1 − ρ2)JB + σ2ρ2J

We now want to “guess” the eigenspaces and projection matrices of Cov[Y ].We know that we can decompose V = V0 ⊕WB ⊕ V ⊥B , orthogonal, so we starthere.

1. We need a matrix such that for u0 ∈ V0, Au0 = λu0. Since u0 is a vectorof 1’s we try Ju0 = Nu0. So the projection is N−1J .

2. We need to do the same for VB , which gives back a basis vector vj . Wetry JBvj = kvj . So to get the orthogonal projection matrix onto WB , wehave k−1JB −N−1J .

16

3. Finally, for V the corresponding projection matrix is I. So for the remain-ing subspace, the projection matrix is I − k−1JB .

To complete the proof, we simply add and subtract to get our matrix projectionsinto the equation for Cov[Y ]

Cov[Y ] = σ2(1− ρ1)I + σ2(ρ1 − ρ2)kk−1JB + σ2ρ2NN−1J

= ξ2(I − k−1JB) + σ2 [(1− ρ1) + (ρ1 − ρ2)k] (k−1JB) + σ2ρ2NN−1J

= ξ2(I − k−1JB) + ξ1(k−1JB −N−1J) + ξ0N−1J

To begin our analysis, we first compute the expected sums of squares forV0, WB , WT ≤ V ⊥B , & W⊥T ∩V ⊥B ≤ V ⊥B , which we do using the general theoremon the covariances of matrices:

1. E[‖PV0Y ‖2] = ξ0 dim(V0) + ‖PV0

τ‖2 = ξ0 + ‖τ0‖2

2. E[‖PWBY ‖2] = ξ1 dim(WB) + ‖PWB

τ‖2 = ξ1(b− 1)

3. E[‖PWTY ‖2] = ξ2 dim(WT ) + ‖PWT

τ‖2 = ξ2(t− 1) + ‖τT ‖2

4. E[‖PW⊥T ∩V ⊥

BY ‖2] = ξ2 dim(W⊥T ∩ V ⊥B ) + ‖PW⊥

T ∩V ⊥Bτ‖2

= ξ2(b(k − 1)− (t− 1))

From this, we have the following ANOVA table:

Random Effects Blocking ANOVA Table

Stratum Source DF SS MS VR

V0 Mean 1 X XWB Blocks b-1 X XV ⊥B Treatments t-1 X X X

Residuals <subtraction> X XTotal b(k-1) X

Total N=bk X

Remark 3.1. Just as before, we cannot individually estimate ‖τ0‖2 and ξ0since they are both in the same subspace. Additionally, since we expect ρ1 > ρ2

if blocking is appropriate, we expect ξ1 > ξ2. If we see that this is not the case,we may not consider blocking in any subsequent experiments.

17

4 Treatment Structures

4.1 Treatment Factors and their Subspaces

Example 4.1. Recall from the first example, there were two treatment factors:cultivants with 3 varieties and fertilisers with 4 levels. This resulted in 12 treat-ments in total.

Definition 4.1. We consider two factors denoted C and F:

1. We write T = C∧F for all treatments that are a combination of the levelsof F and the levels of C.

2. Let the factor subspace for C be VC = v ∈ V |C(α) = C(β) ⇒ vα = vβ.Let WC = VC ∩ V ⊥0 . Define VF and WF similarly.

3. Let WC∧F = VT ∩ (VC + VF )⊥.

Lemma 4.1. Let WC ,WF , & WC∧F be as above. Let nC be the number oflevels of C, and nF be the number of levels of F . Then:

1. dim(WC) = nC − 1 and dim(WF ) = nF − 1

2. If WC ⊥ WF then dim(WC∧F ) = nCnF − (nC − 1) − (nF − 1) − 1 =(nC − 1)(nF − 1)

Proposition 4.1. If every level of C ∧ F occurs with the same number ofobservations, then WC ⊥WF .

Proof. We proceed in a fashion similar to the previous proof. By definition ofWC and WF , WC ⊥ WF ⇔ WC ⊥ VF . Let v1, . . . , vnC be a the usual basisfor VC , and w1, . . . , wnF be the usual basis for VF .

Let denote sij as the number of units with level i of C and j of F , qi bethe number of units with level i of C, and pj be the number of units with levelj of F . Note the following:

1.∑j sij = pi

2.∑i sij = qj

Just as before, for any ai satisfying uT0∑aivi =

∑aiqi = 0 we have that

wTj∑i aivi =

∑i aisij for j = 1, · · · , nF then WC ⊥ VF . If all sij = s then

qj = snC = q and pi = snF = p. Then for any ai satisfying q∑ai = 0 we have

s∑ai = 0. So the subspaces are orthogonal.

4.2 Hasse Diagrams

Definition 4.2. A Hasse diagram is a graphical representation of vector spacesand their subspaces. The higher up on a Hasse Diagram a vector space, repre-sented by a •, is the larger it is. Subspaces are shown below a space and areconnected by a line.

Example 4.2. The following is an example of a two-factor Hasse Diagram.

18

Two Factor Hasse Diagram

1. VT . Full treatment model. Under thismodel, E[Yω] = τT (ω). Requires t =nCnF parameters.

2. This line indicates the “interaction”between VF and VC .

3. VC + VF . Additive model. Under thismodel, E[Yω] = λC(ω) + µF (ω). Re-quires nC + nF − 1 parameters.

4. VC or VF In this model, E[Yω] =λC(ω) or µF (ω). Requires nC or nF pa-rameters.

5. V0. Null model. Under this model,E[Yω] = constant. Requires 1 parame-ter.

4.3 Main Effect, Fitted Values, Interactions

Definition 4.3. Suppose we are in the two-factor experiment with factors Cand F:

1. The fits for C and F are PVCy and PVF y which are unbiased estimates ofthe treatment constants of C and F respectively (?).

2. The main effects of factors C and F are PWCτ and PWF

τ describe theeffect of the factor beyond the null model.

3. The interaction PWC∧F τ describes the degree to which the treatment con-stant of one factor is influenced by another. Interaction occurs when thepresence of one factor alters the effect of another.

Remark 4.1. See Appendix A for an example of interaction and computed fitsand effects for a two factor experiment.

Computing the overall fit, with BFj and BCi are bases for VF and VC . vi,jis the basis for the treatment subspace.

Sym 0 or U C F CF or C∧F

Sub V0 VC VF VT = VC∧FDim 1 nC nF t = nCnFFit PV0

Y PVCY PVF Y PVT Y

Y u0

∑nFj=1

SUMF=j

rnCBFj∑nC

i=1SUMC=i

rnFBCi

∑i,j

SUMC=i,F=j

r vi,j

CSS Y2N (rnC)−1

∑j SUM

2F=j

(rnF )−1∑i SUM

2C=i r−1

∑i,j SUM

2C=i,F=j

Computing the main effects:

19

Sub V0 = W0 WC WF WC∧F

Dim 1 nC − 1 nF − 1 (nC − 1)(nF − 1)Effect PV0

Y PVF Y − PV0Y

PVCY − PV0Y (PVT − PWC

− PWF− PW0

)YSS CSS(V0) CSS(VF )− CSS(V0)

CSS(VC)− CSS(V0) CSS(VT )− SS(WC)− SS(WF )− SS(W0)

4.4 Data Analysis

In general, we start with the whole model and test the effects of the largest (bydimension) orthogonal subspaces using the variance ratio until we can concludethat the subspace does indeed have an effect.

Example 4.3. Suppose we have two factors F and G, and T = F ∧G.

1. First we create our ANOVA table using WF ,WG,WF∧G subspaces

2. We use variance ratio to determine if the interaction F ∧G is important

(a) If it is, then we simply report that the interaction occurs and producea table of all the treatment means and standard errors

(b) If it is not important, then we can move on to test the additive modelin the space VF + VG

3. In the space VF +VG we can compare the variance ratios for WF and WG

to the residual to determine if we can simplify the model further to VF , VGor V0.

Example 4.4. Suppose we have two factors F,G,H that we use to create thetreatment subspace. See Appendix B for the Hasse Diagram for this space. Look-ing at the orthogonal “effect” subspaces significantly simplifies the analysis, sincethe diagram can be constructed using:

1. WF ,WH , & WG

2. WF∧H ,WF∧G, & WH∧G

3. WF∧H∧G

Starting at the full model, we can use the variance ratios to determine whichsubspaces have a significant effect and should be included in the model.

Note 4.1. Suppose that we conclude VF∧G + VF∧H + VH∧G is the appropriatesubspace. This means that the effect of F depends on G and the effect of Hdepends on G, but the interaction of F and H does not depend on G.

4.5 Factorial Experiments

Definition 4.4. A factorial experiment occurs when the treatments are all com-binations of the levels of two more factors.

Factorial experiments, in comparison to “change-one-variable-at-a-time” ex-periments, have some benefits:

20

1. It allows us to test for interactions between factors

2. If there is interaction, we can find the “optimal” treatment combination

3. We improve replication by doing a factorial experiment, hence savingmoney.

Example 4.5. We compare a factorial experiment against a “change-one-variable-at-a-time” experiment. Suppose we have two factors P with levels g, s and Mwith levels +,−.

Factors Factorial Design Other Exp. 1 Other Exp. 2

P g g g g s s s s g g g g g g s sM + + - - + + - - + + - - + + + +

The two one-at-a-time experiments on the right have a lower replication, anddo not test the possible combination of s and −.

Construction and Randomisation: In simple cases (no blocking, row-columndesign or orthogonal blocking), we simply ignore the factorial design and proceednormally.

Remark 4.2. If we have a Factorial experiment with a control, we treat thisas if we had a “control factor” with levels control, everything − else. Wefirst analyse this factor to see if the two levels are different, and proceed withthe analysis normally if they are.

21

5 Row-Column Designs

5.1 Double Blocking

1. Motivation: it is sometimes the case that we need more than one systemof blocking to control some external variable.

Example 5.1. Consider 8 judges sampling 4 wines. Each judge forma block. If we randomise the order of the wines within each block, it ispossible for one wine to be tested by all the judges at the end, which mayskew the results if the judges are inebriated by the fourth glass. Therefore,we may consider the position in tasting order to be another block.

JudgesDrinking A B C D A B C D

Order B C D A B C D AC D A B C D A BD A B C D A B C

2. Framework and Notation

(a) Assume for simplicity:

i. The intersection of two blocks from different systems contains ex-actly one unit (i.e. each row’s and column’s intersection containsexactly 1 unit)

ii. each treatment occurs the same number of times in each of theblocking systems (i.e every row and column has every treatmentin equal replications)

(b) Notation

i. Let t be the number of treatments, m be the number of rows(units in one blocking system), and n be the number of columns(units in another blocking system)

ii. Assumption (A1), N = mn = |Ω|iii. Assumption (A2), t divides m and n, and the number of repli-

cants for each treatment is r = mnt

5.2 Latin Squares

Definition 5.1. If t = m = n and the row-column design satisfies the assump-tions (A1 & A2), it is called a Latin Square

Latin squares are useful in constructing and randomising row-column de-signs. There are three popular ways of constructing Latin Squares.

1. Cylic Method for Treatments

Example 5.2. Suppose we have treatments A,B,C and we want to blockby two systems with 3 units in each block.

2. Group Method: given elements g1, . . . , gt of a group G of order t, wesimply write out its Cayley table under its group product

22

1 2 31 A B C2 C A B3 B C A

Example 5.3. Suppose we again have three treatments that we need toassign. So t = 3.

⊗ g1 g2 g3

g1 g1 g2 g3

g2 g2 g3 g1

g3 g3 g1 g2

3. Product Method: suppose t = uv for u 6= 1, v 6= 1. Start with a u × ulatin square with treatments A1, . . . , Au, and replace each occurrence ofAi with a v × v Latin square of elements L1, . . . , Lv.

Example 5.4. Suppose t = 4 which we can divide up into two 2×2 LatinSquares:

A1 A2

A2 A1⇒

L1 L2 L1 L2

L2 L1 L2 L1

L1 L2 L1 L2

L2 L1 L2 L1

5.3 General Construction and Randomisation

Divide the m×n rectangle completely into t×t squares, and make Latin squaresout of each t×t square using any method. Randomise the rows. Then randomisethe columns, and hide the order.

Example 5.5. Suppose we have 8 judges and 4 wines. First we split the tableinto two 4× 4 Latin squares.Then we find a permutation of the drinking order: 3 1 4 2. Then we find apermutation of the judges: 6 8 3 1 2 5 4 7.

5.4 Orthogonal Subspaces

1. Consider the table of subspaces (somewhere below this point)

2. Properties

(a) By (A1) and (A2), W0, WT , WR, WC , and WE = (VT + VR + VC)⊥

are orthogonal

(b) dim(WE) = (m− 1)(n− 1)− (t− 1)

(c) We can compute the CSS and SS just as before

23

JudgesOrder 1 2 3 4 5 6 7 8

1 A B C D A B C D2 B C D A B C D A3 C D A B C D A B4 D A B C D A B C

JudgesOrder 1 2 3 4 5 6 7 8

1 D B A C D C B A2 B D C A B A D C3 A C B D A D C B4 C A D B C B A D

5.5 Fixed Effect: Model and Analysis

1. let R(ω) and C(ω) be the row and column of ω.

2. Model

(a) Because we are assuming a fixed effect model: Cov[Y ] = σ2I andE[Y ] = τ + η + ζ where η depends on the column and ζ depends onthe row

(b) As before, we split τ, η, ζ into its projection in V0 and the respectiveorthogonal subspaces: E[Y ] = (τ0 + η0 + ζ0) + τT + ηC + ζR

Note 5.1. Just as before, we are unable to individually estimateτ0, η0, ζ0 since they are all projections into the same subspace.

3. Analysis

(a) By construction, we assumed row and column effects exist, and sowe do not need to test them in our analysis

(b) In the following ANOVA table, we omit SS and MS which we shouldcompute anyway:

5.6 Random Effect: Model and Analysis

Under random effects, we have the following model:

1. E[Z] = 0 therefore E[Y ] = τ

2. Cov[Zα, Zβ ] =

σ2 if α = β

σ2ρ1 if R(α) = R(β), α 6= β

σ2ρ2 if C(α) = C(β), α 6= β

σ2ρ3 otherwise

Note 5.2. Because we are blocking by rows and columns, we expect cor-relations to exist within rows and columns. Therefore, we expect ρ1 > ρ3

and ρ2 > ρ3.

Theorem 5.1. Given the row-column random effects model, then:

24

Subspace V0 = W0 VT WT VR WR VC WC

Name Mean Treatment Row ColumnDim 1 t t− 1 m m− 1 n n− 1

Row-Column Fixed Effects ANOVA Table

Source DF EMS VRMean 1 ‖τ0 + η0 + ζ0‖2 + σ2

Rows m− 1 ‖ζR‖2m−1 + σ2

Columns n− 1 ‖ηC‖2n−1 + σ2

Treatments t− 1 ‖τT ‖2n−1 + σ2 X

Residuals <sub> σ2

Total mn

1. The eigenspaces and corresponding orthogonal projection matrices of Cov[Y ]are:

(a) V0 with the corresponding orthogonal projection matrix (mn)−1J

(b) WR with the corresponding orthogonal projection matrix n−1JR −(mn)−1J

(c) WC with the corresponding orthogonal projection matrix m−1JC −(mn)−1J

(d) (VR + VC)⊥ with the corresponding orthogonal projection matrix I −m−1JC − n−1JR + (mn)−1J

2. The eigenvalues of each space are:

(a) ξ0 = σ2[1 + ρ1(n− 1) + ρ2(m− 1) + ρ3(m− 1)(n− 1)]

(b) ξR = σ2[1 + ρ1(n− 1)− ρ2 − ρ3(n− 1)]

(c) ξC = σ2[1− ρ1 + ρ2(m− 1)− ρ3(m− 1)]

(d) ξ = σ2[1− ρ1 − ρ2 + ρ3]

3. So we can write Cov[Y ] = ξ0(mn)−1J+ξR(n−1JR−(mn)−1J)+ξC(m−1JC−(mn)−1J) + ξ(I −m−1JC − n−1JR + (mn)−1J)

Using the previous theorem, we can construct an ANOVA table:

Row-Column Random Effects ANOVA Table

Stratum Source DF EMS VR

V0 Mean 1 ξ0 + ‖τ0‖2WR Rows m− 1 ξRWC Columns n− 1 ξC

(VC + VR)⊥ Treatments t− 1 ξ + ‖τT ‖2t−1 X

Residuals ξTotal (m− 1)(n− 1)

Total mn

25

6 Small Units Inside Larger Units

6.1 Treatment on E-Units Containing O-Units

6.1.1 Overview, Construction and Modelling

Example 6.1. Suppose we have 8 pens of 10 calves each, and 4 different feedswith one given to each pen. Differences between feeds should be assessed againstthe pen-to-pen variability

1. General Set Up: Suppose we have m experimental units containing kobservational units each. Suppose we have t treatments and t divides m

2. Construction and Randomisation: Simply construct and randomise unitsat the experimental unit level

3. Model

(a) Fixed v. Random Effect: If we suppose that each pen has a fixedeffect, we cannot conclude anything about the treatments, since onlyone treatment can be applied to an entire experimental unit whichhas a “fixed effect”

(b) Random Effect Model:

i. Let P (ω) indicate the pen to which ω belongs, and define VP andWP accordingly. Note that dim(VP ) = m

ii. Under this model, E[Y ] = τ

iii. Also, Cov[Yα, Yβ ] =

σ2 if α = β

σ2ρ1 if P (α) = P (β), α 6= β

σ2ρ2 otherwise

6.1.2 Analysis

Theorem 6.1. Given the pen-calves design with random effects model, then:


(a) W0 with the corresponding orthogonal projection matrix (mk)−1J

(b) WP with the corresponding orthogonal projection matrix k−1JP −(mk)−1J

(c) V ⊥P with the corresponding orthogonal projection matrix I − k−1JP


(a) ξ0 = σ2[(1− ρ1) + k(ρ1 − ρ2) +mkρ2]

(b) ξP = σ2[(1− ρ1) + k(ρ1 − ρ2)]

(c) ξ = σ2[1− ρ1]

3. So we can write Cov[Y ] = ξ0(mk)−1J + ξP (k−1JP − (mk)−1J) + ξ(I −k−1JP )

Lemma 6.1. If treatments are applied to whole pens, WT ≤WP

26

Proof. Since P (α) = P (β) =⇒ T (α) = T (β), VT ≤ VP . VT ∩ V ⊥0 ≤ VP ∩V ⊥0 .

To create an ANOVA Table:

1. Create a Null ANOVA table, which contains the strata and degrees offreedom.

2. Expand the Null ANOVA table by computing the SS and EMS for eachstrata (if necessary)

Treatments on E, Null ANOVA Table

Strata DF SS EMS

W0 1 Y2N ξ0 + ‖τ0‖2

WP m− 1SUM2

P=i

k − Y 2N

V ⊥P k(m− 1)∑Y 2ω −

SUM2P=i

k

3. Create the Skeleton ANOVA table, which contains the strata, subspacesand degrees of freedom.

4. To get the Full ANOVA table, use the Null ANOVA table with computa-tions to complete the skeleton ANOVA

Treatments on E, Full ANOVA Table

Strata Source DF SS EMS VR

W0 Mean 1 SS(W0) ξ0 + ‖τ0‖2

WP Treatments t− 1 SS(WT ) ξP + ‖τT ‖t−1 X

Residuals m− t <sub> ξPTotal m− 1 SS(WP )

V ⊥P O-units k(m− 1) SS(V ⊥P ) ξTotal km

Definition 6.1. False replication occurs if we take MS(treatments)/MS(O−units) which increases the degrees of freedom quite a bit. This is an inappropri-ate comparison in light of the EMS for treatments.

Remark 6.1. The let c be the number of pens per treatment. Let x = 1ckv

Ti −

1ckv

Tj ∈ WT for j 6= i. Then x · Y is an estimator for τi − τj. By the general

theorem on the covariance of matrices, Cov[x ·Y ] = 2ξPck = 2

c ( 1−ρ1k + (ρ1− ρ2)).

Increasing c (i.e.) the number of pens reduces the variance better than increasingthe number of observational units k in each pen.

6.2 Treatment Effects in Different Strata

6.2.1 General Description and Construction

Example 6.2. Suppose we are given 4 feeds arising from 2 factors (hay andmedicine) with two levels each. Suppose hay must be given to whole pens whilemedicine can be given to individuals calves. It is better to randomise hay over

27

pens and medicine by calves in each pen in analysis rather than randomisinghay and medicine over pens.

1. Set up: Suppose H is a factor with nH levels, each given to rH exper-imental units. Suppose M is a facotr with nM levels each given to rMobservational units in each experimental unit.

Note 6.1. So there are rHnH experimental units and rMnM observationalunits in each experimental unit.

2. Construction and Randomisation

(a) Assign the levels of H to experimental units as in a typical ran-domised design.

(b) Assign the levels of M to the observational units within each pen asin a block design.

6.2.2 Analysis

Theorem 6.2. In such a design, we have

1. WH ≤WP

2. WM ,WM∧H ≤ V ⊥PProof. Note that VT = VH ⊕WM ⊕ +WM∧H =⇒ VT ∩ V ⊥H = WM ⊕WM∧H .Let v ∈ VT ∩V ⊥H . Then v · v′ = 0 for v′ ∈ v ∈ V |H(ω) = i =⇒ vω = 1 ⊂ VH .So the sum of the entries of v on ω ∈ Ω|H(ω) = i is 0.

All pens with level i of H have the same treatments with the same frequency,so the entries of v must add up to 0 on each pen. Therefore, v ∈ V ⊥P

The Full ANOVA Table:

Treatments in Different Strata, Full ANOVA Table


V0 V0 1 X ξ0 + ‖τ0‖2

WP H nH − 1 X ξP + ‖τH‖2nH−1 X

Resid. <sub> X ξPTotal m− 1 X

V ⊥P M nM − 1 X ξ + ‖τM‖2nM−1 X

H ∧M (nM − 1)(nH − 1) X ξ + ‖τM∧H‖2(nM−1)(nH−1) X

Resid. <sub> X ξTotal m(k − 1) X

V mk X

We can interpret this as follows:

1. We can test the effects of levels of H we can use the variance ratio in pensWP , and report the standard errors of the differences using the meansquare of the residuals for pens.

2. We can test the effects of M and M ∧ H using the variance ratio forcalves V ⊥P , and report the standard errors of the differences using themean square of the residuals for calves.

28

6.2.3 Design Advantages

The advantages of splitting treatments over strata compared to simply assigningM and H to pens are:

1. The variance for M and H ∧M will be smaller since we expect ξ < ξP

2. Power for testing M and H ∧M will increase because:

(a) ξ is expected to be smaller than ξP

(b) The degrees of freedom for calves residuals is much higher than thedegrees of freedom for the pens residuals (see example)

3. The power for testing H increases since the degrees of freedoms for pensresiduals increases slightly as well (see examples)

Example 6.3. Suppose we have two factors H assigned to Pens and M assignedto calves with 10 calves in each of the 8 pens.

Assigning All Treatments to Pens

Stratum Mean Pens Calves TotalSource Mean H M H ∧M Res. Tot.

DF 1 1 1 1 4 7 72 80

Splitting Treatments over Strata

Stratum Mean Pens Calves TotalSource Mean H Res. Tot. M H ∧M Res. Tot.

DF 1 1 6 7 1 1 70 72 80

The residuals for pens in the second design has increased from 4 to 6. Theresiduals for calves in the second design increases from the residuals for pens,which is 4 in the first design, to 70 in the second design.

6.3 Split Plot Designs

6.3.1 Overview

1. Motivation: we are sometimes interested in grouping the experimentalunits into blocks.

2. Set-up

(a) Plot Structure: We assume we have b blocks each containing s plots(experimental units) each containing k subplots (observational units).

(b) Treatment Structure: We have a factor H which has s levels appliedto the s plots in each block. We have a factor M with k levels appliedto the k subplots in each plot

3. Construction and Randomisation: Apply the levels of H to the plots as ina randomised block design. Then, apply the levels of M to the subplotsas in a completely randomised design within each plot.

29

6.3.2 Model and Analysis

Let B(ω) indicate the block to which ω belongs and let P (ω) indicate the plotto which ω belongs. Define VB and VP accordingly.

Lemma 6.2. Under the split plot design, VB ≤ VP

Proof. If P (α) = P (β) =⇒ B(α) = B(β). Therefore, VB ≤ VP

Just as before, we must use a random effects model:

1. E[Y ] = τ

2. Cov[Yα, Yβ ] =

σ2 if α = β

σ2ρ1 if α 6= β, P (α) = P (β)

σ2ρ2 if P (α) 6= P (β), B(α) = B(β)

σ2ρ3 if B(α) 6= B(β)

Theorem 6.3. Given the split plots design with random effects model, then:


(a) W0 with the corresponding orthogonal projection matrix (bsk)−1J

(b) WB with the corresponding orthogonal projection matrix (sk)−1JB −(bsk)−1J

(c) WP with the corresponding orthogonal projection matrix k−1JP −(sk)−1JB

(d) V ⊥P with the corresponding orthogonal projection matrix I − k−1JP


(a) ξ0 = σ2[(1− ρ1) + k(ρ1 − ρ2) + sk(ρ2 − ρ3) + bskρ3]

(b) ξB = σ2[(1− ρ1) + k(ρ1 − ρ2) + sk(ρ2 − ρ3)]

(c) ξP = σ2[(1− ρ1) + k(ρ1 − ρ2)]

(d) ξ = σ2[1− ρ1]

3. So we can write Cov[Y ] = ξ0(bsk)−1J + ξB((sk)−1JB − (bsk)−1J) +ξP (k−1JP − (sk)−1JB) + ξ(I − k−1JP )

The Full ANOVA Table

30

Split Plot Design Full ANOVA Table


W0 Mean 1 X ξ0 + ‖tau0‖2WB Blocks b− 1 X ξB

WP WH s− 1 X ξP + ‖τH‖2s−1 X

Resid. (b− 1)(s− 1) X ξPTotal b(s− 1) X

V ⊥P WM k − 1 X ξ + ‖τM‖2k−1 X

WM∧H (s− 1)(k − 1) X ξ + ‖τM∧H‖2(s−1)(k−1) X

Resid. (b− 1)s(k − 1) X ξTotal bs(k − 1) X

Total bsk

7 More on Latin Squares

7.1 Uses of Latin Squares

1. Row-Column Designs

2. Two n-level treatment factors A,B split into n blocks of size n

(a) There cannot be any interaction between the factors

(b) Each of the n blocks should have each level of the treatment factor

(c) Since there are n2 treatments, each treatment is replicated only once

(d) Construction: let the columns be the blocks, let the rows be the levelsof A and the letters be the levels of B

3. Three n-level treatment factors A,B,C over n2 plots.

(a) There cannot be any interaction between the factors

(b) Let the columns be levels of A, the rows be levels of B and the lettersbe levels of C

7.2 Graeco-Latin Squares

Definition 7.1. Let L and M be latin squares of order n. L is orthogonal to Mif each letter of L occurs exactly once in the same position as each letter of M(i.e. if L has an “a” in position (1,1) and (2,2), M cannot have an α is both(1,1) and (2,2) if it is orthogonal to L). The pair (L,M) is called a Graeco-Latinsquare if L and M are orthogonal.

Example 7.1. Suppose n = 3. The following Latin Squares are a Graeco-LatinSquare

A B C α β γC A B β γ αB C A γ α β

31

Methods of Construction:

1. If n is odd, we can use two cyclic squares.

(a) Label the rows and columns by the integers mod n

(b) For the element in row i and column j of L, we have i+ j

(c) For the element in row i and column j of M, we have i− j

2. If n is power of a prime (n ≥ 3), we use the Galois Field of order n

(a) Label the rows and columns by the GF (n)

(b) For the element in row i and column j of L, we have i+ j

(c) Let a ∈ GF (n) \ 0, 1. For the element in row i and column j of M,we have i+ aj

7.3 Applications of Graeco-Latin Squares

Applications:

1. Suppose we must reuse units (e.g. trees) we have previously experimentedon using a Latin square design. If the previous treatments have a residualeffect, we want to use an orthogonal Latin square to account for this inthe new experiment.

2. We can design an experiment with two n-level factors (A,B) with no in-teraction for n2 units.

3. We can design an experiment with n blocks of size n and three n-leveltreatment factors (A,B,C) of size n with no interaction

4. We can design an experiment with four n-level treatment factors (A,B,C,D)with no interaction on n2 units

Construction and Replicates

App. Rows Colms. Latin Greek Replicates1 Rows Colms. Previous Current2 Rows Colms. A B Single3 Blocks A B C Fractional4 A B C D Fractional

32

8 Experiments on People and Animals

1. Cross-over studies: A study in which the animal or individual changestreatment every period

(a) Row-column designs are suitable for cross-over studies

(b) A treatment that could “cure” a subject is not good for a cross-overstudy (because then the person no longer needs treatment in anotherperiod)

2. If people are recruited sequentially, we may not know enough about themto block them correctly, so we can use a completely randomised designinstead.

3. Blinding: People react differently if they think they are being experimenton. Treatments should be blinded so that no one knows who is gettingwhat except for the statistician.

4. If there are ethical issues, consider observational studies (e.g. Cohortstudies)

A Example of Interactions

B Three Factor Hasse Diagram

33

Date post:	01-Feb-2022
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Notes from Design of Experiments - University of Wisconsin ...

Documents