Post on 15-Mar-2021
transcript
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Model
The classical linear model is defined by
Y = Xβ + ε
where
Y is an observable data (response variable) vector
β is a vector of unknown parameters
X is the design matrix (for factors and regressors)
ε is a vector of random errors and ε ∼ N(0, σ2I)
ThenE(Y) = Xβ and Var(Y) = σ2I
The ordinary least-squares estimator (the same as MLE) of β is
β = (X′X)−1X′Y
Disadvantages
too restrictive for most of typical data sets
the error-structure in real-world experiments is often more complexthan Σ = σ2I
Clarice G.B. Demetrio and Cristian Villegas 1 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expectation and variance - properties
1. Expected value
Definition
The expected value or mean of a random variable Y , denoted by E(Y ) isdefined by
E(Y ) =
∫ +∞
−∞y fY (y) d y .
PropertiesLet X and Y be two random variables and a and b ∈ R be constants.Then
1 E(a) = a.2 E(aX ) = aE (X ).3 E(aX ± bY ) = aE(X )± bE(Y ).4 E(aX ± b) = aE(X )± b.5 E[(X − a)2] = E(X 2)− 2aE(X ) + a2.6 E(XY ) = E(X )E(Y ), for X and Y independent random variables.7 E
(∑ni=1 Yi
)=∑n
i=1 E(Yi )
Clarice G.B. Demetrio and Cristian Villegas 2 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2. Variance
Definition
Let Y be a random variable and let assume that µ = E(Y ) exists. Thevariance of Y is the number denoted by Var(Y) and defined by
Var(Y ) = E(Y − µ)2 = E[Y − E(Y )]2 = E(Y 2)− E(Y )2 ≥ 0
Note
The variance for a continuous random variable Y is calculated by
Var(Y ) =
∫ +∞
−∞(y − µ)2 fY (y) d y
or
Var(Y ) =
∫ +∞
−∞y 2 fY (y) d y −
[∫ +∞
−∞y fY (y) d y
]2
Clarice G.B. Demetrio and Cristian Villegas 3 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
PropertiesLet X and Y be two random variables and a and b ∈ R be constants.Then
1 Var(aY + b) = a2Var(Y )
2 Var(a) = 0
3 Var(aY ) = a2Var(Y )
4 Var(−Y ) = Var(Y )
5 Var(X ± Y ) = Var(X )± Var(Y ), for X and Y independent randomvariables.
6 Var
(n∑
i=1
aiYi
)=∑n
i=1 a2i Var(Yi ), for Yi independent random
variables.
Clarice G.B. Demetrio and Cristian Villegas 4 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3. Covariance
Definition
The covariance between Y and Z is defined by
Cov(Y ,Z ) = E(YZ )− E(Y )E(Z ).
Properties
1 Cov(aY , bZ ) = abCov(Y ,Z )
2
n∑i=1
Cov(aiYi , biZi ) =n∑
i=1
aibiCov(Yi ,Zi )
3 Var
(n∑
i=1
aiYi
)=
n∑i=1
a2i Var(Yi ) + 2
∑i<i ′
aiai ′Cov(Yi ,Yi ′)
Clarice G.B. Demetrio and Cristian Villegas 5 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Explanatory variables
2 types of explanatory variables:1 factors↪→ interest is in attributing variability in y to various categories ofthe factorExample: corn yields from two replicates of three varieties (A/B/C)in a completely randomized design
Yij = µ+ τi + εij i = 1, 2, 3 j = 1, 2↪→ In matrix notation, this model can be expressed as: y1
y2
y3
=
12
12
12
µ+
12 02 02
02 12 02
02 02 12
τ1
τ2
τ3
+
ε1
ε2
ε3
yi = [yi1, yi2]′ is the vector of observations of variety i ; 12 and 02 are2-dimensional column vectors of 1′s and 0′s, respectively; andεi = [εi1, εi2]′ is the vector of residuals associated with variety i .↪→ parameter values give the impact of factor’s levels on theresponse variablefactors may be crossed or nestedfactors may have main effect and interaction effect
Clarice G.B. Demetrio and Cristian Villegas 6 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2 regressors↪→ interest is in attributing variability in Y to changes in values of acontinuous covariable
Example: changes due to weight x
Yi = β0 + β1xi + εi
↪→ In matrix notation, this model can be expressed as:y1
y2
· · ·yn
=
1 x1
1 x2
· · · · · ·1 xn
[ β0
β1
]+
ε1
ε2
· · ·εn
↪→ parameter values give the impact of an increase in x on theresponse variable
Clarice G.B. Demetrio and Cristian Villegas 7 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Terminology :
Multiple Linear Regression/ANOVA/ANCOVA
if matrix X contains only regressors, models are called regressionmodels
if matrix X contains only factors, model are called Analysis ofVariance (ANOVA) (X is a matrix with 1’s and 0’s) models.
if matrix X contains both regressors and factors, models are calledAnalysis of Covariance (ANCOVA) models.
Clarice G.B. Demetrio and Cristian Villegas 8 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation
Let’s assume a linear model :
Y = Xβ + ε
Parameters to be estimated are β, σIn all the following, X is supposed of full rank: rank(X)= K
Least squares approach : min(||Y − Xβ||2)
βls = (X′X)−1X′Y
best linear unbiased estimator of β
βls ∼ N (β, σ2(X′X)−1)
best quadratic unbiased estimator of σ2
σ2ls =
1
n − K(Y − Xβls)′(Y − Xβls) and σ2
ls ∼σ2
n − Kχ2
(n−K)
Clarice G.B. Demetrio and Cristian Villegas 9 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maximum likelihood approach
Likelihood
L(β, σ; y) =n∏
i=1
1√2πσ2
e−1
2σ2 (yi−x′i β)′(yi−x′i β)
Log-likelihood
`(β, σ; y) = −n
2log (2πσ2)− 1
2σ2(y − Xβ)′(y − Xβ)
Maximum log-likelihood
∂β,σ`(β, σ, y) = 0⇒
βml = (X′X)−1X′Y
σ2ml =
1
n(Y − Xβ)′(Y − Xβ)
Clarice G.B. Demetrio and Cristian Villegas 10 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
βls = βml , unbiased
E[βls ] = E[βml ] = E[(X′X)−1X′Y] = β
but σ2ls 6= σ2
ml
σ2ls is unbiasedσ2ls is calculated on the orthogonal space of Xσ2ls takes into account the difference between Y and its projection Xβ
on X and the lost of degrees of freedom due to the estimation of β
σ2ml is biased
joint estimation of σ2 and βit does not take into account the lost in degrees of freedom due tothe estimation of β
Note thatE[(Y − Xβ)′(Y − Xβ)] = E{Y′[I− (X′X)−1X′]Y}σ2 = (n − K )σ2
Then E(σ2ls) = σ2 and E(σ2
ml) = n−Kn σ2
Clarice G.B. Demetrio and Cristian Villegas 11 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Example 1: Linear Regression
yi = β0 + β1Xi + εi
XTX =
[1 1 . . . 1X1 X2 . . . Xn
]1 X1
1 X2
· · · · · ·1 Xn
=
[n
∑ni=1 Xi∑n
i=1 Xi
∑ni=1 X 2
i
],
(XTX)−1 =1
n∑n
i=1 X 2i − (
∑ni=1 Xi )2
[ ∑ni=1 X 2
i −∑n
i=1 Xi
−∑n
i=1 Xi n
]=
1
n∑n
i=1 x2i
[ ∑ni=1 X 2
i −∑n
i=1 Xi
−∑n
i=1 Xi n
]where n
∑ni=1 X 2
i − (∑n
i=1 Xi )2 = n
∑ni=1 x2
i . Also,
XTY =
[1 1 . . . 1X1 X2 . . . Xn
]Y1
Y2
· · ·Yn
=
[ ∑ni=1 Yi∑n
i=1 XiYi
].
Clarice G.B. Demetrio and Cristian Villegas 12 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then the normal equations system is[n
∑ni=1 Xi∑n
i=1 Xi
∑ni=1 X 2
i
] [β0
β1
]=
[ ∑ni=1 Yi∑n
i=1 XiYi
]and the least square estimator θ = (XTX)−1XTY is given by
θ =
[β0
β1
]=
1
n∑n
i=1 x2i
[ ∑ni=1 X 2
i −∑n
i=1 Xi
−∑n
i=1 Xi n
] [ ∑ni=1 Yi∑n
i=1 XiYi
]
=
Y − β1X
n∑n
i=1 XiYi −∑n
i=1 Xi
∑ni=1 Yi
n∑n
i=1 x2i
=
Y − β1X∑n
i=1 xiYi∑ni=1 x2
i
where xi = Xi − X
Clarice G.B. Demetrio and Cristian Villegas 13 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Example 2: Completely randomized design
Yij = µ+ τi + εij = µi + εij
For the general case of a set of t Treatments suppose Y is orderedso that all observations for:
1st treatment occur in the first r1 rows,2nd treatment occur in the next r2 rows,and so on with the last treatment occurring in the last rt rows.
Y =
Y11
Y12
. . .Y1r1
. . .Ytrt
,XT =
1r1 0r1×1 . . . 0r1×1
0r2×1 1r2 . . . 0r2×1
. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt
,θ =
µ1
µ2
· · ·µt
where 1ri is the ri × 1 column vector of ones and 0ri is the ri × 1column vector of zeroes (1 only ever vector, but 0 can be matrix.)
Clarice G.B. Demetrio and Cristian Villegas 14 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
From the theory of linear models (see chapter XI, Chris Brien’snotes) the OLS estimators for θ and ΨT = E(Y) = XTθ are givenby
θ = (X′TXT )−1X′TY and ΨT = XT θ = MTY = T
where
X′TXT =
r1 0 . . . 00 r2 . . . 0. . . . . . . . . . . .0 0 . . . rt
,
MT = XT (X′TXT )−1X′T =
1r1
Jr1 0r1×r2 . . . 0r1×rt0r2×r1
1r2
Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1
rtJrt
It can be shown, by examining the OLS equation, that the estimatorsof the elements of θ and Ψ are the means of the treatments.
θ = [µ1, µ2, · · · , µt ]T = [T1, T2, · · · , Tt ]
T , Ti =
∑rij=1
riYij
Clarice G.B. Demetrio and Cristian Villegas 15 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Note that T = MTY is a vector with the first r1 elements of equalto the mean of the Yi ’s for the first treatment, the next r2 elementsequal to the mean of those for the second treatment and so on.
MT is called the treatment mean operator as it computes thetreatment means from the vector to which it is applied and replaceseach element of this vector with its treatment mean.
ΨT = XT θ = XT (X′TXT )−1X′TY = MTY = T =
T1
. . .T1
. . .Tt
. . .Tt
=
T11r1
T21r2
. . .Tt1rt
For the observed values y of Y, t = MTy is the estimate of ΨT .
Clarice G.B. Demetrio and Cristian Villegas 16 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Other types of restriction
∑ti=1 τi = 0
θ = [µ, τ1, τ2, · · · , τt ]T = [Y , T1 − Y , T2 − Y , · · · , Tt − Y ]T
τ1 = 0
θ = [µ, τ2, · · · , τt ]T = [T1, T2 − T1, · · · , Tt − T1]T
but, ΨT will be the same in any case.
Clarice G.B. Demetrio and Cristian Villegas 17 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Sums of squares for the analysis of variance
LM theory: Y′Y = Y′HY + Y′(I−H)Y =1
nY′JY + α′X′TY
From Chapter XII of Chris Brien’s notes,an SSq is the SSq of the elements of a vector andcan be written as the product of transpose of a column vector withoriginal column vector.
For a completely randomized design, the sums of squares in the analysisof variance for Units, Treatments and Residual are given by the quadraticforms, respectively,
Y′QUY Y′QTY and Y′QUResY
where QU = MU −MG , QT = MT −MG , andQURes
= MU −MT , MU = In, MG = 1nJn and
MT = XT (X′TXT )−1X′T =
1r1
Jr1 0r1×r2 . . . 0r1×rt0r2×r1
1r2
Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1
rtJrt
Clarice G.B. Demetrio and Cristian Villegas 18 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Degrees of freedom of the sums of squares for anANOVA
Definition: The trace of a square matrix is the sum of its diagonalelements.Definition: The degrees of freedom of a sum of squares is the rank ofthe idempotent of its quadratic form. That is the degrees of freedom ofY′AY is given by rank(A).Lemma: For B idempotent, rank(B) = trace(B).Lemma: Let c be a scalar and (A), (B) and (C) be matrices. Thenwhen the appropriate operations are defined, we have
(i) trace(A) = trace(A′);
(ii) trace(cA) = c trace(A);
(iii) trace(A + B) =trace(A) + trace(B);
(iv) trace(AB) =trace(BA);
(v) trace(ABC) =trace(CAB) =trace(BCA)
(vi) trace(A⊗ B) = trace(B) trace(A);
(vii) trace(A′A) = 0 if only if A = 0.
Clarice G.B. Demetrio and Cristian Villegas 19 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares
Have an ANOVA in which we use F (= ratio of MSqs) to decidebetween models.
But why is this ratio appropriate?
One way of answering this question is to look at what the MSqsmeasure?
Use expected values of the MSqs, i.e. E[MSq]s, to do this.
To derive the expected values, we note that the general form of a meansquare is a quadratic form divided by degrees of freedom, Y′QY/ν.
Clarice G.B. Demetrio and Cristian Villegas 20 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expectation of quadratic forms
Definition: A quadratic form in a vector Y is a scalar function of Y ofthe form Y′AY where A is called the matrix of the quadratic form.
Expectation: Let Y be an n × 1 vector of random variables with
E[Y] = Ψ and Var[Y] = V
where Ψ is a n × 1 vector of expected values and V is an n × n matrix.Let A be an n × n matrix of real values. Then
E(YTAY) = trace (AV) + ΨTAΨ
Clarice G.B. Demetrio and Cristian Villegas 21 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Distribution of a quadratic form
Theorem: Let A be an n × n symmetric matrix of rank ν and Y be ann × 1 normally distributed random vector with E[AY] = 0, Var[Y] = Vand E[Y′AY/ν] = λ. Then (1/λ)Y′AY follows a χ2-distribution withν = rank(A) degrees of freedom if and only if A is idempotent.
- The mean and variance of a χ2-distribution with ν degrees of freedomare equal to ν and 2ν, respectively.
Clarice G.B. Demetrio and Cristian Villegas 22 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Cochran’s theorem (1934)
Theorem: Let Y be an n × 1 normally distributed random vector withE[Y] = Xθ and Var[Y] = V. Let Y′A1Y, . . ., Y′AhY be a collection of hquadratic forms where, for each i = 1, 2, . . . , h,
Ai is symmetric, of rank νi , E[AiY] = 0¯
, E[Y′AiY/νi ] = λi .
If any two of the following three statements are true,1. All Ai are idempotent
2.∑h
i=1 Ai is idempotent
3. AiAj = 0, i 6= j
then for each i , Y′AiY/νi follows a χ2-distribution with νi degrees of
freedom. Furthermore, Y′AiY are independent for i 6= j and∑h
i=1 νi = ν
where ν denotes the rank of∑h
i=1 Ai .
Clarice G.B. Demetrio and Cristian Villegas 23 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Distribution of a ratio of independent χ2-distributions
Theorem: Let U1 and U2 be two random variables distributed χ2 with ν1
and ν2 degrees of freedom. Then, provided U1 and U2, are independent,the random variable
W =U1/ν1
U2/ν2
is distributed as Snedecor’s F with ν1 and ν2 degrees of freedom.
Note: Two quadratic forms Y′AiY and Y′AjY are independent ifAiAj = 0, i 6= j
Clarice G.B. Demetrio and Cristian Villegas 24 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
It is possible to show (See Chapter XII of Chris Brien’s notes) thatfor the completely randomized design
-Y′QURes
Y
σ2∼ χ2
n−t
-Y′QTY
σ2∼ χ2
t−1, under H0
-Y′QURes
Y
σ2and
Y′QTY
σ2are independent
- F =Treatments MSq
Residual MSq∼ Ft−1,n−t
Clarice G.B. Demetrio and Cristian Villegas 25 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Standard errors of samples variances
Consider a random sample Yi , i = 1, 2, · · · , n, from a normal distributionwith mean E(Yi ) = µ and variance Var(Yi ) = σ2.
The sample mean Y =∑
i Yi/n and the varianceS2 =
∑i (yi − y)2/(n − 1) are unbiased estimators for µ and σ2,
respectively
(n − 1)S2/σ2 follows a χ2-distribution with (n − 1) df
E(S2) = σ2 and Var(S2) = 2σ4/(n − 1)
Let MS denote a mean square with ν df. If νMS/E(MS) ∼ χ2ν , the
variance of MS is Var(MS) = 2E2(MS)/ν. Hence,
Var(MS) = E(MS2)− E2(MS) = E(MS2)− ν
2Var(MS).
Thus, (ν + 2)Var(MS)/2 = E(MS2) and an unbiased estimator ofVar(MS) is given by
2MS2
ν + 2
As an illustration, the estimator of the variance of the variance S2 isVar(S2) = 2S4/(n + 1).
Clarice G.B. Demetrio and Cristian Villegas 26 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear combinations of χ2 variables
Consider the mean squares MSi , i = 1, 2, · · · , k , independent, with νidegree of fredoom, and that independently νiMSi/E(MSi ) ∼ χ2
νi .
Estimators of variance components usually take the form ofMS =
∑i aiMSi , where ai are constants.
Following Smith (1938), Satterthwaite (1946) considersνMS/E(MS) ∼ χ2
ν .
As a consequence, Var(MS) = 2E2(MS)/ν.
However, Var(MS) =∑
i a2i Var(MSi ) = 2
∑i [a
2i E2(MSi )/νi ]
Equating the two expressions for Var(MS),
ν =E2(MS)∑
i [a2i E2(MSi )/νi ]
=[∑
i aiE(MSi )]2∑i [a
2i E2(MSi )/νi ]
In practice ν is obtained from (∑
i aiMSi )2/∑
i (a2i MS2
i /νi ).Clarice G.B. Demetrio and Cristian Villegas 27 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Goodness of fit criterion
Adjusted R-square
R2 = 1−∑n
i=1(yi − x ′i β)2/(n − K )∑ni=1(yi − y)2/(n − 1)
Akaike’s Information Criterion
AIC = −2 logL(βml , σml , y) + 2K
Bayesian Information Criterion
BIC = −2 logL(βml , σml , y) + K log(n)
Clarice G.B. Demetrio and Cristian Villegas 28 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
SAS procedure
Proc GLM data = data;
class x; * if x is a factor
model y = x;
output out=Regr p=Predite r=Residu;
run;
Clarice G.B. Demetrio and Cristian Villegas 29 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Checking
Gaussian hypothesis
Graphical
histogram, QQ-plot,
proc univariate data=Regr;var Residu ;histogram Residu / normal ;qqplot Residu / normal(mu=est sigma=estcolor=red L=1);inset mean std / cfill=blank format=5.2;run;
Statistical test
Kolmogorov-Smirnov
proc univariate data=Regr normaltest ;var Residu;run;
Homoscedasticity hypothesis
Graphical
residual/predicted
proc GPlot data=Regr ;plot Residu*Predite /vref=0;run;
Independence hypothesis
Difficult to test !!!
Clarice G.B. Demetrio and Cristian Villegas 30 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Variable selection in multiple regression
The main approaches
Forward selection, which involves starting with no variables in themodel, trying out the variables one by one and including them ifthey are statistically significant.
Backward elimination, which involves starting with all candidatevariables and testing them one by one for statistical significance,deleting any that are not significant.
Methods that are a combination of the above, testing at each stagefor variables to be included or excluded.
SAS Reg procedure
proc reg
model Y = x/selection = adjrsq bic;
model Y = x/selection = stepwise;
run;
Clarice G.B. Demetrio and Cristian Villegas 31 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Mixed Models
Linear mixed effects models have been widely used in analysis ofdata where responses are clustered around some random effects,such that there is a natural dependence between observations in thesame cluster.
For example, consider repeated measurements taken on each subjectin longitudinal data, or observations taken on members of the samefamily in a genetic study.
They can easily accommodate covariances among observations.
They handle correlated data by incorporating random effects andestimating their associated variance components to model variabilityover and above the residual error.
Because of the estimation procedures usually envolved, mixed-modelapproaches can circumvent the problems associated with unbalancedand incomplete data.
Clarice G.B. Demetrio and Cristian Villegas 32 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maize trial
Example
5 progenies of a population of maize progenies were investigated
the trial was conducted randomizing completely 4 replicates of eachprogeny
the response variable was the weight of corn-cob (kg/10m2)
Progenies Replicates1 5.95 6.21 5.40 5.182 5.07 6.71 5.46 4.983 4.82 5.11 4.68 4.524 3.87 4.16 4.11 4.845 5.53 5.82 4.29 4.70
At crossing, genetic effects may be reasonably assumed as normalrandom variables.During early stages of a selection programme, the nature ofgenotypic effects may still be regarded as random.In general, the interest is in the heritability of a trait.
Clarice G.B. Demetrio and Cristian Villegas 33 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Penicillin yield (Brien, 2009)
Example
The effects of four treatments on the yield of penicillin are to beinvestigated. It is known that corn steep liquor, an important rawmaterial in producing penicillin, is highly variable from one blending of itto another. To ensure that the results of the experiment apply to morethan one blend, five blends (blocks) are to be used in the experiment.The trial was conducted using the same blend in four flasks andrandomizing the four treatments to these four flasks.
interest of course in each particular treatment usedno interest in each blend which are very depending on thecircumstancesblend effect can be viewed as a sample of a random blend effect(levels are chosen at random from an infinite set of blend levels)interest in estimating the variance of the blend effect as a source ofrandom variation in the datathe four flasks with the same blend share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 34 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
. . .
Blend 1
Flask 1 Flask 3 Fask 4Flask 2
Blend 5
Flask 1 Flask 3 Flask 4Flask 2
TreatmentBlend A B C D
1 89 88 97 942 84 77 92 793 81 87 87 854 87 92 89 845 79 81 80 88
Clarice G.B. Demetrio and Cristian Villegas 35 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Calf birth weight
Example
In an animal breeding experiment 20 unrelated cows were subjected tosuperovulation and artificial insemination. Each group of 4 cows wasinseminated with a different sire, with a total of 5 unrelated sires. Out ofeach mating (combination of dam and sire), three calves were generatedand their yearling weights were recorded.
no interest in each sire or dam which are very depending on thecircumstances
sire effect can be viewed as a sample of a random sire effect (levelsare chosen at random from an infinite set of sire levels)
dam effect can be viewed as a sample of a random dam effect (levelsare chosen at random from an infinite set of dam levels)
interest in estimating the variance of the sire and dam effects assources of random variation in the data
the three calves with the same parents share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 36 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
...
S1
D1 D2 D3 D4
...
S5
D17 D18 D19 D20
Clarice G.B. Demetrio and Cristian Villegas 37 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Fixed vs Random effects
Random effect: A factor will be designated as random if it isconsidered appropriate to use a probability distribution function todescribe the distribution of effects associated with the population setof levels.
influence only the variance of the response variableinfinite set of levels (only a finite subset present) and interest liesmore in the variance induced by these levels than in the estimation ofthe levels themselvesblends in the penicilin example, progenies in the maize trial
Fixed effect: It will be designated as fixed if it is consideredappropriate to have the effects associated with the population set oflevels for the factor differ in an arbitrary manner, rather than beingdistributed according to a regularly-shaped p.d.f.
influence only the mean of the response variablefinite set of levels and interest lies in the estimation of eachparticular level effecttreatments in the penicilin example
Clarice G.B. Demetrio and Cristian Villegas 38 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
In practice
Random if
i . large number of population levels andii . random behaviouriii . occur in two contrasting kinds of circumstances:
observational studies or designed experiments with hierarchicalstructure- School/Class/Student- Sire/Dam/Calfdesigned experiments with different spatial or temporal scales- longitudinal studies
Fixed if
i . small or large number of population levels andii . systematic behaviour
↪→ Consequence: data collected within each level of the random effectfactor are linked to a same realization of a random variable. Thisintroduce dependency between this data.
Clarice G.B. Demetrio and Cristian Villegas 39 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Type of Models
Fixed-effects model - envolves only fixed effects– to make inferences about those particular levels of theclassification factor that were used in the experiment
Random-effects model - envolves only random effects– to make inferences about the population from which these levelswere drawn
Mixed model - envolves fixed and mixed effects
Clarice G.B. Demetrio and Cristian Villegas 40 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Example
Consider a study, related to observations of half-sib families of Iunrelated sires.
If the interest is on comparing only the I sires, the following fixedmodel can be used to represent the data:
E(Yij) = µ+ si
where yij represents the phenotypic trait observation of progeny j ,j = 1, . . . , r , in family i , i = 1, . . . , I , µ is a mean, si is a fixed effectcommon to all animals having sire i .
If the I sires are considered as a sample of a population of sires, thefollowing random model can be used to represent the data:
E(Yij |si ) = µ+ si
where Si is a random effectTwo usual assumptions:
1 si ’s are independently and identically distributed2 si ’s have zero mean and the same variance σ2
s
Si ∼ i .i .d .(0, σ2s )
Clarice G.B. Demetrio and Cristian Villegas 41 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
On matrix notation, this model can be expressed as:
y1
y2
· · ·yI
=
1r
1r
· · ·1r
µ+
1r 0r . . . 0r
0r 1r . . . 0r
· · · · · · · · · · · ·0r 0r . . . 1r
s1
s2
· · ·sI
+
ε1
ε2
· · ·εI
where yi = [yi1, yi2, . . . , yiI ]
′ represents the vector of observations ofprogeny i (i.e., relative to sire i); 1r and 0r represent r -dimensionalcolumn vectors of 1′s and 0′s, respectively; and εi = [εi1, εi2, . . . , εiI ]
′ isthe vector of residuals associated with progeny j .
Clarice G.B. Demetrio and Cristian Villegas 42 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Simulation
Case 1: Consider the simple model yij = µ+ si + eij , with 3 independentsires and 2 replicates
fix µ = 50
get a sample of 3 values for si from a N(0, σ2s )
get a sample of 6 values for eij from a N(0, σ2)
Case 2: We could have a more complex covariance structure for sires (forexample, A ∗ σ2
s , where A could be the parental matrix). The simulationcould be done using the Cholesky decomposition of A, i.e. A = DD ′).Then, we could get a vector z with dimension 3 from a normal N(0,1) –with each of its elements obtained from a (0,1) and then z is multipliedby D and by the square root of σ2
s , i.e. the s vector for sires is given bys = D ∗ z ∗ σs .
Clarice G.B. Demetrio and Cristian Villegas 43 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Advantages of Linear Mixed Models
flexibility of mixed models for grouped or correlated observations.
models can be used for related individuals (like animal and plantbreeding), longitudinal data, spatial statistics, etc.
generalized linear models with random effects, as, for example,implemented in GLIMMIX of SAS,
non-linear mixed models (NLINMIX of SAS, for example), forgrowth curves.
Clarice G.B. Demetrio and Cristian Villegas 44 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Mixed Model
Y = Xβ + Zu + ε
Y is an observable data vector
β is a vector of unknown parameters
u is a vector of unobservable random variables
X and Z are design matrices for the fixed and random effects
ε is a vector of random errors
Generally, it is assumed that U and ε are independent from eachother and normally distributed with zero-mean vectors andvariance-covariance matrices G and Σ, respectively, i.e.:[
Uε
]∼ N
([00
],
[G 00 Σ
])Inferences regarding mixed effects models refer to the estimation offixed effects, the prediction of random effects, and the estimation ofvariance and covariance components, which are briefly discussednext.
Clarice G.B. Demetrio and Cristian Villegas 45 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Mixed Models
Recall that the general linear mixed models equals
Y = Xβ + Zu + ε
U ∼ N(0,G)
ε ∼ N(0,Σ)
u and ε independentThen,
E(Y|u) = Xβ + Zu and Var(Y|u) = Σ
E(Y) = E[E(Y|u)] = E(Xβ + ZU) = Xβ
Var(Y) = Var[E(Y|u)] + E[Var(Y|u)] = Var(Xβ + ZU) + E(Σ) =ZGZ′ + Σ
The implied marginal model equals Y ∼ N(Xβ,V) whereV = ZGZ′ + Σ
Note that inferences based on the marginal model do not explicitlyassume the presence of random effects representing the naturalheterogeneity between subjects (case of longitudinal data)
Clarice G.B. Demetrio and Cristian Villegas 46 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Some properties of the direct product of matrices
if Ar and Br are square matrices of order r and c, respectively,
Ar ⊗ Bc =
a11B . . . a1rB. . . . . . . . .ar1B . . . arrB
where ⊗ is called the direct (Kronecker) product operator
In general, A⊗ B 6= B⊗ A
If u and v are vectors, then u′ ⊗ v = v ⊗ u′ = vu′
If D(n) is a diagonal matrix and A is any matrix, then:
D⊗ A = d11A⊕ d22A⊕ . . . dnnA
If matrix dimensions are compatible
(A⊗ B)(C⊗ D) = AC⊗ BD
(αAA⊗ αBB) = αAαB (A⊗ B)
(A⊗ B)T = (AT ⊗ BT )
(A⊗ B)−1 = (A)−1(B)−1
rank(A⊗ B) = rank(A)rank(B)
tr(A⊗ B) = tr(A)tr(B)
det(A⊗ B) = det(A)rank(B)det(B)rank(A)
Clarice G.B. Demetrio and Cristian Villegas 47 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Completely Randomized Design (CRD)
Let’s suppose a CRD with treatment as a random effect and with thesame number of replicates (r) per treatment. The model is
Yij = µ+ τi + εij ,
where i = 1, 2, · · · , t, j = 1, 2, · · · , r , µ constant, τi random and εijrandom
τi ∼ N(0, σ2T ) and εij ∼ N(0, σ2)
τi and εij , τi and τi ′ , εij and εi ′j′ (j 6= j ′ and/or i 6= i ′) areindependent
then
Var(Yij) = Var(τi + εij) = σ2 + σ2T
Cov(Yij ,Yij′) = Cov(τi + εij , τi + εij′) = σ2T (observations from the
same treatment)
Cov(Yij ,Yi ′j) = Cov(τi + εij , τi ′ + εi ′j) = 0 (observations fromdifferent treatments)
Clarice G.B. Demetrio and Cristian Villegas 48 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The variance matrices of the observations for the fixed and randommodels when r = 2, t = 3, for example, arei) τi fixed
Y =
y11
y12
y21
y22
y31
y32
, Var(Y) = Σ =
σ2 0 0 0 0 00 σ2 0 0 0 00 0 σ2 0 0 00 0 0 σ2 0 00 0 0 0 σ2 00 0 0 0 0 σ2
,
ii) τi random
Var(Y) = ZGZ′ + Σ =
σ2 + σ2T σ2
T 0 0 0 0σ2T σ2 + σ2
T 0 0 0 00 0 σ2 + σ2
T σ2T 0 0
0 0 σ2T σ2 + σ2
T 0 00 0 0 0 σ2 + σ2
T σ2T
0 0 0 0 σ2T σ2 + σ2
T
In this case:
Z =
12 02×1 02×1
02×1 12 02×1
02×1 02×1 12
,G = σ2T I3 and Σ = σ2I6
Clarice G.B. Demetrio and Cristian Villegas 49 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares for an ANOVA – CRD
Let’s suppose a CRD with treatment as a random effect, but withdifferent number of replicates (ri ) per treatment. The model is
Yij = µ+ τi + εij ,
where i = 1, 2, · · · , t, j = 1, 2, · · · , ri , µ constant, τi random and εijrandom. The ANOVA table is
Source df SSq MSq FUnits n − 1 Y′QUY
Treatments t − 1 Y′QTYY′QTY
t − 1MSqTMSqRes
Residual n − t Y′QUResY
Y′QUResY
n − t
whereMU = In, XG = 1n, MG = XG (XT
G XG )−1XTG = n−1Jn
QT = MT −MG , QU = MU −MG , QURes= MU −MT
Clarice G.B. Demetrio and Cristian Villegas 50 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
XT =
1r1 0r1×1 . . . 0r1×1
0r2×1 1r2 . . . 0r2×1
. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt
,
MT = XT (X′TXT )−1X′T =
1r1
Jr1 0r1×r2 . . . 0r1×rt0r2×r1
1r2
Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1
rtJrt
Then,
SSqT = Y′QTY =∑t
i=1T 2
i
ri− C , C =
(∑
i,j Yij )2
n
SSUnits =∑
i,j Y 2ij − C , SSRes = SSqUnits − SSqT
When r1 = r2 = · · · = rt = rMG = XG (XT
G XG )−1XTG = n−1Jt ⊗ Jr = n−1Jn
XT = It ⊗ 1r , MT = XT (XTTXT )−1XT
T = r−1It ⊗ Jr
SSqT = Y′QTY = 1r
∑ti=1 T 2
i − C
Clarice G.B. Demetrio and Cristian Villegas 51 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Assuming that
τi ∼ N(0, σ2T ) and εij ∼ N(0, σ2) and
τi and εij , τi and τi ′ , εij and εi ′j′ (j 6= j ′ and/or i 6= i ′) areindependent
i) E(SSqUnits)E(SSqUnits) =
∑i,j E(Y 2
ij )− E(C )
E(Y 2ij ) = E(µ2) + E(τ 2
i ) + E(ε2ij) + E(dp) = µ2 + σ2
T + σ2
E(∑
i,j Y 2ij ) = nµ2 + nσ2
T + nσ2∑i,j Yij = nµ+
∑i riτi +
∑i,j εij
E[(∑
i,j
Yij
)2]= n2µ2 + E
(∑i
riτi)2
+ E(∑
i,j
εij)2
+ E(dp)
= n2µ2 +∑i
r 2i σ
2T + nσ2
E(C ) =E[(∑
i,j Yij
)2]n = nµ2 +
∑i r
2i
n σ2T + σ2
E(SSqUnits) =(n −
∑i r
2i
n
)σ2T + (n − 1)σ2
Clarice G.B. Demetrio and Cristian Villegas 52 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ii) E(SSqT )
E(SSqT ) =∑t
i=1 E(T 2
i
ri
)− C
Ti =∑
j(µ+ τi + εij) = riµ+ riτi +∑
j εij
T 2i = r 2
i µ2 + r 2
i τ2i +
(∑j εij)2
+ dp
T 2i
ri= riµ
2 + riτ2i +
(∑j εij
)2
ri+ dp
ri
E∑t
i=1
(T 2i
ri
)=∑t
i=1(riµ2 + riσ
2T + σ2) = nµ2 + nσ2
T + tσ2
E(SSqT ) = nµ2 + nσ2T + tσ2 − nµ2 −
∑i r 2
i
nσ2T − σ2
=(n −
∑i r 2
i
n
)σ2T + (t − 1)σ2
E(MSqT ) = 1t−1
(n −
∑i r
2i
n
)σ2T + σ2
When r1 = r2 = · · · = rt = r
E(MSqT ) = rσ2T + σ2
Clarice G.B. Demetrio and Cristian Villegas 53 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
iii) E(SSqRes)SSqRes = SSqUnits − SSqT
E(SSqRes) = E(SSqUnits)− E(SSqT ) = (n − t)σ2
E(MSqRes) = σ2
Exercise: Show that for a fixed CRD
E(MSqT ) = qT (Ψ) + σ2 and E(MSqRes) = σ2,
where qT (Ψ) =∑t
i=1
ri (τi − τ .)2
t − 1
Clarice G.B. Demetrio and Cristian Villegas 54 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The expected mean squares under the fixed and random models are givenin the following table
Source df SSq MSq (s2) E[MSq] E[MSq]Units n − 1 YTQUY
Treatments t − 1 YTQTYYTQTY
t − 1σ2 + qT (Ψ) σ2 + k1σ
2T
Residual n − t YTQUResY
YTQUResY
n − tσ2 σ2
where qT (Ψ) =ΨTQTΨ
t − 1=
t∑i=1
ri (τi − τ .)2
t − 1, k1 = 1
t−1
(n −
∑i r
2i
n
)MU = In, MG = n−1Jn
QT = MT −MG , QU = MU −MG , QURes= MU −MT
σ2 and σ2T are called components of variance
Clarice G.B. Demetrio and Cristian Villegas 55 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares for an ANOVA, using matrixnotation
Recall that the general linear mixed models equals
Y = Xβ + Zu + ε
U ∼ N(0,G)
ε ∼ N(0,Σ)
u and ε independent. Then E(Y) = Xβ and V = ZGZT + Σ.
Expected mean squares for an ANOVATheorem: Let Y be an n × 1 vector of random variables with E[Y] = µand Var[Y] = V, where µ is a n × 1 vector of expected values and V isan n × n matrix. Let A an n × n matrix of real numbers.Then
E(YTAY) = tr(AV) + µTAµ
Clarice G.B. Demetrio and Cristian Villegas 56 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
i) Assuming a fixed CRD model (fixed effect for treatment), we have
Y = XGµ+ XTτ + ε
with τ fixed and ε ∼ N(0, Inσ2), that is, E(τ ) = τ , G = Var(τ ) = 0t×t ,E(ε) = 0 and Σ = Var(ε) = Inσ2. Then,
E(Y) = µ = Xβ = XGµ+ XTτ and V = ZGZ′ + Σ = Inσ2
Remember thatSSqT = YTQTY = YT (MT −MG )Y andSSqRes = YTQURes
Y = YT (MU −MT )YwhereMU = In, XG = 1n MG = n−1Jn
XT =
1r1 0r1×1 . . . 0r1×1
0r2×1 1r2 . . . 0r2×1
. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt
,MT =
1r1
Jr1 0r1×r2 . . . 0r1×rt0r2×r1
1r2
Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1
rtJrt
Clarice G.B. Demetrio and Cristian Villegas 57 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then,i.1) E(SSqT ) = E(YTQTY) = E(YTMTY)− E(YTMGY)
but
E(YTMTY) = tr(MT In)σ2 + µTMTµ
= tσ2 + µ2XTG MTXG + 2µXT
G XTτ + τTXTTXTτ
= tσ2 + nµ2 + 2µXTG XTτ +
t∑i=1
riτ2i
E(YTMGY) = tr(MG In)σ2 + µTMGµ
= σ2 + nµ2 + 2µXTG XTτ + τTXT
TMGXTτ
= σ2 + nµ2 + 2µXTG XTτ +
1
n
( t∑i=1
riτi)2
E(SSqT ) = (t − 1)σ2 +t∑
i=1
riτ2i −
1
n
( t∑i=1
riτi)2
= (t − 1)σ2 +t∑
i=1
ri(τi − τ
)2
Clarice G.B. Demetrio and Cristian Villegas 58 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
and
E(MSqT ) = σ2 +1
t − 1
t∑i=1
ri(τi − τ
)2
i.2) E(SSqRes) = E(YTQUResY) = E(YTMUY)− E(YTMTY)
but
E(YTMUY) = tr(MU In)σ2 + µTMUµ
= nσ2 + µTMUµ = nσ2 + (XGµ+ XTτ )T (XGµ+ XTτ )
= nσ2 + nµ2 + 2µXTG XTτ +
t∑i=1
riτ2i
E(SSqRes) = (n − t)σ2 and E(MSqRes) = σ2
Clarice G.B. Demetrio and Cristian Villegas 59 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ii) For a random effect for treatment (random CRD model), we have
Y = XGµ+ ZTτ + ε
with τ ∼ N(0, Itσ2T ) and ε ∼ N(0, Inσ2), that is,
E(τ ) = 0, G = Var(τ ) = Itσ2T ,
E(ε) = 0 and Σ = Var(ε) = Inσ2.
Then,
E(Y) = µ = Xβ = XGµ and V = ZT ItZTTσ
2T + Σ = ZTZT
Tσ2T + Inσ
2
Remember thatSSqT = YTQTY = YT (MT −MG )Y andSSqRes = YTQURes
Y = YT (MU −MT )YwhereMU = In, XG = 1n MG = n−1Jn
ZT =
1r1 0r1×1 . . . 0r1×1
0r2×1 1r2 . . . 0r2×1
. . . . . . . . . . . .0rt×1 0rt×1 . . . 1rt
,MT =
1r1
Jr1 0r1×r2 . . . 0r1×rt0r2×r1
1r2
Jr2 . . . 0r2×rt. . . . . . . . . . . .0rt×r1 0rt×r2 . . . 1
rtJrt
Clarice G.B. Demetrio and Cristian Villegas 60 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then,ii.1) E(SSqT ) = E(YTQTY) = E(YTMTY)− E (YTMGY)but
E(YTMTY) = tr[MT (ZTZTTσ
2T + Inσ
2)] + µTMTµ
= tr(MTZTZTT )σ2
T + tr(MT )σ2 + µ2XTG MTXG
= tr(ZTZTT )σ2
T + tr(It)σ2 + nµ2
= nσ2T + tσ2 + nµ2
E(YTMGY) = tr[MG (ZTZTTσ
2T + Inσ
2)] + µTMGµ
= tr(MGZTZTT )σ2
T + tr(MG )σ2 + µ2XTG MGXG
=1
ntr(JnZTZT
T )σ2T + σ2 + µ2XT
G XG
=1
n
t∑i=1
r 2i σ
2T + σ2 + nµ2
E(SSqT ) =
(n − 1
n
t∑i=1
r 2i
)σ2T + (t − 1)σ2
Clarice G.B. Demetrio and Cristian Villegas 61 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
and
E(MSqT ) = σ2 +1
t − 1
(n − 1
n
t∑i=1
r 2i
)σ2T
ii.2) E(SSqRes) = E(Y′QUResY) = E(Y′MUY)− E(Y′MTY)
but
E(YTMUY) = tr[MG (ZTZTTσ
2T + Inσ
2)] + µTMUµ
= tr(MUZTZTT )σ2
T + tr(MU)σ2 + µ2XTG XG =
= nσ2T + nσ2 + nµ2
E(SSqRes) = (n − t)σ2 and E(MSqRes) = σ2
Clarice G.B. Demetrio and Cristian Villegas 62 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
U 1 h 1 Mean
Progenies 5 h 4 Progenies
U 1 x 1 Mean
Plots 20 x 19 Plots
U MG h MG Mean
Progenies MPr h MPr −MG Progenies
U MG x MG Mean
Plots MPl x MPl −MG Plots
U 1 h 1 Mean
Progenies 4σ2T h 4σ2
T Progenies
U 1 x 1 Mean
Plots σ2 x σ2 PlotsClarice G.B. Demetrio and Cristian Villegas 63 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation of Fixed Effects
Recall that the general linear mixed models equals
Y = Xβ + Zu + ε
with U ∼ N(0,G) and ε ∼ N(0,Σ) independent
Then,E(Y|u) = Xβ + Zu and Var(Y|u) = ΣE(Y) = E(Xβ + ZU) = XβVar(Y) = Var(Xβ + ZU) + E(Σ) = ZGZ′ + Σ = Vand marginal model Y ∼ N(Xβ,V)
Notationβ: vector of fixed effects (as before)α: vector of all variance components in G and Σθ = (β′,α′)′: vector of all parameters in marginal model
Marginal likelihood function:
LML(θ) = (2π)−n/2|V(α)|−1/2 exp[− 1
2(Y−Xβ)′V−1(α)(Y−Xβ)
]If α were known, MLE of β equals
β(α) = (X′V−1X)−1X′V−1Y ∼ N(β, (X′V−1X)−1)Clarice G.B. Demetrio and Cristian Villegas 64 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation of Fixed Effects
As G and Σ are generally unknown, an estimate of V is used instead,such that the estimator becomes β(α) = (X′V−1X)−1X′V−1Y.
The variance-covariance matrix of β is approximated by(X′V−1X)−1.
Note: (X′V−1X)−1 is biased downwards as a consequence ofignoring the variability introduced by working with estimates of(co)variance components instead of their true (unknown) parametervalues.
Approximated confidence regions and test statistics for estimablefunctions of the type K′β can be obtained by using the result:
(K′β0)′(K′(X′V−1X)−K)−1(K′β0)
rank(K)≈ F[ϕNϕD ]
where F[ϕNϕD ] refers to an F-distribution with ϕN = rank(K) degreesof freedom for the numerator, and ϕD degrees of freedom for thedenominator, which is generally calculated from the data using, forexample, the Satterthwaite’s approach
Clarice G.B. Demetrio and Cristian Villegas 65 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Matrix review
X ∼ Nk(µ,Σ)
Consider the partitions:
X =
[X1
X2
], µ =
[µ1
µ2
]and Σ =
[Σ11 Σ12
Σ21 Σ22
],
X1 ∼ N(µ1,Σ11) e X2 ∼ N(µ2,Σ22) (marginal distributions)
and
X1|X2 ∼ N(µ1.2,Σ11.2) e X2|X1 ∼ N(µ2.1,Σ22.1) (condicional distributions),
where
µ1.2 = µ1 + Σ12Σ−122 (X2 − µ2), Σ11.2 = Σ11 −Σ12Σ
−122 Σ21
and
µ2.1 = µ2 + Σ21Σ−111 (X1 − µ1) e Σ22.1 = Σ22 −Σ21Σ
−111 Σ12.
Clarice G.B. Demetrio and Cristian Villegas 66 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Prediction of Random Effects
In addition to the estimation of fixed effects, very often in genetics,for example, interest is also on prediction of random effects.
In linear (Gaussian) models such predictions are given by theconditional expectation of U given the data, i.e. E[U|y].
Given the model specifications, the joint distribution of Y and U is:[YU
]∼ N
([Xβ0
],
[V ZG
GZ′ G
])From the properties of multivariate normal distribution, we have
E[U|y] = E[U] + Cov[U,Y′]Var−1[Y](y − E[Y])
= GZ′V−1(y − Xβ) = GZ′(ZGZ′ + Σ)−1(y − Xβ)
The fixed effects β are typically replaced by their estimates, so thatpredictions are made based on the following expression:
u = GZ′(ZGZ′ + Σ)−1(y − Xβ)
Clarice G.B. Demetrio and Cristian Villegas 67 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Mixed Model Equations
The solutions β and u discussed before require V−1. As V can be ofhuge dimensions, especially in plant and animal breedingapplications, its inverse is generally computationally demanding ifnot unfeasible.
However, Henderson (1950) presented the mixed model equations(MME) to estimate β and u simultaneously, without the need forcomputing V.The MME were derived by maximizing (β and u) the joint density ofY and U ,[f (y,u|β,G,Σ) = f (y|u|β,Σ)f (u|G)], expressed as:
f (y, u|β,G,Σ) ∝ |Σ|−1/2|G|−1/2 exp[−
1
2(y−Xβ−Zu)′Σ−1(y−Xβ−Zu)−
1
2u′G−1u
]
The logarithm of this function is:
` ∝ log |Σ|+ log |G|+ (y − Xβ − Zu)′Σ−1(y − Xβ − Zu) + u′G−1u
= log |Σ|+ log |G|+ y′Σ−1y − 2y′Σ−1Xβ − 2y′Σ−1Zu
+ β′X′Σ−1Xβ + 2β′X′Σ−1Zu + u′Z′Σ−1Zu + u′G−1u
Clarice G.B. Demetrio and Cristian Villegas 68 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Mixed Model Equations
The derivatives of ` regarding β and u are: ∂`
∂β∂`
∂u
=
[X′Σ−1y − X′Σ−1Xβ − X′Σ−1Zu
Z′Σ−1y − Z′Σ−1Xβ − Z′Σ−1Zu− G−1u
]
Equating them to zero gives the following system:[X′Σ−1Xβ + X′Σ−1Zu
Z′Σ−1Xβ + Z′Σ−1Zu + G−1u
]=
[X′Σ−1yZ′Σ−1y
]which can be expressed as:[
X′Σ−1X X′Σ−1ZZ′Σ−1X Z′Σ−1Z + G−1
] [βu
]=
[X′Σ−1yZ′Σ−1y
]known as the mixed model equations (MME).
Clarice G.B. Demetrio and Cristian Villegas 69 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
BLUE and BLUP
Using the second part of the MME, we have that:
Z′Σ−1Xβ + (Z′Σ−1Z + G−1)u = Z′Σ−1y
so thatu = (Z′Σ−1Z + G−1)−1Z′Σ−1(y − Xβ)
It can be shown that this expression is equivalent tou = GZ′(ZGZ′ + Σ)−1(y − Xβ) and, more importantly, that u isthe best linear unbiased predictor (BLUP) of u.
Using this result into the first part of the MME, we have that:
X′Σ−1Xβ + X′Σ−1Zu = X′Σ−1y
X′Σ−1Xβ+ X′Σ−1Z(Z′Σ−1Z + G−1)−1Z′Σ−1(y−Xβ) = X′Σ−1y
β = {X′[Σ−1−Σ−1Z(Z′Σ−1Z+G−1)−1Z′Σ−1]X}−1X′[Σ−1−Σ−1Z(Z′Σ−1Z+G−1)−1Z′Σ−1]Y
Similarly, it is shown that this expression is equivalent toβ = (X′V−1X)−1X′V−1Y, which is the best linear unbiasedestimator (BLUE) of β
Clarice G.B. Demetrio and Cristian Villegas 70 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
BLUE and BLUP
It is important to note that β and u require knowledge of G and Σ.
These matrices, however, are rarely known.
This is a problem without an exact solution using classical methods.
The practical approach is to replace G and Σ by their estimates (Gand Σ) into the MME.
Note that if G and Σ are known, the variance covariance matrix ofthe BLUE and BLUP is:
Var
[βu
]=
[X′Σ−1X X′Σ−1ZZ′Σ−1X Z′Σ−1Z + G−1
]−1
Clarice G.B. Demetrio and Cristian Villegas 71 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
BLUE and BLUP
If G and Σ are unknown and their values are replaced in the MMEby some sort of point estimates G and Σ, the new solutions β and uof the system:[
X′Σ−1X X′Σ−1Z
Z′Σ−1X Z′Σ−1Z + G−1
] [βu
]=
[X′Σ−1y
Z′Σ−1y
]are no longer BLUE and BLUP solutions, as they are not even linearfunctions of the data y.
It is shown also that generally:
Var
[βu
]>
[X′Σ−1X X′Σ−1Z
Z′Σ−1X Z′Σ−1Z + G−1
]−1
Clarice G.B. Demetrio and Cristian Villegas 72 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Inverse of a nonsingular partitioned matrix
Let A be a nonsingular partitioned matrix and A−1 its inverse as follows
A =
[A11 A12
A21 A22
]A−1 =
[A11 A12
A21 A22
]=
[Var(β) Cov()Cov() Var(u)
]whereA11 = (A11 − A12A−1
22 A21)−1
A12 = A21T = −(A11 − A12A−122 A21)−1A12A−1
22
A22 = A−122 + A−1
22 A21(A11 − A12A−122 A21)−1A12A−1
22
Clarice G.B. Demetrio and Cristian Villegas 73 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Example
Considering the completely randomized design with random treatmenteffect and r = 2, t = 3, then β = µ, X = 16, G = σ2
T I3, Σ = σ2I6
Z =
12 02×1 02×1
02×1 12 02×1
02×1 02×1 12
and V =
J2×2 02×2 02×2
02×2 J2×2 02×2
02×2 02×2 J2×2
σ2T+σ2I6
Thenµ = y (Exercise!)
u = (Z′Σ−1Z + G−1)−1Z′Σ−1(y − Xβ)
=(Z′Z
σ2+
I3
σ2t
)−1 Z′
σ2(y − 16µ) =
( r
σ2I3 +
1
σ2t
I3
)−1 Z′
σ2(y − 16µ)
= =( rσ2
t + σ2
σ2σ2t
)−1 Z′
σ2(y − 16µ) =
σ2t
σ2t + σ2
r
Z′
r(y − 16µ)
ui = BLUP i =σ2t
σ2t +σ2
r
(yi − µ) = (shrinkage factor)BLUE i
Clarice G.B. Demetrio and Cristian Villegas 74 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The EBLUP for ui is given by(1− MSqRes
MSqT
)(yi − y)
The BLUP for µi = µ+ ui is given by
µi = y +σ2t
σ2t + σ2
r
(yi − y) = yi −σ2
rσ2t + σ2
(yi − y)
and substituting σ2t and σ2 by their estimates we have the EBLUP for µi
yi −MSqRes
MSqT(yi − y)
The relationship between the shrunk or adjusted means (EBLUP’s) andunadjusted means (BLUP’s) can also be illustrated by a scatter diagram.The shrinkage towards the overall mean is indicated by the fact that thepoints representing treatments that have an estimated mean above µ = ylie below the line
EBLUP = BLUP
whereas those representing treatments with an estimated mean belowµ = y lie above the line.
Clarice G.B. Demetrio and Cristian Villegas 75 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
A11 = X′Σ−1X, A12 = X′Σ−1Z, A22 = Z′Σ−1Z + G−1
Var(Y ) =
[X′Σ−1X− X′Σ−1Z
(Z′Σ−1Z + G−1
)−1
Z′Σ−1X
]−1
=
[X′X
σ2− X′Z
σ2
(Z′Z
σ2+
I
σ2T
)−1Z′X
σ2
]−1
=
(n
σ2+
X′Z
σ2
σ2σ2T
rσ2T + σ2
Z′X
σ2
)−1
=σ2
n
(σ2
rσ2T + σ2
)−1
=rσ2
T + σ2
n
and an estimate of Var(Y ) is given by
Var(Y ) =MSqT
n
Clarice G.B. Demetrio and Cristian Villegas 76 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Var(u) =(
Z′Σ−1Z + G−1)−1
+(
Z′Σ−1Z + G−1)−1
Z′Σ−1X[X′Σ−1X− X′Σ−1Z
(Z′Σ−1Z + G−1
)−1
Z′Σ−1X
]−1
X′Σ−1Z(
Z′Σ−1Z + G−1)−1
=σ2σ2
T
rσ2T + σ2
and an estimate of Var(u) is given by
Var(u) =
(1− MSqRes
MSqT
)MSqRes
r
Clarice G.B. Demetrio and Cristian Villegas 77 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Recall that α is the vector of all variance components in G and Σ
In most cases, α is not known, and needs to be replaced by anestimate α
Three frequently used estimation methods for α
Moment method or ANOVA Method (MM)
Maximum likelihood method (ML)
Restricted maximum likelihood method (REML)
Clarice G.B. Demetrio and Cristian Villegas 78 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Anova Estimation
Fit the model by assuming that the random effects in the model arefixed effects. Obtain the corresponding ANOVA table.
Compute the expected mean squares of the observed mean squaresin the ANOVA table under the true assumption about the u′s and ε.
Equate the observed mean squares to their expected mean squaresand solve the resulting system of equations for each of the variancecomponents.
Use the resulting solutions as the estimates of the variancecomponents
Clarice G.B. Demetrio and Cristian Villegas 79 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Example
Consider the data set below, related to observations of half-sib families oft unrelated sires.
Sire1 2 . . . t
y11 y21 . . . yt1
y12 y22 . . . yt2
. . . . . . . . . . . .y1r1 y2r2 . . . ytrt
The following model can be used to represent these data:
yij = µ+ si + εij
where yij represents the phenotypic trait observation of progeny j(j = 1, 2 . . . , ri ) in family i , µ is a mean, si is an effect common toall animals having sire i , and εij is a residual term.
Clarice G.B. Demetrio and Cristian Villegas 80 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Example
The sire effect si is equivalent to the transmitting ability (which isequal to one-half additive genetic value) of sire i , as one-half of itsgenes are (randomly) transmitted to each of its ri progeny.
The residual terms εij refer to additional genetics effects (such asthe effect of dams) and environmental components.
It is assumed that si ∼ N(0, σ2s ) and εij ∼ N(0, σ2)
The expectation and variance of Yij are
E(Yij) = µ and Var(Yij) = σ2s + σ2
Clarice G.B. Demetrio and Cristian Villegas 81 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Example
The ANOVA table with expected mean squares isSource df SSq MSq E[MSq]Units n − 1Sire t − 1 SSSq SMSq σ2 + kσ2
s
Residual n − t RSSq RMSq σ2
where k = 1t−1 (n − 1
n
∑ti=1 r 2
i ).
The ANOVA (MM) estimators for σ2 and σ2s are
σ2 =RSSq
n − tand σ2
s =SMSq − RMSq
k=
1
k
[SMSq − σ2
]In the specific case of balanced data, i.e. the same progeny size forall sires, ri = r = n/t and the ANOVA estimators become:
σ2 = RMSq =RSSq
t(r − 1)and σ2
s =SMSq − RMSq
r=
1
r
[1
t − 1SSSq−σ2
]Clarice G.B. Demetrio and Cristian Villegas 82 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Anova Estimation – Advantages
In general, the ANOVA approach works well for simple models (suchas a one-way structure) or balanced data (such as data fromdesigned experiments with no missing data).
The estimators of the variance components are unbiased.
One can often approximate the degrees of freedom corresponding tothe estimated standard errors of estimators of estimable functions ofthe fixed effects by using Satterthwaite’s Method.For the sire example
σ2s =
SMSq − RMSq
k
with ns degrees of freedom given by
ns =(SMSq − RMSq)2
(SMSq)2
t − 1+
(RMSq)2
n − t
SAS and R can produce the necessary information to perform theseanalysis.
Clarice G.B. Demetrio and Cristian Villegas 83 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Anova Estimation – Disadvantages
It is not indicated for more complex models and data structures suchas those generally found in plant and animal breeding, longitudinalstudies.
There is no unique way in which to form an ANOVA table when thedata are not balanced.
The procedure can produce negative estimates of the variancecomponents which do not make sense.
If some of the expected mean squares of the random effects in theANOVA table depend on fixed effects, the method cannot beapplied. This problem can be avoided by placing all the fixed effectsin the model first followed by the random effects.
Clarice G.B. Demetrio and Cristian Villegas 84 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
A number of methods have been proposed for estimating variancecomponents in more complex scenarios, such as the expected meansquares approach of Henderson (1953), and the minimum normquadratic unbiased estimation (Rao 1971a, 1971b), but maximumlikelihood based methods are currently the most popular ones,especially the restricted (or residual) maximum likelihood (REML)approach, which attempts to correct for the well-known bias in theclassical maximum likelihood (ML) estimation of variancecomponents.
These two methods are briefly described next.
Clarice G.B. Demetrio and Cristian Villegas 85 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maize trial
Example
5 progenies of a maize population were investigated
the trial was conducted using a completely randomized design with 4replicates of each progeny
the response variable was the weight of corn-cob (kg/10m2)
Progenies Replicates1 5.95 6.21 5.40 5.182 5.07 6.71 5.46 4.983 4.82 5.11 4.68 4.524 3.87 4.16 4.11 4.845 5.53 5.82 4.29 4.70
Clarice G.B. Demetrio and Cristian Villegas 86 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Completely Randomized Design (CRD) with same numberof replicates – Expected mean squares for an ANOVA
Let Y be an n × 1 vector of random variables with E[Y] = µ andVar[Y] = V, where µ is a n × 1 vector of expected values and V is ann × n matrix. Let A an n × n matrix of real numbers. Then
E(YTAY) = tr (AV) + µTAµ
For a fixed CRD modelE(Y) = XTτ and V = Inσ2
For a random CRD modelE(Y) = Inµ and V = Inσ2 + rσ2
TMT
whereXT = It ⊗ 1r , MT = XT (XT
TXT )−1XTT = r−1Ir ⊗ Jt
Clarice G.B. Demetrio and Cristian Villegas 87 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The expected mean squares under the fixed and random models are givenin the following table
Source df SSq MSq (s2) E[MSq] E[MSq]Units n − 1 Y′QUY
Treatments t − 1 Y′QTYY′QTY
t − 1σ2 + qT (Ψ) σ2 + rσ2
T
Residual n − t Y′QUResY
Y′QUResY
n − tσ2 σ2
where qT (Ψ) =Ψ′QTΨ
t − 1=
t∑i=1
r(τi − τ .)2
t − 1
MU = (It ⊗ Ir ) = Itr , MG = n−1Jt ⊗ Jr = n−1Jn
QT = MT −MG , QU = MU −MG , QURes= MU −MT
σ2 and σ2T are called components of variance
ANOVA estimators:
σ2 = RMSq, σ2T =
TMSq − RMSq
r
Clarice G.B. Demetrio and Cristian Villegas 88 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maize trial
ANOVA table using R
Source df SSq MSq F Prob
Plots 19
Progeny 4 5.5078 1.3770 4.2872 0.01644∗Residual 15 4.8177 0.3212
MM REML ML
σ2P 0.2639 0.2639 0.1951
σ2 0.3212 0.3212 0.3212
Clarice G.B. Demetrio and Cristian Villegas 89 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
τi (BLUE ) τi (BLUP) µi (BLUE) µi (BLUP)0.6145 0.4712 5.6850 5.54170.4845 0.3715 5.5550 5.4420-0.2880 -0.2208 4.7825 4.8497-0.8255 -0.6329 4.2450 4.43750.0145 0.0111 5.0850 5.0816
Var(τi ) (BLUE ) = Var(µi ) (BLUE ) = 0.0803
Var(τi ) (BLUP) = Var(µi ) (BLUP) = 0.0616
**
*
*
*
4.0 4.5 5.0 5.5 6.0
4.0
4.5
5.0
5.5
6.0
Unadjusted means
Adj
uste
d m
eans
Clarice G.B. Demetrio and Cristian Villegas 90 Modelos Mistos e Componentes de Variancia
SAS program
data prog;
input Progeny Yield @@;
cards;
1 5.95 3 4.68
1 6.21 3 4.52
1 5.40 4 3.87
1 5.18 4 4.16
2 5.07 4 4.11
2 6.71 4 4.84
2 5.46 5 5.53
2 4.98 5 5.82
3 4.82 5 4.29
3 5.11 5 4.70
;
* Moment Method;
proc glm data=prog;
class Progeny;
model Yield = Progeny;
run;
* Restricted Maximum Likelihood Method;
proc mixed data=prog;
class Progeny;
model Yield = / solution ddfm=sat;
random Progeny / solution ;
run;
* Maximum Likelihood Method;
proc mixed data=prog method=ML;
class Progeny;
model Yield = / solution ddfm=sat;
random Progeny / solution ;
run;
R program
CRDMaize.dat <- data.frame(Plots = factor(c(1:20)),Progeny = factor(rep(c(1:5), each=4)),Yield = c(5.95,6.21,5.40,5.18,5.07,
6.71,5.46,4.98,4.82,5.11,4.68,4.52,3.87,4.16,4.11,4.84,5.53,5.82,4.29,4.70))
CRDMaize.dat#attach(CRDMaize.dat)
summary(aov(Yield ~ Progeny+Error(Plots), CRDMaize.dat))summary(aov(Yield ~ Progeny, CRDMaize.dat))(length(levels(CRDMaize.dat$Progeny)))#number of levels of ProgeniesCRDMaize.lm <- lm(Yield ~ Progeny, CRDMaize.dat)anova(CRDMaize.lm)(df_res <- df.residual(CRDMaize.lm))(MSq_T <- anova(CRDMaize.lm)$"Mean Sq"[1])(MSq_Res <- anova(CRDMaize.lm)$"Mean Sq"[2])
## estimate of sigma2 and confidence interval for sigma2(sigma2 <- anova(CRDMaize.lm)$"Mean Sq"[2])(summary(CRDMaize.lm)$sigma)^2(Var_sigma2 <- 2*MSq_Res^2/17)(lim_inf <- df_res*sigma2/qchisq(0.975,df_res))(lim_sup <- df_res*sigma2/qchisq(0.025,df_res))
R program
## estimate of sigma2_T and confidence interval for sigma2_T(sigma2_T <- (MSq_T-MSq_Res)/4) # Moment Method(Var_sigma2_T <- 2/4*((MSq_T^2/(20+4)+(MSq_Res)^2/(4*(df_res+2)))))(sd_sigma2_T <- sqrt(Var_sigma2_T))(nu_T <-(MSq_T-MSq_Res)^2/(MSq_T^2/4+MSq_Res^2/df_res)) # by Sattertwaite(lim_inf <- nu_T*sigma2_T/qchisq(0.975,nu_T))(lim_sup <- nu_T*sigma2_T/qchisq(0.025,nu_T))
# estimate of mean, variance, confidence interval(ybar <- mean(CRDMaize.dat$Yield))(Var_ybar <- MSq_T/20)(sd_ybar <- sqrt(Var_ybar))(t_Var_ybar <- ybar/sd_ybar)ybar-qt(1-0.025,4)*sd_ybar;ybar+qt(1-0.025,4)*sd_ybar
## BLUE, BLUP (step by step)summary(lm(Yield ~ Progeny-1, CRDMaize.dat))# shows the BLUE’s for mu_isqrt(MSq_Res/4) #(standard error the BLUE’s for mu_i)mean_T <- tapply(CRDMaize.dat$Yield,CRDMaize.dat$Progeny,mean)tau_BLUE <- mean_T - ybartau_BLUP <- tau_BLUE*sigma2_T/(sigma2_T+sigma2/4)mean_T_BLUP <- ybar+tau_BLUP(var_tau_BLUP <- sigma2_T*sigma2/(4*sigma2_T+sigma2))sqrt(var_tau_BLUP)data.frame(tau_BLUE,tau_BLUP,mean_T,mean_T_BLUP)plot(mean_T,mean_T_BLUP, pch=’*’,xlim=c(4,6),ylim=c(4,6),xlab=’Unadjusted means’,ylab=’Adjusted means’)abline(0,1)
R program
require(nlme)# Restricted Maximum Likelihood MethodCRDMaize.reml <- lme(Yield ~ 1, random = ~1|Progeny,CRDMaize.dat, method="REML")summary(CRDMaize.reml)VarCorr(CRDMaize.reml)VarCorr(CRDMaize.reml)[1]VarCorr(CRDMaize.reml)[2]VarCorr(CRDMaize.reml)[3]VarCorr(CRDMaize.reml)[4](summary(CRDMaize.reml)$sigma)^2random.effects(CRDMaize.reml) ## tau EBLUPCRDMaize.reml$coef ## mean EBLUPcoef(CRDMaize.reml)
# Maximum Likelihood MethodCRDMaize.ml <- update(CRDMaize.reml, method="ML")#CRDMaize.ml <- lme(Yield ~ 1, random = ~1|Progeny, CRDMaize.dat, method="ML")summary(CRDMaize.ml,corr = F)VarCorr(CRDMaize.ml)(summary(CRDMaize.ml)$sigma)^2random.effects(CRDMaize.ml)coef(CRDMaize.ml)
R program
## Restricted Maximum Likelihood Method, using library lme4library(lme4)CRDMaize.lmer <- lmer(Yield ~ 1 + (1|Progeny),CRDMaize.dat, REML=TRUE)summary(CRDMaize.lmer)summary(CRDMaize.lmer)@coefsdata.frame(summary(CRDMaize.lmer)@REmat)
# Restricted Maximum Likelihood Method using ASReml-Rrequire(asreml)CRDMaize.asreml<- asreml(Yield ~ 1,random=~Progeny,data=CRDMaize.dat)summary(CRDMaize.asreml)summary(CRDMaize.asreml)$varcomp
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Software
R functions
lm() – classical linear model
aov() – analysis of variance model
glm() – generalized linear model
gls() – generalized least squares model
gee() – generalized estimating equations (package gee)
lme() – linear mixed models (package nlme)
nlme() – non-linear mixed model (package nlme)
nls() – non-linear regression model (package nls)
lmer() – linear mixed models (package lme4)
ASReml packagesupport@vsni.co.ukhttp://www.vsni.co.uk/products/asremlASReml forum www.vsni.co.uk/forumCookbook: http://uncronopio.org/ASReml
Clarice G.B. Demetrio and Cristian Villegas 96 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Differences between lme4 and nlme
(B. Venables, 2010, personal communication)
1 With nlme the fixed and random parts of the model are specifiedusing two formulae; in lme4 they are specified in the one formulawith the random parts ”added on” to the fixed parts.
2 With nlme you have no generalized linear mixed model fitter, thoughglmmPQL in the MASS library can be used for some GLMMs, and ituses the nlme library. lme4 has a GLMM built-in. It allows you tospecify families in the glm sense, but not all glm families aresupported, yet.
3 nlme offers non-linear mixed effect models; lme4 does not and neverwill.
4 The nlme package allows you to specify variance heterogeneity andcorrelation patterns; the only way to do this within lme4 is to use aglm family, which is often not what you want to do.
5 The nlme package has a gls functon for ”generalized least squares”.This allows you to make use of the variance heterogeneity andcorrelation patterns feature even if the model does not contain anyrandom effects. This is handy.
Clarice G.B. Demetrio and Cristian Villegas 97 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Differences between lme4 and nlme
(B. Venables, 2010, personal communication, cont.)
6 (Probably most important difference). nlme is hard to use withcrossed random effects, but is very well-developed for nested randomeffects. lme4 is the opposite: it handles crossed random effects welland using it with nested random effects is still simple enough, but abit more work than with nlme.
7 nlme uses an older algorithm which struggles for large data sets.lme4 uses a newer algorighm and can handle quite large data setsvery quickly. (I think the SAS Proc mixed, though, will handle evenbigger ones.)
8 lme4 is that, at this stage, it is relatively under-developed. Someimportant things are missing.
9 ASREML is wonderful, but it only handles a relatively small set ofmodels (though the most important set, of course)
Clarice G.B. Demetrio and Cristian Villegas 98 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
(C. Brien, 2010, personal communication)
1 ASREML does a wide range of heterogeneous variances andcorrelations for nested and crossed random effects, althoughprobably not the full range of heterogeneous, nested models thatnlme does. ASREML also does GLMMs, similar to GLMMPQL. Itdoes not do the non-linear models.
2 ASREML is good for experiments and lme4/nlme are good for largesurveys, because that is what they were developed for
Clarice G.B. Demetrio and Cristian Villegas 99 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Software
SAS procedures
PROC GLM – general linear model
PROC MIXED – linear mixed model
PROC GENMOD – generalized linear model
PROC GLIMMIX
PROC NLMIXED – non-linear mixed model
Clarice G.B. Demetrio and Cristian Villegas 100 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Basic SAS code1/ proc mixed data=variety.eval;
2/ class block type dose;
3/ model y = type|dose ;
4/ random block block*dose ;
5/ ods select Tests3 CovParms; run;
call procedure and declare data set
define block, type, dose as factor
define fixed effects in the model
declare random effects
output test type 3 and covarianceparameters
Clarice G.B. Demetrio and Cristian Villegas 101 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
1/ proc mixed statement <options>;
DATA= SAS data set. Name of SAS data set to be used by PROCMIXED. The default is the most recently created data set.
METHOD
REML (default method)ML
COVTEST allows to specify if asymptotic standard errors and WaldZ-test for variance-covariance structure parameter estimates is used.
Clarice G.B. Demetrio and Cristian Villegas 102 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3/ MODEL statement <option>;
describes linear relation between Y and fixed covariables
S or Solution for fixed effects output;
DDFM method to compute approximate Degree of Freedom
CONTAIN (default)RESKRSATTERTH
outpred=Names1, output data-sets Names1 contains predictedvalues X β + Z u, sd...
outpredm=Names2, output data-sets Names2 contains predictedvalues X β, sd...
4/ Random statement
random block / Solution;
↪→ Blup and t-test
Clarice G.B. Demetrio and Cristian Villegas 103 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRD – Variance of Component of Variance Estimators
From Session 1: Let MS denote a mean square with ν df. IfνMS/E(MS) ∼ χ2
ν , the variance of MS is
Var(MS) = 2E2(MS)/ν.
Hence,
Var(MS) = E(MS2)− E2(MS) = E(MS2)− ν
2Var(MS).
Thus, (ν + 2)Var(MS)/2 = E(MS2) and an unbiased estimator ofVar(MS) is given by
2MS2
ν + 2Then
Var(σ2) = Var(MSqRes) =2σ4
n − tand
Var(σ2) = Var(MSqRes) = 2MSq2
Res
n − t + 2
Maize example: σ2 = 0.3212 and Var(σ2) = 2 0.32122
15+2 = 0.0121
Clarice G.B. Demetrio and Cristian Villegas 104 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
We saw that
σ2T =
MSqT −MSqRes
r=
∑i (yi − y)2
t − 1− s2
r
Since MSqT and MSqRes are independent, the two terms of σ2T are
distributed independently. Furthermore,
(t − 1)MSqT
σ2 + rσ2T
∼ χ2t−1 and
(n − t)MSqRes
σ2∼ χ2
n−t
From these results
Var(σ2T ) =
1
r 2[Var(MSqT ) + Var(MSqRes)] =
2
r 2
[(rσ2
T + σ2)2
t − 1+
σ4
n − t
]An unbiased estimator of this variance is given by
Var(σ2T ) =
2
r 2
[MSq2
T
t − 1 + 2+
MSq2Res
n − t + 2
]=
2
r
[MSq2
T
n + r+
MSq2Res
r(n − t + 2)
]Maize example: σ2
P = 0.2639 and
Var(σ2P) = 2
42
[1.37702
4+2 + 0.32122
15+2
]= 0.0403
Clarice G.B. Demetrio and Cristian Villegas 105 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Confidence interval for σ2
From(n − t)MSqRes
σ2∼ χ2
n−t obtained from
P
(χ2n−t;α/2 <
(n − t)MSqRes
σ2< χ2
n−t;1−α/2
)= 1− α
or equivalently
P
((n − t)MSqRes
χ2n−t;1−α/2
< σ2 <(n − t)MSqRes
χ2n−t;α/2
)= 1− α
Then a confidence interval for σ2 with a 100(1− α)% is[(n − t)MSqRes
χ2n−t;1−α/2
;(n − t)MSqRes
χ2n−t;α/2
]Maize example: A confidence interval for σ2 with a 100(1− α)% is[
15 ∗ 0.3212
27.4884;
15 ∗ 0.3212
6.2621
]= [0.1753; 0.7693]
Clarice G.B. Demetrio and Cristian Villegas 106 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Confidence interval for σ2T
To get the confidence interval for σ2T we need first to determine the
number of degrees of freedom associated to σ2T , by Satterthwaite
method.
As σ2T =
MSqT −MSqRes
r, from Session 1,
νT =(∑
i aiMSi )2∑
ia2i MS2
i
νi
=(MSqT −MSqRes)2
MSq2T
t−1 +MSq2
Res
n−t
.
Then a confidence interval for σ2T with a 100(1− α)% is[
νt σ2T
χ2νt ;1−α/2
;νt σ
2T
χ2νt ;α/2
]Maize example:
νT =(1.37695− 0.32118)2
1.376952
4 + 0.321182
15
= 2.32
and [2.32 ∗ 0.2639
8.0308;
2.32 ∗ 0.2639
0.0903
]= [0.07618; 6.7714]
Clarice G.B. Demetrio and Cristian Villegas 107 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Inference regarding the mean
It is easy to show that the sample mean Y =∑
i,j Yij
n is an unbiasedestimator for µ and has variance
Var(Y ) =1
t(σ2
T +σ2
r) =
1
n(rσ2
T + σ2)
An unbiased estimator of this variance is
Var(Y ) =1
nMSqT
The hypothesis H0 : µ = µ0 can be tested using
tt−1 =y − µ0√Var(Y )
which follows from the Student’s t-distribution with (t-1)d.f. The intervalfor µ with 100(1− α)% confidence has limits
CI (µ) :
[y − tt−1;α/2
√Var(Y ); y + tt−1;1−α/2
√Var(Y )
]Maize example: y = 5.07, Var(Y ) = 1.3770
20 = 0.0688,
t = (5.0705− 0)/√
0.0688 = 19.32 and the CI(µ): [4.34, 5.80].Clarice G.B. Demetrio and Cristian Villegas 108 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares for an ANOVA – CRD withsubsampling
The model for a CRD with subsampling (k subsamples per plot) withtreatment random is
Yijk = µ+ τi + εij + εijk ,
where i = 1, . . . , t, j = 1, . . . , r , k = 1, . . . , k , µ constant, τi random, εijrandom and εijk random. The ANOVA table is
Source df SSq MSq FPlots rt − 1 Y′QPY
Treatments t − 1 Y′QTYY′QTY
t − 1MSqTMSqRes
Residual t(r − 1) Y′QUResY
Y′QUResY
n − tMSqResMSqW
Between samples within plots rt(k − 1) YTQUWY
YTQUWY
rt(k − 1)
MU = In, XG = 1n, MG = XG (XTG XG )−1XT
G = n−1Jn
QT = MT −MG , QU = MU −MG , QURes= MU −MT
Clarice G.B. Demetrio and Cristian Villegas 109 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then,
SSqT = Y′QTY = 1rk
∑ti=1 T 2
i − C , C =(∑
i,j,k Yijk )2
n
SSqPlots = 1k
∑i,j Y 2
ij. − C , SSqRes = SSqPlots − SSqT
SSqWithin =∑
i,j,k Y 2ijk − C − SSqPlots
Assuming thatYijk = µ+ τi + εij + εijk
where τi ∼ N(0, σ2T ), εij ∼ N(0, σ2
P) and εijk ∼ N(0, σ2PS). Then
E(τi ) = 0, Var(τi ) = E(τ 2i ) = σ2
T ,
E(εij) = 0, Var(εij) = E(ε2ij) = σ2
P ,
E(εijk) = 0, Var(εijk) = E(ε2ijk) = σ2
PS ,
Clarice G.B. Demetrio and Cristian Villegas 110 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
i) E(SSqUnits)
E(SSqUnits) =∑i,j,k
E(Y 2ijk)− E(C )
E(Y 2ijk) = E(µ2) + E(τ 2
i ) + E(ε2ij) + E(dp) = µ2 + σ2
T + σ2
∑i,j,k
E(Y 2ijk
)= nµ2 + nσ2
T + nσ2P + nσ2
PS
C =1
trk
(∑i,j,k
Yijk
)2=
1
trk
[∑i,j,k
(µ+ τi + εij + εijk
)]2=
1
trk
[(trkµ+ rk
∑i
τi + k∑i,j
εij +∑i,j,k
εijk)]2
=1
trk
[(trkµ)2 + (rk)2
(∑i
τi)2
+ k2(∑
i,j
εij)2
+(∑i,j,k
εijk)2
+ dp]
E(C) = trkµ2 +rk
tE[(∑
i
τi)2]
+k
trE[(∑
i,j
εij)2]
+1
rtkE[(∑
i,j,k
εijk)2]
+1
trkE(dp)
Clarice G.B. Demetrio and Cristian Villegas 111 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
But
E[(∑
i
τi)2]
=∑i
E(τ 2i ) +
∑i
E(dp) = tσ2T
E[(∑
i,j
εij)2]
=∑i,j
E(ε2ij
)+∑i,j
E (dp) = trσ2P
E[(∑
i,j,k
εijk)2]
=∑i,j,k
E(ε2ijk)∑i,j,k
E(dp) = trkσ2PS
E(dp) = 0
Then
E(C ) = trkµ2 +rk
ttσ2
T +k
trtrσ2
P +1
trktrkσ2
PS
= trkµ2 + rkσ2T + kσ2
P + σ2PS
and
E(SSqUnits) = (n − rk)σ2T + (n − k)σ2
P + (n − 1)σ2PS
= rk(t − 1)σ2T + k(tr − 1)σ2
P + (n − 1)σ2PS
Clarice G.B. Demetrio and Cristian Villegas 112 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ii) E(SSqPlots)
E(SSqPlots) =1
k
∑i,j
E(P2ij
)− E(C )
Pij =∑k
Yijk =∑k
(µ+ τi + εij + εijk) = kµ+ kτi + kεij +∑k
εijk
P2ij =
[∑k
(µ+ τi + εij + εijk
)]2= k2µ2 + k2τ 2
i + k2ε2ij +
(∑k
εijk)2
+ dp
E(P2ij
)= k2µ2 + k2E(τ 2
i ) + k2E(ε2ij) + E
[(∑k
εijk)2]
+ E(dp)
= k2µ2 + k2σ2T + k2σ2
P + kσ2PS
E(SSqPlots) =1
k
∑i,j
(k2µ2 + k2σ2
T + k2σ2P + kσ2
PS
)− E(C)
=1
k
(trk2µ2 + trk2σ2
T + trk2σ2P + trkσ2
PS
)− E(C)
= trkµ2 + trkσ2P + trkσ2
P + trσ2PS − E(C)
= rk(t − 1)σ2T + k(tr − 1)σ2
P + (tr − 1)σ2PS
Clarice G.B. Demetrio and Cristian Villegas 113 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
iii) E(SSqT ) and E(MSqT )
SSqT =1
rk
t∑i=1
T 2i − C
T 2i =
(rkµ+ rkτi + k
∑j
εij +∑j,k
εijk)2
=(rkµ)2
+(rkτi
)2+(k∑j
εij)2
+(∑
j,k
εijk)2
+
2(rkµ)(
rkτi)
+ 2(rkµ)(
k∑j
εij)
+ 2(rkµ)(∑
j,k
εijk)
+
2(rkτi
)(k∑j
εij)
+ 2(rkτi
)(∑j,k
εijk)
+ 2(k
r∑j=1
εij)(∑
j,k
εijk)
E(T 2i
)= r 2k2µ2 + r 2k2E
(τ 2i
)+ k2E
(∑j
εij)2
+ E(∑
j,k
εijk)2
= r 2k2µ2 + r 2k2σ2T + rk2σ2
P + rkσ2PS
Clarice G.B. Demetrio and Cristian Villegas 114 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
1
rk
t∑i=1
E(T 2i
)= trkµ2 + trkσ2
T + tkσ2P + tσ2
PS
E(SSqT ) =1
rk
t∑i=1
E(T 2i
)−E(C ) = rk(t−1)σ2
T +k(t−1)σ2P +(t−1)σ2
PS
and
E(MSqT ) =E(SSqT )
t − 1= rkσ2
T + kσ2P + σ2
PS
iv) E(SSqRes) and E(MSqRes)
SSqRes = SSqPlots − SSqT
E(SSqRes) = kt(r−1)σ2P+t(r−1)σ2
PS and E(MSqRes) =E(SSqRes)
t(r − 1)= kσ2
P+σ2PS
v) E(SSqWithin) and E(MSqWithin)
SSqWithin = SSqUnits − SSqPlots
E(SSqWithin) = tr(k − 1)σ2PS and E(MSqWithin) =
E(SSqWithin)
tr(k − 1)= σ2
PS
Clarice G.B. Demetrio and Cristian Villegas 115 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
E(MSq)Source df SSq MSq T fixed T RandomPlots rt − 1 YTQUY
Treat. t − 1 YTQTYYTQTY
t − 1σ2 + kσ2
P + qT (Ψ) σ2 + kσ2P + rkσ2
T
Res. t(r − 1) YTQUResY
YTQUResY
n − tσ2 + kσ2
P σ2 + kσ2P
S[Plots] rt(k − 1) YTQUWY
YTQUWY
rt(k − 1)σ2 σ2
qT (Ψ) =Ψ′QTΨ
t − 1=
t∑i=1
rk(τi − τ .)2
t − 1
Clarice G.B. Demetrio and Cristian Villegas 116 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Wood shearing strength
Example
The effects of six treatments (a 2 x 3 set of factorial treatments from twotypes of resin and 3 wood blade densities) on the shearing strength are tobe investigated. The two types of resin were APM (resin of highmolecular weight) and BPM (resin of low molecular weight) and thethree wood blade densities were VH (Very Hard), H (Hard) and S (Soft).The trial was conducted using a completely randomized design with threewood panels from each treatment and the shearing strength (kgf/cm2) offive test bodies from each panel resin was measured.
interest of course in each particular treatment used
no interest in each panel which are very depending on the circumstances
no interest in each test body which are very depending on thecircumstances
interest in estimating the variance of the panel effect as a source ofrandom variation in the data
the five body tests from the same panel share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 117 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Wood Test APM BPMBlade Body VH H S VH H S
1 1 10.620 6.251 9.982 18.23 9.553 11.3901 2 16.840 7.825 12.510 20.24 10.140 12.6301 3 11.120 8.606 13.650 18.92 10.900 10.6501 4 7.407 9.421 10.020 22.67 9.762 9.6521 5 14.400 6.405 7.154 24.92 10.250 6.3062 1 21.890 21.580 12.620 21.16 17.630 14.3702 2 20.770 20.060 12.990 10.82 20.700 13.0602 3 18.670 15.830 12.430 20.36 16.080 12.1302 4 16.160 16.120 14.250 20.33 14.770 14.4702 5 18.780 16.200 11.820 14.16 19.270 13.7503 1 18.710 13.550 7.385 14.40 11.400 12.3903 2 23.460 17.070 6.075 15.85 15.860 12.3703 3 16.650 14.210 12.890 18.37 11.270 13.7603 4 20.820 17.920 12.220 13.95 13.370 11.3803 5 16.240 18.670 7.781 16.04 13.920 13.210
Clarice G.B. Demetrio and Cristian Villegas 118 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Source df SSq MSq FPanels 17
Treatments 5 556.5 111.30 1.77Residual 12 754.0 62.83
Between body tests within panels 72 428.5 5.95
ANOVA estimators:
σ2 = MSqWithin = 5.95, σ2Res =
MSqRes −MSqWithin
t=
62.83− 5.95
5= 11.38
MM REML MLσ2P 11.38 11.38 7.19σ2PS 5.95 5.95 5.95
Clarice G.B. Demetrio and Cristian Villegas 119 Modelos Mistos e Componentes de Variancia
R program
CRDk_wood.dat <- read.csv2("Wood_CRD.csv",h=T)CRDk_wood.dat$Panels <- factor(rep(1:18, each=5))CRDk_wood.dat$Treat <-gl(6,15)CRDk_wood.dat$Panel <- factor(CRDk_wood.dat$Panel)CRDk_wood.dat$TB <- factor(CRDk_wood.dat$TB)
summary(aov(Strength ~ Treat+Error(Panels/TB), CRDk_wood.dat))(summary(CRDk_wood.aov)[[1]])[[1]]$"Mean Sq"[2](summary(CRDk_wood.aov)[[2]])[[1]]$"Mean Sq"[1]CRDk_wood.lm <-lm(Strength ~ Treat+Panels/TB, CRDk_wood.dat)(MSq_Res <- anova(CRDk_wood.lm)$"Mean Sq"[2])(MSq_Within <- anova(CRDk_wood.lm)$"Mean Sq"[3])
## components of variance - ANOVA method(sigma2_PS <- MSq_Within)(sigma2_Res <-(MSq_Res-MSq_Within)/5)
## REML using library nlmelibrary(nlme)CRDk_wood.lme <- lme(Strength ~ Treat,random = ~ 1|Panels,data=CRDk_wood.dat,method="REML")VarCorr(CRDk_wood.lme)summary(CRDk_wood.lme)## MLsummary(lme(Strength ~ Treat,random = ~ 1|Panels,data=CRDk_wood.dat,method="ML"))2.680866^2; 2.439458^2VarCorr(lme(Strength ~ Treat,random = ~ 1|Panels,data=CRDk_wood.dat,method="ML"))
## REML using library lme4library(lme4)CRDk_wood.lmer <- lmer(Strength ~ Treat + (1|Panels),data=CRDk_wood.dat,REML=TRUE)summary(CRDk_wood.lmer)summary(CRDk_wood.lmer)@coefsdata.frame(summary(CRDk_wood.lmer)@REmat)## MLlmer(Strength ~ Treat + (1|Panels),data=CRDk_wood.dat,REML=FALSE)
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Randomized Complete Block Design (RCBD)
Consider a randomized complete block design,
Yjk = µ+ βj + τk + εjk , j = 1, . . . , r , k = 1, . . . , t
1. Fixed model with βj and τk as fixed effects and εjk random
βj as fixed ⇒ E(βj) = βj ,E(β2j ) = β2
j and Var(βj) = 0
τk as fixed ⇒ E(τk) = τk , E(τ 2k ) = τ 2
k and Var(τk) = 0εjk ∼ N(0, σ2) ⇒ E(ε2
jk) = 0 and Var(εjk) = E(εjk) = σ2
εjk and εj′k′ (j 6= j ′ and/or k 6= k ′) are independent
ThenVar(Yjk) = Var(µ+ βj + τk + εjk) = Var(εjk) = σ2
Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(εjk , εjk′) = 0 (observations from the same block and differenttreatments)Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(εjk , εj′k) = 0 (observations from different blocks and the sametreatment)Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k′) =Cov(εjk , εj′k′) = 0 (observations from different blocks and differenttreatments)
Clarice G.B. Demetrio and Cristian Villegas 121 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2. Mixed model with βj as random effect, τk as fixed effect and εjk asrandom
βj ∼ N(0, σ2B) as random ⇒ E(βj) = 0 and Var(βj) = E(β2
j ) = σ2B
τk as fixed ⇒ E(τk) = τk , E(τ 2k ) = τ 2
k and Var(τk) = 0
εjk ∼ N(0, σ2) ⇒ E(εjk) = 0 and Var(εjk) = E(ε2jk) = σ2
βj and εjk , βj and βj′ (j 6= j ′), εjk and εj′k′ (j 6= j ′ and/or k 6= k ′)are independent
Then
Var(Yjk) = Var(µ+ βj + τk + εjk) = Var(βj + εjk) = σ2 + σ2B
Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(βj + εjk , βj + εjk′) = σ2
B (observations from the same block anddifferent treatments)
Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(βj + εjk , βj′ + εj′k) = 0 (observations from different blocks andthe same treatment)
Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k′) =Cov(βj + εjk , βj′ + εj′k′) = 0 (observations from different blocks anddifferent treatments)
Clarice G.B. Demetrio and Cristian Villegas 122 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3. Mixed model with βj as fixed effect, τk as random effect and εjk asrandom
βj as fixed ⇒ E(βj) = βj ,E(β2j ) = β2
j and Var(βj) = 0
τk ∼ N(0, σ2B) as random ⇒ E(τk) = 0 and Var(τk) = E(τ 2
k ) = σ2T
εjk ∼ N(0, σ2) ⇒ E(εjk) = 0 and Var(εjk) = E(ε2jk) = σ2
τk and εjk , τk and τk′ (k 6= k ′), εjk and εj′k′ (j 6= j ′ and/or k 6= k ′)are independent
Then
Var(Yjk) = Var(µ+ βj + τk + εjk) = Var(βj + τk + εjk) = σ2 + σ2T
Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(τk + εjk , τk′ + εjk′) = 0 (observations from the same block anddifferent treatments)
Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(τk + εjk , τk + εj′k) = σ2
T (observations from different blocks andthe same treatment)
Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k) =Cov(τk + εjk , τk′ + εj′k) = 0 (observations from different blocks anddifferent treatments)
Clarice G.B. Demetrio and Cristian Villegas 123 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
4. Random model with βj as random effect, τk as random effect and εjkas random
βj ∼ N(0, σ2B) as random ⇒ E(βj) = 0 and Var(βj) = E(β2
j ) = σ2B
τk ∼ N(0, σ2B) as random ⇒ E(τk) = 0 and Var(τk) = E(τ 2
k ) = σ2T
εjk ∼ N(0, σ2) ⇒ E(εjk) = 0 and Var(εjk) = E(ε2jk) = σ2
τk and εjk , τk and τk′ (k 6= k ′), βj and βj′ (j 6= j ′), εjk and εj′k′(j 6= j ′ and/or k 6= k ′) are independent
Then
Var(Yjk) = Var(µ+βj +τk +εjk) = Var(βj +τk +εjk) = σ2 +σ2B +σ2
T
Cov(Yjk ,Yjk′) = Cov(µ+ βj + τk + εjk , µ+ βj + τk′ + εjk′) =Cov(βj + τk + εjk , βj + τk′ + εjk′) = σ2
B (observations from the sameblock and different treatments)
Cov(Yjk ,Yj′k) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk + εj′k) =Cov(βj + τk + εjk , βj′ + τk + εj′k) = σ2
T (observations from differentblocks and the same treatment)
Cov(Yjk ,Yj′k′) = Cov(µ+ βj + τk + εjk , µ+ βj′ + τk′ + εj′k′) =Cov(βj + τk + εjk , βj′ + τk′ + εj′k′) = 0 (observations from differentblocks and different treatments)
Clarice G.B. Demetrio and Cristian Villegas 124 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Variance-covariance matrix - Example
Suppose a RCBD with r = 2, t = 3, that is
y = [y11 y12 y13 y21 y22 y23]T
1. Fixed model with βj and τk as fixed effects and εjk random.In this case, in matrix notation:
Y = XGµ+ XBβ + XTτ + ε
E(Y) = XGµ+ XBβ + XTτ and Var(Y) = Σ = σ2I6
XG = 16, XB =
[13 03
03 13
], XT =
[I3
I3
],
Var(Y) =
σ2 0 0 0 0 00 σ2 0 0 0 00 0 σ2 0 0 00 0 0 σ2 0 00 0 0 0 σ2 00 0 0 0 0 σ2
,
Clarice G.B. Demetrio and Cristian Villegas 125 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2. Mixed model with βj as random effect, τk as fixed effect and εjk asrandomIn this case, in matrix notation:
Y = XGµ+ XTτ + ZBβ + ε
E(Y) = XGµ+ XTτ and Var(Y) = ZGZ′ + Σ = σ2BZBZ′B + σ2I6
XG = 16, XT =
[I3
I3
], ZB =
[13 03
03 13
], G = σ2
B I2, Σ = σ2I6
Var(Y) =
σ2 + σ2B σ2
B σ2B 0 0 0
σ2B σ2 + σ2
B σ2B 0 0 0
σ2B σ2
B σ2 + σ2B 0 0 0
0 0 0 σ2 + σ2B σ2
B σ2B
0 0 0 σ2B σ2 + σ2
B σ2B
0 0 0 σ2B σ2
B σ2 + σ2B
Clarice G.B. Demetrio and Cristian Villegas 126 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3. Mixed model with βj as fixed effect, τk as random effect and εjk asrandomIn this case, in matrix notation:
Y = XGµ+ XBβ + ZTτ + ε
E(Y) = XGµ+ XBβ and Var(Y) = ZGZ′ + Σ = σ2TZTZ′T + σ2I6
XG = 16, XB =
[13 03
03 13
], ZT =
[I3
I3
], G = σ2
T I3, Σ = σ2I6
Var(Y) =
σ2 + σ2T 0 0 σ2
T 0 00 σ2 + σ2
T 0 0 σ2T 0
0 0 σ2 + σ2T 0 0 σ2
T
σ2T 0 0 σ2 + σ2
T 0 00 σ2
T 0 0 σ2 + σ2T 0
0 0 σ2T 0 0 σ2 + σ2
T
Clarice G.B. Demetrio and Cristian Villegas 127 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
4. Random model with βj as random effect, τk as random effect andεjk as randomIn this case, in matrix notation:
Y = XGµ+ ZBβ + ZTτ + ε
E(Y) = XGµ and Var(Y) = ZGZ′ + Σ = σ2BZBZ′B + σ2
TZTZ′T + σ2I6
XG = 16, ZB =
[13 03
03 13
], ZT =
[I3
I3
], G = σ2
T I3, Σ = σ2I6
σ2 + σ2B + σ2
T σ2B σ2
B σ2T 0 0
σ2B σ2 + σ2
B + σ2T σ2
B 0 σ2T 0
σ2B σ2
B σ2 + σ2B + σ2
T 0 0 σ2T
σ2T 0 0 σ2 + σ2
B + σ2T σ2
B σ2B
0 σ2T 0 σ2
B σ2 + σ2B + σ2
T σ2B
0 0 σ2T σ2
B σ2B σ2 + σ2
B + σ2T
Clarice G.B. Demetrio and Cristian Villegas 128 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
E(MSq)Source df SSq Fixed B random T random Random
Blocks r − 1 Y′QBY σ2 + qB (Ψ) σ2 + tσ2B σ2 + qB (Ψ) σ2 + tσ2
BUnits[Blocks] r(t − 1) Y′QUYTreatments t − 1 Y′QT Y σ2 + qT (Ψ) σ2 + qT (Ψ) σ2 + rσ2
T σ2 + rσ2T
Residual (r − 1)(t − 1) Y′QUResY σ2 σ2 σ2 σ2
whereXG = 1n, XB = Ir ⊗ 1t , XT = 1r ⊗ It
MU = (Ir ⊗ It) = Irt , MG = n−1Jr ⊗ Jt = n−1Jn
MB = XB(XTB XB)−1XT
B = t−1Ir ⊗ Jt
MT = XT (XTTXT )−1XT
T = r−1Jr ⊗ It
QB = MB −MG , QT = MT −MG , QU = MU −MG
QURes= MU −MB −MT + MG
qB(Ψ) =Ψ′QBΨ
r − 1=
r∑j=1
t(βj − β.)2
r − 1
qT (Ψ) =Ψ′QTΨ
t − 1=
t∑k=1
b(τk − τ .)2
t − 1
Clarice G.B. Demetrio and Cristian Villegas 129 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares for an ANOVA – RCBD
Assuming that
Yjk = µ+ βj + τk + εjk
where τk is a fixed effect, βj ∼ N(0, σ2B) and εjk ∼ N(0, σ2). Then
E(τk) = τk , E(τ 2k ) = τ 2
k , Var(τk) = 0
E(βj) = 0, Var(βj) = E(β2j ) = σ2
B ,
E(εjk) = 0, Var(εjk) = E(ε2jk) = σ2,
i) E(SSqB) and E(MSqB)
SSqB =1
t
∑j
B2j − C , C =
1
rt
(∑j,k
Yjk
)2
Bj =∑k
Yjk =∑k
(µ+ βj + τk + εjk) = (tµ+ tβj +∑k
τk +∑k
εjk)
Clarice G.B. Demetrio and Cristian Villegas 130 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
B2j = (
∑k
µ+ βj + τk + εjk)2 = (tµ+ tβj +∑k
τk +∑k
εjk)2
= (tµ)2 +(tβj)2
+(∑
k
τk)2
+(∑
k
εjk)2
+ 2t2µβj + 2tµ∑k
τk +
2tµ∑k
εjk + 2tβj∑k
τk + 2tβj∑k
εjk + 2∑k
τk∑k
εjk
E(B2j ) = t2µ2 + t2E
(β2j
)+ E(
∑k
τk)2 + E(∑
k
εjk)2
+ 2tµ∑k
τk
= t2µ2 + t2σ2B + (
∑k
τk)2 + tσ2 + 2tµ∑k
τk
1
t
r∑j=1
E(B2j ) =
1
t
[rt2µ2 + rt2σ2
B + r(∑k
τk)2 + rtσ2 + 2rtµ∑k
τk
]= rtµ2 + rtσ2
B +r
t
(∑k
τk)2
+ rσ2 + 2rµ∑k
τk
Clarice G.B. Demetrio and Cristian Villegas 131 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
∑j,k
Yjk =∑j,k
(µ+ βj + τk + εjk) = rtµ+ t∑j
βj + r∑k
τk +∑j,k
εjk
(∑j,k
Yjk
)2=
(rtµ+ t
∑j
βj + r∑k
τk +∑j,k
εjk)2
= r 2t2µ2 + t2(∑j
βj)2
+ r 2(∑k
τk)2
+(∑
j,k
εjk)2
+
2rt2µ∑j
βj + 2r 2tµ∑k
τk + 2rtµ∑j,k
εjk + 2tr∑j
βj∑k
τk +
2t∑j
βj∑j,k
εjk + 2r∑k
τk∑j,k
εjk
1
rt
(∑j,k
Yjk
)2=
1
rt
[r 2t2µ2 + t2rσ2
B + r 2(∑k
τk)2 + rtσ2 + 2r 2tµ∑k
τk]
E(C) = rtµ2 + tσ2B +
r
t(∑k
τk)2 + σ2 + 2rµ∑k
τk
E(SSqB) = t(r − 1)σ2B + (r − 1)σ2 and E(MSqB) =
E(SSqB)
r − 1= tσ2
B + σ2
Clarice G.B. Demetrio and Cristian Villegas 132 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ii) E(SSqT ) and E(MSqT )
SSqT =1
r
t∑i=1
T 2k − C
Tk =∑j
(µ+ βj + τk + εjk) = rµ+∑j
βj + rτk +∑j
εjk
T 2k =
(rµ+
∑j
βj + rτk +∑j
εjk)2
= r 2µ2 +(∑
j
βj)2
+ r 2τ 2k +
(∑j
εjk)2
+ 2rµ∑j
βj + 2r 2µτk +
2rµ∑j
εjk + 2rτk∑j
βj + 2∑j
βj∑j
εjk + 2rτk∑j
εjk
E(T 2k
)= r 2µ2 + rσ2
B + r 2τ 2k + rσ2 + 2r 2µτk
1
r
t∑k=1
E(T 2k
)= trµ2 + tσ2
B + rt∑
k=1
τ 2k + tσ2 + 2rµ
t∑k=1
τk
Clarice G.B. Demetrio and Cristian Villegas 133 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
E(SSqT ) =1
r
t∑k=1
E(T 2k
)− E(C ) = (t − 1)σ2 + r
t∑k=1
τ 2k −
r
t(∑k
τk)2
and
E(MSqT ) =E(SSqT )
t − 1= σ2 +
r
t − 1
[ t∑k=1
τ 2k −
1
t(∑k
τk)2]
= σ2 + qT (Ψ)
where qT (Ψ) =Ψ′QTΨ
t − 1=
t∑k=1
r(τk − τ .)2
t − 1
Clarice G.B. Demetrio and Cristian Villegas 134 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
iv) E(SSqRes) and E(MSqRes)
E(SSqRes) = E(SSqTotal)−E(SSqT )−E(SSqB) =∑jk
E(Y 2j,k
)−1
r
t∑k=1
E(T 2k
)−E(SSqB)
Y 2jk = (µ+ βj + τk + εjk)2
= µ2 + β2j + τ 2
k + ε2jk + 2µβj + 2µτk + µεjk + 2βjτk + 2βjεjk + 2τkεjk
E(Y 2jk
)= µ2 + σ2
B + τ 2k + σ2 + 2µτk
∑j,k
E(Y 2jk
)= rtµ2 + trσ2
B + r∑k
τ 2k + rtσ2 + 2rµ
∑k
τk
E(SSqRes) = (t − 1)(r − 1)σ2
and
E(MSqRes) =E(SSqRes)
(t − 1)(r − 1)= σ2
Clarice G.B. Demetrio and Cristian Villegas 135 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Penicillin yield (Brien, 2009)
Example
The effects of four treatments on the yield of penicillin are to beinvestigated. It is known that corn steep liquor, an important rawmaterial in producing penicillin, is highly variable from one blending of itto another. To ensure that the results of the experiment apply to morethan one blend, five blends (blocks) are to be used in the experiment.The trial was conducted using the same blend in four flasks andrandomizing the four treatments to these four flasks.
interest of course in each particular treatment usedno interest in each blend which are very depending on thecircumstancesblend effect can be viewed as a sample of a random blend effect(levels are chosen at random from an infinite set of blend levels)interest in estimating the variance of the blend effect as a source ofrandom variation in the datathe four flasks with the same blend share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 136 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
. . .
Blend 1
Flask 1 Flask 3 Fask 4Flask 2
Blend 5
Flask 1 Flask 3 Flask 4Flask 2
TreatmentBlend A B C D
1 89 88 97 942 84 77 92 793 81 87 87 854 87 92 89 845 79 81 80 88
Clarice G.B. Demetrio and Cristian Villegas 137 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Penicillin yield
ANOVA table using R
Source df SSq MSq F Prob
Blend 4 264.0 66.0 1.97 0.15
Plots[Blocks] 15
Treat 3 70.0 23.3 1.24 0.34
Residual 12 226.0 18.8
ANOVA estimators:
σ2 = MSqRes = 18.8, σ2B =
MSqB −MSqRes
t=
66.0− 18.8
4= 11.8
MM REML ML
σ2B 11.8 11.8 9.4
σ2 18.8 18.8 15.1
Clarice G.B. Demetrio and Cristian Villegas 138 Modelos Mistos e Componentes de Variancia
SAS programdata pen;
input Blend Treat$ Yield @@;
cards;
1 A 89 3 C 87
1 B 88 3 D 85
1 C 97 4 A 87
1 D 94 4 B 92
2 A 84 4 C 89
2 B 77 4 D 84
2 C 92 5 A 79
2 D 79 5 B 81
3 A 81 5 C 80
3 B 87 5 D 88
;
* Moment Method;
proc glm data=pen;
class Blend Treat;
model Yield = Blend Treat;
run;
* Restricted Maximum Likelihood Method;
proc mixed data=pen;
class Blend Treat;
model Yield = Treat / solution ddfm=sat;
random Blend / solution ;
run;
* Maximum Likelihood Method;
proc mixed data=pen method=ML;
class Blend Treat;
model Yield = Treat / solution ddfm=sat;
random Blend / solution ;
run;
R program
#set up data.frame with factors Flasks, Blends and Treat and response variable YieldRCBDPen.dat <- data.frame(Blend=factor(rep(c(1,2,3,4,5), times=c(4,4,4,4,4))),
Flask = factor(rep(c(1,2,3,4), times=5)),Treat = factor(rep(c("A","B","C","D"), times=5)))
RCBDPen.dat$Yield <- c(89,88,97,94,84,77,92,79,81,87,87,85,87,92,89,84,79,81,80,88)
RCBDPen.dat#attach(RCBDPen.dat)# Moment MethodRCBDPen.lm <- lm(Yield ~ Blend + Treat, RCBDPen.dat)anova(RCBDPen.lm)(66.000-18.833)/4(anova(RCBDPen.lm)$"Mean Sq"[1]-anova(RCBDPen.lm)$"Mean Sq"[3])/(length(levels(RCBDPen.dat$Blend))-1)anova(lm(Yield ~1, RCBDPen.dat)) # to get the Total SSrequire(nlme)# Restricted Maximum Likelihood MethodRCBD.reml <- lme(Yield ~ Treat, random = ~1|Blend,RCBDPen.dat, method="REML")summary(RCBD.reml,corr = F)VarCorr(RCBD.reml)# Maximum Likelihood MethodRCBD.ml <- lme(Yield ~ Treat, random = ~1|Blend, RCBDPen.dat, method="ML")summary(RCBD.ml,corr = F)VarCorr(RCBD.ml)# Restricted Maximum Likelihood Method using ASReml-Rrequire(asreml)RCBD.asreml<- asreml(Yield ~ Treat,random=~Blend,data=RCBDPen.dat)summary(RCBD.asreml)summary(RCBD.asreml)$varcomp
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
U 1 x 1 Mean
Blend 5 x 4 B
Blend∧Flask 20 x 15 F[B]
U 1 h 1 Mean
Treat 4 h 3 T
Clarice G.B. Demetrio and Cristian Villegas 141 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
U MG x MG Mean
Blend MB x MB −MG B
Blend∧Flask MBF x MBF −MB F[B]
U MG h MG Mean
Treat MT h MT −MG T
Clarice G.B. Demetrio and Cristian Villegas 142 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
U 1 x 1 Mean
Blend 4σ2B x σ2 + 4σ2
B B
Blend∧Flask σ2 x σ2 F[B]
U 1 h 1 Mean
Treat qT (Ψ) h qT (Ψ) T
Clarice G.B. Demetrio and Cristian Villegas 143 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Randomized Complete Block Design (RCBD) withsubsampling
In breeding experiments, when families or clones are evaluated, themeasurements are made at individual’s levels within the plots inorder to estimate the variability within the plot.
For clones evaluation (same genes), as in sugar-cane, potato ormanioc the variation within the plot will be only due to theenvironment.
The same is true for homozygote lines.
For segregant families of plants, the phenotypic variation within theplots is due to two components: one genetic and anotherenvironmental, that is, the phenotypic variance (σ2
W ) is equal to theenvironmental variance within the plots (σ2
E ) plus genetic variancewithin the families (σ2
G ).
This type of information allows the geneticist to get estimates ofgenetic parameters as heritability and expected gain with theselection.
Clarice G.B. Demetrio and Cristian Villegas 144 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Eucalyptus data (Ramalho et all, 2013)
Example
For the evaluation of progenies of Eucalyptus camaldulensis a RCBD wasperformed using 10 progenies as treatments and three blocks. Theresponse variable was wood volume (m3 × 10−4) of six trees per plot.
Block I Block II Block IIIProgeny I II III IV V VI I II III IV V VI I II III IV V VI Means
1 55 96 212 289 140 142 218 162 106 124 119 155 105 38 124 119 59 58 128.942 124 230 108 111 46 111 146 138 194 236 214 116 218 207 63 146 212 192 156.223 42 134 229 246 166 175 181 262 150 290 112 133 239 195 195 146 169 117 176.724 99 75 175 64 106 192 388 207 339 256 124 282 320 356 77 367 273 160 214.445 201 131 33 236 195 273 124 225 206 147 281 90 155 368 285 210 142 111 189.616 109 131 124 256 110 94 223 108 69 59 129 54 119 92 218 106 70 257 129.337 138 62 27 132 138 100 214 290 60 175 106 80 131 34 166 51 49 24 109.838 37 70 38 157 84 142 181 48 194 134 108 91 81 250 91 295 175 30 122.569 126 106 210 61 190 86 126 98 41 29 274 54 168 210 256 90 106 142 131.83
10 104 136 140 137 111 358 48 62 68 68 20 157 67 134 157 108 33 194 116.78
Note that the number of possible descendents of every plant is enormous and in general onlya sample of them is evaluated (a random effect);
We have three possible types of analysis: at individual level, at mean level or at total level.
Clarice G.B. Demetrio and Cristian Villegas 145 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Randomized Complete Block Design (RCBD) withsubsampling
ANOVA at individual level:Consider a randomized complete block design, with subsampling
Yijk = µ+ βi + τj + εij + εijk , i = 1, . . . , r , j = 1, . . . , t, k = 1, . . . , s
where µ constant, βi is the effect of the i-th block (fixed), τj is the effectof the j-th treatment (random), εij is the experimental error at plot’slevel and εijk is the effect of the individual (e.g. plant) k within the plotij . The ANOVA table is
Source df SSq MSq E(MSq) F
Blocks r − 1 Y′QBY σ2W + sσ2
e + qB (Ψ)Plots[Blocks] r(t − 1) Y′QPY
Treatments t − 1 Y′QT YY′QT Y
t − 1σ2W + sσ2
e + rsσ2T
MSqTMSqRes
Residual (r − 1)(t − 1) Y′QUResY
Y′QUResY
n − tσ2W + sσ2
eMSqR es
MSqW
Samples[Blocks ∧ Plots] rt(s − 1) YT QUWY
YT QUWY
rt(k − 1)σ2W
Clarice G.B. Demetrio and Cristian Villegas 146 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then,
SSqB = 1t
∑i B2
i − C , C =(∑
i,j,k Yijk )2
n
SSqT = 1rs
∑rj=1 T 2
j − C ,
SSqPlots = 1k
∑i,j Y 2
ij. − C ,
SSqPlots[Blocks] = 1k
∑i,j Y 2
ij. − C − SSqB = 1s
∑i,j Y 2
ij. − 1t
∑i B2
i ,
SSqRes = SSqPlots − SSqT ,
SSqResidual = SSqPlots[Blocks] − SSqT ,
SSqWithin =∑
i,j,k Y 2ijk − C − SSqPlots[Blocks]
Clarice G.B. Demetrio and Cristian Villegas 147 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Y = XGµ+ XBβ + ZTτ + Zeε+ ε
E(Y) = XGµ+XBβ and Var(Y) = ZGZ′+Σ = σ2TZTZ′T+σ2
eZeZ′e+σ2W Irst
y = [y111 . . . y11s . . . y1t1 . . . y1ts . . . yr11 . . . yr1s . . . yrt1 . . . yrts ]T
XG = 1rst = 1r ⊗ 1t ⊗ 1s , XB =
1st 0st . . . 0st
0st 1st . . . 0st
. . . . . . . . . . . .0st 0st . . . 1st
= Ir ⊗ 1t ⊗ 1s ,
ZT =
1s 0s . . . 0s
0s 1s . . . 0s
. . . . . . . . . . . .0s 0s . . . 1s
. . . . . . . . . . . .1s 0s . . . 0s
0s 1s . . . 0s
. . . . . . . . . . . .0s 0s . . . 1s
, Ze =
1s 0s . . . 0s . . . 0s 0s . . . 0s
0s 1s . . . 0s . . . 0s 0s . . . 0s
. . . . . . . . . . . . . . . . . . . . . . . . . . .0s 0s . . . 1s . . . 0s 0s . . . 0s
. . . . . . . . . . . . . . . . . . . . . . . . . . .0s 0s . . . 0s . . . 1s 0s . . . 0s
0s 0s . . . 0s . . . 0s 1s . . . 0s
. . . . . . . . . . . . . . . . . . . . . . . . . . .0s 0s . . . 0s . . . 0s 0s . . . 1s
,
ZT = 1r ⊗ It ⊗ 1s , Ze = Ir ⊗ It ⊗ 1s
GT = σ2T It , Ge = σ2
e Irt , Σ = σ2Irst
Var(Y) = σ2TJr ⊗ It ⊗ Js + σ2
e Ir ⊗ It ⊗ Js + σ2W Irst
Clarice G.B. Demetrio and Cristian Villegas 148 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Assuming r = 4, t = 3 and s = 2
Var(Y) = σ2TJ4 ⊗ I3 ⊗ J2 + σ2
e I4 ⊗ I3 ⊗ J2 + σ2W I24
Var(Y) =
V1 V2 V2 V3 V2 V2 V3 V2 V2 V3 V2 V2
V2 V1 V2 V2 V3 V2 V2 V3 V2 V2 V3 V2
V2 V2 V1 V2 V2 V3 V2 V2 V3 V2 V2 V3
V3 V2 V2 V1 V2 V2 V3 V2 V2 V3 V2 V2
V2 V3 V2 V2 V1 V2 V2 V3 V2 V2 V3 V2
V2 V2 V3 V2 V2 V1 V2 V2 V3 V2 V2 V3
V3 V2 V2 V3 V2 V2 V1 V2 V2 V3 V2 V2
V2 V3 V2 V2 V3 V2 V2 V1 V2 V2 V3 V2
V2 V2 V3 V2 V2 V3 V2 V2 V1 V2 V2 V3
V3 V2 V2 V3 V2 V2 V3 V2 V2 V1 V2 V2
V2 V3 V2 V2 V3 V2 V2 V3 V2 V2 V1 V2
V2 V2 V3 V2 V2 V3 V2 V2 V3 V2 V2 V1
V1 =
[σ2W + σ2
e + σ2T σ2
e + σ2T
σ2e + σ2
T σ2W + σ2
e + σ2T
]V2 =
[0 00 0
]V3 =
[σ2T σ2
T
σ2T σ2
T
]Clarice G.B. Demetrio and Cristian Villegas 149 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Eucalyptus data
Source df SSq MSq FBlocks 2 12987.88 6493.94Plots[Blocks] 27
Progenies 9 199609.12 22178.79 2.23Residual 18 178969.57 9942.75
Trees[Blocks ∧ Plots] 150 764157.50 5094.38
ANOVA estimators:σ
2W = MSqWithin = 5094.38
σ2e =
MSqRes −MSqWithin
s=
9942.75− 5094.38
6= 808.06
σ2P =
MSqP −MSqRes
rs=
22178.79− 9942.75
3x6= 679.78
Note that
σ2W is the estimate of the phenotypic variance within plots, that is, the variance between
trees within plots from the same family. Because there is genetic variation between treesfrom a family of half-sibs, σ2
W = σ2G + σ2
E
σ2Wσ2e
= 5094.38808.06 = 6.30 which shows that the phenotypic variation between plants within the
plots is 6.3 times bigger than the error variance.
σ2P is the estimate of the genetic variance between families of half-sibs.
Clarice G.B. Demetrio and Cristian Villegas 150 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
MM REML MLσ2P 679.78 679.78 611.80σ2e 808.06 808.06 642.35σ2W 5094.38 5094.38 5094.38
Clarice G.B. Demetrio and Cristian Villegas 151 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The variance of σ2P is given by
Var(σ2P) =
2
r 2s2
( MSq2P
t − 1 + 2+
MSq2Res
(t − 1)(r − 1) + 2
)=
2
3262
(22178.742
9 + 2+
9942.752
18 + 2
)= 306549.3
Another important estimate is the phenotypic variance between themeans of the families (σ2
Pwhich can be obtained by three ways:
i) Using the variance of the means of the progenies
σ2P =
1
9
(128.942+156.222+. . .+116.782−128.94 + . . . 116.78
10
)= 1232.16
ii) Using MSqP , that is,
σ2P =
MSqP
rs=
22178.79
3× 6= 1232.16
iii) Using the components of variance
σ2P =
MSqPrs
=σ2W + sσ2
e + rsσ2P
rs=
5094.38
3× 6+
808.06
3+ 679.78 = 1232.16
Clarice G.B. Demetrio and Cristian Villegas 152 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The last expression helps the geneticist to study the ways ofreducing the phenotypic variance between means of progenies.
One way is to improve the experimental precision, and as aconsequence decrease the estimates of σ2
W and σ2e .
Another option is to increase de number r of replicates or thenumber s of plants per plot.
It is possible to play with r and k to see what is better: to increasethe number of replicates or the number of plants per plot
r = 4 and s = 6 ⇒ σ2P = 5094.38
4×6+ 808.06
4+ 679.78 = 1094.06
r = 3 and s = 8 ⇒ σ2P = 5094.38
3×8+ 808.06
3+ 679.78 = 1161.40
which shows that the better option is to increase the number ofreplicates instead the number of plants per plot.
Clarice G.B. Demetrio and Cristian Villegas 153 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
τi (BLUE ) τi (BLUP) µi (BLUE) µi (BLUP)-18.683333 -10.307910 128.9444 125.58.594444 4.741700 156.2222 140.6
29.094444 - 16.051895 176.7222 151.966.816667 - 36.863881 214.4444 172.741.983333 23.162913 189.6111 159.0-18.294444 -10.093353 129.3333 125.7-37.794444 -20.851832 109.8333 115.0-25.072222 -13.832768 122.5556 122.0-15.794444 -8.714061 131.8333 127.1-30.850000 -17.020465 116.7778 118.8
τBLUE = yP − y
τEBLUP = τBLUEσ2P
σ2P +
σ2e
r +σ2W
rs
Clarice G.B. Demetrio and Cristian Villegas 154 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ANOVA for totals:
Source df SSq MSq FBlocks 2 77927.27 38963.63Plots[Blocks] 27
Progenies 9 1197654.70 133072.74 2.23Residual 18 1073817.40 59656.57
ANOVA for means:
Source df SSq MSq FBlocks 2 2164.57 1082.28Plots[Blocks] 27
Progenies 9 33267.29 3396.37 2.23Residual 18 29828.37 1657.13
Clarice G.B. Demetrio and Cristian Villegas 155 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Heritability coefficient
The amount of genetic variation among the individuals of a speciesof crop or domesticated animal can be compared with the amount ofvariation due to non-genetic causes in a ratio called the heritability.
The heritability of a trait is defined as
h2 =σ2G
σ2Ph
where σ2G is the genetic component of variance, i.e. the part of the
variation in the organism’s phenotype (its observable traits) that isdue to genetic effects; σ2
Ph is the phenotypic variance, i.e. thevariance due to the combined effects of genotype and environment.
For the Eucalyptus example
h2 =σ2P
σ2W
rs +σ2e
r + σ2P
=679.78
5094.383×6 + 808.06
3 + 679.78= 0.5517
Clarice G.B. Demetrio and Cristian Villegas 156 Modelos Mistos e Componentes de Variancia
R program
rm(list=ls(all=TRUE))RCBDk_Eucaliptus.dat<- read.table("RCBDkEucaliptus.csv", sep=";", h=T)names(RCBDk_Eucaliptus.dat)head(RCBDk_Eucaliptus.dat)str(RCBDk_Eucaliptus.dat)
RCBDk_Eucaliptus.dat$block<- factor(RCBDk_Eucaliptus.dat$block)RCBDk_Eucaliptus.dat$progeny<- factor(RCBDk_Eucaliptus.dat$progeny)RCBDk_Eucaliptus.dat$plots<- factor(rep(1:10,times=18))str(RCBDk_Eucaliptus.dat)
## ANOVARCBDk_Eucaliptus.aov<-aov(volume~progeny+Error(block/plots),RCBDk_Eucaliptus.dat)summary(RCBDk_Eucaliptus.aov)
(MSqB <-(summary(RCBDk_Eucaliptus.aov)[[1]])[[1]]$"Mean Sq"[1])(MSqP <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Mean Sq"[1])(MSqRes <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Mean Sq"[2])(MSqWithin <- (summary(RCBDk_Eucaliptus.aov)[[3]])[[1]]$"Mean Sq"[1])(dfB <-(summary(RCBDk_Eucaliptus.aov)[[1]])[[1]]$"Df"[1])(dfP <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Df"[1])(dfRes <- (summary(RCBDk_Eucaliptus.aov)[[2]])[[1]]$"Df"[2])(dfW <- (summary(RCBDk_Eucaliptus.aov)[[3]])[[1]]$"Df"[1])
## Components of variance - Moment method(sigma2_W <- MSqWithin)(sigma2_Res<- (MSqRes - MSqWithin)/6)(sigma2_P<- (MSqP- MSqRes)/(3*6))(sigma2_W/sigma2_Res)(h2 <- sigma2_P/(sigma2_P+sigma2_Res/3+sigma2_W/18))## Estimate of Var(sigma2_P)(Var_sigma2_P=2/(3^2*6^2)*(MSqP^2/(dfP+2)+MSqRes^2/(dfRes+2)))sqrt(Var_sigma2_P)
## BLUE and EBLUP for tau - calculating step by step(ybar <- mean(RCBDk_Eucaliptus.dat$volume))(mean_P <- tapply(RCBDk_Eucaliptus.dat$volume,RCBDk_Eucaliptus.dat$progeny,mean))mean(mean_P) ## mean of the Progeny means(mean_B <- tapply(RCBDk_Eucaliptus.dat$volume,RCBDk_Eucaliptus.dat$block,mean))mean(mean_B)(tau_BLUE <- mean_P - ybar)tau_EBLUPc <- tau_BLUE*sigma2_P/(sigma2_P+sigma2_Res/3+sigma2_W/18)
## REML using library lme4library(lme4)RCBDk_Eucaliptus.REML <- lmer(volume~ block +(1|progeny)+ (1|block:plots),data=RCBDk_Eucaliptus.dat,REML=TRUE)summary(RCBDk_Eucaliptus.REML)summary(RCBDk_Eucaliptus.REML)@coefsdata.frame(summary(RCBDk_Eucaliptus.REML)@REmat)
## EBLUP for tau and shrunk meanstau_EBLUP <- ranef(RCBDk_Eucaliptus.REML)[[2]]round(sum(tau_EBLUP),2)mm <- model.matrix(terms(RCBDk_Eucaliptus.REML),RCBDk_Eucaliptus.dat)RCBDk_Eucaliptus.dat$distance <- mm %*% fixef(RCBDk_Eucaliptus.REML)mu_EBLUP <- RCBDk_Eucaliptus.dat$distance + tau_EBLUP(Blup<- data.frame(round(tau_BLUE,1),round(tau_EBLUP,1),round(mean_P,1),round(mu_EBLUP,1)))plot(Blup[[3]],Blup[[4]], pch=’*’,xlim=c(100,200),ylim=c(100,200),xlab=’Unadjusted means’,ylab=’Shrunk means’)abline(0,1)
## MLRCBDk_Eucaliptus.ML <- lmer(volume~ block +(1|progeny)+ (1|block:plots),data=RCBDk_Eucaliptus.dat,REML=FALSE)summary(RCBDk_Eucaliptus.ML)summary(RCBDk_Eucaliptus.ML)@coefsdata.frame(summary(RCBDk_Eucaliptus.ML)@REmat)
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Randomized Incomplete Block Design
In many situations the number of treatments is large and given theheterogeneity of the experimental conditions there is need to useblocks
However, blocks with too many plots could also becomeheterogeneous.
In breeding experiments, for example, it is common to have 100 ormore cultivars of corn to evaluate.
In other situations, there is not enough material to use. -
In biological work on animals, for example, it will be desirable, if atall possible, to compare several treatments within litters, but the sizeof the litter will depend on the particular species and will often besuch that it is impossible to include all the treatments within a litter.
Clarice G.B. Demetrio and Cristian Villegas 158 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The randomized incomplete block design can be of three types:
Balanced - here are included the “balanced incomplete block design(BIBD)” and the “balanced lattices square”
Partially unbalanced - here are included the “lattice squares” and“partially balanced incomplete block designs (PBIBD)”
Unbalanced
Clarice G.B. Demetrio and Cristian Villegas 159 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Definition: A balanced incomplete block design (BIBD) is one inwhich each of the t treatments is replicated r times and occurs at mostin each of the b blocks that contain k plots and the arrangement oftreatments in blocks is that each of treatments occurs together the samenumber of times (λ) in a block. (Brien, 2010)
The first condition means that the total number of units is tr = bk
the second condition implies that the total number of plots withother treatments in the blocks in which a treatment occur isλ(t − 1) = r(k − 1)
A BIBD cannot exist if these two first conditions are not met
However, that both of these conditions are satisfied does not implythat a BIBD must exist.
For example, a BIBD does not exist for t = 15, k = 5, b = 21, r = 7and λ = 2, even though both conditions are satisfied.
Such designs are not orthogonal, however they are balanced.
That is to say they are not orthogonal because treatments areconfounded with both blocks and plots within blocks.
They are balanced because all comparisons between treatments areconfounded with blocks to the same extent, as they are with plotwithin blocks.
Clarice G.B. Demetrio and Cristian Villegas 160 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
It can be shown that for a BIBD the proportion of the informationwithin blocks is e2 = tλ
kr and between blocks is e1 = 1− e2.These proportions are called the canonical efficiency factors whichare always values between zero and one and sum to one for aparticular randomized term, in this case Treatments.It is desirable that e2 is as close to one as possible and this impliesthat as much of the information as possible is confounded withplots, which are less variable than blocks.Designs can be obtained from Cochran and Cox (1957) and Box,Hunter and Hunter (2005) or can be generated as follows.Suppose t = 4, k = 3, b = 4 ⇒ r = 3 and λ = 2
BlocksI II III IVA A A BB B C CC D D D
e2 = 4×2×3 = 0.8889and e1 = 1− 0.8889 = 0.1111, that is, 88,89% of
the information about treatments is between plots within blocks.Randomization: the treatment combinations are randomized to theblocks and the treatments in a block are randomized to the plots(dae).Clarice G.B. Demetrio and Cristian Villegas 161 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ANOVA table
E(MSq)Source df Fixed B random T random RandomBlocks b − 1Treatments t − 1 σ2 + e1qTB (Ψ) σ2 + kσ2
B + e1qTB (Ψ) σ2 + e1rσ2T σ2 + kσ2
B + e1rσ2T
Residual (b − t) σ2 + qB (Ψ) σ2 + kσ2B σ2 + qB (Ψ) σ2 + kσ2
BUnits[Blocks] b(k − 1)Treatments t − 1 σ2 + e2qT (Ψ) σ2 + e2qT (Ψ) σ2 + e2rσ
2T σ2 + e2rσ
2T
Residual (bk − b − t + 1) σ2 σ2 σ2 σ2
Total bk − 1
Note that there are two Treatment lines in the analysis, the firstbeing referred to as the “interblock” Treatment line and the secondas the “intrablock” Treatment line.Generally, one tries to have e2 as close to one as possible and tobase conclusions on the intrablock Treatment effects.Because, when Blocks are fixed, qTB
involves both β’s and τ ’s, it isnot possible to separately test for treatment difference between theblocks in this case – the intrablock test for treatments will be theonly test for treatments that can be performed here.Thus it is preferable to designate Blocks as random, if it isappropriate.
Clarice G.B. Demetrio and Cristian Villegas 162 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Barley data – unbalanced
Example
The data here is from a field trial of barley breeding lines (Galwey, page87). The lines studied were derived from a cross between two parentvarieties, “Chebec” and “Harrington”. They were “double haploid” lines,which means they were obtained by a laboratory technique that ensuresthat all plants within the same breeding line are genetically identical, sothat the line will breed true. This feature improves the precision whichgenetic variation among the lines can be estimated. The trial consideredhere was arranged in two randomized blocks. Within each block, eachline occupied a single rectangular field plot. All lines were present inBlock I, but due to limited seed stocks, some were absent in Block II.The grain yield (g/m2) was measured in each field plot.
Clarice G.B. Demetrio and Cristian Villegas 163 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Barley data
Blocks Blocks Blocks BlocksLines I II Lines I II Lines I II Lines I II
1 718 591 22 341 NA 43 678 837 64 648 6822 483 NA 23 606 818 44 518 873 65 819 7133 873 NA 24 671 463 45 520 576 66 688 8464 719 NA 25 429 NA 46 724 627 67 407 7035 799 NA 26 580 639 47 192 NA 68 326 3856 850 755 27 732 762 48 786 645 69 467 3797 907 820 28 680 NA 49 831 823 70 996 9058 636 300 29 606 932 50 721 886 71 596 5709 775 587 30 353 NA 51 693 746 72 166 259
10 765 757 31 167 226 52 603 NA 73 355 NA11 645 517 32 669 837 53 559 NA 74 489 55112 437 NA 33 770 847 54 809 859 75 617 NA13 541 475 34 673 639 55 555 NA 76 344 NA14 911 935 35 374 555 56 523 436 77 358 NA15 600 NA 36 800 1055 57 182 NA 78 260 26016 211 240 37 895 553 58 522 435 79 318 43917 552 959 38 641 541 59 553 635 80 488 47818 366 265 39 146 213 60 573 285 81 316 30419 623 424 40 411 568 61 612 472 82 251 NA20 632 NA 41 322 538 62 730 NA 83 280 NA21 515 NA 42 793 553 63 563 756
Clarice G.B. Demetrio and Cristian Villegas 164 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Barley data
The model for this data is
Yjk = µ+ βj + τk + εjk , j = 1, . . . , rk , k = 1, . . . , t
where Yjk is the grain yield of the k-th plot in the j-th block; µ is thegrand mean value of the grain yield; βj is the effect of the j-th block; τkis the effect of the effect of the k-th breeding line, being the line sown inthe jk-th block.plot combination.
It is natural in this case to consider block as a random effect, thatis, βj ∼ N(0, σ2
B) and εjk ∼ N(0, σ2).
Note that the cross Chebec × Harrington could produce many linesbesides those studied here, and the lines in this field trial mayreasonably be considered as a random sample from this populationof potential lines.
Thus it is reasonable to consider “line” as a random-effect term,that is to assume that τk ∼ N(0, σ2
L)
Clarice G.B. Demetrio and Cristian Villegas 165 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Using the R aov function with Error, the ANOVA table is
Source df SSq MSq F p-valueBlocks 1Treatments 1 58079.91 58079.91
Units[Blocks] 140Treatments 82 5343839 65168.77 4.86 < 0.01Residual 58 777747 13409.42
Total 141 6179665
The value 4.86 of F provides significance evidence against the nullhypothesis H0 : σ2
L = 0.
Using the R lme4 library, the estimates of the variance componentsare σ2
B = 8.208310−6, σ2L = 30666.89 and σ2 = 13225.75.
These estimates are similar, but not identical to the ones obtainedusing GENSTAT (see Galwey, page 99).
The estimate of the variance component due to block is very smallcompared with the other components.
We may decide that the best estimate of this component is zero,σ2B = 0, i.e. σ2
B has is a degenerate distribution instead of a normaldistribution.
Clarice G.B. Demetrio and Cristian Villegas 166 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
REML MLσ2B 8.2083 10−6 1.4403 10−07
σ2L 30666.89 30199.55σ2 13225.75 13224.92µ 572.47 572.51sd(µ) 21.67 21.53
The estimate of variance due to breeding lines is about double theresidual variance.
The ML estimates of the variance components are smaller than theREML estimates
Note that the estimates of the fixed parameter µ using ML andREML don’t differ much.
Clarice G.B. Demetrio and Cristian Villegas 167 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The likelihood ratio test for σ2L = 0 is obtained by fitting the full
model and the reduced model from which the term “line” is omitted.
By comparing the deviances of both models, the contribution madeby the term “line” to the fit of the model can be assessed, providedthat the deviances were obtained from models with the samefixed-effect terms.
Using the R lme4 library, the deviances for the full and reducedmodels are, respectively, 1880.37 and 1919.67. The likelihood ratiostatistic with 1 d.f. is
Devreduced model − Devfull model = 1919.67− 1880.37 = 39.30
Note that R lme4 library uses the deviances from a ML estimation.
In a similar way, the likelihood ratio test for σ2B = 0 is obtained by
fitting the full model and the reduced model from which the term“block” is omitted. The likelihood ratio statistic with 1 d.f. is
Devreduced model−Devfull model = 2×940.18508−2×940.18504 = 0.00008
These results are similar but not identical to GENSTAT (see Galwey,page 101-104)
Clarice G.B. Demetrio and Cristian Villegas 168 Modelos Mistos e Componentes de Variancia
R results
summary(RIBD_barley.REML)Linear mixed model fit by REMLFormula: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)
Data: RIBD_barleyAIC BIC logLik deviance REMLdev
1880.384 1892.207 -936.1919 1880.37 1872.384Random effects:Groups Name Variance Std.Dev.fline (Intercept) 3.0666887e+04 1.7511964e+02fblock (Intercept) 8.2083130e-06 2.8650154e-03Residual 1.3225750e+04 1.1500326e+02
Number of obs: 142, groups: fline, 83; fblock, 2Fixed effects:
Estimate Std. Error t value(Intercept) 572.47356 21.67072 26.41691
> summary(RIBD_barley.REML_L0)Linear mixed model fit by REMLFormula: yield_g_m2 ~ 1 + (1 | fblock)
Data: RIBD_barleyAIC BIC logLik deviance REMLdev
1918.107 1926.974 -956.0533 1919.673 1912.107Random effects:Groups Name Variance Std.Dev.fblock (Intercept) 6.4216528e-05 8.0135216e-03Residual 4.3827416e+04 2.0934998e+02
Number of obs: 142, groups: fblock, 2Fixed effects:
Estimate Std. Error t value(Intercept) 581.57355 17.56834 33.10351
R results
> summary(RIBD_barley.ML)Linear mixed model fit by maximum likelihoodFormula: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)
Data: RIBD_barleyAIC BIC logLik deviance REMLdev
1888.368 1900.191 -940.1838 1880.368 1872.386Random effects:Groups Name Variance Std.Dev.fline (Intercept) 3.0199552e+04 1.7378018e+02fblock (Intercept) 1.4403497e-07 3.7951939e-04Residual 1.3224923e+04 1.1499967e+02
Number of obs: 142, groups: fline, 83; fblock, 2Fixed effects:
Estimate Std. Error t value(Intercept) 572.51138 21.53978 26.57926
> summary(RIBD_barley.ML_L0)Linear mixed model fit by maximum likelihoodFormula: yield_g_m2 ~ 1 + (1 | fblock)
Data: RIBD_barleyAIC BIC logLik deviance REMLdev
1925.673 1934.541 -959.8366 1919.673 1912.107Random effects:Groups Name Variance Std.Dev.fblock (Intercept) 0.00 0.00000Residual 43518.77 208.61153
Number of obs: 142, groups: fblock, 2Fixed effects:
Estimate Std. Error t value(Intercept) 581.57352 17.50629 33.22083
R results
> anova(RIBD_barley.REML,RIBD_barley.REML_L0)Data: RIBD_barleyModels:RIBD_barley.REML_L0: yield_g_m2 ~ 1 + (1 | fblock)RIBD_barley.REML: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)RIBD_barley.REML_L0 3 1925.6731 1934.5406 -959.83656RIBD_barley.REML 4 1888.3702 1900.1935 -940.18508 39.30295 1 3.6289e-10 ***
> anova(RIBD_barley.ML_L0,RIBD_barley.ML)Data: RIBD_barleyModels:RIBD_barley.ML_L0: yield_g_m2 ~ 1 + (1 | fblock)RIBD_barley.ML: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)RIBD_barley.ML_L0 3 1925.6731 1934.5406 -959.83655RIBD_barley.ML 4 1888.3676 1900.1909 -940.18381 39.30548 1 3.6242e-10 ***
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Heritability. The prediction of genetic advance underselection.
The heritability for the Barley data can be calculated by
h2 =σ2L
σ2L + σ2
r∗
=30666.89
30666.89 + 13225.751.55
= 0.7823
where r∗ is the number of replications per line.
One way to calculate r∗ is to use
r∗ =t∑t
k=01ri
=83
24 11 + 59 1
2
= 1.55
similar to the value 1.63 given by Galwey, on page 106.
The heritability can be used to calculate the expected geneticadvance under selection in a plant or animal breeding programme.
This is given by the formula
Gs = iσPhh2
where i is an index of selection.Clarice G.B. Demetrio and Cristian Villegas 172 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The index is defined in relation to the standard normal model: thatis, the distribution of a variable Z , such that
Z ∼ N(0, 1)
It is the value of Z that corresponds to the fraction k of thepopulation that is to be selected.
Clarice G.B. Demetrio and Cristian Villegas 173 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The best linear unbiased predictor or ’shrunk’ estimate
The adjustment to obtain the random-effect mean is made asfollows.The true mean of the kth breeding line is represented by
µk = µ+ τk
In the table of means presented (using R lme4 library), this value isestimated by
µk =
∑rkj=1 yjk
rkwhere yjk is the jth observation of the kth breeding line; rk is thenumber of observations of the kth breeding line.The overall mean of the population of breeding lines, µ is estimatedby
µ = 572.5
Note that this not quite the same as the mean of all observations(= 581.1) or the mean of the line means (= 569.1).Then
µk = µ+ τk ⇒ τk = µk − µClarice G.B. Demetrio and Cristian Villegas 174 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
An estimate of τk is given by
BLUE k = τk = µk − µ
To allow for the expectation that high-yielding lines in the presenttrial will perform less well in a future trial – and that low-yieldinglines will perform better - the BLUE is replaced by a “shrunkestimate” called the best linear unbiased predictor (BLUP)
BLUPk = BLUE kshrinkage factor = (µk − µ)σ2L
σ2L + σ2
rk
This relationship, combined with the constraint
t∑k=1
BLUPk = 0,
where t is the number of breeding lines, determines the value of µ aswell as those of the BLUP’s.
A new estimate of the mean for the kth breeding is then given by
µ′k = µ′ + BLUPk
Clarice G.B. Demetrio and Cristian Villegas 175 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Line τk µk µk (BLUP) Line τk µk µk (BLUP) Line τk µk µk (BLUP)1 67.5 654.5 639.9 29 162.0 769.4 734.4 57 -272.8 182.0 299.62 -62.3 483.3 510.2 30 -153.1 353.3 419.3 58 -77.3 478.5 495.23 210.0 873.0 782.5 31 -309.3 196.4 263.1 59 17.7 594.0 590.24 102.5 719.1 674.9 32 148.5 753.1 721.0 60 -118.2 428.8 454.35 158.3 799.0 730.7 33 194.2 808.5 766.7 61 -25.1 542.0 547.46 189.1 802.4 761.6 34 68.9 656.3 641.4 62 110.4 730.5 682.97 239.6 863.7 812.1 35 -88.8 464.5 483.6 63 71.5 659.4 644.08 -85.8 468.2 486.7 36 292.1 927.5 864.5 64 75.9 664.8 648.49 89.2 681.0 661.7 37 124.6 724.0 697.1 65 159.3 766.1 731.7
10 154.9 760.8 727.4 38 15.3 591.0 587.8 66 160.0 767.0 732.511 6.9 580.8 579.4 39 -323.5 179.2 249.0 67 -14.4 554.9 558.012 -94.9 436.6 477.5 40 -68.2 489.6 504.3 68 -178.3 355.7 394.213 -52.9 508.1 519.5 41 -117.3 429.9 455.2 69 -122.9 423.1 449.614 288.1 922.7 860.6 42 82.7 673.0 655.2 70 311.0 950.5 883.515 19.3 600.1 591.7 43 152.3 757.6 724.8 71 8.4 582.7 580.916 -285.5 225.4 286.9 44 101.1 695.3 673.6 72 -296.4 212.2 276.117 150.6 755.5 723.1 45 -20.3 547.8 552.2 73 -151.7 355.3 420.718 -211.1 315.8 361.3 46 84.9 675.6 657.3 74 -42.9 520.4 529.619 -40.5 523.2 532.0 47 -265.5 192.4 306.9 75 31.3 617.3 603.820 41.9 632.5 614.4 48 117.7 715.6 690.2 76 -159.3 344.5 413.221 -40.4 514.6 532.1 49 209.3 826.9 781.8 77 -149.9 357.9 422.622 -162.0 340.6 410.5 50 190.0 803.4 762.4 78 -256.9 260.1 315.523 114.5 711.6 686.9 51 120.8 719.3 693.3 79 -159.6 378.4 412.824 -4.3 567.2 568.2 52 21.4 603.0 593.8 80 -73.6 483.0 498.925 -100.5 428.6 472.0 53 -9.3 559.1 563.2 81 -216.2 309.7 356.326 30.7 609.8 603.2 54 215.5 834.4 787.9 82 -224.4 251.3 348.127 143.7 747.1 716.2 55 -12.0 555.3 560.5 83 -204.0 280.4 368.428 75.1 679.9 647.5 56 -76.8 479.1 495.7
Clarice G.B. Demetrio and Cristian Villegas 176 Modelos Mistos e Componentes de Variancia
R results
RIBD_barley <- read.table("barleyprogeny.dat", header=TRUE)attach(RIBD_barley)head(RIBD_barley)str(RIBD_barley)RIBD_barley$fline <- factor(RIBD_barley$line)RIBD_barley$fblock <- factor(RIBD_barley$block)RIBD_barley$plot <- factor(c(rep(1,83),rep(2,59)))head(RIBD_barley)str(RIBD_barley)
options(digits=10)barley.aov <- aov(yield_g_m2 ~ fline + Error(fblock), data=RIBD_barley)summary(barley.aov)summary(aov(yield_g_m2 ~ 1, data=RIBD_barley))
## REML considering fblock and fline as random effects## using library lme4library(lme4)RIBD_barley.REML <- lmer(yield_g_m2 ~ 1 + (1|fblock) +(1|fline), data=RIBD_barley, REML=TRUE)summary(RIBD_barley.REML)summary(RIBD_barley.REML)@coefsdata.frame(summary(RIBD_barley.REML)@REmat)
## a likelihood ratio test for sigma2_LRIBD_barley.REML_L0 <- lmer(yield_g_m2 ~ 1 +(1|fblock) , data=RIBD_barley, REML=TRUE)summary(RIBD_barley.REML)summary(RIBD_barley.REML_L0)anova(RIBD_barley.REML,RIBD_barley.REML_L0)
## a likelihood ratio test for sigma2_B(RIBD_barley.REML_B0 <- lmer(yield_g_m2 ~ 1 +(1|fline), data=RIBD_barley, REML=TRUE))anova(RIBD_barley.REML,RIBD_barley.REML_B0)
## ML considering fblock and fline as random effects## using library lme4(RIBD_barley.ML <- lmer(yield_g_m2 ~ 1 + (1|fblock) +(1|fline), data=RIBD_barley, REML=FALSE))(RIBD_barley.ML_L0 <- lmer(yield_g_m2 ~ 1 + (1|fblock),data=RIBD_barley, REML=FALSE))summary(RIBD_barley.ML)summary(RIBD_barley.ML_L0)anova(RIBD_barley.ML_L0,RIBD_barley.ML)
# meansunad_mean<-tapply(RIBD_barley$yield_g_m2,RIBD_barley$fline, mean) ## unadjusted meansmean(unad_mean) ## general meanmean(RIBD_barley$yield_g_m2) # mean of the line means
## tau_EBLUP, shrunk meansmm <- model.matrix(terms(RIBD_barley.REML),RIBD_barley)RIBD_barley$distance <- mm %*% fixef(RIBD_barley.REML)tau_EBLUP <- ranef(RIBD_barley.REML)[[1]]mu_EBLUP <- RIBD_barley$distance + tau_EBLUPBlup<- data.frame(round(tau_EBLUP,1),round(unad_mean,1),round(mu_EBLUP,1))plot(Blup[[2]],Blup[[3]], pch=’*’,xlim=c(0,1000),ylim=c(0,1000),xlab=’Unadjusted means’,ylab=’Shrunk means’)abline(0,1)
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Latin squares designs (LS)
Sometimes we need more than one type of blocks. In general callone sort of blocks “rows” and the other sort “columns”.
Definition: A Latin square design is one in which
each treatment occurs once and only once in each row and eachcolumnso that the numbers of rows, columns and treatments are all equal.
Clearly, the total number of observations is n = t2.
Suppose in a field trial moisture is varying across the field and thestoniness down the field.
A Latin square can eliminate both sources of variability.
Clarice G.B. Demetrio and Cristian Villegas 178 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Sugarcane experiment
Suppose there are five different varieties of sugarcane to becompared and suppose that moisture is varying across the field andthat the stoniness down the field.
A Latin square design for this would be as follows:
5× 5 Latin SquareColumn Less
1 2 3 4 5 stony ofI A B C D E field
II C D E A B ⇓Row III E A B C D ⇓
IV B C D E A ⇓V D E A B C Stonier
end offield
Less Moremoisture ⇒⇒⇒ moisture
VarietiesA, B, C, D, E
Clarice G.B. Demetrio and Cristian Villegas 179 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Even if one has not identified trends in two directions, a Latin squaremay be employed to guard against the problem of putting the blocksin the wrong direction.
Latin squares may also be used when there are two different kinds ofblocking variables, for example, animals and times.
General principle is that one is interested in maximizing row andcolumn differences so as to minimize the amount of uncontrolledvariation affecting treatment comparisons.
The major disadvantage with the Latin square is that you arerestricted to having the number of replicates equal to the number oftreatments.
Clarice G.B. Demetrio and Cristian Villegas 180 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Several fundamentally different Latin squares exist for a particular t
for t = 4 there are three different squares.A collection of Latin squares for t = 3, 4, . . . , 9 is given in Appendix8A of Box, Hunter and Hunter.
To randomize these designs appropriately involves the following:
1. randomly select one of the designs for a value of t;2. randomly permute the rows and then the columns;3. randomly assign letters to treatments.
Note: no nested.factors as Rows and Columns are to be randomizedindependently
Hence they are not nested (they are crossed)
Generally we will use R to obtain randomized layouts.
General instructions given in Appendix B (Chris Brien’s notes),Randomized layouts and sample size computations in R.
Clarice G.B. Demetrio and Cristian Villegas 181 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Latin Square Design
Consider a latin square design,
yij = µ+ βi + γj + τk(ij) + εij
where µ constant, βi is the effect of the i-th row, γj is the effect of j-thcolumn, τk is the effect of the k-th treatment within the plot ij , εij is theexperimental error associated to the i , j-th plot. The ANOVA table is
Source df SSq MSq F
Rows t − 1 Y′QRYY′QRY
t − 1
Columns t − 1 Y′QCYY′QCY
t − 1Rows:Columns (t − 1)2
Treatments t − 1 Y′QTYY′QTY
t − 1
Residual (t − 1)(t − 2) Y′QRCRes YY′QRCRes Y
(t − 1)(t − 2)Total t2 − 1
where SSqR = 1t
∑ti=1 R2
i − C , C =(∑
i,j,k Yijk )2
n ,
SSqC = 1t
∑tj=1 C 2
j − C , SSqT = 1t
∑tk=1 T 2
k − CClarice G.B. Demetrio and Cristian Villegas 182 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Considering t=3:
ColumnRow 1 2 3
1 A B C2 C A B3 B C A
In matrix notation, for a fixed model,
Y = XGµ + XRβ + XCγ + XTτ + ε
y11
y12
y13
y21
y22
y23
y31
y32
y33
=
111111111
µ +
1 0 01 0 01 0 00 1 00 1 00 1 00 0 10 0 10 0 1
β +
1 0 00 1 00 0 11 0 00 1 00 0 11 0 00 1 00 0 1
γ +
1 0 00 1 00 0 10 0 11 0 00 1 00 1 00 0 11 0 0
τ + ε
Clarice G.B. Demetrio and Cristian Villegas 183 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
In a general way we can have at least five types of models:
1 Fixed model with βi , γj and τk as fixed effects and εij ∼ N(0, σ2)
2 Mixed model with γj and τk as fixed effects and βi ∼ N(0, σ2R) and
εij ∼ N(0, σ2);
3 Mixed model with βi and γj as fixed effects and τk ∼ N(0, σ2T ) and
εij ∼ N(0, σ2);
4 Mixed model with βi as a fixed effect and γj ∼ N(0, σ2C ),
τk ∼ N(0, σ2T ) and εij ∼ N(0, σ2);
5 A random model with βi ∼ N(0, σ2R), γj ∼ N(0, σ2
C ), τk ∼ N(0, σ2T )
and εij ∼ N(0, σ2).
Clarice G.B. Demetrio and Cristian Villegas 184 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
1. Fixed model
βi , γj and τk as fixed effects and εij ∼ N(0, σ2)
Y = XGµ+ XRβ + XCγ + XTτ + ε
Source d.f. E(MSq)Row (t-1) σ2 + qR(ψ)Column (t-1) σ2 + qC (ψ)Row # Column (t-1)2
Treatment (t-1) σ2 + qT (ψ)Residual (t-1)(t-2) σ2
Then
E(Y) = XGµ+ XRβ + XCγ + XTτ
Var(Y) = Σ = σ2It2
Clarice G.B. Demetrio and Cristian Villegas 185 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Var(Y) = Σ = σ2It2
=
σ2 0 0 0 0 0 0 00 σ2 0 0 0 0 0 00 0 σ2 0 0 0 0 0 00 0 0 σ2 0 0 0 0 00 0 0 0 σ2 0 0 0 00 0 0 0 0 σ2 0 0 00 0 0 0 0 0 σ2 0 00 0 0 0 0 0 0 σ2 00 0 0 0 0 0 0 0 σ2
Clarice G.B. Demetrio and Cristian Villegas 186 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2. Mixed model
with γj and τk as fixed effects and βi ∼ N(0, σ2R) and εij ∼ N(0, σ2)
Y = XGµ+ XCγ + XTτ + ZRβ + ε
Source d.f. E(MSq)Row (t-1) σ2 + tσ2
R
Column (t-1) σ2 + qC (ψ)Row # Column (t-1)2
Treatment (t-1) σ2 + qT (ψ)Residual (t-1)(t-2) σ2
Then
E(Y) = XGµ+ XCγ + XTτ
Var(Y) = ZRσ2R ItZ
′R + Σ = σ2
RZRZ′R + σ2It2
Clarice G.B. Demetrio and Cristian Villegas 187 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Var(Y) = σ2RZRZ′R + σ2It2 =
σ2R
1 1 1 0 0 0 0 0 01 1 1 0 0 0 0 0 01 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 00 0 0 1 1 1 0 0 00 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 1
+ σ2
1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1
=
σ2 + σ2R σ2
R σ2R 0 0 0 0 0 0
σ2R σ2 + σ2
R σ2R 0 0 0 0 0 0
σ2R σ2
R σ2 + σ2R 0 0 0 0 0 0
0 0 0 σ2 + σ2R σ2
R σ2R 0 0 0
0 0 0 σ2R σ2 + σ2
R σ2R 0 0 0
0 0 0 σ2R σ2
R σ2 + σ2R 0 0 0
0 0 0 0 0 0 σ2 + σ2R σ2
R σ2R
0 0 0 0 0 0 σ2R σ2 + σ2
R σ2R
0 0 0 0 0 0 σ2R σ2
R σ2 + σ2R
Clarice G.B. Demetrio and Cristian Villegas 188 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3. Mixed model
βi and γj as fixed effects and τk ∼ N(0, σ2T ) and εij ∼ N(0, σ2)
Y = XGµ+ XRβ + XCγ + ZRτ + ε
Source d.f E(MSq)Row (t-1) σ2 + qR(ψ)Column (t-1) σ2 + qC (ψ)Row # Column (t-1)2
Treatment (t-1) σ2 + tσ2T
Residual (t-1)(t-2) σ2
Then
E(Y) = XGµ+ XRβ + XCγ
Var(Y) = ZTσ2T ItZ
′T + Σ = σ2
TZTZ′T + σ2It2
Clarice G.B. Demetrio and Cristian Villegas 189 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Var(Y) = σ2TZTZ′T + σ2It2
= σ2T
1 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 0
0 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 0
0 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 1
+ σ2
1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1
=
σ2 + σ2T 0 0 0 σ2
T 0 0 0 σ2T
0 σ2 + σ2T 0 0 0 σ2
T σ2T 0 0
0 0 σ2 + σ2T σ2
T 0 0 0 σ2T 0
0 0 σ2T σ2 + σ2
T 0 0 0 σ2T 0
σ2T 0 0 0 σ2 + σ2
T 0 0 0 σ2T
0 σ2T 0 0 0 σ2 + σ2
T σ2T 0 0
0 σ2T 0 0 0 σ2
T σ2 + σ2T 0 0
0 0 σ2T σ2
T 0 0 0 σ2 + σ2T 0
σ2T 0 0 0 σ2
T 0 0 0 σ2 + σ2T
=
V1 V2 VT2
VT2 V1 V2
V2 VT2 V1
, V1 =
σ2 + σ2T 0 0
0 σ2 + σ2T 0
0 0 σ2 + σ2T
, V2 =
0 σ2T 0
0 0 σ2T
σ2T 0 0
Clarice G.B. Demetrio and Cristian Villegas 190 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
4. Mixed model
with βi as a fixed effect and γj ∼ N(0, σ2C ), τk ∼ N(0, σ2
T ) andεij ∼ N(0, σ2)
Y = XGµ+ XRβ + ZCγ + ZTτ + ε
Source d.f. E(MSq)Row (t-1) σ2 + qR(ψ)Column (t-1) σ2 + σ2
C
Row # Column (t-1)2
Treatment (t-1) σ2 + σ2T
Residual (t-1)(t-2) σ2
Then
E(Y) = XGµ+ XRβ
Var(Y) = ZCσ2C ItZ
′C + ZTσ
2T ItZ
′T + Σ
= σ2CZCZ′C + σ2
TZTZ′T + σ2It2
Clarice G.B. Demetrio and Cristian Villegas 191 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Var(Y) = σ2CZCZ′C + σ2
TZTZ′T + σ2It2
= σ2C
1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1
1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1
1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1
+ σ2
T
1 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 0
0 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 0
0 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 1
+σ2
1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1
=
V1 V2 VT2
VT2 V1 V2
V2 VT2 V1
V1 =
σ2 + σ2C + σ2
T 0 0
0 σ2 + σ2C + σ2
T 0
0 0 σ2 + σ2C + σ2
T
, V2 =
σ2C σ2
T 0
0 σ2C σ2
Tσ2T 0 σ2
C
Clarice G.B. Demetrio and Cristian Villegas 192 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
In summary
ModelsSource d.f. 1 2 3 4 5
Row (t-1) σ2 + qR (ψ) σ2 + tσ2R σ2 + qR (ψ) σ2 + qR (ψ) σ2 + tσ2
RColumn (t-1) σ2 + qC (ψ) σ2 + qC (ψ) σ2 + qC (ψ) σ2 + tσ2
C σ2 + tσ2C
Row # Column (t-1)2
Treatment (t-1) σ2 + qT (ψ) σ2 + qT (ψ) σ2 + tσ2T σ2 + tσ2
T σ2 + tσ2T
Residual (t-1)(t-2) σ2 σ2 σ2 σ2 σ2
Clarice G.B. Demetrio and Cristian Villegas 193 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
A Design of factorial experiments
Often be more than one factor of interest to the experimenter.
Definition: Experiments that involve more than one randomized ortreatment factor are called factorial experiments.
In general, the number of treatments in a factorial experiment is theproduct of the numbers of levels of the treatment factors.
The disadvantage of this is that the number of treatments increasesvery quickly.
Given the number of treatments, the experiment could be laid out as
a Completely Randomized Design,a Randomized Complete Block Design ora Latin Square with that number of treatments.
The incomplete block designs, such as BIBDs or Youden Squares arenot suitable for factorial experiments.
Clarice G.B. Demetrio and Cristian Villegas 194 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
a) Obtaining a layout for a factorial experiment in R
Layouts for factorial experiments can be obtained in R usingexpressions for the chosen design when only a single-factor isinvolved.
Difference with factorial experiments is that the several treatmentfactors are entered.
Their values can be generated using fac.gen.
fac.gen(generate, each=1, times=1, order=”standard”)
It is likely to be necessary to use either the each or times argumentsto generate the replicate combinations.
The syntax of fac.gen and examples are given in Appendix B,Randomized layouts and sample size computations in R.
For Yates order, as opposed to standard order, the first factorchanges fastest, last slowest whereas the first factor changes slowestand the last fastest in standard order.
Clarice G.B. Demetrio and Cristian Villegas 195 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Summary of advantages of factorial experiments
To summarize, relative to one-factor-at-a-time experiments, factorialexperiments have the advantages that:
1. if the factors interact, factorial experiments allow this to bedetected and estimates of the interaction effect can be obtained, and2. if the factors are independent, factorial experiments result in theestimation of the main effects with greater precision.
Clarice G.B. Demetrio and Cristian Villegas 196 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRD Factorial
yijk = µ+ αi + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; αi is effect of the i-thfactor A; βj is effect of the j-th factor B; τij is effect of the i-th factor Acombined with the j-th factor B; εijk is the experimental error associatedto the i , j ; k-th plot.
Considering a=3, b=4 e r=2:
A2B1 A3B4 A1B1 A3B1
A2B4 A3B3 A1B3 A2B3
A2B2 A1B1 A3B2 A1B2
A1B2 A3B1 A2B2 A1B4
A2B1 A1B4 A3B2 A2B4
A1B3 A2B3 A3B3 A3B4
Clarice G.B. Demetrio and Cristian Villegas 197 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''abr
Plot P(abr−1)hh ii
Aa
A(a−1)
&&
Bb
B(b−1)
xxA ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
vv ((σ2p
Plot Pσ2p
AqA(ψ)
AqA(ψ)
''
BqB (ψ)
BqB (ψ)
wwA ∧ B
qAB (ψ)A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 198 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''abr
Plot P(abr−1)hh ii
Aa
A(a−1)
&&
Bb
B(b−1)
xxA ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
vv ((σ2p
Plot Pσ2p
AqA(ψ)
AqA(ψ)
''
BqB (ψ)
BqB (ψ)
wwA ∧ B
qAB (ψ)A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 199 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM)Plots rab - 1
A a-1 σ2p + qA(ψ)
B b-1 σ2p + qB(ψ)
A#B (a-1)(b-1) σ2p + qAB(ψ)
Residual ab(r-1) σ2p
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 200 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRDB Factorial
yijk = µ+ γk + αi + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; γk is effect of the k-thblock; αi is effect of the i-th factor A; βj is effect of the j-th factor B; τijis effect of the i-th factor A combined with the j-th factor B; εijk is theexperimental error associated to the i , j ; k-th plot.
3*Bloco I A3B4 A1B2 A2B2 A1B3
A2B4 A2B1 A1B1 A2B3
A1B4 A3B1 A3B2 A3B3
3*Bloco II A2B3 A1B3 A3B1 A3B2
A3B4 A2B1 A1B4 A2B2
A1B1 A2B4 A1B2 A3B3
Clarice G.B. Demetrio and Cristian Villegas 201 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''Blockr
Bl(r−1)
��
Aa
A(a−1)
&&ww
Bb
B(b−1)
xxrrPlot ∧ Block
abrP[Bl]
r(ab−1)
A ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
ww ''Block
qBl (ψ)Bl
σ2PB
+qBl (ψ)
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxPlot ∧ Block
σ2PB
P[Bl]
σ2PB
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 202 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''Blockr
Bl(r−1)
��
Aa
A(a−1)
&&ww
Bb
B(b−1)
xxrrPlot ∧ Block
abrP[Bl]
r(ab−1)
A ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
ww ''Block
qBl (ψ)Bl
σ2PB
+qBl (ψ)
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxPlot ∧ Block
σ2PB
P[Bl]
σ2PB
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 203 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM)Block r-1 σ2
PB + qBl(ψ)Plots[Blocks] r(ab - 1)
A a-1 σ2PB + qA(ψ)
B b-1 σ2PB + qB(ψ)
A#B (a-1)(b-1) σ2PB + qAB(ψ)
Residual (ab-1)(r-1) σ2PB
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 204 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
A Design of split-plot experiments
Designs in which main effects confounded with more variable unitssuch as large plots.
Their defining attribute is that there is randomization to twodifferent physical entities such that some main effects arerandomized to the more variable entities.
Definition: The standard split-plot design is one in which twofactors, say A and B with a and b levels, respectively are assigned asfollows:
one of the factors, A say, is randomized according to a RCBD withsay r blocks andeach of its ra plots, called the main plots, is split into b subplots (orsplit-plots) and levels of B randomized independently in each subplot.Altogether the experiment involves n = rab subplots.
That is, the generic factor names for this design are Blocks,MainPlots, SubPlots, A and B.
Clarice G.B. Demetrio and Cristian Villegas 205 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Split-plot principle
Very flexible principle that can be used to generate a large number ofdifferent types of experiments.
For example, the main plots could be arranged in any of a CRD, RCBD,Latin square, BIBD, Youden Square
each plot of the design is subdivided into subplots.
The subplots may utilize more complicated designs as well.
For example, the main plots may be arranged in a RCBD each ofwhich are subdivided in such a way as to allow a Latin Square to beplaced in each main plot.
Also, subplots can be split into subsubplots and subsubplots into ...
Nor is one restricted to applying just one factor to each type of unit.
More than one factor can be randomized to main plots, more thanone to subplots and so on.
The standard split-plot design is nearly the simplest possibility; only aCRD in the main plots would be simpler.
Clarice G.B. Demetrio and Cristian Villegas 206 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
When to use a split-plot design
1 When the physical attributes of a factor require the use of largerunits of experimental material than other factors.
For example, land preparation treatments usually require to beperformed on larger areas of land than do the sowing of differentvarieties (due to the different pieces of equipment).Temperature control for say storage purposes involves the use ofrelatively large chambers in which several samples can usually bestored.Different processing runs are often of a minimum size such that theirproduce can be readily subdivided for the application of furthertreatments.Also, some factors are relatively hard to change. For example, thetemperature of a production operation is often difficult to change sothat it might be better to change it less often by making it amain-plot factor.
Clarice G.B. Demetrio and Cristian Villegas 207 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2 When it is desired to incorporate an additional factor into anexperiment.
3 When it is expected that differences amongst the levels of certainfactors are larger than amongst those of other factors.
The levels of the factors with larger differences are randomized tomain plots.One effect of this may be to increase the precision of comparisonsbetween the levels of the other factors.
4 When it is desired to ensure greater precision between some factorsthan others.
Irrespective of the size of the differences between the main plottreatment factors, it is desired to increase the precision of somefactors by assigning them to subplots.One may be less interested in main effects of some factors. Aparticular example of such factors is ”noise” factors.
Clarice G.B. Demetrio and Cristian Villegas 208 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Notes
Note that the last two of these situations are utilising theanticipated greater variability of main plots relative to subplots.
That is, we are expecting the larger units to be more variable thanthe smaller units.This will be expressed in the models and E[MSq]s for theseexperiments.
In describing the type of study, you need to identify the main plotand subplot design.
Clarice G.B. Demetrio and Cristian Villegas 209 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Ravioli dataHere we will illustrate the design with data from an evaluation of fourcommercial brands of ravioli by nine trained assessors. (Guillermo Hough,DESA-ISETA, Argentina.) The purpose of the study was to identifydifferences in taste and texture between the brands. Knowledge of suchdifferences is of great commercial importance to food manufacturers, butdifficult to obtain: these sensory characteristics must ultimately beassessed by the subjective impressions of a human observer, which varyamong individuals, and over occasions in the same individual. However, ifthe subjective assessment of some aspect of taste or texture (such assaltiness or gumminess) is consistent, for a particular brand, amongindividuals and over occasions that is, if the perceived differencesbetween brands are statistically significant it is safe to conclude thatthese differences are real. Differences among assessors are of less interest.Different individuals may simply be using different parts of theassessment scale to describe the same sensations: who can say whetherfood tastes saltier to you than it does to me? However, if there aresignificant interactions between brand and assessor for example, if theassessor ANA consistently perceives Brand A as saltier than Brand B,whereas GUI consistently ranks these brands in the opposite order this isof interest to the investigator.
Clarice G.B. Demetrio and Cristian Villegas 210 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRD Split-plot
yijk = µ+ αi + eik + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; αi is effect of the i-thfactor A; eik is the experimental error associated to the i-th factorassociated with the k-th plot; βj is effect of the j-th factor B; τij is effectof the i-th factor A combined with the j-th factor B; εijk is theexperimental error associated to the i , j ; k-th sub-plot.
Considering a=3, b=4 e r=2:
2*A3 B1 B2 B4 B1
B4 B3 B3 B2
2*A1 B3 B4 B1 B2
B2 B1 B3 B4
2*A2 B1 B4 B2 B4
B2 B3 B3 B1
Clarice G.B. Demetrio and Cristian Villegas 211 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''ar Plot P
ar−1
��
iiAa
A(a−1)
&&
Bb
B(b−1)
xxrrSubplot ∧ Plot
abr
S[P]
ar(b−1)
A ∧ Bab
A#B
(a−1)(b−1)oo
µ
��
µ
ww ''Plotsabrarσ2p
P
σ2+bσ2p
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxS ∧ P
σ2S[P]
σ2
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 212 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''ar Plot P
ar−1
��
iiAa
A(a−1)
&&
Bb
B(b−1)
xxrrSubplot ∧ Plot
abr
S[P]
ar(b−1)
A ∧ Bab
A#B
(a−1)(b−1)oo
µ
��
µ
ww ''Plotsabrarσ2p
P
σ2+bσ2p
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxS ∧ P
σ2S[P]
σ2
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 213 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM)Plots ra - 1
A a-1 σ2 + bσ2P + qA(ψ)
Residual a(r-1) σ2 + bσ2P
Subplots[Plots] ra(b-1)B b-1 σ2 + qB(ψ)A#B (a-1)(b-1) σ2 + qAB(ψ)Residual a(b-1)(r-1) σ2
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 214 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
yijk = µ+ γk + αi + eik + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; γk is effect of the k-thblock; αi is effect of the i-th factor A; eik is the experimental errorassociated to the i-th factor associated with the k-th plot; βj is effect ofthe j-th factor B; τij is effect of the i-th factor A combined with the j-thfactor B; εijk is the experimental error associated to the i , j ; k-thsub-plot.
Considering a=3, b=4 e r=2:
3*Block I A3 B1 B2 B4 B3
A1 B4 B3 B1 B2
A2 B3 B4 B1 B2
3*Block II A2 B1 B3 B4 B2
A1 B2 B3 B1 B4
A3 B3 B4 B2 B1
Clarice G.B. Demetrio and Cristian Villegas 215 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''r Block Bl
r−1
��
Aa
A(a−1)
&&vv
Bb
B(b−1)
xx
ww
Plot ∧ Blockar
P[ Bl ]
r(a−1)
��
A ∧ Bab
A#B
(a−1)(b−1)
ssSubplot ∧ Plot ∧ Block
abr
S[P ∧ Bl]
ar(b−1)
Clarice G.B. Demetrio and Cristian Villegas 216 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
µ
��
µ
ww ''Block
qBl (ψ)Bl
σ2+bσ2p+qBl (ψ)
��
AqA(ψ)
AqA(ψ)
%%
BqB (ψ)
BqB (ψ)
yyPlot ∧ Block
abrarσ2p
P[Bl]
σ2+bσ2p
��
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Subplot ∧ Plot ∧ Block
σ2S[P ∧ Bl]
σ2
Clarice G.B. Demetrio and Cristian Villegas 217 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM) F
Blocks r-1 σ2 + bσ2P + qBl (ψ)
Plots r(a - 1)A a-1 σ2 + bσ2
P + qA(ψ) QMA/QMResAResidual A (a-1)(r-1) σ2 + bσ2
P
Subplots[Plots] ra(b-1)B b-1 σ2 + qB (ψ) QMB/QMResBA#B (a-1)(b-1) σ2 + qAB (ψ) QMAB/QMResBResidual B a(b-1)(r-1) σ2
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 218 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
If βj ∼ N(0, σ2B)
H01: σ2AB=0
H02: σ2B=0
H03: µA1 = µA2 = . . . = µAa = 0
source d.f. E(QM) F
Blocks r-1 σ2 + bσ2P + qBl (ψ)
Plots r(a - 1)
A a-1 σ2 + bσ2P + rσ2
AB + qA(ψ) z (sob H03)
Residual (a-1)(r-1) σ2 + bσ2P
Subplots[Plots] ra(b-1)
B b-1 σ2 + rσ2AB + raσ2
B QMB/QMAB (sob H02)
A#B (a-1)(b-1) σ2 + rσ2AB QMAB/QMResB (sob H01)
Residual a(b-1)(r-1) σ2
Total rab-1
z F = QMA+QMResBQMResA+QMA#B ∼ Fν1,ν2
Sattertwaite: ν1 = (QMA+QMResB)2
QMA2
a−1QMResB2
a(b−1)(r−1)
e ν2 = (QMResA+QMA#B)2
QMResA2
(r−1)(a−1)QMA#B2
(a−1)(b−1)
Clarice G.B. Demetrio and Cristian Villegas 219 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The brands of ravioli were cooked, served into small dishes and presentedhot to the assessors. Three replicate evaluations were made, each beingcompleted on a single day; hence each day comprised a block. There mayhave been uncontrolled and unobserved variation from day to day in thecooking and serving conditions for example, the temperature of theroom may have changed. On each day, the order in which the fourbrands were presented to the assessors was randomised. However, on anygiven day, all the assessors received the brands in the same order: for thistype of product it is complicated to randomise the order of presentationamong assessors. Hence each presentation of a brand comprised a mainplot: the brand varied only between presentations, but the whole set ofassessors received the brand within each presentation. Each serving, in asingle dish, comprised a sub-plot. During each presentation, the servingswere shuffled before being taken to the assessors; thus the assessors wereinformally randomised over the sub-plots within each main plot. (Itwould have been cumbersome to follow a formal randomisation at thisstage: it was more important to get the servings to the assessors whilethey were still hot.)Each assessor gave the serving presented to him or her a numerical scorefor saltiness.
Clarice G.B. Demetrio and Cristian Villegas 220 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Hierarchical classification or nested classification model
Experimental designs with hierarchical classification are frequentlyused in agricultural,genetic, industrial, medical and other types ofresearch.Cochran (1939) described a sampling scheme for estimating wheatproduction: samples of farms were selected from six districts; at thenext stage, samples of fields were selected from each of the selectedfarm; at the final stage, measurements on the yield of wheat wereobtained from sample ”paths” in each of the selected fields.For demographic, political and socioeconomic studies, samples ofgeographic regions, counties, districts and towns are selected in asuccession.Similar procedures are employed in geographical studies on rockformation, mineral deposition and soil erosion.These types of designs are also used in studies related to water andatmospheric pollution, and also in environmental and ecologicalstudies.Nested designs can be balanced or unbalanced, the classificationfactors can be fixed or random, and the variances can be equal orunequal.
Clarice G.B. Demetrio and Cristian Villegas 221 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Further illustrations (Rao, 1997)
A number of applications of the nested designs appeared in the literature,as, for example,
Fabric differences: Tippett (1931) presents an experiment forexamining the properties of four fabrics. Three tests were performedon each of the fabrics, and each test was repeated four times.
Blood pressure measurements: Canner et al. (1991) used data fromthe Hypertension Prevention Trial, conducted in U.S. during1983-1986 and a nested model to examine errors made in measuringblood pressure of individuals. The effects of the participants in thestudy, their visits and duplicate measurements on each visit wereincluded in the model and they were all assumed to be random.
Eye examination: Rosner (1982) used a nested model to study itemsuch as ”intraoccular pressures in persons”. The models consists ofgroups, individuals in the groups, and the measurements on both theeyes. The groups were considered to be fixed, and the remaining twofactors random. For some items, measurement on one of the eyeswas missing. For some other items, the condition being examined bythe opthalmologist existed in only one of the eyes.
Clarice G.B. Demetrio and Cristian Villegas 222 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Asparagus clones: For patenting asparagus clones, a plant producerused estimates of means and variances of their importantcharacteristics. For future experimentation, estimates of the variancecomponents of the cladophylls, ” the tiny leaves located on theasparagus branches,” were also obtained (Trout, 1985). The studywas conducted by selecting in stages, five stalks from a clone, twobranches from each stalk, five nodes on each branch, and threecladophylls from each node. Variance components were obtainedfrom the lengths of the 150 cladophylls at the nodes.
Experimental drugs: Patients with certain diseases are hospitalizedand administered suitable medical treatments. Experimental drugscan be examined by administering them to the patients receivingeach of the treatments. In a split-plot experiment, the treatmentsand drugs are considered to be the main and the sub-plottreatments, respectively.
Spectral density: Jackson and Lawton (1969) examined theconsequences of estimating a spectral density through a nestedclassification.
Clarice G.B. Demetrio and Cristian Villegas 223 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Textile production: Bainbridge (1965) suggests a staggered designfor detecting sources of variation occurring in industrial productions,and illustrates it through a chemical test on a specific textile. Froma large number of machines, two were selected on each of forty-twodays, the sample from one machine was tested by two analysts ondifferent shifts, one of them obtaining duplicate measurements. Thesample from the second machine was analysed only once by ananalyst. The data from this experiment were used for studying thevariations occurring from (1) changes in the raw material over thedays, (2) differences in the machines, (3) long term tests at thedifferent shifts, and (4) short term tests through the duplicatemeasurements. This four stage design is unbalanced.
Animal breeding: In several experiments on animal breeding, each ofa sample of sires are randomly mated to samples of dames. Theobservations from the offspring are analyzed through the model for anested design.
Clarice G.B. Demetrio and Cristian Villegas 224 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Calf birth weight
Example
In an animal breeding experiment 20 unrelated cows were subjected tosuperovulation and artificial insemination. Each group of 4 cows wasinseminated with a different sire, with a total of 5 unrelated sires. Out ofeach mating (combination of dam and sire), three calves were generatedand their yearling weights were recorded.
no interest in each sire or dam which are very depending on thecircumstances
sire effect can be viewed as a sample of a random sire effect (levelsare chosen at random from an infinite set of sire levels)
dam effect can be viewed as a sample of a random dam effect (levelsare chosen at random from an infinite set of dam levels)
interest in estimating the variance of the sire and dam effects assources of random variation in the data
the three calves with the same parents share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 225 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
...
S1
D1 D2 D3 D4
...
S5
D17 D18 D19 D20
Clarice G.B. Demetrio and Cristian Villegas 226 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Dam 1 Dam 2 Dam 3 Dam 4Sire C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
1 30.1 31.1 34.6 29.2 30.8 31.6 32.0 32.6 32.7 33.3 40.2 36.72 32.3 36.7 40.1 35.6 34.3 41.1 34.1 30.8 39.3 39.9 36.7 38.73 39.8 36.5 38.9 37.5 38.6 36.8 39.0 39.8 38.6 36.7 37.6 38.94 40.8 42.0 45.0 42.7 43.9 46.7 44.5 46.0 47.0 43.9 45.0 48.05 41.9 43.2 45.3 45.3 44.0 47.1 45.3 44.8 45.3 46.0 47.2 48.0
Clarice G.B. Demetrio and Cristian Villegas 227 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Hierarchical classification model
The three stages of the illustration can be represented by the model
yijk = µ+ αi + β(i)j + ε(ij)k
where µ is the grand mean, αi is the effect of the i-th sire, β(i)j is theeffect of the j-th dam inseminated by the i-th sire, ε(ij)k is the effect ofthe k-th calf born from the j-th dam with the i-th sire. Assuming
αi ∼ N(0, σ2s ), β(i)j ∼ N(0, σ2
d) and ε(ij)k ∼ N(0, σ2)
αi β(i)j and ε(ij)k , αi and α′i (i 6= i ′), β(i)j and β(i ′)j′ (i 6= i ′ and/orj 6= j ′), ε(ij)k and ε(i ′j′)k′ (i 6= i ′, j 6= j ′ and/or k 6= k ′) areindependent
then
Var(Yijk) = Var(µ+ αi + β(i)j + ε(ij)k) = σ2 + σ2s + σ2
d
Cov(Yijk ,Yijk′) = Cov(µ+αi + β(i)j + ε(ij)k , µ+αi + β(i)j + ε(ij)k′) =σ2s + σ2
d (observations from same sire and same dam)
Cov(Yijk ,Yijk′) = Cov(µ+αi+β(i)j+ε(ij)k , µ+αi+β(i)j′+ε(ij′)k′) = σ2s
(observations from same sire and different dam)
Cov(Yijk ,Yijk′) = Cov(µ+αi+d(i)j+ε(ij)k , µ+α′i+d(i ′)j′+ε(i ′j′)k′) = 0(observations from different sire and different dam)
Clarice G.B. Demetrio and Cristian Villegas 228 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Considering 2 sires/2 dams/2 calves,
Y = XGµ+ Z1α+ Z2β + ε
E(Y) = XGµ
Var(Y) = ZGZ′ + Σ = Z1G1Z′1 + Z2G2Z′2 + Σ
Var(Y) = σ2s I2 ⊗ J4 + σ2
d I4 ⊗ J2 + σ2I8 =
[V 04×4
04×4 V
]where
V =
σ2s + σ2
d + σ2 σ2s + σ2
d σ2s σ2
s
σ2s + σ2
d σ2s + σ2
d + σ2 σ2s σ2
s
σ2s σ2
s σ2 + σ2s + σ2
d σ2s + σ2
d
σ2s σ2
s σ2s + σ2
d σ2s + σ2
d + σ2
In this case:
Z1 =
[14 04×1
04×1 14
]= I2 ⊗ 12 ⊗ 12,Z2 =
12 02×1 02×1 02×1
02×1 12 02×1 02×1
02×1 02×1 12 02×1
02×1 02×1 02×1 12
= I2 ⊗ I2 ⊗ 12,
G1 = σ2s I2, G2 = σ
2d I4 and Σ = σ
2I8
Clarice G.B. Demetrio and Cristian Villegas 229 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
For Sires, Dams and Calves random (I Sires/JDams/K Calves), theanalysis of variance table is
Source df SSq E(MSq)Sire I − 1 Y′QSY σ2 + Kσ2
d + JKσ2s
Dam[Sire] I (J − 1) Y′QDY σ2 + Kσ2d
Residual IJ(K − 1) Y′QUResY σ2
ANOVA estimators:
σ2 = RMSq, σ2d =
DSMSq − RMSq
K, σ2
s =SMSq − DSMSq
JK
Clarice G.B. Demetrio and Cristian Villegas 230 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
D.f
1µ
1
��SireI
SI−1
��Dam ∧ Sire
IJD[S]
I (J−1)
��IJK
Calf ∧ Dam ∧ Sire C[D ∧ S]IJ(K−1)
E(QM)
µ
��Sire
JKσ2s
Sσ2+kσ2
d+JKσ2
s
��Dam ∧ Sire
kσ2d
D[S]
σ2+kσ2d
��σ2 Calf ∧ Dam ∧ Sire C[D ∧ S]
σ2
Clarice G.B. Demetrio and Cristian Villegas 231 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
D.f
1µ
1
��SireI
SI−1
��Dam ∧ Sire
IJD[S]
I (J−1)
��IJK
Calf ∧ Dam ∧ Sire C[D ∧ S]IJ(K−1)
E(QM)
µ
��Sire
JKσ2s
Sσ2+kσ2
d+JKσ2
s
��Dam ∧ Sire
kσ2d
D[S]
σ2+kσ2d
��σ2 Calf ∧ Dam ∧ Sire C[D ∧ S]
σ2
Clarice G.B. Demetrio and Cristian Villegas 232 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Calf birth weight
ANOVA table using R or SAS
Source df SSq MSq F Prob
Sire 4 1356 339 71.22 < 0.01
Dam[Sire] 15 129 9 1.81 0.068
Residual 40 190 4.8
MM REML ML
σ2s 28 27 22
σ2v 1.3 1.2 1.2
σ2 4.8 4.8 4.8
Clarice G.B. Demetrio and Cristian Villegas 233 Modelos Mistos e Componentes de Variancia
SAS program
data calf_b;input weight sire dam @@;cards;30.1 1 1 39.3 2 3 43.9 4 231.1 1 1 39.9 2 4 46.7 4 234.6 1 1 36.7 2 4 44.5 4 329.2 1 2 38.7 2 4 46.0 4 330.8 1 2 39.8 3 1 47.0 4 331.6 1 2 36.5 3 1 43.9 4 432.0 1 3 38.9 3 1 45.0 4 432.6 1 3 37.5 3 2 48.0 4 432.7 1 3 38.6 3 2 41.9 5 133.3 1 4 36.8 3 2 43.2 5 140.2 1 4 39.0 3 3 45.3 5 136.7 1 4 39.8 3 3 45.3 5 232.3 2 1 38.6 3 3 44.0 5 236.7 2 1 36.7 3 4 47.1 5 240.1 2 1 37.6 3 4 45.3 5 335.6 2 2 38.9 3 4 44.8 5 334.3 2 2 40.8 4 1 45.3 5 341.1 2 2 42.0 4 1 46.0 5 434.1 2 3 45.0 4 1 47.2 5 430.8 2 3 42.7 4 2 48.0 5 4;* Moment Method;proc glm data=calf_b;class sire dam;model weight = sire dam(sire);random sire dam(sire)/test;run;
SAS program (cont.)
* Restricted Maximum Likelihood Method;proc mixed data=calf_b;class sire dam;model weight = / solution ;random sire dam / solution G;run;
* Maximum Likelihood Method;proc mixed data=calf_b method=ML;class sire dam;model weight = / solution ddfm=sat;random sire dam / solution G;run;
R program
sire <- factor(rep(c(1,2,3,4,5), times=c(rep(12,5)))); siredam <- factor(rep(rep(c(1:4), each=3),times=5))weight <- c(30.1, 31.1, 34.6, 29.2, 30.8, 31.6, 32 , 32.6, 32.7, 33.3, 40.2, 36.7,
32.3, 36.7, 40.1, 35.6, 34.3, 41.1, 34.1, 30.8, 39.3, 39.9, 36.7, 38.7,39.8, 36.5, 38.9, 37.5, 38.6, 36.8, 39.0, 39.8, 38.6, 36.7, 37.6, 38.9,40.8, 42.0, 45.0, 42.7, 43.9, 46.7, 44.5, 46.0, 47.0, 43.9, 45.0, 48.0,41.9, 43.2, 45.3, 45.3, 44.0, 47.1, 45.3, 44.8, 45.3, 46.0, 47.2, 48.0)
calf.dat<- data.frame(weight, sire, dam)
# Moment Methodcalf.lm <- lm(weight ~ sire/dam, calf.dat)anova(calf.lm)(MSSire = anova(calf.lm)$Mean[1])(MSDamdSire = anova(calf.lm)$Mean[2])(MSRes = anova(calf.lm)$Mean[3])(MSSire - MSDamdSire)/ (3*4)(MSDamdSire - MSRes)/ (3)
library(nlme)# Maximum Likelihood Methodcalf.ml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="ML")summary(calf.ml,corr = F)(summary(calf.ml)$sigma)^2
# Restricted Maximum Likelihood Methodrequire(nlme)calf.reml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="REML")summary(calf.reml,corr = F)(summary(calf.reml)$sigma)^2 # sigma^2names(summary(calf.reml)) # sigma^2
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Unbalanced data
The experiment just described is not common. In general, experimentsthis type are unbalanced as the example in the following figure.
S1
D1 D2
S2
D3 D4 D5
S3
D6 D7D1 D2 D3 D4 D5 D6 D7
Clarice G.B. Demetrio and Cristian Villegas 237 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Dam 1 Dam 2 Dam 3Sire C1 C2 C1 C2 C1 C2 C3
1 32.0 33.5 55.02 36.0 34.5 35.0 48.0 49.5 50.03 32.5 31.5 58.0 57.0
Clarice G.B. Demetrio and Cristian Villegas 238 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
For Sires, Dams and Calves random (I Sires/niDams/mijCalves), theanalysis of variance table is
Source df SSq E(MSq)Sire I − 1 Y′QSY σ2 + K2σ
2d + K3σ
2s
Dam[Sire]∑I
i=1 ni − I Y′QDY σ2 + K1σ2d
Residual N −∑I
i=1 ni − I Y′QUResY σ2
where K1 =1∑I
i=1 ni − I
(N −
I∑i=1
∑nij=1 m2
ij
mi.
),
K2 =1
I − 1
[ I∑i=1
ni∑j=1
m2ij
(1
mi .− 1
N
)]and K3 =
1
I − 1
(N −
∑Ii=1 m2
i.
N
).
N =∑I
i=1
∑nij=1 mij and mi. =
∑nij=1 mij
ANOVA estimators:
σ2 = RMSq, σ2d =
DSMSq − RMSq
K1, σ2
s =SMSq − K2σ
2d − σ2
K3
Clarice G.B. Demetrio and Cristian Villegas 239 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Calf birth weight
ANOVA table using R or SAS
Source df SSq MSq F Prob
Sire 2 37.25 18.63 25.30 < 0.01
Dam[Sire] 4 1275.33 318.83 433.13 < 0.01
Residual 6 4.42 0.74
K1 = 1.75, K2 = 1.96, K3 = 4.15
MM ML REML
σ2s -81.53 0.00000012 0.00000023
σ2v 181.77 104.29 121.75
σ2 0.74 0.74 0.74
Negative component of variance
Clarice G.B. Demetrio and Cristian Villegas 240 Modelos Mistos e Componentes de Variancia
SAS programdata calf_unb;
input sire dam weight;
cards;
1 1 32.0
1 1 33.5
1 2 55.0
2 1 36.0
2 2 34.5
2 2 35.0
2 3 48.0
2 3 49.5
2 3 50.0
3 1 32.5
3 1 31.5
3 2 58.0
3 2 57.0
;
* Moment Method;
proc glm data=calf_unb;
class sire dam;
model weight = sire dam(sire);
random sire dam(sire)/test;
run;
* Restricted Maximum Likelihood Method;
proc mixed data=calf_unb;
class sire dam;
model weight = / solution ;
random sire dam / solution G;
run;
* Maximum Likelihood Method;
proc mixed data=calf_unb method=ML;
class sire dam;
model weight = / solution ddfm=sat;
random sire dam / solution G;
run;
R programsire <- factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3))
dam <- factor(c(1,1,2,1,2,2,3,3,3,1,1,2,2))
weight <- c(32.0,33.5,55.0,36.0,34.5,35.0,48.0,49.5,50.0,32.5,31.5,58.0,57.0)
(calf.dat<- data.frame(sire, dam, weight))
# Moment Method
calf.lm <- lm(weight ~ sire/dam, calf.dat)
anova(calf.lm)
(SireMSq = anova(calf.lm)$Mean[1])
(DamdSireMSq = anova(calf.lm)$Mean[2])
(ResMSq = anova(calf.lm)$Mean[3])
(k1 <- (13-(2^2+1^2)/3-(1^2+2^2+3^2)/6-(2^2+2^2)/4)/((2+3+2)-3))
(k2 <- ((2^2+1^2)*(1/3-1/13)+(1^2+2^2+3^2)*(1/6-1/13)+(2^2+2^2)*(1/4-1/13))/(3-1))
(k3 <- (13-(3^2+6^2+4^2)/13)/(3-1))
(sigma2D <- (DamdSireMSq - ResMSq)/k1) # sigmaD^2_hat
(sigma2S <- (SireMSq - ResMSq - k2*sigma2D)/ k3) # sigmaS^2_hat
# Restricted Maximum Likelihood Method
require(nlme)
calf.reml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="REML")
summary(calf.reml,corr = F)
(summary(calf.reml)$sigma)^2 # sigma^2
# Maximum Likelihood Method
calf.ml <- lme(weight ~ 1, random = ~1|sire/dam, calf.dat, method="ML")
summary(calf.ml,corr = F)
(summary(calf.ml)$sigma)^2
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Enzyme lypase data
Lypase is an enzyme used for certain types of medical diagnosis, andthe different stages of its preparation can affect the requiredspecifications.
Three laboratories preparing the enzyme were randomly selected forthe experiment.
Four weeks were randomly assigned for each of the laboratories.
Measurements were obtained in the mornings and evenings of thesample days chosen from the selected weeks.
Only the averages for the days are presented in the following table;450 is subtracted from each of the averages.
Clarice G.B. Demetrio and Cristian Villegas 243 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
LaboratoryWeek 1 2 3
1 43.4, 46.2, 46.5 7.0, 7.8, 15.7 22.4,15.5,29.72 37.0. 16.6 32.4, 16.8 25.4, 23.13 23.6, 33.6 13.4, 9.6 22.9, 0.64 51.0, 52.4 23.9, 19.3 18.4, 3.7
ANOVA table using R or SAS
Source df SSq MSq F Prob
Laboratory 2 2874.03 1437.02
Week[Laboratory] 9 1625.45 180.61
Residual 15 910.82 60.72
Clarice G.B. Demetrio and Cristian Villegas 244 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Unequal variances and sample sizes
For this general case, the variances of β(i)j are unequal and they canbe denoted by σ2
i .The variance of ε(ij)k , denoted by σ2
ij can also be unequal.
Equality of σ2i and σ2
ij are usually assumed.In some practical situation, these assunptions may not be valid.
Var(Y) =
[V1 04×4
04×4 V2
]where
V1 =
σ2s + σ2
1 + σ211 σ2
s + σ21 σ2
s σ2s
σ2s + σ2
1 σ2s + σ2
1 + σ211 σ2
s σ2s
σ2s σ2
s σ2s + σ2
1 + σ212 σ2
s + σ21
σ2s σ2
s σ2s + σ2
1 σ2s + σ2
1 + σ212
V2 =
σ2s + σ2
2 + σ221 σ2
s + σ22 σ2
s σ2s
σ2s + σ2
2 σ2s + σ2
2 + σ221 σ2
s σ2s
σ2s σ2
s σ2s + σ2
2 + σ222 σ2
s + σ22
σ2s σ2
s σ2s + σ2
2 σ2s + σ2
2 + σ222
Clarice G.B. Demetrio and Cristian Villegas 245 Modelos Mistos e Componentes de Variancia