Power in Mixed Effects
Gary W. Oehlert
School of StatisticsUniversity of Minnesota
December 1, 2014
Power is an important aspect of designing an experiment; we nowreturn to power in mixed effects.
We will compute power for“old school” tests. This techniqueworks for balanced designs; it will be exact for some situations withREML and approximate in other situations.
Modeling Assumptions
For pure random terms αβij where all contributing factors arerandom, we assume all αβijs are independent of each other.
For mixed terms αβij , say where A is fixed and B is random, wehave a choice:
Unrestricted assumptions say that all elements of αβij areindependent.
Restricted assumptions say that elements of αβij will add to zeroacross any fixed subscript (i in this case) but areotherwise independent.
Restricted assumptions induce negative correlation among somerandom effects:
Under restricted assumptions, two random effects from the samemixed term are negatively correlated if all of their subscriptscorresponding to random factors are the same.
Otherwise, they are independent.
The text usually defaults to restricted assumptions, but eithercould be appropriate depending on the situation, or something elsecould be better still.
It is usually very tricky to decide which mixed modelingassumptions are appropriate.
In general, unrestricted assumptions are more conservative(typically more difficult to reject the null.)
The lme and lmer functions in R fit using the unrestricted modelassumptions.
Via heroic effort, one can make lme fit some models under therestricted assumptions.
Give R predilections, we will concentrate on the unrestrictedapproach.
Anova and Expected Mean Squares
Old school testing in mixed effects proceeds as follows:
1 Compute an Anova table as if everything in the model is afixed effect.
2 Compute the expectation of every mean square (the expectedmean squares) using the complete Hasse diagram.
3 Use the Hasse diagram (or EMS) to determine the correctratio of MS (correct F test) for every term of interest.
4 Do the tests.
We need the EMS and DF for the F-test to do power.
The complete Hasse diagram includes super- and subscripts onevery term.
For each node on the diagram, add a superscript that indicates thenumber of different levels of the effect in that term.
For each node on the diagram, add a subscript that indicates thedegrees of freedom. Compute the df for a term U by starting withthe superscript for U and subtracting the subscripts (df) for allterms above U.
A has 5 levels, B has 4 levels, C has 2 levels, 2 replications.
Cheese raters
1. The representative element for a random term is its variance.
2. The representative element for a fixed term is the sum of thesquared fixed effects divided by degrees of freedom.
3. Contribution from a term is N, divided by the superscript, timesthe representative element.
4. Using unrestricted model assumptions, the EMS for a term isthe contribution from that term and all random terms below it.
A fixed, B random, crossed (first diagram)
MSE4040σ
2 = σ2
MSAB4040σ
2 + 4020σ
2αβ = σ2 + 2σ2
αβ
MSA4040σ
2 + 4020σ
2αβ + 40
5
P5i=1 αi
2
4 = σ2 + 2σ2αβ + 8
P5i=1 αi
2
4
MSB4040σ
2 + 4020σ
2αβ + 40
4 σ2β = σ2 + 2σ2
αβ + 10σ2β
A random, B random, C fixed, crossed (second diagram)
MSE8080σ
2 = σ2
MSABC8080σ
2 + 8040σ
2αβγ = σ2 + 2σ2
αβγ
MSAB8080σ
2 + 8040σ
2αβγ + 80
20σ2αβ = σ2 + 2σ2
αβγ + 4σ2αβ
MSBC8080σ
2 + 8040σ
2αβγ + 80
8 σ2βγ = σ2 + 2σ2
αβγ + 10σ2βγ
MSAC8080σ
2 + 8040σ
2αβγ + 80
10σ2αγ = σ2 + 2σ2
αβγ + 8σ2αγ
A random, B random, C fixed, crossed (second diagram),continued.
MSC8080σ
2 + 8040σ
2αβγ + 80
10σ2αγ + 80
8 σ2βγ + 80
2
P2i=1 γi
2
1 =
σ2 + 2σ2αβγ + 8σ2
αγ + 10σ2βγ + 40
P2i=1 γi
2
1
MSB8080σ
2 + 8040σ
2αβγ + 80
20σ2αβ + 80
8 σ2βγ + 80
4 σ2β =
σ2 + 2σ2αβγ + 4σ2
αβ + 10σ2βγ + 20σ2
β
MSA8080σ
2 + 8040σ
2αβγ + 80
20σ2αβ + 80
10σ2αγ + 80
5 σ2α =
σ2 + 2σ2αβγ + 4σ2
αβ + 8σ2αγ + 16σ2
α
A random, B random, C random, fully nested (third diagram).
MSE8080σ
2 = σ2
MSC8080σ
2 + 8040σ
2γ = σ2 + 2σ2
γ
MSB8080σ
2 + 8040σ
2γ + 80
20σ2β = σ2 + 2σ2
γ + 4σ2β
MSA8080σ
2 + 8040σ
2γ + 80
20σ2β + 80
5 σ2α = σ2 + 2σ2
γ + 4σ2β + 16σ2
α
Cheese raters.
MSE160160σ
2 = σ2
MSRC160160σ
2 + 16080 σ
2ργ = σ2 + 2σ2
ργ
MSBC160160σ
2 + 16080 σ
2ργ + 160
8
P2,4i,j=1 βγ
2jk
3 = σ2 + 2σ2ργ + 20
P2,4i,j=1 βγ
2jk
3
MSC160160σ
2 + 16080 σ
2ργ + 160
4
P4k=1 γ
2k
3 = σ2 + 2σ2ργ + 40
P4j=1 γ
2k
3
MSR160160σ
2 + 16080 σ
2ργ + 160
20 σ2ρ = σ2 + 2σ2
ργ + 8σ2ρ
MSB160160σ
2 + 16080 σ
2ργ + 160
20 σ2ρ + 160
2
P2j=1 β
2j
1
= σ2 + 2σ2ργ + 8σ2
ρ + 80P2
j=1 β2j
1
A has a levels, B has b levels, C has c levels, n replications.
Notice the pattern1 of the integer multiplier for regular models likethese:
A nbcB nacC nabAB ncAC nbBC naABC n
The multiplier is the product of the levels not in the term.
1Also notice the pattern that computing EMS is pretty damn tedious.
F Tests
In the old school approach, we test a null hypothesis such asσ2α = 0 or 0 =
∑β2
j by
Finding two Mss with EMSs that differ by a multiple of theitem of interest.
Computing the F ratio of those two MS and using the df forthe two MSs to find a p-value.
I asserted that as if it were always possible; this is not alwayspossible.
A fixed, B random, crossed (first diagram)
Item Num. MS Den. MS∑α2
i MSA MSAB
σ2β MAB MSAB
σ2αβ MSAB MSE
No trouble here.
Fully nested design (third diagram)
Item Num. MS Den. MS
σ2α MSA MSB
σ2β MAB MSC
σ2γ MSC MSE
No trouble here.
Cheese raters
Item Num. MS Den. MS∑j β
2j MSB MSR∑
k γ2k MSC MSRC∑
j ,k βγ2jk MSBC MSRC
σ2ρ MSR MSRC
σ2ργ MSRC MSE
No trouble here.
And that brings us to the second diagram.
Item Num. MS Den. MS∑k γ
2k — —
σ2β — —
σ2α — —
σ2αβ MSAB MSABC
σ2αγ MSAC MSABC
σ2βγ MSBC MSABC
σ2αβγ MSABC MSE
In the second diagram/model, there are no ordinary F-tests formain effects!
Looking at the Hasse diagrams, the denominator for a term (usingunrestricted model assumptions) is the first random term belowthe term of interest.
If there is more than one random term you can get to withoutgoing through another random term, then there is no exact test.
We will eventually get to approximate tests for these cases.
Power for Fixed Effects
Recall that the noncentrality parameter controls power for fixedeffects (along with degrees of freedom and the error rate).
The expected mean square for a fixed has the form:
random stuff +N
superscript× sum of squared effects
df
The noncentrality parameter is then
Nsuperscript × sum of squared effects
random stuff
The random stuff goes to the denominator and the df disappears.
Try this approach on Chapter 7 problems . . . it works.
For testing A in the first diagram, the EMS is
σ2 + 2σ2αβ + 8
∑5i=1 αi
2
4
and the noncentrality parameter is
8∑5
i=1 αi2
σ2 + 2σ2αβ
Note that you need to know, or make assumptions about, twovariances to compute the NCP.
In more general form, the NCP for this test is
nb∑5
i=1 αi2
σ2 + nσ2αβ
You cannot always make a noncentrality parameter in a mixedeffects model arbitrarily large by increasing n.
For this problem, you need to increase b, the number of levels ofthe random term B, to make the power go to 1.
Power for Random Effects
Power for random effects is actually easier than power for fixedeffects.
Suppose we want to test H0 : σ2η = 0.
We have two MS with EMS1 = τ + kσ2η and EMS2 = τ .
The F test is MS1/MS2 with ν1 and ν2 df.
Under the null,
MS1
MS2∼ τ + k × 0
τFν1,ν2 = Fν1,ν2
Under the alternative,
MS1
MS2∼τ + kσ2
η
τFν1,ν2
We reject H0 if MS1/MS2 > FE,ν1,ν2 .
Looking at the distribution under the alternative, we reject when
τ + kσ2η
τFν1,ν2 > FE,ν1,ν2
or, put another way, when
Fν1,ν2 >τ
τ + kσ2η
FE,ν1,ν2
Consider testing H0 : σ2αγ = 0 in the second model (fully crossed
three-way design).
The test is MSAC/MSABC . The df are 4 and 12.
Test at E = .01 assuming σ2 = 1, σ2αβγ = 2, and σ2
αγ = .5.
The EMS are:EMSAC = σ2 + 2σ2
αβγ + 8σ2αγ = 9
EMSABC = σ2 + 2σ2αβγ = 5
F.01,4,12 = 5.41 and 5/9× 5.41 = 3.01
Power is the probability that F with 4 and 12 df is larger than 3.01,which is .062 (which is not much power).
Approximate Tests
When there is no F test we can usually construct an approximatetest.
We want to find a sum of two MS for the numerator and a sum oftwo MS for the denominator such that the sum of the EMS on thetop is equal to the sum of the EMS on the bottom plus the termof interest.
On the Hasse diagram, if there are two random terms immediatelybelow the term of interest, the bottom will be the sum of thosetwo random terms, and the top will be the sum of the term ofinterest plus the term where the denominator terms “intersect.”
For testing σ2α in our second example, the numerator MS are:
MSA, EMSA = σ2 + 2σ2αβγ + 4σ2
αβ + 8σ2αγ + 16σ2
α
MSABC , EMSABC = σ2 + 2σ2αβγ
The denominator MS are:
MSAB , EMSAB = σ2 + 2σ2αβγ + 4σ2
αβ
MSAC , EMSABC = σ2 + 2σ2αβγ + 8σ2
αγ
Numerator and denominator sums are:
EMSA + EMSABC = 2σ2 + 4σ2αβγ + 4σ2
αβ + 8σ2αγ + 16σ2
α
EMSAB + EMSAC = 2σ2 + 4σ2αβγ + 4σ2
αβ + 8σ2αγ
The approximate test is a decent statistic, the problem is that itdoesn’t follow an F distribution (or any other standarddistribution).
What we do is treat it as if it follows an F distribution, andcompute some approximate degrees of freedom separately for thenumerator and denominator.
The df approximation is called the Satterthwaite approximation.
Suppose we’re trying to get an approximate df for MS1 + MS2 withν1 and ν2 df.
Let Ei = MSi and Vi = 2E 2i /νi .
The the approximate df for the sum is:
2(E1 + E2)2
V1 + V2
For computing power, let Ei = EMSi . (In fact, even for data Ei issupposed to be EMSi , we’re just using MSi to estimate EMSi .)