Prediction of New Observations - ETH Z...Content • Introduction • Prediction of Mixed Effects...

Prediction of New Observations

Statistic Seminar: 6th talkETHZ FS2010

1

ObservationsMartina Albers12. April 2010

Papers: Welham (2004), Yiang (2007)

Content

• Introduction• Prediction of Mixed Effects• Prediction of Future Observation• Principles of Prediction

– Prediction Process– Prediction Process– Fixed and Random Terms

• Example: Split-Plot Design

2

Linear mixed model

y X Zb Wβ γε ε= + + = +

1 vector of observations

matrix associating observations with

the appropriate combination of fixed effects

1 vector of fixed effects

y n

X n p

pβ

≅ ×

≅ ×

≅ ×

3



t

p

Z n q

β ≅ ×

≅ ×

he app. comb. of random effects

1 vector of random effects

1 vector of residual errors

, combined design matrix resp. vector of effects

b q

n

W

ε

γ

≅ ×

≅ ×

≅

Introduction

Linear mixed model

y X Zb Wβ ε τ ε= + + = +

( ) ( )( )

2

1~ ,

~ 0,

nY B b N b I

B N θ

β σ= +X Z

Σ

cov( ) q q, symmetricB= ×Σ

Assumption

where

b ⊥ ε

4Introduction

2

cov( ) q q, symmetricBθ

θ θσ

= ×

′=

Σ

Λ Λ

( )2

1~ 0, q

B U

U N I

θ

σ

= Λ ( ) ( )( )

2

1

2

1

~ ,

~ 0,

n

q

Y U u N u I

U N I

θβ σ

σ

= +X ZΛ

where

Use:

Linear mixed modelCompute and

2

,

arg mˆ

0ˆin

0u

y uu

I

θ

βθ

ββ

= −

ΛZ X X

uI y′ ′ ′ + ′′Λ Λ Λ ΛZ Z Z X Z

u ˆθβ

5Introduction

0

0

ˆ

ˆq

uI y

y

θ θ

θ θ

θ θ θ θ

θ θβ′

′ ′

′=

′

′

′ ′= −

′ ′ ′ + = ′ ′ ′

′

′

ZX

ZX X X

ZX

X X ZX ZX

Λ

L L R

R R R

L R Z X

R R X X R R

Λ Λ Λ Λ

Λ

Z Z Z X Z

X Y X X X��

What do we mean by “prediction”?• Estimation of effects in the model

� prediction = linear combination of estimated effects• Marginal vs. Conditional predictions• What is needed?

– Marginal vs. Conditional predictions– In Example: variety or nitrogen prediction?

6Introduction

Is there a general strategy?Problem: Each situation needs to be analyzed “by hand”!

Questions that might arise:• In- or exclusion of random model terms from the prediction?• Different weighting schemes?• etc.

7

• etc.

Introduction

Predictions

� Prediction of mixed effects� Prediction of a future observation

2 types of prediction:

8Introduction

Linear mixed modely X Zb Wβ γε ε= + + = +

1 vector of observations


the appropriate combination of fixed effects


y n

X n p

pβ

≅ ×

≅ ×

≅ ×

9



t

p

Z n q

β ≅ ×

≅ ×

he app. comb. of random effects

1 vector of random effects

1 vector of residual errors

, combined design matrix resp. vector of effects

b q

n

W

ε

γ

≅ ×

≅ ×

≅

Prediction of Mixed effects

Prediction of Mixed Effects“all parameters are known”, i.e. fixed effects + variance components are knownConsider :

: known vectors

, vector of fixed/random effects

b

β b :

β βγ ξ′ ′ ′= + = +x z x

x, z

10

Best predictor for ξ (as MSE):

Prediction of Mixed Effects

ˆ E Ey b yξ ξ ′ = = z

Assumption

( )var( ) 'E b y b y β− ⇒ = − 1

Z V X

0 cov( ) cov( )~ ,

cov( )

cov( ), cov( ) cov( )

b b bN

X b

y b

y β

′

′= = =

+

Z

Z V

R V Z Z Rε

11

Best linear predictor of γ is then


( )

( )1

var( ) '

ˆ var( ) '

E b y b y

b y

β

ξ β−

⇒ = −

⇒ = −

Z V X

x' Z V X

( )cov( )ˆ 'b yγ β β−= + −1x' z' Z V X

Example: IQ-Test

−

− Estimate the true IQ of a student scoring 130 in a test−

− model:

)15,100(~IQ 2N

)5,IQ(~IQscore 2N

, student test score,

realization of a random effect

y b y

b

µ ε= + + ≅

≅

12

− Predict:

Result:


student's true IQ (unobservable)bµ + ≅

�IQ 127=


• Fixed effects + variance components unknown• 20 Students• each student: 5 tests• Computations as described before• See R-File: rf_prediction3.R

13Prediction of Mixed Effects

Prediction of Future ObservationsAIM: construct prediction intervalsprediction intervalsprediction intervalsprediction intervals, i.e. an interval in which future

observation will fall with a certain probability given what has been observed.

Examples:1. Longitudinal studies

– Prediction of a fut. obs. from an individual not previously observed

14

– Prediction of a fut. obs. from an individual not previously observed– Less interest to predict another observation from an observed individual as

the studies often aim at applications to a larger population– E.g.: drugs going to the market after clinical trials

2. Surveys– 2-step survey:– A) number of families randomly selected– B) some family members of each family are interviewed– Prediction for a non-selected family

Prediction of Future Observations

Prediction Intervals

a. Assumption: fut. obs. has certain distribution– Distribution defined up to a finite number of unknown parameters– Estimate parameter � obtain prediction interval– BUT: if distribution assumption fails, the interval might be wrong

b. Distribution-free– Normality is not assumed– Distribution-free approach

15

– Distribution-free approach– Assumption: future observation is independent of current ones

c. Markov-Chains, Montecarlod. et cetera…

Prediction of Future Observations

Confidence vs. Prediction Intervals

Confidence Interval (CI)Confidence Interval (CI)Confidence Interval (CI)Confidence Interval (CI)• Interval estimate of

population parameter• ¨ unobservable

population parameter

Prediction Interval (PI)Prediction Interval (PI)Prediction Interval (PI)Prediction Interval (PI)• Interval estimate of future

observation• ¨ future observation

population parameter• predict distribution of

estimate of unobservable quantity of interest (e.g. true pop. mean)

• Predict distribution of individual future points

16Prediction of Future Observations

Confidence vs. Prediction Intervals (math.)( )

( )( ) 12

12

1

2

)'(ˆvar

)'(ˆvar

')'(ˆ

,0~,

−

−

−

=−

=

=

+=

XX

XX

yXXX

NXy

σββ

σβ

β

σεεβ

true y x β′= ⌣

Fixed effect model


1 1

true

observed CIˆ Interval: 1.96 var

ˆˆmodelled PI

ˆˆpredicted

i i

i i

i i

n n

y x

y xy

y x

y x

β

β ε

β

β+ +

′= ′= +

± ⋅ ′=

′=

⌣

Normal Approximation

Confidence vs. Prediction Intervals (math.)

( ) ( )

( ) ( )( )

12

12

1 1 1 1

ˆCI: var '

ˆPI: var ' 1

i i i

n n n n

y x X X x

y y x X X x

σ

σ

−

−

+ + + +

′=

′− = +

( )12ˆConfidence Interval: 1.96 'i iy x X X xσ

−′± ⋅


( )

( )( )12

1 1

ˆConfidence Interval: 1.96 '

ˆPrediction Interval: 1.96 ' 1

i i

n n

y x X X x

y x X X x

σ

σ−

+ +

′± ⋅

′± ⋅ +

Confidence vs. Prediction Intervals

How do we construct Prediction Intervals for a more general model?

Mixed effects model: y X Zbβ ε= + +


Prediction IntervalsModel:

( )

( )2

,

~ 0,

~ 0, n

y X Zb b

b N

N

θ

β ε ε

ε σ

= + + ⊥

Σ

I

Observationsn

( ) ( )2 2

cov(

cov cov

)

b

θ ε

σεσ

= +

= = =

V Z'

Σ I

ΣZ

I


( ) ( )2 2

1cov covq nb θ σεσ= = =Σ I I

Estimate and : bβ ( )( )

11 1

2 1

1

2 2

1

ˆ ˆ ˆ' '

ˆˆˆ '

ˆ ˆ ˆwhere ' n

y

yb

β

σ β

σ σ

−− −

−

=

= −

= +

X V X X V

Z V X

V ZZ I

ɶ

Prediction IntervalsMarginal:

( ) ( ) ( )( ) ( )( )

( )1 1 1 1 1

2

ˆ ˆˆcov cov cov

ˆˆcov cov

ˆcov

i i i i

n n n n n

y x x x

y y x x z b

x x z z

β β

β β ε

β β σ

+ + + + +

′ ′= =

′ ′ ′− = − + + =

′ ′= − + +Σ I


( ) 2

1 1 1 1ˆcovn n n n nx x z zθβ β σ+ + + +

′ ′= − + +Σ I

Conditional (on all random effects):

( ) ( )( )12

11 1 1ˆcov 1n n n nb xby y xσ ++

−

++′ ′− = = +X Xɶ

Prediction IntervalsMarginal:

( )( )( )2

1 1 1 1

ˆˆConfidence Interval: 1.96 cov

ˆˆPrediction Interval: 1.96 cov

i i

n n n n n

y x x

y x x z zθ

β

β β σ+ + + +

′± ⋅

′ ′± ⋅ − + +Σ I


Conditional on all random effects:( )( )12

1 1ˆPrediction Interval: 1.96 1n ny X xx Xσ

−

+ +′ ′± ⋅ +

Prediction IntervalsThere is a difference between marginal and conditional predictions!

�Which one is of interest?


The prediction process

Prediction: • is a linear function of the best linear (unbiased) predictor of random effects with the best linear (unbiased) estimator of fixed effects in the model• is typically associated with a combination of explanatory

24

• is typically associated with a combination of explanatory variables• either averaged over, ignoring, or at a specific value of other explanatory variables in the model

Principles of Prediction

The prediction process

Partition of the explanatory variables (e.v.) into 3 sets:1. Classifying set

• e.v. for which predicted values are required

25

2. Averaging set• e.v. which have to have averaged over

3. Rest• e.v. which will be ignored


The role of fixed and random effects with respect to prediction

Fixed Terms• have associated set of effects (parameters) which have to be estimated

Random Terms

26

Random Terms• associated effects are normally distributed with 0 mean and co-variance matrix• co-variance matrix is function of (usually) unknown parameters• error terms due to randomization or other structure of the data


How to deal with Random Factor Terms1. Evaluate at a given value(s) specified by user2. Average over the set of random effects

• Prediction specific to / conditional on the random effects observed

• � „Conditional prediction” w.r.t. the term

27

• � „Conditional prediction” w.r.t. the term3. Omit the random term from the model

• Prediction at the population average (zero)• substitutes the assumed pop. mean for an unknown

random effect• � „Marginal prediction” w.r.t. the term


How to deal with Fixed Factors

• no pre-defined population average• no natural interpretation for a prediction derived by omitting a fixed term from the fitted values• average over all the present levels to give a conditional

28

• average over all the present levels to give a conditional average• or: user should specify the value(s)


4 conceptual steps for the prediction process

1. Choose e.v. and their respective values for which predictive margins are required, i.e. determine the classifying set

2. Determine which variables should be averaged over, i.e. determine the averaging set

3. Determine terms that are needed to compute parameters

29

3. Determine terms that are needed to compute parameters and estimations

4. Choose the weighting for taking means over margins (for the averaging set)


Split-Plot DesignExperiment: • 4 levels of nitrogen• 3 oat varieties• 6 “tries”, i.e. 6 blocks• 4 subplots• 3 whole-plots• random allocation of

30

• random allocation of nitrogen within a block

Fixed effects:• treatment combination

Random effects:• blocking factors (source of error variation)

AIM: estimate the performance of each treatment combination within AIM: estimate the performance of each treatment combination within AIM: estimate the performance of each treatment combination within AIM: estimate the performance of each treatment combination within the experimentthe experimentthe experimentthe experiment

Split-Plot Design

The Data-Set

31Split-Plot Design

The model: components~ :

~ :

~ : :

rando

constant variety nitrogen variety nitrogen

bloc

residual

fixe

ks blocks wplots

blocks wplots s

m

p ot

d

l s

+ + +

+

Random terms: • error terms used in estimation of treatment effects

32

• error terms used in estimation of treatment effects• Not otherwise relevant to the prediction of treatment effects

Split-Plot Design

Conditional vs. Marginal prediction for the random effects

Conditional prediction:• gives a prediction specific to the blocks and plots used in the experiment• appropriate to inference for the specific instance that occurred in the dataset

33

occurred in the datasetMarginal prediction:

• the prediction corresponds to the yields expected from a similar experiment laid out using different blocks and plots• appropriate when inference is required for members of the wider population

Split-Plot Design

The model( ) ( )i ijr ij s ijijk ir jkk sv nb n ey w vµ= + + + + + +

( ) ( )

( )

ijrv

b

kjiy

ijr

i

ijk

variety ofeffect

block ofeffect

constant overall

4,...,1plot -sub ,3,..1plot - whole,6,...,1block on yield

≅

≅

≅

===≅

µ

34

( )

( )

( )

kjie

srvn

ijks

sn

ijw

ijr

ijk

rs

ijks

ij

plot -sub ,plot - whole,block for error residual

levelnitrogen with level variety ofn interactio

plots-sub tolevelsnitrogen ofion randomizat

levelnitrogen ofeffect

block in plot - wholeofeffect

plots- whole to varietiesofion randomizat

≅

≅

≅

≅

≅

≅

Split-Plot Design

Assumption

( ) ( ) ( )634112111631211621 ,...,, and ,...,, ,,...,, eeeewwwwbbbb ===

For the following terms we assume a normal distribution

( ) ( )i ijr ij s ijijk ir jkk sv nb n ey w vµ= + + + + + +

35

72

2

18

2

6

2

00

00

00

,

0

0

0

~

I

I

I

N

e

w

b

w

b

σ

σ

σ

Split-Plot Design

ANOVA• no interaction• usually: drop non-significant terms

variety nitrogen×


36Split-Plot Design

Prediction Process

Prediction of yield for each nitrogen level• = general effect of different nitrogen applications• � unweighted average across all varieties


37

• = prediction of yield for nitrogen level l for “average” block+whole-plot

• � marginal prediction wrt block+whole-plotCalculation: ignore random terms:

Split-Plot Design

�( )3

1

1ˆ ˆ ˆ

3l j jl

j

n v vnµ=

+ + +∑

Prediction Process

Prediction specific to blocks+whole-plots in experiment• � conditional prediction

Calculation: include random terms:


38Split-Plot Design

�( )1

3

1

6 6 3

1 1

11 1ˆ

6 18ˆ ˆ

3jli l ij

i i j

j

j

vb n w vnµ= = ==

+ ++ + +∑ ∑∑ ∑ɶ ɶ

Prediction Process

Explanatory variable Set Levels AveragingM C M C M C

Variety a a all all e e

Nitrogen c c all all n n

39Split-Plot Design

Blocks x a - all - e

Wplots x a - all - e

splots x x - - - -

e : equal weightsn : none

a : averaging setc : classifying setx : excluded

M : marginal pred.C : conditional pred.

Prediction Process

Model term In prediction?M C

Constant + +

Variety + +

Nitrogen + +

40Split-Plot Design

Nitrogen + +

Variety:nitrogen + +

Blocks x +

Blocks:wplots x +

Blocks:wplots:splots x x

+ : usedx : ignored

The resulting predictions

Nitrogen application Prediction SE

M C

0.0 cwt/acre 79.4 7.18 3.14

Predictions for nitrogen application levels with SE and SED, using marginal (M) or conditional (C) values of blocks+whole-plots

41Split-Plot Design

0.2 cwt/acre 98.9 7.18 3.14

0.4 cwt/acre 114.2 7.18 3.14

0.6 cwt/acre 123.4 7.18 3.14

SED 4.44 4.44

The resulting predictionspredictions

Variety Nitrogen application (cwt/acre) Margin

0.0 0.2 0.4 0.6

Golden Rain 80.00 98.50 124.83 124.83 104.50

Marvellous 86.67 108.50 117.17 126.83 109.79

variety nitrogen×

42Split-Plot Design

Marvellous 86.67 108.50 117.17 126.83 109.79

Victory 71.50 89.67 110.83 118.50 97.63

Margin 79.39 98.89 114.22 123.39 103.97

The resulting predictions• SE smaller for conditional predictions

• � because predictions are calculated conditional on the blocks+whole-plots observed!

• Using marginal values = „no information on block+whole-plot effect“

43Split-Plot Design

Special case: data missing

Data for all replicates of Golden Rain with 0 cwt nitrogen are missing

44Split-Plot Design


Variety Nitrogen application (cwt/acre) Margin

0.0 0.2 0.4 0.6

Golden Rain ??? 98.50 124.83 124.83

Marvellous 86.67 108.50 117.17 126.83 109.79

45Split-Plot Design

Corresponding cell cannot be estimated without additional assumptions!

Marvellous 86.67 108.50 117.17 126.83 109.79

Victory 71.50 89.67 110.83 118.50 97.63

Margin 98.89 114.22 123.39


No significant variety main effect/ interactions present in the model� Approach chosen has no great influence on nitrogen prediction� consider variety predictions

46

Possible approaches• set inestimable parameters to 0, average over all cells• average over cells with data present• average over levels of nitrogen for which all varieties are present

Split-Plot Design

The resulting predictions

Variety Inestimable

parameters zero

For data

present

On nitrogen levels

0.2-0.6

Golden Rain 105.67 112.67 112.67

Marvellous 109.97 109.79 117.50

Variety predictions with ‘Golden Rain + 0 cwt nitrogen‘ plots set tomissing valueMargin

104.50

109.79

47Split-Plot Design

Marvellous 109.97 109.79 117.50

Victory 97.63 97.63 106.33

109.79

97.63

Complete

data

The resulting predictionsVariety Inestimable

parameters zero

For data

present

On nitrogen levels

0.2-0.6

Golden Rain 105.67 112.67 112.67

Marvellous 109.97 109.79 117.50

Victory 97.63 97.63 106.33

48Split-Plot Design

In 2nd case: variety ordering has changed� prediction not comparable because of large nitrogen effect

ComparisonVariety Margin

Golden Rain 104.50

Marvellous 109.79

Victory 97.63

Margin 103.97

Data complete

49Split-Plot Design

Variety Inest. param.

zero

For data

present

On nitrogen levels

0.2-0.6

Golden Rain 105.67 112.67 112.67

Marvellous 109.79 109.79 117.50

Victory 97.63 97.63 106.33

Margin 103.97

Data missing

Other special cases: data missingFirst entry in the data is missing (i.e. Victory, 0.0 cwt/acre, Block I)

50Split-Plot Design

Data in Block I for all replicates with 0.0 cwt nitrogen are missing

Other special cases: data missingAll Data for replicates with 0.0 cwt nitrogen are missing

51Split-Plot Design

Making the computations…

R-FILE!

52Split-Plot Design

Date post:	07-May-2020
Category:	Documents
Upload:	others
View:	7 times
Download:	0 times

Prediction of New Observations - ETH Z...Content • Introduction • Prediction of Mixed Effects...

Documents