+ All Categories
Home > Documents > Statistical Methods for Analysis with Missing...

Statistical Methods for Analysis with Missing...

Date post: 29-Jun-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
79
Statistical Methods for Analysis with Missing Data Lecture 3: nave methods: complete-case analysis and imputation Mauricio Sadinle Department of Biostatistics 1 / 42
Transcript
Page 1: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Statistical Methods for Analysis with Missing Data

Lecture 3: naıve methods: complete-case analysis and imputation

Mauricio Sadinle

Department of Biostatistics

1 / 42

Page 2: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Previous LectureUniverse of missing-data mechanisms:

MNAR MAR MCAR

I MCAR: p(R = r | z) = p(R = r)I Unreasonable in most cases

I MAR: p(R = r | z) = p(R = r | z(r))I Hard to digest, in generalI R ⊥⊥ Z1 | Z2, if Z2 fully observed

I MNAR: p(R = r | z) 6= p(R = r | z(r))I Most realistic, but hard to handle

2 / 42

Page 3: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Today’s Lecture

Naıve or ad-hoc methods

I Complete-case / available-case analyses

I Different types of (single) imputation

Reading: Ch. 2, of Davidian and Tsiatis

3 / 42

Page 4: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Naıve or Ad-Hoc Methods

I Motivation: we know how to run analyses with complete(rectangular) datasets

I Idea: somehow “fix” the dataset so that the analysis for completedata can be run

4 / 42

Page 5: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

5 / 42

Page 6: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

6 / 42

Page 7: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Complete-Case Analysis

I Idea: ignore observations with missingness, run intended analysiswith remaining data

7 / 42

Page 8: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Complete-Case Analysis

8 / 42

Page 9: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Assumption for Complete-Case AnalysisComplete-case analysis implicitly assumes

p(z) = p(z | R = 1K ) (1)

where 1K represents a vector (1, 1, . . . , 1) of length K

I By Bayes’ theorem

p(z | R = 1K ) =p(R = 1K | z)p(z)

p(R = 1K )

I Therefore, (1) is equivalent to

p(R = 1K | z) = p(R = 1K )

I This doesn’t require any assumptions on p(R = r | z) for r 6= 1K

I MCAR (Z ⊥⊥ R) is a sufficient condition for (1)

9 / 42

Page 10: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Assumption for Complete-Case AnalysisComplete-case analysis implicitly assumes

p(z) = p(z | R = 1K ) (1)

where 1K represents a vector (1, 1, . . . , 1) of length K

I By Bayes’ theorem

p(z | R = 1K ) =p(R = 1K | z)p(z)

p(R = 1K )

I Therefore, (1) is equivalent to

p(R = 1K | z) = p(R = 1K )

I This doesn’t require any assumptions on p(R = r | z) for r 6= 1K

I MCAR (Z ⊥⊥ R) is a sufficient condition for (1)

9 / 42

Page 11: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Assumption for Complete-Case AnalysisComplete-case analysis implicitly assumes

p(z) = p(z | R = 1K ) (1)

where 1K represents a vector (1, 1, . . . , 1) of length K

I By Bayes’ theorem

p(z | R = 1K ) =p(R = 1K | z)p(z)

p(R = 1K )

I Therefore, (1) is equivalent to

p(R = 1K | z) = p(R = 1K )

I This doesn’t require any assumptions on p(R = r | z) for r 6= 1K

I MCAR (Z ⊥⊥ R) is a sufficient condition for (1)

9 / 42

Page 12: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Assumption for Complete-Case AnalysisComplete-case analysis implicitly assumes

p(z) = p(z | R = 1K ) (1)

where 1K represents a vector (1, 1, . . . , 1) of length K

I By Bayes’ theorem

p(z | R = 1K ) =p(R = 1K | z)p(z)

p(R = 1K )

I Therefore, (1) is equivalent to

p(R = 1K | z) = p(R = 1K )

I This doesn’t require any assumptions on p(R = r | z) for r 6= 1K

I MCAR (Z ⊥⊥ R) is a sufficient condition for (1)

9 / 42

Page 13: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Assumption for Complete-Case AnalysisComplete-case analysis implicitly assumes

p(z) = p(z | R = 1K ) (1)

where 1K represents a vector (1, 1, . . . , 1) of length K

I By Bayes’ theorem

p(z | R = 1K ) =p(R = 1K | z)p(z)

p(R = 1K )

I Therefore, (1) is equivalent to

p(R = 1K | z) = p(R = 1K )

I This doesn’t require any assumptions on p(R = r | z) for r 6= 1K

I MCAR (Z ⊥⊥ R) is a sufficient condition for (1)

9 / 42

Page 14: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Complete-Case Analysis is Wasteful/Inefficient

Clearly, there can be a huge waste of information

I Observed data with response patterns r 6= 1K should be informativeabout the distribution of Z(r), which is informative about thedistribution of Z

p(z(r)) =

∫p(z) dz(r), r ∈ {0, 1}K

I We might end up with very little data

I Say the R1, . . . ,RKi.i.d.∼ Bernoulli(π)

I p(R = 1K ) = πK K→∞−→ 0

10 / 42

Page 15: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Complete-Case Analysis is Wasteful/Inefficient

Clearly, there can be a huge waste of information

I Observed data with response patterns r 6= 1K should be informativeabout the distribution of Z(r), which is informative about thedistribution of Z

p(z(r)) =

∫p(z) dz(r), r ∈ {0, 1}K

I We might end up with very little data

I Say the R1, . . . ,RKi.i.d.∼ Bernoulli(π)

I p(R = 1K ) = πK K→∞−→ 0

10 / 42

Page 16: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

We’ll see an alternative presentation of Example 1 in Section 1.4 ofDavidian and Tsiatis

I {(Yi ,Ri )}ni=1i.i.d.∼ F

I Yi : numeric variable for individual i

I Ri : indicator of Yi being observed

I If Yi was always observed, we could estimate the mean of Y ,µ = E (Y ), as

µfull =1

n

n∑i=1

Yi

11 / 42

Page 17: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

With missing data, we could use the complete cases

µcc =

∑ni=1 YiRi∑ni=1 Ri

Is this any good?

HW1: show that the following holds

E (µcc) = E (Y | R = 1)

for all sample sizes, provided that at least one Yi is observed.

Hint: write E(µcc ) = E[E(∑n

i=1 YiRi∑ni=1 Ri

| R1, . . . ,Rn

)]

12 / 42

Page 18: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

With missing data, we could use the complete cases

µcc =

∑ni=1 YiRi∑ni=1 Ri

Is this any good?

HW1: show that the following holds

E (µcc) = E (Y | R = 1)

for all sample sizes, provided that at least one Yi is observed.

Hint: write E(µcc ) = E[E(∑n

i=1 YiRi∑ni=1 Ri

| R1, . . . ,Rn

)]

12 / 42

Page 19: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

With missing data, we could use the complete cases

µcc =

∑ni=1 YiRi∑ni=1 Ri

Is this any good?

HW1: show that the following holds

E (µcc) = E (Y | R = 1)

for all sample sizes, provided that at least one Yi is observed.

Hint: write E(µcc ) = E[E(∑n

i=1 YiRi∑ni=1 Ri

| R1, . . . ,Rn

)]

12 / 42

Page 20: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

E (µcc) = E (Y | R = 1)

Therefore

I Complete-case estimator of the mean requires assuming

E (Y ) = E (Y | R = 1)

I In particular, valid under MCAR

I Otherwise, µcc is not valid for µ, as it estimates the wrong quantity

I HW1: if p(R = 1 | y) is an increasing function of y , show that

E (Y | R = 1) > E (Y )

13 / 42

Page 21: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

E (µcc) = E (Y | R = 1)

Therefore

I Complete-case estimator of the mean requires assuming

E (Y ) = E (Y | R = 1)

I In particular, valid under MCAR

I Otherwise, µcc is not valid for µ, as it estimates the wrong quantity

I HW1: if p(R = 1 | y) is an increasing function of y , show that

E (Y | R = 1) > E (Y )

13 / 42

Page 22: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

14 / 42

Page 23: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Available-Case Analysis

Sometimes what we need to estimate doesn’t really require a“rectangular” dataset

I If you can, just use whatever data are available for computing whatyou need

I Davidian and Tsiatis talk about generalized estimating equations(GEEs) and their Example 3 in Section 1.4 (we’ll cover this when weget to Chapter 5)

I K normal random variables: under some missing-data assumption, itseems we could still obtain a good estimate of the distribution as itonly depends on univariate and bivariate quantities (means,variances, covariances)

15 / 42

Page 24: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Available-Case Analysis

Sometimes what we need to estimate doesn’t really require a“rectangular” dataset

I If you can, just use whatever data are available for computing whatyou need

I Davidian and Tsiatis talk about generalized estimating equations(GEEs) and their Example 3 in Section 1.4 (we’ll cover this when weget to Chapter 5)

I K normal random variables: under some missing-data assumption, itseems we could still obtain a good estimate of the distribution as itonly depends on univariate and bivariate quantities (means,variances, covariances)

15 / 42

Page 25: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Available-Case Analysis

Sometimes what we need to estimate doesn’t really require a“rectangular” dataset

I If you can, just use whatever data are available for computing whatyou need

I Davidian and Tsiatis talk about generalized estimating equations(GEEs) and their Example 3 in Section 1.4 (we’ll cover this when weget to Chapter 5)

I K normal random variables: under some missing-data assumption, itseems we could still obtain a good estimate of the distribution as itonly depends on univariate and bivariate quantities (means,variances, covariances)

15 / 42

Page 26: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Available-Case Analysis

I Say the data are

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

I Available-case estimators:

µacj =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

σacjk =

∑ni=1(Yij − µac

j )(Yik − µack )RijRik∑n

i=1 RijRik − 1; j , k = 1, . . . ,K

I Better than complete-case analysis

I Valid under MCAR, but what are the minimal assumptions on themissing-data mechanism for this to be valid?

16 / 42

Page 27: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Available-Case Analysis

I Say the data are

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

I Available-case estimators:

µacj =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

σacjk =

∑ni=1(Yij − µac

j )(Yik − µack )RijRik∑n

i=1 RijRik − 1; j , k = 1, . . . ,K

I Better than complete-case analysis

I Valid under MCAR, but what are the minimal assumptions on themissing-data mechanism for this to be valid?

16 / 42

Page 28: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Available-Case Analysis

I Say the data are

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

I Available-case estimators:

µacj =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

σacjk =

∑ni=1(Yij − µac

j )(Yik − µack )RijRik∑n

i=1 RijRik − 1; j , k = 1, . . . ,K

I Better than complete-case analysis

I Valid under MCAR, but what are the minimal assumptions on themissing-data mechanism for this to be valid?

16 / 42

Page 29: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Available-Case Analysis

I Say the data are

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

I Available-case estimators:

µacj =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

σacjk =

∑ni=1(Yij − µac

j )(Yik − µack )RijRik∑n

i=1 RijRik − 1; j , k = 1, . . . ,K

I Better than complete-case analysis

I Valid under MCAR, but what are the minimal assumptions on themissing-data mechanism for this to be valid?

16 / 42

Page 30: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Complete-Case and Available-Case Analysis

The moral:

I Complete-case analysis is wasteful and, most likely, invalid

I Available-case analysis is better, but still requires MCAR or possiblya weaker assumption depending on what we need to compute

17 / 42

Page 31: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

18 / 42

Page 32: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Imputation

I Idea: plug something “reasonable” into the holes of the dataset,then run intended analysis with completed data

19 / 42

Page 33: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Imputation

20 / 42

Page 34: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

21 / 42

Page 35: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mean Imputation

I Numeric variables

I Impute mean of observed values

I Corresponds to imputing an estimate of E(Yj | Rj = 1), j = 1, . . . ,K

I Leads to valid point estimates of means under MCAR

I Underestimates true variance of estimators

22 / 42

Page 36: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mean Imputation

Say the data are

I {(Zi ,Ri )}ni=1i.i.d.∼ F

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

Mean imputation:

I Compute

µ1j =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

I Impute Yij with µ1j whenever Rij = 0

I Run your analysis as if your data were fully observed

23 / 42

Page 37: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mean Imputation

Say the data are

I {(Zi ,Ri )}ni=1i.i.d.∼ F

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

Mean imputation:

I Compute

µ1j =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

I Impute Yij with µ1j whenever Rij = 0

I Run your analysis as if your data were fully observed

23 / 42

Page 38: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mean Imputation

Say the data are

I {(Zi ,Ri )}ni=1i.i.d.∼ F

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

Mean imputation:

I Compute

µ1j =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

I Impute Yij with µ1j whenever Rij = 0

I Run your analysis as if your data were fully observed

23 / 42

Page 39: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mean Imputation

Say the data are

I {(Zi ,Ri )}ni=1i.i.d.∼ F

I Zi = (Yi1, . . . ,YiK )

I Ri = (Ri1, . . . ,RiK )

Mean imputation:

I Compute

µ1j =

∑ni=1 YijRij∑ni=1 Rij

, j = 1, . . . ,K

I Impute Yij with µ1j whenever Rij = 0

I Run your analysis as if your data were fully observed

23 / 42

Page 40: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mean Imputation

Age Income25 60, 000

? ?

51 ?

? 150, 300

......

=⇒

Age Income25 60, 000

µ1Age µ1

Income

51 µ1Income

µ1Age 150, 300

......

24 / 42

Page 41: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

I Estimating a mean after mean imputation corresponds to using theestimator

µmimpj =

1

n

n∑i=1

[YijRij + µ1j (1− Rij)]

I µmimpj is the mean of the imputed data, so its naıvely estimated

variance isVnaıve(µmimp

j ) = Vnaıve(Yj)/n

where

Vnaıve(Yj) =1

n − 1

n∑i=1

[Rij(Yij − µmimpj )2 + (1− Rij)(µ1

j − µmimpj )2]

I HW1: show that µmimpj = µ1

j

25 / 42

Page 42: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

I Estimating a mean after mean imputation corresponds to using theestimator

µmimpj =

1

n

n∑i=1

[YijRij + µ1j (1− Rij)]

I µmimpj is the mean of the imputed data, so its naıvely estimated

variance isVnaıve(µmimp

j ) = Vnaıve(Yj)/n

where

Vnaıve(Yj) =1

n − 1

n∑i=1

[Rij(Yij − µmimpj )2 + (1− Rij)(µ1

j − µmimpj )2]

I HW1: show that µmimpj = µ1

j

25 / 42

Page 43: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

I Estimating a mean after mean imputation corresponds to using theestimator

µmimpj =

1

n

n∑i=1

[YijRij + µ1j (1− Rij)]

I µmimpj is the mean of the imputed data, so its naıvely estimated

variance isVnaıve(µmimp

j ) = Vnaıve(Yj)/n

where

Vnaıve(Yj) =1

n − 1

n∑i=1

[Rij(Yij − µmimpj )2 + (1− Rij)(µ1

j − µmimpj )2]

I HW1: show that µmimpj = µ1

j

25 / 42

Page 44: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

As a consequence, using the mean imputation method we:

I Underestimate the variance of each variable:

Vnaıve(Yj) =1

n − 1

n∑i=1

Rij(Yij − µ1j )2

I Compare with an estimate based on the available cases:

V 1(Yj) =

∑ni=1 Rij(Yij − µ1

j )2∑ni=1 Rij − 1

I =⇒ Vnaıve(Yj) ≤ V 1(Yj)

26 / 42

Page 45: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

As a consequence, using the mean imputation method we:

I Underestimate the variance of µmimpj :

Vnaıve(µmimpj ) =

1

n(n − 1)

n∑i=1

Rij(Yij − µ1j )2

I Compare with an estimate based on the available cases:

V 1(µmimpj ) =

∑ni=1 Rij(Yij − µ1

j )2

(∑n

i=1 Rij)(∑n

i=1 Rij − 1)

I =⇒ Vnaıve(µmimpj ) ≤ V 1(µmimp

j )

I HW1: comment on the implications of mean imputation for theconstruction of confidence intervals

27 / 42

Page 46: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example: Estimating a Mean

As a consequence, using the mean imputation method we:

I Underestimate the variance of µmimpj :

Vnaıve(µmimpj ) =

1

n(n − 1)

n∑i=1

Rij(Yij − µ1j )2

I Compare with an estimate based on the available cases:

V 1(µmimpj ) =

∑ni=1 Rij(Yij − µ1

j )2

(∑n

i=1 Rij)(∑n

i=1 Rij − 1)

I =⇒ Vnaıve(µmimpj ) ≤ V 1(µmimp

j )

I HW1: comment on the implications of mean imputation for theconstruction of confidence intervals

27 / 42

Page 47: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

28 / 42

Page 48: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Mode Imputation

I Categorical variables

I Impute mode of observed values

I Artificially inflates frequency of mode

I Leads to valid point estimates of marginal modes under MCAR

I Underestimates true variance of estimators

29 / 42

Page 49: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

30 / 42

Page 50: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Regression Imputation

I Regress one variable on others based on observed data, then imputepredicted values from model

I Corresponds to imputing an estimate of E (Yj | y−j ,R = 1K ), wherey−j = (y1, . . . , yj−1, yj+1, . . . , yK )

I Valid for means under MCAR

I Underestimates true variance of estimators

I Validity depends on model used for imputation

31 / 42

Page 51: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Regression Imputation

I Regress one variable on others based on observed data, then imputepredicted values from model

I Corresponds to imputing an estimate of E (Yj | y−j ,R = 1K ), wherey−j = (y1, . . . , yj−1, yj+1, . . . , yK )

I Valid for means under MCAR

I Underestimates true variance of estimators

I Validity depends on model used for imputation

31 / 42

Page 52: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Regression Imputation

I Regress one variable on others based on observed data, then imputepredicted values from model

I Corresponds to imputing an estimate of E (Yj | y−j ,R = 1K ), wherey−j = (y1, . . . , yj−1, yj+1, . . . , yK )

I Valid for means under MCAR

I Underestimates true variance of estimators

I Validity depends on model used for imputation

31 / 42

Page 53: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Regression Imputation

I Regress one variable on others based on observed data, then imputepredicted values from model

I Corresponds to imputing an estimate of E (Yj | y−j ,R = 1K ), wherey−j = (y1, . . . , yj−1, yj+1, . . . , yK )

I Valid for means under MCAR

I Underestimates true variance of estimators

I Validity depends on model used for imputation

31 / 42

Page 54: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Regression Imputation

I Regress one variable on others based on observed data, then imputepredicted values from model

I Corresponds to imputing an estimate of E (Yj | y−j ,R = 1K ), wherey−j = (y1, . . . , yj−1, yj+1, . . . , yK )

I Valid for means under MCAR

I Underestimates true variance of estimators

I Validity depends on model used for imputation

31 / 42

Page 55: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

I Z = (Y1,Y2), baseline and follow-up, Y1 always observed

I R indicator of response for Y2

I Goal: to estimate µ2 = E (Y2)

I Say we posit a linear model E (Y2 | y1) = β0 + β1y1

I Impute Yi2 with Yi2 = β0 + β1Yi1 when Ri = 0, with β0 and β1

obtained via least squares among complete cases

I The regression imputation estimator for µ2 is

µrimp2 =

1

n

n∑i=1

[Yi2Ri + Yi2(1− Ri )]

I When is this valid? (when does µrimp2

n→∞−→ µ2 ?)

32 / 42

Page 56: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

I Z = (Y1,Y2), baseline and follow-up, Y1 always observed

I R indicator of response for Y2

I Goal: to estimate µ2 = E (Y2)

I Say we posit a linear model E (Y2 | y1) = β0 + β1y1

I Impute Yi2 with Yi2 = β0 + β1Yi1 when Ri = 0, with β0 and β1

obtained via least squares among complete cases

I The regression imputation estimator for µ2 is

µrimp2 =

1

n

n∑i=1

[Yi2Ri + Yi2(1− Ri )]

I When is this valid? (when does µrimp2

n→∞−→ µ2 ?)

32 / 42

Page 57: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

I Z = (Y1,Y2), baseline and follow-up, Y1 always observed

I R indicator of response for Y2

I Goal: to estimate µ2 = E (Y2)

I Say we posit a linear model E (Y2 | y1) = β0 + β1y1

I Impute Yi2 with Yi2 = β0 + β1Yi1 when Ri = 0, with β0 and β1

obtained via least squares among complete cases

I The regression imputation estimator for µ2 is

µrimp2 =

1

n

n∑i=1

[Yi2Ri + Yi2(1− Ri )]

I When is this valid? (when does µrimp2

n→∞−→ µ2 ?)

32 / 42

Page 58: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

I Z = (Y1,Y2), baseline and follow-up, Y1 always observed

I R indicator of response for Y2

I Goal: to estimate µ2 = E (Y2)

I Say we posit a linear model E (Y2 | y1) = β0 + β1y1

I Impute Yi2 with Yi2 = β0 + β1Yi1 when Ri = 0, with β0 and β1

obtained via least squares among complete cases

I The regression imputation estimator for µ2 is

µrimp2 =

1

n

n∑i=1

[Yi2Ri + Yi2(1− Ri )]

I When is this valid? (when does µrimp2

n→∞−→ µ2 ?)

32 / 42

Page 59: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

I Z = (Y1,Y2), baseline and follow-up, Y1 always observed

I R indicator of response for Y2

I Goal: to estimate µ2 = E (Y2)

I Say we posit a linear model E (Y2 | y1) = β0 + β1y1

I Impute Yi2 with Yi2 = β0 + β1Yi1 when Ri = 0, with β0 and β1

obtained via least squares among complete cases

I The regression imputation estimator for µ2 is

µrimp2 =

1

n

n∑i=1

[Yi2Ri + Yi2(1− Ri )]

I When is this valid? (when does µrimp2

n→∞−→ µ2 ?)

32 / 42

Page 60: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

Davidian and Tsiatis show that for µrimp2

n→∞−→ µ2 (µrimp2

p−→ µ2) weneed these two requirements to hold simultaneously:

I E (Y2 | y1,R = 1) = E (Y2 | y1) (implied by MAR)

I E (Y2 | y1) is correctly specified, i.e., there really exist β∗0 and β∗1such that E (Y2 | y1) = β∗0 + β∗1 y1

However, even if these two conditions hold, single imputation leads tounderestimation of variances, as seen with mean imputation

33 / 42

Page 61: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

Davidian and Tsiatis show that for µrimp2

n→∞−→ µ2 (µrimp2

p−→ µ2) weneed these two requirements to hold simultaneously:

I E (Y2 | y1,R = 1) = E (Y2 | y1) (implied by MAR)

I E (Y2 | y1) is correctly specified, i.e., there really exist β∗0 and β∗1such that E (Y2 | y1) = β∗0 + β∗1 y1

However, even if these two conditions hold, single imputation leads tounderestimation of variances, as seen with mean imputation

33 / 42

Page 62: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

Davidian and Tsiatis show that for µrimp2

n→∞−→ µ2 (µrimp2

p−→ µ2) weneed these two requirements to hold simultaneously:

I E (Y2 | y1,R = 1) = E (Y2 | y1) (implied by MAR)

I E (Y2 | y1) is correctly specified, i.e., there really exist β∗0 and β∗1such that E (Y2 | y1) = β∗0 + β∗1 y1

However, even if these two conditions hold, single imputation leads tounderestimation of variances, as seen with mean imputation

33 / 42

Page 63: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of Regression Imputation in Davidian and Tsiatis

Davidian and Tsiatis show that for µrimp2

n→∞−→ µ2 (µrimp2

p−→ µ2) weneed these two requirements to hold simultaneously:

I E (Y2 | y1,R = 1) = E (Y2 | y1) (implied by MAR)

I E (Y2 | y1) is correctly specified, i.e., there really exist β∗0 and β∗1such that E (Y2 | y1) = β∗0 + β∗1 y1

However, even if these two conditions hold, single imputation leads tounderestimation of variances, as seen with mean imputation

33 / 42

Page 64: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

34 / 42

Page 65: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Hot-Deck Imputation

I Replace missing values of a non-respondent (called the recipient)with observed values from a respondent (the donor)

I Recipient and donor need to be similar with respect to variablesobserved by both cases

I Donor can be selected randomly from a pool of potential donorsI Single donor can be identified, e.g. “nearest neighbour” based on

some metric

I Andridge & Little (2010, Int. Stat. Rev.) reviewed this approachand concluded that

I General patterns of missingness are difficult to deal with (“swisscheese pattern”)

I Lack of theory to support this methodI Lack of comparisons with other methodsI Uncertainty from imputation is not taken into account

(underestimation of variances)

35 / 42

Page 66: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Hot-Deck Imputation

I Replace missing values of a non-respondent (called the recipient)with observed values from a respondent (the donor)

I Recipient and donor need to be similar with respect to variablesobserved by both cases

I Donor can be selected randomly from a pool of potential donorsI Single donor can be identified, e.g. “nearest neighbour” based on

some metric

I Andridge & Little (2010, Int. Stat. Rev.) reviewed this approachand concluded that

I General patterns of missingness are difficult to deal with (“swisscheese pattern”)

I Lack of theory to support this methodI Lack of comparisons with other methodsI Uncertainty from imputation is not taken into account

(underestimation of variances)

35 / 42

Page 67: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Hot-Deck Imputation

I Replace missing values of a non-respondent (called the recipient)with observed values from a respondent (the donor)

I Recipient and donor need to be similar with respect to variablesobserved by both cases

I Donor can be selected randomly from a pool of potential donorsI Single donor can be identified, e.g. “nearest neighbour” based on

some metric

I Andridge & Little (2010, Int. Stat. Rev.) reviewed this approachand concluded that

I General patterns of missingness are difficult to deal with (“swisscheese pattern”)

I Lack of theory to support this methodI Lack of comparisons with other methodsI Uncertainty from imputation is not taken into account

(underestimation of variances)

35 / 42

Page 68: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

36 / 42

Page 69: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Last Observation Carried Forward

I Common in settings where a variable is measured repeatedly overtime and there is dropout

I If there is droput at time j , we don’t observe Zj ,Zj+1, . . . ,ZT

I LOCF: replace all of Zj ,Zj+1, . . . ,ZT with Zj−1

37 / 42

Page 70: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Last Observation Carried Forward

Example from Davidian and Tsiatis:

Solid lines: observed data. Dashed lines: extrapolated data with LOCF.

38 / 42

Page 71: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Last Observation Carried Forward

Attempts to justify LOCF

I Interest in the last observed outcome measure (reasonable in somecontext??)

I Under some assumptions, will lead to conservative analysis

I Say we have a clinical trial, outcome under treatment is expected toimprove over time

I If treatment is found to be superior even with LOCF, then true effectshould be even larger

I Relies on assumption of monotonic improvement over time!

39 / 42

Page 72: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Last Observation Carried Forward

Attempts to justify LOCF

I Interest in the last observed outcome measure (reasonable in somecontext??)

I Under some assumptions, will lead to conservative analysis

I Say we have a clinical trial, outcome under treatment is expected toimprove over time

I If treatment is found to be superior even with LOCF, then true effectshould be even larger

I Relies on assumption of monotonic improvement over time!

39 / 42

Page 73: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of LOCF in Davidian and TsiatisStudy participants’ characteristic to be measured at T times

I Yj : measurement taken at time tj

I D: participant dropout time

I Interest: µT = E (YT )

I The LOCF estimator of the mean is

µLOCFT =

1

n

n∑i=1

T∑j=1

I (Di = j + 1)Yij

I The expected value of the LOCF estimator of the mean is

E (µLOCFT ) = µT −

T−1∑j=1

E [I (D = j + 1)(YT − Yj)],

so µLOCFT is biased, in general

40 / 42

Page 74: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of LOCF in Davidian and TsiatisStudy participants’ characteristic to be measured at T times

I Yj : measurement taken at time tj

I D: participant dropout time

I Interest: µT = E (YT )

I The LOCF estimator of the mean is

µLOCFT =

1

n

n∑i=1

T∑j=1

I (Di = j + 1)Yij

I The expected value of the LOCF estimator of the mean is

E (µLOCFT ) = µT −

T−1∑j=1

E [I (D = j + 1)(YT − Yj)],

so µLOCFT is biased, in general

40 / 42

Page 75: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of LOCF in Davidian and TsiatisStudy participants’ characteristic to be measured at T times

I Yj : measurement taken at time tj

I D: participant dropout time

I Interest: µT = E (YT )

I The LOCF estimator of the mean is

µLOCFT =

1

n

n∑i=1

T∑j=1

I (Di = j + 1)Yij

I The expected value of the LOCF estimator of the mean is

E (µLOCFT ) = µT −

T−1∑j=1

E [I (D = j + 1)(YT − Yj)],

so µLOCFT is biased, in general

40 / 42

Page 76: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Example of LOCF in Davidian and TsiatisStudy participants’ characteristic to be measured at T times

I Yj : measurement taken at time tj

I D: participant dropout time

I Interest: µT = E (YT )

I The LOCF estimator of the mean is

µLOCFT =

1

n

n∑i=1

T∑j=1

I (Di = j + 1)Yij

I The expected value of the LOCF estimator of the mean is

E (µLOCFT ) = µT −

T−1∑j=1

E [I (D = j + 1)(YT − Yj)],

so µLOCFT is biased, in general

40 / 42

Page 77: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Outline

Complete-Case and Available-Case AnalysisComplete-Case AnalysisAvailable-Case Analysis

ImputationMean ImputationMode ImputationRegression ImputationHot-Deck ImputationLast Observation Carried Forward

Summary

41 / 42

Page 78: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Summary

Main take-aways from today’s lecture:

I Complete-case analyses are wasteful. Also, potentially invalid unlessMCAR

I Available-case analyses make a better use of the available data butstill requires MCAR (weaker assumptions possibly depend onmodel/quantity being used/estimated)

I Imputation methods might be valid for some quantities under MCARbut variances are underestimated =⇒ overconfidence in your results!

Next lecture:

I R session 1: imputation methods, some simulation studies

I Bring your laptops!

42 / 42

Page 79: Statistical Methods for Analysis with Missing Datafaculty.washington.edu/msadinle/teaching/UW-BIOST531... · 2019-01-14 · Statistical Methods for Analysis with Missing Data Lecture

Summary

Main take-aways from today’s lecture:

I Complete-case analyses are wasteful. Also, potentially invalid unlessMCAR

I Available-case analyses make a better use of the available data butstill requires MCAR (weaker assumptions possibly depend onmodel/quantity being used/estimated)

I Imputation methods might be valid for some quantities under MCARbut variances are underestimated =⇒ overconfidence in your results!

Next lecture:

I R session 1: imputation methods, some simulation studies

I Bring your laptops!

42 / 42


Recommended