+ All Categories
Home > Documents > EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term...

EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term...

Date post: 01-Nov-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
38
EPSE 581C: Causal Inference for Applied Researchers Ed Kroc University of British Columbia [email protected] June 12, 2019 Ed Kroc (UBC) Causal Inference June 12, 2019 1 / 37
Transcript
Page 1: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

EPSE 581C: Causal Inference for Applied Researchers

Ed Kroc

University of British Columbia

[email protected]

June 12, 2019

Ed Kroc (UBC) Causal Inference June 12, 2019 1 / 37

Page 2: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Last time

Propensity score matching

implementation

limitations

case studies

Ed Kroc (UBC) Causal Inference June 12, 2019 2 / 37

Page 3: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Today

Propensity score matching

Intro to instrumental variables

Ed Kroc (UBC) Causal Inference June 12, 2019 3 / 37

Page 4: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching to minimize confounding of causal effects

Consider the following example:

We want to assess the causal effect of a new tutoring service beingoffered free to all students in MATH 105, Winter Session II, on finalcourse grade.

Graduate students in MATH are trained via a series of pedagogicalworkshops. Students can then book one thirty-minute appointmentvia an online system with one of these tutors anytime from 9 AM to 4PM, the week before an exam.

Problem: self-enrolment in treatment (tutor) group - meansassignment to treatment mechanism will be confounded.

Ed Kroc (UBC) Causal Inference June 12, 2019 4 / 37

Page 5: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching to minimize confounding of causal effects

Several problems with practical implementation of raw matching:

(1) What if we can’t find a match?

(2) What if we have too many covariates to match on, thereby creatingsome sample units that can’t be matched?

(3) For non-categorical variables, how do we decide what a ‘match’ is?E.g. we equated ages of 19 to 21, and GPAs of 3.5 to 3.7. Is thisreasonable? Optimal?

LOTS of work has been done to try to answer these questions.

To address (3), many different matching algorithms exist to find“best” matches. (No one algorithm is always best.)

To address (2), very common to use propensity scores.

Ed Kroc (UBC) Causal Inference June 12, 2019 5 / 37

Page 6: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Propensity scores

The propensity score is defined as the conditional probability that asample unit with observed covariates X will be assigned to treatment,T “ 1:

spX q :“ PrpT “ 1 | X q

Claim: if sample units all have about the same propensity score spX q,then the distribution of X is about the same for both treatmentgroups.

That is, sample units with similar propensity scores are(approximately) exchangeable over treatment, assuming nounobserved confounders.

Thus, propensity scores are a way of removing confounding effects ofthe observed covariates on the assignment to treatment mechanism.

Ed Kroc (UBC) Causal Inference June 12, 2019 6 / 37

Page 7: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Propensity scores

Propensity scores solve the problem of having too many covariates tomatch on: instead of matching on all covariates simultaneously,simply estimate the propensity score and then match sample unitsbased on this single value.

Two immediate questions arise:

(1) How do we estimate a propensity score from sample data?

(2) What is the best way to match units over their propensity scores (acontinuous quantity)?

Ed Kroc (UBC) Causal Inference June 12, 2019 7 / 37

Page 8: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Estimating propensity scores

(1) How do we estimate a propensity score from sample data?

ANSWER: propose a regression model relating assignment totreatment to the observed covariates.

Assignment to treatment is a binary variable, and the propensity scoreis the probability of assignment to treatment (real number between 0and 1), given the observed covariates.

How do we specify a regression model for a binary response variable?

ANSWER (most commonly): logistic regression (primer last time)

Ed Kroc (UBC) Causal Inference June 12, 2019 8 / 37

Page 9: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Estimating propensity scores

Goal: estimate spX q “ PrpT “ 1 | X q.

Solution: logistic regression:

log

ˆ

PrpT “ 1 | X q1´ PrpT “ 1 | X q

˙

“ β0 ` βXX ` ε

Note: it is entirely possible that we may want to include interactionsand other marginal curvature, etc. terms in this regression model: thegeneric vector notation (bold-faced coefficients, covariates) should beconstrued as allowing for these possibilities.

Ed Kroc (UBC) Causal Inference June 12, 2019 9 / 37

Page 10: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Estimating propensity scores

Using the logistic model, we can then model the propensity score:

PrpT “ 1 | X q1´ PrpT “ 1 | X q

“ exppβ0 ` βXX ` εq

PrpT “ 1 | X q “ p1´ PrpT “ 1 | X qq exppβ0 ` βXX ` εq

exppβ0 ` βXX ` εq “ PrpT “ 1 | X qp1` exppβ0 ` βXX ` εqq

spX q “ PrpT “ 1 | X q “exppβ0 ` βXX ` εq

1` exppβ0 ` βXX ` εq

So once we have estimates for the parameters (regression coefficients)of the logistic model, we can then get an estimate for the propensityscores.

Ed Kroc (UBC) Causal Inference June 12, 2019 10 / 37

Page 11: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Estimating propensity scores

Return to our tutor intervention example:

Model assignment to treatment T conditional on observed covariates,for example:

logitpT q “ β0 ` β1Sex ` β2Faculty ` β3Age ` β4GPA` ε

Fit in R with ‘glm’ command:

Now convert to propensity scores using formula we previously derivedand add these scores to our data frame:

Ed Kroc (UBC) Causal Inference June 12, 2019 11 / 37

Page 12: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Estimating propensity scores

Student Sex Faculty AgeGPA from

previous termTutor?

Finalgrade

Propensityscore est.

1 F SCIE 19 3.5 1 90 0.2472 F SCIE 19 3.6 1 88 0.3383 F SCIE 21 3.7 0 85 0.2254 M SCIE 18 3.0 0 82 0.4985 F SCIE 20 3.5 0 82 0.3166 M SCIE 19 3.7 0 83 0.4287 M SCIE 19 3.5 1 90 0.4468 F ARTS 18 3.8 0 95 0.1189 M SCIE 21 3.5 1 94 0.43410 M SCIE 20 3.6 0 94 0.43111 F ARTS 19 2.8 0 75 0.159...

......

......

......

...113 M SCIE 20 3.1 0 85 0.476

Ed Kroc (UBC) Causal Inference June 12, 2019 12 / 37

Page 13: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching with propensity scores

So now that we have estimates for the propensity scores, how do wematch sample units on them (recall: these are continuous quantities)?

Many techniques have been proposed, but one of the most commonand validated techniques is to match on the quintiles or deciles of thepropensity score distribution; i.e. match on the strata of thep.s. distribution.

Ed Kroc (UBC) Causal Inference June 12, 2019 13 / 37

Page 14: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching with propensity scores

Quintiles: assume sample units are exchangeable over treatment(conditional on observed covariates and assuming no omittedconfounders) within the quintiles of the propensity score distribution:

[0,20th percentile), [20th,40th percentile), . . . , [80th,100th percentile]

If you have enough data, can match over finer quantiles of thepropensity score distribution, e.g. the deciles:

[0,10th percentile), [10th,20th percentile), . . . , [90th,100th percentile]

Once again, many algorithms for choosing matches within each ofthese strata.

For our example, quintiles of p.s. distribution are:

0, 0.13, 0.22, 0.36, 0.44, 1

Ed Kroc (UBC) Causal Inference June 12, 2019 14 / 37

Page 15: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching to minimize confounding of causal effects

Student Sex Faculty AgeGPA from

previous termTutor?

Finalgrade

Propensityscore est.

1 F SCIE 19 3.5 1 90 0.2472 F SCIE 19 3.6 1 88 0.3383 F SCIE 21 3.7 0 85 0.2254 M SCIE 18 3.0 0 82 0.4985 F SCIE 20 3.5 0 82 0.3166 M SCIE 19 3.7 0 83 0.4287 M SCIE 19 3.5 1 90 0.4468 F ARTS 18 3.8 0 95 0.1189 M SCIE 21 3.5 1 94 0.43410 M SCIE 20 3.6 0 94 0.43111 F ARTS 19 2.8 0 75 0.159...

......

......

......

...113 M SCIE 20 3.1 0 85 0.476

Ed Kroc (UBC) Causal Inference June 12, 2019 15 / 37

Page 16: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching with propensity scores

Now match sample units with T “ 1 to those with T “ 0 within eachstrata.

Can do this manually, or most commonly use software to creatematches.

Some potential issues:

No guarantee that you will have the same number of treated units asuntreated units within each strata.

No guarantee that your strata will be equal-sized.

Can get different matches if you choose different strata; thus, willproduce different estimates of ACE pT q.

Ed Kroc (UBC) Causal Inference June 12, 2019 16 / 37

Page 17: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Matching with propensity scores

(1) No guarantee that you will have the same number of treated units asuntreated units within each strata.

SOLUTION: one-to-one matching: only match when you can, discardleftover data.

ALTERNATIVE: many-to-one matching: match multiple units in onetreatment group to one unit in other treatment group (not advised).

(2) No guarantee that your strata will be equal-sized.

SOLUTION: not a problem as long as (1) is not a problem.

(3) Can get different matches if you choose different strata; thus, willproduce different estimates of ACE pT q.

SOLUTION: sensitivity analysis: compare estimates using differentstratifications - if no major differences, then results relatively robust tochoice of strata.

Ed Kroc (UBC) Causal Inference June 12, 2019 17 / 37

Page 18: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Finding a good match is often the hardest part

What if there aren’t enough (good) matches to be made in a stratum?

Equivalently, what if the distribution of the propensity scores are toounbalanced between treatment groups?

There is NO solution to this problem, and it is quite common.

Partial fixes:

Restrict your inferences to subsets where this problem does not arise.

Collect more data to produce enough matches in each stratum.

Respecify your PS-model to (hopefully) achieve better balance.

Ed Kroc (UBC) Causal Inference June 12, 2019 18 / 37

Page 19: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Using propensity scores without matching

Alternatively, one can try not matching at all.

That is, assume that all units within a particular stratum areexchangeable; so can simply estimate EpYi p0qq and EpYi p1qq byEpY | T “ 0q and EpY | T “ 1q within each stratum using all thatstratum’s data.

Positives: no discarding of data, no need to find matches.

Negatives: stronger exchangeability assumption required (nottestable).

Ed Kroc (UBC) Causal Inference June 12, 2019 19 / 37

Page 20: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Limitations of propensity scores

How to specify the best propensity score model?

How much balance is enough balance?

What if there aren’t enough matches to be made in a stratum?

Ed Kroc (UBC) Causal Inference June 12, 2019 20 / 37

Page 21: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Specifying the propensity score model

Lots of conflicting research out there on this.

Different goals of model specification than before:

Previously (RD designs), wanted to properly specify the model for thedata-generating process: Y “ f pX q ` ε.

But now, we are trying to specify a model for theassignment-to-treatment process: spX q “ PrpT “ 1 | X q.

Main goal: to acheive balance of covariates between treated groups,thus ensuring exchangeability of treatment over measured covariates.

Best way to do this: fit a PS-model, then check the balance ofcovariates empirically.

Ed Kroc (UBC) Causal Inference June 12, 2019 21 / 37

Page 22: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Checking empirical propensity score balance

Most common ways to check covariate balance:

Do standardized mean differences of the covariate between T “ 0 andT “ 1 approximately equal 0 over each propensity score subclass?

Does ratio of variances of covariate between T “ 0 and T “ 1approximately equal 1 over each propensity score subclass?

More generally, do the distributions of the covariate between T “ 0and T “ 1 approximately equal?

Also more generally, are the propensity scores balanced between theT “ 0 and T “ 1 groups in each subclass?

See case studies for examples.

How much balance is enough? No good answer.

Ed Kroc (UBC) Causal Inference June 12, 2019 22 / 37

Page 23: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Outline for working with propensity scores

Fit a PS-model for the assignment-to-treatment mechanism.

Check the empirical balance of propensity scores over subclasses(e.g. quintiles, deciles).

If balance is poor, specify revised PS-model and check new balance.

Once balance is adequate, then estimate ACE within each subclass:

pEpY | spX qq

Finally, can combine strata estimates to get an overall estimate of theACE, if desired: usually combine estimates via inverse-sample-sizeweighting.

Ed Kroc (UBC) Causal Inference June 12, 2019 23 / 37

Page 24: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Case study

Connors et al. (1996)

Ed Kroc (UBC) Causal Inference June 12, 2019 24 / 37

Page 25: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Instrumental variables (natural experiments)

The final major technique for causal inference in observational orquasi-experimental settings is instrumental variables.

RD-designs allow us to estimate average causal effects of treatmenton response locally near the fixed threshold that determinesassignment to treatment.

PS-analyses help us estimate average causal effects of treatment onresponse, assuming no omitted confounders, by making treatmentand non-treatment groups (approximately) exchangeable overobserved covariates.

IVs allow us to estimate average causal effects of treatment onresponse, theoretically, in any context. In practice, of course, this isvery difficult to accomplish.

Ed Kroc (UBC) Causal Inference June 12, 2019 25 / 37

Page 26: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Confounding in a randomized controlled trial

Suppose we want to assess the efficacy of a new drug to treat highblood pressure. We enrol 100 patients with no known comorbiditiesinto our study. Then we randomly assign 50 to receive the new drug,and 50 to receive the current “best” medication.

Suppose some experimental units (patients) do not always take theirassigned treatment; e.g. laziness, forgetfulness.

Suppose some experimental units do not always stay on their assignedtreatment; e.g. adverse side effects.

Even though the assignment-to-treatment mechanism is random, ourestimates of the ACE(T) will be confounded since the actualtreatment does not always match with the intended treatment.

Ed Kroc (UBC) Causal Inference June 12, 2019 26 / 37

Page 27: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Confounding in a randomized controlled trial

What to do when people don’t consistently stay on treatment or dropout altogether?

Exclude these people from analysis?

Restrict scope of analysis (e.g. duration of treatment) for everyone?

Neither ideal. . .

We might hope that if patients don’t stay on treatment, this happensat random; i.e. there is no reason for the drop-out.

Of course, likely not reasonable in practice.

Thus, if we simply exclude these people from analysis, we will bedestroying our random assignment to treatment mechanism;i.e. estimated treatment effects will be confounded by unknowndrop-out confounders.

Ed Kroc (UBC) Causal Inference June 12, 2019 27 / 37

Page 28: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Confounding in a randomized controlled trial

Formally, suppose we analyze our data, excluding patients who didn’tadhere to treatment, with a simple regression model or one-wayANOVA (assuming no relevant covariates to control for):

Y “ β0 ` βTT ` δ.

But because the treatment effect is no longer randomly assigned, βTdoes not estimate the ACE(T); i.e. the error δ is confounded with T

For example, patients with undiagnosed depression are likely todrop-out of treatment; thus:

δ “ βTDT ¨ Depression ` δ1

or, more generally,

δ “ f pT ,Depressionq ` δ1

Ed Kroc (UBC) Causal Inference June 12, 2019 28 / 37

Page 29: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Confounding in a randomized controlled trial

If we ignore this confounding, we already know that our estimates ofACE(T) will be biased:

Y “ β0 ` βTT ` δ

CovpY ,T q “ Covpβ0,T q ` CovpβTT ,T q ` Covpδ,T q

CovpY ,T q “ 0` βTVarpT q ` Covpδ,T q,

So that:

βT “CovpY ,T q ´ Covpδ,T q

VarpT q

If the usual regression assumptions are satisfied, then

Covpδ,T q “ 0,

and we recover the classical solution:

βT “CovpY ,T q

VarpT q

Ed Kroc (UBC) Causal Inference June 12, 2019 29 / 37

Page 30: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Confounding in a randomized controlled trial

However, this is not the case we are in!

Under our model misspecification, we know that some term(s) is/aremissing from our model:

δ “ f pT ,Depressionq ` δ1

Thus, Covpδ,T q ‰ 0, and so

βT “CovpY ,T q ´ Covpδ,T q

VarpT q

However, our standard regression analysis assumes that there are noviolations of assumptions; thus, we only estimate the first term:

pβT “yCovpY ,T q

xVarpT q“

řni“1pyi ´ syqpti ´ stqřn

i“1pti ´ stq2

Ed Kroc (UBC) Causal Inference June 12, 2019 30 / 37

Page 31: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Confounding in a randomized controlled trial

Thus, under the misspecified model, our estimate of βT is wrong.

Consequently, any derived estimate of ACE(T) is wrong.

To get an unconfounded estimate of treatment effect, we would needto model:

Y “ β0 ` βTT ` f pT ,Depressionq ` δ1

Unfortunately, we don’t know the function f , we don’t have data onDepression, and we don’t even know that undiagnosed Depression isthe reason for drop-out!

What to do?

ANSWER: enter the instrumental variable. . .

Ed Kroc (UBC) Causal Inference June 12, 2019 31 / 37

Page 32: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Intention to treat: the first instrumental variable

Notice that the initial assignment to treatment was random; we saythat the intention to treat was random.

Consider modelling actual treatment, T , as a function of intention totreat, Z :

T “ γ0 ` γ1Z ` ε

If all patients stay on their assigned treatment, then γ0 “ γ1 “ ε “ 0,so T “ Z .

If, instead, patients do not adhere to their assigned treatment, thenthe model parameters (and error) are not all zero.

Ed Kroc (UBC) Causal Inference June 12, 2019 32 / 37

Page 33: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Intention to treat: the first instrumental variable

Notice also that the intention to treat, Z , is only related to theoutcome of interest (decline in blood pressure) via the causal effect oftreatment, T . That is, there is no direct effect of Z on Y .

Consequently, CovpZ , δq “ 0, where

Y “ β0 ` βTT ` δ

Since these analytical properties hold, we say that Z is aninstrumental variable for the causal effect of T on Y .

And, amazingly, we claim that an unconfounded estimate of theACE(T) is given by

βIV “CovpZ ,Y q

CovpZ ,T q

Ed Kroc (UBC) Causal Inference June 12, 2019 33 / 37

Page 34: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Intention to treat unconfounds drop-out effects

We claim that an unconfounded estimate of the ACE(T) is given by

βIV “CovpZ ,Y q

CovpZ ,T q

Why? A little covariance algebra:

βIV “CovpZ ,Y q

CovpZ ,T q

“CovpZ , β0 ` βTT ` δq

CovpZ ,T q

“βTCovpZ ,T q ` CovpZ , δq

CovpZ ,T q“ βT

And this is exactly the (unconfounded) causal effect of treatment onresponse Y that we could not estimate before due to Covpδ,T q ‰ 0.

Ed Kroc (UBC) Causal Inference June 12, 2019 34 / 37

Page 35: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Instrumental variables in general

In general, an instrumental variable Z for the causal effect of some Xon Y must obey the following properties:

(1) Z must be conditionally independent of the response Y , given X ;i.e. Z is only related to Y via X .

(2) Z must be related to the causal variable X , and in particular,CovpZ ,X q ‰ 0.

Assumption (1) is usually untestable; must be justified fromtheoretical considerations (like our intention to treat example).

Assumption (2) is directly testable; in practice, to get goodIV-estimates, we should have a strong association/correlation betweenZ and X .

Note: this all generalizes to situations where you may want to controlfor fixed covariates or have more than one instrumental variable forone or more causal variables of interest.

Ed Kroc (UBC) Causal Inference June 12, 2019 35 / 37

Page 36: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Difficulties of instrumental variables

It can be quite difficult, if not impossible to find a reasonableinstrumental variable for a given research problem.

In practice, IV-estimates can be quite unstable:

When CovpZ ,X q is close to zero, the IV is said to be weak. Sampleestimates based on weak IVs are unlikely to be improvements overordinary non-IV-estimates.

When the IV is related to the response Y in more ways than justthrough the causal variable of interest X , the whole IV-frameworkbreaks down; IV-estimates are biased and can even be inconsistent.

Even if the IV is strong and is only related to the response via thecausal variable of interest, IV-estimates have relatively poor smallsample properties because their sampling distributions tend to be veryheavy-tailed; may not even have finite moments!

Ed Kroc (UBC) Causal Inference June 12, 2019 36 / 37

Page 37: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Case study

Angrist & Krueger (2001)

Ed Kroc (UBC) Causal Inference June 12, 2019 37 / 37

Page 38: EPSE 581C: Causal Inference for Applied Researchers...Student Sex Faculty Age GPA from previous term Tutor? Final grade Propensity score est. 1 F SCIE 19 3.5 1 90 0.247 2 F SCIE 19

Next time

More instrumental variables

Ed Kroc (UBC) Causal Inference June 12, 2019 38 / 37


Recommended