What works in Boston may not work in Los Angeles:...

What works in Boston may not work in Los Angeles:Understanding site di↵erences and generalizing e↵ects

from one site to another.

Kara Rudolphwith Mark van der Laan

RWJF Health and Society ScholarUC Berkeley / UC San Francisco

Kara Rudolph (UCB/UCSF) Generalizing e↵ects across sites 1 / 39

Outline

1Motivation

Motivating example

2Methodologic Challenges

3Approach

4Results

5Future directions


Motivation

Should we expect that a policy/program/intervention implemented inone place will have the same e↵ect when implemented in anotherplace?

Not always.

1 Di↵erences in site-level variables (e.g., implementation, economy) thatmodify intervention e↵ectiveness, AND/OR

2 Di↵erences in person-level variables (i.e, population composition) thatmodify intervention e↵ectiveness.


Motivation


Not always.




Motivation


Not always.




Motivation


Not always.




Motivation

Budgets are limited.

Need to target the policy/intervention to those places that stand tobenefit most.

E.g., planned expansion of intervention. Where should it be expandedto have the largest e↵ect? How is success defined?

Can you think of any practical examples of this?

Research question: What do we expect the e↵ect of anintervention to be in a new place, accounting for populationcomposition?


Motivation







Motivation







Motivation







Motivation







Outline

1Motivation

Motivating example


3Approach

4Results

5Future directions


Motivating Example

Moving To Opportunity (MTO)1

https://upload.wikimedia.org/

http://www.chicagomag.com

1Kling, J. R. et al. Experimental analysis of neighborhood e↵ects. Econometrica 75, 83–119 (2007).


https://upload.wikimedia.org/

http://www.chicagomag.com

Motivating Example

In discussing di↵erences in e↵ects across sites, MTO researchers concluded:

Of course, if it had been possible to attribute di↵erences inimpacts across sites to di↵erences in site characteristics, thatwould have been very valuable information. Unfortunately, thatwas not possible. With only five sites, which di↵er ininnumerable potentially relevant ways, it was simply not possibleto disentangle the underlying factors that cause impacts to varyacross sites. (This is true for both the quantitative analysis andfor any qualitative analysis of the impacts that might beundertaken.)2

Why are the researchers saying this? Do you agree?

2Orr, L. et al. Moving to opportunity: Interim impacts evaluation. (2003), p.B11.


Motivating Example

Research Question (MTO-specific): Are di↵erences in interventione↵ects across cities due to di↵erences in implementation? City-leveldi↵erences (e.g, the economy)? Or di↵erences in populationcomposition?


Outline

1Motivation

Motivating example


3Approach

4Results

5Future directions


Methodologic Challenges

Typically, multi-site data are analyzed using fixed e↵ects.

Usually assumes that we answered “Yes” to whether we expect theintervention e↵ect in one site is the same as the other site.

Why is that the case?

Dummy variables for site changes the intercept but not the treatmente↵ect coe�cient. Assume that the conditional e↵ect (regressioncoe�cient) of the intervention in one site is the same as in another site.

Need to apply the results from one city/site to a target city/sitebased on the observed population composition.

Transportability/ generalizability/ external validity.










































What’s been done

Most common: Use fixed e↵ects for site.

- Conditional e↵ect is not as policy relevant as marginal e↵ect

- We usually don’t believe the assumption

Common-ish: Post-stratification/ direct standardization.3E.g.,age-adjusted rates of disease for comparisons between populations.

- Breaks down with continuous characteristics or multiple charactersticsbecause of small cell sizes

- No standard errors/ no inference

3Miettinen, O. S. Standardization of risk ratios. American Journal of Epidemiology 96, 383–388 (1972).


What’s been done









What’s been done









What’s been done









What’s been done









What’s been done









What’s been done

Less common, rare: Model-based approaches: Horvitz-Thompsonweighting (model-based standardization),4propensity scorematching,5and principal stratification6

- Relies on correct model specification

- Inference with machine learning is unclear

- With exception of principal stratification, have not been extended toencouragement-design interventions

Pearl and Bareinbom: formalized theory and assumptions fortransportability7

4Cole, S. R. & Stuart, E. A. Generalizing Evidence From Randomized Clinical Trials to Target Populations The ACTG 320Trial. American journal of epidemiology 172, 107–115 (2010).

5Stuart, E. A. et al. The use of propensity scores to assess the generalizability of results from randomized trials. Journal of

the Royal Statistical Society: Series A (Statistics in Society) 174, 369–386 (2011).6Frangakis, C. The calibration of treatment e↵ects from clinical trials to target populations. Clinical trials (London,

England) 6, 136 (2009).7Pearl, J. & Bareinboim, E. Transportability across studies: A formal approach tech. rep. (DTIC Document, 2011).


What’s been done











What’s been done











What’s been done











What’s been done











Outline

1Motivation

Motivating example


3Approach

4Results

5Future directions


Our contribution

New statistical method for “transporting” e↵ects from one population toanother8

Transport formula for multi-site encouragement-design interventions(extending Pearl and Bareinboim’s work).

Estimation using transport formulas addressing previous gaps:

+ Inference based on theory (even when using machine learning)

+ Double robust: can misspecify multiple models and still get unbiasedestimates

8Rudolph, K. E. & van der Laan, M. J. Double Robust Estimation of Encouragement-design Intervention E↵ectsTransported Across Sites. Under Review (2015).


Our contribution








Our contribution








Our contribution








Problem: there are a lot of relationships to specify and wedon’t know the truth!

$ = <

:

6S ⇠ WZ ⇠ A,W ,A ⇤W , SY ⇠ Z ,W ,Z ⇤W , S

Can you guess the correct modelswhen W is high dimensional? Allinteractions? Correct form (e.g,linear, quadratic, spline)?

Note: A = instrument/encouragement, Z = exposure, Y = outcome, S =site, W = covariates/characteristics


Transport estimators

Targeted maximum likelihood estimators (TMLE) for the followingestimands:

E↵ect of A on Y (intent-to-treat)

E↵ect of Z on Y using randomization of the instrument (complieraverage treatment e↵ect)

E↵ect of Z on Y ignoring randomization



In everyday language, what does TMLE do?

1 Start with identifying the parameter you’re interested in estimating.E.g., the ITTATE, .

2 Get initial estimate for . E.g., run a regression of the Y modelsetting A = 1 and A = 0. The di↵erence will be the initial estimate.

3 The Y model may not be perfect. (If it is, you’re done.) This initialestimate is then adjusted by something called the clever covariate, C ,which is derived from the e�cient influence curve. It uses informationfrom the other models improve upon the initial estimate.

4 This fluctuation can be iterated until convergence.


In everyday language, what does TMLE do?

LQLWLDO�HVWLPDWH WUXH�HVWLPDWH 70/(�HVWLPDWH

��&�

��&�


Outline

1Motivation

Motivating example


3Approach

4Results

5Future directions


Performance

Results for intent-to-treat e↵ect of A on Y. Results are similar for thetwo other estimators.

Model specification % Bias Variance Coverage MSEAll models correct -0.67 0.0004 95.01 0.0004S model misspecified -0.49 0.0004 95.34 0.0004Z model misspecified -0.67 0.0004 95.00 0.0004Y model misspecified -0.71 0.0005 95.36 0.0005S,Z models misspecified -0.49 0.0004 95.29 0.0004S,Z,Y models misspecified 6.05 0.0004 94.84 0.0004



Sensitivity to positivity violations

Structural positivity violations: Person with some set of covariatevalues in one treatment/selection group has a zero probability ofbeing in another treatment/selection group.

Practical positivity violations: This probability isn’t strictly zero, butit’s close.

Why is this a problem?













Practical positivity violations are a substantial issue in real world data.

Why might we expect it in the example below?

Pre−Matching

0

2

4

0.00 0.25 0.50 0.75 1.00Predicted probability of job strain

dens

ity

Less job strainMore job strain

Post−Matching

0.0

0.5

1.0

1.5

2.0


dens

ity



Practical positivity violations are a substantial issue in real world data.

Why might we expect it in the example below?

Pre−Matching

0

2

4


dens

ity

Less job strainMore job strain

Post−Matching

0.0

0.5

1.0

1.5

2.0


dens

ity



Which of the 3 estimands is most vulnerable to these violations usingthe MTO data?

What are some other real-world examples that might be vulnerable topositivity violations?



Which of the 3 estimands is most vulnerable to these violations usingthe MTO data?

What are some other real-world examples that might be vulnerable topositivity violations?



CY

(A = 1)Mean(SD)

Min Max

EATEData-generatingmechanism 1

0.49(0.38) 0.05 2.46

Data-generatingmechanism 2

1.07(1.62) 0.15⇥10�2 26.26

Application 2.05(2.76) 4.54⇥10�2 13.11



Specification %Bias SE⇥pn Cov MSE

(1.60)EATE: Without Positivity Violations

All models correct -0.31 1.60 94.94 0.0005S model misspecified -0.38 1.46 93.68 0.0005Z model misspecified -0.31 1.48 93.01 0.0005Y model misspecified -0.29 1.62 95.09 0.0005S,Z models misspecified -0.43 1.36 92.95 0.0004S,Z,Y models misspecified 14.46 1.37 76.27 0.0009

EATE: With Positivity ViolationsAll models correct 0.18 3.60 91.36 0.0029S model misspecified 1.98 1.96 86.33 0.0012Z model misspecified 0.18 2.67 82.93 0.0029Y model misspecified 2.09 4.17 96.05 0.0027S,Z models misspecified 2.18 1.38 79.27 0.0009S,Z,Y models misspecified -52.11 1.41 2.49 0.0065


Strategies for addressing positivity violations

Limit the sample to the area of support

Truncate weights

Exclude covariates that are neither 1) confounders of theexposure-outcome relationship, nor 2) a↵ect transportability.

Moving the weights from the clever covariate into the model fittingstep




Truncate weights






Truncate weights






Truncate weights





Truncation Level %Bias SE⇥pn Cov MSE

EATENo modification 0.18 3.60 91.36 0.0029Truncation at 0.01/100 2.29 3.23 92.50 0.0024Truncation at 0.05/20 2.71 2.40 89.78 0.0016Truncation at 0.1/10 2.60 1.90 84.96 0.0013


Results

Can our new statistical method shed light on the previouslyintractable problem of not knowing why there are di↵erences ine↵ects across sites?

We take two of the sites: LA and Boston.

Outcome: adolescent school drop out at follow-up.

We use full data from Boston. We ignore the outcome data from LA.Using the outcome model from Boston, we predict the interventione↵ect in LA, accounting for di↵erences in population compositionbetween the two cities.


Results






Results






Results






Results

Real results: Boston

●

●

ITTATE CATE

−1.0

−0.5

0.0

0.5

1.0

Bost

on LA

Tran

spor

ted

LA, T

MLE

Bost

on LA

Tran

spor

ted

LA, T

MLE

Inte

rven

tion

Effe

ct o

n R

isk

of S

choo

l Dro

p O

ut


Results

Predicted results: LA

●

●

●

●

ITTATE CATE

−1.0

−0.5

0.0

0.5

1.0

Bost

on LA

Tran

spor

ted

LA, T

MLE

Bost

on LA

Tran

spor

ted

LA, T

MLE

Inte

rven

tion

Effe

ct o

n R

isk

of S

choo

l Dro

p O

ut


Results

Predicted vs. real results: LA

●

●●

●

●

●

ITTATE CATE

−1.0

−0.5

0.0

0.5

1.0

Bost

on LA

Tran

spor

ted

LA, T

MLE

Bost

on LA

Tran

spor

ted

LA, T

MLE

Inte

rven

tion

Effe

ct o

n R

isk

of S

choo

l Dro

p O

ut


Results

The transported estimates for LA are similar to true LA estimates.

Using population composition, we can predict the e↵ect for LA !intervention e↵ect on school dropout is transportable.

This means that the di↵erence in e↵ects between Boston and LA canbe largely explained by population composition.


Results





Results





Aside: the importance of incorporating machinelearning

●

● ●

●

●

●

●

●

ITTATE CATE

−3

−2

−1

0

1

Boston LATransported LA, TMLE Boston LATransported LA, TMLE

diffe

renc

e of

pro

babi

lity

of s

tayi

ng in

sch

ool

model●

●

●

noneparametricsuperlearner


Superlearner9

Ensemble machine learning

Weights multiple machine learning algorithms to get best prediction

Guaranteed to perform at least as well as best algorithm included inthe weighting

9Van der Laan, M. J. et al. Super learner. Statistical applications in genetics and molecular biology 6 (2007).


Policy implications

We should not expect an intervention/program/policy to have thesame e↵ect in one city as in another city.

In an era of shrinking budgets, important to recognize that whatworks in Boston may not work in LA, so resources can be targetedoptimally.

Broadly useful: multi-site epidemiologic studies, large-scale policy orprogram interventions, clinical trials.


Policy implications





Policy implications





Outline

1Motivation

Motivating example


3Approach

4Results

5Future directions


Future Directions

Examine other strategies to reduce sensitivity to practical positivityviolations, especially excluding covariates and moving the weights.

In-depth application of transportability to MTO to understand therelationship between neighborhood poverty and exposure to violenceand violent behaviors.

Grant application to extend the transportability method to mediationmechanisms. Examine mediation of the relationship betweenneighborhood poverty on adolescent risk behaviors by the schoolenvironment.

Other ideas? Suggestions?


Future Directions






Future Directions






Future Directions






How can I do this?

Use the R functions that I wrote

Parametric or semiparametric options

i t t a t e tm l e<�f u n c t i o n ( a , z , y , s i t e ,w, t r unca t e , lbound )

ca te tmle<�f u n c t i o n ( ca , cz , cy , c s i t e , cw , c t r unca t e , c lbound )

n o i n s t r a t e tm l e<�f u n c t i o n ( a , z , y , s i t e ,w, t r unca t e , lbound )


Thanks!

www.biostat.jhsph.edu/⇠[email protected]

Robert Wood Johnson Foundation Health & Society Scholars program,UCSF/UCB

Collaborators

Mark van der Laan, UC Berkeley

Jennifer Ahern, UC Berkeley,

Maria Glymour, UCSF

Theresa Osypuk, University of Minnesota


Date post:	25-Feb-2021
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

What works in Boston may not work in Los Angeles:...

Documents