Sheetal Sekhri University of Virginia BREAD IGC Summer ... · Sheetal Sekhri University of Virginia...

Lecture 2: Causal Inference Using Observational Data

Sheetal SekhriUniversity of Virginia

BREAD IGC Summer School, India 2012

July 21, 2012

Using Observational Data

• Many policies and programs are evaluated after theirimplementation

• Policy design can sometimes provide plausibly exogenousvariation

• Observational data that can be combined withinstitutional details for evaluation

• Applications may also lend themselves to naturalexperiments

• Observational data can also be used in such contexts

























Benefits of Using Observational Data

• Time horizon relative to field experiments

• Not as expensive

• Externally valid inference (depending on the data anddesign)

• Less fraught with behavioral concerns
















Limitations of Using Observational Data

• Selection concerns

• Data quality

• Data limitations - not all desired data for an applicationmay exist

• Data restrictions



• Data quality





• Data quality





• Data quality



Working with Observational Data- Methods

• Difference-in-Difference (DID)

• Regression Discontinuity Design (RDD)

Working with Observational Data- Methods

• Difference-in-Difference (DID)

• Regression Discontinuity Design (RDD)

Difference-in-Difference

• Most popular method used in empirical analysis

• Emulate an experiment with treatment and comparisongroups

• Uses panel data and is a two way fixed effects model









DID- Basic Idea

• With panel data on the treated group, can compare preand post intervention or policy change

• But any discerned effect can arise due to secular changes

• Panel data on comparison group can provide thecounterfactual

• What would happen to treated group over time inabsence of treatment

DID- Basic Idea





DID- Basic Idea





DID- Basic Idea





DID- Implementation

• Isolate the design using tabular or graphic representation

• Formalize using regression analysis

DID- Implementation

• Isolate the design using tabular or graphic representation

• Formalize using regression analysis

Tabular Representation

DID

Before After DifferenceTreatmentControlDifference


DID

Before After DifferenceTreatment YT1 YT2

ControlDifference


DID

Before After DifferenceTreatment YT1 YT2

Control YC1 YC2

Difference


DID

Before After DifferenceTreatment YT1 YT2 ∆YT = YT2 − YT1

Control YC1 YC2

Difference


DID


Control YC1 YC2 ∆YC = YC2 − YC1

Difference


DID


Control YC1 YC2 ∆YC = YC2 − YC1

Difference ∆YT − ∆YC

DID- Identifying Assumption

• Control group shows the time path of the treatmentgroup without the intervention

• Time trends in absence of treatment should be the same

• Levels can be different

• If different time trends, effect over or under stated

• Identifying assumption- No differential pre-trends

























Difference in Difference

Time

Outcome

Pre-treatment

Control group Treatment group


Time

Outcome

Post-treatmentPre-treatment



Time

Outcome




Time

Outcome



Treatment effect comparing Treatment & control group in post period


Time

Outcome




Time

Outcome



Treatment effect comparing just the Treatmentgroup in pre & post period


Time

Outcome




Time

Outcome



Difference-in-Difference: Treatment effect by comparing the Treatmentgroup in pre & post period after eliminating pre-existing difference b/w Treatment and Control group

Difference in Difference: Parallel Trend Assumption

Time

Outcome





Time

Outcome




Time

Outcome



Difference in Difference: Parallel Trend Assumption Violated

Time

Outcome



DID- Regression Analysis

• Suppose our policy change effects villages such that a setof villages are treated

• We have data over time for all villages

• The panel has only 2 time periods

• Post is an indicator that switches to 1 after theintervention

• T is an indicator that takes value 1 for the villages to betreated


























• Outcome variable Y varies by villages and time

• Yit = α0 + α1 Post + α2 T + α3 Post ∗ T + εit





























DID

Before After DifferenceTreatment α0 + α2

ControlDifference


DID

Before After DifferenceTreatment α0 + α2 α0 + α1 + α2 + α3

ControlDifference


DID

Before After DifferenceTreatment α0 + α2 α0 + α1 + α2 + α3 α1 + α3

ControlDifference


DID


Control α0

Difference


DID


Control α0 α0 + α1

Difference


DID


Control α0 α0 + α1 α1

Difference


DID


Control α0 α0 + α1 α1

Difference α3

DID- Robustness and Extension

• With many years data before the intervention, possible tocheck for pre-trends to make estimation more credible

• Common support required - check for significant overlapin distributions of T and C

• Placebo test- No effect should be discerned if treatmentis randomly considered to occur in any year prior toactual date

• Balance across treatment and control- selection model toshow determinants of treatment not time varying
















DID- Extension

• Yit = β0 + Tt + Vi + β2 Tt ∗ Vi + εit

• Tt full set of year fixed effects

• Vi full set of village fixed effects

• Allows for covariance between Tt and Tt ∗ Vi and Vi andTt ∗ Vi

• Systematic differences between villages allowed

• Allow for intervention to occur in years with differentoutcome variable

DID- Extension







DID- Extension







DID- Extension







DID- Extension







DID- Extension







Regression Discontinuity Design- RDD

• Resource allocation based on a cutoff- scores, date ofbirth, rationing cutoffs

• Can use an RD design in such settings

• Powerful way of addressing selection

• Observable characteristics in T and C can be different

• Common support not needed

























RDD- Basic Idea

• The control and treated observations are very similararound the cutoff

• Scoring barely above the cutoff matter of chance

• Unobservable characteristics like ability very similar butone group gets treatment and other does not

• Selection process completely known and can be modeled

• Regression function between assignment and outcomevariable determined

RDD- Basic Idea






RDD- Basic Idea






RDD- Basic Idea






RDD- Basic Idea






20

40

60

80

Tes

t S

core

10 20 30 40 50 60 70 80 90 100Assignment Variable

Regression Discontinuity: No Effect

Control Group Treatment Group

20

40

60

80

Tes

t S

core


Regression Discontinuity: Significant Effect


20

40

60

80

Tes

t S

core




20

40

60

80

Tes

t S

core




20

40

60

80

Tes

t S

core




20

40

60

80

Tes

t S

core




Counterfactual

Regression

20

40

60

80

Tes

t S

core




Counterfactual

Regression

Treatment Effect

RDD- Implementation

• Probability of treatment should be discontinuous at thecutoff- T sample on one side

• Those offered T should take it up and control groupsshould not be able to get treated

• Sharp versus fuzzy design require different approaches

• The pr of T changes from 0 to 1 at the cutoff in sharpdesign

• If the pr does not change very sharply or the over ridesare high, use assignment as IV for treatment

RDD- Implementation






RDD- Implementation






RDD- Implementation






RDD- Implementation






Regression Discontinuity: Sharp Design

0.2

.4.6

.81

Tre

atm

ent


Regression Discontinuity: Fuzzy Design

0.2

.4.6

.81

Tre

atm

ent


RDD- Implementation

• Parametric , semi-parametric or non parametric methodscan be used for estimation

• Mis-specified functional form can be a problem

• Discontinuity in regression functions at the cutoff is thetreatment effect

• Functional forms can generate spurious effects or biasedeffects

• Non linear functional forms estimated as linear regressionfunctions is an example

RDD- Implementation






RDD- Implementation






RDD- Implementation






RDD- Implementation






20

40

60

80

Tes

t S

core


Threats to RD: Nonlinear Functional Form

Treatment Group Control Group

20

40

60

80

Tes

t S

core




20

40

60

80

Tes

t S

core




20

40

60

80

Tes

t S

core




Treatment Effect

20

40

60

80

Tes

t S

core




Treatment Effect

RDD- Functional Form

• Visual inspection of the data - normalize the AV bysubtracting the cutoff from the observation score

• Over fitting the model allowing for interaction terms aswell

• Will reduce power and need a lot of data around he cutoff

• Sensitivity analysis to different functional forms











ssekhri

Typewritten Text







• Non parametric approaches - local linear regressions

• Sensitivity to bandwidth and kernel choice

• In semi parametric approaches, smooth functionestimated with splines and covariates can be controlled









RDD- Threats to the Validity

• The cutoffs should be unknown to the population

• Cutoffs should not be manipulated

• Testing for manipulation, can perform Mcrary’s test

• Other potential outcomes should be continuous to avoidalternative confounding interpretations

• Test for continuity of several available control variables

























Threats to RD: Manipulation of the

Assignment Variable


Assignment Variable

McCrary Test (2008) • Statistical test for testing discontinuity of the assignment variable at

the cutoff point


Assignment Variable


the cutoff point

• Assignment variable satisfying “McCrary” Test

0

.1.2

.3.4

.5

46 48 50 52 54


Assignment Variable


the cutoff point

• Assignment variable violating “McCrary” Test

0

.1.2

.3.4

.5

46 48 50 52 54

RDD- LATE Estimator

• The limitation of RDD - effect isolated at cutoff

• Cutoff may not be policy relevant or results may not beexternally valid

• RD frontier can arise if cutoff varies by years or sites

• Can pool different cutoff to get a more general estimatefor the range over which cutoff varies

• More generalizable but masks heterogeneity

RDD- LATE Estimator

• The limitation of RDD - effect isolated at cutoff

• Cutoff may not be policy relevant or results may not beexternally valid

• RD frontier can arise if cutoff varies by years or sites

• Can pool different cutoff to get a more general estimatefor the range over which cutoff varies

• More generalizable but masks heterogeneity

Date post:	22-Jul-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Sheetal Sekhri University of Virginia BREAD IGC Summer ... · Sheetal Sekhri University of Virginia...

Documents