Teaching CoP: Teaching Causality in...

Bla

nch

ena

y C

oP

201

9

Teaching Causality in ECO372

P. Blanchenay

Teaching and Learning Community of Practice

2019/05/27

Bla

nch

ena

y C

oP

201

9

(Data) science without (data) conscience is but the ruin of the soul.

— (counterfactual) Rabelais.

Bla

nch

ena

y C

oP

201

9

Angrist & Pischke “Through our classes darkly” (JEP, 2017)

Undergrad econometrics does not address causality

Focus on Gauss-Markov assumptions & their failures

Implicit or explicit focus on estimator efficiency

Little on identification strategies (e.g. diff-in-diff, RDD)

https://doi.org/10.1257/jep.31.2.125

Bla

nch

ena

y C

oP

201

9

Outline

Today: Approach to causal inference in econometrics

• Structure of ECO372

• Two frameworks to explain causality (not regressions)

• How I test students

• Causality beyond econometrics

Not today: settling debates

• reduced form vs. structural

• potential outcomes vs. causal graphs (DAGs)

Bla

nch

ena

y C

oP

201

9

ECO372 APPLIED REGRESSION ANALYSIS

Bla

nch

ena

y C

oP

201

9

ECO372 Applied Regression Analysis and Empirical Papers

2019H1 renamed : “Data Analysis and Applied Econometrics in Practice”

Objective: quant. methods → applied empirical research

• Causal inference and identification

• Empirical strategies

• Stata, replication

Bla

nch

ena

y C

oP

201

9

Motivation#1Getting teaching closer to practice

Many questions in economics are causal

Identification central in applied empirical work

Variety of empirical strategies (diff-in-diff, RDD…)

Reliability of findings

Classic econometrics instruction does not focus on this

Bla

nch

ena

y C

oP

201

9

Angrist & Pischke approach

Potential Outcomes (Roy-Rubin) causal model

Start with RCTs as “perfect setting”

Regression to deal with selection on observables

Quasi-experimental approaches:

Instrumental variables

Diff-in-diff, segue into Panel Data

Regression Discontinuity

Bla

nch

ena

y C

oP

201

9

Gauss-Markov assumptions failure

GM assumptions “Classic” metrics Angrist Pischke

Linear model with mean zero errors

Get functional form right CEF Linear approximation

Errors are homoskedastic GLS “Just add ‘robust’”

Errors are serially uncorrelated

GLS, time series • “Just add ‘robust’”• Clustered SE in

clustered RCTs

(Errors are normally distributed)

Alternative estimators Focus on large sample / asymptotics

Exogeneity • Extensive discussion of measurement error

• Technical discussion of IV

• Small discussion of measurement error

• Extensive focus on empirical strategies that yield CIA

Bla

nch

ena

y C

oP

201

9

Motivation #2Economics comparative advantage

Think hard about data!

Many disciplines do stats; not many causal inference

Big data not the solution to all problems

Bla

nch

ena

y C

oP

201

9

Course structure

RCTs as “perfect setting”

Regression to deal with selection on observables

Instrumental Variables

RCTs with imperfect compliance

Difference-in-differences

Regression Discontinuity

Bla

nch

ena

y C

oP

201

9

Causal frameworks

Two causal frameworks as Ariadne’s threads

• Potential Outcomes

• Causal graphs (DAGs)

Emphasis on identification assumptions

Bla

nch

ena

y C

oP

201

9

A different take on regressions

𝑦𝑖 = 𝛼 + 𝛽𝐷𝑖 + 𝛾𝑊𝑖 + 휀𝑖

Start with binary treatment

Not all regressors equal

Bla

nch

ena

y C

oP

201

9

Correlation and causation

Two distinct questions:

1. “If there were a correlation between 𝐷 and 𝑌, would this represent the effect of 𝐷 on 𝑌?”

• Tools: assumptions, DAGs, sometimes regressions

2. “Is there a correlation between 𝐷 and 𝑌?”

• Tools: regressions, statistical inference

Bla

nch

ena

y C

oP

201

9

Bla

nch

ena

y C

oP

201

9

TWO CAUSAL FRAMEWORKS

Potential Outcomes, DAGs

Bla

nch

ena

y C

oP

201

9

Combining two approaches

• Potential Outcomes (PO)

• Used by the textbook

• Causal graphs (mostly Pearl, 2000)

Bla

nch

ena

y C

oP

201

9

Potential Outcomes (PO) / Roy-Rubin

• Binary treatment 𝐷

• Potential Outcome: 𝑌0𝑖 if untreated ; 𝑌1𝑖 if treated ;

• Treatment effect: 𝑌1𝑖 − 𝑌0𝑖

• But only observe either 𝑌𝑖0 or 𝑌1𝑖

Bla

nch

ena

y C

oP

201

9

Potential Outcomes (PO) / Roy-Rubin

Assume constant effect 𝛽: 𝑌𝑖 = 𝛼 + 𝛽𝐷𝑖 + 휀𝑖Baseline potential outcome: 𝑌0𝑖 = 𝛼 + 휀𝑖

Then:𝐸 𝑌𝑖 𝐷𝑖 = 1 − 𝐸 𝑌𝑖 𝐷𝑖 = 0

observed

= 𝛽 + 𝐸 휀𝑖 𝐷𝑖 = 1 − 𝐸 휀𝑖 𝐷𝑖 = 0𝐒𝐞𝐥𝐞𝐜𝐭𝐢𝐨𝐧 𝐛𝐢𝐚𝐬

Conditional Independence Assumption (CIA)𝐸 휀𝑖 𝐷𝑖 = 0 = 𝐸 휀𝑖 𝐷𝑖 = 1

Bla

nch

ena

y C

oP

201

9

All design assumptions deal with CIA

Diff-in-diff: common trend assumptionAbsent treatment, treated units would have evolved the same way as untreated units

𝑦𝑖𝑡 = 𝛼 + 𝛽 ⋅ 𝑇𝑅𝐸𝐴𝑇𝑖 + 𝛾 𝑃𝑂𝑆𝑇𝑡+𝛿 𝑇𝑅𝐸𝐴𝑇 × 𝑃𝑂𝑆𝑇 𝑖𝑡 + 휀𝑖𝑡

CIA : 𝐸 휀𝑖𝑡 𝑇𝑅𝐸𝐴𝑇, 𝑃𝑂𝑆𝑇 = 𝐸[휀𝑖𝑡]

Bla

nch

ena

y C

oP

201

9

Directed Acyclical Graphs (DAGs)

• X has a (possible) effect on D, and on Y

• D has an effect of Y

• 휀 is unobserved (and has an effect on Y)• Typically omitted from graph

• Directed: causal relationships have a direction (effect of X on Y)

• Acyclical: Forbids cycles such as𝐷

𝑋

𝑌

𝐷

𝑋

𝑌

휀

Bla

nch

ena

y C

oP

201

9

𝑋 is a confounder(common cause)

𝑋 is a collider(common outcome)

Terminology

𝐷

𝑋

𝑌 𝐷

𝑋

𝑌

Bla

nch

ena

y C

oP

201

9

Causal paths

2 causal paths from D to Y:

Direct path: 𝐷 → 𝑌

Backdoor path: 𝐷 ← 𝑋 → 𝑌

𝐷

𝑋

𝑌

휀

Bla

nch

ena

y C

oP

201

9

Open causal paths

• Open if either:

• There is no collider on the path

• There is a collider 𝑋, and we control / hold it constant 𝑋

𝐷

𝑋

𝑌 𝐷

𝑊

𝑌

𝑋

Bla

nch

ena

y C

oP

201

9

Closed causal paths

• Closed if either:

• There is a collider on the path

• We control for a non-collider on the path

𝐷

𝑊

𝑌

𝑋

𝐷

𝑊

𝑌

Bla

nch

ena

y C

oP

201

9

A and B correlated because open paths

• 𝐴 → 𝐵

• 𝐴 → 𝐷 ← 𝐵

Correlation between A and B does not represent only the direct effect of A on B

Open paths create correlations

𝐴

𝐶

𝐵

𝐸

𝐷

𝐹

Bla

nch

ena

y C

oP

201

9

DAGs and identification

Backdoor criterion (sufficient)

The covariance between 𝐷 and 𝑌 identifies the causal effect of 𝐷 on 𝑌 if all backdoor paths from 𝐷 to 𝑌 are closed.

Identification strategies try to rule out backdoor paths

Bla

nch

ena

y C

oP

201

9

Different benefits

• Potential Outcomes

• Easy to talk about counterfactuals

• Neat interpretable algebra, formula for bias

• Causal graphs

• Visual

• Connects the assumptions of each empirical strategy

• Offers immediate reasoning about control variables

Bla

nch

ena

y C

oP

201

9

Collider bias (~ “bad controls”)

• Controlling on a collider (common outcome) re-opens a causal path

Collider bias, “bad controls”, endog. selection bias, Simpson’s paradox

(Conditional) Correlation between D and Y does not reflect causal effect of 𝐷 on 𝑌

𝐷

𝑋

𝑌

Bla

nch

ena

y C

oP

201

9

Collider bias (~ “bad controls”)(1) (2)

SAT Maths SAT Maths

SAT Verbal 0.029 -0.251***

(0.0364) (0.0350)

Accepted 0.598***

(19.26)

Observations 800 800

(1)

SAT Maths

SAT Verbal 0.029

(0.0364)

Accepted

Observations 800

Bla

nch

ena

y C

oP

201

9

2 frameworks = 2 Ariadne’s threads

Identification strategies rule out

• Violation of CIA

• Open backdoor paths

Examples:

• RCT

• Multivariate regression (control variables)

• Individual fixed effects

• Instrumental variables

Bla

nch

ena

y C

oP

201

9

TESTING STUDENTS

Bla

nch

ena

y C

oP

201

9

How I test students’ understanding

Questions on specific papers/studies

• If the researchers estimate Eq(1), would መ𝛽 estimate the causal effect of 𝑋 on 𝑌? Why or why not?

Make students create data and then run estimations

True or False questions (h/t Karen Bernhardt-Walther)

Bla

nch

ena

y C

oP

201

9

Bla

nch

ena

y C

oP

201

9

Effect of Facezon HQ on wages

Q1: Generate wages according to:

𝑤𝑖𝑡 = 10 + 1.3 𝐻𝑄𝑖𝑡+0.2 𝑦𝑒𝑎𝑟𝑡 × 𝐶𝑖𝑡𝑦𝐴𝑖 + 0.6 𝑦𝑒𝑎𝑟𝑡 × 𝐶𝑖𝑡𝑦𝐵𝑖 + 휀𝑖𝑡

Q2: You receive the data on wages in each city. How would you estimate the effect of Facezon HQ on wages? Estimate diff-in-diff:

𝑤𝑖𝑡 = 𝛼 + 𝛽𝐶𝑖𝑡𝑦𝐵𝑖 + 𝛾 𝑃𝑂𝑆𝑇2016 𝑡

+𝛿 𝐶𝑖𝑡𝑦𝐵 × 𝑃𝑂𝑆𝑇2016 + 𝑢𝑖𝑡

Does your estimate መ𝛿 correspond to what you expected from Q1? Why or why not?

Bla

nch

ena

y C

oP

201

9

True/false questions

For an RCT on the effect of receiving food stamps on the decision to work, participants were recruited at Whole Foods and No Frills supermarkets. True or false? In that RCT, one should not control for the recruitment location, as this is a ‘bad control’.

The Ontario government considers offering a subsidy for childcare to families that fall below $40,000 of yearly joint income. True or false? Families are likely to under-report their income in order to qualify, but a researcher could always use an instrumental variable approach to estimate the effect of the childcare subsidy.

Bla

nch

ena

y C

oP

201

9

Trade-offs

Few proofs & little maths (students selection)

Little discussion of heteroskedasticity

• Just add vce(robust) or vce(cluster …)

No time series

OLS only (IV as 2SLS)

Little discussion of heterogeneous effects

Bla

nch

ena

y C

oP

201

9

CAUSAL INFERENCE BEYOND ECONOMETRICS

Bla

nch

ena

y C

oP

201

9

Causal inference beyond econometrics

• An economic theory generates causal statements

• Empirics allow to sort between theories

• “↗ Min wage ⇒ ↗ unemployment”

• How would you test that?

Bla

nch

ena

y C

oP

201

9

Example: Price elasticity

https://www.dropbox.com/s/8nujfq892ut5a37/Lecture%2016%20Estimating%20Elasticity.pptx?dl=

Bla

nch

ena

y C

oP

201

9

Example: Demand elasticity

• We only observe equilibrium values of P,Q

• How can we find demand elasticity?


P

QQ1 Q2

P1

P2

Bla

nch

ena

y C

oP

201

9

Example: Demand elasticity

Which is it?


P

QQ1 Q2

P1

P2

P

QQ1 Q2

P1

P2

Scenario 1:Less elastic demand, positive supply shock.

Scenario 2:More elastic demand, positive supply shock and negative demand shock.

D

D1

D2

S1

S2

S1

S2

Bla

nch

ena

y C

oP

201

9

Example: Identifying demand elasticity

• Suppose you have information on average price and quantity of bread sold per month in Cleveland when there are 30 bakeries. Suppose three new bakeries open on April 1st, increasing supply for April.

• 휀𝐷 = −

𝑄𝐴𝑃𝑅−𝑄𝑀𝐴𝑅𝑄𝑀𝐴𝑅

𝑃𝐴𝑃𝑅−𝑃𝑀𝐴𝑅𝑃𝑀𝐴𝑅

P

QQ1 Q2

S1

S2

D

• If we are sure that an elasticity is estimated by an exogenous shock only to supply, we say it isidentified.

Bla

nch

ena

y C

oP

201

9

N. Huntington-Klein ECO305Economics, Causality, and Analytics

• Focus on causal inference and programming

• No regression!

• Controlling done through subsamples/matching

http://nickchk.com/econ305.html

Bla

nch

ena

y C

oP

201

9

Some resources

Angrist-Pischke / Potential Outcomes

• Textbooks: Mostly Harmless Econometrics , Mastering ‘Metrics

• Angrist & Pischke (2017), Journal of Economic Perspectives, “Through our classes darkly”.

Directed Acyclical Graphs

• Scott Cunningham (regularly updated) “Causal Inference: The Mixtape”, particularly section 3: accessible intro to DAGs

• Nick Huntington-Klein ECO305: causal inference without regressions; causal graphs examples of common empirical strategies

• Morgan & Winship (2007, 2nd ed 2015) Counterfactuals and Causal Inference: combines Potential Outcomes & DAGs, focus on economics

• Judea Pearl, Causality (2000, 2nd ed 2009), particularly chapter 3: full formalism of DAGs & causal inference

https://search.library.utoronto.ca/details?6716526&uuid=d007e6b9-df6a-49f7-9c5b-6ebd763c58d5

https://search.library.utoronto.ca/details?9956541

https://doi.org/10.1257/jep.31.2.125

https://www.scunning.com/mixtape.html

http://nickchk.com/econ305.html

http://nickchk.com/causalgraphs.html

https://search.library.utoronto.ca/details?9921161

Bla

nch

ena

y C

oP

201

9

Thank you!

Bla

nch

ena

y C

oP

201

9

EXAMPLES

Bla

nch

ena

y C

oP

201

9

RCT

• Ensures 𝐸 휀𝑖 𝐷𝑖 = 1 = 𝐸 휀𝑖 𝐷𝑖 = 0

• CIA satisfied

• Ensures no causal path between 𝐷 and other covariates

Backdoor criterion

randomized 𝐷

𝑈

𝑌

𝑋1 𝑋2

Back

Bla

nch

ena

y C

oP

201

9

Control variables

• DGP: 𝑌𝑖 = 𝛼 + 𝛽𝐷𝑖 + 𝛾𝑋𝑖 + 𝑒𝑖• Run: 𝑌𝑖 = 𝛼 + 𝛽𝐷𝑖 + 𝑢𝑖• Omitted Variable Bias if 𝐸 𝑢𝑖 𝑃𝑖 = 1 ≠ 𝐸 𝑢𝑖 𝑃𝑖 = 0

• Controlling for 𝑋 closes causal path 𝐷 ← 𝑋 → 𝑌

𝐷

𝑋

𝑌

Back

Bla

nch

ena

y C

oP

201

9

Fixed effects (within) in panel data

• Controlling for individual closes backdoor path

U

𝑋

Individual

𝑌

Time

𝑋

Individual

𝑌

Time

Back

Bla

nch

ena

y C

oP

201

9

Instrumental Variables

Req1 (first stage): instrument 𝑍 has an effect on 𝐷

Req2 (exogeneity): 𝑍 is as good as randomly assigned

No unobserved confounder between 𝑍 and 𝑌

Req3 (exclusion restriction)

No other causal path between 𝑍 and 𝑌

𝐷

𝑈

𝑌𝑍

Back

Date post:	25-May-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Teaching CoP: Teaching Causality in...

Documents