Nonparametric testing for exogeneity with discrete regressors and instruments

Nonparametric testing for exogeneity with discreteregressors and instruments

Katarzyna Bech and Grant Hillier

Warsaw School of Economicsand

University of Southampton

July 8, 2016

1/28

Outline

1 Motivation.

2 Simplest nonparametric additive error model- setup, identication,estimation.

3 Two test statistics and critical values computation.4 Generalization to several variables of each type- tested, exogenous,instrument.

5 Applications: Card (1995) and Angrist and Krueger (1991).

2/28

Outline

1 Motivation.2 Simplest nonparametric additive error model- setup, identication,estimation.



2/28

Outline


3 Two test statistics and critical values computation.

4 Generalization to several variables of each type- tested, exogenous,instrument.


2/28

Outline




2/28

Outline




2/28

Motivation

Endogeneity is one of the common problems in econometric models.

In nonparametric models with discrete regressors and instruments, thepresence of endogenous regressors produces bias (in the identiedcase) or non-existance of any consistent estimator (in the partiallyidentied case).

IV for nonparametric models with discrete regressors: Das (2005) andFlorens and Malavolti (2003).

Nonparametric testing for exogeneity with continuous regressors:Blundell and Horowitz (2007), Lavergne and Patilea (2008) amongothers.

3/28

Motivation

Endogeneity is one of the common problems in econometric models.In nonparametric models with discrete regressors and instruments, thepresence of endogenous regressors produces bias (in the identiedcase) or non-existance of any consistent estimator (in the partiallyidentied case).



3/28

Motivation




3/28

Motivation




3/28

Simple model

Nonparametric additive error model

Y = h(X ) + ε

E [εjZ = zj ] = 0, 8j

where we have i .i .d . data (x si , ysi , z

si ) on (X ,Y ,Z ), and

Y is a continuous scalar dependent variable,

X is a single discrete regressor with support fxk , k = 1, ...,Kg, thatmay be endogenous, with associated probabilities pk > 0,

Z is a discrete instrumental variable with support fzj , j = 1, ..., Jg,with associated probabilities qj > 0.

4/28

Simple model


Y = h(X ) + ε

E [εjZ = zj ] = 0, 8j






4/28

Simple model


Y = h(X ) + ε

E [εjZ = zj ] = 0, 8j






4/28

Simple model


Y = h(X ) + ε

E [εjZ = zj ] = 0, 8j






4/28

Hypothesis of interest

Null hypothesis (exogeneity):

H0 : E [εjX = xk ] = 0, k = 1, ..,K .

Under the null, h() can be consistently estimated using standardnonparametric techniques.

Under the alternative, the IV solution to endogeneity only possibleunder point identication.

5/28



H0 : E [εjX = xk ] = 0, k = 1, ..,K .



5/28



H0 : E [εjX = xk ] = 0, k = 1, ..,K .



5/28

Identication

Since

Y =K

∑k=1

h(xk )I (X = xk ) + ε,

the conditional expectation of Y given Z = zj is

E [Y jZ = zj ] =K

∑k=1

Pr[X = xk ,Z = zj ]h(xk ).

) the instrument Z supplies the equations

π = Πβ,

where βk = h(xk ),πj = E [Y jZ = zj ],Πjk = P [X = xk jZ = zj ].h() is identied at ALL support points of X i¤ J K .

6/28

Identication

Since

Y =K

∑k=1

h(xk )I (X = xk ) + ε,


E [Y jZ = zj ] =K

∑k=1



π = Πβ,

where βk = h(xk ),πj = E [Y jZ = zj ],Πjk = P [X = xk jZ = zj ].

h() is identied at ALL support points of X i¤ J K .

6/28

Identication

Since

Y =K

∑k=1

h(xk )I (X = xk ) + ε,


E [Y jZ = zj ] =K

∑k=1



π = Πβ,

where βk = h(xk ),πj = E [Y jZ = zj ],Πjk = P [X = xk jZ = zj ].h() is identied at ALL support points of X i¤ J K .

6/28

Identication when J<K

h() is partially identied when J < K :

Theorem 1Let L(β) = c 0β be a linear functional of the elements of β. Whenrank(Π) = J < K , the following are true:(1) for any c orthogonal to the null space of Π, L(β) is point-identied;the dimension of this set is J.(2) for c not orthogonal to the null space of Π, L(β) is completelyunconstrained; the dimension of this set is K J.

That is: when J < K , some linear functionals are point-identied,some completely arbitrary (not even set-identied!).Point-identiability of L(β) can be tested (for a given choice of c):

Gn = nc 02 c 01Π1

1 Π2V1P

c 02 c 01Π1

1 Π20 !d χ2KJ .

7/28





Gn = nc 02 c 01Π1

1 Π2V1P

c 02 c 01Π1

1 Π20 !d χ2KJ .

7/28




That is: when J < K , some linear functionals are point-identied,some completely arbitrary (not even set-identied!).

Point-identiability of L(β) can be tested (for a given choice of c):

Gn = nc 02 c 01Π1

1 Π2V1P

c 02 c 01Π1

1 Π20 !d χ2KJ .

7/28





Gn = nc 02 c 01Π1

1 Π2V1P

c 02 c 01Π1

1 Π20 !d χ2KJ .

7/28

Linear Model Setup

We dene the (0, 1) matrix LX (nK ) with elements(LX )ik = I (x si = xk ). Likewise LZ (n J). Then, H0 says:

y = LX β+ ε, E [εjX = xk ] = 08k.

β can be consistently estimated by OLS under exogeneity:

bβ = (L0X LX )1L0X y =0BB@

∑ni=1 yi I (x

si =x1)

∑ni=1 I (x

si =x1)

...∑ni=1 yi I (x

si =xK )

∑ni=1 I (x

si =xK )

1CCA .

Theorem 2

If X is exogenous then the nonparametric (OLS) estimator bβ is consistentand p

nbβ β

!d N

0, σ2D1X

,

where DX is diag(pk ).

8/28

Linear Model Setup


y = LX β+ ε, E [εjX = xk ] = 08k.β can be consistently estimated by OLS under exogeneity:


∑ni=1 yi I (x

si =x1)

∑ni=1 I (x

si =x1)

...∑ni=1 yi I (x

si =xK )

∑ni=1 I (x

si =xK )

1CCA .

Theorem 2


nbβ β

!d N

0, σ2D1X

,


8/28

Linear Model Setup


y = LX β+ ε, E [εjX = xk ] = 08k.β can be consistently estimated by OLS under exogeneity:


∑ni=1 yi I (x

si =x1)

∑ni=1 I (x

si =x1)

...∑ni=1 yi I (x

si =xK )

∑ni=1 I (x

si =xK )

1CCA .Theorem 2


nbβ β

!d N

0, σ2D1X

,


8/28

Linear Model Setup

or by IV (with LZ as instruments) under endogeneity, when J K :bβIV = L0XPLZ LX 1 L0XPLZ y .

Theorem 3

Under assumptions above, the IV estimator bβIV is consistent andpnbβIV β

!d N

0, σ2

P 0D1Z P

1,

where P is a matrix of joint probabilities with elements

pjk = Pr [Z = zj ,X = xk ] ; j = 1, ..., J; k = 1, ...,K

and DZ is diag(qj ).

BUT no consistent estimator exists for K J linear functionals if X isendogenous and J < K .

9/28

Linear Model Setup

or by IV (with LZ as instruments) under endogeneity, when J K :bβIV = L0XPLZ LX 1 L0XPLZ y .Theorem 3


!d N

0, σ2

P 0D1Z P

1,





9/28

Linear Model Setup

or by IV (with LZ as instruments) under endogeneity, when J K :bβIV = L0XPLZ LX 1 L0XPLZ y .Theorem 3


!d N

0, σ2

P 0D1Z P

1,





9/28

Test for exogeneity

Test statistics di¤er depending on whether J K or J < K .

For J K , (a modied version of) Wu-Hausman test:

Theorem 4Under H0, and the assumptions above,

Tn !d χ2K1.

Theorem 5Under the sequence of local alternatives, and the assumptions above, thetest statistic

Tn !d Gamma (β,λ, θ) ,

with the shape parameter α = K12 , the scale parameter θ = 2 σ2

σ2 and thenoncentrality parameter λ = 2δ2, where

δ2 =ξ 0Σ111 ξ

σ2.

10/28

Test for exogeneity

Test statistics di¤er depending on whether J K or J < K .For J K , (a modied version of) Wu-Hausman test:


Tn !d χ2K1.





δ2 =ξ 0Σ111 ξ

σ2.

10/28

Test for exogeneity



Tn !d χ2K1.





δ2 =ξ 0Σ111 ξ

σ2.

10/28

Test for exogeneity



Tn !d χ2K1.





δ2 =ξ 0Σ111 ξ

σ2.

10/28

Test for exogeneity

For J < K , the test is based on the two SSE:

Unrestricted: in the model y = LX β+ ε, i.e., y 0MLX y , andRestricted: minimising the SSE in this model subject to π = Πβ.

Test statistic:

Rn =y 0MLX LZ (L

0ZPLX LZ )

1 L0ZMLX yn1y 0MLX y

.

11/28

Test for exogeneity


Unrestricted: in the model y = LX β+ ε, i.e., y 0MLX y , and

Restricted: minimising the SSE in this model subject to π = Πβ.

Test statistic:

Rn =y 0MLX LZ (L

0ZPLX LZ )


.

11/28

Test for exogeneity



Test statistic:

Rn =y 0MLX LZ (L

0ZPLX LZ )


.

11/28

Test for exogeneity



Test statistic:

Rn =y 0MLX LZ (L

0ZPLX LZ )


.

11/28

Test for exogeneity

Theorem 6

Under H0 and the assumptions above,

Rn !d z 0Ω1z J1∑j=1

ωjχ2j (1)

where z N(0,Σ), with Σ as dened above,

Ω := C 0J (PD1X P 0 pZ p0Z )CJ ,

and the ωj are positive eigenvalues satisfying

det[ΣωΩ] = 0

with the χ2j (1) variables independent copies of a χ21 random variable.

12/28

Critical value computation

using consistent estimates of bωj , simulate the distribution of

∑(J1)j=1 bωjχ

2j (1) to get the appropriate 1 α quantiles,

simulate the quadratic form z 0 bΩ1z , with z N(0, bΣ) and computethe quantiles,

approximate by the distribution of aχ2(v ) + b, choosing (a, b, v) tomatch the rst three cumulants.

13/28



∑(J1)j=1 bωjχ




13/28



∑(J1)j=1 bωjχ




13/28

Generalizations: model with two discrete regressors

Y = h(W ,X ) + ε

E [εjZ = zj ,W = wd ] = 0, 8j , d

We dene LWX (nDK ) with elements(LWX )i ,dk = I (W = wd )I (X = xk ). Likewise LWZ (nDJ), and H0says:

y = LWX β+ ε,E [εjW = wd ,X = xk ] = 08d , k.

14/28

Structure of regression matrix

LWX is a permutation of the rows of2664L1X 0 0 00 L2X 0 00 0 00 0 0 LDX

3775

Observations corresponding to LdX all have W = wd , and rowsidentify which values of X occur where.

Similarly for LWZ .

We assume that all possible combinations of K support points of X , Jsupport points of Z and D support points of W occur in the sample.

15/28



3775Observations corresponding to LdX all have W = wd , and rowsidentify which values of X occur where.

Similarly for LWZ .


15/28




Similarly for LWZ .


15/28




Similarly for LWZ .


15/28

Identication in general model

The vector with elements h(wd , xk ) can be split into D K 1vectors, with component vectors hd (xk ), say, one for each wd .

So we have D problems of the same type as the case with W absent.[Split y into D subvectors yd ]

The instrument is valid for each subsample.

For h to be point-identied, each hd must be, so the condition isagain J K .

16/28






16/28






16/28






16/28

Most general model

Now assume several variables of each type:X1, ..,XR ,W1, ..,WU ,Z1, ..,ZT , with respective supports ofdimensions Kr ,Su , Jt .

We want to test the joint endogeneity of (X1, ..,XR ) .

Label combinations of support points thus:

xα = (xα1 , .., xαR ), 1 αr Kr ,wβ = (wβ1

, ..,wβS), 1 βu Su ,

zγ = (zγ1 , .., zγT ), 1 γt Jt

Order the sequences lexicographically.

Indistinguishable from the case of one variable of each type, exceptthat

J = ΠTt=1Jt , K = ΠR

r=1Kr , S = ΠUu=1Su .

17/28

Most general model





, ..,wβS), 1 βu Su ,

zγ = (zγ1 , .., zγT ), 1 γt Jt





17/28

Most general model





, ..,wβS), 1 βu Su ,

zγ = (zγ1 , .., zγT ), 1 γt Jt





17/28

Most general model





, ..,wβS), 1 βu Su ,

zγ = (zγ1 , .., zγT ), 1 γt Jt





17/28

Most general model





, ..,wβS), 1 βu Su ,

zγ = (zγ1 , .., zγT ), 1 γt Jt





17/28

In particular - Identication

The N&S condition remains that J K , but with J and K dened asproducts of the Jt and Kr .

BUT note:

There is NO requirement that there should be at least as manyinstruments as endogenous variables (T R);All that is needed is J K .Of course: more instruments increases J = ΠT

t=1Jt .

18/28



BUT note:


t=1Jt .

18/28



BUT note:

There is NO requirement that there should be at least as manyinstruments as endogenous variables (T R);

All that is needed is J K .Of course: more instruments increases J = ΠT

t=1Jt .

18/28



BUT note:

There is NO requirement that there should be at least as manyinstruments as endogenous variables (T R);All that is needed is J K .

Of course: more instruments increases J = ΠTt=1Jt .

18/28



BUT note:


t=1Jt .

18/28

Applications motivation

There are many published applications, where discrete endogenousregressor is instrumented by a variable with insu¢ cient support, e.g.Card (1995), Angrist and Krueger (1991), Bronars and Grogger(1994), Lochner and Moretti (2004).

The point identication is achieved by assuming a parametric (linear)specication.

Parametric vs. nonparametric specication testing (e.g. Horowitz(2006)) not possible in this case.

19/28





19/28





19/28

Application- Card (1995) on returns to schooling

We are interested in the relationship between individuals wage Y andeducation X (in the presence of exogenous covariates W ) in

Y = h(X ,W ) + ε.

Card (1995) treats education as endogenous and estimates

ln(wagei ) = β0 + β1Xi +S

∑s=1

γsWsi + εi

by 2SLS using a binary instrument Z , which takes value 1 if there is acollege in the neighbourhood, 0 otherwise. The point identication isachieved by imposing the parametric (linear) specication, that is nottestable.

20/28

Application- Card (1995) on returns to schooling

We are interested in the relationship between individuals wage Y andeducation X (in the presence of exogenous covariates W ) in

Y = h(X ,W ) + ε.

Card (1995) treats education as endogenous and estimates

ln(wagei ) = β0 + β1Xi +S

∑s=1

γsWsi + εi

by 2SLS using a binary instrument Z , which takes value 1 if there is acollege in the neighbourhood, 0 otherwise. The point identication isachieved by imposing the parametric (linear) specication, that is nottestable.

20/28

Data

The dataset consists of 3010 observations from the NationalLongitudinal Survey of Young Men.

The (sample) support of education variable consists of K = 18di¤erent values, and for a binary instrument: J = 2.

Data limitation: the more exogenous covariates, the less likely it is toget observations for all possible combinations of the support points.

Educational levels (K = 4): less than high school, high school, somecollege, post-college education.

Potential labour market experience levels: low and high.

21/28

Data






21/28

Data






21/28

Data






21/28

Data






21/28

Results

Covariates Rn cv .1 cv .2 cv .3 α

Educ 1.765 0.239 0.232 0.238 1%0.136 0.132 0.138 5%0.094 0.096 0.097 10%

Educ*, Exp* 4.147 1.221 1.259 1.217 1%0.715 0.696 0.719 5%0.511 0.500 0.515 10%

Educ*, Exp*, Race 3.572 1.771 1.692 1.688 1%1.107 1.131 1.108 5%0.849 0.871 0.860 10%

Educ*, Exp*, Race, SMSA 2.955 2.382 2.330 2.415 1%1.702 1.679 1.735 5%1.399 1.365 1.430 10%

22/28

Outcome

Education is endogenous, whatever the specication of the W 0s.

So: linearity is not testable, because no consistent estimator for h()Some linear functionals of interest may be - use the test above tocheck.

Can consistently estimate an identied linear combination.

23/28

Outcome


So: linearity is not testable, because no consistent estimator for h()

Some linear functionals of interest may be - use the test above tocheck.


23/28

Outcome




23/28

Outcome




23/28

Testing for point-identiability of linear functionals

As J = 2, linear functionals of only 2 parameters might be pointidentied, eg. the di¤erence in earnings across di¤erent years ofeducation.

linear combination Gn bL(β) Cheshers boundsh(3) h(2) 0.1356 0.0040 -h(7) h(6) 1.9332 0.1017 (0.0365, 0.2895)h(8) h(7) 0.1494 0.2395 (0.1732; 0.352)h(9) h(8) 26.5527 - (0.2742; 0.1334)h(10) h(9) 75.2217 - -h(11) h(10) 4.7003 0.1317 (0.057; 0.3187)h(14) h(13) 61.5525 - -h(17) h(16) 10.7344 -0.1900 -h(18) h(17) 74.1413 - -

24/28

Application- Angrist and Krueger (1991) on returns toschooling

Angrist and Krueger (1991) estimate

ln(wagei ) = βXi +∑c

δcYci +S

∑s=1

γsWsi + εi

by 2SLS using quarter of birth as an instrument for (assumed)endogenous education.

Data: 1980 Census, split into 1930-1939 cohort (40-49 year-old men)and 1940-1949 cohort (30-39 year-old men).

Now, K = 21 and J = 4.

25/28




δcYci +S

∑s=1

γsWsi + εi



Now, K = 21 and J = 4.

25/28




δcYci +S

∑s=1

γsWsi + εi



Now, K = 21 and J = 4.

25/28

Results: 1930s cohort

critical valuesRn 1% 5% 10%

1930 0.645 17.144 11.026 8.4331931 1.000 19.806 12.614 9.5821932 10.843 21.541 14.313 11.1841933 2.385 18.980 12.685 9.9521934 6.498 25.674 16.824 13.0251935 2.728 20.451 13.374 10.3401936 10.990 29.102 18.465 13.9801937 1.614 13.467 9.032 7.1011938 1.344 22.932 15.107 11.7371939 9.649 22.130 14.837 11.664

full cohort 38.044 85.933 72.138 65.465

26/28

Results: 1940s cohort

critical valuesRn 1% 5% 10%

1940 3.528 24.137 15.704 12.0961941 18.143 24.733 16.005* 12.286*1942 6.517 34.282 21.810 16.5351943 99.840 55.818* 35.712* 27.202*1944 22.665 39.214 24.860 18.823*1945 31.736 26.623* 17.705* 13.847*1946 17.181 23.478 15.183* 11.642*1947 22.803 33.000 21.830* 17.012*1948 34.116 46.991 29.790* 22.552*1949 32.952 36.445 23.627* 18.168*

full cohort 278.703 138.344* 114.551* 103.182*

27/28

To conlude...

we propose consistent nonparametric exogeneity test(s) applicable inmodels with discrete regressors,

the tests conrm endogeneity of education variable in some classicapplied work, but

suggest that linearity of these models might be a bold assumption;

we suggest a nonparametric approach, or nding instruments withmore support points!

THE END

28/28

To conlude...





THE END

28/28

To conlude...





THE END

28/28

To conlude...





THE END

28/28

To conlude...





THE END

28/28

Date post:	12-Apr-2017
Category:	Economy & Finance
Upload:	grape
View:	43 times
Download:	0 times

Nonparametric testing for exogeneity with discrete regressors and instruments

Economy & Finance