Evaluating the Performance of Tests of Over Identifying Restrictions

8/7/2019 Evaluating the Performance of Tests of Over Identifying Restrictions

1/23

1. EVALUATING THE PERFORMANCE OF TESTS OF

OVERIDENTIFYING RESTRICTIONS

Niemczyk Jerzydraft date: 31.05.2008

1.1 Introduction

In linear regression models with endogenous explanatory variables a re-searcher utilizes additional variables, so called instruments (which can alsoinclude some of the regressors), that have known correlation with the re-gression error term, and provide necessary information for the estimationof the unknown parameter(s). So far, the most common estimator used totackle the linear model is the Instrumental Variables estimator which can beviewed as a special case of the Generalized Method of Moments Estimator(GMM), introduced by Hansen (1982) in his seminal paper. In that case thevalidity of the exploited moment conditions (restrictions), provided by theinstruments, is investigated via the Hansen (1982) J statistic.

More recently, based on the concept of Empirical Likelihood introducedby Owen (1991), an alternative method was proposed by Qin and Law-less (1994), Imbens (1997). Empirical Likelihood (EL) finds an estimatortogether with empirical probabilities that maximize the empirical likeli-hood function such that the moment conditions are exactly satisfied in thesample (which, in general, is not the case with the GMM estimator forwhich the implied empirical probabilities are all equal). Here the maxi-mized criterion function (empirical likelihood) provides the natural statisticfor overidentifying restrictions (via the Empirical Likelihood Ratio test,ELR). Other estimators, such as Exponential Tilting and Continuous Up-dating Estimator (CUE), were further proposed. They are all special cases of

the so called Generalized Empirical Likelihood (GEL) Estimator, see Smith(1997), Newey and Smith (2004) and citations therein. The consistencyof the overidentifying restriction tests based on GEL was proven by Smith(1997). Optimality of EL for testing moment conditions was shown by Ki-tamura (2001). The study ofNewey and Smith (2004) suggests that GEL
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


2/23

1. Evaluating tests of overidentifying restrictions 2

can have better finite samples properties relative to GMM.

In this paper, for a simple linear model, we examine several proceduresfor testing overidentifying restrictions via a Monte Carlo simulation. Wecompare several versions of known GMM statistics (that differ with respectto weighting matrices applied) with GEL type Likelihood Ratio tests. Weanalyze the behavior of the tests when the instruments are either weak orstrong. We also examine the incremental version of those overidentifyingrestriction tests, that is the difference between test statistics of the validityof a set of instruments of a subset of those instruments. By exploiting thevalidity of a subset of the instruments this incremental version should leadto a local power improvement, see Hall (2005).

Finite sample properties of the GMM tests can be improved by applyingbootstrap procedures. We examine an implementation suggested in Halland Horowitz (1996) and Brown and Newey (2002) and some modificationsof those. We also analyze a simple residual type bootstrap.

We find that the Hall and Horowitz (1996) implementations, in termsof size, are working well for large samples and rather strong instruments.However, the residual bootstrap performs much better here, for both smalland large samples and under weak or strong instruments. Brown and Newey(2002) does not perform well in our examples, which is probably due tonumerical procedures.

In Section 2, we present the GMM and GEL type tests for overidentifyingrestrictions. In Section 3, we describe bootstrap procedures for correctingGMM type test statistics. In Section 4 we illustrate the size and power of

those tests and the performance of bootstrap procedures for a simple linearmodel.

1.2 Overidentifying restrictions test statistics

For some stationary data vector Xi, i = 1, . . . , n, from X = [X1, . . . , Xn],where n is the sample size, we denote a particular l1 vector function of thedata by gi() g(Xi, ) and the corresponding sample moment function bygn() =

1n

ni=1 gi(), where is a k 1 vector of parameters. We assume

that under true but unknown data generating process Egi() = g() and

that gn()p

g() for every

Rk. We aim to find a unique 0 for

which g(0) = 0.A popular estimation procedure for estimating 0 is the Hansen (1982)

GMM method, which minimizes a particular quadratic form of the samplemoment function. Although having attractive asymptotic properties, GMM
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


3/23


can perform poorly in finite samples. Especially when the parameter is

weakly identified (when identifying conditions are close to being violated),see Andrews and Stock (2007) and references therein.Alternatives to Hansen (1982) GMM estimator and the test statistics for

overidentifying restrictions were proposed. They include empirical likelihood(which finds an estimator that maximizes the likelihood function of the datasubject to the moment restrictions being satisfied in the sample), exponentialtilting and the continuous updating estimator, see Newey and Smith (2004).These are members of so called generalized empirical likelihood estimators(GEL).

Below we describe those procedures in some more detail.

1.2.1 GMM statistics

The GMM estimator is

argmin

gn()W(X)gn(), (1.1)

where W(X) = Op(1) is a l l positive semi-definite weighting matrix. Theefficient GMM estimator is obtained in one, two or several steps. In thefirst step we obtain using some initial W(X) (for instance the identitymatrix, but in particular cases an optimal weighting matrix can be derivedanalytically), in the second step we re-compute (1.1) using W(X) = 1(),the inverse of a consistent estimator of the asymptotic variance of

ngn(0),

yielding. The third stage estimator,

3, would be based on

1

(). Forthe theoretical derivation of the consistency of the GMM estimator and the

asymptotic results see Hansen (1982).As recommended by Andrews (1999) and further justified by Hall (2000)

(this will lead to the local power improvement), we shall examine the fol-lowing form of the covariance estimator,

a() =1

n

ni=1

gi()gi() gn()gn(). (1.2)

We will also examine the standard estimator of the covariance matrix

s() =1

n

ni=1

gi()gi()

. (1.3)

Hansens two stage test statistic for overidentifying restrictions is

Jn ngn()1()gn(). (1.4)


4/23


Which, based on (1.2) or (1.3), specializes to

Jan = ngn(a)1a (a)gn(a) (1.5)

orJsn = ngn(s)

1s (s)gn(s), (1.6)

witha argmin

gn()

1a ()gn(), (1.7)

s argmin

gn()1s ()gn(). (1.8)

Sometimes, for brevity, we will simply write for s or a, although in

general those estimators are not equal.The difference between the expressions (1.2) and (1.3), gn()gn()

, tendsto zero if the population moments are satisfied. Hence, it will not change thelimiting distribution of the test statistic. If they are not satisfied, this factordoes not disappear in the limit, and thus can lead to power improvementsof the test, see Hall (2000).

When all the moment conditions are valid, g(0) = 0 for a unique 0,then we can test if (l k) overidentifying moment restrictions are satis-fied. Under appropriate regularity conditions, see Hansen (1982), (1.5) and(1.6) are asymptotically distributed as 2(l k). The procedures for test-ing overidentifying restrictions based on (1.4) are consistent, see Andrews(1999).

Linear model

In a linear model we have Xi = (yi, xi, zi), where zi is an l1 vector of allegedinstruments and xi is a k1 vector of regressors (if some of the regressors areexogenous or predetermined then xi and zi can share the same elements),i = 1, . . . , n. Let Z = [z1, . . . , zn]

be the (n l) matrix of instruments,let X = [x1, . . . , xn]

be the matrix of regressors and y = [y1, . . . , yn] be the

vector of dependent variables. The GMM estimator is based on the followingmoment conditions

g(0) = Ezi(yi xi0) = 0.For an initial consistent estimator, , let u() = y X, we then have

gn() =1

n

ni=1

zi(yi xi) =1

nZu(), (1.9)


5/23


a() =1

n

n

i=1

(yi

xi)2ziz

i

1

n2

Zu()u()Z, (1.10)

and

s() =1

n

ni=1

(yi xi)2zizi. (1.11)

The second stage GMM estimator is

= (XZ1XZ)1XZ1Zy (1.12)

with being either (1.10) or (1.11). For testing moment conditions we areusing (1.5) or (1.6).

Under conditional homoscedasticity, E((yi

xi0)2

|zi) =

20, the instru-

mental variables (IV) estimator (which results from the minimization of(1.1) using W(X) = [ 1nZZ]1) is

= (XPZX)1XPZy. (1.13)

The unconditional covariance matrix of the moment conditions is =2ZZ. Using, for the weighting matrix in the second stage, the structureof this matrix, instead of (1.11) or (1.10), we can apply

s() =u()u()

n

1

nZZ (1.14)

ora() =

u()u()

n

1

nZZ 1

n2Zu()u()Z. (1.15)

However, from the form of (1.12) we can easily see that updating theestimator that uses s s() will not affect the second stage estimator.That is for s,

s,where

s = (XZ1s Z

X)1XZ1s Zy.

We now show that also for a()

a. (1.16)

For simplicity, lets writea = s gg, (1.17)
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


6/23


with g = 1nZu(). The resulting estimator is

a = (XZ1a Z

X)1XZ1a Zy (1.18)

Applying a known matrix result,

1a = (s gg)1 = 1s + (1 g1s g)11s gg1s ,

we obtain

a = {XZ[1s + (1 g1s g)11s gg1s ]ZX}1XZ[1s + (1 g1s g)11s gg1s ]Zy

=

{XZ1s Z

X+ (1

g1s g)

1XZ1s gg1s Z

X

}1

{XZ1s Zy + (1 g1s g)1XZ1s gg1s Zy}= {XZ1s ZX}1XZ1s Zy = .

The third equality is due to

XZ1s g =1

nXZ1s (Z

y ZX) =

u()u()

n

1

XPZ(y X) = 0.

The standard Sargan test arises from the application of (1.14) in (1.6),giving

Ssn

ng1s g = n

u()PZu()

u()

u()

. (1.19)

Even though the estimator a = s = , the test statistic

San ng1a g (1.20)

is not equivalent to Ssn. In fact we have,

San = ng[1s + (1 g1s g)11s gg1s ]g

= ng1s g + (1 g1s g)1ng1s gg1s g= Ssn +

Ssnn Ssn

Ssn = Ssn

n

n Ssn. (1.21)

Since 0 Ssn < n and 0 San, we have San > Ssn. Hence, for a given criticalvalue, San will reject more often than S

sn.
http://-/?-http://-/?-http://-/?-http://-/?-


7/23


1.2.2 GEL statistics

Here we will shortly describe GEL and the resulting test statistic for theoveridentifying restrictions. For the analytical development of the followingresults see Smith (1997). For some more refined results on GEL see alsoNewey and Smith (2004).

For a concave scalar function (v) and an open interval V containingzero

GEL argmin

sup()

ni=1

(gi()), (1.22)

where () = { : gi() V, i = 1, . . . , n}. Let gi gi(GEL). Theempirical probabilities of the observations associated with the GEL are

i 1(gi)n

j=1 1(gj)

, (1.23)

where 1(v) = (v)/v and = argmaxn()

ni=1 (

gi). These probabilities

are important for the bootstrap procedure introduced by Brown and Newey(2002) which we will describe later.

The GEL likelihood ratio test statistic is

GELRn 2(ni=1

(gi) n(0))

and has asymptotic 2(l k) distribution, when all the moment conditionsare valid.

1.3 Bootstrap procedures

Below we describe bootstrap procedures of Hall and Horowitz (1996) andBrown and Newey (2002) for improving the finite sample properties of theGMM overidentifying restrictions test statistics. Hall and Horowitz (1996)bootstrap version of the statistic uses a re-centered moment function suchthat it satisfies the moment restrictions in the bootstrap samples. Brownand Newey (2002) propose resampling the data according to the probabili-

ties associated with the observations that arise from the computation of theempirical likelihood evaluated at the GMM estimator. That way the mo-ments exploited by GMM are also satisfied in the bootstrap samples, and intheory this procedure should lead to improvements with respect to the Halland Horowitz (1996) procedure.
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


8/23


1.3.1 HH type bootstrap

We adopt here the bootstrap procedure of Hall and Horowitz (1996) whichis originally designed to handle dependent data. Assuming that the datais i.i.d., we have E(g(Xi, 0)g(Xj, 0)) = O for i = j. The version of theGMM statistic they consider is the one that uses the inverse of (1.3) for theweighting matrix.

A bootstrap sample, X, (Xi i = 1, . . . , n) we obtain by drawing inde-pendently with replacement from X (Xi i = 1, . . . , n). Let

ghi () g(Xi , ) Eg(Xi , ) = gi () gn(), (1.24)where E() = E(|X), g(Xi , ) gi () and is the estimator the teststatistic (which we would like to bootstrap) applies. Clearly

Eghn () = gn() gn() = 0, (1.25)which clarifies the structure of the bootstrap moment function, (1.24). Forthat choice the population moments exploited in the estimation, Egn(0) =0, are satisfied exactly in the bootstrap samples at . Because of the in-dependence of the bootstrap samples (Eghi ()g

hj ()

= 0 for i = j) thebootstrap variance of

nghn () is

nEghn ()ghn ()

=1

n

ni,j=1

Eghi ()ghj ()

(1.26)

= Eghi

()ghi

() = a

().

The bootstrap version of (1.3) is

hs () =

1

n

ni=1

ghi ()ghi (

) (1.27)

=1

n

ni=1

gi ()gi (

) gn()gn()

gn()gn() + gn()gn()

and the bootstrap version of (1.2) is

ha () = 1

n

ni=1

ghi ()ghi (

) ghn ()ghn () (1.28)

=1

n

ni=1

gi ()gi (

) gn()gn().
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


9/23


where is obtained from

argmin

ghn ()Wghn (), (1.29)

for some initial weighting matrix W. It is easy to see that

Ehs () = Eha () = a(), (1.30)

hence, for the W, we shall use the inverse of hs () or ha () ((1.27) and

(1.15) evaluated at the bootstrap true parameter ). We will also examinehow, using updated (1.27) and (1.28), changes the results with respect tothe initial choice.

The bootstrap versions of (1.5) and (1.6) are

J

a = nghn (

a)

ha (a)

1

gn(

a) (1.31)

orJs = ng

hn (

s)hs (s)

1gn(

s) (1.32)

and if we update in the weighting matrix, the second stage versions are

Ja = nghn (

a)ha (

a)1gn(

a) (1.33)

orJs = ng

hn (

s)hs (

s)1gn(

s). (1.34)

Where, ignoring the subscript a or s, is obtained from (1.29) using forthe weighting matrix h()1 and using h()1.

Having designed their procedure primarily for dependent data, Hall andHorowitz (1996) are concerned with the fact that the block bootstrap doesnot replicate the dependence of the true data generating process. To over-come this issue they apply a particular transformation to (1.15). Since weare not dealing here with dependent data we do not need to apply thistransformation here.

Linear model

In the linear case (1.24) specifies to

gh

n

() =1

nZ

u()

1

nZu(),

where u() yX and for a given weighting matrix W, say the inverseof hs () or

ha (), (1.29) becomes

= (X

ZWZ

X)1X

ZW(Z

y Z(y X)). (1.35)


10/23


Bootstrap version of the weighting matrix used for the Sargan test statistic

based on (1.27) or (1.28), and (1.14) form, would use the inverse of

hs () =

u()u()

n

1

nZ

Z 1n2

Z

u()u()Z (1.36)

1n2

Zu()u()Z +1

n2Zu()u()Z,

or

ha () =

u()u()

n

1

nZ

Z 1n2

Z

u()u()Z, (1.37)

hence, the bootstrap versions of the standard or alternative Sargan tests(Sa, S

s , S

a and S

s ) will have the same structure as (1.31), (1.32), (1.33),(1.34) but with the weighting matrix replaced with the dotted version.

In the MC experiments we will examine how, for a given version of thetest statistic, the bootstrap critical values obtained from the bootstrappedstatistic (Ja , J

s , J

a , J

s , S

a, S

s , S

a or S

s ) perform relative to the stan-dard asymptotic critical values.

1.3.2 EL type bootstrap

An alternative approach to the one of Hall and Horowitz (1996) was pro-posed by Brown and Newey (2002). Instead of drawing (with replacement)bootstrap samples Xi = (yi , x

i , z

i ) from X, with each Xi having equalprobability, they suggest using the probabilities obtained from the calcu-lation of the empirical likelihood at the given GMM estimator, . Theseare

i =1

n(1 gi()), (1.38)

with = arg max

gi()


11/23


obtain bootstrap samples Xb by drawing with replacement from X,where Xi has the probability of being drawn i, i = 1, . . . , n b =1, . . . , B .

compute the statistic Jbn () exactly the same way Jn() was obtainedbut using the bootstrapped data Xb, instead ofX.

the bootstrap level critical value is the 100(1 )% quantile of thebootstrap distribution: cv J[(1)B]n

A modification to that procedure could use probabilities derived from an-other GEL member, for example the exponential tilting, where (v) = exp{v}.

1.3.3 Residual type bootstrap

For the linear model (1.41) we are going to analyze,

obtain ols = (ZZ)1ZX and the residuals v = MZX (scaled bynnl)

obtain from (1.13) and the residuals u = y X from V = (u, v), for b = 1, . . . , B obtain bootstrap versions of the

disturbances Vb = (ub, vb) by re-drawing row-wise; generate

xb = Zols + vb (1.39)

yb = xb + ub (1.40)

compute the bootstrap versions of the Sargan or Hansen tests (fromyb, xb and Z)

the bootstrap critical value is obtained from the 100(1 )% quantileof the bootstrap versions of the statistic.


12/23


13/23


and combining (1.46) with (1.47) we get

2v =2x

2p/n + 1. (1.48)

We define the signal to noise ratio of (1.41) by 2 =Var(xt)

2u, then 2 = 2x.

Taking i = 0 in (1.46) for i = 1, . . . , l , we get 2p

2v = nl

20. Hence, for

a given sample size n, 2p, and 2 we calculate 2v from (1.48) and 0 from

20 = 2p

2v/(nl).

We will present our findings with respect to Z1u (the degree of invalidityof the instrument) and xu = 0.2 (the degree of simultaneity of the regres-sor). We have uvv = xux. We fix

2x = 9. We take M C = 5000, the

number of Monte Carlo replications, and B = 299, the number of bootstrapreplications.

For n = 50, 200 (small and larger sample), and 2p/3 = 20, 1 (ratherstrong and weak instruments), Table 1.1 contains size distortions (Z1u = 0)of the standard Sargan test, San, and the modified Sargan test, S

an. Table 1.2

presents the same results but for Jsn and Jan. The critical values are based

on the asymptotic theory (cv); Hall and Horowitz (1996) 1 step (cv) and

2 steps (cv); and the residual type bootstrap (cvR) respectively. In thecolumn indicating instruments used by a test (say Tn), [123] means that allthe three instruments are taken, [23] only the second and the third, [123] |[23]means that the test used is the incremental one (Tn([123]) Tn([23])). Wedo not show the result for [13] or [12] in Tables 1.1 and 1.2 because here all

three instruments are valid and i.i.d. so they would yield the same (similar)results as Tn([23]). Likewise, as far as type I errors are concerned, Tn([123])Tn([23]) gives similar results as Tn([123]) Tn([12]) or Tn([123]) Tn([13]).In fact, for the size distortion results, we averaged the results for [12] and[23] (hence the result is based on 2MC replications). We did the same for[123]|[12] and [123]|[23].

Tables interpretation

Table 1.1 shows the fraction size distortion for the given significance levelfor the standard and alternative Sargan tests. It reveals that for the simple

linear model with homoscedastic errors HH bootstrap is not giving substan-tial improvements with respect to standard asymptotic results. Only forthe weak instrument case p/3 = 1 and n = 200 we see that it can reducethe size distortion from around 30% for the asymptotic result to about16% when we use 2 step HH. The residual type bootstrap performs much
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


14/23


better in all the cases considered. The table also reveals that the alternative

Sargan test seems to be less distorted (compared to the standard one) whenn = 200 and for the weak instrument cases.Table 1.2 gives the results for the standard and alternative Hansen tests.

HH bootstrap seems to give improvements (2 steps HH) for large n forboth weak and strong instruments. The residual bootstrap however seemssuperior in giving improvements.

Table 1.3 shows results of the procedures applying EL or ET. The firstcolumns present outcomes of applying the BN bootstrap, for the Sargan testsor the alternative Hansen test, with either EL or ET implied probabilities.We clearly notice no size improvements for all the cases considered. Thepercentage size distortion varies from about 50% to 70%. The right-hand columns show the LR type tests. EL() and ET() are GEL typeLR tests evaluated at the GMM statistic. EL or ET are GEL type LRevaluated at the EL or ET. We see that, for the strong instrument case,EL is less distorted than EL(), similarly for ET, and this size distortiondecreases with the sample size. For the weak instrument case EL and ET areworse than the versions evaluated at the GMM statistics, which is probablydue to the optimization procedure applied to find ET or ET.

Table 1.4 shows the size corrected power of those LR tests. They allseem to perform about equally when the instrument is rather strong. Whenit is weak, EL and ET are less powerful than the GMM counterparts, forwhich again the optimization procedure probably could be blamed.

Table 1.5 and 1.6 show the power of the Sargan tests and the Hansen

ones (when the instrument is strong). For each of those tests the cvc columnpresents the size corrected (ideal) power, cv and cv power that resultedform applying the HH bootstrap (one or two steps) and cvR from applyingthe residual type bootstrap. By comparing the bootstrap powers withthe size corrected power we can see which bootstrap procedure can produce(under the alternative) a critical value that is close to the size corrected one.We see that the residual bootstrap power is reproducing the size correctedone very well, for both small and large samples. The HH bootstrap poweris size corrected for large samples only. For the small samples, becauseits power is smaller than the corrected one, the critical values that HHproduces must then be too large on average. This could be expected from

the size distortion Tables 1.1 and 1.2, where the HH procedures under-rejectfor small n in most of the cases.
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


15/23


1.4.2 Example 2

In the previous example we have conditional homoscedasticity, becauseE(u2t |zt) = 1. Therefore the Sargan test seems the most appropriate oneto use. Here we will re-generate the previous model with the conditionalhomoscedasticity property violated. Let wt (ztzt)/l. Because zt N(0, Il), we obtain E(wt) = 1 and E(w

2t ) = 15/9 (for l = 3). Let wt

1+2wt21+15/9

22+212

, for some scalars 1, 2. By construction Ew2t = 1 and

Ewt =1+2

21+15/922+212

.Now, we generate

yt = x

t+ ut (1.49)

xt = z

t + vt2, (1.50)

where ut = wtut (hence yielding conditional heteroscedasticity, E(u2t |zt) =

w2t ). When 1 = 1 and 2 = 0 then ut = ut and we are back in the previousexample. For arbitrary 1 and 2 we have

Eut = EE(ut|zt) = E wtE(ut|zt) = 0

Eu2t = EE(u2t |zt) = E wt2E(u2t |zt) = 1.

Hence, as previously, the unconditional variance of the structural equationerror term is 1.

From (1.50), we now have

Extut = EE(z

twtut|zt) + EE( wtutv2t|zt) =

E(ztwt)E(ut|zt) + E wtE(utv2t|zt) = uvv.Because Varxt =

2 and Varut = 1 then

xu =uvv

.

Now create the invalid instrument, Z1, according to

Z1 =1 2Z1u

Z1 + Z1uu. (1.51)

In order to make this more flexible model comparable with the previous one(1 = 1 and 2 = 0 for which = 1) we want to choose 1, 2 such that is close to 1 (for 2 = 1 we calculate 1 such that = 0.9). Note that canbe equal to one only if 2 = 0 (no heteroscedasticity).
http://-/?-http://-/?-


16/23


As before for different values of the concentration parameter 2p, signal

to noise ratio 2

= 9, sample size we will parameterize figures with respectto Z1u (the degree of invalidity of the instrument) and xu = 0.2 (thedegree of simultaneity of the regressor). We generate the data according to(1.49), (1.50), (1.51), zt N(0, Il) and with disturbances (1.43), where nowuvv = xu/ with = 0.9. MC=10000.

Tables interpretation

Tables for the example 2 are the work in progress.

Ssn San

instr. cv cv cv cvR cv cv

cv cvR

p/3 = 20, n=50

[123] -0.036 -0.21 -0.25 0.004 0.28 -0.21 -0.25 0.004[23] -0.02 -0.1 -0.098 0 0.12 -0.1 -0.086 0

[123]|[23] 0.002 -0.11 -0.14 -0.018 0.28 -0.14 -0.17 -0.044

p/3 = 20, n=200

[123] -0.012 -0.016 -0.032 0 0.088 -0.016 -0.028 0[23] -0.082 -0.042 -0.02 -0.016 -0.028 -0.042 -0.016 -0.016

[123]|[23] 0.03 0.032 0.024 0.03 0.11 0.028 0.022 0.032

p/3 = 1, n=50

[123] -0.29 -0.37 -0.34 -0.1 0.024 -0.37 -0.3 -0.1[23] -0.33 -0.24 -0.17 -0.12 -0.18 -0.24 -0.13 -0.12

[123]|[23] -0.058 -0.27 -0.21 -0.11 0.19 -0.28 -0.19 -0.13

p/3 = 1, n=200

[123] -0.34 -0.27 -0.17 -0.1 -0.26 -0.27 -0.16 -0.1[23] -0.4 -0.27 -0.21 -0.19 -0.36 -0.27 -0.2 -0.19

[123]|[23] -0.07 -0.27 -0.14 -0.12 -0.022 -0.26 -0.12 -0.12

Tab. 1.1: Size distortion: For different n = 50, 200; p/3 = 2 0, 1; the table presentsP(Tn>cv)

(relative deviation of the rejection probabilities from the desired

level = 0.05), where Tn is either San or S

sn and cv stands for the different

critical values (at the given significance level used): cv asymptotic criticalvalue, cv based on one step version of Hall and Horowitz (1996), cv basedon two steps, cvR is based on the residual type bootstrap alternative.
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


17/23


Jsn Jan

instr. cv cv cv cvR cv cv

cv cvR

p/3 = 20, n=50

[123] -0.18 -0.42 -0.4 -0.004 0.2 -0.43 -0.4 -0.004[23] -0.066 -0.18 -0.17 0.04 0.12 -0.18 -0.16 0.04

[123]|[23] -0.044 -0.34 -0.34 -0.05 0.26 -0.37 -0.36 -0.058

p/3 = 20, n=200

[123] -0.076 -0.08 -0.048 0.02 0.012 -0.08 -0.044 0.02[23] -0.068 -0.034 -0.024 -0.01 -0.022 -0.034 -0.02 -0.01

[123]|[23] 0.014 -0.028 -0.024 0.042 0.096 -0.02 -0.026 0.034

p/3 = 1, n=50

[123] -0.4 -0.52 -0.49 -0.12 -0.12 -0.53 -0.44 -0.12[23] -0.35 -0.29 -0.25 -0.096 -0.21 -0.29 -0.24 -0.096

[123]|[23] -0.046 -0.45 -0.4 -0.088 0.19 -0.46 -0.37 -0.1

p/3 = 1, n=200

[123] -0.33 -0.24 -0.2 -0.15 -0.28 -0.24 -0.2 -0.15[23] -0.4 -0.28 -0.21 -0.18 -0.37 -0.28 -0.21 -0.18

[123]|[23] -0.11 -0.24 -0.18 -0.16 -0.032 -0.25 -0.18 -0.17

Tab. 1.2: Size distortion: P(Tn>cv)

; same as table 1.1 with Tn either Jan or J

sn.
http://-/?-http://-/?-


18/23


Ssn San J

an

instr. cvEL cvET cvEL cvET cvEL cvET EL() ET() EL ET

p/3 = 20, n=50

[123] -0.5 -0.45 -0.5 -0.45 -0.68 -0.64 0.56 0.45 0.51 0.4[23] -0.63 -0.61 -0.63 -0.61 -0.72 -0.7 0.25 0.21 0.21 0.17

[123]|[23] -0.66 -0.64 -0.66 -0.63 -0.78 -0.76 0.45 0.32 0.43 0.31

p/3 = 20, n=200

[123] -0.6 -0.59 -0.6 -0.59 -0.66 -0.66 0.15 0.14 0.1 0.092[23] -0.68 -0.67 -0.68 -0.67 -0.72 -0.72 0.038 0.058 0.024 0.036

[123]|[23] -0.77 -0.73 -0.77 -0.73 -0.78 -0.78 0.17 0.16 0.16 0.16

p/3 = 1, n=50

[123] -0.71 -0.68 -0.71 -0.68 -0.92 -0.88 0.36 0.16 -0.38 -0.5[23] -0.79 -0.76 -0.79 -0.76 -0.88 -0.87 -0.054 -0.12 -0.54 -0.56

[123]|[23] -0.79 -0.78 -0.8 -0.78 -0.9 -0.88 0.56 0.36 -0.056 -0.18

p/3 = 1, n=200

[123] -0.7 -0.72 -0.7 -0.72 -0.78 -0.78 -0.21 -0.21 -0.66 -0.64[23] -0.83 -0.82 -0.83 -0.82 -0.86 -0.86 -0.32 -0.32 -0.66 -0.66

[123]|[23] -0.79 -0.77 -0.79 -0.78 -0.81 -0.8 0.018 0.012 -0.41 -0.4

Tab. 1.3: Brown and Newey (2002) b ootstrap: relative size distortion: P(Tn>cv)

; fordifferent Tn with the critical values cvEL, cvET obtained applying EL or ETtype implied probabilities. Columns EL() and ET() show relative distortionof EL and EL type LR tests computed at the GMM, and the EL and ETcolumns show full EL and ET LR tests.


19/23


n = 50 n = 200

instr. EL() ET() EL ET EL() ET() EL ET

p/3 = 20, Z1u = 0.1

[123] 0.0638 0.0622 0.0640 0.0628 0.1492 0.1522 0.1512 0.1522[12] 0.0646 0.0646 0.0636 0.0652 0.1602 0.1556 0.1612 0.1562

[123]|[12] 0.0598 0.0570 0.0596 0.0586 0.0896 0.0874 0.0890 0.0874[123]|[23] 0.0700 0.0700 0.0714 0.0716 0.2064 0.2056 0.2068 0.2088

Z1u = 0.2

[123] 0.1330 0.1290 0.1324 0.1284 0.4922 0.4972 0.4954 0.4974[12] 0.1324 0.1334 0.1310 0.1352 0.4618 0.4586 0.4638 0.4584

[123]|[12] 0.0996 0.0920 0.1034 0.0944 0.2272 0.2220 0.2304 0.2252[123]|[23] 0.1696 0.1696 0.1708 0.1728 0.5964 0.5952 0.5972 0.5980

Z1u = 0.5

[123] 0.6602 0.6538 0.6542 0.6526 0.9954 0.9956 0.9950 0.9952[12] 0.5688 0.5722 0.5558 0.5648 0.9466 0.9436 0.9406 0.9380

[123]|[12] 0.3892 0.3496 0.4096 0.3638 0.8548 0.8526 0.8582 0.8676[123]|[23] 0.7558 0.7570 0.7502 0.7534 0.9982 0.9982 0.9980 0.9976

p/3 = 1, Z1u = 0.1

[123] 0.0598 0.0596 0.0636 0.0656 0.1212 0.1222 0.1054 0.1058[12] 0.0596 0.0590 0.0612 0.0578 0.1190 0.1168 0.1132 0.1126

[123]|[12] 0.0524 0.0536 0.0534 0.0542 0.0810 0.0818 0.0784 0.0788

[123]|[23] 0.0668 0.0678 0.0688 0.0690 0.1430 0.1460 0.1260 0.1278Z1u = 0.2

[123] 0.1224 0.1206 0.1122 0.1178 0.3270 0.3308 0.2432 0.2430[12] 0.1156 0.1186 0.1072 0.1030 0.2602 0.2584 0.2136 0.2126

[123]|[12] 0.0826 0.0850 0.0836 0.0810 0.1856 0.1858 0.1552 0.1556[123]|[23] 0.1440 0.1462 0.1398 0.1380 0.3736 0.3774 0.2892 0.2912

Z1u = 0.5

[123] 0.4118 0.4116 0.3048 0.3092 0.4784 0.4792 0.3566 0.3558[12] 0.3170 0.3180 0.2430 0.2392 0.3194 0.3180 0.2750 0.2732

[123]|[12] 0.2464 0.2294 0.1872 0.1812 0.3436 0.3412 0.2322 0.2304[123]|[23] 0.4600 0.4576 0.3532 0.3506 0.5300 0.5304 0.4120 0.4148

Tab. 1.4: Size corrected power: P(Tn > c|Z1u); = 5%.


20/23


Ssn San

instr. cvc cv1hs cv

2hs cvR cv cv

1ha cv

2ha cvR

Z1u = 0.05, n=50

[123] 0.0618 0.0492 0.0452 0.0600 0.0618 0.0492 0.0462 0.0600[12] 0.0546 0.0532 0.0526 0.0624 0.0546 0.0532 0.0534 0.0624

[123]|[12] 0.0648 0.0482 0.0472 0.0568 0.0654 0.0478 0.0478 0.0570[123]|[23] 0.0572 0.0542 0.0506 0.0636 0.0570 0.0532 0.0512 0.0632

Z1u = 0.1

[123] 0.0736 0.0578 0.0538 0.0726 0.0736 0.0578 0.0544 0.0726[12] 0.0676 0.0656 0.0666 0.0758 0.0676 0.0656 0.0674 0.0758

[123]|[12] 0.0666 0.0542 0.0510 0.0608 0.0674 0.0554 0.0520 0.0634[123]|[23] 0.0728 0.0666 0.0634 0.0824 0.0768 0.0666 0.0642 0.0814

Z1u = 0.2

[123] 0.1632 0.1268 0.1264 0.1628 0.1632 0.1268 0.1276 0.1628[12] 0.1564 0.1404 0.1392 0.1650 0.1564 0.1404 0.1410 0.1650

[123]|[12] 0.1046 0.0786 0.0766 0.0910 0.1108 0.0842 0.0820 0.0988[123]|[23] 0.1888 0.1618 0.1570 0.1978 0.1910 0.1610 0.1592 0.1972

Z1u = 0.5

[123] 0.7272 0.6496 0.6334 0.7178 0.7272 0.6496 0.6348 0.7178[12] 0.5974 0.5526 0.5308 0.6022 0.5974 0.5526 0.5308 0.6022

[123]|[12] 0.3652 0.2940 0.2858 0.3416 0.4304 0.3640 0.3582 0.4136[123]|[23] 0.7732 0.7248 0.7092 0.7714 0.7768 0.7206 0.7068 0.7722

Z1u = 0.05, n = 200

[123] 0.0748 0.0754 0.0774 0.0786 0.0748 0.0754 0.0776 0.0786

[12] 0.0838 0.0782 0.0816 0.0840 0.0838 0.0782 0.0816 0.0840[123]|[12] 0.0626 0.0634 0.0630 0.0628 0.0628 0.0642 0.0638 0.0638[123]|[23] 0.0852 0.0914 0.0910 0.0944 0.0868 0.0916 0.0914 0.0940

Z1u = 0.1

[123] 0.1470 0.1450 0.1478 0.1546 0.1470 0.1450 0.1478 0.1546[12] 0.1626 0.1536 0.1588 0.1624 0.1626 0.1536 0.1596 0.1624

[123]|[12] 0.0880 0.0894 0.0884 0.0864 0.0896 0.0902 0.0898 0.0886[123]|[23] 0.1872 0.1886 0.1884 0.1980 0.1886 0.1884 0.1890 0.1986

Z1u = 0.2

[123] 0.5060 0.4990 0.5026 0.5080 0.5060 0.4990 0.5028 0.5080[12] 0.4744 0.4572 0.4628 0.4668 0.4744 0.4572 0.4626 0.4668

[123]|[12] 0.2242 0.2180 0.2172 0.2248 0.2330 0.2262 0.2252 0.2336[123]|[23] 0.5998 0.5910 0.5892 0.6006 0.6026 0.5914 0.5906 0.6018

Z1u = 0.5

[123] 0.9966 0.9942 0.9934 0.9962 0.9966 0.9942 0.9934 0.9962[12] 0.9500 0.9394 0.9344 0.9412 0.9500 0.9394 0.9342 0.9412

[123]|[12] 0.8748 0.8664 0.8614 0.8712 0.9008 0.8944 0.8918 0.9002[123]|[23] 0.9988 0.9984 0.9974 0.9982 0.9988 0.9986 0.9976 0.9984

Tab. 1.5: Power of Ssn and San: P(Tn > c|Z1u); the column with cvc contains size

corrected power. = 5%.


21/23


Jsn Jan

instr. cvc cv1hs cv

2hs cvR cv cv

1ha cv

2ha cvR

Z1u = 0.05, n=50

[123] 0.0626 0.0368 0.0346 0.0620 0.0626 0.0366 0.0366 0.0620[12] 0.0572 0.0462 0.0490 0.0598 0.0572 0.0458 0.0488 0.0598

[123]|[12] 0.0586 0.0386 0.0378 0.0548 0.0614 0.0360 0.0356 0.0552[123]|[23] 0.0628 0.0410 0.0422 0.0628 0.0634 0.0396 0.0366 0.0618

Z1u = 0.1

[123] 0.0746 0.0408 0.0416 0.0742 0.0746 0.0412 0.0460 0.0742[12] 0.0688 0.0544 0.0556 0.0746 0.0688 0.0542 0.0604 0.0746

[123]|[12] 0.0622 0.0392 0.0394 0.0580 0.0658 0.0392 0.0414 0.0606[123]|[23] 0.0774 0.0480 0.0490 0.0770 0.0780 0.0456 0.0532 0.0756

Z1u = 0.2

[123] 0.1510 0.0914 0.0942 0.1470 0.1510 0.0902 0.1126 0.1470[12] 0.1416 0.1178 0.1210 0.1536 0.1416 0.1178 0.1394 0.1536

[123]|[12] 0.0858 0.0530 0.0566 0.0800 0.0922 0.0560 0.0690 0.0840[123]|[23] 0.1800 0.1196 0.1212 0.1718 0.1816 0.1160 0.1432 0.1714

Z1u = 0.5

[123] 0.6574 0.4644 0.4618 0.6488 0.6574 0.4630 0.6238 0.6488[12] 0.5536 0.4746 0.4712 0.5600 0.5536 0.4740 0.5458 0.5600

[123]|[12] 0.2340 0.1564 0.1596 0.2202 0.2868 0.1896 0.3316 0.2686[123]|[23] 0.7288 0.5666 0.5626 0.7112 0.7364 0.5588 0.6940 0.7152

Z1u = 0.05, n = 200

[123] 0.0758 0.0740 0.0762 0.0758 0.0758 0.0738 0.0756 0.0758

[12] 0.0820 0.0768 0.0798 0.0866 0.0820 0.0768 0.0816 0.0866[123]|[12] 0.0622 0.0584 0.0586 0.0626 0.0616 0.0588 0.0612 0.0632[123]|[23] 0.0898 0.0892 0.0896 0.0946 0.0900 0.0890 0.0916 0.0932

Z1u = 0.1

[123] 0.1476 0.1424 0.1446 0.1448 0.1476 0.1424 0.1492 0.1448[12] 0.1558 0.1534 0.1578 0.1598 0.1558 0.1534 0.1620 0.1598

[123]|[12] 0.0820 0.0808 0.0798 0.0814 0.0824 0.0808 0.0896 0.0842[123]|[23] 0.1860 0.1828 0.1836 0.1880 0.1856 0.1828 0.1908 0.1884

Z1u = 0.2

[123] 0.4998 0.4824 0.4854 0.4960 0.4998 0.4828 0.5070 0.4960[12] 0.4664 0.4572 0.4580 0.4650 0.4664 0.4572 0.4664 0.4650

[123]|[12] 0.2088 0.2004 0.2020 0.2084 0.2194 0.2090 0.2302 0.2176[123]|[23] 0.5832 0.5762 0.5778 0.5882 0.5840 0.5762 0.5986 0.5890

Z1u = 0.5

[123] 0.9952 0.9928 0.9926 0.9940 0.9952 0.9928 0.9954 0.9940[12] 0.9444 0.9382 0.9332 0.9374 0.9444 0.9380 0.9404 0.9374

[123]|[12] 0.7868 0.7692 0.7650 0.7800 0.8256 0.8120 0.8954 0.8210[123]|[23] 0.9978 0.9972 0.9960 0.9970 0.9980 0.9972 0.9976 0.9972

Tab. 1.6: Power of Jsn and Jan .


22/23

BIBLIOGRAPHY

Andrews, D. (1999): Consistent moment selection procedures for Gener-alized Method of Moments estimation, Econometrica, 67(3), 543564.

Andrews, D., and J. Stock (2007): Inference with weak instruments. Ad-vances in Economics and Econometrics, Theory and Applications: Ninth

World Congress of the Econometric Society, Vol. III, ed. by R. Blun-dell, W.K. Newey and T. Persson. Cambridge, UK: Cambridge UniversityPress, forthcoming.

Brown, B., and W. Newey (2002): Generalized Method of Moments,Efficient Bootstrapping, and Improved Inference, Journal of Business &Economic Statistics, 20(4), 507517.

Hall, A. (2000): Covariance Matrix Estimation and the Power of theOveridentifying Restrictions Test, Econometrica, 68(6), 15171528.

(2005): Generalized Method of Moments. Oxford University Press.

Hall, P., and J. Horowitz (1996): Bootstrap critical values for testsbased on generalized method of moments estimators, Econometrica,64(4), 891916.

Hansen, L. (1982): Large Sample Properties of Generalized Method ofMoments Estimators, Econometrica, 50(4), 10291054.

Imbens, G. (1997): One-Step Estimators for Over-Identified GeneralizedMethod of Moments Models, The Review of Economic Studies, 64(3),359383.

Kitamura, Y. (2001): Asymptotic Optimality of Empirical Likelihood for

Testing Moment Restrictions, Econometrica, 69(6), 16611672.

Newey, W., and R. Smith (2004): Higher Order Properties Of GMMAnd Generalized Empirical Likelihood Estimators, Econometrica, 72(1),219255.


23/23

BIBLIOGRAPHY 23

Owen, A. (1991): Empirical Likelihood for Linear Models, The Annals

of Statistics, 19(4), 17251747.Qin, J., and J. Lawless (1994): Empirical Likelihood and General Esti-

mating Equations, The Annals of Statistics, 22(1), 300325.

Smith, R. (1997): Alternative Semi-Parametric Likelihood Approachesto Generalised Method of Moments Estimation, The Economic Journal,107(441), 503519.

Date post:	08-Apr-2018
Category:	Documents
Upload:	bensonpai
View:	221 times
Download:	0 times

Evaluating the Performance of Tests of Over Identifying Restrictions

Documents