Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter...

©2015 Pearson Education, Inc.

Introduction to Econometrics (3rd Updated Edition)

by

James H. Stock and Mark W. Watson

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 17

(This version August 17, 2014)

Stock/Watson - Introduction to Econometrics - 3rd Updated Edition - Answers to Exercises: Chapter 17 _____________________________________________________________________________________________________


1

17.1. (a) Suppose there are n observations. Let b1 be an arbitrary estimator of β1. Given the estimator b1, the sum of squared errors for the given regression model is

21

1( ) .

n

i iiY b X

=

−∑

1̂ ,RLSβ the restricted least squares estimator of β1, minimizes the sum of squared

errors. That is, 1̂RLSβ satisfies the first order condition for the minimization which

requires the differential of the sum of squared errors with respect to b1 equals zero:

112( )( ) 0.

n

i i ii

Y b X X=

− − =∑

Solving for b1 from the first order condition leads to the restricted least squares estimator

11 2

1

ˆ .n

RLS i i ini i

X YX

β =

=

∑=∑

(b) We show first that 1̂RLSβ is unbiased. We can represent the restricted least

squares estimator 1̂RLSβ in terms of the regressors and errors:

1 1 1 11 12 2 2

1 1 1

( )ˆ .n n n

RLS i i i i i i i i i in n ni i i i i i

X Y X X u X uX X X

ββ β= = =

= = =

∑ ∑ + ∑= = = +∑ ∑ ∑

Thus

1 1 11 1 1 12 2

1 1

( | , , )ˆ( ) ,n n

RLS i i i i i i nn ni i i i

X u X E u X XE E EX X

β β β β= =

= =

⎛ ⎞ ⎡ ⎤∑ ∑= + = + =⎜ ⎟ ⎢ ⎥∑ ∑⎝ ⎠ ⎣ ⎦

K

where the second equality follows by using the law of iterated expectations, and the third equality follows from

1 12

1

( | , , ) 0ni i i n

ni i

X E u X XX

=

=

∑ =∑

K

(continued on the next page)



2

17.1 (continued)

because the observations are i.i.d. and E(ui |Xi) = 0. (Note, E(ui |X1,…, Xn) = E(ui |Xi) because the observations are i.i.d.

Under assumptions 1−3 of Key Concept 17.1, 1̂RLSβ is asymptotically normally

distributed. The large sample normal approximation to the limiting distribution of 1̂

RLSβ follows from considering

111

1 1 2 211 1

ˆ .nn

RLS i i ii i i nn ni i i in

X uX uX X

β β ==

= =

∑∑− = =∑ ∑

Consider first the numerator which is the sample average of vi = Xiui. By assumption 1 of Key Concept 17.1, vi has mean zero: ( ) [ ( | )] 0.i i i i iE X u E X E u X= = By assumption 2, vi is i.i.d. By assumption 3,

var(vi) is finite. Let 2 211 , then / .n

i i i v vnv X u nσ σ== ∑ = Using the central limit theorem, the sample average

1

1/ (0, 1)n

dv i

iv

v v Nn

σσ =

= →∑

or

2

1

1 (0, ).n

di i v

iX u N

nσ

=

→∑

For the denominator, 2iX is i.i.d. with finite second variance (because X has a

finite fourth moment), so that by the law of large numbers

2 2

1

1 ( ).n

pi

iX E X

n =

→∑

Combining the results on the numerator and the denominator and applying Slutsky’s theorem lead to




3

17.1 (continued)

11

1 2 211

var( )ˆ( ) 0, .( )

ni i iRLS n i id

u ni in

X u X un NX E X

β β =

=

∑ ⎛ ⎞− = → ⎜ ⎟∑ ⎝ ⎠

(c) 1̂RLSβ is a linear estimator:

11 2 21

1 1

ˆ , where .n

nRLS i i i ii i in ni

i i i i

X Y XaY aX X

β ==

= =

∑= = =∑ ∑∑

The weight ai (i = 1,…, n) depends on X1,…, Xn but not on Y1,…, Yn.

Thus

11 1 2

1

ˆ .n

RLS i i ini i

X uX

β β =

=

∑= +∑

1̂RLSβ is conditionally unbiased because

E(β̂1RLS |X1,…, Xn = E β1 +

∑ i=1n Xiui

∑ i=1n Xi

2 |X1,…, Xn

⎛

⎝⎜⎞

⎠⎟

= β1 + E∑ i=1

n Xiui

∑ i=1n Xi

2 |X1,…, Xn

⎛

⎝⎜⎞

⎠⎟

= β1.

The final equality used the fact that

E

∑ i=1n Xiui

∑ i=1n Xi

2 |X1,…, Xn

⎛

⎝⎜⎞

⎠⎟=∑ i=1

n Xi E(ui |X1,…, Xn )∑ i=1

n Xi2 = 0

because the observations are i.i.d. and E (ui |Xi) = 0.




4

17.1 (continued)

(d) The conditional variance of 1̂ ,RLSβ given X1,…, Xn, is

var(β̂1RLS |X1,…, Xn) = var β1 +

∑ i=1n Xiui

∑ i=1n Xi

2 |X1,…, Xn

⎛

⎝⎜⎞

⎠⎟

=∑ i=1

n Xi2 var(ui |X1,…, Xn )(∑ i=1

n Xi2 )2

=∑ i=1

n Xi2σ u

2

(∑ i=1n Xi

2 )2

=σ u

2

∑ i=1n Xi

2 .

(e) The conditional variance of the OLS estimator 1̂β is

2

1 1 21

ˆvar( | , , ) .( )

un n

i i

X XX Xσβ

=

=∑ −

K

Since

2 2 2 2 2 2

1 1 1 1 1( ) 2 ,

n n n n n

i i i i ii i i i iX X X X X nX X nX X

= = = = =

− = − + = − <∑ ∑ ∑ ∑ ∑

the OLS estimator has a larger conditional variance:

1 1 1 1ˆvar( | , , ) var( | , , ).RLS

n nX X X Xβ β>K K

The restricted least squares estimator 1̂RLSβ is more efficient.

(f) Under assumption 5 of Key Concept 17.1, conditional on X1,…, Xn, 1̂RLSβ is

normally distributed since it is a weighted average of normally distributed variables ui:

11 1 2

1

ˆ .n

RLS i i ini i

X uX

β β =

=

∑= +∑




5

17.1 (continued)

Using the conditional mean and conditional variance of 1̂RLSβ derived in parts (c)

and (d) respectively, the sampling distribution of 1̂RLSβ , conditional on X1,…, Xn,

is

2

1 1 21

ˆ ~ , .RLS uni i

NX

σβ β=

⎛ ⎞⎜ ⎟∑⎝ ⎠

(g) The estimator

!β1 =∑ i=1

n Yi

∑ i=1n Xi

=∑ i=1

n (β1Xi + ui )∑ i=1

n Xi

= β1 +∑ i=1

n ui

∑ i=1n Xi

The conditional variance is

var( !β1|X1,…, Xn) = var β1 +∑ i=1

n ui

∑ i=1n Xi

|X1,…, Xn

⎛

⎝⎜⎞

⎠⎟

=∑ i=1

n var(ui |X1,…, Xn )(∑ i=1

n Xi )2

=nσ u

2

(∑ i=1n Xi )

2 .

The difference in the conditional variance of !β1 and β̂1

RLS is

var( !β1|X1,…, Xn)− var(β̂1

RLS |X1,…, Xn) =nσ u

2

(∑ i=1n Xi )

2 −σ u

2

∑ i=1n Xi

2 .

In order to prove var( !β1|X1,…, Xn) ≥ var(β̂1RLS |X1,…, Xn), we need to show

2 21 1

1( )n n

i i i i

nX X= =

≥∑ ∑

or equivalently




6

17.1 (continued)

22

1 1.

n n

i ii in X X

= =

⎛ ⎞≥ ⎜ ⎟⎝ ⎠∑ ∑

This inequality comes directly by applying the Cauchy-Schwartz inequality

22 2

1 1 1( )

n n n

i i i ii i ia b a b

= = =

⎡ ⎤⋅ ≤ ⋅⎢ ⎥⎣ ⎦∑ ∑ ∑

which implies

Xi

i=1

n

∑⎛⎝⎜⎞⎠⎟

2

= 1⋅ Xii=1

n

∑⎛⎝⎜⎞⎠⎟

2

≤ 12 ⋅ Xi2 = n Xi

2.i=1

n

∑i=1

n

∑i=1

n

∑

That is nΣ i=1n Xi

2 ≥ (Σ x=1n Xi )

2, or var( !β1|X1,…, Xn) ≥ var(β̂1RLS |X1,…, Xn).

Note: because !β1 is linear and conditionally unbiased, the result

var( !β1|X1,…, Xn) ≥ var(β̂1RLS |X1,…, Xn) follows directly from the Gauss-Markov

theorem.



7

17.3. (a) Using Equation (17.19), we have

11

1 1 211

11

211

1 11 1

2 21 11 1

1 11 1

2 21 11 1

( )ˆ( )( )

[( ) ( )]( )

( ) ( )( ) ( )

( )( ) ( )

ni i inni inni i X X in

ni in

n ni i X i X i in nn ni i i in n

n ni i X i in n

n ni i i in n

X X un nX X

X X unX X

X u X uX X X X

v X uX X X X

β β

µ µ

µ µ

µ

=

=

=

=

= =

= =

= =

= =

∑ −− =∑ −

∑ − − −=∑ −

∑ − − ∑= −

∑ − ∑ −

∑ − ∑= −

∑ − ∑ −

by defining vi = (Xi − µX)ui.

(b) The random variables u1,…, un are i.i.d. with mean µu = 0 and variance 20 .uσ< < ∞ By the central limit theorem,

11( ) (0, 1).

ni in du

u u

un u Nµσ σ

=∑− = →

The law of large numbers implies 2, or 0.p p

X XX Xµ µ→ − → By the consistency

of sample variance, 211( )n

i in X X=Σ − converges in probability to population variance, var(Xi), which is finite and non-zero. The result then follows from Slutsky’s theorem.

(c) The random variable vi = (Xi − µX) ui has finite variance:

2 2

4 4

var( ) var[( ) ]

[( ) ]

[( ) ] [( ) ] .

i i X i

i X i

i X i

v XE X u

E X E u

µ µµ

µ

= −

≤ −

≤ − <∞

The inequality follows by applying the Cauchy-Schwartz inequality, and the second inequality follows because of the finite fourth moments for (Xi, ui). The finite variance along with the fact that vi has mean zero (by assumption 1 of Key Concept 15.1) and vi is i.i.d. (by assumption 2) implies that the sample average v satisfies the requirements of the central limit theorem. Thus,




8

17.3 (continued)

11

ni in

v v

vvσ σ

=∑=

satisfies the central limit theorem.

(d) Applying the central limit theorem, we have

11 (0, 1).

ni in d

v

vN

σ=∑

→

Because the sample variance is a consistent estimator of the population variance, we have

211( ) 1.var( )

npi in

i

X XX

=∑ − →

Using Slutsky’s theorem,

11

211

2

(0,1),( )

ni tn

dvni tn

X

v

NX Xσ

σ

=

=

∑

→∑ −

or equivalently

11

2 211

var( )0, .( ) [var( )]

ni in d i

ni i in

v vNX X X

=

=

∑ ⎛ ⎞→ ⎜ ⎟∑ − ⎝ ⎠

Thus

1 11 1

1 1 2 21 11 1

2

( )ˆ( )( ) ( )

var( )0,[var( )]

n ni i X i in n

n ni i i in n

d i

i

v X un

X X X X

vNX

µβ β = =

= =

∑ − ∑− = −

∑ − ∑ −

⎛ ⎞→ ⎜ ⎟

⎝ ⎠

since the second term for 1 1ˆ( )n β β− converges in probability to zero as shown

in part (b).



9

17.5. Because E(W4) = [E(W2)]2 + var(W2), [E(W2)]2 ≤ E (W4) < ∞. Thus E(W2) < ∞.



10

17.7. (a) The joint probability distribution function of ui, uj, Xi, Xj is f (ui, uj, Xi, Xj). The conditional probability distribution function of ui and Xi given uj and Xj is f (ui, Xi |uj, Xj). Since ui, Xi, i = 1,…, n are i.i.d., f (ui, Xi |uj, Xj) = f (ui, Xi). By definition of the conditional probability distribution function, we have

( , , , ) ( , | , ) ( , )

( , ) ( , ).i j i j i i j j j j

i i j j

f u u X X f u X u X f u Xf u X f u X

=

=

(b) The conditional probability distribution function of ui and uj given Xi and Xj equals

( , , , ) ( , ) ( , )( , | , ) ( | ) ( | ).

( , ) ( ) ( )i j i j i i j j

i j i j i i j ji j i j

f u u X X f u X f u Xf u u X X f u X f u X

f X X f X f X= = =

The first and third equalities used the definition of the conditional probability distribution function. The second equality used the conclusion the from part (a) and the independence between Xi and Xj. Substituting

( , | , ) ( | ) ( | )i j i j i i j jf u u X X f u X f u X=

into the definition of the conditional expectation, we have

( | , ) ( , | , )

( | ) ( | )

( | ) ( | )

( | ) ( | ).

i j i j i j i j i j i j

i j i i j j i j

i i i i j j j j

i i j j

E u u X X u u f u u X X du du

u u f u X f u X du du

u f u X du u f u X du

E u X E u X

=

=

=

=

∫ ∫∫ ∫∫ ∫

(c) Let Q = (X1, X2,…, Xi – 1, Xi + 1,…, Xn), so that f (ui|X1,…, Xn) = f (ui |Xi, Q). Write

(continued on next page)



11

17.7 (continued)

( , , )( | , )( , )( , ) ( )( ) ( )( , )( )( | )

i ii i

i

i i

i

i i

i

i i

f u X Qf u X Qf X Qf u X f Qf X f Qf u Xf Xf u X

=

=

=

=

where the first equality uses the definition of the conditional density, the second uses the fact that (ui, Xi) and Q are independent, and the final equality uses the definition of the conditional density. The result then follows directly.

(d) An argument like that used in (c) implies

( | , ) ( | , )i j i n i j i jf u u X X f u u X X=K

and the result then follows from part (b).



12

17.9. We need to prove

2 2 2 2

1

1 ˆ[( ) ( ) ] 0.n

pi i i X i

iX X u X u

nµ

=

− − − →∑

Using the identity ( ),X XX Xµ µ= + −

2 2 2 2 2 2

1 1

2

1

2 2 2

1

1 1ˆ ˆ[( ) ( ) ] ( )

1 ˆ2( ) ( )

1 ˆ( ) ( ).

n n

i i i X i X ii i

n

X i X ii

n

i X i ii

X X u X u X un n

X X un

X u un

µ µ

µ µ

µ

= =

=

=

− − − = −

− − −

+ − −

∑ ∑

∑

∑

The definition of ˆiu implies

2 2 2 2 20 0 1 1 0 0

1 1 0 0 1 1

ˆ ˆ ˆˆ ( ) ( ) 2 ( )ˆ ˆ ˆ2 ( ) 2( )( ) .

i i i i

i i i

u u X u

u X X

β β β β β β

β β β β β β

= + − + − − −

− − + − −

Substituting this into the expression for 2 2 2 211 ˆ[( ) ( ) ]n

i i i i X in X X u X uµ=Σ − − − yields a

series of terms each of which can be written as anbn where 0pna → and

11

n r sn i i inb X u== Σ where r and s are integers. For example,

1 1ˆ( ), ( )n X na X aµ β β= − = − and so forth. The result then follows from Slutksy’s

theorem if 1 1pn r s

i i in X u d=Σ → where d is a finite constant. Let r si i iw X u= and note that

wi is i.i.d. The law of large numbers can then be used for the desired result if 2( ) .iE w < ∞ There are two cases that need to be addressed. In the first, both r and s

are non-zero. In this case write

2 2 2 4 4( ) ( ) [ ( )][ ( )]r s r si i i i iE w E X u E X E u= <

and this term is finite if r and s are less than 2. Inspection of the terms shows that this is true. In the second case, either r = 0 or s = 0. In this case the result follows directly if the non-zero exponent (r or s) is less than 4. Inspection of the terms shows that this is true.



13

17.11. Note: in early printing of the third edition there was a typographical error in the expression for µY|X. The correct expression is 2

| ( / )( )Y X Y XY X Xxµ µ σ σ µ= + − .

(a) Using the hint and equation (17.38)

| 2 2

2 2 2

2

1( )(1 )

1 1exp 2 .2(1 ) 2

Y X x

Y XY

X X Y Y XXY

XY X X Y Y X

f y

x x y y x

σ ρ

µ µ µ µ µρρ σ σ σ σ σ

= =−

⎛ ⎞⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞− − − − −⎜ ⎟⎜ ⎟× − + +⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟− − ⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠⎝ ⎠

Simplifying yields the desired expression.

(b) The result follows by noting that fY|X=x(y) is a normal density (see equation (17.36)) with µ = µT|X and σ2 = 2

|Y Xσ .

(c) Let b = σXY/ 2Xσ and a = µY −bµX.



14

17.13 (a) The answer is provided by equation (13.10) and the discussion following the equation. The result was also shown in Exercise 13.10, and the approach used in the exercise is discussed in part (b).

(b) Write the regression model as Yi = β0 + β1Xi + vi, where β0 = E(β0i), β1 =

E(β1i), and vi = ui + (β0i − β0) + (β1i − β1)Xi. Notice that E(vi | Xi) = E(ui|Xi) + E(β0i − β0| Xi) + XiE(β1i − β1|Xi) = 0

because β0i and β1i are independent of Xi. Because E(vi | Xi) = 0, the OLS regression of Yi on Xi will provide consistent estimates of β0 = E(β0i) and β1 = E(β1i). Recall that the weighted least squares estimator is the OLS

estimator of Yi/σi onto 1/σi and Xi/σi , where 20 1i iXσ θ θ= + . Write this

regression as 0 1/ (1/ ) ( / ) /i i i i i i iY X vσ β σ β σ σ= + + . This regression has two regressors, 1/σi and Xi/σi. Because these regressors depend only on Xi, E(vi|Xi) = 0 implies that E(vi/σi | (1/σi), Xi/σi) = 0. Thus, weighted least squares provides a consistent estimator of β0 = E(β0i) and β1 = E(β1i).



15

17.15

(a) Write W =

Zi2

i=1

n

∑ where Zi ~ N(0,1). From the law of large number W/n →

d

E( Zi2 )

= 1.

(b) The numerator is N(0,1) and the denominator converges in probability to 1. The

result follows from Slutsky’s theorem (equation (17.9)).

(c) V/m is distributed χm2 / m and the denominator converges in probability to 1. The

result follows from Slutsky’s theorem (equation (17.9)).

Date post:	11-May-2018
Category:	Documents
Upload:	vuonghanh
View:	232 times
Download:	1 times

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter...

Documents