Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc

8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc

1/84

Introduction to Linear Regression

(SW Chapter 4)

Empirical problem: Class size and educational output

Policy question: What is the effect of reducing classsize by one student per class? by 8 students/class?

What is the right output (performance) measure?

parent satisfaction

student personal deelopment

future adult !elfare

future adult earnings

performance on standardized tests

"#$


2/84

What do data say about class sizes and test scores

!he Cali"ornia !est Score #ata Set

%ll ' and California school districts (n "*)

+ariables: ,thgrade test scores (-tanford#. achieement test

combined math and reading) district aerage

-tudent#teacher ratio (-01) no2 of students in thedistrict diided by no2 full#time equialent teachers

"#


3/84

%n initial loo3 at the California test score data:

"#4


4/84

5o districts !ith smaller classes (lo!er -01) hae higher test

scores?

"#"


5/84

!he class size$test score policy %uestion:

What is the effect on test scores of reducing -01 byone student/class?

6b7ect of policy interest:0est score

STR

0his is the slope of the line relating test score and STR

"#,


6/84

0his suggests that !e !ant to dra! a line through the

Test Score v. STRscatterplot but how?

"#'


7/84

Some &otation and !erminology

(Sections 4' and 4')

0hepopulation regression line:

Test Score *9 $STR

$ slope of population regression line

0est score

STR

change in test score for a unit change in STR

Why are *and $population parameters?

We !ould li3e to 3no! the population alue of $2

We dont 3no! $ so must estimate it using data2"#;


8/84

How can we estimate *and $from data?

1ecall that Y !as the least squares estimator of Y: Y

soles)

$

min ( )n

m i

i

Y m=

n

b b i i

i

Y b b =

+"#8


9/84

0he 6- estimator soles:* $

)

/ * $

$

min = ( )>n

b b i i

i

Y b b =

+

0he 6- estimator minimizes the aerage squareddifference bet!een the actual alues of Yiand the

prediction (predicted alue) based on the estimated line2

0his minimization problem can be soled usingcalculus (%pp2 "2)2

!he result is the /LS estimators o" .and 2

"#.


10/84

Why use /LS0 rather than some other estimator

6- is a generalization of the sample aerage: if the@lineA is 7ust an intercept (no) then the 6-

estimator is 7ust the sample aerage of Y$BYn(Y)2

i3e Y the 6- estimator has some desirableproperties: under certain assumptions it is unbiased

(that is!( $C ) $) and it has a tighter sampling

distribution than some other candidate estimators of

$(more on this later)

Dmportantly this is !hat eeryone uses the common@languageA of linear regression2

"#$*


11/84

"#$$


12/84


13/84

Estimated regression line: TestScore '.82. 28STR

%nterpretation of the estimated slope and intercept

TestScore '.82. 28STR

5istricts !ith one more student per teacher on aeragehae test scores that are 28 points lo!er2

0hat is0est score

STR

28

0he intercept (ta3en literally) means that according tothis estimated line districts !ith zero students per

teacher !ould hae a (predicted) test score of '.82.2

0his interpretation of the intercept ma3es no sense iteFtrapolates the line outside the range of the data in

"#$4


14/84

this application the intercept is not itself

economically meaningful2

"#$"


15/84

1redicted 2alues 3 residuals:

6ne of the districts in the data set is %ntelope C% for

!hich STR $.244 and Test Score ',;28

predicted alue: C&ntelopeY '.82. 28$.244 ',"28

residual: &ntelopeu ',;28 ',"28 42*

"#$,


16/84

/LS regression: S!! output

regress testscr str, robust

Regression with robust standard errors Number of obs = 420 F( 1, 418) = 192! "rob # F = 00000 R$s%uared = 00&12 Root ' = 18&81

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ * Robusttestscr * +oef td rr t "#*t* 9&- +onf .nter/a$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ str * $2239808 &194892 $49 0000 $0094& $12&8!31 5cons * !989 10!4! !344 0000 !38&!02 3190&3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$


(!ell discuss the rest of this output later)

"#$'


17/84

0he 6- regression line is an estimate computed using

our sample of dataG a different sample !ould hae gien

a different alue of $C 2

Ho! can !e:

quantify the sampling uncertainty associated !ith $C ?

use $C to test hypotheses such as $ *?

construct a confidence interal for $?

i3e estimation of the mean !e proceed in four steps:

$2 0he probability frame!or3 for linear regression2 Estimation

42 Hypothesis 0esting

"2 Confidence interals"#$;


18/84

' 1robability 5rame*or- "or Linear Regression

'opulation

population of interest (eF: all possible school districts)

Random variables:Y

EF: (Test Score( STR)

)oint distribution of (Y)

0he 3ey feature is that !e suppose there is a linear

relation in the population that relatesand YG this linear

relation is the @population linear regressionA

"#$8


19/84

!he 1opulation Linear Regression 6odel (Section 4'7)

Yi *9 $i9 ui i $B n

is the independent variableor re*ressor

Yis the dependent variable

* intercept

$ slope

ui @error termA

0he error term consists of omitted factors or possiblymeasurement error in the measurement of Y2 Dn

"#$.


20/84

general these omitted factors are other factors that

influence Y other than the ariable

"#*


21/84

!+.: 0he population regression line and the error term

What are some of the omitted factors in this e+ample?"#$


22/84

,ata and samplin*

0he population ob7ects (@parametersA) *and $are

un3no!nG so to dra! inferences about these un3no!n

parameters !e must collect releant data2

Simple random samplin*:

Choose nentities at random from the population ofinterest and obsere (record)and Yfor each entity

-imple random sampling implies that I(i Yi)J i $B

n are independently and identically distributed(i2i2d2)2

(-ote: (i Yi) are distributed independently of ( Y) for

different obserations iand2)

"#


23/84

0as3 at hand: to characterize the sampling distribution of

the 6- estimator2 0o do so !e ma3e three

assumptions:

!he Least S%uares ssumptions

$2 0he conditional distribution of ugienhas meanzero that is!(uK+) *2

2 (i(Yi) i$Bn are i2i2d2

42 and uhae four moments that is:

!(") L and!(u") L 2

Well discuss these assumptions in order2

"#4


24/84

Least s%uares assumption 8: E(u9Xx) .'

/or any *iven value of ( the mean of u is $ero

"#"


25/84

EFample: %ssumption M$ and the class size eFample

Test Scorei *9 $STRi9 ui ui other factors

@6ther factors:A

parental inolement

outside learning opportunities (eFtra math class22)

home enironment conducie to reading

family income is a useful proFy for many such factors

-o!(uK+) * means!(/amily %ncomeKSTR) constant

(!hich implies that family income and STRare

uncorrelated)2 This assumption is not innocuous0 We

will return to it often."#,


26/84

Least s%uares assumption 8:

(Xi0Yi)0 i 0;0nare i'i'd'

0his arises automatically if the entity (indiidual district)

is sampled by simple random sampling: the entity is

selected then for that entityandYare obsered

(recorded)2

0he main place !e !ill encounter non#i2i2d2 sampling is

!hen data are recorded oer time (@time series dataA)

this !ill introduce some eFtra complications2

"#'


27/84

Least s%uares assumption 87:

E(X4) < andE(u4)

( )

n

i i i

i

n

i

i

u u

=

=

+

"#.


30/84

$C

$

$

)

$

( )= ( ) ( )>

( )

n

i i i

i

n

i

i

u u

=

=

+

$ $

$) )

$ $

( )( ) ( )( )

( ) ( )

n n

i i i i

i i

n n

i i

i i

u u

= =

= =

+

so

$C $$

)

$

( )( )

( )

n

i i

i

n

i

i

u u

=

=

"#4*


31/84

We can simplify this formula by noting that:

$

( )( )n

i i

i

u u=

$

( )n

i i

i

u=

$

( )n

i

i

u=

$

( )n

i i

i

u=

20hus

$C $

$

)

$

( )

( )

n

i i

i

n

i

i

u

=

=

$

)

$

$

n

i

i

vn

ns

n

=

!here vi (i )ui2

"#4$


32/84

$C $

$

)

$

$

n

i

i

vn

ns

n

=

!here vi (i )ui

We no! can calculate the mean and ariance of $C :

!( $C $) )$

$ $n

i

i

n! v sn n

=

)$

$

$

ni

i

vn!

n n s=

)$

$

$

ni

i

vn!

n n s=

"#4

) )


33/84

No! !(vi/ )

s ) !=(i )ui/ )

s > *

because!(uiKi+) * (for details see %pp2 "24)

0hus !( $C $) )

$

$

$

ni

i

vn!

n n s=

*

so

!( $C ) $

0hat is $C

is an unbiased estimator o" '

"#44

C


34/84

Calculation of the ariance of $C :

$

C

$

$

)

$

$

n

i

i

vn

n sn

=

0his calculation is simplified by supposing that nis

large (so that)

s can be replaced by)

)G the result is

ar( $C

)

ar( )

v

n

(Oor details see %pp2 "242)

"#4"

h li di ib i i li d b h


35/84

0he eFact sampling distribution is complicated but !hen

the sample size is large !e get some simple (and good)

approFimations:

($)


36/84

$C $

$

)

$

$

n

i

i

vn

ns

n

=

When nis large:

vi (i )ui(i)ui !hich is i2i2d2 (why?) and

has t!o moments that is ar(vi) L

(why?)2 0hus

$

$ n

i

i

vn

=

is distributed-(*ar(v)/n) !hen nis large

)s is approFimately equal to)

!hen nis large

$n

n

$

$

n$ !hen nis large

Putting these together !e hae:

"#4'

C


37/84

Large=nappro>imation to the distribution o" $ :

$C $

$

)

$

$

n

ii

vn

ns

n

=

$

$n

i

i

vn

=

!hich is approFimately distributed-(*

)

) )( )v

n

)2


38/84

1ecall the summary of the sampling distribution of Y:

Oor (Y$BYn) i2i2d2 !ith * L)

Y L

0he eFact (finite sample) sampling distribution of Y

has meanY(@Yis an unbiased estimator of YA) and

ariance)

Y /n

6ther than its mean and ariance the eFactdistribution of Y is complicated and depends on the

distribution of Y

Y p

Y (la! of large numbers)

( )

ar( )

Y ! Y

Y

is approFimately distributed-(*$) (C0)

"#48

i " / S i C


39/84

1arallel conclusions hold "or the /LS estimator $ :

nder the three east -quares %ssumptions

0he eFact (finite sample) sampling distribution of $C

has mean $(@ $C is an unbiased estimator of $A) and

ar( $C ) is inersely proportional to n2

6ther than its mean and ariance the eFact

distribution of $C is complicated and depends on the

distribution of (u)

$C p$(la! of large numbers)

$ $

$

( )

ar( )

!

is approFimately distributed-(*$) (C0)

"#4.


40/84

"#"*

$ 0h b bilit f 3 f li i


41/84

$2 0he probability frame!or3 for linear regression

2 Estimation

7' ?ypothesis !esting (Section 4'@)

"2 Confidence interals

-uppose a s3eptic suggests that reducing the number of

students in a class has no effect on learning orspecifically test scores2 0he s3eptic thus asserts the

hypothesis

H*: $ *

We !ish to test this hypothesis using data reach a

tentatie conclusion !hether it is correct or incorrect2

"#"$


42/84


43/84

t/*

/

Y

Y

Y

s n

then re7ect the null hypothesis if KtK Q$2.'2

!here the S!of the estimator is the square root of an

estimator of the ariance of the estimator2

"#"4

% li d t h th i b t


44/84

%pplied to a hypothesis about $:

testimator # hypothesized alue

standard error of the estimator

so

t$ $/*

$

( )S!

!here $is the alue of $*hypothesized under the null

(for eFample if the null alue is zero then $* *2

What is S!( $C )?

S!( $C ) the square root of an estimator of the

ariance of the sampling distribution of $C

"#""

1 ll th i f th i f C (l )


45/84

1ecall the eFpression for the ariance of $ (large n):

ar( $C

) ar=( ) >

( )

i + i

u

n

)

"

v

n

!here vi (i )ui2 Estimator of the ariance of $C :

$

)

CC )

) )

$ estimator of

(estimator of )

v

n

) )

$

)

)

$

$

( )$

$( )

n

i ii

n

i

i

un

n

n

=

=

2

"#",

$ n


46/84

$

)

CC

) )

$

)

)

$

$( )

$

$( )

i i

i

n

i

i

un

n

n

=

=

2

6& this is a bit nasty but:

0here is no reason to memorize this Dt is computed automatically by regression soft!are

S!( $C ) $)

C is reported by regression soft!are

Dt is less complicated than it seems2 0he numeratorestimates the ar(v) the denominator estimates

ar()2

"#"'

1eturn to calculation of the t statsitic:


47/84

1eturn to calculation of the t#statsitic:

t$ $/*

$

( )S!

$

$ $/*

)

C

1e7ect at ,R significance leel if KtK Q $2.'

p#alue isp Pr=KtK Q KtactK> probability in tails ofnormal outside KtactK

ample: Test Scores and STR Cali"ornia data


48/84

E>ample: Test ScoresandSTR0 Cali"ornia data


1egression soft!are reports the standard errors:

S!( * ) $*2" S!( $C ) *2,

t#statistic testing $* * $ $/*

$

( )S!

28 *

*2,

"248

0he $1#sided significance leel is 2,8 so !e re7ectthe null at the $R significance leel2

%lternatiely !e can compute thep#alueB"#"8


49/84

0hep#alue based on the large#nstandard normal

approFimation to the t#statistic is *2****$ ($*")"#".

$ 0he probability frame!or3 for linear regression


50/84

$2 0he probability frame!or3 for linear regression

2 Estimation

42 Hypothesis 0esting

4' Con"idence inter2als (Section 4'A)

Dn general if the sampling distribution of an estimator is

normal for large n then a .,R confidence interal can beconstructed as estimator $2.'standard error2

-o: a .,R confidence interal for $C is

I $C $2.'S!( $

C )J

"#,*

!+ample: Test Scores and STR California data


51/84

!+ample: Test Scoresand STR California data


S!( * ) $*2" S!( $

C ) *2,

.,R confidence interal for $C :

I $C $2.'S!( $

C )J I28 $2.'*2,J

(424* $2')

Equialent statements: 0he .,R confidence interal does not include zeroG

0he hypothesis $ * is re7ected at the ,R leel

"#,$

con2ention "or reporting estimated regressions:


52/84

con2ention "or reporting estimated regressions:

Put standard errors in parentheses belo! the estimates


($*2") (*2,)

0his eFpression means that:

0he estimated regression line is

TestScore '.82. 28STR 0he standard error of *

is $*2"

0he standard error of $C is *2,

"#,



53/84



Regression with robust standard errors Number of obs = 420 F( 1, 418) = 192! "rob # F = 00000 R$s%uared = 00&12 Root ' = 18&81$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ * Robusttestscr * +oef td rr t "#*t* 9&- +onf .nter/a

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ str * $2239808 &194892 $48 0000 $0094& $12&8!31 5cons * !989 10!4! !344 0000 !38&!02 3190&3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

so:


($*2") (*2,)

t($ *) "248p#alue *2***

.,R conf2 interal for $is (424* $2')"#,4

Regression *hen X is Binary (Section 4 )


54/84

Regression *henXis Binary (Section 4')

-ometimes a regressor is binary:

$ if female * if male

$ if treated (eFperimental drug) * if not

$ if small class size * if not

-o far $has been called a @slopeA but that doesnt

ma3e much sense ifis binary2

Ho! do !e interpret regression !ith a binary regressor?

"#,"

Yi * 9 $i 9 ui !here is binary (i * or $):


55/84

Yi *9 $i9 ui !hereis binary (i * or $):

Wheni *: Yi *9 ui

Wheni $: Yi *9 $9 ui

thus:

Wheni * the mean of Yiis *

Wheni $ the mean of Yiis *9 $that is:

!(YiKi*) *

!(YiKi$) *9 $so:

$!(YiKi$) !(YiKi*)

"#,,

population difference in group means


56/84

population difference in group means

!+ample: TestScoreand STR California data

et

,i$ if *

* if *

i

i

STR

STR

>

0he 6- estimate of the regression line relatingTestScoreto,(!ith standard errors in parentheses) is:

TestScore ',*2* 9 ;2",

($24) ($28)

5ifference in means bet!een groups ;2"G

"#,'

S! $28 t ;2"/$28 "2*


57/84

S! $28 t ;2"/$28 "2*

"#,;

#ompare the re*ression results with the *roup means(


58/84

#ompare the re*ression results with the *roup means(

computed directly2

Class -ize %erage score (Y) -td2 de2 (sY) -

-mall (STRQ *) ',;2" $.2" 48arge (STRS *) ',*2* $;2. $8

Estimation: small largeY Y ',;2" ',*2* ;2"

!est .:;2"

( ) $284

s l

s l

Y Yt

S! Y Y

= = "2*,

D@ con"idence inter2alI;2"$2.'$284J(428$$2*)

This is the same as in the regression!TestScore ',*2* 9 ;2",

($24) ($28)

"#,8

Summary: regression *hen Xi is binary (.$)


59/84

Summary: regression *henXiis binary (.$)

Yi *9 $i9 ui

* mean of Ygien that *

*9$ mean of Ygien that $

$ difference in group means$ minus * -E( $C ) has the usual interpretation

t#statistics confidence interals constructed as usual

0his is another !ay to do difference#in#meansanalysis

"#,.

0he regression formulation is especially useful !hen


60/84

0he regression formulation is especially useful !hen

!e hae additional regressors (comin* up soon3)

"#'*

/ther Regression Statistics (Section 4'F)


61/84

g ( )

% natural question is ho! !ell the regression line @fitsA

or eFplains the data2 0here are t!o regression statistics

that proide complementary measures of the quality of

fit:

0he re*ression R

measures the fraction of theariance of Ythat is eFplained byG it is unitless and

ranges bet!een zero (no fit) and one (perfect fit)

0hestandard error of the re*ressionmeasures the fit the typical size of a regression residual in the units

of Y2

"#'$

!heR


62/84

Write Yias the sum of the 6- prediction 9 6-

residual:

Yi CiY9 iu

0heR

is the fraction of the sample ariance of Yi@eFplainedA by the regression that is by CiY:

R

!SS

TSS

!here!SS)

$

( )n

i

i

Y Y=

and TSS )$

( )n

i

i

Y Y=

2

"#'

R!SS

h !SS) ( )

n

Y Y d TSS )( )n

Y Y


63/84

RTSS

!here!SS)

$

( )ii

Y Y=

and TSS )$

( )ii

Y Y=

0heR:

R * means!SS * soeFplains none of theariation of Y

R

$ means!SS TSS so Y CYsoeFplains all ofthe ariation of Y

* TRT $

Oor regression !ith a single regressor (the case here)Ris the square of the correlation coefficient bet!een

and Y

"#'4

!heStandard Error of the Regression(SER)


64/84

f g ( )

0he standard error of the regression is (almost) the

sample standard deiation of the 6- residuals:

S!R)

$

$ ( )

n

i i

i

u u

n =

)

$

$

n

i

i

un

=

(the second equality holds because$

$

n

i

i

un

=

*)2

"#'"

S!R)$

n


65/84

S!R)

$

i

i

un

=

0he S!R:

has the units of u !hich are the units of Y

measures the spread of the distribution of u

measures the aerage @sizeA of the 6- residual (the

aerage @mista3eA made by the 6- regression line)

0he root mean squared error(R4S!) is closelyrelated to the S!R:

R4S! )

$

$ n

i

i

un

=

0his measures the same thing as the S!R the minor

difference is diision by $/ninstead of $/(n)2"#',

Technical note: !hy diide by n instead of n$?


66/84

y y

S!R)

$

$

n

i

i

un

=

5iision by n is a @degrees of freedomA correction

li3e diision by n$ in)

Ys G the difference is that in the

S!R t!o parameters hae been estimated (*and $ by

* and $

C ) !hereas in)

Ys only one has been estimated

(Y by Y)2

When nis large it ma3es negligible difference !hethern n$ or n are used although the conentional

formula uses n !hen there is a single regressor2

"#''

Oor details see -ection $,2"


67/84

EFample ofRand S!R

TestScore '.82. 28STRR

2*, S!R $82'($*2") (*2,)

"#';

The slope coefficient is statistically si*nificant and lar*e


68/84

in a policy sense( even thou*h STR e+plains only a small

fraction of the variation in test scores2

"#'8

1ractical &ote: ?eteros-edasticity0


69/84

?omos-edasticity0 and the 5ormula "or the Standard

Errors o" * and $

C (Section 4'D)

What do these t!o terms mean?

Consequences of homos3edasticity

Dmplication for computing standard errors

What do these t*o terms mean

Df ar(uK+) is constant that is the ariance of theconditional distribution of ugiendoes not depend on

then uis said to be homosedastic2 6ther!ise uis

said to be heterosedastic2"#'.

omosedasticityin a picture2


70/84

!(uK+) * (usatisfies east -quares %ssumption M$)

0he ariance of udoes notchange !ith (depend on)+

"#;*

eterosedasticityin a picture2


71/84

!(uK+) * (usatisfies east -quares %ssumption M$)

"#;$

0he ariance of udepends on+ so uis


72/84

heteros3edastic2

%n real#!orld eFample of heterosedasticityfrom labor

economics: aerage hourly earnings s2 years of

education (data source: $... Current Population -urey)

"#;

Average Hourly Earnings Fitted values

"0


73/84

Averagehourlyearn

ings

Scatterplot and OLS Regression LineYears of Education

5 10 15 0

0

0

!0

"0

"#;4

Ds heteros3edasticity present in the class size data?


74/84

Hard to sayBloo3s nearly homos3edastic but the spread

might be tighter for large alues of STR2

"#;"

-o far !e hae (!ithout saying so) assumed that uis


75/84

heteros3edastic:

Recall the three least s5uares assumptions2

$2 0he conditional distribution of ugienhas mean

zero that is!(uK+) *2

2 (i(Yi) i$Bn are i2i2d242 and uhae four finite moments2

Heteros3edasticity and homos3edasticity concern ar(uK

+)2


76/84

Uou can proe some theorems about 6- (inparticular the Vauss#ar3o theorem !hich says

that 6- is the estimator !ith the lo!est ariance

among all estimators that are linear functions of (Y$

BYn)G see -ection $,2,)2

0he formula for the ariance of $C

and the 6-standard error simplifies (%pp2 "2"): Df ar(uiKi+)

)

u then

ar( $C ) ) )ar=( ) >

( )

i + i

u

n

B

)

)

u

n

-ote: ar( $C ) is inersely proportional to ar():

more spread inmeans more information about $C 2

"#;'

7eneral formulafor the standard error of $C is the of:


77/84

$

)

CC

) )

$

)

)

$

$( )

$

$( )

n

i ii

n

i

i

un

n

n

=

=

2

Special caseunder homos3edasticity:

$

)

CC

)

$

)

$

$

$

$ ( )

n

i

i

n

i

i

un

n n

=

=

2

-ometimes it is said that the lo!er formula is simpler2"#;;

0he homos3edasticity#only formula for the standard error


78/84

of $C and the @heteros3edasticity#robustA formula (the

formula that is alid under heteros3edasticity) differ in

generalyou *et different standard errors usin* the

different formulas2

?omos-edasticity=only standard errors are the

de"ault setting in regression so"t*are Gsometimes the only setting (e'g' E>cel)' !o get

the general +heteros-edasticity=robust,

standard errors you must o2erride the de"ault'

Df you dont oerride the default and there is in fact

heteros3edasticity you !ill get the !rong standard errors

(and !rong t#statistics and confidence interals)2

"#;8

The critical points:


79/84

Df the errors are homos3edastic and you use theheteros3edastic formula for standard errors (the one

!e deried) you are 6&

Df the errors are heteros3edastic and you use thehomos3edasticity#only formula for standard errors

the standard errors are !rong2 0he t!o formulas coincide (!hen nis large) in the

special case of homos3edasticity

0he bottom line: you should al!ays use theheteros3edasticity#based formulas these are

conentionally called the heterosedasticity"robust

standard errors2

"#;.

?eteros-edasticity=robust standard errors in S!!


80/84


Regression with robust standard errors Number of obs = 420

F( 1, 418) = 192! "rob # F = 00000 R$s%uared = 00&12 Root ' = 18&81$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ * Robusttestscr * +oef td rr t "#*t* 9&- +onf .nter/a$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ str * $2239808 &194892 $49 0000 $0094& $12&8!31 5cons * !989 10!4! !344 0000 !38&!02 3190&3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

Hse the +0 robust, option

"#8*

Summary and ssessment (Section 4'.)


81/84

0he initial policy question:

-uppose ne! teachers are hired so the student#

teacher ratio falls by one student per class2 What

is the effect of this policy interention (this

@treatmentA) on test scores?

5oes our regression analysis gie a conincing ans!er?-ot really districts !ith lo! STRtend to be ones

!ith lots of other resources and higher income

families !hich proide 3ids !ith more learning

opportunities outside schoolBthis suggests that

corr(uiSTRi) Q * so!(uiKi)*2

"#8$

#igression on Causality


82/84

0he original question (!hat is the quantitatie effect of

an interention that reduces class size?) is a question

about a causal effect: the effect on Yof applying a unit

of the treatment is $2


83/84

%deal: sub7ects all follo! the treatment protocol perfect compliance no errors in reporting etc2X

Randomi$ed: sub7ects from the population of interestare randomly assigned to a treatment or control group

(so there are no confounding factors)

#ontrolled: haing a control group permitsmeasuring the differential effect of the treatment

!+periment: the treatment is assigned as part of theeFperiment: the sub7ects hae no choice !hich

means that there is no @reerse causalityA in !hich

sub7ects choose the treatment they thin3 !ill !or3

best2

"#84


84/84

Date post:	04-Jun-2018
Category:	Documents
Upload:	antonio-alvino
View:	254 times
Download:	0 times

Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc

Documents