Date post: | 04-Jun-2018 |
Category: |
Documents |
Upload: | antonio-alvino |
View: | 254 times |
Download: | 0 times |
of 84
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
1/84
Introduction to Linear Regression
(SW Chapter 4)
Empirical problem: Class size and educational output
Policy question: What is the effect of reducing classsize by one student per class? by 8 students/class?
What is the right output (performance) measure?
parent satisfaction
student personal deelopment
future adult !elfare
future adult earnings
performance on standardized tests
"#$
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
2/84
What do data say about class sizes and test scores
!he Cali"ornia !est Score #ata Set
%ll ' and California school districts (n "*)
+ariables: ,thgrade test scores (-tanford#. achieement test
combined math and reading) district aerage
-tudent#teacher ratio (-01) no2 of students in thedistrict diided by no2 full#time equialent teachers
"#
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
3/84
%n initial loo3 at the California test score data:
"#4
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
4/84
5o districts !ith smaller classes (lo!er -01) hae higher test
scores?
"#"
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
5/84
!he class size$test score policy %uestion:
What is the effect on test scores of reducing -01 byone student/class?
6b7ect of policy interest:0est score
STR
0his is the slope of the line relating test score and STR
"#,
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
6/84
0his suggests that !e !ant to dra! a line through the
Test Score v. STRscatterplot but how?
"#'
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
7/84
Some &otation and !erminology
(Sections 4' and 4')
0hepopulation regression line:
Test Score *9 $STR
$ slope of population regression line
0est score
STR
change in test score for a unit change in STR
Why are *and $population parameters?
We !ould li3e to 3no! the population alue of $2
We dont 3no! $ so must estimate it using data2"#;
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
8/84
How can we estimate *and $from data?
1ecall that Y !as the least squares estimator of Y: Y
soles)
$
min ( )n
m i
i
Y m=
n
b b i i
i
Y b b =
+"#8
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
9/84
0he 6- estimator soles:* $
)
/ * $
$
min = ( )>n
b b i i
i
Y b b =
+
0he 6- estimator minimizes the aerage squareddifference bet!een the actual alues of Yiand the
prediction (predicted alue) based on the estimated line2
0his minimization problem can be soled usingcalculus (%pp2 "2)2
!he result is the /LS estimators o" .and 2
"#.
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
10/84
Why use /LS0 rather than some other estimator
6- is a generalization of the sample aerage: if the@lineA is 7ust an intercept (no) then the 6-
estimator is 7ust the sample aerage of Y$BYn(Y)2
i3e Y the 6- estimator has some desirableproperties: under certain assumptions it is unbiased
(that is!( $C ) $) and it has a tighter sampling
distribution than some other candidate estimators of
Dmportantly this is !hat eeryone uses the common@languageA of linear regression2
"#$*
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
11/84
"#$$
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
12/84
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
13/84
Estimated regression line: TestScore '.82. 28STR
%nterpretation of the estimated slope and intercept
TestScore '.82. 28STR
5istricts !ith one more student per teacher on aeragehae test scores that are 28 points lo!er2
0hat is0est score
STR
28
0he intercept (ta3en literally) means that according tothis estimated line districts !ith zero students per
teacher !ould hae a (predicted) test score of '.82.2
0his interpretation of the intercept ma3es no sense iteFtrapolates the line outside the range of the data in
"#$4
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
14/84
this application the intercept is not itself
economically meaningful2
"#$"
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
15/84
1redicted 2alues 3 residuals:
6ne of the districts in the data set is %ntelope C% for
!hich STR $.244 and Test Score ',;28
predicted alue: C&ntelopeY '.82. 28$.244 ',"28
residual: &ntelopeu ',;28 ',"28 42*
"#$,
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
16/84
/LS regression: S!! output
regress testscr str, robust
Regression with robust standard errors Number of obs = 420 F( 1, 418) = 192! "rob # F = 00000 R$s%uared = 00&12 Root ' = 18&81
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ * Robusttestscr * +oef td rr t "#*t* 9&- +onf .nter/a$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ str * $2239808 &194892 $49 0000 $0094& $12&8!31 5cons * !989 10!4! !344 0000 !38&!02 3190&3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
TestScore '.82. 28STR
(!ell discuss the rest of this output later)
"#$'
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
17/84
0he 6- regression line is an estimate computed using
our sample of dataG a different sample !ould hae gien
a different alue of $C 2
Ho! can !e:
quantify the sampling uncertainty associated !ith $C ?
use $C to test hypotheses such as $ *?
construct a confidence interal for $?
i3e estimation of the mean !e proceed in four steps:
$2 0he probability frame!or3 for linear regression2 Estimation
42 Hypothesis 0esting
"2 Confidence interals"#$;
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
18/84
' 1robability 5rame*or- "or Linear Regression
'opulation
population of interest (eF: all possible school districts)
Random variables:Y
EF: (Test Score( STR)
)oint distribution of (Y)
0he 3ey feature is that !e suppose there is a linear
relation in the population that relatesand YG this linear
relation is the @population linear regressionA
"#$8
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
19/84
!he 1opulation Linear Regression 6odel (Section 4'7)
Yi *9 $i9 ui i $B n
is the independent variableor re*ressor
Yis the dependent variable
* intercept
$ slope
ui @error termA
0he error term consists of omitted factors or possiblymeasurement error in the measurement of Y2 Dn
"#$.
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
20/84
general these omitted factors are other factors that
influence Y other than the ariable
"#*
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
21/84
!+.: 0he population regression line and the error term
What are some of the omitted factors in this e+ample?"#$
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
22/84
,ata and samplin*
0he population ob7ects (@parametersA) *and $are
un3no!nG so to dra! inferences about these un3no!n
parameters !e must collect releant data2
Simple random samplin*:
Choose nentities at random from the population ofinterest and obsere (record)and Yfor each entity
-imple random sampling implies that I(i Yi)J i $B
n are independently and identically distributed(i2i2d2)2
(-ote: (i Yi) are distributed independently of ( Y) for
different obserations iand2)
"#
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
23/84
0as3 at hand: to characterize the sampling distribution of
the 6- estimator2 0o do so !e ma3e three
assumptions:
!he Least S%uares ssumptions
$2 0he conditional distribution of ugienhas meanzero that is!(uK+) *2
2 (i(Yi) i$Bn are i2i2d2
42 and uhae four moments that is:
!(") L and!(u") L 2
Well discuss these assumptions in order2
"#4
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
24/84
Least s%uares assumption 8: E(u9Xx) .'
/or any *iven value of ( the mean of u is $ero
"#"
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
25/84
EFample: %ssumption M$ and the class size eFample
Test Scorei *9 $STRi9 ui ui other factors
@6ther factors:A
parental inolement
outside learning opportunities (eFtra math class22)
home enironment conducie to reading
family income is a useful proFy for many such factors
-o!(uK+) * means!(/amily %ncomeKSTR) constant
(!hich implies that family income and STRare
uncorrelated)2 This assumption is not innocuous0 We
will return to it often."#,
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
26/84
Least s%uares assumption 8:
(Xi0Yi)0 i 0;0nare i'i'd'
0his arises automatically if the entity (indiidual district)
is sampled by simple random sampling: the entity is
selected then for that entityandYare obsered
(recorded)2
0he main place !e !ill encounter non#i2i2d2 sampling is
!hen data are recorded oer time (@time series dataA)
this !ill introduce some eFtra complications2
"#'
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
27/84
Least s%uares assumption 87:
E(X4) < andE(u4)
( )
n
i i i
i
n
i
i
u u
=
=
+
"#.
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
30/84
$C
$
$
)
$
( )= ( ) ( )>
( )
n
i i i
i
n
i
i
u u
=
=
+
$ $
$) )
$ $
( )( ) ( )( )
( ) ( )
n n
i i i i
i i
n n
i i
i i
u u
= =
= =
+
so
$C $$
)
$
( )( )
( )
n
i i
i
n
i
i
u u
=
=
"#4*
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
31/84
We can simplify this formula by noting that:
$
( )( )n
i i
i
u u=
$
( )n
i i
i
u=
$
( )n
i
i
u=
$
( )n
i i
i
u=
20hus
$C $
$
)
$
( )
( )
n
i i
i
n
i
i
u
=
=
$
)
$
$
n
i
i
vn
ns
n
=
!here vi (i )ui2
"#4$
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
32/84
$C $
$
)
$
$
n
i
i
vn
ns
n
=
!here vi (i )ui
We no! can calculate the mean and ariance of $C :
!( $C $) )$
$ $n
i
i
n! v sn n
=
)$
$
$
ni
i
vn!
n n s=
)$
$
$
ni
i
vn!
n n s=
"#4
) )
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
33/84
No! !(vi/ )
s ) !=(i )ui/ )
s > *
because!(uiKi+) * (for details see %pp2 "24)
0hus !( $C $) )
$
$
$
ni
i
vn!
n n s=
*
so
!( $C ) $
0hat is $C
is an unbiased estimator o" '
"#44
C
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
34/84
Calculation of the ariance of $C :
$
C
$
$
)
$
$
n
i
i
vn
n sn
=
0his calculation is simplified by supposing that nis
large (so that)
s can be replaced by)
)G the result is
ar( $C
)
ar( )
v
n
(Oor details see %pp2 "242)
"#4"
h li di ib i i li d b h
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
35/84
0he eFact sampling distribution is complicated but !hen
the sample size is large !e get some simple (and good)
approFimations:
($)
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
36/84
$C $
$
)
$
$
n
i
i
vn
ns
n
=
When nis large:
vi (i )ui(i)ui !hich is i2i2d2 (why?) and
has t!o moments that is ar(vi) L
(why?)2 0hus
$
$ n
i
i
vn
=
is distributed-(*ar(v)/n) !hen nis large
)s is approFimately equal to)
!hen nis large
$n
n
$
$
n$ !hen nis large
Putting these together !e hae:
"#4'
C
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
37/84
Large=nappro>imation to the distribution o" $ :
$C $
$
)
$
$
n
ii
vn
ns
n
=
$
$n
i
i
vn
=
!hich is approFimately distributed-(*
)
) )( )v
n
)2
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
38/84
1ecall the summary of the sampling distribution of Y:
Oor (Y$BYn) i2i2d2 !ith * L)
Y L
0he eFact (finite sample) sampling distribution of Y
has meanY(@Yis an unbiased estimator of YA) and
ariance)
Y /n
6ther than its mean and ariance the eFactdistribution of Y is complicated and depends on the
distribution of Y
Y p
Y (la! of large numbers)
( )
ar( )
Y ! Y
Y
is approFimately distributed-(*$) (C0)
"#48
i " / S i C
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
39/84
1arallel conclusions hold "or the /LS estimator $ :
nder the three east -quares %ssumptions
0he eFact (finite sample) sampling distribution of $C
has mean $(@ $C is an unbiased estimator of $A) and
ar( $C ) is inersely proportional to n2
6ther than its mean and ariance the eFact
distribution of $C is complicated and depends on the
distribution of (u)
$C p$(la! of large numbers)
$ $
$
( )
ar( )
!
is approFimately distributed-(*$) (C0)
"#4.
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
40/84
"#"*
$ 0h b bilit f 3 f li i
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
41/84
$2 0he probability frame!or3 for linear regression
2 Estimation
7' ?ypothesis !esting (Section 4'@)
"2 Confidence interals
-uppose a s3eptic suggests that reducing the number of
students in a class has no effect on learning orspecifically test scores2 0he s3eptic thus asserts the
hypothesis
H*: $ *
We !ish to test this hypothesis using data reach a
tentatie conclusion !hether it is correct or incorrect2
"#"$
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
42/84
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
43/84
t/*
/
Y
Y
Y
s n
then re7ect the null hypothesis if KtK Q$2.'2
!here the S!of the estimator is the square root of an
estimator of the ariance of the estimator2
"#"4
% li d t h th i b t
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
44/84
%pplied to a hypothesis about $:
testimator # hypothesized alue
standard error of the estimator
so
t$ $/*
$
( )S!
!here $is the alue of $*hypothesized under the null
(for eFample if the null alue is zero then $* *2
What is S!( $C )?
S!( $C ) the square root of an estimator of the
ariance of the sampling distribution of $C
"#""
1 ll th i f th i f C (l )
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
45/84
1ecall the eFpression for the ariance of $ (large n):
ar( $C
) ar=( ) >
( )
i + i
u
n
)
"
v
n
!here vi (i )ui2 Estimator of the ariance of $C :
$
)
CC )
) )
$ estimator of
(estimator of )
v
n
) )
$
)
)
$
$
( )$
$( )
n
i ii
n
i
i
un
n
n
=
=
2
"#",
$ n
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
46/84
$
)
CC
) )
$
)
)
$
$( )
$
$( )
i i
i
n
i
i
un
n
n
=
=
2
6& this is a bit nasty but:
0here is no reason to memorize this Dt is computed automatically by regression soft!are
S!( $C ) $)
C is reported by regression soft!are
Dt is less complicated than it seems2 0he numeratorestimates the ar(v) the denominator estimates
ar()2
"#"'
1eturn to calculation of the t statsitic:
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
47/84
1eturn to calculation of the t#statsitic:
t$ $/*
$
( )S!
$
$ $/*
)
C
1e7ect at ,R significance leel if KtK Q $2.'
p#alue isp Pr=KtK Q KtactK> probability in tails ofnormal outside KtactK
ample: Test Scores and STR Cali"ornia data
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
48/84
E>ample: Test ScoresandSTR0 Cali"ornia data
Estimated regression line: TestScore '.82. 28STR
1egression soft!are reports the standard errors:
S!( * ) $*2" S!( $C ) *2,
t#statistic testing $* * $ $/*
$
( )S!
28 *
*2,
"248
0he $1#sided significance leel is 2,8 so !e re7ectthe null at the $R significance leel2
%lternatiely !e can compute thep#alueB"#"8
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
49/84
0hep#alue based on the large#nstandard normal
approFimation to the t#statistic is *2****$ ($*")"#".
$ 0he probability frame!or3 for linear regression
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
50/84
$2 0he probability frame!or3 for linear regression
2 Estimation
42 Hypothesis 0esting
4' Con"idence inter2als (Section 4'A)
Dn general if the sampling distribution of an estimator is
normal for large n then a .,R confidence interal can beconstructed as estimator $2.'standard error2
-o: a .,R confidence interal for $C is
I $C $2.'S!( $
C )J
"#,*
!+ample: Test Scores and STR California data
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
51/84
!+ample: Test Scoresand STR California data
Estimated regression line: TestScore '.82. 28STR
S!( * ) $*2" S!( $
C ) *2,
.,R confidence interal for $C :
I $C $2.'S!( $
C )J I28 $2.'*2,J
(424* $2')
Equialent statements: 0he .,R confidence interal does not include zeroG
0he hypothesis $ * is re7ected at the ,R leel
"#,$
con2ention "or reporting estimated regressions:
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
52/84
con2ention "or reporting estimated regressions:
Put standard errors in parentheses belo! the estimates
TestScore '.82. 28STR
($*2") (*2,)
0his eFpression means that:
0he estimated regression line is
TestScore '.82. 28STR 0he standard error of *
is $*2"
0he standard error of $C is *2,
"#,
/LS regression: S!! output
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
53/84
/LS regression: S!! output
regress testscr str, robust
Regression with robust standard errors Number of obs = 420 F( 1, 418) = 192! "rob # F = 00000 R$s%uared = 00&12 Root ' = 18&81$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ * Robusttestscr * +oef td rr t "#*t* 9&- +onf .nter/a
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ str * $2239808 &194892 $48 0000 $0094& $12&8!31 5cons * !989 10!4! !344 0000 !38&!02 3190&3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
so:
TestScore '.82. 28STR
($*2") (*2,)
t($ *) "248p#alue *2***
.,R conf2 interal for $is (424* $2')"#,4
Regression *hen X is Binary (Section 4 )
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
54/84
Regression *henXis Binary (Section 4')
-ometimes a regressor is binary:
$ if female * if male
$ if treated (eFperimental drug) * if not
$ if small class size * if not
-o far $has been called a @slopeA but that doesnt
ma3e much sense ifis binary2
Ho! do !e interpret regression !ith a binary regressor?
"#,"
Yi * 9 $i 9 ui !here is binary (i * or $):
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
55/84
Yi *9 $i9 ui !hereis binary (i * or $):
Wheni *: Yi *9 ui
Wheni $: Yi *9 $9 ui
thus:
Wheni * the mean of Yiis *
Wheni $ the mean of Yiis *9 $that is:
!(YiKi*) *
!(YiKi$) *9 $so:
$!(YiKi$) !(YiKi*)
"#,,
population difference in group means
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
56/84
population difference in group means
!+ample: TestScoreand STR California data
et
,i$ if *
* if *
i
i
STR
STR
>
0he 6- estimate of the regression line relatingTestScoreto,(!ith standard errors in parentheses) is:
TestScore ',*2* 9 ;2",
($24) ($28)
5ifference in means bet!een groups ;2"G
"#,'
S! $28 t ;2"/$28 "2*
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
57/84
S! $28 t ;2"/$28 "2*
"#,;
#ompare the re*ression results with the *roup means(
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
58/84
#ompare the re*ression results with the *roup means(
computed directly2
Class -ize %erage score (Y) -td2 de2 (sY) -
-mall (STRQ *) ',;2" $.2" 48arge (STRS *) ',*2* $;2. $8
Estimation: small largeY Y ',;2" ',*2* ;2"
!est .:;2"
( ) $284
s l
s l
Y Yt
S! Y Y
= = "2*,
D@ con"idence inter2alI;2"$2.'$284J(428$$2*)
This is the same as in the regression!TestScore ',*2* 9 ;2",
($24) ($28)
"#,8
Summary: regression *hen Xi is binary (.$)
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
59/84
Summary: regression *henXiis binary (.$)
Yi *9 $i9 ui
* mean of Ygien that *
*9$ mean of Ygien that $
$ difference in group means$ minus * -E( $C ) has the usual interpretation
t#statistics confidence interals constructed as usual
0his is another !ay to do difference#in#meansanalysis
"#,.
0he regression formulation is especially useful !hen
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
60/84
0he regression formulation is especially useful !hen
!e hae additional regressors (comin* up soon3)
"#'*
/ther Regression Statistics (Section 4'F)
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
61/84
g ( )
% natural question is ho! !ell the regression line @fitsA
or eFplains the data2 0here are t!o regression statistics
that proide complementary measures of the quality of
fit:
0he re*ression R
measures the fraction of theariance of Ythat is eFplained byG it is unitless and
ranges bet!een zero (no fit) and one (perfect fit)
0hestandard error of the re*ressionmeasures the fit the typical size of a regression residual in the units
of Y2
"#'$
!heR
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
62/84
Write Yias the sum of the 6- prediction 9 6-
residual:
Yi CiY9 iu
0heR
is the fraction of the sample ariance of Yi@eFplainedA by the regression that is by CiY:
R
!SS
TSS
!here!SS)
$
( )n
i
i
Y Y=
and TSS )$
( )n
i
i
Y Y=
2
"#'
R!SS
h !SS) ( )
n
Y Y d TSS )( )n
Y Y
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
63/84
RTSS
!here!SS)
$
( )ii
Y Y=
and TSS )$
( )ii
Y Y=
0heR:
R * means!SS * soeFplains none of theariation of Y
R
$ means!SS TSS so Y CYsoeFplains all ofthe ariation of Y
* TRT $
Oor regression !ith a single regressor (the case here)Ris the square of the correlation coefficient bet!een
and Y
"#'4
!heStandard Error of the Regression(SER)
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
64/84
f g ( )
0he standard error of the regression is (almost) the
sample standard deiation of the 6- residuals:
S!R)
$
$ ( )
n
i i
i
u u
n =
)
$
$
n
i
i
un
=
(the second equality holds because$
$
n
i
i
un
=
*)2
"#'"
S!R)$
n
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
65/84
S!R)
$
i
i
un
=
0he S!R:
has the units of u !hich are the units of Y
measures the spread of the distribution of u
measures the aerage @sizeA of the 6- residual (the
aerage @mista3eA made by the 6- regression line)
0he root mean squared error(R4S!) is closelyrelated to the S!R:
R4S! )
$
$ n
i
i
un
=
0his measures the same thing as the S!R the minor
difference is diision by $/ninstead of $/(n)2"#',
Technical note: !hy diide by n instead of n$?
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
66/84
y y
S!R)
$
$
n
i
i
un
=
5iision by n is a @degrees of freedomA correction
li3e diision by n$ in)
Ys G the difference is that in the
S!R t!o parameters hae been estimated (*and $ by
* and $
C ) !hereas in)
Ys only one has been estimated
(Y by Y)2
When nis large it ma3es negligible difference !hethern n$ or n are used although the conentional
formula uses n !hen there is a single regressor2
"#''
Oor details see -ection $,2"
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
67/84
EFample ofRand S!R
TestScore '.82. 28STRR
2*, S!R $82'($*2") (*2,)
"#';
The slope coefficient is statistically si*nificant and lar*e
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
68/84
in a policy sense( even thou*h STR e+plains only a small
fraction of the variation in test scores2
"#'8
1ractical &ote: ?eteros-edasticity0
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
69/84
?omos-edasticity0 and the 5ormula "or the Standard
Errors o" * and $
C (Section 4'D)
What do these t!o terms mean?
Consequences of homos3edasticity
Dmplication for computing standard errors
What do these t*o terms mean
Df ar(uK+) is constant that is the ariance of theconditional distribution of ugiendoes not depend on
then uis said to be homosedastic2 6ther!ise uis
said to be heterosedastic2"#'.
omosedasticityin a picture2
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
70/84
!(uK+) * (usatisfies east -quares %ssumption M$)
0he ariance of udoes notchange !ith (depend on)+
"#;*
eterosedasticityin a picture2
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
71/84
!(uK+) * (usatisfies east -quares %ssumption M$)
"#;$
0he ariance of udepends on+ so uis
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
72/84
heteros3edastic2
%n real#!orld eFample of heterosedasticityfrom labor
economics: aerage hourly earnings s2 years of
education (data source: $... Current Population -urey)
"#;
Average Hourly Earnings Fitted values
"0
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
73/84
Averagehourlyearn
ings
Scatterplot and OLS Regression LineYears of Education
5 10 15 0
0
0
!0
"0
"#;4
Ds heteros3edasticity present in the class size data?
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
74/84
Hard to sayBloo3s nearly homos3edastic but the spread
might be tighter for large alues of STR2
"#;"
-o far !e hae (!ithout saying so) assumed that uis
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
75/84
heteros3edastic:
Recall the three least s5uares assumptions2
$2 0he conditional distribution of ugienhas mean
zero that is!(uK+) *2
2 (i(Yi) i$Bn are i2i2d242 and uhae four finite moments2
Heteros3edasticity and homos3edasticity concern ar(uK
+)2
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
76/84
Uou can proe some theorems about 6- (inparticular the Vauss#ar3o theorem !hich says
that 6- is the estimator !ith the lo!est ariance
among all estimators that are linear functions of (Y$
BYn)G see -ection $,2,)2
0he formula for the ariance of $C
and the 6-standard error simplifies (%pp2 "2"): Df ar(uiKi+)
)
u then
ar( $C ) ) )ar=( ) >
( )
i + i
u
n
B
)
)
u
n
-ote: ar( $C ) is inersely proportional to ar():
more spread inmeans more information about $C 2
"#;'
7eneral formulafor the standard error of $C is the of:
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
77/84
$
)
CC
) )
$
)
)
$
$( )
$
$( )
n
i ii
n
i
i
un
n
n
=
=
2
Special caseunder homos3edasticity:
$
)
CC
)
$
)
$
$
$
$ ( )
n
i
i
n
i
i
un
n n
=
=
2
-ometimes it is said that the lo!er formula is simpler2"#;;
0he homos3edasticity#only formula for the standard error
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
78/84
of $C and the @heteros3edasticity#robustA formula (the
formula that is alid under heteros3edasticity) differ in
generalyou *et different standard errors usin* the
different formulas2
?omos-edasticity=only standard errors are the
de"ault setting in regression so"t*are Gsometimes the only setting (e'g' E>cel)' !o get
the general +heteros-edasticity=robust,
standard errors you must o2erride the de"ault'
Df you dont oerride the default and there is in fact
heteros3edasticity you !ill get the !rong standard errors
(and !rong t#statistics and confidence interals)2
"#;8
The critical points:
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
79/84
Df the errors are homos3edastic and you use theheteros3edastic formula for standard errors (the one
!e deried) you are 6&
Df the errors are heteros3edastic and you use thehomos3edasticity#only formula for standard errors
the standard errors are !rong2 0he t!o formulas coincide (!hen nis large) in the
special case of homos3edasticity
0he bottom line: you should al!ays use theheteros3edasticity#based formulas these are
conentionally called the heterosedasticity"robust
standard errors2
"#;.
?eteros-edasticity=robust standard errors in S!!
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
80/84
regress testscr str, robust
Regression with robust standard errors Number of obs = 420
F( 1, 418) = 192! "rob # F = 00000 R$s%uared = 00&12 Root ' = 18&81$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ * Robusttestscr * +oef td rr t "#*t* 9&- +onf .nter/a$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ str * $2239808 &194892 $49 0000 $0094& $12&8!31 5cons * !989 10!4! !344 0000 !38&!02 3190&3$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Hse the +0 robust, option
"#8*
Summary and ssessment (Section 4'.)
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
81/84
0he initial policy question:
-uppose ne! teachers are hired so the student#
teacher ratio falls by one student per class2 What
is the effect of this policy interention (this
@treatmentA) on test scores?
5oes our regression analysis gie a conincing ans!er?-ot really districts !ith lo! STRtend to be ones
!ith lots of other resources and higher income
families !hich proide 3ids !ith more learning
opportunities outside schoolBthis suggests that
corr(uiSTRi) Q * so!(uiKi)*2
"#8$
#igression on Causality
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
82/84
0he original question (!hat is the quantitatie effect of
an interention that reduces class size?) is a question
about a causal effect: the effect on Yof applying a unit
of the treatment is $2
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
83/84
%deal: sub7ects all follo! the treatment protocol perfect compliance no errors in reporting etc2X
Randomi$ed: sub7ects from the population of interestare randomly assigned to a treatment or control group
(so there are no confounding factors)
#ontrolled: haing a control group permitsmeasuring the differential effect of the treatment
!+periment: the treatment is assigned as part of theeFperiment: the sub7ects hae no choice !hich
means that there is no @reerse causalityA in !hich
sub7ects choose the treatment they thin3 !ill !or3
best2
"#84
8/14/2019 Introduction to Econometrics- Stock & Watson -Ch 4 Slides.doc
84/84