+ All Categories
Home > Documents > ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables...

ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables...

Date post: 07-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
84
Economics 130 Lecture 8 Final Comments on Dummy Variables Heteroskedasticity Serial Correlation Teams and Topics Midterm Preparation
Transcript
Page 1: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Economics 130

Lecture 8

Final Comments on Dummy Variables

Heteroskedasticity

Serial Correlation

Teams and Topics

Midterm Preparation

Page 2: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Multiple Regression

• We continue with addressing our second issue

+ add in how we evaluate these relationships:

– Where do we get data to do this analysis?

– How do we create the model relating the data?

– How do we relate data to on another?

– How do we evaluate these relationships?

Page 3: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Multiple Regression

• Tonight we will work on:

– Finish Dummy Variables (Indicator Variables)

– Heteroskedascity

– Serial Correlation (a little)

– Talk about Teams and Topics

Page 4: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Regressions with Dummy Variables

• Simple Introduction:

• Dummy variable is either 0 or 1.

• Use to turn qualitative (Yes/No) data into 1/0.

Page 5: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Multiple Regression

• Simple Regression with a Dummy Variable

Y = b1 + b2D + e

• OLS estimation, confidence intervals, testing, etc. carried out in standard way

• Interpretation a little different.

Page 6: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Simple Regression with a Dummy

Variable

• Fitted value for ith observation (point on

regression line):

• Since Di = 0 or 1 either

• or6

Ŷi = b1 + b2Di

Ŷi = b1

Ŷi = b1 + b2Di

Page 7: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables• Example: Explaining house prices (continued)

• Regress Y = house price on D = dummy for air

conditioning (=1 if house has air conditioning, = 0

otherwise).

• Result:

• Average price of house with air conditioning is $85,881

• Average price of house without air conditioning is

$59,8857

b1 = 59,885

b2 = 25,996

b1 + b2 = 85,881

Page 8: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables• Multiple Regression with Dummy Variables

• Example: Explaining house prices (continued)

• Y = b1 + b2D1 . . . bkDk + e

• Regress Y = house price on D1 = driveway dummy and

D2 = rec room dummy

• Four types of houses:

• Houses with a driveway and a rec room (D1=1, D2=1)

• Houses with a driveway but no rec room (D1=1, D2=0)

• Houses with a rec room but no driveway (D1=0, D2=1)

• Houses with no driveway and no rec room (D1=0, D2=0)

Page 9: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

Example: Explaining house prices (continued)

• If D1=1 and D2=1, then

Ŷi = b1 + b2 + b3 = 47,099 + 21,160 + 16,024 = 84,283

• “The average price of houses with a driveway and rec room is

$84,283”.

Coeff. St.

Error

t Stat P-

value

Lower

95%

Upper

95%

Inter. 47099.1 2837.6 16.60 2.E-50 41525 52673

D1 21159.9 3062.4 6.91 1.E-11 15144 27176

D2 16023.7 2788.6 5.75 1.E-08 10546 21502

Page 10: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables• If D1 = 1 and D2=0, then

• “The average price of houses with a driveway but no rec room is $68,259”.

• If D1=0 and D2=1, then

• “The average price of houses with a rec room but no driveway is $63,123”.

• If D1=0 and D2=0, then

• “The average price of houses with no driveway and no rec room is $47,099”.

Ŷi = b1 + b2 = 47,099 + 21,160 = 68,259

Ŷi = b1 + b3 = 47,099 + 16,024 = 63,123

Ŷi = b1 = 47,099

Page 11: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Multiple Regression with Dummy and non-Dummy

Explanatory Variables

• Regress Y = house price on D = air conditioning

dummy and X = lot size.

• OLS estimates: b1= 32,693

b2 = 20,175

b3 = 5.64

Page 12: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• For houses with an air conditioner D = 1 and

• For houses without an air conditioner D=0 and

• Two different regression lines depending on whether the house has an air conditioner or not.

• Two lines have different intercepts but same slope (i.e. same marginal effect)

Ŷi = 52,868 + 5.64X

Ŷi = 32,693 + 5.64X

Page 13: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Verbal ways of expressing OLS results:

• “An extra square foot of lot size will tend to add $5.64 onto the price of a house” (Note: no ceteris paribusqualifications to statement since marginal effect is same for houses with and without air conditioners)

• “Houses with air conditioners tend to be worth $20,175 more than houses with no air conditioners, ceteris paribus” (Note: Here we do have ceteris paribusqualification)

• “If we consider houses with similar lot sizes, those with air conditioners tend to be worth an extra $20,175”

Page 14: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Another House Price Regression

• Regress Y = house price on D1 = dummy variable for driveway, D2 = dummy variable for rec room, X1 = lot size and X2 = number of bedrooms

• OLS estimates:

Y = b1 + b2D1 + b3D2 + b4X1 + b5X2 + e

b1= -2,736b2 = 12,598b3 = 10,969b4 = 5.197b5 = 10,564

Page 15: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

1. If D1=1 and D2=1, then

This is the regression line for houses with a driveway and rec room.

2. If D1=1 and D2=0, then

• This is the regression line for houses with a driveway but no rec room.

3. If D1=0 and D2=1, then

This is the regression line for houses

with a rec room but no driveway.

4. If D1=0 and D2=0, then

This is the regression line for houses

with no driveway and no rec room.

Page 16: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

1. If D1=1 and D2=1, then

This is the regression line for houses with a driveway and rec room.

2. If D1=1 and D2=0, then

• This is the regression line for houses with a driveway but no rec room.

Page 17: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

3. If D1=0 and D2=1, then

This is the regression line for houses with a rec

room but no driveway

4. If D1=0 and D2=0, then

This is the regression line for houses with no

driveway and no rec room.

Page 18: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• “Houses with driveways tend to be worth

$12,598 more than similar houses with no

driveway.”

• “If we consider houses with the same number

of bedrooms, then adding an extra square foot

of lot size will tend to increase the price of a

house by $5.197.”

• “An extra bedroom will tend to add $10,562 to

the value of a house, ceteris paribus”

Page 19: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Interacting Dummy and Non-Dummy Variables

• Where Z=DX.

• Z is either 0 (for observations with D=0) or X (for observations with D=1)

• If D=1 then

• If D=0, then

Two different regression lines corresponding to D=0 and D=1 exist and have different intercepts and slopes. • The marginal effect of X on Y is different for D=0 and

D=1

Y = b1 + b2D + bX + b4Z + e

Ŷi = (b1 + b2) + (b3 + b4)X

Ŷi = b1 + b3X

Page 20: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Regress Y = house price on D = air conditioner dummy, X = lot size and Z = DX

• OLS estimates:

• The marginal effect of lot size on housing is 7.27 for houses with air conditioners and only $5.02 for houses without.

• Increasing lot size will tend to add more to the value of a house if it has an air conditioner than if it does not.

20

b1= 35,684

b2 = 7,613

b3 = 5.02

b4 = 2.25

Page 21: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Issue here: Using dummy variables on the right side of regression, i.e., as INDEPENDENT variables

• Dummy variables are (0,1) variables

• The illustrative example:

• House prices depend upon house characteristics: size (square feet), location (may be dummy), # of bedrooms, bathrooms, age, whether has a pool (dummy), whether has a tile roof (dummy), etc.

• Note: These sorts of models are called “hedonic price” models.

21

Page 22: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

Here again is our basic simple regression model:

1 2PRICE SQFT e b b

1 if characteristic is present

0 if characteristic is not presentD

1 if property is in the desirable neighborhood

0 if property is not in the desirable neighborhoodD

Page 23: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

1 2PRICE D SQFT e b b

1 2

1 2

( ) when 1( )

when 0

SQFT DE PRICE

SQFT D

b b

b b

Page 24: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

Page 25: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

1 2 ( )PRICE SQFT SQFT D e b b

1 2

1 2

1 2

( ) when 1( )

when 0

SQFT DE PRICE SQFT SQFT D

SQFT D

b b b b

b b

2

2

when 1 ( )

when 0

DE PRICE

DSQFT

b

b

1 2

1 2

1 2

( ) when 1( )

when 0

SQFT DE PRICE SQFT SQFT D

SQFT D

b b b b

b b

2

2

when 1 ( )

when 0

DE PRICE

DSQFT

b

b

Dummy variables can also be used to determine whether the

marginal impact of one variable depends on the presence (or

absence) of the dummy characteristic. Does price per square

foot depend upon whether the house is in a good

neighborhood?

Page 26: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

Page 27: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Here’s a model with both intercept and slope

dummy effects. We will estimate it now.

Page 28: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• 1000 house sales from two similar neighborhoods,

one bordering a university, the other 3 miles away.

• Variables:

• price house price, in $1000

• sqft square feet of living area, in 100's

• age house age, in years

• utown =1 if close to university

• pool =1 if house has pool

• fplace =1 if house has fireplace

Page 29: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• The Model:

• Sample Data:

Page 30: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

• Model 2: OLS, using observations 1-1000• Dependent variable: p

• coefficient std. error t-ratio p-value • ---------------------------------------------------------• const 24.5000 6.19172 3.957 8.13e-05 ***• sqft 7.61218 0.245176 31.05 1.87e-148 ***• utown 27.4530 8.42258 3.259 0.0012 ***• usqft 0.0129940 0.00332048 3.913 9.72e-05 ***• pool 4.37716 1.19669 3.658 0.0003 ***• fplace 1.64918 0.971957 1.697 0.0901 *• age -0.190086 0.0512046 -3.712 0.0002 ***

• Mean dependent var 247.6557 S.D. dependent var 42.19273• Sum squared resid 230184.4 S.E. of regression 15.22521• R-squared 0.870570 Adjusted R-squared 0.869788• F(6, 993) 1113.183 P-value(F) 0.000000

Page 31: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy VariablesPremium for

lots near the

university

is….

Premium for

having a pool

is…

Premium for

having a

fireplace is …

Δ Price due to

Δ in house

size is . . .

A. $24,500

B. $76

C. $4,377

D. $1.2994

E. $1,649

F. $27,453

A. $24,500

B. $76

C. $4,377

D. $1.2994

E. $1,649

F. $27,453

A. $24,500

B. $76

C. $4,377

D. $1.2994

E. $1,649

F. $27,453

A. $24,500

B. $76

C. $4,377

D. $1.2994

E. $1,649

F. $27,45

Page 32: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables

Based on these regression results, we estimate:

the location premium, for lots near the university, to be

$27,453

the price per square foot to be $89.12 for houses near the

university, and $76.12 for houses in other areas.

that houses depreciate $190.10 per year

that a pool increases the value of a home by $4377.20

that a fireplace increases the value of a home by

$1649.20

Page 33: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy VariablesVariable Coefficient X Units Y Units Marginal Effect

UTOWN (1 = close

to university)

27.453 1,0 1000s 1 x 2.7453 x 1000 =

$27453

SQFT (size in 100s) 7.612 100s of

SQFT

1000s 7.612 ÷ 100 = .07612 x

1000 = $76.12

USQFT ( x SQFT) .01299 1,0 1000s .01299 x 1000 = $12.99

+ $76.12 = $89.11

AGE (years) -.190 Years 1000s 1 x -.19 x 1000 = - $190

FPLACE (1 =

fireplace)

1.649 1,0 1000s 1 x 1.649 x 1000 =

$1,649

POOL (1 = pool) 4.377 1,0 1000s 1 x 4.377 x 1000 =

$4,377

Page 34: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• We are now going to return to a model we

looked at earlier for women’s labor force

participation

• This model used 1990 data from 50 states.

Page 35: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

Page 36: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

Page 37: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

Page 38: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Gretl Practice

• Here is a model of the determinants of women’s labor force participation for all 50 states.

• WLFP = Participation rate (%) of women > 16 in the labor force

• YF = Annual median earnings by females (000s of $)

• YM = Annual median earnings by males (000s of $)

• EDUC = Female HS grads > 24 (%)

• UE = Unemployment rate (%)

• MR = Marriage rate (%) women over 16

• DR = Divorce rate (%)

• URB = % of state’s population that is urban

• WH = % of state’s female > 16 population who are white

Page 39: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Gretl Practice

• This is another “kitchen sink” model:

• WLFP = b1 + b2YF + b3YM + b4EDUC +

b5UE + b6MR + b7DR + b8URB + b9WH e

Page 40: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Gretl Practice

• Model 1: OLS, using observations 1-50• Dependent variable: wlfp

• Coefficient Std. Error t-ratio p-value• const 44.5096 8.97496 4.9593 0.00001 ***• yf 0.987983 0.407583 2.4240 0.01985 **• ym -0.174345 0.306207 -0.5694 0.57221• educ 0.285129 0.0931647 3.0605 0.00389 ***• ue -1.61058 0.313617 -5.1355 <0.00001 ***• mr -0.0782145 0.173139 -0.4517 0.65383• dr 0.437371 0.258336 1.6930 0.09804 *• urb -0.0926339 0.0333355 -2.7788 0.00820 ***• wh -0.0874916 0.0398446 -2.1958 0.03382 **

• Mean dependent var 57.47400 S.D. dependent var 4.248784• Sum squared resid 193.9742 S.E. of regression 2.175104• R-squared 0.780710 Adjusted R-squared 0.737922• F(8, 41) 18.24590 P-value(F) 2.90e-11• Log-likelihood -104.8395 Akaike criterion 227.6790• Schwarz criterion 244.8872 Hannan-Quinn 234.2319

Page 41: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Gretl Practice

• Model 4: OLS, using observations 1-50• Dependent variable: wlfp

• Coefficient Std. Error t-ratio p-value• const 41.346 5.55984 7.4365 <0.00001 ***• yf 1.06712 0.364515 2.9275 0.00550 ***• educ 0.258172 0.0708648 3.6432 0.00073 ***• ue -1.59099 0.307647 -5.1715 <0.00001 ***• dr 0.391632 0.235404 1.6637 0.10363• urb -0.0876356 0.0311463 -2.8137 0.00742 ***• wh -0.0850871 0.0391115 -2.1755 0.03527 **• ym -0.198418 0.298664 -0.6644 0.51010

• Mean dependent var 57.47400 S.D. dependent var 4.248784• Sum squared resid 194.9397 S.E. of regression 2.154396• R-squared 0.779619 Adjusted R-squared 0.742888• F(7, 42) 21.22554 P-value(F) 6.62e-12• Log-likelihood -104.9636 Akaike criterion 225.9272• Schwarz criterion 241.2234 Hannan-Quinn 231.7521

Page 42: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Gretl Practice

• Model 5: OLS, using observations 1-50• Dependent variable: wlfp

• Coefficient Std. Error t-ratio p-value• const 41.8336 5.47528 7.6405 <0.00001 ***• yf 0.849264 0.158152 5.3699 <0.00001 ***• educ 0.249152 0.0690987 3.6057 0.00080 ***• ue -1.67758 0.276859 -6.0593 <0.00001 ***• dr 0.434104 0.22508 1.9287 0.06039 *• urb -0.0942172 0.0293363 -3.2116 0.00250 ***• wh -0.0960861 0.0352037 -2.7294 0.00916 ***

• Mean dependent var 57.47400 S.D. dependent var 4.248784• Sum squared resid 196.9882 S.E. of regression 2.140355• R-squared 0.777303 Adjusted R-squared 0.746229• F(6, 43) 25.01455 P-value(F) 1.55e-12• Log-likelihood -105.2249 Akaike criterion 224.4499• Schwarz criterion 237.8341 Hannan-Quinn 229.5467

Page 43: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• We are going to look at data from both 1980

and 1990 because it’s possible there was a

structural change in WLFP over that decade.

• We are going to create a Dummy Variable for

1990, called D90. For 1990, D = 1.

• Now we are going to incorporate interaction

terms, by multiplying D90 by the other

variables.

Page 44: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• Here is the model we are starting with:

• WLFP = b1 + b2YF + b3YM + b4EDUC +

b5UE + b6MR + b7DR + b8URB + b9WH

q1(D90*YF) + q2(D90*YM) + q3(D90*EDUC)

+ q4(D90*UE) + q5(D90*MR) + q6(D90*DR)

+ q7(D90*URB) + q8(D90*WH) + e

Page 45: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• Model 2: OLS, using observations 1-100

• Dependent variable: WLFP

• Omitted due to exact collinearity: D90YM

• coefficient std. error t-ratio p-value

• ------------------------------------------------------------

• const 49.6235 10.5465 4.705 1.00e-05 ***

• YF 0.00470565 0.000948974 4.959 3.71e-06 ***

• YM -0.000133492 0.000273021 -0.4889 0.6262

• EDUC 0.286358 0.0586220 4.885 4.97e-06 ***

• UE -1.09155 0.269898 -4.044 0.0001 ***

• MR -0.210187 0.153263 -1.371 0.1739

• DR 0.208349 0.172456 1.208 0.2304

• URB -0.0665164 0.0299182 -2.223 0.0289 **

• WH -0.126810 0.0343940 -3.687 0.0004 ***

• D90 -4.85398 13.6340 -0.3560 0.7227

• D90YF -0.00376523 0.000880843 -4.275 5.08e-05 ***

• D90EDUC -0.00164444 0.111949 -0.01469 0.9883

• D90UE -0.537329 0.396399 -1.356 0.1789

• D90MR 0.127952 0.230950 0.5540 0.5811

• D90DR 0.239853 0.304639 0.7873 0.4333

• D90URB -0.0276884 0.0437808 -0.6324 0.5288

• D90WH 0.0369987 0.0517175 0.7154 0.4764

• R-squared 0.861932 Adjusted R-squared 0.835316

Page 46: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• Suspecting multicollinearity, we eliminate variables with insignificant coefficients one at a time.

Page 47: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• Model 3: OLS, using observations 1-100

• Dependent variable: WLFP

• coefficient std. error t-ratio p-value

• ----------------------------------------------------------

• const 47.6366 6.57840 7.241 1.52e-010 ***

• YF 0.00477939 0.000733949 6.512 4.28e-09 ***

• EDUC 0.275070 0.0455059 6.045 3.43e-08 ***

• UE -1.06141 0.245591 -4.322 4.02e-05 ***

• MR -0.207293 0.104894 -1.976 0.0512 *

• DR 0.281618 0.133697 2.106 0.0380 **

• URB -0.0784652 0.0206237 -3.805 0.0003 ***

• WH -0.111495 0.0242421 -4.599 1.40e-05 ***

• D90YF -0.00405375 0.000682124 -5.943 5.36e-08 ***

• D90UE -0.569355 0.327225 -1.740 0.0853 *

• D90MR 0.126361 0.0509756 2.479 0.0151 **

• R-squared 0.858214 Adjusted R-squared 0.842283

• F(10, 89) 53.87052 P-value(F) 1.98e-33

Page 48: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

• Here is the final model:

• WLFP = 47.63 + .00478 YF - .00405 (D90 *

YF) + .275 EDUC – 1.06 UE - .569 (D90 *

UE) - .207 MR + .126 (D90 * MR) + .282 DR

- .078 URB - .111 WH

• Adjusted R2 = .842

Page 49: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Dummy Variables/Gretl Practice

For 1980, set D90 = 0

• WLFP = 47.63

+ .00478YF + .275

EDUC – 1.06 UE - .207

MR +.282 DR - .078 URB

- .111 WH

For 1990, set D90 = to 1 and

combine terms

• WLFP = 47.63

+ .00073 YF + .275

EDUC – 1.63UE - .081

MR + .282 DR - .078

URB - .111 WH

Page 50: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Multiple Regression

• We continue with addressing these three

issues:

– Where do we get data to do this analysis?

– How do we create the model relating the data?

– How do we relate data to on another?

– How do we evaluate these relationships?

Page 51: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Multiple Regression

• Sources of specification errors:

– Choice of variables

– Functional forms (non-linear relationships)

– Structure of the error terms (e’s)

Page 52: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Multiple Regression

• Sources of specification errors:

– Choice of variables

• Omitted Variables

• Irrelevant Variables

• Multicollinearity

– Functional forms (non-linear relationships)

• Non-linear models

– Structure of the error terms (e’s)

Page 53: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Structure of the Error Terms (e’s)

Heteroskedasticity Serial Correlation

Page 54: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Structure of the Error Terms (e’s)

• What about the variance of our estimates?

– All e’s are equally distributed with the same

conditional variance (s2) [Homoskedasticity

(equal scatter]

– e’s are independently distributed; cov (ei, ej) = 0

• Then, among all unbiased, linear combinations

of Ys, our estimates -- b’s -- have the lowest

variance = most efficient.

Page 55: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• Under heteroskedasticity,

the error term varies with

the value of x. No longer

is the VAR = s2 which is

a constant.

• Now VAR (ei) =

si2

Page 56: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• The consequences of Heteroskedasticity

• (1) The standard errors usually computed for the least

squares estimator are incorrect. Confidence intervals

and hypothesis tests that use these standard errors ARE

WRONG!!!

• (2) The least squares estimator is still a linear and

unbiased estimator, but it is no longer best. There is

another estimator with a smaller variance.

Page 57: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• Why? Consider the following equations:

• Remember the standard error (se) = Var 1/2

• And that t = bi

se

Homoskedasticity Heteroskedasticity

Var (e) s2 si2

Var (b) =

Page 58: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• Thus each observation has a different error

variance.

• Now, we will look at a quick fix for the

problems created by heteroskedasticity.

Page 59: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• By using our squared residuals (êi2)to estimate

each E(ei2) (=σi

2), we can estimate the

HETEROSKEDASTICITY-ROBUST

VARIANCE

• Var (b) = S [(xi - x̅)2

êi2]

S [(xi – x̅)2]]2

Page 60: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• When you correct the standard errors in this

way (the White correction)

• Do your parameter estimates (bi,…) change??

Page 61: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• When you correct the standard errors in this

way (the White correction)

• Do your parameter estimates (b2, b3) change??

• NO!!!

• Only the standard errors change.

Page 62: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• Correcting Standard Errors (se) Example:

Food Expenditure Model:

y = weekly food expenditures ($0)

x = weekly income ($100)

Page 63: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• Correcting Standard Errors (se) Example:

Food Expenditure Model:

ˆ 83.42 10.21

(27.46) (1.81) (White se)

(43.41) (2.09) (incorrect se)

i iy x

Page 64: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

Using, heteroskedasticity-robust standard errors is

probably the MOST IMPORTANT part of this lecture!

This is what most practitioners do to address

potential heteroskedasticity.

Why? Because changing estimation (which we will talk

about in a moment) requires specifying the form of

heteroskedasticity.

If we are wrong about the form, we are potentially

introducing error in the estimation.

Page 65: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

Page 66: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• An observation, continued:

• HOWEVER, we need to talk about form of heteroskedasticity in order to TEST for it, which practitioners want to know about.

• Why? Because if you have evidence of heteroskedasticity, they may want to see that your results are robust to correcting for it in the estimation.

Page 67: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

Therefore, we turn to detecting heteroskedasticity.

• Diagnostic: residual plots

• Forms of heteroskedasticity and testing for them

• Using specific forms of heteroskedasticity to perform Generalized Least Squares (to make the least squares estimator BEST, not just unbiased)

Page 68: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

Residual Plots

Estimate the model using least squares and plot the

least squares residuals.

With more than one explanatory variable, plot the

least squares residuals or squared residuals against

each explanatory variable, or against , to see if those

residuals vary in a systematic way relative to the

specified variable.

ˆiy

Page 69: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

Page 70: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

Residual plot in gretl

In OLS output, click

path

– Graphs

– Residuals plot

– Against x

Squared Residual plot in

gretl

In OLS output, click

path

– Save

– Squared Residuals

– OK

Return to Data

– Choose X-Y Graph

– X = variable; Y = u2

Page 71: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

No Patterns to Residuals Heteroskedastic Pattern

Page 72: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Heteroskedasticity

• Summary of key points on heteroskedasticity:

• 1) Implies OLS standard errors are WRONG, so t-stats

(hypothesis tests) and confidence intervals based on OLS

are WRONG

– Correct with Heteroskedasticity-robust (White)

standard errors

• 2) Detecting (testing for) specific forms of heteroskedasticity: B-P, White,

Harvey

• 3) Implies OLS estimator is not minimum variance (efficient / BLUE).

Can obtain BLUE estimator for specific form of heteroskedasticity by using

Generalized Least Squares (OLS on “transformed data,” obtained by

dividing all data by estimated standard deviation of observation i, σ^i)

Page 73: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• Serial correlation (or autocorrelation) violates

the 6th Assumption that underpins Gauss

Markov: cov (ei, ej) = 0

• With serial correlation cov (ei, ej) 0

Page 74: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Structure of the Error Terms (e’s)

Heteroskedasticity Serial Correlation

Page 75: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• We are going to concern ourselves with First

Order Serial Correlation.

• This is often found in time series data.

• A period’s error term is related to the error

term of the previous period.

• A primary cause is the existence of long-term

cycles and trends in the data.

Page 76: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• Here is a formal statement using a simple linear

regression:

• Model:

• Error term:

• Because r is the coefficient of the error term lagged

one period, this is called the first order

autocorrelation coefficient and the process is called

the first order autoregressive process or AR(1).

1 2t t ty x e b b

1t t te e v r 1 1 r

Page 77: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• The implication of AR(1) is:

• The least squares estimator is still a linear and

unbiased estimator, but it is no longer best. There

is another estimator with a smaller variance.

• The standard errors usually computed for the

least squares estimator are incorrect.

Confidence intervals and hypothesis tests that

use these standard errors may be misleading.

Page 78: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• The most common test for first order serial

correlation is the Durbin-Watson (DW) test.

• The DW statistic is calculated from the value

of the residuals.

t=n

• d = St=2 (êt – êt-1)2

S(ê2)

Page 79: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• Working out the d statistic yields, approximately:

• d 2(1 - r) r is the estimate of r

• Because r can range from -1 to 1, the range for d

is 0 – 4.

• Durbin-Watson critical values are upper (du) and

lower (dl) bounds. N = sample size and k’ =

number of coefficients NOT COUNTING THE

CONSTANT!

Page 80: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

• Rules of Thumb: because if r is 0, d = 2, then a

DW statistic around 2 (1.75 – 2.25) means there is

no first order serial correlation.

• A r close to 1 means a d close to 0, which means

strong positive serial correlation.

• Similarly, a r close to -1 means a d close to 4,

which indicates strong negative serial correlation.

Page 81: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

Page 82: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

The Durbin-Watson bounds test.

• if the test is inconclusive.

(Lower d values only)

Page 83: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Serial Correlation

What can you do about serial correlation?

• First differences

• Change the model specification

Page 84: ECON-130 Lecture 10 - … › ...Dummy Variables •Multiple Regression with Dummy Variables •Example: Explaining house prices (continued) •Y = b 1 + b 2 D 1. . . b k D k + e •Regress

Projects

Is everyone on a team? If not, see me!

Do you have topics? Please give them to me

either tonight or email them during the

upcoming week.

Must have topics by NEXT WEEK – which is

also the midterm …


Recommended