+ All Categories
Home > Documents > Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR...

Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR...

Date post: 20-Dec-2015
Category:
View: 216 times
Download: 2 times
Share this document with a friend
20
Sociology 601 Class 25: November 24, 2009 Homework 9 • Review dummy variable example from ASR (finish) regression results for dummy variables Quadratic effects example: earnings and age – plotting F-tests comparing models Example from Sociology of Religion 1
Transcript
Page 1: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Sociology 601 Class 25: November 24, 2009

• Homework 9

• Review

– dummy variable example from ASR (finish)

– regression results for dummy variables

• Quadratic effects

– example: earnings and age

– plotting

• F-tests comparing models

• Example from Sociology of Religion

1

Page 2: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Review: Regression with Dummy Variables

2

Create dummy variables for age: why? age is an interval variable, what advantage is there to creating a series of dummies?

gen byte age25=0 if age<. /* new variable, age25, will be missing if age is missing */replace age25=1 if age>=25 & age<=29

gen byte age30=0 if age<.replace age30=1 if age>=30 & age<=34

gen byte age35=0 if age<.replace age35=1 if age>=35 & age<=39

gen byte age40=0 if age<.replace age40=1 if age>=40 & age<=44

gen byte age45=0 if age<.replace age45=1 if age>=45 & age<=49

gen byte age50=0 if age<.replace age50=1 if age>=50 & age<=55

* check age dummies (agecheck should =1 for all cases)egen byte agecheck=rowtotal(age25-age50)tab agecheck, missing

Page 3: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Stata Shortcut for Dummy Variables

3

gen byte agecat= floor(age/5)*5tab agecat, gen(age)* floor function deletes decimal places:* e.g., at age 23: floor(23/5)*5 = floor(4.6)*5 = 4*5 = 20

* check age dummies (agecheck should =1 for all cases)egen byte agecheck=rowtotal(age1-age6)tab agecheck, missing

drop if age<25 | age>54

Page 4: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Regression with Age Dummy Variables

4

. regress conrinc age2-age6 if sex==1 Source | SS df MS Number of obs = 725-------------+------------------------------ F( 5, 719) = 12.79 Model | 3.8044e+10 5 7.6089e+09 Prob > F = 0.0000 Residual | 4.2773e+11 719 594895739 R-squared = 0.0817-------------+------------------------------ Adj R-squared = 0.0753 Total | 4.6577e+11 724 643334846 Root MSE = 24390------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- age2 | 8220.236 3143.413 2.62 0.009 2048.872 14391.6 age3 | 16495.6 3122.571 5.28 0.000 10365.16 22626.05 age4 | 17274.8 3112.55 5.55 0.000 11164.03 23385.57 age5 | 21532.53 3288.812 6.55 0.000 15075.7 27989.35 age6 | 20013.57 3406.607 5.87 0.000 13325.48 26701.66 _cons | 26954.2 2325.541 11.59 0.000 22388.54 31519.86------------------------------------------------------------------------------

Same R-squared and overall F, but different b’s and t’s (although same relative order):. regress conrinc age1-age5 if sex==1 Source | SS df MS Number of obs = 725-------------+------------------------------ F( 5, 719) = 12.79 Model | 3.8044e+10 5 7.6089e+09 Prob > F = 0.0000 Residual | 4.2773e+11 719 594895739 R-squared = 0.0817-------------+------------------------------ Adj R-squared = 0.0753 Total | 4.6577e+11 724 643334846 Root MSE = 24390------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- age1 | -20013.57 3406.607 -5.87 0.000 -26701.66 -13325.48 age2 | -11793.33 3266.455 -3.61 0.000 -18206.26 -5380.405 age3 | -3517.968 3246.403 -1.08 0.279 -9891.531 2855.595 age4 | -2738.771 3236.766 -0.85 0.398 -9093.413 3615.872 age5 | 1518.956 3406.607 0.45 0.656 -5169.13 8207.043 _cons | 46967.77 2489.343 18.87 0.000 42080.52 51855.02------------------------------------------------------------------------------

Page 5: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Plot Earnings by Age

5

. tab age, sum(conrinc)

| Summary of respondent income in age of | constant dollars respondent | Mean Std. Dev. Freq.------------+------------------------------------ 25 | 16277.936 10757.323 47 26 | 22712.5 12540.689 46 27 | 21188.725 11802.539 40 28 | 25593.444 18395.24 54 29 | 27021.244 17314.169 45 30 | 29687.902 16242.466 61 31 | 30723.709 21631.857 55 32 | 30218.871 19739.067 62 33 | 26096.263 15751.154 57 34 | 30685.51 20528 51 35 | 37709.106 26704.259 47 36 | 29178.255 21877.287 51 37 | 33702.843 20378.26 70 38 | 39046.871 30994.531 62 39 | 40338.326 29449.024 43 40 | 35442.909 23448.711 55 41 | 38218.979 31804.641 48 42 | 34377.678 26582.113 59 43 | 37867.069 25189.647 58 44 | 34885.268 23017.34 41 45 | 35212.378 20559.449 45 46 | 41641.308 28233.297 39 47 | 39708.14 29503.584 50 48 | 41391.807 26493.252 57 49 | 38324.964 23601.741 55 50 | 42443.892 29193.688 37 51 | 37255.357 25395.935 42 52 | 35165.655 20471.181 29 53 | 44005.892 30812.439 37 54 | 36918.065 26556.129 31------------+------------------------------------ Total | 33571.775 24047.119 1474

Page 6: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Regression Test for Curvilinearity

6

•test whether x has a curvilinear relationship with y:

•testing for a quadratic relationship is the most common, but not the only method of testing for curvilinearity.

•yi = β0 + β1xi + β2xi2 + ei

•test whether β2 ≠ 0

o if β2 > 0, then U-shape curve (or part)

o if β2 < 0, then inverted-U curve (or part)

o if β2 !> 0 & β2 !< 0, then revert to linear equation by dropping x2

•β1 is rather irrelevant in this test

o if p(β2 ≠ 0)>.05 and p(β1 ≠ 0)>.05, that does not mean there is no linear relationship.

Page 7: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Curvilinear Regression Equation: β2

7

yi = β0 + β1xi + β2xi2 + ei

β2 (quadratic coefficient) determines how steeply the curve accelerates:

y = 2x2 ; y = x2 ; y = .5 x2

Page 8: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Curvilinear Regression Equation: β2< 0

8

yi = β0 + β1xi + β2xi2 + ei

β2 (quadratic coefficient) < 0 then curve is inverted-U

y = -2x2 ; y = -x2 ; y = -.5 x2

Page 9: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Curvilinear Regression Equation:Inflexion Point = Maximum | Minimum

9

yi = β0 + β1xi + β2xi2 + ei

inflexion point = value of x when y is a maximum or minimum= - β1 / 2β2

y = -20x2 + 800x + 62000inflexion= -800 / (-20 * 2) = 20(i.e., below observed x values)

y = -100x2 + 8000x – 90000inflexion = -8000 / (-100 *2) = 40(i.e., within the x range)

y = -20x2 + 2400x + 800inflexion = -2400 / (-20 * 2) = 60(i.e., above observed values)

Page 10: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Curvilinear Regression Equation:Inflexion Point = Maximum | Minimum

10

yi = β0 + β1xi + β2xi2 + ei

for completeness, when β2 is positive:inflexion point = value of x when y is a maximum or minimum

= - β1 / 2β

y = 20x2 - 800x + 50000inflexion= --800 / (20 * 2) = 20(i.e., below observed x values)

y = 100x2 - 8000x + 205000inflexion = -8000 / (-100 *2) = 40(i.e., within the x range)

y = 20x2 - 2400x + 114000inflexion = -2400 / (-20 * 2) = 60(i.e., above observed values)

Page 11: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Example: Regression with Curvilinear Age

11

. gen int agesq=age*age

. summarize age agesq

Variable | Obs Mean Std. Dev. Min Max-------------+-------------------------------------------------------- age | 1860 38.84355 8.309941 25 54 agesq | 1860 1577.839 655.309 625 2916

. regress conrinc age agesq if sex==1

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 2, 722) = 32.08 Model | 3.8016e+10 2 1.9008e+10 Prob > F = 0.0000 Residual | 4.2776e+11 722 592463841 R-squared = 0.0816-------------+------------------------------ Adj R-squared = 0.0791 Total | 4.6577e+11 724 643334846 Root MSE = 24341

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- age | 4764.733 1134.778 4.20 0.000 2536.875 6992.591 agesq | -50.27083 14.30126 -3.52 0.000 -78.34785 -22.19381 _cons | -65221.92 21786.08 -2.99 0.003 -107993.6 -22450.29------------------------------------------------------------------------------

tagesq = -3.52; p < .001, so: curvilinear;

bagesq = negative, so: inverted U;

inflexion point = -bage / (2 * bagesq)) = - 4764.7 / (2 * -50.27) = 47.4

so maximum earnings at age 47 and a half.

Page 12: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Cubic Polynomials

12

• Occasionally (actually, rarely), it is worthwhile to investigate whether a more complex polynomial would better describe the curvilinear relationship.

• Add a cubic term (x3) to the previous quadratic equation:

• yi = β0 + β1xi + β2xi2 + β3xi

3 + ei

• Test β3 ≠ 0

o if you can’t show β3 ≠ 0, then revert to quadratic model

o if p(β3 ≠ 0) > .05, then don’t interpret β2 and β1

• if β3 ≠ 0, then curve has at least two bends (although not necessarily over the range of observed x’s)

Page 13: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Cubic Polynomials: Earnings and Age Example

. regress conrinc age agesq agecu if sex==1

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 3, 721) = 21.36 Model | 3.8020e+10 3 1.2673e+10 Prob > F = 0.0000 Residual | 4.2775e+11 721 593278929 R-squared = 0.0816-------------+------------------------------ Adj R-squared = 0.0778 Total | 4.6577e+11 724 643334846 Root MSE = 24357

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- age | 3971.837 8901.06 0.45 0.656 -13503.26 21446.93 agesq | -29.64795 230.0667 -0.13 0.897 -481.3286 422.0327 agecu | -.1739568 1.936886 -0.09 0.928 -3.976566 3.628653 _cons | -55354.68 112007 -0.49 0.621 -275253.4 164544.1------------------------------------------------------------------------------

• Note: after age cubed in entered, none of the coefficients are statistically significant (even though age and age squared were in the quadratic model).

• So, since βagecubed is not statistically significant, revert to the quadratic model (DON’T conclude that age has no relationship with earnings!)13

Page 14: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Cubic Polynomials: Actual Results

14

Page 15: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Inferences: F-tests Comparing models

15

Comparing Regression Models, Agresti & Finlay, p 409:

Where:Rc

2 = R-square for complete model,R r

2 = R-square for reduced model,k = number of explanatory variables in complete model,g = number of explanatory variables in reduced model, andN = number of cases.

F =Rc2− Rr2( ) / k − g( )

(1− Rc2) / [N − (k +1)]

df 1= k − g; df 2 = N − (k +1)

Page 16: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Example: F-tests Comparing models

16

• Complete model: men’s earnings on

• age,

• age square,

• age cubed,

• education, and

• currently married dummy.

• Reduced model: men’s earnings on

• education and

• currently married dummy.

• F-test comparing model is whether age variables, as a group, have a significant relationship with earnings after controls for education and marital status

Page 17: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Example: F-tests Comparing models

17

• Complete model: men’s earnings

. regress conrinc age agesq agecu educ married if sex==1

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 5, 719) = 45.08 Model | 1.1116e+11 5 2.2233e+10 Prob > F = 0.0000 Residual | 3.5461e+11 719 493199914 R-squared = 0.2387-------------+------------------------------ Adj R-squared = 0.2334 Total | 4.6577e+11 724 643334846 Root MSE = 22208

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- age | 5627.049 8127.377 0.69 0.489 -10329.18 21583.27 agesq | -75.30909 210.0421 -0.36 0.720 -487.6781 337.0599 agecu | .1985975 1.768176 0.11 0.911 -3.272807 3.670003 educ | 3555.331 317.9738 11.18 0.000 2931.063 4179.599 married | 8664.627 1690.098 5.13 0.000 5346.51 11982.74 _cons | -127148.4 102508.3 -1.24 0.215 -328399.8 74103.01------------------------------------------------------------------------------

• Note: none of the three age coefficients are, by themselves, statistically significant.

• Rc2 = .2387; k = 5.

Page 18: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Example: F-tests Comparing models

18

• Reduced model: men’s earnings. regress conrinc educ married if sex==1

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 2, 722) = 80.20 Model | 8.4666e+10 2 4.2333e+10 Prob > F = 0.0000 Residual | 3.8111e+11 722 527850916 R-squared = 0.1818-------------+------------------------------ Adj R-squared = 0.1795 Total | 4.6577e+11 724 643334846 Root MSE = 22975

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- educ | 3650.611 328.1065 11.13 0.000 3006.454 4294.767 married | 10721.42 1716.517 6.25 0.000 7351.457 14091.38 _cons | -16381.3 4796.807 -3.42 0.001 -25798.65 -6963.944------------------------------------------------------------------------------

• Rr2 = .1818; g = 2.

Page 19: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Inferences: F-tests Comparing models

19

F = ( 0.2387 – 0.1818) / (5 – 2) df1=5-2; df1=725-6 ( 1 - .2387) / (725 – 6)

= 0.0569/3 0.7613/719

= 26.87, df=(3,719), p < .001 (Agresti & Finlay, table D, page 673)

F =Rc2− Rr2( ) / k − g( )

(1− Rc2) / [N − (k +1)]

df 1= k − g; df 2 = N − (k +1)

Page 20: Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.

Next: Regression with Interaction Effects

20

Examples with earnings:• married x gender • age x gender• age x education• marital status x gender


Recommended