+ All Categories
Home > Documents > Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19… © Andrew Ho, Harvard...

Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19… © Andrew Ho, Harvard...

Date post: 01-Jan-2016
Category:
Upload: simon-hodges
View: 226 times
Download: 1 times
Share this document with a friend
Popular Tags:
28
Unit 8: Categorical predictors I: Dichotomies Class 19… ttp://xkcd.com/74/ ttp://xkcd.com/210/ © Andrew Ho, Harvard Graduate School of Education Unit 8 / Page 1
Transcript
Page 1: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Unit 8: Categorical predictors I: Dichotomies Class 19…

http://xkcd.com/74/http://xkcd.com/210/

Unit 8 / Page 1

Page 2: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Where is Unit 6 in our 11-Unit Sequence?

Unit 6:The basics of

multiple regression

Unit 7:Statistical control in depth:Correlation and collinearity

Unit 10:Interaction and quadratic effects

Unit 8:Categorical predictors I:

Dichotomies

Unit 9:Categorical predictors II:

Polychotomies

Unit 11:Regression in practice. Common Extensions.

Unit 1:Introduction to

simple linear regression

Unit 2:Correlation

and causality

Unit 3:Inference for the regression model

Building a solid

foundation

Unit 4:Regression assumptions:Evaluating their tenability

Unit 5:Transformations

to achieve linearity

Mastering the

subtleties

Adding additional predictors

Generalizing to other types of predictors and

effects

Pulling it all

together

Unit 8 / Page 2

Page 3: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

In this unit, we’re going to cover…

• The dichotomous (dummy) variable as a predictor like any other• Naming conventions: the variable name vs. the reference category• Equivalence of the two-sample -test• The dummy variable in a multiple regression• Graphic displays of regression findings: How do you decide which

effects to highlight?• Adjusting means: A simple way of presenting findings for

categorical question predictors• Displaying and interpreting conditional regression lines

Unit 8 / Page 3

Page 4: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Categorical variables

Unit 8 / Page 4

kkXXXY 22110

Assumptions focus on Y or, more precisely, : i.i.d. normal with mean 0

No assumptions about the distributions of Xs, except that they are free of measurement error…

Nominal variables(unordered values)Sex ReligionPolitical Party

Ordinal variables(ordered values)Educational levelEnglish learner status (?)Test scores (?)

Another important distinctionDichotomies (only 2 categories)Polychotomies (>2 categories)

Dummy (or indicator) variables0/1 variables whose sole purpose is to identify yes/no membership in a particular category

femaleif

maleif

1

0

FEMALE

treatedif1

controlif0

TREAT

By convention, the variable name corresponds to the category given the value 1By convention, the category given the value 0 is called the reference category

Categorical variables are those whose values denote categories

In multiple regression, nominal variables can only enter as a dummy variables (or many dummies, as we’ll see next unit).

Page 5: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Do mandatory seat belt laws reduce fatalities?

Unit 8 / Page 5

Source: Calkins, LN & Zlatoper, TJ (2001). The effects of mandatory seat belt laws on motor vehicle fatalities in the United States, Social Science Quarterly, 82(4), 716-732

State-level data from all 50 states in 1997 occfatal – number of occupant fatalities in 1997 beltlaw – whether the state has a mandatory seatbelt law miles – total vehicle miles driven in the state (in millions) (note: currently, only NH has no mandatory seatbelt laws.

10. GA 973 1 93317 9. FL 1478 0 134007 8. DE 84 0 8007 7. CT 199 1 28552 6. CO 375 0 37746 5. CA 1817 1 285612 4. AR 427 0 28144 3. AZ 458 0 43491 2. AK 47 0 4387 1. AL 777 0 53458 state occfatal beltlaw miles

. list state occfatal beltlaw miles, clean

Outcome: Number of occupant fatalities in 1997.Question predictor: Mandatory seatbelt law (1 for states with the law, 0 otherwise)Covariate: Total miles driven

Hypothesis 1: Seat belt laws save lives because seat belts save lives

Hypothesis 2: The Offset Hypothesis: Seat belts encourage riskier driving behavior that may offset any benefit associated with increased seat belt use

milesbeltlawoccfatal

XXY

210

22110

Page 6: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Standard univariate and bivariate exploratory descriptives

Unit 8 / Page 6

Log(Numberof

occupantfatalities)

Mandatoryseatbelt

law

Log(Totalvehicle

miles drivenin thestate)

4

6

8

4 6 8

0

.5

1

0 .5 1

8

10

12

8 10 12

Number ofoccupantfatalitiesin 1997

Mandatoryseatbelt

law

Totalvehiclemiles

driven inthe state

0

1000

2000

0 1000 2000

0

.5

1

0 .5 1

0

100000

200000

300000

0 100000 200000 300000

This is one of those cases where both outcome and predictor suggest that log transformations will aid model fit substantially.

Coefficients must be interpreted on the log scale for Y (raise e to the coefficient for an estimated percent increase/decrease).

05

1015

20F

requ

ency

0 500 1000 1500 2000Number of occupant fatalities in 1997

010

2030

Fre

quen

cy

0 100000 200000 300000Total vehicle miles driven in the state

Distribution of occfatal Distribution of miles

Before transformation After transformation

Page 7: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Mean differences: The old-fashioned way. Stata’s by options

Unit 8 / Page 7

Why not start simple? Did states with mandatory seat belt laws have fewer fatalities?

occfatal 14 688.8571 583.8415 83 2012 Variable Obs Mean Std. Dev. Min Max

-> beltlaw = 1

occfatal 36 416.0833 337.0085 44 1478 Variable Obs Mean Std. Dev. Min Max

-> beltlaw = 0

. bysort beltlaw: summarize occfatal

logocc 14 6.208296 .8718579 4.41884 7.606884 Variable Obs Mean Std. Dev. Min Max

-> beltlaw = 1

logocc 36 5.642575 .9755272 3.78419 7.298445 Variable Obs Mean Std. Dev. Min Max

-> beltlaw = 0

. bysort beltlaw: summarize logocc

020

040

060

080

0M

ean

fata

litie

s in

200

7

No law Mandatory seat belts

. graph bar (mean) occfatal, over(beltlaw, relabel(1 "No law" 2 "Mandatory seat belts")) ytitle(Mean fatalities in 2007). graph bar (mean) occfatal, over(beltlaw, relabel(1 "No law" 2 "Mandatory seat belts")) ytitle(Mean fatalities in 2007)

02

46

Mea

n lo

g(fa

talit

ies)

in 2

007

No law Mandatory seat belts

. graph bar (mean) logocc, over(beltlaw, relabel(1 "No law" 2 "Mandatory seat belts")) ytitle(Mean log(fatalities) in 2007). graph bar (mean) logocc, over(beltlaw, relabel(1 "No law" 2 "Mandatory seat belts")) ytitle(Mean log(fatalities) in 2007)

Is this surprising?

Page 8: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

t-tests for significant mean differences: The old-fashioned way

Unit 8 / Page 8

Pr(T < t) = 0.0220 Pr(|T| > |t|) = 0.0439 Pr(T > t) = 0.9780 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Ho: diff = 0 degrees of freedom = 48 diff = mean(0) - mean(1) t = -2.0694 diff -272.7738 131.812 -537.7997 -7.74794 combined 50 492.46 61.13366 432.2802 369.6073 615.3127 1 14 688.8571 156.0382 583.8415 351.7571 1025.957 0 36 416.0833 56.16808 337.0085 302.0561 530.1106 Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Two-sample t test with equal variances

. ttest occfatal, by(beltlaw)

Pr(T < t) = 0.0322 Pr(|T| > |t|) = 0.0643 Pr(T > t) = 0.9678 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Ho: diff = 0 degrees of freedom = 48 diff = mean(0) - mean(1) t = -1.8935 diff -.5657208 .2987713 -1.166441 .0349992 combined 50 5.800977 .1376414 .9732718 5.524377 6.077578 1 14 6.208296 .2330138 .8718579 5.704901 6.711692 0 36 5.642575 .1625879 .9755272 5.312505 5.972646 Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Two-sample t test with equal variances

. ttest logocc, by(beltlaw)

02

004

006

008

00M

ean

fat

aliti

es

in 2

00

7

No law Mandatory seat belts

02

46

Mea

n lo

g(f

atal

itie

s) in

200

7

No law Mandatory seat belts

• A review of two-sample t-tests. – The standard error of the difference: The estimated standard deviation of the

distribution of mean differences under repeated sampling.– The test statistic: – The decision rule:

• Our conclusion: States with mandatory seat belt laws have higher average numbers of occupant fatalities than states without them (not significant on the log-scale).

0;:0 babaH

ba XXba seXXt

0reject ,2on If Hdfnntt bacrit

Page 9: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

What is a slope but a difference? t-tests in a regression framework

Unit 8 / Page 9

0200

400

600

800

Mea

n fatalitie

s in

20

07

No law Mandatory seat belts

02

46

Mea

n lo

g(f

ata

litie

s)

in 2

00

7

No law Mandatory seat belts

The mean difference between states without and with mandatory seatbelt laws in a t-test and regression slope framework, respectively.

The regression line passes through the mean values when =0 and =1, as expected of an algorithm that minimizes least squared residuals

45

67

8lo

g(N

umbe

r of

occ

upan

t fat

aliti

es in

199

7)

0-No law 1-LawMandatory seatbelt law

050

010

0015

0020

00N

umbe

r of

occ

upan

t fat

aliti

es in

199

7

0-No law 1-LawMandatory seatbelt law

beltlawoccfatal

XY

10

110

Page 10: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

A significant slope *is* a significant mean difference

Unit 8 / Page 10

Pr(T < t) = 0.0220 Pr(|T| > |t|) = 0.0439 Pr(T > t) = 0.9780 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Ho: diff = 0 degrees of freedom = 48 diff = mean(0) - mean(1) t = -2.0694 diff -272.7738 131.812 -537.7997 -7.74794 combined 50 492.46 61.13366 432.2802 369.6073 615.3127 1 14 688.8571 156.0382 583.8415 351.7571 1025.957 0 36 416.0833 56.16808 337.0085 302.0561 530.1106 Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Two-sample t test with equal variances

. ttest occfatal, by(beltlaw)

Pr(T < t) = 0.0322 Pr(|T| > |t|) = 0.0643 Pr(T > t) = 0.9678 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Ho: diff = 0 degrees of freedom = 48 diff = mean(0) - mean(1) t = -1.8935 diff -.5657208 .2987713 -1.166441 .0349992 combined 50 5.800977 .1376414 .9732718 5.524377 6.077578 1 14 6.208296 .2330138 .8718579 5.704901 6.711692 0 36 5.642575 .1625879 .9755272 5.312505 5.972646 Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Two-sample t test with equal variances

. ttest logocc, by(beltlaw)

_cons 416.0833 69.74838 5.97 0.000 275.8448 556.3218 beltlaw 272.7738 131.812 2.07 0.044 7.74794 537.7997 occfatal Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 9156444.42 49 186866.213 Root MSE = 418.49 Adj R-squared = 0.0628 Residual 8406436.46 48 175134.093 R-squared = 0.0819 Model 750007.956 1 750007.956 Prob > F = 0.0439 F( 1, 48) = 4.28 Source SS df MS Number of obs = 50

. regress occfatal beltlaw

_cons 5.642575 .1580949 35.69 0.000 5.324704 5.960447 beltlaw .5657208 .2987713 1.89 0.064 -.0349992 1.166441 logocc Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 46.4156426 49 .947258013 Root MSE = .94857 Adj R-squared = 0.0501 Residual 43.1896388 48 .899784141 R-squared = 0.0695 Model 3.22600386 1 3.22600386 Prob > F = 0.0643 F( 1, 48) = 3.59 Source SS df MS Number of obs = 50

. regress logocc beltlaw

• The p-values are identical. The degrees of freedom are the same. Recall F = t2.• A unit difference in X is associated with a difference of 272.77 fatalities… oh, right!

– The estimated slope is the mean difference... (given all else in the model).– On the log scale, states with laws have have 76% more occfatal.

• And the constant that we usually ignore… the predicted Y value when X = 0.– The constant is the mean of the reference category… It’s interpretable!

• What if we switch the 0/1 assignment from no-law/yes-law to yes-law/no-law?

Page 11: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Effect of reversing the 0/1 labels

• Reversing the 0/1 labels will change the sign of the slope but not the magnitude.• The constant will always be the mean of the reference (0) category.• All significance tests and the full ANOVA table will be unaffected.• In general, follow the convention of naming the variable by the 1-category, leaving

0 as a reference, to avoid confusion.Unit 8 / Page 11

_cons 6.208296 .2535159 24.49 0.000 5.698569 6.718024 nolaw -.5657208 .2987713 -1.89 0.064 -1.166441 .0349992 logocc Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 46.4156426 49 .947258013 Root MSE = .94857 Adj R-squared = 0.0501 Residual 43.1896388 48 .899784141 R-squared = 0.0695 Model 3.22600386 1 3.22600386 Prob > F = 0.0643 F( 1, 48) = 3.59 Source SS df MS Number of obs = 50

. regress logocc nolaw

_cons 5.642575 .1580949 35.69 0.000 5.324704 5.960447 beltlaw .5657208 .2987713 1.89 0.064 -.0349992 1.166441 logocc Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 46.4156426 49 .947258013 Root MSE = .94857 Adj R-squared = 0.0501 Residual 43.1896388 48 .899784141 R-squared = 0.0695 Model 3.22600386 1 3.22600386 Prob > F = 0.0643 F( 1, 48) = 3.59 Source SS df MS Number of obs = 50

. regress logocc beltlaw

45

67

8lo

g(N

um

be

r o

f o

ccu

pan

t fa

talit

ies

in 1

997)

0-Law 1-No lawNo mandatory seatbelt law

Page 12: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

* p<0.05, ** p<0.01, *** p<0.001LogMiles is the log of total vehicle miles driven in 2007, in millionsBeltLaw is an indicator variable for a mandatory state seatbelt law.t statistics in parentheses df_r 48 47 df_m 1 2 F 3.585 304.2 adj. R-sq 0.050 0.925 R-sq 0.070 0.928 N 50 50 (35.69) (-9.96) _cons 5.643*** -4.123***

(23.72) LogMiles 0.955***

(1.89) (-0.57) BeltLaw 0.566 -0.0502 Law Only ANCOVA Predicting the log number of occupant fatalities in 1997

ANCOVA: ANalysis of COVAriance

• ANCOVA is a general term that can encompass both Analysis Of VAriance (ANOVA) and regression, however, it tends to specifically refer to investigation of a (usually dichotomous) question predictor with one or more (usually continuous) covariates.

• ANCOVA can be readily implemented within our familiar regression framework.

Unit 8 / Page 12

milesbeltlawocc

XXY

loglog 210

22110

_cons -4.122666 .4139898 -9.96 0.000 -4.955506 -3.289826 logmiles .955139 .0402593 23.72 0.000 .8741477 1.03613 beltlaw -.0501633 .0877473 -0.57 0.570 -.2266881 .1263615 logocc Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 46.4156426 49 .947258013 Root MSE = .26612 Adj R-squared = 0.9252 Residual 3.32848985 47 .070818933 R-squared = 0.9283 Model 43.0871528 2 21.5435764 Prob > F = 0.0000 F( 2, 47) = 304.21 Source SS df MS Number of obs = 50

. regress logocc beltlaw logmiles

States with mandatory seat belt laws have more occupant fatalities, however, when accounting for the number of miles driven, laws are negatively associated with fatalities (5% fewer). The differences are not statistically significant.

Page 13: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

02

46

8

4 6 8 4 6 8

nolaw beltlaw

Fre

quen

cy

Log(Number of occupant fatalities)Graphs by Mandatory seatbelt law

Regression diagnostics and homoscedasticity

• Are residual plots still relevant?• Absolutely.• Homoscedasticity is an assumption

both in regression and for t-tests (and for ANOVA/ANCOVA in general)

Unit 8 / Page 13

-2-1

01

2R

esid

uals

5.6 5.8 6 6.2Fitted values

• Stata’s by option will continue to be useful for diagnostics, plotting, and reporting whenever there are dichotomous predictors.

. histogram logocc, by(beltlaw) freq

Page 14: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

AL

AK

AZAR

CA

CO

CT

DE

FL

GA

HI

ID

IL

IN

IAKS

KYLA

ME

MD

MA

MI

MNMS

MO

MT NENV

NH

NJ

NM

NYNC

ND

OH

OK

OR

PA

RI

SC

SD

TN

TX

UT

VT

VA

WA

WV

WI

WY

34

56

78

Log(

Num

ber

of

occu

pant

fata

litie

s)

8 9 10 11 12 13Log(Number of occupant fatalities)

Graphical display of results

© Andrew Ho, Harvard Graduate School of EducationUnit 8 / Page 14

beltlaw=1

beltlaw=0

Difference is not statistically significant.

𝑙𝑜𝑔𝑜𝑐𝑐=−4.123− .05𝑏𝑒𝑙𝑡𝑙𝑎𝑤+.955 𝑙𝑜𝑔𝑚𝑖𝑙𝑒𝑠

Page 15: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

no

no

nono

no

no

no

no

no

no

no

no

no

no

no

no

no

nono

no

no

nono

no

no

no

no

no

no

no

no

no

no

no

no

no

yes

yes

yes

yesyes

yes

yes

yes

yes

yes

yes

yesyes

yes

45

67

8L

og(N

um

ber

of o

ccu

pant

fata

litie

s)

8 9 10 11 12 13Log(Total vehicle miles driven in the state)

Visualizing the covariate’s reversal of the mean difference

Unit 8 / Page 15

Belt-law states have more occupant fatalities but also more total miles driven. When accounting for the total miles driven, states with belt laws have fewer fatalities, although the differences are not significant.

beltlaw=1beltlaw=0

Page 16: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

8 9 10 11 12 13Log(Total vehicle miles driven in the state)

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

40 50 60 70 80Normal daily mean state temperature

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

0 200 400 600 800 1000Population density per square mile

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

20 40 60 80 100Percentage of urban miles to total miles

Incorporating other predictors

Unit 8 / Page 16

One of the nuisances arising from log transformations of outcome variables is frequent transformation of additional predictors.

Page 17: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

8 9 10 11 12 13Log(Total vehicle miles driven in the state)

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

3.6 3.8 4 4.2 4.4Log(Daily mean state temperature)

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

0 2 4 6 8Log(Population density)

45

67

8Lo

g(N

umbe

r of

occ

upan

t fat

aliti

es)

3 3.5 4 4.5Log(Percent urban miles)

Incorporating other predictors

Unit 8 / Page 17

This is rough. Ideally we would use rvfplots, not scatterplots. And ideally we would also use rvfplots from multiple regression models that we’re actually fitting, not just these.Density seems better, but temperature and urban miles are borderline. Try it both ways?

Page 18: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

* p<0.05, ** p<0.01, *** p<0.001Predictors F and G are log transformations of Predictors D and E respectively.Predictor E is the percentage of urban miles driven in the state.Predictor D is the mean state daily temperature.Predictor C is the log of the population density per square mile.Predictor B is the log of total vehicle miles driven in 2007 in millions.Predictor A is an indicator variable for a mandatory state seatbelt law.t statistics in parentheses df_r 48 47 46 44 44 44 df_m 1 2 3 5 5 5 F 3.585 304.2 266.1 418.7 397.4 437.0 adj. R-sq 0.050 0.925 0.942 0.977 0.976 0.978 R-sq 0.070 0.928 0.946 0.979 0.978 0.980 N 50 50 50 50 50 50 (35.69) (-9.96) (-11.90) (-20.46) (-11.78) (-14.31) _cons 5.643*** -4.123*** -4.485*** -5.115*** -7.492*** -8.402***

(-4.67) L%Urb (G) -0.387***

(6.91) (7.07) LTemp (F) 1.128*** 1.102***

(-5.33) (-5.31) %Urb (E) -0.00902*** -0.00879***

(6.80) Temp (D) 0.0190***

(-3.81) (-2.71) (-3.43) (-2.97) LDens (C) -0.109*** -0.0590** -0.0746** -0.0635**

(23.72) (25.15) (37.80) (36.06) (38.14) LMile (B) 0.955*** 1.034*** 1.022*** 1.024*** 1.017***

(1.89) (-0.57) (-0.43) (-1.93) (-1.95) (-2.05) Law (A) 0.566 -0.0502 -0.0332 -0.0968 -0.100 -0.101* Model A Model AB ABC ABCDE ABCFG ABCEF Predicting the log number of occupant fatalities in 1997

Building a model for occupant fatalities

© Andrew Ho, Harvard Graduate School of EducationUnit 8 / Page 18

Remember: When interpreting coefficients in a model with a log outcome variable, small coefficients (less than 0.3) are approximately equal to a percent increase or decrease, e.g., -.1 predicts an approximate 10% decline in fatalities accounting for all else in the model.

What do we think of these findings?

Page 19: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Conditional regression lines, revisited

Unit 8 / Page 19

_cons -5.114937 .2499949 -20.46 0.000 -5.618768 -4.611105 pcturban -.0090171 .0016909 -5.33 0.000 -.0124248 -.0056093 temp .0189964 .0027955 6.80 0.000 .0133624 .0246304 logden -.0590361 .0217956 -2.71 0.010 -.1029623 -.0151099 logmiles 1.022367 .0270467 37.80 0.000 .9678582 1.076876 beltlaw -.0967979 .0500726 -1.93 0.060 -.1977127 .0041168 logocc Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 46.4156426 49 .947258013 Root MSE = .14737 Adj R-squared = 0.9771 Residual .955544825 44 .021716928 R-squared = 0.9794 Model 45.4600978 5 9.09201956 Prob > F = 0.0000 F( 5, 44) = 418.66 Source SS df MS Number of obs = 50

. regress logocc beltlaw logmiles logden temp pcturban

pcturbantempdenmilesbeltlawocc 009.019.log059.log022.10968.115.5log

pcturban 50 52.30357 17.40729 22.33816 85.53246 temp 50 54.256 8.459535 40.6 77.2 logden 50 4.288666 1.385434 -.0101779 6.887838 logmiles 50 10.40444 .9885524 8.386401 12.56239 beltlaw 50 .28 .4535574 0 1 Variable Obs Mean Std. Dev. Min Max

. summarize beltlaw logmiles logden temp pcturban

We have already looked at the relationship with logmiles, which is fairly obvious. Let’s look more closely at pcturban by placing it on the -axis with beltlaw on the legend. In order to plot conditional regression lines, we fix other predictor values (logmiles, logden, temp) at their means.

pcturbanbeltlawocc 009.26.54*019.289.4*059.404.10*022.10968.115.5log

Page 20: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

45

67

8L

og(N

um

ber

of o

ccu

pant

fata

litie

s)

20 40 60 80 100Percentage of urban miles to total miles

Conditional regression lines with multiple covariates

Unit 8 / Page 20

With multiple covariates, we generally do not want to include the bivariate scatterplot, as the conditional regression lines may not seem to fit and will be distracting. Poor fit because pcturban-logocc relationship is positive unconditionally but negative when accounting for other variables.Scatterplot shows the former, lines show the latter.Removing the scatterplot allows us room to display more relationships. Let’s include temperature.

No law

Law

Difference is not statistically significant.

Page 21: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

AKND MN

MIVT MT NH MESD WY WI

WA IL CT NY IA CO RINE NV ID MA OHUT IN PA DENJORMO WV MDKY KS NM

VANC OKTN CA GAARSCMS

TX

AL LA

FLAZ

HI

40

50

60

70

80

Nor

ma

l da

ily m

ean

sta

te te

mpe

ratu

re

0 2 4 6 8Frequency

Select prototypical temperature values

Unit 8 / Page 21

. dotplot temp, mlabel(state)

We were already showing the relationship between logocc, pcturban, and beltlaw, while attempting to hold logmiles, logden, and temp constant at their average values. Let’s try to visualize the “effect” of temp, also, by picking two prototypical values.

A warmer, southern state (65)

A cooler, northern state (50)

Page 22: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

55

.56

6.5

Log

(Nu

mbe

r o

f occ

upa

nt fa

talit

ies)

20 40 60 80 100Percentage of urban miles to total miles

Five predictors: One axis, two legend, two fixed at averages.

Unit 8 / Page 22

pcturbantempbeltlawocc 009.*019.289.4*059.404.10*022.10968.115.5log

Warm state (65) vs. cool state (50) difference: 33%

No-law vs. law: 10% (not significant)

1.329762. di exp(.019*15)

Conditional regression lines assuming average log(total miles driven ) and average log(population density).

No law, warm stateLaw, warm state

No law, cool stateLaw, cool state

Note the “Main Effects” assumptions here: All effects are the same across every level and predictor.

Page 23: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Adjusted mean differences

Unit 8 / Page 23* p<0.05, ** p<0.01, *** p<0.001Predictors F and G are log transformations of Predictors D and E respectively.Predictor E is the percentage of urban miles driven in the state.Predictor D is the mean state daily temperature.Predictor C is the log of the population density per square mile.Predictor B is the log of total vehicle miles driven in 2007 in millions.Predictor A is an indicator variable for a mandatory state seatbelt law.t statistics in parentheses df_r 48 47 46 44 44 44 df_m 1 2 3 5 5 5 F 3.585 304.2 266.1 418.7 397.4 437.0 adj. R-sq 0.050 0.925 0.942 0.977 0.976 0.978 R-sq 0.070 0.928 0.946 0.979 0.978 0.980 N 50 50 50 50 50 50 (35.69) (-9.96) (-11.90) (-20.46) (-11.78) (-14.31) _cons 5.643*** -4.123*** -4.485*** -5.115*** -7.492*** -8.402***

(-4.67) L%Urb (G) -0.387***

(6.91) (7.07) LTemp (F) 1.128*** 1.102***

(-5.33) (-5.31) %Urb (E) -0.00902*** -0.00879***

(6.80) Temp (D) 0.0190***

(-3.81) (-2.71) (-3.43) (-2.97) LDens (C) -0.109*** -0.0590** -0.0746** -0.0635**

(23.72) (25.15) (37.80) (36.06) (38.14) LMile (B) 0.955*** 1.034*** 1.022*** 1.024*** 1.017***

(1.89) (-0.57) (-0.43) (-1.93) (-1.95) (-2.05) Law (A) 0.566 -0.0502 -0.0332 -0.0968 -0.100 -0.101* Model A Model AB ABC ABCDE ABCFG ABCEF Predicting the log number of occupant fatalities in 1997

The beltlaw coefficient is the mean difference given everything else in the model. This row thus shows the adjusted mean differences.Make sure that your dichotomous variable is scaled to 0/1, 1/2, or is treated as an indicator.

Page 24: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

no

no

nono

no

no

no

no

no

no

no

no

no

no

no

no

no

nono

no

no

nono

no

no

no

no

no

no

no

no

no

no

no

no

no

yes

yes

yes

yesyes

yes

yes

yes

yes

yes

yes

yesyes

yes

45

67

8L

og(N

um

ber

of o

ccu

pant

fata

litie

s)

8 9 10 11 12 13Log(Total vehicle miles driven in the state)

Visualizing the adjusted mean difference for Model AB

Unit 8 / Page 24

beltlaw=1beltlaw=0

Difference is not statistically significant.

Page 25: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

55

.56

6.5

Log

(Nu

mbe

r o

f occ

upa

nt fa

talit

ies)

20 40 60 80 100Percentage of urban miles to total miles

Visualizing the adjusted mean difference for Model ABCDE

Unit 8 / Page 25

No-law vs. law: 10% (not significant)

No law, warm stateLaw, warm state

No law, cool stateLaw, cool state

Conditional regression lines assuming average log(total miles driven ) and average log(population density).

Page 26: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Best practices for conditional regression lines

1) Estimate the regression coefficients (regress)2) Calculate the means of the predictors (summarize)3) Write out the (model and) prediction equation

4) Decide what to condition on, and pick prototypical values for those covariates (means are a good default choice for a single value, but, whatever you choose, you must make the prototypical value clear and relevant).

5) When plotting and adjusting, consider using _b[_cons] and _b[_coefname] to increase precision and minimize copy/paste error.

6) Be explicit about which variables you are conditioning on and adjusting for.

7) Be explicit about whether apparent differences are statistically significant.

Unit 8 / Page 26

pcturbantempdenmilesbeltlawocc

pcturbantempdenmilesbeltlawocc

009.019.log059.log022.10968.115.5log

logloglog 543210

Page 27: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

• Regression models can include dichotomous predictors like any others– Variable names refer to the 1 category, 0 is the reference category.– Switching the category definitions changes only the sign of the slope and the value of the

constant.– Coefficients are estimated mean differences adjusting for all other covariates in the model.– The simple linear regression on the dichotomous predictor is equivalent to the t-test.

• Inclusion of other covariates in the regression model (ANCOVA) can change coefficients, just as before.– Investigation of sensitivity under plausible model specifications is necessary, just as before.

• Results of complex analyses can be displayed more simply using tables and graphs– As your models become more complex, the need for simple numerical and graphical

displays remains– Consider how you will communicate your results to colleagues and broader audiences– Adjusted mean differences and prototypical regression lines are powerful tools– But be clear about what variables and which levels of these variables (prototypical values)

you are conditioning on.– And be clear about whether apparent differences are statistically significant.

• General tips:– Small coefficients (magnitudes less than 0.3ish) can be interpreted as a predicted percent

change for a log(outcome) variable. This is because exp(β) ≈ 1+ β.– Use the _b[coefname] stored coefficients after running a regression model for your graph

twoway function code to estimate conditional regression lines.

What are the takeaways from this unit?

Unit 8 / Page 27

Page 28: Unit 8: Categorical predictors I: Dichotomies Class 19…Class 19…   © Andrew Ho, Harvard Graduate School of EducationUnit.

© Andrew Ho, Harvard Graduate School of Education

Glossary of terms

Unit 8 / Page 28

• Adjusted mean differences• Analysis of Covariance (ANCOVA)• Categorical variable (nominal and ordinal)• Conditional regression line• Dichotomous variable• Dummy variable• Main effects assumption• Two-sample t-test


Recommended