Multinomial Logistic Regression
“Inanimate objects can be classified scientifically into three major categories; those that don't work, those that break down and those that get lost” (Russell
Baker)
Multinomial Logistic Regression
Also known as “polytomous” or “nominal logistic” or “logit regression” or the “discrete choice model”
Generalization of binary logistic regression to a polytomous DVWhen applied to a dichotomous DV identical
to binary logistic regression
Polytomous Variables
Three or more unordered categories Categories mutually exclusive and
exhaustive Sometimes called “multicategorical” or
sometimes “multinomial” variables
Polytomous DVs
Reason for leaving welfare:marriage, stable employment, move to
another state, incarceration, or death Status of foster home application:
licensed to foster, discontinued application process prior to licensure, or rejected for licensure
Changes in living arrangements of the elderly:newly co-residing with their children, no
longer co-residing, or residing in institutions
Single (Dichotomous) IV Example DV = interview tracking effort
easy-to-interview and track mothers (Easy); difficult-to-track mothers who required more
telephone calls (MoreCalls); difficult-to-track mothers who required more
unscheduled home visits (MoreVisits) IV = race, 0 = European-American, 1 =
African-American N = 246 mothers What is the relationship between race and
interview tracking effort?
Crosstabulation
Table 3.1
Relationship between race and tracking effort is statistically significant [2(2, N = 246) = 8.69, p = .013]
Reference Category
In binary logistic regression category of the DV coded 0 implicitly serves as the reference category
Known as “baseline,” “base,” or “comparison” category
Necessary to explicitly select reference category“Easy” selected
Probabilities
Table 3.1 More Calls (vs. Easy)
European-American: .24 = 30 / (30 + 96) African-American: .31 = 24 / (24 + 53)
More Visits (vs. Easy)European-American: .15 = 17 / (17 + 96) African-American: .33 = 26 / (26 +53)
Odds & Odds Ratio
More Calls (vs. Easy)European-American: .3125 (.2098 / .6713)African-American: .4528 (.2330 / .5146)Odds Ratio = 1.45 (.4528 / .3125)
• 45% increase in the odds
More Visits (vs. Easy)European-American: .1771 (.1189 / .6713)African-American: .4905 (.2524 / .5146). Odds Ratio = 2.77 (.4905 / .1771)
• 177% increase in the odds
Question & Answer
What is the relationship between race and interview tracking effort?
The odds of requiring more calls, compared to being easy-to-track, are higher for African-Americans by a factor of 1.45 (45%). The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).
Multinomial Logistic Regression
Set of binary logistic regression models estimated simultaneouslyNumber of non-redundant binary logistic
regression equations equals the number of categories of the DV minus one
Statistical Significance
Table 3.2(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0
• Reject Table 3.3
(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0• Reject
Table 3.4(Race, More Calls vs. Easy) = 0
• Don’t Reject(Race, More Visits vs. Easy) = 0
• Reject
Odds Ratios
OR(More Calls vs. Easy) = 1.45The odds of requiring more calls, compared
to being easy-to-track, are not significantly different for European- and African-Americans.
OR(More Visits vs. Easy) = 2.77The odds of requiring more visits, compared
to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).
Estimated Logits (L)
Table 3.4
L(More Calls vs. Easy) = a + BRaceXRace
L(More Calls vs. Easy) = -1.163 + (.371)(XRace)
L(More Visits vs. Easy) = a + BRaceXRace
L(More Visits vs. Easy) = -1.731 + (1.019)(XRace)
Logits to Odds
African-Americans (X = 1)
L(More Calls vs. Easy) = -.792 = -1.163 + (.371)(1)
Odds = e-.792 = .45
L(More Visits vs. Easy) = -.712 = -1.731 + (1.019)(1)Odds = e-.712 = .49
Logits to Probabilities
African-Americans, L(More Calls vs. Easy) = -.792
African-Americans, L(More Visits vs. Easy) = -.712
.e
ep̂
.
.
Easy) vs.Calls (More
.e
ep̂
.
.
Easy) vs.Visits (More
Question & Answer
What is the relationship between race and interview tracking effort?
The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans.
The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).
Single (Quantitative) IV Example DV = interview tracking effort
easy-to-interview and track mothers (Easy); difficult-to-track mothers who required more
telephone calls (MoreCalls); difficult-to-track mothers who required more
unscheduled home visits (MoreVisits) IV = years of education N = 246 mothers What is the relationship between
education and interview tracking effort?
Statistical Significance
Table 3.6(Education, More Calls vs. Easy) = (Education, More Visits vs. Easy)
= 0• Reject
Table 3.7(Education, More Calls vs. Easy) = 0
• Don’t Reject
(Education, More Visits vs. Easy) = 0• Reject
Odds Ratios
OR(More Calls vs. Easy) = .88The odds of requiring more calls, compared
to being easy-to-track, are not significantly associated with education.
OR(More Visits vs. Easy) = .76For every additional year of education the
odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).
Figures
Education.xls
Estimated Logits (L)
Table 3.7
X = 12 (high school education)
L(More Calls vs. Easy) = -.977 = .583 + (-.130)(12)
L(More Visits vs. Easy) = -1.235 = 2.077 + (-.276)(12)
Effect of Education on Tracking Effort (Logits)
-3.00
-2.00
-1.00
0.00
1.00
2.00
Years of Education
Log
its
More Calls -0.46 -0.58 -0.71 -0.84 -0.97 -1.10 -1.23 -1.36 -1.49 -1.62
More Visits -0.13 -0.41 -0.69 -0.96 -1.24 -1.51 -1.79 -2.07 -2.34 -2.62
8 9 10 11 12 13 14 15 16 17
Logits to Odds
X = 12 (high school education)
Odds(More Calls vs. Easy) = e-.977 = .38
Odds(More Visits vs. Easy) = e-1.235 = .29
Effect of Education on Tracking Effort (Odds)
0.00
0.20
0.40
0.60
0.80
1.00
Years of Education
Odd
s
More Calls 0.63 0.56 0.49 0.43 0.38 0.33 0.29 0.26 0.22 0.20
More Visits 0.88 0.66 0.50 0.38 0.29 0.22 0.17 0.13 0.10 0.07
8 9 10 11 12 13 14 15 16 17
Logits to Probabilities
X = 12 (high school education)
.e
ep̂
.
.
Easy) vs.Calls (More
.e
ep̂
.
.
Easy) vs.Visits (More
Effect of Education on Tracking Effort (Probabilities)
.00
.10
.20
.30
.40
.50
Years of Education
Pro
babi
litie
s
More Calls 0.39 0.36 0.33 0.30 0.27 0.25 0.23 0.20 0.18 0.16
More Visits 0.47 0.40 0.34 0.28 0.22 0.18 0.14 0.11 0.09 0.07
8 9 10 11 12 13 14 15 16 17
Question & Answer
What is the relationship between education and interview tracking effort?
The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).
Multiple IV Example
DV = interview tracking efforteasy-to-interview and track mothers (Easy); difficult-to-track mothers who required more
telephone calls (MoreCalls); difficult-to-track mothers who required more
unscheduled home visits (MoreVisits) IV = race, 0 = European-American, 1 =
African-American IV = years of education N = 246 mothers
Multiple IV Example (cont’d)
What is the relationship between race and interview tracking effort, when controlling for education?
Statistical Significance
Table 3.8(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = (Ed,
More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0• Reject
Table 3.9(Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0
• Reject(Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0
• Reject
Statistical Significance (cont’d) Table 3.10
(Race, More Calls vs. Easy) = 0• Don’t reject
(Race, More Visits vs. Easy) = 0• Reject
(Ed, More Calls vs. Easy) = 0• Don’t reject
(Ed, More Visits vs. Easy) = 0• Reject
Odds Ratios: Race
OR(More Calls vs. Easy) = 1.36The odds of requiring more calls, compared
to being easy-to-track, are not significantly different for European- and African-Americans.
OR(More Visits vs. Easy) = 2.48The odds of requiring more visits, compared
to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%).
Odds Ratios: Education
OR(More Calls vs. Easy) = .89The odds of requiring more calls, compared
to being easy-to-track, are not significantly associated with education.
OR(More Visits vs. Easy) = .77For every additional year of education the
odds of needing more visits, compared to being easy-to-track, decrease by a factor of .77 (i.e., -23%), when controlling for race.
Figures
Race & Education.xls
Effect of Education on Tracking Effort for African-Americans (Odds)
0.00
0.50
1.00
1.50
Years of Education
Odd
s
More Calls 0.73 0.65 0.58 0.51 0.45 0.40 0.36 0.32 0.28 0.25
More Visits 1.30 1.01 0.78 0.60 0.46 0.36 0.28 0.21 0.17 0.13
8 9 10 11 12 13 14 15 16 17
Effect of Education on Tracking Effort for African-Americans (Probabilities)
.00
.10
.20
.30
.40
.50
.60
Years of Education
Pro
babi
litie
s
More Calls 0.42 0.39 0.37 0.34 0.31 0.29 0.26 0.24 0.22 0.20
More Visits 0.57 0.50 0.44 0.38 0.32 0.26 0.22 0.18 0.14 0.11
8 9 10 11 12 13 14 15 16 17
Question & Answer
What is the relationship between race and interview tracking effort, when controlling for education?
The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans, when controlling for education. The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%), when controlling for education.
Assumptions Necessary for Testing Hypotheses Assumptions discussed in GZLM lecture Independence of irrelevant alternatives
(IIA)Odds of one outcome (e.g., More Calls)
relative to another (e.g., Easy) are not influenced by other alternatives (e.g., More Visits)
Model Evaluation
Create a set of binary DVs from the polytomous DV
recode TrackCat (1=0) (2=1) (3=sysmis) into MoreCalls.recode TrackCat (1=0) (2=sysmis) (3=1) into MoreVisits.
Run separate binary logistic regressions Use binary logistic regression methods to
detect outliers and influential observations
Model Evaluation (cont’d)
Index plotsLeverage valuesStandardized or unstandardized deviance
residualsCook’s D
Graph and compare observed and estimated counts
Analogs of R2
None in standard use and each may give different results
Typically much smaller than R2 values in linear regression
Difficult to interpret
Multicollinearity
SPSS multinomial logistic regression doesn’t compute multicollinearity statistics
Use SPSS linear regression Problematic levels
Tolerance < .10 or VIF > 10
Additional Topics
Polytomous IVs Curvilinear relationships Interactions
Additional Regression Models for Polytomous DVs Multinomial probit regression
Substantive results essentially indistinguishable from binary logistic regression
Choice between this and binary logistic regression largely one of convenience and discipline-specific convention
Many researchers prefer binary logistic regression because it provides odds ratios whereas probit regression does not, and binary logistic regression comes with a wider variety of fit statistics
Additional Regression Models for Polytomous DVs (cont’d)Discriminant analysis
Limited to continuous IVs