logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Multiple regression:Categorical dependent variables
Johan A. ElkinkSchool of Politics & International Relations
University College Dublin
27 November 2017
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
1 Binary dependent variables
2 Logistic regression
3 Interpretation
4 Model fit
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Outline
1 Binary dependent variables
2 Logistic regression
3 Interpretation
4 Model fit
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Binary models
Binary models have a dependent variable consisting of twocategories.
For example,
• Vote on a particular law
• Turning out in an election
• Approval in a referendum
• Bankrupt or not
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Limited dependent variables
When a dependent variable is not continuous, or is truncatedfor some reason, a linear model would lead to implausiblepredictions.
For binary dependent variables we estimate the probability ofobserving a one:
• Prediction below 0 and above 1 would not make sense.
• For any case where the predicted probability is alreadyhigh, it cannot increase much with a change in X (andvice versa for low probabilities).
• A linear model would imply high levels ofheteroskedasticity.
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Estimators
A typical approach is to have an estimator that is “linear in theparameters” – i.e. it generates a linear prediction based on Xand β – but then transforms this linear prediction into onebounded between 0 and 1.
−6 −4 −2 0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
Logistic transformation
Linear prediction
Pre
dict
ed p
roba
bilit
y
Pr(Y = 1) = 0.5
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Outline
1 Binary dependent variables
2 Logistic regression
3 Interpretation
4 Model fit
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Logistic regression
The most common transformation is the logistictransformation, which relates to the log-odds:
log
(Pr(yi = 1)
Pr(yi = 0)
)= β1 + β2xi1 + β3xi2,
which can also be formulated as:
Pr(yi = 1) =1
1 + e−(β1+β2xi1+β3xi2).
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Estimating a logistic regression
Estimating a logisticregression is straightforwardand output will look similar tothat of linear regression.
E.g. explaining “Yes” in theMarriage EqualityReferendum.
Note the use of continuousand discreet independentvariables.
Age 25-34 −0.152(0.410)
35-44 −0.707∗
(0.386)45-54 −0.865∗∗
(0.390)55-64 −1.084∗∗∗
(0.399)65+ −1.857∗∗∗
(0.374)Urban 0.305∗
(0.168)Pro-abortion 0.221∗∗∗
attitude (0.028)intercept 0.358
(0.372)
N 851
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Outline
1 Binary dependent variables
2 Logistic regression
3 Interpretation
4 Model fit
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Derivatives
For linear regression, we interpret using the first derivative –i.e. the effect of X on Y is:
∂y
∂xj= βj .
In a logistic regression, however, the derivative is morecomplicated:
∂π
∂xj= βj π(1 − π).
Because of the non-linear relationship, the effect of X on Ydepends on all other independent variables.
Nevertheless, a quick method to interpret logit coefficients is todivide them by 4 to get the slope at π = 0.5.
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Graphical interpretation
An alternative method is to plot the relationship between one xand π, holding the other values of X constant (e.g. at themean, median, etc.).
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●●●
●
2 4 6 8
−0.
20.
00.
20.
40.
60.
81.
01.
2
Predicted probability of a yes vote(for 35−44 year old)
Pro−abortion attitude
Pro
babi
lity
of y
es v
ote
UrbanRural
Because the link function g(Xβ) is not linear (but insteadg(Xβ) = 1
1+e−Xβ ), the effect of X on y depends on all X.
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Fitted values
A third useful way of interpreting logit regression coefficients isby describing typical cases or interesting examples.
Age Region P(Yesvote)
18–24 Urban 0.8935–44 Urban 0.7965+ Urban 0.59
18–24 Rural 0.8535–44 Rural 0.7465+ Rural 0.51
(This assumes attitude towards abortion at the median value.)
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Presentation
Bottom line: it is much better to present interpretable andunderstandable inferences, with an indication of the level ofuncertainty, than to present simply estimated coefficients.
E.g. “An increase in automobile support for a Republicansenator from $10000 to $20000 in total increases his or herprobability to vote for the Corporate Average Fuel Economystandard bill by 11%, give or take 7%, all else equal.”
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Outline
1 Binary dependent variables
2 Logistic regression
3 Interpretation
4 Model fit
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
R2 for logistic regression
Although various authors have proposed pseudo-R2 estimatorsthat roughly do the same thing as an R2 for linear regression,there is no good alternative.
They cannot be interpreted as “the proportion of variance in Yexplained.”
Instead, it is typically better to look at the quality of thepredictions – do I get high Pr(Y = 1) for the observed ones inthe data and low Pr(Y = 1) for the observed zeros?
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Confusion matrix
Evaluating the performance of the binary model can be done byusing the confusion matrix:
True value1 0
Pre
dic
tion 1 True positive False positive
Precision: TPTP+FP
(Type I error)
0 False negative True negative(Type II error)
Sensitivity: TPTP+FN Specificity: TN
FP+TN Accuracy: TP+TNN
TPR: TPTP+FN FPR: FP
FP+TN
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Confusion matrix
Evaluating the performance of the binary model can be done byusing the confusion matrix:
True value1 0
Pre
dic
tion 1 True positive False positive Precision: TP
TP+FP
(Type I error)
0 False negative True negative(Type II error)
Sensitivity: TPTP+FN Specificity: TN
FP+TN Accuracy: TP+TNN
TPR: TPTP+FN FPR: FP
FP+TN
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Receiver Operating Characteristic curve
The accuracy of predictions will depend on the thresholdprobability – variations on default of π = 0.5 are possible.
Depending on the application, it might be better or worse toover- or underestimate ones relative to zeros.
The ROC-curve plots, for all possible thresholds, the truepositive rate against the false positive rate.
An ROC-curve further from (above) the 45 degree lineindicates a better predictive performance; any predictions underthis line indicate worse than random prediction.
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Receiver Operating Characteristic curve
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Receiver Operating Characteristic curve
False positive rate
True
pos
itive
rat
e
With ageWithout age
Given the above, we can also calculate the area under theROC-curve as a measure of prediction quality, called AUC. Thisis somewhat related to the Gini coefficient for incomedistributions (G = 2AUC − 1).
logisticregression
Binarydependentvariables
Logisticregression
Interpretation
Model fit
Variations
Logistic regression is the most common and easiest to use.Other models exist for specific uses:
• Probit regression—similar to logistic regression, but withless fat tails.
• Ordered probit—similar to probit regression, but withmultiple category ordinal dependent variable.
• Multinomial logistic regression—similar to logisticregression, but for a multiple category nominal dependentvariable.
• Poisson / negative binomial—regression models fordiscrete, positive dependent variables, such as counts.