Post on 10-Mar-2020
transcript
1
Shall We Mixed Logit? Estimation stability and prediction reliability of
error component mixed logit models
Shusaku NAKAI Ryuichi KITAMURA
Kyoto University Toshiyuki YAMAMOTO
Nagoya University
2
Outline
• Introduction • Error component MXL models
– Identification issue – Variability of parameter estimates – Estimation of choice probabilities
• Usefulness of MNL models • Conclusions and future research
3
Introduction MXL models • considered the most promising discrete choice
model • widespread applications in recent years However • properties of parameter estimates are not well
understood
Objective • Estimation stability and prediction reliability of
error component MXL models are examined with simulated data
4
Error component MXL models • Examined is a trinomial MXL model
+++=+++=
+++=
nnnnn
nnnnn
nnnnn
XXuXXuXXu
322321313
222221212
112121111
εµββεµββ
εµββ
2 explanatory variables Standard iid Gumbel
2 error components
),0(~
),0(~2
22
211
sN
sN
n
n
µ
µ
+
+
+
=Σ
6
6
006
22
2
22
22
2
22
1
π
π
π
s
ss
s
5
Simulated discrete choice data Generated by a probit model
Error component MXL models
++=
++=
++=
nnnn
nnnn
nnnn
XXu
XXu
XXu
32321313
22221212
12121111
ξββ
ξββ
ξββ
=∑
1010
001
ρρξ
==
5.00.1
2
1
ββ
)1,0(~ NX jin
ρ = 0.00, 0.10, 0.30, 0.50, 0.70, 0.90, 0.95, 0.99 Each data set contains 1,000 cases 25 data sets are generated for each value of ρ
6
Identification issue
For trinomial probit models, Dansie (1985) suggests
• 3 matrices are equivalent, and produce the same likelihood value
• Model estimation would not be able to indicate which is most likely
Error component MXL models
=
1010
00
23
23
11
σσ
σ
AΣ
=
1'0'10
001
23
23
σσBΣ
=
10001000''11σ
CΣ
, thus ΣA is not estimable
7
Identification issue (cont.) For GEV models, Börsch-Supan (1990) and
Munizaga et al. (2000) estimated in the case of 4 alternatives
• and found that nested logit models have some
capacity to accommodate heteroscedasticity
Error component MXL models
=
1010
00
23
23
11
σσ
σ
AΣ
=
1'0'10
001
23
23
σσBΣ
=
10001000''11σ
CΣ
Nested logit model HEV model
8
Identification issue (cont.) • In this study, data sets are simulated by ΣB
Error component MXL models
=
1010
00
23
23
11
σσ
σ
AΣ
=
1'0'10
001
23
23
σσBΣ
=
10001000''11σ
CΣ
+
+
+
=Σ
6
6
006
22
2
22
22
2
22
1
π
π
π
s
ss
s • Error component MXL model examined in this study is consistent with ΣA
9
Identification issue (cont.)
• Standard deviation becomes extremely large, implying covariance structure is unidentified
• MXL model is subject to the same identification problem of probit model (consistent with Walker et al. (2007))
Error component MXL models
0.1
1
10
100
1000
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ
Para
met
er E
stim
ate
0.1
1
10
100
1000
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ
Para
met
er E
stim
ate
21s
22s
10
Identification issue (cont.) • Hereafter, we constrain
Error component MXL models
=
1010
00
23
23
11
σσ
σ
AΣ
=
1'0'10
001
23
23
σσBΣ
=
10001000''11σ
CΣ
+
+
+
=Σ
6
6
006
22
2
22
22
2
22
1
π
π
π
s
ss
s
22
21
2 sss ==
+
11
001
6
22 ρπs
6
22
2
πρ
+=
s
s
11
Variability of parameter estimates Error component MXL models
• Parameter estimates are quite instable especially for the case with higher ρ
0
1
2
3
4
5
6
7
8
9
10
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
誤差相関係数ρ
推定
パラメータ値
Error correlation coefficient ρ
Para
met
er e
stim
ate
1β
0.11 =β
12
Variability of parameter estimates (cont.)
• This instability is caused by the dependence of coefficient estimates on error variance
• Error variance is not standardized in MXL model • Needs for normalization of parameter estimates
Error component MXL models
=∑
1010
001
ρρξ
Probit model
+=Σ
11
001
6
22 ρπs
6
22
2
πρ
+=
s
s
Error component MXL model
jj
s
βπ
β ˆ
6ˆ
1~2
2 +
=
13
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ
Para
emte
r Esti
mat
e
Error component MXL models
• After normalization, utility coefficients are unbiased and stable
0.11 =β
Variability of parameter estimates (cont.)
1β~
14
0.1
1
10
100
1000
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ
Para
met
er E
stim
ate
Error component MXL models
Variability of parameter estimates (cont.)
• Estimated variances of the error components tend to be biased upward
True value
2s
15
Error component MXL models
Variability of parameter estimates (cont.)
• Biases in estimated variances might be related to the difference in shape of Normal and Gumbel distribution
• Amemiya (1981) suggests in binary case N(0, 1.62) rather than N(0, π2/3) fits better to L(0, π2/3),
though the latter has equal variance to L(0, π2/3) (1.6 < π/30.5 ≈ 1.8)
0
0.2
0.4
0.6
0.8
1
1.2
-3 -2.7 -2.4 -2.1 -1.8 -1.5 -1.2 -0.9 -0.6 -0.3 0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3
Cumulative distribution function
0.66
0.665
0.67
0.675
0.68
0.685
0.69
0.695
0.75
0.75
0.76
0.76
0.76
0.76
0.77
0.77
0.77
0.77
0.77
0.78
0.78
0.78
0.78
0.79
0.79
0.79
0.79 0.8
0.8
16
Estimation of choice probabilities Error component MXL models
• Choice probabilities are calculated by
for the case
• The effects of biased estimate of s is examined by introducing q, and calculate
• True probability is obtained when q ≈ 1.29
( )( ) )()(
ˆˆˆexpˆˆˆexp)(ˆ
212211
2211 ηηηββ
ηββ dfdfsXX
sXXiP
jjjnjn
iininn ∫∫∑ ++
++=
==
=3or2or
1or,
2
1
jiifjiif
ji ηη
ηη
0.1231322122111 ====== XXXXXX
( )( ) )()(
6exp
6exp)|( 212
2211
22211 ηη
πηββ
πηββdfdf
sXXq
sXXqqiP
jjjj
iii∫∫∑ ++
++=
17
0
0.1
0.2
0.3
0.4
0.5
0.6
0.1 1 10
q
Cho
ice
Prob
abili
ty
1.15 1.73
P(1)
P(3)
P(2)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.1 1 10
q
Cho
ice
Prob
abili
ty
1.15 1.73
P(1)
P(3)
P(2)
Estimation of choice probabilities (cont.)
Error component MXL models
• True probabilities are contained in the range of the estimated probability
0.1=ρ
Range of estimated probability
True value
18
0
0.1
0.2
0.3
0.4
0.5
0.6
0.1 1 10
q
Cho
ice
Prob
abili
ty
P(1)
P(3)
P(2)
1.57 2.290
0.1
0.2
0.3
0.4
0.5
0.6
0.1 1 10
q
Cho
ice
Prob
abili
ty
P(1)
P(3)
P(2)
1.57 2.29
Estimation of choice probabilities (cont.)
Error component MXL models
• True probabilities are NOT contained in the range of the estimated probability
Range of estimated probability
True value
0.5=ρ
19
0
0.1
0.2
0.3
0.4
0.5
0.6
0.1 1 10
q
Cho
ice
Prob
abili
ty
P(1)
P(3)
P(2)
3.45 5.330
0.1
0.2
0.3
0.4
0.5
0.6
0.1 1 10
q
Cho
ice
Prob
abili
ty
P(1)
P(3)
P(2)
3.45 5.33
Estimation of choice probabilities (cont.)
Error component MXL models
• True probabilities are NOT contained in the range of the estimated probability
Range of estimated probability
True value
0.9=ρ
20
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
ρ
Para
met
er E
stim
ate
Usefulness of MNL models • MNL models are estimated using the same
data sets
• Utility coefficient estimates are biased upward, but up to about 30%, smaller than MXL model
0.11 =β1β
21
Conclusions and future research For the error component MXL model 1. Variance structure cannot be uniquely identified
through model estimation 2. Parameter estimates are quite instable especially for
the case with a high error correlation 3. After proper normalization, utility coefficients are
unbiased and stable 4. Estimated variances of the error components tend to
be biased upwards 5. Choice probabilities are biased unless the error
correlation is very small MNL model can produce relatively unbiased utility
coefficient estimates
22
Conclusions and future research (cont.)
• One would adopt MXL model in search of covariance specification
-> The model is incapable of identifying the true structure, and parameter estimates are instable
• One may opt to develop adequately specified MNL through careful selection of explanatory variables, utility formulation or definition of alternatives (consistent with suggestion by Pinjari & Bhat (2006))
• Needs for further research on properties of parameter estimates of MXL model with taste heterogeneity as well as error components