GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models.

GY460 Techniques of Spatial Analysis

Steve Gibbons

Lecture 6: Probabilistic choice models

Introduction

• Sometimes useful to model individual firm, or other agents choices over discrete alternatives

– Choice of transport mode

– Choice of firm location amongst regions

– Choice of cities or country to migrate to

• Theoretical framework

– Random utility model

• Empirical methods:

– Micro: Probit, logit, multinomial logit

– Aggregate: Poisson, OLS, gravity

The “Random Utility” choice model

Random Utility Model

• RUM underlies economic interpretation of discrete choice models. Developed by Daniel McFadden for econometric applications

– see JoEL January 2001 for Nobel lecture; also Manski (2001) Daniel McFadden and the Econometric Analysis of Discrete Choice, Scandinavian Journal of Economics, 103(2), 217-229

• Preferences are functions of biological taste templates, experiences, other personal characteristics

– Some of these are observed, others unobserved

– Allows for taste heterogeneity

• Discussion below is in terms of individual utility (e.g. migration, transport mode choice) but similar reasoning applies to firm choices


• Individual i’s utility from a choice j can be decomposed into two components:

• Vij is deterministic – common to everyone, given the same

characteristics and constraints

– representative tastes of the population e.g. effects of time and cost on travel mode choice

ij is random

– reflects idiosyncratic tastes of i and unobserved attributes of choice j

ijijij VU


• Vij is a function of attributes of alternative j (e.g. price

and time) and observed consumer and choice characteristics.

ij ij ij ijV t p z

• We are interested in finding , ,

• Lets forget about z now for simplicity

RUM and binary choices

• Consider two choices e.g. bus or car

• We observe whether an individual uses one or the other

• Define 1 if chooses bus

0 if chooses cari

i

y i

y i

• What is the probability that we observe an individual choosing to travel by bus?

• Assume utility maximisation

• Individual chooses bus (y=1) rather than car (y=0) if utility of commuting by bus exceeds utility of commuting by car

RUM and binary choices

• So choose bus if 01 ii UU

10011 iiii VV

01101 iiii VV

• So the probability that we observe an individual choosing bus travel is

1 0 1 0

1 0 1 0 1 0

Pr ob

Pr ob

i i i i

i i i i i i

V V

t t p p

The linear probability model

• Assume probability depends linearly on observed characteristics (price and time)

• Then you can estimate by linear regression

1 0 1 0Pr ob chooses bus i i i ii t t p p

1 1 0 1 0 1i i i i i iy t t p p

• Where is the “dummy variable” for mode choice (1 if bus, 0 if car)

• Other consumer and choice characteristics can be included (the zs in the first slide in this section)

1iy

The linear probability model

• Unfortunately his has some undesirable properties

Pr ob bus 1

0iV

Linear regression line

Non-linear probability model

• Better for probability function to have a shape something like:

1

0iV

Pr ob bus

Probits and logits

• Common assumptions:

– Cumulative normal distribution function – “Probit”

– Logistic function – “Logit”

expPr ob chooses bus

1 expi

i

Vi

V

• Estimation by maximum likelihood

1

Pr ob 1

Prob 0 1

ln ln 1 1

i

i

i n

i ii

y F

y F

L y F y F

i

i

i i

x β

x β

x β x β

Example

• McFadden, D. (1974) The Measurement of Urban Travel Demand, Journal of Public Economics, 3

• Methods of commuting in San Francisco Bay area

Example 1

Characteristics t

Family income $ 0.000095

(0.774)

Car-bus cost, cents per round trip

-0.01022*

(3.726)

Car-bus vehicle time costs (one way minutes x wage)

-0.01479

(2.460)

Bus total access time costs (one way minutes x wage)

-0.00314

(0.818)

Constant 0.3832 (0.428)

McFadden (1974) car versus bus commute modes in SF Bay area

Multiple choices and the “multinomial logit”

Multiple choices

• We often want to think about many more than two choices

– Choice of regional location

– Choice of transport mode with many alternatives

– Choice amongst a sample of schools

• How can we extend the binary choice logit model?

• Random Utility model extends to many choices

ijijij VU

kjVV ijijikik allfor

• Choose choice k if utility higher than for all other choices

Multinomial logit (1)

• Again we need to assume some distribution for the unobserved factor

• One type of distribution (extreme value) gives a simple solution for the probability that choice k is made:

• This is a generalisation of the logit model with many alternatives = “multinomial logit” or “conditional logit”

expPr ob chooses

expik

ijj

Vi k

V

1 1

ln lnProb i chooses jj J i n

ijj i

L y

Multinomial logit (2)

• Recall: Vij is a linear function of observed characteristics

of the individuals and their choices. e.g. for travel mode choice

• Parameters estimated:

• For an individual characteristic that is common across choices (e.g. income, gender): one parameter per choice

– For at least one choice this is zero (base case).

• For a characteristic which varies only across choices e.g. price of transport: one parameter common across choices

ij ij ij j ijV t p z

Example: Value of time• MNL models used to estimate “value of travel time” with from observed commuter behaviour

• Three transport choices: bus (0), train (1), car (2)

• Choosing bus as the base case:

1 1 0 1 0

1 0 1 0

2 2 0 2 0

2 0 2 0

( ) ( )

( ) ( )

( ) ( )

( ) ( )

i

i i

i

i i

V price price time time

sex companycar

V price price time time

sex companycar

Example 1: Value of time

• For example, from Truong and Hensher, Economic Journal, 95 (1985) p. 15 for bus/train/car choices in Sydney 1982

Example 2: immigration

• Scott, Coomes and Izyumov, (2005)The Location Choice of Employment-Based Immigrants among U.S. Metro Areas. Journal of Regional Science 45(1) 113-145

• Estimate the impact of metropolitan area characteristics on destination choice for US migrants in 1995

• 298 destination MSAs

Example 2: immigration

Source: Scott, Coomes et al (note: they also report models which include individual Xs)

The independence of irrelevant alternatives problem (IIA) and the nested logit model

Multinomial logit and “IIA”

• Many applications in economic and geographical journals (and other research areas)

• The multinomial logit model is the workhorse of multiple choice modelling in all disciplines. Easy to compute

• But it has a drawback

Independence of Irrelevant Alternatives

• Consider market shares

– Red bus 20%

– Blue bus 20%

– Train 60%

• IIA assumes that if red bus company shuts down, the market shares become

– Blue bus 20% + 5% = 25%

– Train 60% + 15% = 75%

• Because the ratio of blue bus trips to train trips must stay at 1:3


• Model assumes that ‘unobserved’ attributes of all alternatives are perceived as equally similar

• But will people unable to travel by red bus really switch to travelling by train?

• Most likely outcome is (assuming supply of bus seats is elastic)

– Blue bus: 40%

– Train: 60%

• This failure of multinomial/conditional logit models is called the

• Independence of Irrelevant Alternatives assumption (IIA)


• It is easy to see why this is:

• Ratio of probabilities of choosing k (e.g. red bus) and another choice l (e.g. train) is just

exp

expik

il

V

V

• All other choices drop out of this odds ratio

• There are models that overcome this, e.g…

Nested Logit Model

• Multinomial logit model can be generalised to relax IIA assumption

– Nested Logit (Nested Multinomial Logit)

Car (1) Public transport (2)

Bus (3) Train (4)

• Characteristics of Bus and Train affect decision of whether to use Car or Public Transport

• Estimate by sequential logits…

Nested Logit Model

• Value placed on choices available in second stage (3,4) enter into calculation of choice probabilities in first stage (2)…

• Logit for bus versus train to estimate V3 and V4

• Define the ‘Inclusive Value’ of public transport as

2 3 4ln exp expI V V

• Estimate logit model for Car (1) versus Public (2) using:

2 2

2 2 1

expPr ob Public

exp exp

V I

V I V

Example: Transport mode choice

• Asensio, J., Transport Mode Choice by Commuters to Barcelona’s CBD, Urban Studies, 39(10), 2002

• Travel mode for suburban commuters

• Sample of 1381 commuters from a travel survey

• Records mode of transport and other individual characteristics

Private car Public transport

Train Bus

Example: Transport mode choice

• Asensio, J., Transport Mode Choice by Commuters to Barcelona’s CBD, Urban Studies, 39(10), 2002

– Some selected coefficients

Variable Parameter

Cost -0.002

Travel time by car -0.054

Travel time by public transport -0.018

Sex (car) 0.889

Sex (bus) -1.001

• We don’t know the units of measurement, but how much more valuable is time saved car than time saved by public transport?

Other discrete choice applications

• Firm location choices e.g. Head, K. and T.Mayer seminar reading (2004), Market Potential and the Location of Japanese Investment in the European Union, Review of Economics and Statistics, 86(4) 959-972

• School choice (e.g. Barro, L. (2002) School choice through relocation: evidence from the Washington, D.C. area, Journal of Public Economics, 86 p.155-189

• Migration destinations

• Residential choice

Aggregate choice models

Micro and aggregated choice models

• Micro level logit choice models often have aggregated equivalents

• i.e. if you only have choice characteristics, you could use a choice-level regression of the proportion of individuals making each choice on the choice characteristics

• Obviously log(n_k) would work too (why?)

expPr ob chooses

exp

ln Pr ob chooses ln exp

ln /

k

jj

k jj

k k i

Vi k

V

i k V V

n N x

Micro and aggregated choice models

• In fact, a Poisson model on aggregated data gives exactly the same coefficient estimates as the conditional logit model

• Which is based on ML estimation of

• See Guimaraes et al Restats (2003)

– though this equivalence was known before this ‘discovery’

• Here’s an example…

expPr ob number choosing =

!

ln

knk k

kk

k k

k nn

x

Data (295 i’s 3 j’s)

id choice d x

1 American 0 18.97627

1 Japan 0 7.542373

1 Europe 1 3.461017

2 American 1 18.97627

2 Japan 0 7.542373

2 Europe 0 3.461017

3 American 1 18.97627

3 Japan 0 7.542373

3 Europe 0 3.461017

4 American 0 18.97627

4 Japan 1 7.542373

4 Europe 0 3.461017

5 American 1 18.97627

5 Japan 0 7.542373

5 Europe 0 3.461017

Conditional logit

Conditional (fixed-effects) logistic regression Number of obs = 885

LR chi2(1) = 129.65

Prob > chi2 = 0.0000

Log likelihood = -259.26785 Pseudo R2 = 0.2000

------------------------------------------------------------------------------

choice | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

x | .0999331 .0091997 10.86 0.000 .081902 .1179642

------------------------------------------------------------------------------

Simpler data

choice n x p

American 192 18.97627 0.650847

Japan 64 7.542373 0.216949

Europe 39 3.461017 0.132203

Poisson

Poisson regression Number of obs = 3

LR chi2(1) = 129.65

Prob > chi2 = 0.0000

Log likelihood = -9.3973119 Pseudo R2 = 0.8734

------------------------------------------------------------------------------

n | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

x | .0999331 .0091997 10.86 0.000 .081902 .1179642

_cons | 3.364614 .1450806 23.19 0.000 3.080262 3.648967

------------------------------------------------------------------------------

OLS

. reg lnp x

Source | SS df MS Number of obs = 3

-------------+------------------------------ F( 1, 1) = 370.23

Model | 1.32738687 1 1.32738687 Prob > F = 0.0331

Residual | .003585331 1 .003585331 R-squared = 0.9973

-------------+------------------------------ Adj R-squared = 0.9946

Total | 1.3309722 2 .665486102 Root MSE = .05988

------------------------------------------------------------------------------

lnp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

x | .101293 .0052644 19.24 0.033 .034403 .168183

_cons | -2.339238 .06295 -37.16 0.017 -3.139094 -1.539383

------------------------------------------------------------------------------

Aggregate v micro choice models

• Hence, there’s little point in using conditional logit if you only have choice-characteristics

• Conditional/multinomial logit is good if you have individual and group-level characteristics

• The aggregated OLS version gives rise to “Spatial interaction” models of flows between origins and destinations

• = Gravity models

• Widely applied (generally a-theoretically) in migration, trade and commuting applications

– e.g. See Head (2003) Gravity for beginners

Gravity/spatial interaction/migration/trade models

• Flow from place j to place k modelled as

• Typically characteristics of destination and source include some measure of “attraction” e.g. population mass (or “market potential” in trade models) wages (endogenous)

• And measure of the cost in moving between place j and d (e.g. log distance)

• Hence gravity – after Newton

ln( )jk jk j k jkn x

ln( ) lnjk jk jk j k jkn d x

ln( ) ln ln 2lnjk j k jkForce mass mass dist

• Strong distance decay effects

– Typical elasticities -0.5 to -2.0

• Even for internet site visits!: see Blum and Goldfarb (2006) Journal of International Economics

• Trade literature has many examples

• Disdier and Head (2003) The Puzzling Persistence Of The Distance Effect On Bilateral Trade, Review of Economics and Statistics

– Finds mean distance elasticity of -0.9 from about 1500 studies

Gravity/spatial interaction/migration/trade models

Conclusion

• Generally possible to model ‘choices’ as discrete, or as flows

• Discrete choice models offer the advantage of

– Including micro-level (individual/firm) level characteristics

– An underlying structural model (RUM)

• Aggregate flow models

– Simpler to compute

– No need for distributional assumptions necessary for maximum likelihood (nonlinear) methods

– A can’t separate individual from aggregate factors

Date post:	29-Mar-2015
Category:	Documents
Upload:	gina-blea
View:	216 times
Download:	3 times

GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models.

Documents