Urban Growth Shadows - Southern Methodist...

transcript

Urban Growth Shadows∗

David Cuberes

Clark University

Klaus Desmet

SMU, NBER and CEPR

Jordan Rappaport

Kansas City Fed

September 2019

Abstract

Does a location’s growth benefit or suffer from being geographically close to large economic

centers? Spatial proximity may lead to competition and hurt growth, but it may also generate

positive spillovers and enhance growth. Using data on U.S. counties and metro areas for the

period 1840-2017, we document this tradeoff between urban shadows and urban spillovers. Prox-

imity to large urban centers was negatively associated with growth between 1840 and 1920, and

positively associated with growth after 1920. Using a two-city spatial model that incorporates

commuting and moving costs, we account for this and other observed patterns in the data.

Keywords: urban shadows, agglomeration economies, spatial economics, urban systems, city

growth, United States, 1840-2016

JEL Codes: R12, N93

“Cities were like stars or planets, with gravitational fields that attracted people and

trade like miniature solar systems.”

— William Cronon, Nature’s Metropolis: Chicago and the Great West

1 Introduction

In his account of the U.S. westward expansion during the nineteenth century, Cronon (1991) writes

that land speculators on the frontier saw cities as having a gravitational pull akin to a law of

nature that inexorably attracted migrants from the hinterland to the new urban centers.1 This is

∗Cuberes: Department of Economics, Clark University. E-mail: dcuberes@clarku.edu; Desmet: Department

of Economics and Cox School of Business, Southern Methodist University. E-mail: kdesmet@smu.edu; Rappaport:

Federal Reserve Bank of Kansas City. E-mail: jordan.rappaport@kc.frb.org. We benefitted from presentations at

the Urban Economic Association Meetings (New York and Amsterdam), Boston Federal Reserve Bank, Philadelphia

Federal Reserve Bank, Princeton University and the University of North Dakota. The views expressed herein are

those of the authors and do not necessarily reflect the views of the Federal Reserve Bank of Kansas City or the Federal

Reserve System. We thank McKenzie Humann, Isabel Steffens and Anissa Khan for outstanding research assistance.

1This view of cities echoes that of central place theory (Christaller, 1933; Losch, 1940).

consistent with a view that smaller places “close” to larger cities fall under the “urban shadow” of

their neighbors, with increased competition for resources dampening their growth.2 However, there

is also an opposing view: the presence of nearby clusters of economic activity generates positive

agglomeration spillovers, benefiting the growth of neighboring smaller places.3

Is the proximity of a large urban center beneficial or harmful to a location’s economic

growth? This paper empirically and theoretically explores this question. Focusing on local pop-

ulation growth in the U.S. over almost two-hundred years, our empirical analysis identifies two

distinct time periods: between 1840 and 1920, urban shadows dominated, and since then, between

1920 and today, urban spillovers have taken over. One key force that is likely to have driven the

changing relative strength of urban shadows and urban spillovers is the evolution of intra- and

inter-city commuting costs. After providing an overview of changes in commuting costs over the

last two-hundred years, we develop a two-city spatial model that incorporates both commuting and

moving costs. We show that the long-run behavior of just one variable — commuting costs — can

account for many of the observed patterns in the data, including the changing relative strength of

urban shadows and urban spillovers over time and space.

Using U.S. county and metro population data from 1840 to 2017, our empirical analysis

documents the changing correlation of local population growth with the presence of nearby large

locations. In addition to establishing the important shift from urban shadows to urban spillovers

around the year 1920, we identify three additional stylized facts. First, since the turn of the

twenty-first century, there has been a decrease in the positive correlation between proximity to a

large urban center and growth, suggesting that urban spillovers have been weakening in the last

decades. Second, there is evidence of the geographic reach of large urban centers expanding. Urban

spillovers were very local between 1920 and 1940 and much more far-reaching between 2000 and

2017. Third, the greater the size of a nearby large location, the bigger the correlation with its

hinterland’s growth. The evidence therefore suggests that larger locations exerted stronger urban

shadows in the earlier time period, as well as stronger urban spillovers in the later time period.

We hypothesize that changes in commuting costs can account for these patterns. Before

showing this with the help of a simple spatial model, the paper documents the evolution of com-

muting costs in the U.S. over the same period 1840-2017. Beginning in the mid-nineteenth century,

a steady stream of transport innovations lowered the cost of commuting. The introduction of the

streetcar facilitated longer-distance commutes, giving rise to the first “streetcar suburbs”. This

decline in commuting costs accelerated dramatically during the inter-war and post-war periods.

The combination of the widespread adoption of the automobile and the building of the highway

2See, e.g., Krugman (1993), Fujita et al. (1999), Black and Henderson (2003) and Bosker and Buringh (2017).

3See, amongst others, Davis and Weinstein (2002), Rosenthal and Strange (2003), Glaeser and Kahn (2004),Hanson (2005) and Redding and Sturm (2008).

system connecting downtowns to hinterlands made it possible for people to live much further away

from work. People residing in smaller nearby locations no longer had the need to permanently move

to the larger cities to enjoy their productive benefits. Instead, they could reside in the hinterland,

and commute to the urban centers for work. There is some indication that the continued drop in

commuting costs has weakened in the last two decades. Several factors may have contributed to

this: improvements in commuting technology have petered out; road and rail infrastructure invest-

ment has stalled; traffic congestion has worsened and the commuting speed has slowed down; and

there has been a rise in the opportunity cost of time, due to longer work hours and the increase in

double-income families.

To understand the role of commuting costs in shaping the relative strength of urban shadows

and urban spillovers, we develop a simple spatial model of two cities. An individual has three

choices: she can work in the city where she initially resides, she can move to live and work in the

other city, or she can commute for work to the other city without changing her residence. We

then show how these choices change with the cost of commuting, the distance between cities and

their relative sizes. We find that as the cost of commuting gradually drops, individuals switch from

staying put in the smaller, least productive city, first to moving and later to commuting to the

larger, more productive city. Hence, a gradual drop in transport costs first hurts growth in the

smaller city, as it loses population to its larger neighbor, but a further drop eventually helps its

growth, as its population commutes to the nearby larger city. That is, the smaller city goes from

experiencing a negative urban shadow, to benefiting from positive urban spillovers.

The intuition for the non-monotonic relation between commuting costs and the growth of

the smaller city is straightforward. The initial drop in transport costs lowers the cost of living in

the larger city by more than in the smaller city, for the simple reason that intra-city commutes

are on average longer in the larger city than in the smaller city. This makes it more attractive for

residents of the smaller city to pay the one-time moving cost to relocate to the larger city. As in

Cronon (1991), the large city uses its gravitational force to pull in migrants from the hinterland: an

urban shadow. A further drop in transport costs continues to make the larger city more attractive

than the smaller city, but it also facilitates inter-city commuting, which was hitherto too costly.

This allows residents of the smaller city to work in the larger city without the need to move. The

small city benefits from the proximity of the large city: an urban spillover.

When analyzing the observed long-run evolution of commuting costs through the lens of our

model, we can account for the four main stylized facts uncovered in the data. Recall that commuting

costs experienced three distinct regimes: slow decline between 1840 and 1920, rapid decline between

1920 and 2000, and stagnation since then. When interpreted by the model, these are consistent

with urban shadows dominating in the early time period and urban spillovers dominating in the

later time period, with some weakening of spillovers in the last decades. The model also shows that

as commuting costs decrease, the geographic reach of urban areas expands. In addition, the theory

implies that an increase in the relative size of a large urban center strengthens the force it exerts

on its hinterland. These theoretical predictions have their empirical counterparts in the stylized

facts we highlight in our data analysis.

While the long-run evolution of just one variable – commuting costs – is able to capture

the rise and decline of urban shadows, undoubtedly other forces might have been at play as well.

One such force are technology spillovers: in order for smaller locations to benefit from nearby

larger locations, there may be no need to commute to that larger neighbor if technologies diffuse

through space. We show that under certain assumptions our model is observationally equivalent

to one without inter-city commuting but with technology spillovers. Another force that may drive

urban spillovers is market access: the smaller location may benefit from proximity to its larger

neighbor through trade. In the Appendix we consider such an alternative model, and show that it

can capture some of the main empirical findings.

This paper is related to the literature that explicitly considers the spatial location of one

place relative to another. Urban economics has until recently largely ignored the spatial distribution

of cities (Fujita et al., 1999). An important early exception is central place theory (Christaller, 1933;

Losch, 1940). In that theory the tradeoff between scale economies and transportation costs leads to

the emergence of a spatially organized hierarchy of locations of different sizes. A natural implication

of central place theory is that the presence of large urban centers may enhance population growth in

nearby agglomerations through positive spillover effects, but it may also limit such growth through

competition among cities (Krugman, 1993; Tabuchi and Thisse, 2011).4

Most empirical studies that explore the effect of large agglomerations on other locations

focus on the twentieth century. They tend to find positive growth effects from proximity to urban

centers. Using U.S. data, Partridge et al. (2009) uncover a positive impact of large urban clusters

on nearby smaller places.5 Looking at the post-war period, Rappaport (2005) finds evidence of the

populations of cities and suburbs moving together. Dobkins and Ioannides (2001) also conclude

that there has been a positive effect of neighboring locations on growth since the 1950s. Liu et

al. (2011) analyze the case of China, and likewise show that the impact of a high-tier city on its

surrounding areas is positive.

A few papers have looked at earlier time periods and find evidence of urban growth shadows.

In pre-industrial Europe, Bosker and Buringh (2017) show that the net effect of large neighbors was

negative. Consistent with this, Rauch (2014) documents that historically larger European cities

4More recently, there has been a growing interest in incorporating ordered space into economic geography models.This is particularly true of quantitative spatial models that aim to bring the theory to the data in meaningful ways(Desmet and Rossi-Hansberg, 2014; Allen and Arkolakis, 2014; Desmet et al., 2018).

5In an earlier study for the time period 1950-2000, the same authors find negative effects from proximity tohigher-tiered places.

have been surrounded by larger hinterland areas. Most closely related to our work is Beltran et al.

(2017) who use data on Spanish municipalities for the time period 1800-2000. They find that the

influence of neighboring cities was negative between 1800 and 1950, to then become increasingly

positive from 1950 onwards. Our work focuses on the U.S., a country where the urbanization

process is likely to have differed from the Spanish experience for a variety of reasons: it was much

less settled in the nineteenth century, modern-day mobility across cities and regions is greater, and

the adoption of the automobile was swifter. In addition, our paper offers a theoretical framework

that allows us to interpret the switch from urban shadows to urban spillovers by relating it to the

secular decline in transport costs.

The rest of the paper is organized as follows. Section 2 presents the empirical findings on the

changes in urban growth shadows and spillovers over the period 1840-2017. Section 3 documents

the evolution of commuting costs over the same period 1840-2017. Section 4 proposes a conceptual

framework that relates commuting costs to urban shadows and spillovers, and it shows that the

theoretical predictions are consistent with the main patterns in the data. Section 5 concludes.

2 Urban Growth Shadows and Spillovers: 1840 to 2017

This section documents how the correlation of local population growth in the U.S. with the presence

of nearby large locations has evolved over the period 1840-2017. In doing so, it aims to explore

whether urban shadows or urban spillovers were more prominent in different time periods. It also

elicits a number of additional stylized facts.

2.1 Data

We use county population data from the Census Bureau spanning the period 1840 to 2017. With

the exception of the last period, we focus on successive twenty-year time frames: 1840-1860, 1860-

1880, ...,1980-2000, 2000-2017. In constructing the dataset, we had to resolve two main issues:

how to deal with changing county borders and how to delineate metro areas over time. In what

follows we limit ourselves to a brief discussion, and point the interested reader to Desmet and

Rappaport (2017) for more details. To get consistent county borders, we use a “county longitudinal

template” augmented by a map guide to decennial censuses, and combine counties as necessary to

create geographically-consistent county equivalents over successive twenty-year-periods (Horan and

Hargis, 1995; Thorndale and Dollarhide, 1987). For example, if county A splits into counties A1 and

A2 in 1850, we combine counties A1 and A2 to measure population growth of county A between

1840 and 1860. More generally, for growth between 1840 and 1860, we use geographic borders

from 1840; for growth between 1860 and 1880, we use geographic borders from 1860; and so on.6

6This description applies to the most common case of counties splitting over time. If counties merge between, say,1860 and 1880, then we would use geographic borders from 1880.

This methodology gives us a separate dataset for each twenty-year period we study, as well as for

2000-2017.

When different counties form part of the same metropolitan area, we do not want to consider

these counties as different locations. We therefore combine counties into metro areas, when and

where we can delineate them. Our analysis is thus based on a hybrid of metropolitan areas and non-

metropolitan counties. For 1940 and earlier, we merge counties to form metropolitan areas applying

criteria promulgated by the Office of Management and Budget (OMB) in 1950 to population and

economic conditions at the start of each twenty-year period (Gardner, 1999). For 1960 and later,

we use the official delineations promulgated by OMB after each decennial census. As with the

geographically-consistent counties, growth over any period is measured using the geographic borders

of the initial year. The number of locations in our datasets increases rapidly from 862 for the

period 1840-1860 to 2,370 for 1880-1900 and then more slowly to a maximum of 2,982 for 1940-

1960, reflecting both the westward movement of the U.S. frontier and the splitting of geographically

large counties as they became more densely settled into smaller ones. Thereafter, the number of

locations steadily declines as more and more counties were absorbed into metropolitan areas. Our

dataset for 2000-2017 has 2,369 locations.

The distribution of surrounding locations by size and distance systematically varies across

different parts of the country, which in many periods had different average growth rates. For

example, locations near the U.S. frontier during the nineteenth century tended to have few large

neighbors and high average growth. To avoid an omitted variable bias, we extensively control

for regional variation in order to isolate the correlation of growth with measures of surrounding

locations. A first set of 15 control variables are the terms from the third-order polynomial of latitude

and longitude, (1 + lat+ lat2 + lat3)(1 + long+ long2 + long3). A second set of control variables are

indicators for eight of the nine U.S. census divisions. A third set of ten control variables are linear

and quadratic terms of average low temperature in January, average high temperature in July,

average daily humidity in July, average annual rainfall, and average number of days on which it

rains (Rappaport, 2007). A fourth set of ten control variables are indicators of whether a location’s

geographic centroid is within 80 kilometers of the coast and of a natural harbor along each of the

north Atlantic, south Atlantic, Gulf of Mexico, Pacific, and Great Lakes (Rappaport and Sachs,

2003). A fifth set of two control variables are indicators of whether a location’s geographic centroid

is within 40 kilometers of a river on which there was navigation in 1890 and whether it is in addition

located within 80 kilometers of an ocean coast (Rappaport and Sachs, 2003). A final set of two

variables is a quadratic specification of hilliness, measured as the standard deviation of altitude

within a location normalized by the location’s land area. These six sets total 47 variables, which

we include in all regressions beginning with the 1860 cross section. A handful of them are dropped

for the 1840 cross section due to lack of variation (e.g., there were no locations in the Mountain

and Pacific census regions).

2.2 Baseline Specification

Our main specification regresses population growth over successive twenty-year intervals on the

presence of surrounding locations at specified distances with population above specified thresholds.

Let d`k denote the distance between locations ` and k, measured using a straight-line ap-

proximation between their geographic centroids. Let d ∈ {d1, d2, ..., dD} denote strictly increasing

specified distances, e.g., {50km, 100km, ..., 300km}. Finally, let Lk and L respectively denote the

population of location k and a specified population threshold for considering a neighboring location

to be large. For each ordered pair of locations, we construct an indicator variable, IL,d`k , describing

whether location k has population weakly above threshold L and distance from location ` weakly

less than d:

IL,d`k ≡ I(Lk, d`k; L, d) =

1 : Lk ≥ L & d`k ≤ d

0 : otherwise

For each location `, we then construct a set of indicators, one for each specified distance, describing

if there is at least one location, k 6= `, within that distance of location `, that has population weakly

above L and no such location within a smaller distance of `:

IL,d` =

1 : d = d1 &∑

k 6=` IL,d`k ≥ 1

1 : d ∈ {d2, .., dD} &(∏d−1

(1− IL,d`

))(∑k 6=` I

)≥ 1

0 : otherwise

For each 20-year period from 1840 to 2000 and for the 17-year period from 2000 to 2017, we

regress annual average population growth, g`, on the set of indicators, IL` = [IL,d1` , IL,d2` , ... IL,dD` ],

along with a fifth-order polynomial of a location’s initial population, L` = [log(L`), (log(L`))2,

..., (log(L`))5]. The latter absorbs the non-monotonic relationship between growth and size through-

out most of U.S. history (Michaels et al., 2012; Desmet and Rappaport, 2017). It is necessary to

include these terms because the size distribution of neighbors closely depends on a location’s own

size. For example, very small locations rarely have a very large neighbor. As described in the

previous subsection, we also extensively control for geographic attributes with 47 variables, x`. We

thus specify a data generating process with reduced form

g` = IL` β + L` γ + x` δ + ε`. (1)

The partial correlation between the growth of a location and the presence of larger neighbors

unsurprisingly depends both on the threshold population above which we consider neighbors to be

large, L, and the size of the location itself, L`. Because the size distribution of U.S. locations

changed continually throughout U.S. history, we use relative measures of population both to set

year-specific thresholds for considering a location large and to focus the analysis on the growth of

locations that are not large. Specifically, we consider locations to be at least “moderately large” in

a given year if their population is at or above the 95th percentile of the distribution across locations

in that year. Analogously, we respectively consider locations to be “very large” in a given year if

their population is at or above the 99th percentile in that year. Reciprocally, we exclude locations

from our baseline regression analysis that have population above the 80th percentile. Our baseline

regressions thus estimate the partial correlations between the growth rate of small and medium

locations–those with population in the first through fourth quintiles–with the presence of nearby

locations in the top portion of the fifth quintile. We also discuss how these correlations differ across

sub-samples of locations by size.

Partial correlations sensitively depend on the maximum distance, dD, for which an indicator

is included. To understand this, recognize that indicators of a large neighbor within distance

intervals demarcated by {d1, d2, ..., dD}, together with the excluded interval, d`k > dD, make up a

disjoint set that fully partitions the observations in a regression. For a given maximum population

threshold, coefficients on each of the included indicators estimate the difference of predicted growth

for observations with a corresponding positive value and the predicted growth of observations

with a positive value of the excluded category. Estimated coefficients thus depend closely on the

composition of the excluded category.

It is important to specify a maximum distance that is not too high. Failing to do so

leaves few observations with a positive value for the excluded category. For example, in almost

all years for which we run regressions, less than 15 percent of observations have no moderately

large neighbor (one with population above the 95th percentile) within 300 kilometers. As these

relatively isolated locations tended to grow slowly, a regression of growth on indicators for each of

the distance intervals out to 300km must yield some positive coefficients. Hence it is important to

choose a maximum distance that is not too large.

Conversely, it is also important to choose a maximum distance that is not too low. Many of

the regressions estimate coefficients on the 50km-100km and 100km-150km indicators that are the

same sign and similar in magnitude to their estimates on the 0km-50km indicator. For the 95th

percentile and 99th percentile thresholds for large size, the number of observations with positive

indicators for two further-away intervals far exceeds the number with positive indicators for the

closest interval. In many cases, the majority of observations have positive values in the combined

50km-150km range. In consequence, regressions that include an indicator only for the 0km to

50km distance may not find much of a difference in predicted growth compared to locations in the

excluded category.

To balance these two considerations, we specify our regressions to include indicators for

50km intervals out to the maximum distance that leaves at least 50 percent of observations in

the excluded category. For example, 49 percent of the observations in the 1840 regression have a

neighbor that is at least moderately large within 150km and 67 percent have one within 200km

and so we use presence indicators for 0km to 50km, 50km to 100km, and 100km to 150km. Higher

thresholds for considering a location to be large require a maximum distance that is further away.

For the 1840 regression on the presence of neighbors that are very large (ones with population

above 99th percentile), our rule implies including presence indicators out to a maximum distance

of 300km.

We also require that all distance intervals have at least 20 observations with indicators that

are positive. In practice, this pertains only to the 0km-50km interval, which in some years has

positive indicators for only a handful of observations. In these cases, we use a nearest interval that

ranges from 0km to 100km.

2.3 Two Distinct Subperiods

This subsection explores the existence of urban growth shadows and urban growth spillovers be-

tween 1840 and 2017. When estimating the correlation of the population growth of small and

medium locations with the presence of moderately large locations (population at or above the 95th

percentile), Table 1 shows two clearly distinct periods: a negative regime from 1840 and 1920, and

a positive regime from 1920 through 2017.

The predicted population growth of small and medium locations was slower during each

of the four 20-year periods from 1840 to 1920 if they had a moderately large neighbor. In 1840,

the initial population of the small and medium locations ranged from 133 to 24,000 and the initial

population of the 44 moderately large locations ranged from 62,000 to 435,000. Small and medium

locations that had a moderately large neighbor within 50km had predicted annual population

growth from 1840 to 1860 that was slower by 0.63 percentage points compared to the excluded

locations, which did not have a moderately large neighbor within 150 kilometers. Locations whose

nearest moderately large neighbor was between 50km to 100km away had predicted annual growth

that was slower by 0.67 percentage points compared to excluded locations; and those whose nearest

moderately large neighbor was 100km to 150km away had predicted lower annual growth that was

slower by 0.30 percentage point. The corresponding negative coefficients statistically differ from

zero at the 0.05 or 0.10 levels (respectively, dark and light blue typeface). Estimated negative

coefficients are similar in magnitude for the 1860-1880 regression and a bit larger in magnitude

for the 1880-1900 regression. Predicted growth from 1900 to 1920 was also slower for small and

medium locations with a moderately large neighbor, although the magnitude of the difference

compared to not having a moderately large neighbor was considerably less than during the earlier

periods. Throughout the negative regime, the marginal share of the variation in growth accounted

for by the indicators for a moderately large neighbors (the increase in R2 compared to using only

the control variables) is slight, ranging from 0.2 to 0.8 percentage points.7

(1) (2) (3) (4) (5) (6) (7) (8) (9)neighbor w/pop ≥ 95th

percentile@ distance:

1840-1860

1860-1880

1880-1900

1900-1920

1920-1940

1940-1960

1960-1980

1980-2000

2000-2017

1 to 50 km -0.63 -0.77 -1.10 -0.04 0.28 1.11 1.19 0.50 0.41(0.35) (0.28) (0.21) (0.14) (0.12) (0.15) (0.19) (0.14) (0.20)

50 to 100 km -0.67 -0.41 -0.86 -0.28 -0.07 0.16 0.21 0.26 0.11(0.24) (0.20) (0.18) (0.09) (0.07) (0.08) (0.11) (0.09) 0.09

100 to 150 km -0.30 -0.11 -0.63 0.03(0.12) (0.17) (0.15) (0.07)

N (quints 1 to 4) 691 1,328 1,844 2,110 2,357 2,387 2,283 2,104 1,895

control vars 48 52 52 52 52 52 52 52 52

R20.857 0.787 0.726 0.533 0.407 0.366 0.391 0.427 0.318

Adj R20.846 0.778 0.718 0.521 0.393 0.352 0.376 0.412 0.298

R2-R2 controls 0.002 0.002 0.008 0.002 0.003 0.028 0.033 0.010 0.004

pop≥95th pctile 62ths-425ths

49ths-1.4mn

51ths-2.5mn

65ths-4.9mn

80ths-8.5mn

100ths-11.7mn

172ths-14.2mn

250ths-14.5mn

377ths-18.3mn

N, ≥95th pctile 44 86 119 133 148 150 143 132 119

pop of obs 133-24ths

103-23ths

100-26ths

104-30ths

137-32ths

285-36ths

208-42ths

408-49ths

356-65ths

share of pop 0.38 0.39 0.42 0.39 0.34 0.29 0.20 0.16 0.14

Table 1: Population Growth and the Presence of a Moderately Large Neighbor.All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations

have population in the first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based

on Conley (1999). Dark and light blue fonts respectively indicate a negative coefficient that statistically differs from

zero at the 0.05 and 0.10 levels. Dark and light red fonts respectively indicate a positive coefficient that statistically

differs from zero at the 0.05 and 0.10 levels.

The remaining columns of Table 1 describe the positive regime between population growth

and the presence of a moderately large neighbor. For each of the five periods from 1920 to 2017,

predicted population growth was faster for small and medium locations with a moderately large

neighbor within 50km. For each of the three periods from 1940 to 2000, predicted growth was

also slightly faster for locations whose nearest moderately large neighbor was located between

50km and 100km away. The corresponding positive coefficients statistically differ from zero at

the 0.05 and 0.10 levels (respectively, dark and light red typeface). The magnitude of the faster

7In Table 1 we refer to this as “R2 - R2 controls”, i.e., the difference between the R2 of our regression and the R2

of a specification that only includes the controls (and hence leaves out the neighbor dummies).

predicted growth is relatively modest from 1920 to 1940, when suburbanization was just getting

underway. Then, both from 1940-1960 and from 1960-1980, the presence of a moderately large

neighbor within 50km predicted population growth that was higher by more than 1 percentage

point (statistically significant at the 0.01 level). Smaller-magnitude coefficients seem to suggest

that suburbanization waned from 1980 to 2000. But as we will describe in the next subsection,

this is somewhat misleading, because it reflects many rapidly suburbanizing peripheral counties

having been reclassified as belonging to a metropolitan area following the 1970 and 1980 decennial

censuses. The marginal share of the variation accounted for by the indicators of a moderately large

neighbor is about 3 percentage points for the periods beginning in 1940 and 1960, but substantially

lower for the other periods during the positive regime.

If we interpret the slower growth of locations with large neighbors as evidence of urban

shadows, and the faster growth of those same locations as evidence of urban spillovers, then we can

summarize our findings in Table 1 as follows:

Stylized Fact 1: Urban Shadows and Spillovers. Between 1840 and 1920 urban growth shad-

ows dominated the U.S. economic geography, with locations in the vicinity of large places growing

relatively slower, whereas between 1920 and 2017 urban growth spillovers dominated, with locations

in the vicinity of large places growing relatively faster.

This division into a negative regime followed by a positive regime robustly holds for alternative

threshold levels of largeness and widely varying specifications.

2.4 Recent Weakening of Urban Spillovers

In this subsection we analyze whether there has effectively been a weakening in urban spillovers

since the 1980s, as suggested by some of the results reported above. To be precise, Table 1 showed

that the expected growth boost from having a top-5 percent neighbor in the 1-to-50 kilometer

range dropped by more than half, from 1.19 percentage points for the period 1960-1980 to 0.50

percentage points for the period 1980-2000. That fall may be partly explained by changing metro

delineations: if a fast-growing location in one time period is also more likely to get absorbed into

a metro area by the next time period, then this may cause a decline in growth of the locations in

the 1-to-50 kilometer range. More generally, as re-delineated metro areas include more outlying

counties, the continuing filling in of these counties is implicitly accounted for as migration within

a location rather than between locations. This makes comparisons across periods more difficult.8

8In addition, unobserved characteristics are likely to distinguish which surrounding counties at a given distanceare absorbed into a metro, introducing a selection bias in making comparisons across periods. The re-delineationsalso leave fewer locations with nearby large neighbors, reflecting that metropolitan radiuses are becoming longer. Thechanging delineation of metros also affects metropolitan centroids, which are constructed as the population-weightedmean of constituent counties’ centroids. Hence it also affects distances to large neighbors, which are measured betweencentroids.

(1) (2) (3)neighbor w/pop ≥ 95th

1960-1980

1980-2000

2000-2017

1 to 50 km 1.19 1.43 0.60(0.19) (0.30) (0.12)

50 to 100 km 0.21 0.59 0.14(0.11) (0.23) (0.06)

100 to 150 km 0.24(0.20)

150 to 200 km 0.13(0.12)

200 to 250 km

N (quints 1 to 4) 2,283 2,282 2,283

control vars 52 52 52

R20.391 0.440 0.326

Adj R20.376 0.426 0.310

R2-R2 controls 0.033 0.047 0.019

pop≥thresh 172ths-14.2mn

246ths-14.4mn

306ths-15.7mn

N, ≥thresh 143 143 143

408-54ths

67-67ths

Table 2: Population Growth and Large Neighbors, 1960 Metropolitan Borders.Metropolitan areas are delineated using the OMB standards following the 1960 decennial census. All regressions

include a constant and control for initial population 52 geographic covariates. Observations have population in the

first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based on Conley (1999). Dark

blue and light blue fonts respectively indicate a negative coefficient that statistically differs from zero at the 0.05 and

0.10 levels. Dark and light red fonts respectively indicate a positive coefficient that statistically differs from zero at

the 0.05 and 0.10 levels.

To assess the plausibility of this concern, Table 2 reports regressions for the three periods

from 1960 to 2017 using metropolitan area borders established following the 1960 decennial cen-

sus. Consistent with the possible bias we described, when keeping borders constant, the positive

relationship between growth and the presence of large neighbors peaked 20 years later, during the

period from 1980 to 2000, rather than during the period 1960 to 1980. In other words, we still see a

weakening relation between growth and proximity to large locations, but only after the turn of the

twenty-first century. This suggests that the transition of metropolitan areas to a larger geographic

footprint may be winding down.9 We summarize these findings as follows:

Stylized Fact 2: Recent Weakening of Urban Spillovers. Urban growth spillovers have

been weakening since the turn of the 21st century. In particular, during the period 2000-2017 urban

growth spillovers are less pronounced than than during the periods from 1960-1980 and 1980-2000.

2.5 Geographic Span

This subsection explores how the geographic span of urban shadows and urban spillovers has

changed over time. When focusing on moderately large neighbors (at or above the 95th percentile),

as we have done so far, there are few observations with a positive value for the excluded category at

far-away distances. This limits the maximum geographic distance we are able to consider, making it

difficult to analyze how the geographic span of urban shadows and spillovers evolves over time. To

get around this issue, Table 3 considers the presence of very large neighbors (at or above the 99th

percentile), allowing us to consider farther-away distances while maintaining enough observations

with a positive value for the excluded category. As an example, for the period 1980-2000 we are

able to include neighboring locations all the way to 250km, whereas for the same time period in

Table 1 we only considered neighbors within a range of 100km.

Before discussing the spatial reach of urban shadows and spillovers, we show that increasing

the size threshold from the 95th to the 99th percentile does not qualitatively change what we

concluded before. There continues to be a negative regime and a positive regime, with the year

1920 separating the two. The magnitudes of the coefficients of course differ, especially during the

positive regime, when having a neighbor above the 99th percentile rather than above the 95th

percentile was associated with a considerably greater boost in population growth (Table 3). For

example, predicted growth from 1960 to 1980 was 3.2 percentage points per year higher for small

and medium locations that had a very large neighbor within 50km compared to the excluded

locations, those whose nearest very large neighbor was at least 250km away.

We now analyze how the geographic reach of very large neighbors changes over time. During

the negative regime, when comparing 1900-1920 to 1880-1900, the drop in growth from having a very

large neighbor weakens at shorter distances below 50km but strengthens at farther-away distances

above 150km.10 During the positive regime, the growth boost of having a very large neighbor

starts off within a rather narrow 50km radius for the period 1920-1940, but then expands by 50km

9Regressing growth from 1960 to 1980 using metropolitan borders from 1940 modestly increases estimated coeffi-cients on indicators of moderately large neighbors (compared to using 1960 borders) and modestly decreases estimatecoefficients on indicators of very large neighbors. Regressing growth from 1940 to 1960 using metropolitan bordersfrom 1920 modestly increases estimated coefficients on indicators of both moderately large and very large neighbors.Regardless of borders, the strength of suburbanization from 1940 to 1960 as estimated by the regressions was similarto the strength from 1960 to 1980.

10Comparing to earlier time periods is more complex, because of differences in the maximum distance.

(1) (2) (3) (4) (5) (6) (7) (8) (9)neighbor w/pop ≥ 99th

1840-1860

1860-1880

1880-1900

1900-1920

1920-1940

1940-1960

1960-1980

1980-2000

2000-2017

1 to 50 km -1.01 -0.79 -0.28 0.73 2.41 3.20(0.37) (0.42) (0.28) (0.29) (0.26) (0.61)

50 to 100 km† 0.00 -0.67 -0.87 -0.64 0.22 0.55 0.97 1.03 0.68(0.46) (0.30) (0.27) (0.19) (0.19) (0.15) (0.25) (0.19) (0.15)

100 to 150 km 0.27 -0.60 -0.45 -0.47 0.08 0.14 0.30 0.51 0.39(0.47) (0.22) (0.24) (0.18) (0.18) (0.13) (0.14) (0.13) (0.10)

150 to 200 km 0.49 -0.77 -0.15 -0.52 0.15 0.02 0.13 0.26 0.28(0.38) (0.18) (0.23) (0.17) (0.16) (0.13) (0.12) (0.10) (0.08)

200 to 250 km -0.01 -0.02 -0.19 0.18 -0.12 0.03 0.10 0.22(0.24) (0.17) (0.16) (0.15) (0.13) (0.11) (0.08) (0.07)

250 to 300 km 0.17 -0.02 -0.13 0.16 0.10(0.18) (0.19) (0.19) (0.13) (0.07)

N (quints 1 to 4) 691 1,328 1,844 2,110 2,357 2,387 2,283 2,104 1,895

control vars 48 52 52 52 52 52 52 52 52

R20.856 0.790 0.721 0.535 0.407 0.378 0.408 0.452 0.342

Adj R20.844 0.781 0.712 0.521 0.392 0.362 0.393 0.437 0.321

R2-R2 controls 0.001 0.004 0.003 0.004 0.003 0.039 0.050 0.035 0.028

pop≥99th pctile 171ths-425ths

139ths-1.4mn

139ths-2.5mn

197ths-4.9mn

321ths-8.5mn

442ths-11.7mn

810ths-14.2mn

1.3mn-14.5mn

2.0mn-18.3mn

N, ≥99th pctile 9 18 24 27 30 30 29 27 24

103-23ths

100-26ths

104-30ths

137-32ths

285-36ths

208-42ths

408-49ths

356-65ths

Table 3: Population Growth and the Presence of a Very Large Neighbor.†50-to-100km row reports results for 1 to 100km when no results are reported in the 1-to-50km row.

All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations

have population in the first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based

on Conley (1999). Dark and light blue fonts respectively indicate a negative coefficient that statistically differs from

zero at the 0.05 and 0.10 levels. Dark and light red fonts respectively indicate a positive coefficient that statistically

over each subsequent 20-year period, reaching 250km during 2000-2017. Of course, as is intuitive,

growth’s positive relationship with the presence of a very large neighbor weakens the more distant

that neighbor is located. These findings constitute our third stylized fact:

Stylized Fact 3: Geographic Span of Shadows and Spillovers. Over the period 1920-2017

there is strong evidence of the geographic span of urban growth spillovers expanding, with spillovers

being very local between 1920-1940 and much more far-reaching in 2000-2017. Over the period

1840-1920 the evidence is mixed, though there is weak evidence of the geographic span of urban

growth shadows expanding between the late 19th century and early 20th century.

2.6 Relative Size of Locations and Neighbors

This subsection explores how the strength of urban shadows and spillovers depends on the relative

size of locations and neighbors.

The Size of Locations. The qualitative relationship between a location’s growth and the pres-

ence of a large neighbor is not too sensitive to the size of the locations, though the magnitude of

the correlation shows some tendency to decline with size. In what follows, we make this point by

focusing on two representative time periods, one for the positive regime, 1880 to 1900, and one for

the negative regime, 1960 to 1980.

For each of the first four quintiles of locations, Table 4 separately reports regressions of

growth from 1880 to 1900 on the presence of a moderately large neighbor. All coefficients are

estimated to be negative, with the magnitudes being lower for higher quintiles. When we consider

growth of locations between the 80th and the 90th percentile, the magnitude of the correlation

becomes even smaller, though it continues to be negative. Regressions for the other periods during

the negative regime show similar results, both for the presence of moderately large and very large

neighbors.11 Table 5 reports analogous regressions of growth from 1960 to 1980. All coefficients are

estimated to be positive, with in this case too some tendency, albeit weaker, for the magnitudes to

decline with a location’s size.

The Size of Neighbors. When comparing Table 1 and Table 3, we found that growth’s correla-

tions with the presence of large neighbors increased with the size of neighbors during the positive

regime but not the negative regime. A more general specification, however, establishes that mag-

nitudes are increasing with the size of neighbors during both regimes. Table 6 shows results from

regressing population growth on the presence of neighbors above four thresholds: the 80th, 90th,

95th, and 99th percentiles. These categories are nested in the sense that a neighbor that is above

the 99th percentile is also above the 90th and 95th percentiles. Coefficients on these latter thresh-

olds thus estimate the marginal boost to predicted growth compared to having a neighbor with

population only above the next highest threshold. For example, a positive coefficient on the 99th

percentile indicator estimates the additional predicted growth of locations that have a neighbor

with population above the 99th percentile compared to locations with a largest neighbor with

population between the 95th and 99th percentiles.

During the negative regime, the increase in the magnitude of growth’s relationship with the

population of its largest nearby locations is especially strong in the 1880-1900 regression. Having at

11The one exception concerns regressions of growth from 1880 to 1900 on the presence of very large neighbors. Incontrast to the combined regression, estimated coefficients using the first quintile of observations are positive.

(1) (2) (3) (4) (5) (6) (7)

neighbor w/population

quints 1 to 4

quint 1

quint 2

quint 3

quint 4

decile 9

decile 10

≥ 95th pctile@1 to 50 km -1.10 -1.17 -0.35 -0.47 -0.25 0.07

(0.21) (0.26) (0.25) (0.11) (0.13) (0.16)

50 to 100 km† -0.86 -1.55 -0.79 -0.40 -0.23(0.18) (1.05) (0.31) (0.12) (0.10)

100 to 150 km -0.63 -1.92 -0.39(0.15) (0.99) (0.19)

150 to 200 km -1.03(0.96)

200 to 250 km -0.65(0.96)

250 to 300 km -0.14(0.96)

300 to 350 km -0.65(1.01)

N 1,844 422 476 472 474 237 237

control vars 52 52 50 51 52 48 50

R20.726 0.757 0.503 0.470 0.514 0.553 0.583

Adj R20.718 0.718 0.441 0.403 0.451 0.436 0.469

R2-R2 controls 0.008 0.005 0.016 0.013 0.011 0.004 0.000

pop≥95th pctile 51ths-2.5mn

51ths-2.5mn

N, pop≥thresh 119 119 119 119 119 119 119

pop of obs 100-25.7ths

100-5.6ths

5.6ths-10.7ths

10.8ths-16.1ths

16.1ths-25.7ths

25.7ths-36.6ths

36.6ths-2.5mn

share of pop 0.42 0.02 0.08 0.13 0.19 0.14 0.44

Table 4: Population Growth by Quintile, 1880 to 1900.†50-to-100km row reports results for 1 to 100km when no results are reported in the 1-to-50km row.

have population in the enumerated percentile range. Standard errors, in parentheses, are robust to spatial correlation

based on Conley (1999). Dark blue and light blue fonts respectively indicate a negative coefficient that statistically

least one neighbor within 50km that had population (weakly) above the 80th percentile is associated

with slower predicted growth of 0.21 percentage point per year. If the largest such neighbor within

50km had population above the 90th percentile, predicted growth is slower by an additional 0.37

percentage point per year. If the largest such neighbor had population above the 95th percentile,

predicted growth is slower by still an additional 0.74 percentage point per year. As an example,

(1) (2) (3) (4) (5) (6) (7)

quints 1 to 4

quint 1

quint 2

quint 3

quint 4

decile 9

decile 10

≥ 95th pctile@1 to 50 km 1.19 1.69 1.34 1.03 1.16 0.27 0.12

(0.19) (0.83) (0.34) (0.26) (0.19) (0.15) (0.11)

50 to 100 km 0.21 1.12 0.15 0.18(0.11) (0.47) (0.13) (0.12)

100 to 150 km 0.56(0.31)

150 to 200 km 0.60(0.31)

N 2,283 571 570 571 571 285 285

control vars 52 51 50 51 52 51 52

R20.391 0.521 0.452 0.459 0.410 0.427 0.647

Adj R20.376 0.470 0.397 0.404 0.349 0.298 0.566

R2-R2 controls 0.033 0.022 0.032 0.034 0.086 0.005 0.001

pop≥95th pctile 172ths-14.2mn

172ths-14.2mn

N, pop≥thresh 143 143 143 143 143 143 143

pop of obs 208-42.3ths

208-7.8ths

7.8ths-13.5ths

13.5ths-21.3ths

21.3ths-42.3ths

42.3ths-79.5ths

79.5ths-14.2mn

share of pop 0.20 0.02 0.03 0.05 0.10 0.09 0.71

Table 5: Population Growth by Quintile, 1960 to 1980.All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations

have population in the enumerated percentile range. Standard errors, in parentheses, are robust to spatial correlation

based on Conley (1999). Dark and light red fonts respectively indicate a positive coefficient that statistically differs

from zero at the 0.05 and 0.10 levels.

consider a location that has a neighbor with population at the 99th percentile between 50km and

100km away and no neighbor with population above the 90th percentile within 50km. During the

period 1880-1900, such a location would have slower predicted population growth of 1.36 percentage

point per year—the sum of the coefficients on the 50-to-100km indicators for the 90th, 95th, and

99th percentiles—compared to observations that do not have a neighbor in any of the categories

included in the regression.

During the positive regime, the largest marginal increases in predicted growth are associated

with having a neighbor with population at the 99th percentile rather than having one with pop-

ulation between the 95th and 99th percentiles. This is especially so during the 1960-1980 period,

when the marginal increase was 2.43 percentage point per year. For neighbors located more than

50km away, only those with population at the 99th percentile are associated with a meaningful

(1) (2) (3) (4) (5) (6) (7) (8) (9)

1840-1860

1860-1880

1880-1900

1900-1920

1920-1940

1940-1960

1960-1980

1980-2000

2000-2017

≥ 80th pctile@1 to 50 km -0.41 -0.72 -0.21 -0.30 0.10 -0.07 0.06 0.14 0.06

(0.21) (0.19) (0.12) (0.09) (0.07) (0.08) (0.07) (0.06) (0.05)

≥ 90th pctile@1 to 50 km -0.40 0.25 -0.37 0.13 -0.03 0.30 0.15 0.21 0.00

(0.35) (0.27) (0.19) (0.14) (0.11) (0.11) (0.08) (0.09) (0.08)

50 to 100 km -0.30 -0.19 -0.29 0.00(0.24) (0.22) (0.16) (0.06)

≥ 95th pctile@1 to 50 km -0.09 -0.44 -0.74 0.01 0.18 0.54 0.72 0.17 0.36

(0.38) (0.30) (0.23) (0.20) (0.15) (0.17) (0.15) (0.13) (0.20)

50 to 100 km -0.49 -0.17 -0.54 -0.22 -0.09 0.06 0.04 0.17 0.04(0.30) (0.22) (0.17) (0.10) (0.08) (0.09) (0.09) (0.08) (0.09)

100 to 150 km -0.37 0.05 -0.60 0.00(0.12) (0.20) (0.15) (0.06)

≥ 99th pctile@1 to 50 km -0.48 -0.09 -0.31 0.55 1.88 2.43

(0.36) (0.46) (0.29) (0.26) (0.28) (0.59)

50 to 100 km† 0.49 -0.40 -0.53 -0.52 0.28 0.59 0.98 0.96 0.66(0.45) (0.30) (0.26) (0.20) (0.18) (0.16) (0.24) (0.18) (0.15)

100 to 150 km 0.54 -0.59 -0.14 -0.42 0.09 0.16 0.28 0.50 0.40(0.44) (0.26) (0.23) (0.18) (0.19) (0.14) (0.14) (0.13) (0.10)

150 to 200 km 0.65 -0.75 -0.04 -0.51 0.17 0.06 0.11 0.25 0.27(0.35) (0.19) (0.21) (0.17) (0.16) (0.13) (0.11) (0.09) (0.07)

200 to 250 km 0.14 0.03 -0.21 0.19 -0.09 0.01 0.11 0.23(0.23) (0.16) (0.15) (0.15) (0.13) (0.11) (0.08) (0.07)

250 to 300 km 0.25 0.03 -0.13 0.16 0.12(0.21) (0.16) (0.18) (0.13) (0.07)

N (quints 1 to 4) 691 1,328 1,844 2,110 2,357 2,387 2,283 2,104 1,895

control vars 48 52 52 52 52 52 52 52 52

R20.860 0.794 0.729 0.537 0.409 0.390 0.427 0.463 0.345

Adj R20.847 0.784 0.719 0.523 0.393 0.374 0.411 0.447 0.323

R2-R2 controls 0.005 0.008 0.011 0.006 0.006 0.052 0.069 0.046 0.031

pop, pctile 80 24ths 23ths 26ths 30ths 32ths 36ths 42ths 49ths 65ths

pop, pctile 99 171ths 139ths 139ths 197ths 321ths 442ths 810ths 1.3mn 2.0mn

Table 6: Population Growth and the Size of Large Neighbors.†50-to-100km row reports results for 1 to 100km when no results are reported in the 1-to-50km row.

have population in the first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based on

Conley (1999).

increase in predicted growth. For the period from 2000 to 2017, the statistically-significant boost

from having a very large neighbor extends to those as much as 300km away. In contrast to the

negative regime, the magnitude of the estimated differences in growth are modest for neighbors

with population between the 80th and 90th percentiles.

Our findings of how the relative size of locations and neighbors affects the strength of urban

shadows and spillovers can be summarized as follows:

Stylized Fact 4: Relative Size of Locations and Neighbors. Urban shadows and urban

spillovers tend to strengthen in the size difference between a location and its large neighbor. That is,

the smaller a location and the larger its neighbor, the stronger urban shadows and urban spillovers.

2.7 Regional Variation during the U.S. Westward Expansion

In this subsection we aim to understand to what extent the existence of urban shadows might

have been related to the westward expansion of the U.S. during the 19th and early 20th centuries.

If during that time period locations in the West, much of them isolated, grew fast because they

were becoming settled, this would contribute to a negative correlation between local growth and

proximity to a large neighbor. To assess this possibility, we rerun our baseline regression for three

separate regions: the East, corresponding to the states covering the area of the original 13 colonies,

excluding Kentucky and Tennessee;12 the Middle, corresponding to all other states east of the

Mississippi River; and the West, corresponding to all states west of the Mississippi River. Table 7

reports our findings for the time period 1860-1920. Although, as expected, the magnitudes of the

correlations are stronger in the more newly settled portions of the U.S. (the West) compared to

portions that were settled earlier (the East), we continue to observe evidence of urban shadows in

all regions of the country.

3 Commuting Costs: 1840 to 2017

One important force that shapes spatial growth dynamics in the hinterland of large population

clusters relates to intra-city and inter-city commuting costs. As these will play a key role in the

conceptual framework we present to interpret our empirical findings, in this section we briefly

document how commuting costs in the U.S. have evolved since 1840. We start by focusing on

changes in transportation technology, and then consider other factors that also affect commuting

costs.

12The land areas that would eventually constitute Kentucky and Tennessee were originally part of Virginia andNorth Carolina, respectively.

(1) (2) (3) (4) (5) (6) (7) (8) (9)1860-1880

1880-1900

1900-1920

1860-1880

1880-1900

1900-1920

1860-1880

1880-1900

1900-1920

1 to 50 km -0.28 -0.31 0.50 -0.58 -0.62 0.06 -2.11 -1.78 -0.13(0.31) (0.10) (0.20) (0.24) (0.13) (0.08) (1.88) (0.45) (0.27)

50 to 100 km†-0.21 -0.46 -0.52 -2.08 -1.74 -0.38(0.26) (0.17) (0.14) (1.48) (0.31) (0.20)

100 to 150 km -2.59 -1.41 -0.07(1.29) (0.28) (0.18)

150 to 200 km -2.30 -1.17(1.03) (0.26)

200 to 250 km -2.36 -0.94(0.90) (0.26)

250 to 300 km -1.77 -0.53(0.70) (0.23)

300 to 350km -1.41(0.40)

N (quints 1 to 4) 346 379 390 479 567 600 503 898 1,120

control vars 41 41 41 42 42 42 42 43 43

R20.527 0.445 0.363 0.742 0.690 0.389 0.828 0.761 0.589

Adj R20.459 0.375 0.286 0.716 0.664 0.342 0.810 0.748 0.572

R2-R2 controls 0.004 0.004 0.017 0.004 0.009 0.000 0.007 0.008 0.001

East Middle West

neighbor w/pop ≥ 95th

Table 7: Population Growth and Large Neighbors by Region, 1860 to 1920.All regressions include a constant and control for initial population and up to 47 geographic co-variates. Observations have population in the first four quintiles of the national distribution. Eastregion includes the states occupying the land area of the original 13 U.S. colonies excluding Ken-tucky and Tennessee. Middle region includes all other states east of the Mississippi River. Westregion includes all states west of the Mississippi River. Standard errors, in parentheses, are robustto spatial correlation based on Conley (1999). Dark blue and light blue fonts respectively indicatea negative coefficient that statistically differs from zero at the 0.05 and 0.10 levels.

3.1 Transportation Technologies

Since the middle of the 19th century, there have been enormous improvements in transportation

technologies. Some of those have greatly enhanced long-distance trade and market integration.

Examples that come to mind include the railroad network (Fogel, 1964; Donaldson and Hornbeck,

2016), the building of canals (Shaw, 1990), the construction of the inter-state highway system

(Baum-Snow, 2007), and containerization (Bernhofen et al., 2016). To illustrate the magnitude of

the decline in transport costs, Glaeser and Kohlhase (2004) document that the real cost per ton-mile

of railroad transportation dropped by nearly 90% between 1890 and 2000. Other changes have been

more central to improving short-distance transportation between neighboring or relatively close-by

places. For the purpose of our paper, we are mostly interested in these latter improvements. In

what follows we give a brief overview of the main innovations that have benefited short-distance

transportation technology in the U.S. over the past two centuries.

Prior to the 1850s, many Americans worked near the central business district and walked

to work. Other forms of transportation were expensive and slow. Horse-drawn carriages were

available, but were only affordable to the very rich (LeRoy and Sonstelie, 1983).13 The omnibus, a

horse-drawn vehicle carrying twelve passengers, was first introduced in the 1820s and became more

widely used in the 1840s (Kopecky and Hon Suen, 2004). However, it was still a costly and not

very fast way to travel.14 Commuter railroads appeared in the 1830s, although they were noisy and

polluting, which led authorities to impose strict regulations, often limiting their use.15

Between 1850 and 1900 the U.S witnessed the arrival of the streetcar or trolley, which

allowed for smoother travel and larger capacity than an omnibus. As with many other new modes

of transport, initially only high-income individuals could pay the high fare of streetcars to commute

to work on a regular basis. Nonetheless, the introduction of the streetcar allowed the larger cities

to grow. Boston saw the first “streetcar suburbs”, well-off neighborhoods on the outskirts of the

city (Warner, 1972; Mieszkowski and Mills, 1993; Kopecky and Hon Suen, 2004). The streetcar was

an important improvement over the omnibus in terms of capacity and speed: a two-horse streetcar

could carry 40 passengers, and its speed was about one-third greater. Over time, animals were

substituted by cleaner and more efficient motive powers. The first electric streetcar was operated

in Montgomery, Alabama, in 1886, and by the end of 1903, 98 percent of the 30,000 miles of street

railway had been electrified.16 By 1920, the streetcar had become an affordable mean to commute

for almost every worker. However, by then the car had made its appearance, so the streetcar never

became widely used by all income groups.

Several factors contributed to the streetcar facilitating longer-distance commutes, thus al-

lowing large cities to grow bigger. One was an improvement in speed, another was the use of flat

rates independent of distance, and a third was the construction of longer rail lines. In his study

of Boston, Warner (1972) argues that the trolley triggered a substantial outward expansion of the

city. In particular, he estimates this expansion to have been between 0.5 and 1.5 miles per decade.

As Jackson (1985) explains, this translates into the outer limit of convenient commuting, defined as

the distance that can be traversed in one hour or less, increasing from about 2 miles from Boston’s

13Regular steam ferry service began in the early 1810s but was limited to big coastal cities like New York.

14LeRoy and Sonstelie (1983) document that an omnibus fare ranged from 12 cents to 50 cents at a time when alaborer might earn $1.00 a day. Its average speed was slow – about 6 miles per hour.

15As in the case of the omnibus, commuter railroads were also quite expensive (LeRoy and Sonstelie, 1983).

16Before the use of electricity, the use of steam engines was briefly tried, with limited success, in several U.S. cities.

City Hall in 1850 to 6 miles in 1900.

While all these innovations significantly decreased transportation and commuting costs, it

was not until the path-breaking invention of the internal combustion engine that these costs would

experience radical change. The adoption of the car did not happen overnight: the affordability of

automobiles for the middle class had to wait until the mass production of the Model-T in 1908.

Other issues had to be resolved as well before cars could become wide-spread. Initially, regulations

limited their use and speed to 4 miles per hour to avoid scaring horses. There was also a scarcity

in gasoline stations and service facilities. More importantly, roads were still largely unpaved.

The growth in car ownership and use was tightly linked to the investment in roads and

highways. New York opened the first part of its parkway system in 1908, which allowed drivers to

increase their speed to 25 miles per hour. The Federal Highway Act of 1921 allowed the construction

of similar highways across the country. In 1913, there was a motor vehicle to every eight people

and, by the end of the 1920s, the car was used by 23 million people. The government effort was

boosted years later with the Eisenhower Interstate Highway system, arguably the largest public

works project in history and authorized by the Federal Highway Act of 1956. During this entire

period, car ownership continued its upward ascent until the 1970s (Kopecky and Suen, 2004).

The combination of the mass use of the car and the expansion of the highway system

translated into a huge wave of suburbanization, mostly in the post-WWII era. Many of these

highways connected the downtown areas of large urban centers to the suburbs and the farther-

off hinterland. According to Glaeser (2011), “the highway program was meant to connect the

country, but subsidizing highways ended up encouraging people to commute by car”. Baum-Snow

(2007) argues that cars and highways were a fundamental determinant of the suburbanization of

American cities. His estimations show that, between 1950 and 1990, the construction of one new

highway passing through a central city reduced its population by about 18 percent. Another major

transportation change starting around 1950 was the construction of suburban rail terminals. In

cities like San Francisco and Washington, D.C., heavy-rail systems were established, while light-rail

systems followed in cities like San Diego and Portland (Young, 2015).17

3.2 Other Commuting Costs

In addition to transport technology, other factors that determine the time cost of commuting are

the spatial concentration of people and businesses, traffic congestion, and the opportunity cost of

17Suburbanization was also facilitated by factors unrelated to transport technology: the home mortgage interestdeduction, the introduction of government-guaranteed mortgages, the Federal Housing Administration loans thatguaranteed up to 95 percent of mortgages for middle-income buyers, and the GI Bill that offered no down paymenthousing loans for veterans.

Spatial Clustering. Commuting costs fall if it becomes easier to fit more people or businesses

onto an acre of land, since this implies less people needing to commute long distances. One major

factor facilitating density is the possibility of building vertically. Historically, this move upward

was at first modest, as two-story buildings were gradually replaced by four- and six-story buildings

(Glaeser, 2011). Heights were restricted by the cost of construction and the limits on people’s

desire to climb stairs. As a result, the top floors of six-story buildings were typically occupied by

the lowest-income tenants (Bernard, 2014). This all changed with the invention of the elevator. A

first elevator engine was presented by Elisha Otis at the 1854 New York’s Crystal Palace Exposition,

but its rudimentary technology was unsuitable to be used in tall buildings. In 1880, Werner von

Siemens’ electric elevator made it possible to transport people to tall heights in a safe manner,

hence enabling the construction of skyscrapers with functional uses.

Another challenge that had to be overcome to build skyscrapers was an architectural con-

straint: erecting tall buildings required thick walls, making skyscrapers unprofitable. The solution

to this problem was the use of load-bearing steel skeletons, where the weight of the building rests

on a skeleton frame. Building these type of structures became possible in large part thanks to the

increasing affordability of steel in the late 19th century. The first skyscraper is often attributed

to William Le Baron Jenney’s Home Insurance Building, a 138-foot structure built in Chicago in

1885.18 In the following decades, skyscrapers became a fixture in the skylines of American cities,

especially in Manhattan, which witnessed a boom in the number of skyscrapers in the 1920s.19

Congestion. The speed of commuting is of course not only a function of available technology. As

traffic congestion has become worse, the most recent decades have witnessed a slowdown or even

a reversal in the trend of ever-faster commuting. As one indicator of this growing congestion, we

use the travel time index (TTI) of the Texas A&M Transportation Institute. The TTI is defined

as the ratio of travel time in the peak period to travel time at free-flow conditions. For example,

a value of 1.10 indicates a 20-minute free-flow trip takes 22 minutes in the peak period. Between

1990 and 2010, the TTI increased from around 1.10 to 1.20. As another indicator of congestion,

we compute the average speed of trips under 50 km from the National Household Travel Survey.

Between 1983 and 2001, this speed was still increasing, from 23.3 miles per hour to 26.4 miles per

hour. Since then, this speed has declined, and by 2017 it had fallen by nearly one quarter, to 20.3

miles per hour. A similar pattern can be observed for trips between 50 and 100 km. The average

speed increased from 45.2 mph in 1983 to 49.0 mph in 1995, and has since then declined, reaching

39.7 mph in 2017.

18Other famous skyscrapers built around that year are the Mountauk Building in Chicago, and the McCulloughShot and Lead Tower in New York.

19The growth in the number of skyscrapers diminished after 1933, as a result of stringent regulations based on theargument that these tall buildings severely reduced the amount of light available to pedestrians.

Opportunity Cost of Time. Another factor contributing to the increasing time cost of com-

muting is the rising opportunity cost of time. Edlund et al. (2016) focus on the increase in

double-income high-skilled households between 1980 and 2010. Dual-earner couples have less time,

making commuting more costly, giving them an incentive to live closer to work. Edlund and co-

authors find that the increase in the number of couples where both partners work has contributed to

gentrification and urban renewal in recent decades. Su (2018) makes a similar point, but focuses on

individuals between 1990 and 2010. The percentage of those working long hours has increased for

all skill classes, though the effect is larger for the college educated. To economize on the commuting

time, the high-skilled are disproportionately moving to the city centers.20

3.3 Summary

When focusing on 1840-2017, the above discussion suggests that we can distinguish three subperiods

in the evolution of commuting costs. Between 1840 and 1920, there was a gradual decline in

commuting costs, driven by the introduction of the omnibus and the streetcar, followed by the

incipient adoption of the car. After 1920, there was a rapid decline in commuting costs, driven by

the mass adoption of the automobile, the construction of highways connecting urban areas with

their hinterlands, and the expansion of suburban rail systems. By the turn of the 21st century, this

continuous decline in commuting costs slowed down, because of the increase in congestion and the

rising opportunity of time.21

4 Conceptual Framework

In this section we provide a two-city spatial model with commuting and moving costs that is able

to account for the main stylized facts identified in the data. On the one hand, the smaller city may

find it hard to survive in the shadow of the larger city, as its residents prefer to move to the more

productive neighbor. On the other hand, the smaller city may thrive as its residents can access

the neighbor’s higher productivity through commuting. As commuting costs decline, we find that

urban growth shadows dominate in the early stage, whereas urban growth spillovers dominate later

20Of course, since this process of gentrification also displaces people, it is not clear whether this is associated witha decline or an increase in the center-city population.

21The years that separate the different subperiods do not constitute precise breakpoints. For example, we can useeither 1920 or 1940 to separate the first two subperiods, as the mass adoption of cars started after 1908, whereas thebuilding of urban highways and suburban rail networks only started in earnest in the 1950s and the 1960s.

4.1 Setup and Equilibrium

Endowments. The economy consists of a continuum of points on a line. The density of land at

all points of the line is one. There are L individuals, each residing on one unit of land. Each resident

has one unit of time, which she divides between work and commuting. On the line there are two

exogenously given production points, denoted by ` and k. The set of individuals living closer to

production point ` than to production point k comprises city `. Whereas the total population, L,

is exogenous, the populations of the two cities are endogenous. The land rent in city ` at distance

d` from production point ` is denoted by r`(d`). The distance between production points ` and k,

denoted by d`k, is big enough so that there is at least some empty land between the two cities.22

Technology. The economy produces one good, and labor is the only factor of production. Tech-

nology is linear, with one unit of labor producing A` units of the good at production point ` and Ak

units of the good at production point k. The price of the good is normalized to one. To produce,

an individual needs to commute from his residence to one of the two production points.

The time cost of commuting per unit of distance is γ. An individual who resides in city `

at a distance d` from production point ` can choose between working in ` or k. If she works in her

own city `, she supplies one unit of labor net of the time lost in intra-city commuting 1− γd`, and

earns an income of A`(1 − γd`). If she commutes to the other city k, we ignore differences in the

residence location in the own city, and assume that she incurs an inter-city commuting distance

d`k.23 In that case, she supplies 1−γd`k units of labor, and earns an income Ak(1−γd`k). Without

loss of generalization, we assume that no one residing in k has an incentive to commute to `. As

a result, an individual residing in city k at a distance dk from production point k supplies 1− γdkunits of labor, earning an income of Ak(1− γdk). Summarizing, depending on where an individual

resides and works, there are three possible expressions for income y:

y`(d`) = A`(1− γd`)

yk(dk) = Ak(1− γdk)

yk(d`k) = Ak(1− γd`k), (2)

where the subscript on y refers to the individual’s workplace and the subscript on d refers to his

commuting distance, and hence implicitly to his place of residence. For example, yk(d`k) refers to

22This ensures symmetry in a city’s spatial structure on both sides of its production point.

23This simplifying assumption has the advantage of maintaining symmetry between agents who reside at a distanced` to the right of production point ` and those who reside at that same distance d` to the left of production point`. It implies that cities will be symmetric in shape: the number of residents living to the right and to the left ofproduction point ` will be the same. That is, the distance from production point ` to the edge of city `, denoted byd`, will be the same on both sides of `.

the income of an individual who works in k and covers a distance d`k to get to work, implying that

she lives in `.

The possibility of commuting from city ` to production point k is meant to capture the

positive effect of urban spillovers from the neighboring city. These spillovers decline with distance:

by commuting from city ` to production point k, a resident loses working time at a rate of γ

per unit of distance, giving him access to a de facto discounted version of the neighboring city’s

productivity, Ak(1 − γd`k).24 We could alternatively model this effect through direct technology

spillovers, without the need of introducing inter-city commuting, as in Ahlfeldt et al. (2015): if

technological spillovers decay at a rate of γ per unit of distance, then an agent who resides and

works in city ` would have access to a discounted version of his neighbor’s productivity, Ak(1−γd`k),in exactly the same way as an agent who resides in ` and pays a time cost of γ per unit of distance

to commute to k. In that sense, both interpretations are interchangeable in their effects on income.

Hence, our results do not strictly hinge on the existence of inter-city commuting. We will return

to this alternative interpretation when we discuss the model’s results.25

Preferences. Utility is equal to income, with two adjustments. First, agents who change resi-

dence from city ` to city k pay a consumption-equivalent utility moving cost µd`k that is increasing

in inter-city distance. As in the case of commuting, without loss of generality, no individual origi-

nally from k has an incentive to move to `. Because of the moving cost, an agent’s utility depends

not just on where she resides and where she works, but also on where she is originally from. Second,

occupying land is costly. In particular, an individual residing at a distance d` from the production

point ` pays a land rent of r`(d`), which reduces her net income and hence her utility.

Therefore, depending on an individual’s place of origin, place of residence and place of work,

there are four possible expressions for utility:

u``(d`) =A`(1− γd`)− r`(d`)

u`k(dk)=Ak(1− γdk)− rk(dk)− µd`k

u`k(d`)=Ak(1− γd`k)− r`(d`)

ukk(d`)=Ak(1− γdk)− rk(dk), (3)

where the superscript on u refers to the individual’s place of origin, the subscript on u refers to her

workplace, and the subscript on d refers to her commuting distance, and hence implicitly to her

place of residence. For example, u`k(d`k) refers to the utility of an individual who is originally from

24If we include the inter-city commuting time as part of the necessary time dedicated to work, we can interpretAk(1 − γd`k) as the productivity of a commuter from the other city.

25Yet another alternative would be to model this effect by introducing trade. We provide such a model in theAppendix.

`, resides in ` and works in k, whereas u`k(dk) refers to the utility of an individual who is originally

from `, resides in k and works in k. To simplify notation, later in the paper we will sometimes

refer to u``(d`) as the staying utility US , to u`k(dk) as the moving utility UM , and to u`k(d`) as the

commuting utility UC .

We now provide some more details about the moving cost µd`k, and also relate it to the

concept of urban shadows. Rather than interpreting the moving cost as a time cost, we think of it

as the utility cost of being a migrant. For example, this could include the psychological and social

costs of having to leave friends and family behind. Consistent with this interpretation, we assume

that a return migrant does not pay a moving cost. That is, if an individual who moved from city

` to city k returns to her hometown, she does not pay a moving cost. The possibility of moving

between cities is meant to capture urban shadows: an individual who resides in a low-productivity

smaller city in the proximity of a high-productivity larger city may find it beneficial to move to

its high-productivity neighbor. If so, the smaller city loses population, and the larger city casts a

growth shadow on the smaller city.

Residential Mobility within Cities. People can freely locate within cities. Where land is

unoccupied, land rents are normalized to zero. Hence, at the city edge d` land rents r`(d`) = 0,

whereas at other locations closer to the production center ` land rents are determined by the

within-city residential free mobility condition. The same of course applies to city k.

To determine equilibrium land rents at different locations, note that in city ` there are

potentially two types of agents: residents who work locally in `, denoted by L``, and residents who

commute to k, denoted by L`k. The total cost of land rents and commuting costs incurred by a

resident who lives at distance d` and works locally is r`(d`) +A`γd`, whereas the analogous cost if

she commutes to k is r`(d`)+Akγd`k. Since all commuters to the other city k have to cover the same

distance d`k, independently of where they reside in city `, they all prefer to live on the city edge

and pay zero rent. As a result, there will be an area L`k/2 on both edges of the city where rents are

zero. To be precise, for all d` ∈ [d` − L`k/2, d`] we have r`(d`) = 0. For all other locations closer to

production point `, occupied by residents who work locally, the sum of land rents plus commuting

costs must equalize. Hence, for all d` ∈ [0, d`−L`k/2], we have r`(d`) +A`γd` = A`γ(d`−L`k/2), so

that r`(d`) = A`γ(d` − L`k/2− d`). Summarizing, equilibrium land rents in city ` are:

r`(d`) =

A`γ(d` −L`k2 − d`) if d` ∈

[0, d` −

]0 if d` ∈

[d` −

L`k2 , d`

In city k, without loss of generality, no residents commute to `, so land rents are simply:

rk(dk) = Akγ(dk − dk). (5)

City Choice and Commuting Choice. The smaller city will lose population if its residents

prefer to move to the larger neighbor, but it will gain population if its residents choose to commute

to the larger city. To provide some intuition for when one situation is more likely than another,

consider an individual who is originally from city ` and resides at a distance d` from the production

point `. She has three choices: she can stay in city ` and work at production point `, earning a utility

US ≡ u``(d`); she can move to city k and reside at a distance dk from her work at production point

k, earning a utility UM ≡ u`k(dk); or she can commute a distance d`k from city ` to production point

k to work, earning a utility UC ≡ u`k(d`). The expressions in (3) suggest that staying is attractive

if productivity differences are small, inter-city distances are large, commuting costs are high, and

moving costs are big; moving is beneficial if commuting costs are not too high and moving costs

are sufficiently low; and commuting is the preferred choice if commuting costs become sufficiently

Building on this intuition, we can now characterize the equilibrium of the economy in terms

of where individuals choose to reside and where they choose to work. We do so for a given set of

parameters A`, Ak, d`k, µ and γ, and for given initial values d0` and d0

k which determine the size

of both cities when populated by their original residents. Without loss of generality, assume that

Ak(1−γd0k) ≥ A`(1−γd0

` ), implying that if all individuals work in the city they are originally from,

the utility of a resident of ` is less than or equal to that of a resident of k.

Depending on the parameter values and on the initial sizes of both cities, the economy will

be in one of four equilibria, represented by the four quadrants of Figure 1. First, if original residents

of city ` do not stand to gain from either moving to city k or commuting to city k, we will say

that we are in a staying equilibrium: every individual stays and works in the city where she resided

originally. This case is illustrated in the top-left panel of Figure 1. Second, if original residents of

city ` get a higher utility from moving than from both commuting or staying, some individuals from

city ` move to city k. As this happens, city ` becomes smaller and city k becomes larger, implying

that the utility from moving goes down and the utility from staying goes up.26 If, as illustrated

in the top-right panel of Figure 1, the two utility levels equalize at a level above the utility from

commuting, then we will say that we are in an inter-city moving equilibrium: some individuals of

` move to k, and the remainder lives and works in `.27

Third, starting in the same situation, with the utility from moving being higher than the

utility from commuting or staying, it is possible that as people start moving, the utility from moving

reaches the utility from commuting. At that point, some individuals in ` start commuting to k,

until the utility from staying equalizes that of commuting. In this case, shown in the bottom-left

26The utility from commuting remains the same, since that utility depends on the distance between city ` and cityk, which is unchanged.

27It is also possible that all individuals move out of city ` before the two utility levels meet. This possibility is notshown in Figure 1.

panel of Figure 1, the economy is in an inter-city moving and commuting equilibrium. Lastly, if

the utility from commuting is higher than the utility from moving or staying, some individuals in

` commute to k. As this occurs, less people in ` work in production point `. This weakens the

competition for land in ` and lowers the land rent. As a result, the utility from staying increases,

and the economy reaches an equilibrium when the utility from staying equalizes the utility from

commuting.28 In this case, illustrated in the bottom-right panel of Figure 1, the economy is in an

inter-city commuting equilibrium.

Figure 1: Equilibrium Description

Staying equilibrium

Inter-city moving and commuting equilibrium Inter-city commuting equilibrium

Inter-city moving equilibrium

Given initial conditions, this figure graphically illustrates the four possible equilibrium configurations. Horizontallines denote the initial utility levels for the different choices: US refers to the utility of an individual staying andworking in her own city, UM refers to the utility of an individual moving to the other city and working there, andUC refers to the utility of an individual commuting to the other city. In the top-left corner individuals do not gainfrom either moving or commuting to the other city, so we have a staying equilibrium. In the top-right corner andbottom-left corner individuals get a higher utility from moving than from commuting or staying. If moving leads theutility to equalize to that of staying, we get a moving equilibrium, whereas if it leads the utility to equalize to that ofcommuting, we get a moving and commuting equilibrium. In the bottom-right corner individuals get a higher utilityfrom commuting, so we have a commuting equilibrium.

28Of course, if everyone commutes before that equality is reached, then we would have the entire city ` commutingto k.

We are now ready to formally define the equilibrium of the economy for a given set of

parameter values and for a given initial size of ` and k.

Equilibrium. Given A`, Ak, d`k, µ and γ, and given initial values d0` and d0

k, with Ak(1−γd0k) >

A`(1− γd0` ), the economy will be in one of four equilibria:

i. Staying equilibrium. If A`(1 − γd0` ) ≥ Ak(1 − γd0

k) − µd`k and A`(1 − γd0` ) ≥ Ak(1 − γd`k),

then no individual has an incentive to move from ` to k.

ii. Inter-city moving equilibrium. If either Ak(1 − γd0k) − µd`k > A`(1 − γd0

` ) ≥ Ak(1 − γd`k)or both Ak(1 − γd0

k) − µd`k > Ak(1 − γd`k) > A`(1 − γd0` ) and Ak(1 − γ(d0

k + m)) − µd`k ≥Ak(1 − γd`k), then a share min(m, d0

` ) moves from city ` to k, where m is the solution to

Ak(1− γ(d0k +m))− µd`k = A`(1− γ(d0

` −m)).

iii. Inter-city moving and commuting equilibrium. If Ak(1− γd0k)−µd`k > Ak(1− γd`k) > A`(1−

γd0` ) and Ak(1−γ(d0

k+m))−µd`k < Ak(1−γd`k), then a share min(m′, d0` ) people moves from

city ` to city k, where m′ is the solution to Ak(1−γ(d0k+m′))−µd`k = Ak(1−γd`k) and m is the

solution to Ak(1−γ(d0k+m))−µd`k = A`(1−γ(d0

`−m)), and a share min(m′′, d0`−min(m′, d0

people commute from city ` to city k, where m′′ is the solution to A`(1− γ(d0` −m′ −m′′)) =

Ak(1− γd`k).

iv. Inter-city commuting equilibrium. If Ak(1− γd`k) > A`(1− γd0` ) and Ak(1− γd`k) > Ak(1−

γd0k)− µd`k, then min(c, d0

` ) commutes from city ` to city k, where c is the solution to Ak(1−γd`k) = A`(1− γ(d0

` − c)).

Now that we have formally defined the equilibrium, in the following subsection we analyze

whether the smaller city gains or loses population when certain parameter values, such as com-

muting costs, change. This will provide us with valuable predictions on the incidence of urban

shadows and urban spillovers, and it will allow us to link the theory’s predictions to the stylized

facts uncovered in the empirical part of the paper.

4.2 Urban Growth Shadows and Spillovers

In this subsection we explore how a drop in commuting costs affects urban shadows and urban

spillovers. To gain some understanding of the role of commuting costs, consider an agent residing

in the smaller, less productive city. If commuting costs are large, an agent has less incentive to

move or to commute to the larger, more productive city, because in either case she would be facing

longer commuting distances. If commuting costs are at an intermediate level, moving becomes more

attractive: although the commuting distance increases, it does so by less than if the agent were

to commute from the smaller to the larger city. If commuting costs drop far enough, commuting

to the larger city becomes the better choice: she saves the fixed cost of moving, while benefiting

from low commuting costs. This intuition suggests that a drop in γ makes moving relatively more

attractive than staying, and makes commuting relatively more attractive than moving.

Starting off in a situation where all individuals have the same utility, we formalize this

intuition and show that a gradual decrease in γ first shifts the economy from a staying equilibrium

to an inter-city moving equilibrium, with some residents of the smaller low-productivity city moving

to the larger high-productivity city. Later, as γ continues to drop, the economy shifts to an inter-city

moving and commuting equilibrium, and then to an inter-city commuting equilibrium, with some

original residents of the smaller low-productivity city commuting to the larger high-productivity

city. This is stated in the following result.

Result 1. Start off in an equilibrium where Ak > A` and where the utility of all individuals is

identical. For a value of µ that is sufficiently small, a gradual drop in commuting costs, γ, moves

the economy sequentially from a staying equilibrium to an inter-city moving equilibrium, an inter-

city moving and commuting equilibrium, and an inter-city commuting equilibrium. In the inter-city

moving equilibrium the smaller city loses residents to the larger city, whereas in the inter-city

moving and commuting equilibrium the smaller city gains residents from the larger city.

Proof. Initially A`(1 − γd0` ) = Ak(1 − γd0

k) and Ak > A`, so that d0k > d0

` . In this case,

A`(1−γd0` ) ≥ Ak(1−γd0

k)−µd`k and A`(1−γd0` ) ≥ Ak(1−γd`k) because d`k ≥ d0

` by construction.

Because Akd0k > A`d

0` , −∂A`(1 − γd0

` )/∂γ < −∂Ak(1 − γd0k)/∂γ, so that a drop in γ leads to

Ak(1−γd0k) > A`(1−γd0

` ). If γ continues to drop and µ < (Ak−A`)/d`k, at some point Ak(1−γd0k)−

µd`k = A`(1−γd0` ). This occurs when γ reaches the threshold γm = (Ak−A`−µd`k)/(Akd0

k−A`d0` ).

If Ak(1 − γmd0`k) < Ak(1 − γmd0

k) − µd`k, which requires µ < (Ak(d`k − d0k)(Ak − A`))/(d`k(d`k +

Akd0k − A`d

0` )), then as soon as γ falls below γm, some of the original residents of ` will want

to move to k. To be precise, min[m, d0` ] people who originally lived in ` will move to k, where

Ak(1−γ(d0k+m))−µd`k = A`(1−γ(d0

`−m)). As γ continues to drop, m will increase. At some point,

the drop in γ reaches Ak(1− γ(d0k + min[m, d0

` ]))−µd`k = Ak(1− γd`k). We refer to this threshold

as γmc, where γmc = µd`k/(Ak(d`k − d0k −min[m, d0

` ])). Any further drop in γ will now imply that

some of the original residents of ` will prefer to commute to k. If γ continues to drop, an increasing

share of the original residents of ` commute. There is a threshold γc = µd`k/(Ak(d`k − d0k)), below

which all original residents of ` commute to k.

The above result implies three threshold values of γ. A high threshold, γm, a middle threshold, γmc,

and a low threshold, γc, such that for γ ≥ γm, we are in a staying equilibrium, for γm > γ ≥ γmc,

we are in an inter-city moving equilibrium, for γmc > γ ≥ γc, we are in an inter-city moving and

commuting equilibrium, and for γ < γc, we are in an inter-city commuting equilibrium.

Figure 2: Inter-city Moving and Commuting vs. Commuting Cost

This figure illustrates Result 1. It shows how the share of the population of the small city, as well as the share ofcommuters, changes with commuting costs. A drop in commuting costs leads the small city to first lose population,and then to gain population. For high commuting costs (γ ≥ γm), individuals from the small, less productive cityhave no incentive to move or commute to the large, more productive city. As commuting costs drop (γm > γ ≥ γmc),individuals from the small, less productive city move to the large, more productivity. The population of the smallcity declines: an urban growth shadow. For low levels of commuting costs (γm < γmc), individuals from the small,less productive city commute to the large, more productive city, and its original residents start to return: an urbanspillovers. Eventually, the small city recovers its original population (γ < γc).

Figure 2 illustrates Result 1 with a simple numerical example. The productivity values are

set such that the large city has a TFP that is 10% higher than the small city: A` = 1 and Ak = 1.1.

The inter-city distance is set to 1, and the overall population to 1.5: with cities being symmetric

around their production points, this implies 75% of the land between the two cities is occupied.

We set the moving cost parameter µ to 0.03. When taking the initial income in the small city as

reference, this amounts to a cost of slightly more than 3% in income-equivalence terms. We choose

the initial value of γ to be 0.25, implying that people would lose 25% of their income if they were

to commute to the other city.

Starting off with a geographic distribution of population such that utility levels are the

same in both cities, we analyze what happens to the population share of the small city and to the

population share commuting to the large city as we lower γ from 0.25 all the way to zero. As can

be seen in Figure 2, when γ > γm, there is no inter-city moving or commuting. All residents of

the small city stay put, so that its population remains constant. Once γ drops below γm, some

residents of the small city move to the large city, and the population of the small city gradually

declines as the commuting cost continues to decrease. When γ falls below γmc, some residents

from the small city start to commute to the large city. As the commuting cost further drops, the

population of the small city slowly recovers, as some of the movers return and prefer to commute.

Once the commuting cost falls to γc, the small city reaches its original population level, with all of

its residents commuting to the large city.

Relation to Urban Shadows and Spillovers. What does Result 1 tell us about urban growth

shadows and urban growth spillovers? As the commuting cost drops, residents of the smaller city

move to the nearby larger city, and the smaller city loses population. The larger city casts an urban

growth shadow: the nearby smaller city suffers in terms of population growth. A further drop in

the commuting cost reverses this trend, as residents of the smaller city find it more attractive to

commute to the larger city than to move. The larger city no longer displays an urban growth

shadow, but exhibits urban spillovers instead: the nearby smaller city gains in terms of population

growth.

Relation to Empirical Stylized Facts. Our description of the evolution of commuting costs in

the U.S. between 1840 and 2017 suggests a slow decline in γ between 1840 and 1920, a rapid fall in

γ between 1920 and the turn of the 21st century, and a slowdown in the decrease in γ during the last

two decades. In light of Result 1, this would be consistent with an early time period where growth

shadows dominated and a later time period where growth spillovers dominated, with a weakening

in those spillovers in more recent times. This is consistent with our empirical findings for the U.S.,

as summarized in Stylized Fact 1 and Stylized Fact 2.

4.3 Geographic Span of Urban Shadows and Spillovers

We now explore how urban shadows and urban spillovers depend on the distance to the larger city.

The following result states that if inter-city distance increases, then all three threshold values of

the commuting cost are lower.

Result 2. Thresholds γm, γmc and γc are declining in d`k. That is, if the distance to the larger

city increases, the shift from a staying equilibrium to an inter-city moving equilibrium, from an

inter-city moving equilibrium to an inter-city moving and commuting equilibrium, and from an

inter-city moving and commuting equilibrium to an inter-city commuting equilibrium, occurs for

lower values of the commuting cost γ.

Proof. From the proof of Result 1, we can write γm = (Ak −A`−µd`k)/(Akd0k −A`d0

` ). It is clear

that dγm/dd`k < 0. From the same proof of Result 1, we can write γmc = µd`k/(Ak(d`k − d0k −

min[m, d0` ])), where m can be written as (Ak − A` − Akγmcd0

k + A`γmcd0` − µd`k)/(γmc(A` + Ak)).

Together, this implies that γmc = max[µd`k/(Ak(d`k− d0k− d0

` )), (Ak(Ak−A`) +A`d`kµ)/(Ak(Ak +

A`)d`k−AkA`(d0k+ d0

` ))]. Here as well, it is immediate that dγmc/dd`k < 0. Threshold γc is reached

when m in the above expression is equal to zero, so γc = µd`k/(Ak(d`k − d0k)). It is immediate that

dγc/dd`k < 0.

The above result says that when the larger city is geographically farther away, commuting costs

need to drop more before individuals from the smaller city want to move to the bigger city, and

they also need to drop more before they find it profitable to commute to the bigger city.

Figure 3: Inter-city Moving and Commuting Thresholds vs. Inter-City Distance

This figure illustrates Result 2 by showing how the three commuting cost thresholds (γm, γmc and γc) decrease asinter-city distance increases. This result implies that as commuting costs decline, urban shadows first expand in spaceas the moving equilibrium applies to increasingly farther-away locations, and then urban spillovers expand in spaceas the moving & commuting equilibrium applies to increasingly farther-away locations.

Figure 3 illustrates Result 2 with a simple numerical example. We use the same parameter

values as before, with the exception of inter-city distance d`k, which we now vary from 1.0 to 2.5.29

For each level of inter-city distance, we plot the three threshold values. As can be seen, the larger

the inter-city distance, the more commuting costs need to drop before people start moving to the

larger city, and before they start commuting to the larger city.

29To make the results comparable to the other figures, for each value of d`k, population is allocated across citiessuch that utility equalizes for γ = 0.25.

Relation to Geographic Span of Shadows and Spillovers. What do these findings tell us

about the geographic span of urban shadows and urban spillovers? Using Figure 3, consider a

commuting cost that is relatively high, say, γ = 0.14. In that case, population growth in the

smaller city is relatively lower if the larger city is close-by than if it is farther away. We see urban

growth shadows at close distances, and no effect at further distances. As the commuting cost falls

from this relatively high level, the larger city’s urban growth shadow increases its geographic reach.

Once commuting costs become low enough, we see the emergence of urban spillovers. Consider, for

example, a commuting cost of γ = 0.07. In that case, population growth is relatively higher if the

larger city is in the vicinity, and it is relatively lower if the larger city is at a greater distance. That

is, urban spillovers dominate at short distances, and urban shadows dominate at farther distances.

As the commuting cost falls further, the spatial reach of urban spillovers increases.

Relation to Empirical Stylized Facts. Result 2 allows us to trace the changing geographic

reach of urban shadows and spillovers as commuting costs fall. Initially, it predicts urban shadows

at relatively short distances, that gradually expand as transport costs drop. Eventually, these

shadows are replaced by spillovers, again first at relatively short distances, but later at farther

away distances as the spatial reach of spillovers expands. This is consistent with our empirical

findings for the U.S., as summarized in Stylized Fact 3.

4.4 Relative Size of Large City

How do urban shadows and urban spillovers depend on the relative size of the large city? The

following result shows that the moving threshold is increasing in the relative size of the large city.

That is, commuting costs have to fall by less before a large city starts attracting the population of

its hinterland.

Result 3. Keeping population-weighted productivity unchanged, the threshold γm is increasing in

the relative size of the larger city. That is, the shift from a staying equilibrium to an inter-city

moving equilibrium occurs for a higher value of the commuting cost if the larger city has a bigger

relative size.

Proof. From the proof of Result 1, we can write γm = Ak−A`Akd

0k−A`d

0`− µd`k

Akd0k−A`d

0`. Our aim is to show

that γm is increasing in d0k/d

0` . To to so, we consider the two terms in the γm expression separately.

Because we start off in an equilibrium where A`(1− γ0d0` ) = Ak(1− γ0d0

k), where γ0 is the initial

value of γ, it follows that Ak−A`Akd

0k−A`d

= γ0. Hence, the first term of the γm expression above does

not depend on the relative size d0k/d

0` . This leaves us with the second term, − µd`k

Akd0k−A`d

0`. Because

A`(1 − γ0d0` ) = Ak(1 − γ0d0

k), it follows that AkA`

=1−γ0d0`1−γ0d0k

. If d0k/d

0` increases, we know that d0

increases and d0` decreases, since d0

` + d0k is a constant. As a result, if d0

k/d0` increases, it follows

that Ak/A` increases. Recall that we are keeping population-weighted productivity the same, so

Akd0k + A`d

0` is a constant we denote by λ. Hence, Akd

0k − A`d0

` = λ − 2A`d0` . If the larger city

becomes larger and its relative productivity increases and the overall productivity is unchanged,

it must be that the productivity of the small city decreases. It hence follows that Akd0k − A`d0

increases. This implies that the second term, − µd`kAkd

0k−A`d

is increasing in Ak/A`, so that γm is

increasing in Ak/A`.

The above result shows that larger cities exert a stronger gravitational pull on their hinterland, as

they start casting their urban shadows at higher levels of commuting costs. Figure 4 illustrates this

result, and further shows that the same applies to urban spillovers. We use the same parameter

values as before, with the exception of the productivity Ak which we now vary in order for the

city size to change. As the relative size of k increases from 0.65 to 0.85, all three threshold values

increase.

This figure illustrates Result 3 by showing how the three commuting cost thresholds (γm, γmc and γc) increase asthe relative size of the large city increases. This result implies that as commuting costs decline, it is the largest citiesthat first cast their growth shadow on their smaller neighbors, and likewise, it is the largest cities that first spreadtheir growth spillovers to their smaller neighbors.

Relation to Empirical Stylized Facts. Result 3 implies that as commuting costs decline, it is

the largest cities that first cast their urban growth shadow on their smaller neighbors, and likewise,

it is the largest cities that first spread their urban growth spillovers to their smaller neighbors.

The stronger urban shadows and urban spillovers of larger cities is consistent with our empirical

findings for the U.S., as summarized in Stylized Fact 4.

4.5 Alternative Interpretations

Commuting costs are central to our conceptual framework. Their evolution affects whether urban

growth shadows or urban growth spillovers dominate, and they are also key in determining the

geographic reach of these shadows and spillovers. Indeed, by focusing on the evolution of just this

variable, our conceptual framework is able to account for the main stylized facts we identified when

empirically studying growth shadows and spillovers in the U.S. over the period 1840 to 2017.

One potential issue with our interpretation is that in some of the later time periods, after

1980, the geographic span of urban spillovers reached 200km. At face value this seems well beyond

standard inter-city commuting distances, so one could doubt whether in this most recent time period

the conceptual framework captures the essence of what we observe in the data. There are at least

three reasons why our interpretation may still hold. First, although between 1980 and 2017 we find

evidence of urban spillovers having a large geographic reach, those effects dissipate with distance.

For example, correlations between 150km and 200km are one-half to one-quarter their magnitudes

between 1km and 100km. Second, the empirical correlations should always be interpreted relative

to the excluded category (e.g., locations that have no large neighbors within 300km). If in recent

time periods geographically isolated locations have been experiencing particularly low growth, this

pushes up the relative growth rate of all other locations, including those that are, say, 200km

away from large neighbors. Third, although a distance of 200km between the centroid of a rural

county and the centroid of a large metro area may be beyond standard commuting distances, the

distance between that same county and the edge of a large metro area may very well still be within

reasonable commuting time.

That being said, an alternative is that inter-city commuting costs are not the driving force

behind what we observe in the data. As mentioned before, our conceptual framework is equivalent

to one with intra-city (but no inter-city) commuting costs, and with technological spillovers that

decay with distance. If in this alternative model technological spillovers decrease by a share γ per

unit of distance in exactly the same way as the hours worked of a commuter decrease by a share γ

per unit of distance, then both models are observationally equivalent. Workers of the smaller city

have access to the same discounted productivity of their larger neighbor: either by benefitting from

technological spillovers from the larger city or by commuting to the larger city.

In addition to inter-city commuting costs and spatial technological spillovers, a third force

that may contribute to the attractiveness of the larger city is trade and market access. In the

Appendix we consider an alternative model that incorporates trade between both cities. Once

again, we switch off the possibility of inter-city commuting and only allow for intra-city commuting.

Because intra-city commuting is more costly in the large than in the small city, falling transport

costs make it relatively more attractive to move to the large city: an urban growth shadow. At

the same time, falling transport costs make it easier to trade with the large city, thus increasing

the incentive to reside and produce in the nearby small city: an urban growth spillover. In the

Appendix we use simulations to show that such an alternative model can generate similar dynamics

to the ones we observe in the U.S. data between 1840 and 2017.

5 Concluding Remarks

In this paper we have analyzed whether a location’s growth benefits or suffers from being geograph-

ically close to a large urban center. To do so, we have focused on U.S. counties and metro areas

over the time period 1840-2017. We have found evidence of urban shadows between 1840 and 1920

and of urban spillovers between 1920 and 2017. Proximity to large urban clusters was negatively

correlated with a location’s growth in the early time period, and positively correlated in the later

time period, albeit with some weakening of this positive correlation in the last decades.

The conceptual framework we have developed suggests that as the cost of commuting drops,

individuals first have an incentive to move from smaller closeby cities to larger urban centers. Later,

if commuting costs continue to fall, individuals prefer to commute, rather than to move, from the

smaller to the larger cities. This implies that falling transport costs first hurt, and then help, the

growth of smaller locations in the vicinity of large urban centers. After documenting the long-run

evolution of commuting costs, we have shown that our framework is consistent with the empirical

evidence.

As such, a single variable — commuting costs — is able to capture the growth patterns of

small cities in the hinterland of large urban clusters over the time period stretching from 1840 to

2017. Other factors are of course likely to have contributed to these spatial growth patterns. In

this context, we have discussed the role of the spatial diffusion of technology, as well as the possible

importance of market access and trade.

References

[1] Ahlfeldt, G. M., S. J. Redding, D. M. Sturm, and N. Wolf (2015). “The Economics of Density:

Evidence from the Berlin Wall,” Econometrica, 83, 2127-2189.

[2] Allen, T., and C. Arkolakis (2014). “Trade and the Topography of the Spatial Economy,”

Quarterly Journal of Economics, 129, 1085-1140.

[3] Baum-Snow, N. (2007). “Did Highways Cause Suburbanization?,” Quarterly Journal of Eco-

nomics, 122, 775-805.

[4] Beltran, F. J., A. Dıez-Minguela, and J. Martınez-Galarraga (2017). “The Shadow of Cities:

Size, Location, and the Spatial Distribution of Population in Spain,” Cambridge Working

Paper Economics 1749.

[5] Bernard, A. (2014). Lifted: A Cultural History of the Elevator, New York: NYU Press.

[6] Bernhofen, D. M., El-Sahli, Z., and R. Kneller (2016). “Estimating the Effects of the Container

Revolution on World Trade,” Journal of International Economics, 98, 36-50.

[7] Black, D., and V. Henderson (2003). “Urban Evolution in the USA,” Journal of Economic

Geography, 3, 343-372.

[8] Bosker, M., and E. Buringh (2017). “City Seeds: Geography and the Origins of European

Cities,” Journal of Urban Economics, 98, 139-157.

[9] Christaller, W. (1933). Central Places in Southern Germany. Jena, Germany: Fischer (English

translation by C. W. Baskin, London: Prentice Hall, 1966).

[10] Conley, T. (1999). “GMM Estimation with Cross Sectional Dependence,” Journal of Econo-

metrics, 92, 1-45.

[11] Cronon, W. (1991). Nature’s Metropolis: Chicago and the Great West, New York: W.W.

Norton.

[12] Davis, D. R., and D. E. Weinstein (2002). “Bones, Bombs, and Break Points: The Geography

of Economic Activity,” American Economic Review, 92, 1269-1289.

[13] Desmet, K., D. Nagy, and E. Rossi-Hansberg (2018). “The Geography of Development,” Jour-

nal of Political Economy, 126, 903-983.

[14] Desmet, K., and J. Rappaport (2017). “The Settlement of the United States, 1800-2000: The

Long Transition towards Gibrat’s Law,” Journal of Urban Economics, 98, 50-68.

[15] Desmet, K., and E. Rossi-Hansberg (2014). “Spatial Development,” American Economic Re-

view, 104, 1211-1243.

[16] Dobkins, L. H., and Y. Ioannides (2001). “Spatial Interactions among U.S. Cities,” Regional

Science and Urban Economics, 31, 701-731.

[17] Donaldson, D., and R. Hornbeck (2016). “Railroads and American Economic Growth: A

Market Access Approach,” Quarterly Journal of Economics, 131, 799-858.

[18] Edlund, L., C. Machado, and M. Sviatschi (2016). “Bright Minds, Big Rent: Gentrification

and the Rising Returns to Skill,” NBER Working Paper # 21729.

[19] Fogel, R. W. (1964). Railroads and American Economic Growth: Essays in Econometric His-

tory, Baltimore, MD: Johns Hopkins University Press.

[20] Fujita, M., P. Krugman, and A. J. Venables (1999). The Spatial Economy, Cambridge, MA:

MIT Press.

[21] Gardner, T. (1999). “Metropolitan Classification for Census Years before World War II,”

Historical Methods, 32, 139-150.

[22] Glaeser, E. L. (2011),Triumph of the City, London: MacMillan.

[23] Glaeser, E. L., and M. Kahn (2004). “Sprawl and Urban Growth,” In: Henderson J.V, Thisse

J.F (ed.), Handbook of Regional and Urban Economics, Vol.4., Elsevier.

[24] Glaeser, E. L., and M. Kohlhase (2004). “Cities, Regions and the Decline of Transport Costs,”

Papers in Regional Science, 83, 197-228.

[25] Hanson, G. (2005). “Market Potential, Increasing Returns and Geographic Concentration,”

Journal of International Economics, 67, 1-24.

[26] Horan, P. M., and P. G. Hargis (1995). “County Longitudinal Template, 1840-1990.” [com-

puter file]. ICPSR Study 6576. Inter-university Consortium for Political and Social Research

[distributor]. Corrected and amended by Patricia E. Beeson and David N. DeJong, Department

of Economics, University of Pittsburgh, 2001. Corrected and amended by Jordan Rappaport,

Federal Reserve Bank of Kansas City, 2010.

[27] Jackson, K. T. (1985). Crapgrass Frontier. The Suburbanization of the United States. Oxford

University Press.

[28] Kopecky, K., and M. H. Suen (2004). “Economie d’Avant Garde: Suburbanization and the

Automobile.” Research Report No. 6.

[29] Krugman, P. (1993). “On the Number and Location of Cities,” European Economic Review,

37, 293-298.

[30] LeRoy, S. F., and J. Sonstelie (1983), “Paradise Lost and Regained: Transportation Innovation,

Income, and Residential Location,” Journal of Urban Economics, 13, 67-89.

[31] Liu, Y., X. Wang, and J. Wu (2011). “Do Bigger Cities Contribute to Economic Growth in

Surrounding Areas? Evidence from County-Level Data in China,” unpublished manuscript.

[32] Losch, A. (1940). The Economics of Location, Jena, Germany: Fischer (English translation,

New Haven, CT: Yale University Press, 1954).

[33] Michaels, G., F. Rauch, and S. J. Redding (2012). “Urbanization and Structural Transforma-

tion,” Quarterly Journal of Economics, 127, 535-586.

[34] Mieszkowski, P., and E. S. Mills (1993). “The Causes of Metropolitan Suburbanization,” Jour-

nal of Economic Perspectives, 7, 135-147.

[35] Partridge M. D., D. S. Rickman, K. Ali, and M. R. Olfert (2009). “Do New Economic Ge-

ography Agglomeration Shadows Underlie Current Population Dynamics across the Urban

Hierarchy?,” Papers in Regional Science, 88, 445-466.

[36] Rappaport, J. (2005). “The Shared Fortunes of Cities and Suburbs,” Federal Reserve Bank of

Kansas City Economic Review, Third Quarter, 33-59.

[37] Rappaport, J. (2007). “Moving to Nice Weather,” Regional Science and Urban Economics, 37,

375-398.

[38] Rappaport, J., and J. D. Sachs (2003). “The United States as a Coastal Nation,” Journal of

Economic Growth, 8, 5-46.

[39] Rauch, F. (2014). “Cities as Spatial Clusters,” Journal of Economic Geography, 14, 759-773.

[40] Redding, S., and D. Sturm (2008). “The Costs of Remoteness: Evidence from German Division

and Reunification,” American Economic Review, 98, 1766-1797.

[41] Rosenthal, S. S, and W. C. Strange (2003). “Geography, Industrial Organization, and Agglom-

eration,” Review of Economics and Statistics 85, 377-393.

[42] Shaw, R. E. (1990). Canals for a Nation. The Canal Era in the United States, 1790-1860, The

University Press of Kentucky.

[43] Su, Y. (2018). “The Rising Value of Time and the Origin of Urban Gentrification,” unpublished

manuscript.

[44] Tabuchi, T., and J-F. Thisse (2011). “A New Economic Geography Model of Central Places,”

Journal of Urban Economics, 69, 240-252.

[45] Thorndale, W., and W. Dollarhide (1987). Map Guide to the Federal Censuses, 1790-1920,

Genealogical Publishing Company, Baltimore.

[46] Warner, S. B. Jr. (1972). Streetcar Suburbs. The Process of Growth in Boston, 1870-1900,

Harvard University Press and the MIT Press.

[47] Young, J. (2015). “Infrastructure: Mass Transit in 19th- and 20th-Century Urban America.”

Oxford Research Encyclopedia of American History. Online Publication Date: March 2015.

A An Alternative Model with Trade and Market Access

In this Appendix we propose an alternative model without inter-city commuting but with inter-city

trade, and show the existence of the same fundamental tradeoff between urban shadows and urban

spillovers.

Endowments. The economy consists of a continuum of points on a line. The density of land

at all points of the line is one. There are L individuals, each residing on one unit of land. Each

resident has one unit of time, which she divides between work and commuting. On the line there

are two exogenously given production points, indexed by ` or k. The set of individuals living

closer to production point ` than to the other production point comprises city `. Whereas the total

population, L, is exogenous, the populations of the two cities, L` and Lk, are endogenous. The land

rent in city ` at distance d` from production point ` is denoted by r`(d`). The distance between

production points ` and k, denoted by d`k, is big enough so that there is at least some empty land

between the two cities. Land is owned by absentee landlords.

Technology and preferences. Each city produces a different good, firms are competitive, and

labor is the only factor of production. Technology is linear, with one unit of labor producing A`

units of the good at production point ` and Ak units of the good at production point k.

To produce, an individual needs to commute to the production point of the city where she

resides. In contrast to the model in the main paper, there is no inter-city commuting. The time

cost of intra-city commuting per unit of distance is γ. Hence, an individual who resides in city `

at a distance d` from production point ` supplies 1− γd` units of labor, and produces A`(1− γd`)units of her city’s good. Her wage income, w`(d`), is therefore p`A`(1− γd`), where p` is the free-

on-board (f.o.b.) price of the good produced in city `. Her income net of land rents paid, y`(d`), is

w`(d`)− r`(d`). Land rents are paid in terms of the local good to the absentee landlords, and then

disappear from the economy. When a good of city ` is shipped to city k, a share γ′ is lost per unit

of distance, so 1− γ′d`k units arrive. Hence, the price of good ` in city k is p`/(1− γ′d`k).People can freely choose where to reside in their city. This implies that y` equalizes across

all locations within a city. At the edge of city `, land rents are zero, so r`(d`) = 0, where d` refers

to the distance between the city center and the city edge. Hence, for all residents of city `, income

net of land rents is

y` = p`A`(1− γd`). (A.1)

To move to another city, an individual has to be pay a utility cost µd`k.30 We assume that a return

30This introduces a utility difference between the original residents of a city and the immigrant residents from theother city. However, it does not lead to a difference in their income net of land rents, so that (A.1) applies to boththe original residents and the immigrant residents.

migrant does not pay a moving cost. That is, if an individual who moved from city ` to city k

returns to her hometown, she does not pay a moving cost.

Agents have CES preferences over the two different goods. The utility of an individual

originally from city k and residing in city ` can then be defined as

(cσ−1σ

`` + cσ−1σ

) σσ−1

− Ik` µd`k (A.2)

1 if k 6= `

0 otherwise

where c`` and c`k denote the consumption of good ` and k by a resident of city `, σ > 1 is the

elasticity of substitution between both goods, and Ik` is an indicator value equal to zero if the

individual is an original resident of ` and equal to one if she is an immigrant from the other city.

Aggregate production. In city `, total production is∫ d`

0 2A`(1 − γd`)dd` = 2A`d`(1 − 12γd`),

where d` = L`2 . A part of total production is paid out to the absentee land owners, and disappears

from the economy. Net of the payments to land owners, each individual in ` produces A`(1− γd`).As a result, total production net of the payouts to land owners is

Q` = 2A`d`(1− γd`) = A`L`(1− γd`). (A.3)

Aggregate consumption. When solving the utility maximization problem, we can separate the

consumption decision and the residential decision. We start by describing the consumption decision.

An agent who resides in ` maximizes (A.2) subject to

p`A`(1− γd`) = p`c`` +pk

1− γ′d`kc`k. (A.4)

The first order conditions yield the following demand for each one of the two goods:

c`` =y`(p`)

p1−σ` + ( pk

1−γ′d`k )1−σ

c`k =y`(

pk1−γ′d`k )−σ

p1−σ` + ( pk

1−γ′d`k )1−σ . (A.5)

Aggregate demand for goods produced in location ` is:

C` =y`L`(p`)

p1−σ` + ( pk

1−γ′d`k )1−σ +ykLk(

p`1−γ′d`k )−σ

( p`1−γ′d`k )1−σ + p1−σ

. (A.6)

Residential choice. An individual who originally resides in city ` has a choice to stay in city `

or to move to city k. His indirect utility if he stays in city ` is:

u`` =y`(

p1−σ` + ( pk

1−γ′d`k )1−σ) 1

1−σ=

p`A`(1− γd`)(p1−σ` + ( pk

1−γ′d`k )1−σ) 1

1−σ(A.7)

whereas his indirect utility if he moves to k is

u`k =yk(

( p`1−γ′d`k )1−σ + p1−σ

) 11−σ− µd`k =

pkAk(1− γdk)(( p`

1−γ′d`k )1−σ + p1−σk

) 11−σ− µd`k. (A.8)

Note that d` = L/2− dk. Denote by d` the value of d` that equalizes u`` and u`k. That is,

p`A`(1− γd`)(p1−σ` + ( pk

1−γ′d`k )1−σ) 1

1−σ=

pkAk(1− γ(L2 − d`))(( p`

1−γ′d`k )1−σ + p1−σk

) 11−σ− µd`k (A.9)

By analogy, denote by dk the value of dk that equalizes ukk and uk` . That is,

pkAk(1− γdk)(( p`

1−γ′d`k )1−σ + p1−σk

) 11−σ

=p`A`(1− γ(L2 − dk))(

(p1−σ` + pk

1−γ′d`k )1−σ) 1

1−σ− µd`k. (A.10)

The original distribution of population has 2d0` individuals living in ` and 2d0

k individuals living in

k, where 2d0` + 2d0

k = L. If d` < d0` , then 2(d0

` − d`) people move from ` to k. If dk < d0k, then

2(d0k − dk) people move from k to `. If neither d` < d0

` nor dk < d0k, then 2(d0

k − dk), no one moves

and everyone lives in their original location of residence.

Equilibrium. For given parameter values L, A`, Ak, d`k, µ, γ, γ′ and σ, and for a given ini-

tial distribution of individuals across cities, d0` and d0

k, an equilibrium is a collection of variables

{p`, pk, L`, Lk, d`, dk, d`, dk} that satisfy conditions (A.1), (A.9), (A.10) and:

1. Goods market clearing:

L`p`A`(1− γd`) =y`L`(p`)

1−σ

p1−σ` + ( pk

1−γ′d`k )1−σ +ykLk(

p`1−γ′d`k )1−σ

( p`1−γ′d`k )1−σ + p1−σ

LkpkAk(1− γdk) =y`L`(

pk1−γ′d`k )1−σ

p1−σ` + ( pk

1−γ′d`k )1−σ +ykLk(pk)

1−σ

( p`1−γ′d`k )1−σ + p1−σ

(A.11)

2. Labor market clearing:

L = L` + Lk (A.12)

3. Land market clearing:

L` = 2d`

Lk = 2dk (A.13)

4. Labor mobility:

d` if d` < d0

d` + (dk − dk) if dk < d0k

d0` otherwise

(A.14)

Numerical example. We illustrate our model with a numerical example. We make the large

city 50% more productive than the small city: A` = 1.0 and Ak = 1.5. The elasticity of substitution

between both goods, σ = 3. The total population L = 6, and inter-city distance d`k = 3. The

moving cost parameter is set to µ = 0.001. Given the inter-city distance and the initial utility in

both cities, this amounts to a little more than 0.3% in terms of utility. For the initial commuting

cost and trade cost parameters, we choose γ = 0.25 and γ′ = 0.25. Using these initial parameters,

we distribute population between the two cities to equalize utility.

We then do comparative statics by simultaneously lowering commuting costs, γ, and trade

costs, γ′. On the one hand, a drop in commuting costs makes it more attractive to live in the larger,

more productive city than in the smaller, less productive city. This occurs because it reduces the

disadvantage of a longer within-city commute in the larger city. This force makes the smaller city

lose population: an urban growth shadow. On the other hand, a drop in trade costs improves

market access for the smaller city. This force makes the smaller city gain population: an urban

growth spillover. Depending on which force dominates, the smaller city loses or gains population.

Figure 1 depicts an example where commuting costs (represented on the bottom horizontal

axis) decline at a slower pace than transport costs (represented on the top horizontal axis). In

particular, commuting costs decline from 0.25 to 0.075, whereas transport costs decline from 0.25

to 0.00. The relatively slower decline in commuting costs is consistent with the view of Glaeser

and Kohlhase (2004) that in the twenty-first century U.S. “it is essentially free to move goods,

but expensive to move people”. Moving from right to left on the horizontal axis, we see that this

decline first lowers the population of the smaller city (the urban shadow effect dominates) and then

it increases the population of the smaller city (the urban spillover dominates). This echoes Result

1 in the model of the paper.

Urban Growth Shadows - Southern Methodist...

Documents