Post on 13-Mar-2020
transcript
Urban Growth Shadows∗
David Cuberes
Clark University
Klaus Desmet
SMU, NBER and CEPR
Jordan Rappaport
Kansas City Fed
September 2019
Abstract
Does a location’s growth benefit or suffer from being geographically close to large economic
centers? Spatial proximity may lead to competition and hurt growth, but it may also generate
positive spillovers and enhance growth. Using data on U.S. counties and metro areas for the
period 1840-2017, we document this tradeoff between urban shadows and urban spillovers. Prox-
imity to large urban centers was negatively associated with growth between 1840 and 1920, and
positively associated with growth after 1920. Using a two-city spatial model that incorporates
commuting and moving costs, we account for this and other observed patterns in the data.
Keywords: urban shadows, agglomeration economies, spatial economics, urban systems, city
growth, United States, 1840-2016
JEL Codes: R12, N93
“Cities were like stars or planets, with gravitational fields that attracted people and
trade like miniature solar systems.”
— William Cronon, Nature’s Metropolis: Chicago and the Great West
1 Introduction
In his account of the U.S. westward expansion during the nineteenth century, Cronon (1991) writes
that land speculators on the frontier saw cities as having a gravitational pull akin to a law of
nature that inexorably attracted migrants from the hinterland to the new urban centers.1 This is
∗Cuberes: Department of Economics, Clark University. E-mail: dcuberes@clarku.edu; Desmet: Department
of Economics and Cox School of Business, Southern Methodist University. E-mail: kdesmet@smu.edu; Rappaport:
Federal Reserve Bank of Kansas City. E-mail: jordan.rappaport@kc.frb.org. We benefitted from presentations at
the Urban Economic Association Meetings (New York and Amsterdam), Boston Federal Reserve Bank, Philadelphia
Federal Reserve Bank, Princeton University and the University of North Dakota. The views expressed herein are
those of the authors and do not necessarily reflect the views of the Federal Reserve Bank of Kansas City or the Federal
Reserve System. We thank McKenzie Humann, Isabel Steffens and Anissa Khan for outstanding research assistance.
1This view of cities echoes that of central place theory (Christaller, 1933; Losch, 1940).
consistent with a view that smaller places “close” to larger cities fall under the “urban shadow” of
their neighbors, with increased competition for resources dampening their growth.2 However, there
is also an opposing view: the presence of nearby clusters of economic activity generates positive
agglomeration spillovers, benefiting the growth of neighboring smaller places.3
Is the proximity of a large urban center beneficial or harmful to a location’s economic
growth? This paper empirically and theoretically explores this question. Focusing on local pop-
ulation growth in the U.S. over almost two-hundred years, our empirical analysis identifies two
distinct time periods: between 1840 and 1920, urban shadows dominated, and since then, between
1920 and today, urban spillovers have taken over. One key force that is likely to have driven the
changing relative strength of urban shadows and urban spillovers is the evolution of intra- and
inter-city commuting costs. After providing an overview of changes in commuting costs over the
last two-hundred years, we develop a two-city spatial model that incorporates both commuting and
moving costs. We show that the long-run behavior of just one variable — commuting costs — can
account for many of the observed patterns in the data, including the changing relative strength of
urban shadows and urban spillovers over time and space.
Using U.S. county and metro population data from 1840 to 2017, our empirical analysis
documents the changing correlation of local population growth with the presence of nearby large
locations. In addition to establishing the important shift from urban shadows to urban spillovers
around the year 1920, we identify three additional stylized facts. First, since the turn of the
twenty-first century, there has been a decrease in the positive correlation between proximity to a
large urban center and growth, suggesting that urban spillovers have been weakening in the last
decades. Second, there is evidence of the geographic reach of large urban centers expanding. Urban
spillovers were very local between 1920 and 1940 and much more far-reaching between 2000 and
2017. Third, the greater the size of a nearby large location, the bigger the correlation with its
hinterland’s growth. The evidence therefore suggests that larger locations exerted stronger urban
shadows in the earlier time period, as well as stronger urban spillovers in the later time period.
We hypothesize that changes in commuting costs can account for these patterns. Before
showing this with the help of a simple spatial model, the paper documents the evolution of com-
muting costs in the U.S. over the same period 1840-2017. Beginning in the mid-nineteenth century,
a steady stream of transport innovations lowered the cost of commuting. The introduction of the
streetcar facilitated longer-distance commutes, giving rise to the first “streetcar suburbs”. This
decline in commuting costs accelerated dramatically during the inter-war and post-war periods.
The combination of the widespread adoption of the automobile and the building of the highway
2See, e.g., Krugman (1993), Fujita et al. (1999), Black and Henderson (2003) and Bosker and Buringh (2017).
3See, amongst others, Davis and Weinstein (2002), Rosenthal and Strange (2003), Glaeser and Kahn (2004),Hanson (2005) and Redding and Sturm (2008).
2
system connecting downtowns to hinterlands made it possible for people to live much further away
from work. People residing in smaller nearby locations no longer had the need to permanently move
to the larger cities to enjoy their productive benefits. Instead, they could reside in the hinterland,
and commute to the urban centers for work. There is some indication that the continued drop in
commuting costs has weakened in the last two decades. Several factors may have contributed to
this: improvements in commuting technology have petered out; road and rail infrastructure invest-
ment has stalled; traffic congestion has worsened and the commuting speed has slowed down; and
there has been a rise in the opportunity cost of time, due to longer work hours and the increase in
double-income families.
To understand the role of commuting costs in shaping the relative strength of urban shadows
and urban spillovers, we develop a simple spatial model of two cities. An individual has three
choices: she can work in the city where she initially resides, she can move to live and work in the
other city, or she can commute for work to the other city without changing her residence. We
then show how these choices change with the cost of commuting, the distance between cities and
their relative sizes. We find that as the cost of commuting gradually drops, individuals switch from
staying put in the smaller, least productive city, first to moving and later to commuting to the
larger, more productive city. Hence, a gradual drop in transport costs first hurts growth in the
smaller city, as it loses population to its larger neighbor, but a further drop eventually helps its
growth, as its population commutes to the nearby larger city. That is, the smaller city goes from
experiencing a negative urban shadow, to benefiting from positive urban spillovers.
The intuition for the non-monotonic relation between commuting costs and the growth of
the smaller city is straightforward. The initial drop in transport costs lowers the cost of living in
the larger city by more than in the smaller city, for the simple reason that intra-city commutes
are on average longer in the larger city than in the smaller city. This makes it more attractive for
residents of the smaller city to pay the one-time moving cost to relocate to the larger city. As in
Cronon (1991), the large city uses its gravitational force to pull in migrants from the hinterland: an
urban shadow. A further drop in transport costs continues to make the larger city more attractive
than the smaller city, but it also facilitates inter-city commuting, which was hitherto too costly.
This allows residents of the smaller city to work in the larger city without the need to move. The
small city benefits from the proximity of the large city: an urban spillover.
When analyzing the observed long-run evolution of commuting costs through the lens of our
model, we can account for the four main stylized facts uncovered in the data. Recall that commuting
costs experienced three distinct regimes: slow decline between 1840 and 1920, rapid decline between
1920 and 2000, and stagnation since then. When interpreted by the model, these are consistent
with urban shadows dominating in the early time period and urban spillovers dominating in the
later time period, with some weakening of spillovers in the last decades. The model also shows that
3
as commuting costs decrease, the geographic reach of urban areas expands. In addition, the theory
implies that an increase in the relative size of a large urban center strengthens the force it exerts
on its hinterland. These theoretical predictions have their empirical counterparts in the stylized
facts we highlight in our data analysis.
While the long-run evolution of just one variable – commuting costs – is able to capture
the rise and decline of urban shadows, undoubtedly other forces might have been at play as well.
One such force are technology spillovers: in order for smaller locations to benefit from nearby
larger locations, there may be no need to commute to that larger neighbor if technologies diffuse
through space. We show that under certain assumptions our model is observationally equivalent
to one without inter-city commuting but with technology spillovers. Another force that may drive
urban spillovers is market access: the smaller location may benefit from proximity to its larger
neighbor through trade. In the Appendix we consider such an alternative model, and show that it
can capture some of the main empirical findings.
This paper is related to the literature that explicitly considers the spatial location of one
place relative to another. Urban economics has until recently largely ignored the spatial distribution
of cities (Fujita et al., 1999). An important early exception is central place theory (Christaller, 1933;
Losch, 1940). In that theory the tradeoff between scale economies and transportation costs leads to
the emergence of a spatially organized hierarchy of locations of different sizes. A natural implication
of central place theory is that the presence of large urban centers may enhance population growth in
nearby agglomerations through positive spillover effects, but it may also limit such growth through
competition among cities (Krugman, 1993; Tabuchi and Thisse, 2011).4
Most empirical studies that explore the effect of large agglomerations on other locations
focus on the twentieth century. They tend to find positive growth effects from proximity to urban
centers. Using U.S. data, Partridge et al. (2009) uncover a positive impact of large urban clusters
on nearby smaller places.5 Looking at the post-war period, Rappaport (2005) finds evidence of the
populations of cities and suburbs moving together. Dobkins and Ioannides (2001) also conclude
that there has been a positive effect of neighboring locations on growth since the 1950s. Liu et
al. (2011) analyze the case of China, and likewise show that the impact of a high-tier city on its
surrounding areas is positive.
A few papers have looked at earlier time periods and find evidence of urban growth shadows.
In pre-industrial Europe, Bosker and Buringh (2017) show that the net effect of large neighbors was
negative. Consistent with this, Rauch (2014) documents that historically larger European cities
4More recently, there has been a growing interest in incorporating ordered space into economic geography models.This is particularly true of quantitative spatial models that aim to bring the theory to the data in meaningful ways(Desmet and Rossi-Hansberg, 2014; Allen and Arkolakis, 2014; Desmet et al., 2018).
5In an earlier study for the time period 1950-2000, the same authors find negative effects from proximity tohigher-tiered places.
4
have been surrounded by larger hinterland areas. Most closely related to our work is Beltran et al.
(2017) who use data on Spanish municipalities for the time period 1800-2000. They find that the
influence of neighboring cities was negative between 1800 and 1950, to then become increasingly
positive from 1950 onwards. Our work focuses on the U.S., a country where the urbanization
process is likely to have differed from the Spanish experience for a variety of reasons: it was much
less settled in the nineteenth century, modern-day mobility across cities and regions is greater, and
the adoption of the automobile was swifter. In addition, our paper offers a theoretical framework
that allows us to interpret the switch from urban shadows to urban spillovers by relating it to the
secular decline in transport costs.
The rest of the paper is organized as follows. Section 2 presents the empirical findings on the
changes in urban growth shadows and spillovers over the period 1840-2017. Section 3 documents
the evolution of commuting costs over the same period 1840-2017. Section 4 proposes a conceptual
framework that relates commuting costs to urban shadows and spillovers, and it shows that the
theoretical predictions are consistent with the main patterns in the data. Section 5 concludes.
2 Urban Growth Shadows and Spillovers: 1840 to 2017
This section documents how the correlation of local population growth in the U.S. with the presence
of nearby large locations has evolved over the period 1840-2017. In doing so, it aims to explore
whether urban shadows or urban spillovers were more prominent in different time periods. It also
elicits a number of additional stylized facts.
2.1 Data
We use county population data from the Census Bureau spanning the period 1840 to 2017. With
the exception of the last period, we focus on successive twenty-year time frames: 1840-1860, 1860-
1880, ...,1980-2000, 2000-2017. In constructing the dataset, we had to resolve two main issues:
how to deal with changing county borders and how to delineate metro areas over time. In what
follows we limit ourselves to a brief discussion, and point the interested reader to Desmet and
Rappaport (2017) for more details. To get consistent county borders, we use a “county longitudinal
template” augmented by a map guide to decennial censuses, and combine counties as necessary to
create geographically-consistent county equivalents over successive twenty-year-periods (Horan and
Hargis, 1995; Thorndale and Dollarhide, 1987). For example, if county A splits into counties A1 and
A2 in 1850, we combine counties A1 and A2 to measure population growth of county A between
1840 and 1860. More generally, for growth between 1840 and 1860, we use geographic borders
from 1840; for growth between 1860 and 1880, we use geographic borders from 1860; and so on.6
6This description applies to the most common case of counties splitting over time. If counties merge between, say,1860 and 1880, then we would use geographic borders from 1880.
5
This methodology gives us a separate dataset for each twenty-year period we study, as well as for
2000-2017.
When different counties form part of the same metropolitan area, we do not want to consider
these counties as different locations. We therefore combine counties into metro areas, when and
where we can delineate them. Our analysis is thus based on a hybrid of metropolitan areas and non-
metropolitan counties. For 1940 and earlier, we merge counties to form metropolitan areas applying
criteria promulgated by the Office of Management and Budget (OMB) in 1950 to population and
economic conditions at the start of each twenty-year period (Gardner, 1999). For 1960 and later,
we use the official delineations promulgated by OMB after each decennial census. As with the
geographically-consistent counties, growth over any period is measured using the geographic borders
of the initial year. The number of locations in our datasets increases rapidly from 862 for the
period 1840-1860 to 2,370 for 1880-1900 and then more slowly to a maximum of 2,982 for 1940-
1960, reflecting both the westward movement of the U.S. frontier and the splitting of geographically
large counties as they became more densely settled into smaller ones. Thereafter, the number of
locations steadily declines as more and more counties were absorbed into metropolitan areas. Our
dataset for 2000-2017 has 2,369 locations.
The distribution of surrounding locations by size and distance systematically varies across
different parts of the country, which in many periods had different average growth rates. For
example, locations near the U.S. frontier during the nineteenth century tended to have few large
neighbors and high average growth. To avoid an omitted variable bias, we extensively control
for regional variation in order to isolate the correlation of growth with measures of surrounding
locations. A first set of 15 control variables are the terms from the third-order polynomial of latitude
and longitude, (1 + lat+ lat2 + lat3)(1 + long+ long2 + long3). A second set of control variables are
indicators for eight of the nine U.S. census divisions. A third set of ten control variables are linear
and quadratic terms of average low temperature in January, average high temperature in July,
average daily humidity in July, average annual rainfall, and average number of days on which it
rains (Rappaport, 2007). A fourth set of ten control variables are indicators of whether a location’s
geographic centroid is within 80 kilometers of the coast and of a natural harbor along each of the
north Atlantic, south Atlantic, Gulf of Mexico, Pacific, and Great Lakes (Rappaport and Sachs,
2003). A fifth set of two control variables are indicators of whether a location’s geographic centroid
is within 40 kilometers of a river on which there was navigation in 1890 and whether it is in addition
located within 80 kilometers of an ocean coast (Rappaport and Sachs, 2003). A final set of two
variables is a quadratic specification of hilliness, measured as the standard deviation of altitude
within a location normalized by the location’s land area. These six sets total 47 variables, which
we include in all regressions beginning with the 1860 cross section. A handful of them are dropped
for the 1840 cross section due to lack of variation (e.g., there were no locations in the Mountain
6
and Pacific census regions).
2.2 Baseline Specification
Our main specification regresses population growth over successive twenty-year intervals on the
presence of surrounding locations at specified distances with population above specified thresholds.
Let d`k denote the distance between locations ` and k, measured using a straight-line ap-
proximation between their geographic centroids. Let d ∈ {d1, d2, ..., dD} denote strictly increasing
specified distances, e.g., {50km, 100km, ..., 300km}. Finally, let Lk and L respectively denote the
population of location k and a specified population threshold for considering a neighboring location
to be large. For each ordered pair of locations, we construct an indicator variable, IL,d`k , describing
whether location k has population weakly above threshold L and distance from location ` weakly
less than d:
IL,d`k ≡ I(Lk, d`k; L, d) =
1 : Lk ≥ L & d`k ≤ d
0 : otherwise
For each location `, we then construct a set of indicators, one for each specified distance, describing
if there is at least one location, k 6= `, within that distance of location `, that has population weakly
above L and no such location within a smaller distance of `:
IL,d` =
1 : d = d1 &∑
k 6=` IL,d`k ≥ 1
1 : d ∈ {d2, .., dD} &(∏d−1
d=d1
(1− IL,d`
))(∑k 6=` I
L,d`k
)≥ 1
0 : otherwise
For each 20-year period from 1840 to 2000 and for the 17-year period from 2000 to 2017, we
regress annual average population growth, g`, on the set of indicators, IL` = [IL,d1` , IL,d2` , ... IL,dD` ],
along with a fifth-order polynomial of a location’s initial population, L` = [log(L`), (log(L`))2,
..., (log(L`))5]. The latter absorbs the non-monotonic relationship between growth and size through-
out most of U.S. history (Michaels et al., 2012; Desmet and Rappaport, 2017). It is necessary to
include these terms because the size distribution of neighbors closely depends on a location’s own
size. For example, very small locations rarely have a very large neighbor. As described in the
previous subsection, we also extensively control for geographic attributes with 47 variables, x`. We
thus specify a data generating process with reduced form
g` = IL` β + L` γ + x` δ + ε`. (1)
The partial correlation between the growth of a location and the presence of larger neighbors
unsurprisingly depends both on the threshold population above which we consider neighbors to be
7
large, L, and the size of the location itself, L`. Because the size distribution of U.S. locations
changed continually throughout U.S. history, we use relative measures of population both to set
year-specific thresholds for considering a location large and to focus the analysis on the growth of
locations that are not large. Specifically, we consider locations to be at least “moderately large” in
a given year if their population is at or above the 95th percentile of the distribution across locations
in that year. Analogously, we respectively consider locations to be “very large” in a given year if
their population is at or above the 99th percentile in that year. Reciprocally, we exclude locations
from our baseline regression analysis that have population above the 80th percentile. Our baseline
regressions thus estimate the partial correlations between the growth rate of small and medium
locations–those with population in the first through fourth quintiles–with the presence of nearby
locations in the top portion of the fifth quintile. We also discuss how these correlations differ across
sub-samples of locations by size.
Partial correlations sensitively depend on the maximum distance, dD, for which an indicator
is included. To understand this, recognize that indicators of a large neighbor within distance
intervals demarcated by {d1, d2, ..., dD}, together with the excluded interval, d`k > dD, make up a
disjoint set that fully partitions the observations in a regression. For a given maximum population
threshold, coefficients on each of the included indicators estimate the difference of predicted growth
for observations with a corresponding positive value and the predicted growth of observations
with a positive value of the excluded category. Estimated coefficients thus depend closely on the
composition of the excluded category.
It is important to specify a maximum distance that is not too high. Failing to do so
leaves few observations with a positive value for the excluded category. For example, in almost
all years for which we run regressions, less than 15 percent of observations have no moderately
large neighbor (one with population above the 95th percentile) within 300 kilometers. As these
relatively isolated locations tended to grow slowly, a regression of growth on indicators for each of
the distance intervals out to 300km must yield some positive coefficients. Hence it is important to
choose a maximum distance that is not too large.
Conversely, it is also important to choose a maximum distance that is not too low. Many of
the regressions estimate coefficients on the 50km-100km and 100km-150km indicators that are the
same sign and similar in magnitude to their estimates on the 0km-50km indicator. For the 95th
percentile and 99th percentile thresholds for large size, the number of observations with positive
indicators for two further-away intervals far exceeds the number with positive indicators for the
closest interval. In many cases, the majority of observations have positive values in the combined
50km-150km range. In consequence, regressions that include an indicator only for the 0km to
50km distance may not find much of a difference in predicted growth compared to locations in the
excluded category.
8
To balance these two considerations, we specify our regressions to include indicators for
50km intervals out to the maximum distance that leaves at least 50 percent of observations in
the excluded category. For example, 49 percent of the observations in the 1840 regression have a
neighbor that is at least moderately large within 150km and 67 percent have one within 200km
and so we use presence indicators for 0km to 50km, 50km to 100km, and 100km to 150km. Higher
thresholds for considering a location to be large require a maximum distance that is further away.
For the 1840 regression on the presence of neighbors that are very large (ones with population
above 99th percentile), our rule implies including presence indicators out to a maximum distance
of 300km.
We also require that all distance intervals have at least 20 observations with indicators that
are positive. In practice, this pertains only to the 0km-50km interval, which in some years has
positive indicators for only a handful of observations. In these cases, we use a nearest interval that
ranges from 0km to 100km.
2.3 Two Distinct Subperiods
This subsection explores the existence of urban growth shadows and urban growth spillovers be-
tween 1840 and 2017. When estimating the correlation of the population growth of small and
medium locations with the presence of moderately large locations (population at or above the 95th
percentile), Table 1 shows two clearly distinct periods: a negative regime from 1840 and 1920, and
a positive regime from 1920 through 2017.
The predicted population growth of small and medium locations was slower during each
of the four 20-year periods from 1840 to 1920 if they had a moderately large neighbor. In 1840,
the initial population of the small and medium locations ranged from 133 to 24,000 and the initial
population of the 44 moderately large locations ranged from 62,000 to 435,000. Small and medium
locations that had a moderately large neighbor within 50km had predicted annual population
growth from 1840 to 1860 that was slower by 0.63 percentage points compared to the excluded
locations, which did not have a moderately large neighbor within 150 kilometers. Locations whose
nearest moderately large neighbor was between 50km to 100km away had predicted annual growth
that was slower by 0.67 percentage points compared to excluded locations; and those whose nearest
moderately large neighbor was 100km to 150km away had predicted lower annual growth that was
slower by 0.30 percentage point. The corresponding negative coefficients statistically differ from
zero at the 0.05 or 0.10 levels (respectively, dark and light blue typeface). Estimated negative
coefficients are similar in magnitude for the 1860-1880 regression and a bit larger in magnitude
for the 1880-1900 regression. Predicted growth from 1900 to 1920 was also slower for small and
medium locations with a moderately large neighbor, although the magnitude of the difference
compared to not having a moderately large neighbor was considerably less than during the earlier
9
periods. Throughout the negative regime, the marginal share of the variation in growth accounted
for by the indicators for a moderately large neighbors (the increase in R2 compared to using only
the control variables) is slight, ranging from 0.2 to 0.8 percentage points.7
(1) (2) (3) (4) (5) (6) (7) (8) (9)neighbor w/pop ≥ 95th
percentile@ distance:
1840-1860
1860-1880
1880-1900
1900-1920
1920-1940
1940-1960
1960-1980
1980-2000
2000-2017
1 to 50 km -0.63 -0.77 -1.10 -0.04 0.28 1.11 1.19 0.50 0.41(0.35) (0.28) (0.21) (0.14) (0.12) (0.15) (0.19) (0.14) (0.20)
50 to 100 km -0.67 -0.41 -0.86 -0.28 -0.07 0.16 0.21 0.26 0.11(0.24) (0.20) (0.18) (0.09) (0.07) (0.08) (0.11) (0.09) 0.09
100 to 150 km -0.30 -0.11 -0.63 0.03(0.12) (0.17) (0.15) (0.07)
N (quints 1 to 4) 691 1,328 1,844 2,110 2,357 2,387 2,283 2,104 1,895
control vars 48 52 52 52 52 52 52 52 52
R20.857 0.787 0.726 0.533 0.407 0.366 0.391 0.427 0.318
Adj R20.846 0.778 0.718 0.521 0.393 0.352 0.376 0.412 0.298
R2-R2 controls 0.002 0.002 0.008 0.002 0.003 0.028 0.033 0.010 0.004
pop≥95th pctile 62ths-425ths
49ths-1.4mn
51ths-2.5mn
65ths-4.9mn
80ths-8.5mn
100ths-11.7mn
172ths-14.2mn
250ths-14.5mn
377ths-18.3mn
N, ≥95th pctile 44 86 119 133 148 150 143 132 119
pop of obs 133-24ths
103-23ths
100-26ths
104-30ths
137-32ths
285-36ths
208-42ths
408-49ths
356-65ths
share of pop 0.38 0.39 0.42 0.39 0.34 0.29 0.20 0.16 0.14
Table 1: Population Growth and the Presence of a Moderately Large Neighbor.All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations
have population in the first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based
on Conley (1999). Dark and light blue fonts respectively indicate a negative coefficient that statistically differs from
zero at the 0.05 and 0.10 levels. Dark and light red fonts respectively indicate a positive coefficient that statistically
differs from zero at the 0.05 and 0.10 levels.
The remaining columns of Table 1 describe the positive regime between population growth
and the presence of a moderately large neighbor. For each of the five periods from 1920 to 2017,
predicted population growth was faster for small and medium locations with a moderately large
neighbor within 50km. For each of the three periods from 1940 to 2000, predicted growth was
also slightly faster for locations whose nearest moderately large neighbor was located between
50km and 100km away. The corresponding positive coefficients statistically differ from zero at
the 0.05 and 0.10 levels (respectively, dark and light red typeface). The magnitude of the faster
7In Table 1 we refer to this as “R2 - R2 controls”, i.e., the difference between the R2 of our regression and the R2
of a specification that only includes the controls (and hence leaves out the neighbor dummies).
10
predicted growth is relatively modest from 1920 to 1940, when suburbanization was just getting
underway. Then, both from 1940-1960 and from 1960-1980, the presence of a moderately large
neighbor within 50km predicted population growth that was higher by more than 1 percentage
point (statistically significant at the 0.01 level). Smaller-magnitude coefficients seem to suggest
that suburbanization waned from 1980 to 2000. But as we will describe in the next subsection,
this is somewhat misleading, because it reflects many rapidly suburbanizing peripheral counties
having been reclassified as belonging to a metropolitan area following the 1970 and 1980 decennial
censuses. The marginal share of the variation accounted for by the indicators of a moderately large
neighbor is about 3 percentage points for the periods beginning in 1940 and 1960, but substantially
lower for the other periods during the positive regime.
If we interpret the slower growth of locations with large neighbors as evidence of urban
shadows, and the faster growth of those same locations as evidence of urban spillovers, then we can
summarize our findings in Table 1 as follows:
Stylized Fact 1: Urban Shadows and Spillovers. Between 1840 and 1920 urban growth shad-
ows dominated the U.S. economic geography, with locations in the vicinity of large places growing
relatively slower, whereas between 1920 and 2017 urban growth spillovers dominated, with locations
in the vicinity of large places growing relatively faster.
This division into a negative regime followed by a positive regime robustly holds for alternative
threshold levels of largeness and widely varying specifications.
2.4 Recent Weakening of Urban Spillovers
In this subsection we analyze whether there has effectively been a weakening in urban spillovers
since the 1980s, as suggested by some of the results reported above. To be precise, Table 1 showed
that the expected growth boost from having a top-5 percent neighbor in the 1-to-50 kilometer
range dropped by more than half, from 1.19 percentage points for the period 1960-1980 to 0.50
percentage points for the period 1980-2000. That fall may be partly explained by changing metro
delineations: if a fast-growing location in one time period is also more likely to get absorbed into
a metro area by the next time period, then this may cause a decline in growth of the locations in
the 1-to-50 kilometer range. More generally, as re-delineated metro areas include more outlying
counties, the continuing filling in of these counties is implicitly accounted for as migration within
a location rather than between locations. This makes comparisons across periods more difficult.8
8In addition, unobserved characteristics are likely to distinguish which surrounding counties at a given distanceare absorbed into a metro, introducing a selection bias in making comparisons across periods. The re-delineationsalso leave fewer locations with nearby large neighbors, reflecting that metropolitan radiuses are becoming longer. Thechanging delineation of metros also affects metropolitan centroids, which are constructed as the population-weightedmean of constituent counties’ centroids. Hence it also affects distances to large neighbors, which are measured betweencentroids.
11
(1) (2) (3)neighbor w/pop ≥ 95th
percentile@ distance:
1960-1980
1980-2000
2000-2017
1 to 50 km 1.19 1.43 0.60(0.19) (0.30) (0.12)
50 to 100 km 0.21 0.59 0.14(0.11) (0.23) (0.06)
100 to 150 km 0.24(0.20)
150 to 200 km 0.13(0.12)
200 to 250 km
N (quints 1 to 4) 2,283 2,282 2,283
control vars 52 52 52
R20.391 0.440 0.326
Adj R20.376 0.426 0.310
R2-R2 controls 0.033 0.047 0.019
pop≥thresh 172ths-14.2mn
246ths-14.4mn
306ths-15.7mn
N, ≥thresh 143 143 143
pop of obs 208-42ths
408-54ths
67-67ths
Table 2: Population Growth and Large Neighbors, 1960 Metropolitan Borders.Metropolitan areas are delineated using the OMB standards following the 1960 decennial census. All regressions
include a constant and control for initial population 52 geographic covariates. Observations have population in the
first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based on Conley (1999). Dark
blue and light blue fonts respectively indicate a negative coefficient that statistically differs from zero at the 0.05 and
0.10 levels. Dark and light red fonts respectively indicate a positive coefficient that statistically differs from zero at
the 0.05 and 0.10 levels.
To assess the plausibility of this concern, Table 2 reports regressions for the three periods
from 1960 to 2017 using metropolitan area borders established following the 1960 decennial cen-
sus. Consistent with the possible bias we described, when keeping borders constant, the positive
relationship between growth and the presence of large neighbors peaked 20 years later, during the
period from 1980 to 2000, rather than during the period 1960 to 1980. In other words, we still see a
weakening relation between growth and proximity to large locations, but only after the turn of the
twenty-first century. This suggests that the transition of metropolitan areas to a larger geographic
12
footprint may be winding down.9 We summarize these findings as follows:
Stylized Fact 2: Recent Weakening of Urban Spillovers. Urban growth spillovers have
been weakening since the turn of the 21st century. In particular, during the period 2000-2017 urban
growth spillovers are less pronounced than than during the periods from 1960-1980 and 1980-2000.
2.5 Geographic Span
This subsection explores how the geographic span of urban shadows and urban spillovers has
changed over time. When focusing on moderately large neighbors (at or above the 95th percentile),
as we have done so far, there are few observations with a positive value for the excluded category at
far-away distances. This limits the maximum geographic distance we are able to consider, making it
difficult to analyze how the geographic span of urban shadows and spillovers evolves over time. To
get around this issue, Table 3 considers the presence of very large neighbors (at or above the 99th
percentile), allowing us to consider farther-away distances while maintaining enough observations
with a positive value for the excluded category. As an example, for the period 1980-2000 we are
able to include neighboring locations all the way to 250km, whereas for the same time period in
Table 1 we only considered neighbors within a range of 100km.
Before discussing the spatial reach of urban shadows and spillovers, we show that increasing
the size threshold from the 95th to the 99th percentile does not qualitatively change what we
concluded before. There continues to be a negative regime and a positive regime, with the year
1920 separating the two. The magnitudes of the coefficients of course differ, especially during the
positive regime, when having a neighbor above the 99th percentile rather than above the 95th
percentile was associated with a considerably greater boost in population growth (Table 3). For
example, predicted growth from 1960 to 1980 was 3.2 percentage points per year higher for small
and medium locations that had a very large neighbor within 50km compared to the excluded
locations, those whose nearest very large neighbor was at least 250km away.
We now analyze how the geographic reach of very large neighbors changes over time. During
the negative regime, when comparing 1900-1920 to 1880-1900, the drop in growth from having a very
large neighbor weakens at shorter distances below 50km but strengthens at farther-away distances
above 150km.10 During the positive regime, the growth boost of having a very large neighbor
starts off within a rather narrow 50km radius for the period 1920-1940, but then expands by 50km
9Regressing growth from 1960 to 1980 using metropolitan borders from 1940 modestly increases estimated coeffi-cients on indicators of moderately large neighbors (compared to using 1960 borders) and modestly decreases estimatecoefficients on indicators of very large neighbors. Regressing growth from 1940 to 1960 using metropolitan bordersfrom 1920 modestly increases estimated coefficients on indicators of both moderately large and very large neighbors.Regardless of borders, the strength of suburbanization from 1940 to 1960 as estimated by the regressions was similarto the strength from 1960 to 1980.
10Comparing to earlier time periods is more complex, because of differences in the maximum distance.
13
(1) (2) (3) (4) (5) (6) (7) (8) (9)neighbor w/pop ≥ 99th
percentile@ distance:
1840-1860
1860-1880
1880-1900
1900-1920
1920-1940
1940-1960
1960-1980
1980-2000
2000-2017
1 to 50 km -1.01 -0.79 -0.28 0.73 2.41 3.20(0.37) (0.42) (0.28) (0.29) (0.26) (0.61)
50 to 100 km† 0.00 -0.67 -0.87 -0.64 0.22 0.55 0.97 1.03 0.68(0.46) (0.30) (0.27) (0.19) (0.19) (0.15) (0.25) (0.19) (0.15)
100 to 150 km 0.27 -0.60 -0.45 -0.47 0.08 0.14 0.30 0.51 0.39(0.47) (0.22) (0.24) (0.18) (0.18) (0.13) (0.14) (0.13) (0.10)
150 to 200 km 0.49 -0.77 -0.15 -0.52 0.15 0.02 0.13 0.26 0.28(0.38) (0.18) (0.23) (0.17) (0.16) (0.13) (0.12) (0.10) (0.08)
200 to 250 km -0.01 -0.02 -0.19 0.18 -0.12 0.03 0.10 0.22(0.24) (0.17) (0.16) (0.15) (0.13) (0.11) (0.08) (0.07)
250 to 300 km 0.17 -0.02 -0.13 0.16 0.10(0.18) (0.19) (0.19) (0.13) (0.07)
N (quints 1 to 4) 691 1,328 1,844 2,110 2,357 2,387 2,283 2,104 1,895
control vars 48 52 52 52 52 52 52 52 52
R20.856 0.790 0.721 0.535 0.407 0.378 0.408 0.452 0.342
Adj R20.844 0.781 0.712 0.521 0.392 0.362 0.393 0.437 0.321
R2-R2 controls 0.001 0.004 0.003 0.004 0.003 0.039 0.050 0.035 0.028
pop≥99th pctile 171ths-425ths
139ths-1.4mn
139ths-2.5mn
197ths-4.9mn
321ths-8.5mn
442ths-11.7mn
810ths-14.2mn
1.3mn-14.5mn
2.0mn-18.3mn
N, ≥99th pctile 9 18 24 27 30 30 29 27 24
pop of obs 133-24ths
103-23ths
100-26ths
104-30ths
137-32ths
285-36ths
208-42ths
408-49ths
356-65ths
Table 3: Population Growth and the Presence of a Very Large Neighbor.†50-to-100km row reports results for 1 to 100km when no results are reported in the 1-to-50km row.
All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations
have population in the first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based
on Conley (1999). Dark and light blue fonts respectively indicate a negative coefficient that statistically differs from
zero at the 0.05 and 0.10 levels. Dark and light red fonts respectively indicate a positive coefficient that statistically
differs from zero at the 0.05 and 0.10 levels.
over each subsequent 20-year period, reaching 250km during 2000-2017. Of course, as is intuitive,
growth’s positive relationship with the presence of a very large neighbor weakens the more distant
that neighbor is located. These findings constitute our third stylized fact:
Stylized Fact 3: Geographic Span of Shadows and Spillovers. Over the period 1920-2017
there is strong evidence of the geographic span of urban growth spillovers expanding, with spillovers
being very local between 1920-1940 and much more far-reaching in 2000-2017. Over the period
1840-1920 the evidence is mixed, though there is weak evidence of the geographic span of urban
14
growth shadows expanding between the late 19th century and early 20th century.
2.6 Relative Size of Locations and Neighbors
This subsection explores how the strength of urban shadows and spillovers depends on the relative
size of locations and neighbors.
The Size of Locations. The qualitative relationship between a location’s growth and the pres-
ence of a large neighbor is not too sensitive to the size of the locations, though the magnitude of
the correlation shows some tendency to decline with size. In what follows, we make this point by
focusing on two representative time periods, one for the positive regime, 1880 to 1900, and one for
the negative regime, 1960 to 1980.
For each of the first four quintiles of locations, Table 4 separately reports regressions of
growth from 1880 to 1900 on the presence of a moderately large neighbor. All coefficients are
estimated to be negative, with the magnitudes being lower for higher quintiles. When we consider
growth of locations between the 80th and the 90th percentile, the magnitude of the correlation
becomes even smaller, though it continues to be negative. Regressions for the other periods during
the negative regime show similar results, both for the presence of moderately large and very large
neighbors.11 Table 5 reports analogous regressions of growth from 1960 to 1980. All coefficients are
estimated to be positive, with in this case too some tendency, albeit weaker, for the magnitudes to
decline with a location’s size.
The Size of Neighbors. When comparing Table 1 and Table 3, we found that growth’s correla-
tions with the presence of large neighbors increased with the size of neighbors during the positive
regime but not the negative regime. A more general specification, however, establishes that mag-
nitudes are increasing with the size of neighbors during both regimes. Table 6 shows results from
regressing population growth on the presence of neighbors above four thresholds: the 80th, 90th,
95th, and 99th percentiles. These categories are nested in the sense that a neighbor that is above
the 99th percentile is also above the 90th and 95th percentiles. Coefficients on these latter thresh-
olds thus estimate the marginal boost to predicted growth compared to having a neighbor with
population only above the next highest threshold. For example, a positive coefficient on the 99th
percentile indicator estimates the additional predicted growth of locations that have a neighbor
with population above the 99th percentile compared to locations with a largest neighbor with
population between the 95th and 99th percentiles.
During the negative regime, the increase in the magnitude of growth’s relationship with the
population of its largest nearby locations is especially strong in the 1880-1900 regression. Having at
11The one exception concerns regressions of growth from 1880 to 1900 on the presence of very large neighbors. Incontrast to the combined regression, estimated coefficients using the first quintile of observations are positive.
15
(1) (2) (3) (4) (5) (6) (7)
neighbor w/population
quints 1 to 4
quint 1
quint 2
quint 3
quint 4
decile 9
decile 10
≥ 95th pctile@1 to 50 km -1.10 -1.17 -0.35 -0.47 -0.25 0.07
(0.21) (0.26) (0.25) (0.11) (0.13) (0.16)
50 to 100 km† -0.86 -1.55 -0.79 -0.40 -0.23(0.18) (1.05) (0.31) (0.12) (0.10)
100 to 150 km -0.63 -1.92 -0.39(0.15) (0.99) (0.19)
150 to 200 km -1.03(0.96)
200 to 250 km -0.65(0.96)
250 to 300 km -0.14(0.96)
300 to 350 km -0.65(1.01)
N 1,844 422 476 472 474 237 237
control vars 52 52 50 51 52 48 50
R20.726 0.757 0.503 0.470 0.514 0.553 0.583
Adj R20.718 0.718 0.441 0.403 0.451 0.436 0.469
R2-R2 controls 0.008 0.005 0.016 0.013 0.011 0.004 0.000
pop≥95th pctile 51ths-2.5mn
51ths-2.5mn
51ths-2.5mn
51ths-2.5mn
51ths-2.5mn
51ths-2.5mn
51ths-2.5mn
N, pop≥thresh 119 119 119 119 119 119 119
pop of obs 100-25.7ths
100-5.6ths
5.6ths-10.7ths
10.8ths-16.1ths
16.1ths-25.7ths
25.7ths-36.6ths
36.6ths-2.5mn
share of pop 0.42 0.02 0.08 0.13 0.19 0.14 0.44
Table 4: Population Growth by Quintile, 1880 to 1900.†50-to-100km row reports results for 1 to 100km when no results are reported in the 1-to-50km row.
All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations
have population in the enumerated percentile range. Standard errors, in parentheses, are robust to spatial correlation
based on Conley (1999). Dark blue and light blue fonts respectively indicate a negative coefficient that statistically
differs from zero at the 0.05 and 0.10 levels.
least one neighbor within 50km that had population (weakly) above the 80th percentile is associated
with slower predicted growth of 0.21 percentage point per year. If the largest such neighbor within
50km had population above the 90th percentile, predicted growth is slower by an additional 0.37
percentage point per year. If the largest such neighbor had population above the 95th percentile,
predicted growth is slower by still an additional 0.74 percentage point per year. As an example,
16
(1) (2) (3) (4) (5) (6) (7)
neighbor w/population
quints 1 to 4
quint 1
quint 2
quint 3
quint 4
decile 9
decile 10
≥ 95th pctile@1 to 50 km 1.19 1.69 1.34 1.03 1.16 0.27 0.12
(0.19) (0.83) (0.34) (0.26) (0.19) (0.15) (0.11)
50 to 100 km 0.21 1.12 0.15 0.18(0.11) (0.47) (0.13) (0.12)
100 to 150 km 0.56(0.31)
150 to 200 km 0.60(0.31)
N 2,283 571 570 571 571 285 285
control vars 52 51 50 51 52 51 52
R20.391 0.521 0.452 0.459 0.410 0.427 0.647
Adj R20.376 0.470 0.397 0.404 0.349 0.298 0.566
R2-R2 controls 0.033 0.022 0.032 0.034 0.086 0.005 0.001
pop≥95th pctile 172ths-14.2mn
172ths-14.2mn
172ths-14.2mn
172ths-14.2mn
172ths-14.2mn
172ths-14.2mn
172ths-14.2mn
N, pop≥thresh 143 143 143 143 143 143 143
pop of obs 208-42.3ths
208-7.8ths
7.8ths-13.5ths
13.5ths-21.3ths
21.3ths-42.3ths
42.3ths-79.5ths
79.5ths-14.2mn
share of pop 0.20 0.02 0.03 0.05 0.10 0.09 0.71
Table 5: Population Growth by Quintile, 1960 to 1980.All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations
have population in the enumerated percentile range. Standard errors, in parentheses, are robust to spatial correlation
based on Conley (1999). Dark and light red fonts respectively indicate a positive coefficient that statistically differs
from zero at the 0.05 and 0.10 levels.
consider a location that has a neighbor with population at the 99th percentile between 50km and
100km away and no neighbor with population above the 90th percentile within 50km. During the
period 1880-1900, such a location would have slower predicted population growth of 1.36 percentage
point per year—the sum of the coefficients on the 50-to-100km indicators for the 90th, 95th, and
99th percentiles—compared to observations that do not have a neighbor in any of the categories
included in the regression.
During the positive regime, the largest marginal increases in predicted growth are associated
with having a neighbor with population at the 99th percentile rather than having one with pop-
ulation between the 95th and 99th percentiles. This is especially so during the 1960-1980 period,
when the marginal increase was 2.43 percentage point per year. For neighbors located more than
50km away, only those with population at the 99th percentile are associated with a meaningful
17
(1) (2) (3) (4) (5) (6) (7) (8) (9)
neighbor w/population
1840-1860
1860-1880
1880-1900
1900-1920
1920-1940
1940-1960
1960-1980
1980-2000
2000-2017
≥ 80th pctile@1 to 50 km -0.41 -0.72 -0.21 -0.30 0.10 -0.07 0.06 0.14 0.06
(0.21) (0.19) (0.12) (0.09) (0.07) (0.08) (0.07) (0.06) (0.05)
≥ 90th pctile@1 to 50 km -0.40 0.25 -0.37 0.13 -0.03 0.30 0.15 0.21 0.00
(0.35) (0.27) (0.19) (0.14) (0.11) (0.11) (0.08) (0.09) (0.08)
50 to 100 km -0.30 -0.19 -0.29 0.00(0.24) (0.22) (0.16) (0.06)
≥ 95th pctile@1 to 50 km -0.09 -0.44 -0.74 0.01 0.18 0.54 0.72 0.17 0.36
(0.38) (0.30) (0.23) (0.20) (0.15) (0.17) (0.15) (0.13) (0.20)
50 to 100 km -0.49 -0.17 -0.54 -0.22 -0.09 0.06 0.04 0.17 0.04(0.30) (0.22) (0.17) (0.10) (0.08) (0.09) (0.09) (0.08) (0.09)
100 to 150 km -0.37 0.05 -0.60 0.00(0.12) (0.20) (0.15) (0.06)
≥ 99th pctile@1 to 50 km -0.48 -0.09 -0.31 0.55 1.88 2.43
(0.36) (0.46) (0.29) (0.26) (0.28) (0.59)
50 to 100 km† 0.49 -0.40 -0.53 -0.52 0.28 0.59 0.98 0.96 0.66(0.45) (0.30) (0.26) (0.20) (0.18) (0.16) (0.24) (0.18) (0.15)
100 to 150 km 0.54 -0.59 -0.14 -0.42 0.09 0.16 0.28 0.50 0.40(0.44) (0.26) (0.23) (0.18) (0.19) (0.14) (0.14) (0.13) (0.10)
150 to 200 km 0.65 -0.75 -0.04 -0.51 0.17 0.06 0.11 0.25 0.27(0.35) (0.19) (0.21) (0.17) (0.16) (0.13) (0.11) (0.09) (0.07)
200 to 250 km 0.14 0.03 -0.21 0.19 -0.09 0.01 0.11 0.23(0.23) (0.16) (0.15) (0.15) (0.13) (0.11) (0.08) (0.07)
250 to 300 km 0.25 0.03 -0.13 0.16 0.12(0.21) (0.16) (0.18) (0.13) (0.07)
N (quints 1 to 4) 691 1,328 1,844 2,110 2,357 2,387 2,283 2,104 1,895
control vars 48 52 52 52 52 52 52 52 52
R20.860 0.794 0.729 0.537 0.409 0.390 0.427 0.463 0.345
Adj R20.847 0.784 0.719 0.523 0.393 0.374 0.411 0.447 0.323
R2-R2 controls 0.005 0.008 0.011 0.006 0.006 0.052 0.069 0.046 0.031
pop, pctile 80 24ths 23ths 26ths 30ths 32ths 36ths 42ths 49ths 65ths
pop, pctile 90 41ths 35ths 37ths 44ths 49ths 60ths 79ths 103ths 153ths
pop, pctile 95 62ths 49ths 51ths 65ths 80ths 100ths 172ths 150ths 377ths
pop, pctile 99 171ths 139ths 139ths 197ths 321ths 442ths 810ths 1.3mn 2.0mn
Table 6: Population Growth and the Size of Large Neighbors.†50-to-100km row reports results for 1 to 100km when no results are reported in the 1-to-50km row.
All regressions include a constant and control for initial population and up to 47 geographic covariates. Observations
have population in the first four quintiles. Standard errors, in parentheses, are robust to spatial correlation based on
Conley (1999).
18
increase in predicted growth. For the period from 2000 to 2017, the statistically-significant boost
from having a very large neighbor extends to those as much as 300km away. In contrast to the
negative regime, the magnitude of the estimated differences in growth are modest for neighbors
with population between the 80th and 90th percentiles.
Our findings of how the relative size of locations and neighbors affects the strength of urban
shadows and spillovers can be summarized as follows:
Stylized Fact 4: Relative Size of Locations and Neighbors. Urban shadows and urban
spillovers tend to strengthen in the size difference between a location and its large neighbor. That is,
the smaller a location and the larger its neighbor, the stronger urban shadows and urban spillovers.
2.7 Regional Variation during the U.S. Westward Expansion
In this subsection we aim to understand to what extent the existence of urban shadows might
have been related to the westward expansion of the U.S. during the 19th and early 20th centuries.
If during that time period locations in the West, much of them isolated, grew fast because they
were becoming settled, this would contribute to a negative correlation between local growth and
proximity to a large neighbor. To assess this possibility, we rerun our baseline regression for three
separate regions: the East, corresponding to the states covering the area of the original 13 colonies,
excluding Kentucky and Tennessee;12 the Middle, corresponding to all other states east of the
Mississippi River; and the West, corresponding to all states west of the Mississippi River. Table 7
reports our findings for the time period 1860-1920. Although, as expected, the magnitudes of the
correlations are stronger in the more newly settled portions of the U.S. (the West) compared to
portions that were settled earlier (the East), we continue to observe evidence of urban shadows in
all regions of the country.
3 Commuting Costs: 1840 to 2017
One important force that shapes spatial growth dynamics in the hinterland of large population
clusters relates to intra-city and inter-city commuting costs. As these will play a key role in the
conceptual framework we present to interpret our empirical findings, in this section we briefly
document how commuting costs in the U.S. have evolved since 1840. We start by focusing on
changes in transportation technology, and then consider other factors that also affect commuting
costs.
12The land areas that would eventually constitute Kentucky and Tennessee were originally part of Virginia andNorth Carolina, respectively.
19
(1) (2) (3) (4) (5) (6) (7) (8) (9)1860-1880
1880-1900
1900-1920
1860-1880
1880-1900
1900-1920
1860-1880
1880-1900
1900-1920
1 to 50 km -0.28 -0.31 0.50 -0.58 -0.62 0.06 -2.11 -1.78 -0.13(0.31) (0.10) (0.20) (0.24) (0.13) (0.08) (1.88) (0.45) (0.27)
50 to 100 km†-0.21 -0.46 -0.52 -2.08 -1.74 -0.38(0.26) (0.17) (0.14) (1.48) (0.31) (0.20)
100 to 150 km -2.59 -1.41 -0.07(1.29) (0.28) (0.18)
150 to 200 km -2.30 -1.17(1.03) (0.26)
200 to 250 km -2.36 -0.94(0.90) (0.26)
250 to 300 km -1.77 -0.53(0.70) (0.23)
300 to 350km -1.41(0.40)
N (quints 1 to 4) 346 379 390 479 567 600 503 898 1,120
control vars 41 41 41 42 42 42 42 43 43
R20.527 0.445 0.363 0.742 0.690 0.389 0.828 0.761 0.589
Adj R20.459 0.375 0.286 0.716 0.664 0.342 0.810 0.748 0.572
R2-R2 controls 0.004 0.004 0.017 0.004 0.009 0.000 0.007 0.008 0.001
East Middle West
neighbor w/pop ≥ 95th
percentile@ distance:
Table 7: Population Growth and Large Neighbors by Region, 1860 to 1920.All regressions include a constant and control for initial population and up to 47 geographic co-variates. Observations have population in the first four quintiles of the national distribution. Eastregion includes the states occupying the land area of the original 13 U.S. colonies excluding Ken-tucky and Tennessee. Middle region includes all other states east of the Mississippi River. Westregion includes all states west of the Mississippi River. Standard errors, in parentheses, are robustto spatial correlation based on Conley (1999). Dark blue and light blue fonts respectively indicatea negative coefficient that statistically differs from zero at the 0.05 and 0.10 levels.
3.1 Transportation Technologies
Since the middle of the 19th century, there have been enormous improvements in transportation
technologies. Some of those have greatly enhanced long-distance trade and market integration.
Examples that come to mind include the railroad network (Fogel, 1964; Donaldson and Hornbeck,
2016), the building of canals (Shaw, 1990), the construction of the inter-state highway system
(Baum-Snow, 2007), and containerization (Bernhofen et al., 2016). To illustrate the magnitude of
the decline in transport costs, Glaeser and Kohlhase (2004) document that the real cost per ton-mile
of railroad transportation dropped by nearly 90% between 1890 and 2000. Other changes have been
20
more central to improving short-distance transportation between neighboring or relatively close-by
places. For the purpose of our paper, we are mostly interested in these latter improvements. In
what follows we give a brief overview of the main innovations that have benefited short-distance
transportation technology in the U.S. over the past two centuries.
Prior to the 1850s, many Americans worked near the central business district and walked
to work. Other forms of transportation were expensive and slow. Horse-drawn carriages were
available, but were only affordable to the very rich (LeRoy and Sonstelie, 1983).13 The omnibus, a
horse-drawn vehicle carrying twelve passengers, was first introduced in the 1820s and became more
widely used in the 1840s (Kopecky and Hon Suen, 2004). However, it was still a costly and not
very fast way to travel.14 Commuter railroads appeared in the 1830s, although they were noisy and
polluting, which led authorities to impose strict regulations, often limiting their use.15
Between 1850 and 1900 the U.S witnessed the arrival of the streetcar or trolley, which
allowed for smoother travel and larger capacity than an omnibus. As with many other new modes
of transport, initially only high-income individuals could pay the high fare of streetcars to commute
to work on a regular basis. Nonetheless, the introduction of the streetcar allowed the larger cities
to grow. Boston saw the first “streetcar suburbs”, well-off neighborhoods on the outskirts of the
city (Warner, 1972; Mieszkowski and Mills, 1993; Kopecky and Hon Suen, 2004). The streetcar was
an important improvement over the omnibus in terms of capacity and speed: a two-horse streetcar
could carry 40 passengers, and its speed was about one-third greater. Over time, animals were
substituted by cleaner and more efficient motive powers. The first electric streetcar was operated
in Montgomery, Alabama, in 1886, and by the end of 1903, 98 percent of the 30,000 miles of street
railway had been electrified.16 By 1920, the streetcar had become an affordable mean to commute
for almost every worker. However, by then the car had made its appearance, so the streetcar never
became widely used by all income groups.
Several factors contributed to the streetcar facilitating longer-distance commutes, thus al-
lowing large cities to grow bigger. One was an improvement in speed, another was the use of flat
rates independent of distance, and a third was the construction of longer rail lines. In his study
of Boston, Warner (1972) argues that the trolley triggered a substantial outward expansion of the
city. In particular, he estimates this expansion to have been between 0.5 and 1.5 miles per decade.
As Jackson (1985) explains, this translates into the outer limit of convenient commuting, defined as
the distance that can be traversed in one hour or less, increasing from about 2 miles from Boston’s
13Regular steam ferry service began in the early 1810s but was limited to big coastal cities like New York.
14LeRoy and Sonstelie (1983) document that an omnibus fare ranged from 12 cents to 50 cents at a time when alaborer might earn $1.00 a day. Its average speed was slow – about 6 miles per hour.
15As in the case of the omnibus, commuter railroads were also quite expensive (LeRoy and Sonstelie, 1983).
16Before the use of electricity, the use of steam engines was briefly tried, with limited success, in several U.S. cities.
21
City Hall in 1850 to 6 miles in 1900.
While all these innovations significantly decreased transportation and commuting costs, it
was not until the path-breaking invention of the internal combustion engine that these costs would
experience radical change. The adoption of the car did not happen overnight: the affordability of
automobiles for the middle class had to wait until the mass production of the Model-T in 1908.
Other issues had to be resolved as well before cars could become wide-spread. Initially, regulations
limited their use and speed to 4 miles per hour to avoid scaring horses. There was also a scarcity
in gasoline stations and service facilities. More importantly, roads were still largely unpaved.
The growth in car ownership and use was tightly linked to the investment in roads and
highways. New York opened the first part of its parkway system in 1908, which allowed drivers to
increase their speed to 25 miles per hour. The Federal Highway Act of 1921 allowed the construction
of similar highways across the country. In 1913, there was a motor vehicle to every eight people
and, by the end of the 1920s, the car was used by 23 million people. The government effort was
boosted years later with the Eisenhower Interstate Highway system, arguably the largest public
works project in history and authorized by the Federal Highway Act of 1956. During this entire
period, car ownership continued its upward ascent until the 1970s (Kopecky and Suen, 2004).
The combination of the mass use of the car and the expansion of the highway system
translated into a huge wave of suburbanization, mostly in the post-WWII era. Many of these
highways connected the downtown areas of large urban centers to the suburbs and the farther-
off hinterland. According to Glaeser (2011), “the highway program was meant to connect the
country, but subsidizing highways ended up encouraging people to commute by car”. Baum-Snow
(2007) argues that cars and highways were a fundamental determinant of the suburbanization of
American cities. His estimations show that, between 1950 and 1990, the construction of one new
highway passing through a central city reduced its population by about 18 percent. Another major
transportation change starting around 1950 was the construction of suburban rail terminals. In
cities like San Francisco and Washington, D.C., heavy-rail systems were established, while light-rail
systems followed in cities like San Diego and Portland (Young, 2015).17
3.2 Other Commuting Costs
In addition to transport technology, other factors that determine the time cost of commuting are
the spatial concentration of people and businesses, traffic congestion, and the opportunity cost of
time.
17Suburbanization was also facilitated by factors unrelated to transport technology: the home mortgage interestdeduction, the introduction of government-guaranteed mortgages, the Federal Housing Administration loans thatguaranteed up to 95 percent of mortgages for middle-income buyers, and the GI Bill that offered no down paymenthousing loans for veterans.
22
Spatial Clustering. Commuting costs fall if it becomes easier to fit more people or businesses
onto an acre of land, since this implies less people needing to commute long distances. One major
factor facilitating density is the possibility of building vertically. Historically, this move upward
was at first modest, as two-story buildings were gradually replaced by four- and six-story buildings
(Glaeser, 2011). Heights were restricted by the cost of construction and the limits on people’s
desire to climb stairs. As a result, the top floors of six-story buildings were typically occupied by
the lowest-income tenants (Bernard, 2014). This all changed with the invention of the elevator. A
first elevator engine was presented by Elisha Otis at the 1854 New York’s Crystal Palace Exposition,
but its rudimentary technology was unsuitable to be used in tall buildings. In 1880, Werner von
Siemens’ electric elevator made it possible to transport people to tall heights in a safe manner,
hence enabling the construction of skyscrapers with functional uses.
Another challenge that had to be overcome to build skyscrapers was an architectural con-
straint: erecting tall buildings required thick walls, making skyscrapers unprofitable. The solution
to this problem was the use of load-bearing steel skeletons, where the weight of the building rests
on a skeleton frame. Building these type of structures became possible in large part thanks to the
increasing affordability of steel in the late 19th century. The first skyscraper is often attributed
to William Le Baron Jenney’s Home Insurance Building, a 138-foot structure built in Chicago in
1885.18 In the following decades, skyscrapers became a fixture in the skylines of American cities,
especially in Manhattan, which witnessed a boom in the number of skyscrapers in the 1920s.19
Congestion. The speed of commuting is of course not only a function of available technology. As
traffic congestion has become worse, the most recent decades have witnessed a slowdown or even
a reversal in the trend of ever-faster commuting. As one indicator of this growing congestion, we
use the travel time index (TTI) of the Texas A&M Transportation Institute. The TTI is defined
as the ratio of travel time in the peak period to travel time at free-flow conditions. For example,
a value of 1.10 indicates a 20-minute free-flow trip takes 22 minutes in the peak period. Between
1990 and 2010, the TTI increased from around 1.10 to 1.20. As another indicator of congestion,
we compute the average speed of trips under 50 km from the National Household Travel Survey.
Between 1983 and 2001, this speed was still increasing, from 23.3 miles per hour to 26.4 miles per
hour. Since then, this speed has declined, and by 2017 it had fallen by nearly one quarter, to 20.3
miles per hour. A similar pattern can be observed for trips between 50 and 100 km. The average
speed increased from 45.2 mph in 1983 to 49.0 mph in 1995, and has since then declined, reaching
39.7 mph in 2017.
18Other famous skyscrapers built around that year are the Mountauk Building in Chicago, and the McCulloughShot and Lead Tower in New York.
19The growth in the number of skyscrapers diminished after 1933, as a result of stringent regulations based on theargument that these tall buildings severely reduced the amount of light available to pedestrians.
23
Opportunity Cost of Time. Another factor contributing to the increasing time cost of com-
muting is the rising opportunity cost of time. Edlund et al. (2016) focus on the increase in
double-income high-skilled households between 1980 and 2010. Dual-earner couples have less time,
making commuting more costly, giving them an incentive to live closer to work. Edlund and co-
authors find that the increase in the number of couples where both partners work has contributed to
gentrification and urban renewal in recent decades. Su (2018) makes a similar point, but focuses on
individuals between 1990 and 2010. The percentage of those working long hours has increased for
all skill classes, though the effect is larger for the college educated. To economize on the commuting
time, the high-skilled are disproportionately moving to the city centers.20
3.3 Summary
When focusing on 1840-2017, the above discussion suggests that we can distinguish three subperiods
in the evolution of commuting costs. Between 1840 and 1920, there was a gradual decline in
commuting costs, driven by the introduction of the omnibus and the streetcar, followed by the
incipient adoption of the car. After 1920, there was a rapid decline in commuting costs, driven by
the mass adoption of the automobile, the construction of highways connecting urban areas with
their hinterlands, and the expansion of suburban rail systems. By the turn of the 21st century, this
continuous decline in commuting costs slowed down, because of the increase in congestion and the
rising opportunity of time.21
4 Conceptual Framework
In this section we provide a two-city spatial model with commuting and moving costs that is able
to account for the main stylized facts identified in the data. On the one hand, the smaller city may
find it hard to survive in the shadow of the larger city, as its residents prefer to move to the more
productive neighbor. On the other hand, the smaller city may thrive as its residents can access
the neighbor’s higher productivity through commuting. As commuting costs decline, we find that
urban growth shadows dominate in the early stage, whereas urban growth spillovers dominate later
on.
20Of course, since this process of gentrification also displaces people, it is not clear whether this is associated witha decline or an increase in the center-city population.
21The years that separate the different subperiods do not constitute precise breakpoints. For example, we can useeither 1920 or 1940 to separate the first two subperiods, as the mass adoption of cars started after 1908, whereas thebuilding of urban highways and suburban rail networks only started in earnest in the 1950s and the 1960s.
24
4.1 Setup and Equilibrium
Endowments. The economy consists of a continuum of points on a line. The density of land at
all points of the line is one. There are L individuals, each residing on one unit of land. Each resident
has one unit of time, which she divides between work and commuting. On the line there are two
exogenously given production points, denoted by ` and k. The set of individuals living closer to
production point ` than to production point k comprises city `. Whereas the total population, L,
is exogenous, the populations of the two cities are endogenous. The land rent in city ` at distance
d` from production point ` is denoted by r`(d`). The distance between production points ` and k,
denoted by d`k, is big enough so that there is at least some empty land between the two cities.22
Technology. The economy produces one good, and labor is the only factor of production. Tech-
nology is linear, with one unit of labor producing A` units of the good at production point ` and Ak
units of the good at production point k. The price of the good is normalized to one. To produce,
an individual needs to commute from his residence to one of the two production points.
The time cost of commuting per unit of distance is γ. An individual who resides in city `
at a distance d` from production point ` can choose between working in ` or k. If she works in her
own city `, she supplies one unit of labor net of the time lost in intra-city commuting 1− γd`, and
earns an income of A`(1 − γd`). If she commutes to the other city k, we ignore differences in the
residence location in the own city, and assume that she incurs an inter-city commuting distance
d`k.23 In that case, she supplies 1−γd`k units of labor, and earns an income Ak(1−γd`k). Without
loss of generalization, we assume that no one residing in k has an incentive to commute to `. As
a result, an individual residing in city k at a distance dk from production point k supplies 1− γdkunits of labor, earning an income of Ak(1− γdk). Summarizing, depending on where an individual
resides and works, there are three possible expressions for income y:
y`(d`) = A`(1− γd`)
yk(dk) = Ak(1− γdk)
yk(d`k) = Ak(1− γd`k), (2)
where the subscript on y refers to the individual’s workplace and the subscript on d refers to his
commuting distance, and hence implicitly to his place of residence. For example, yk(d`k) refers to
22This ensures symmetry in a city’s spatial structure on both sides of its production point.
23This simplifying assumption has the advantage of maintaining symmetry between agents who reside at a distanced` to the right of production point ` and those who reside at that same distance d` to the left of production point`. It implies that cities will be symmetric in shape: the number of residents living to the right and to the left ofproduction point ` will be the same. That is, the distance from production point ` to the edge of city `, denoted byd`, will be the same on both sides of `.
25
the income of an individual who works in k and covers a distance d`k to get to work, implying that
she lives in `.
The possibility of commuting from city ` to production point k is meant to capture the
positive effect of urban spillovers from the neighboring city. These spillovers decline with distance:
by commuting from city ` to production point k, a resident loses working time at a rate of γ
per unit of distance, giving him access to a de facto discounted version of the neighboring city’s
productivity, Ak(1 − γd`k).24 We could alternatively model this effect through direct technology
spillovers, without the need of introducing inter-city commuting, as in Ahlfeldt et al. (2015): if
technological spillovers decay at a rate of γ per unit of distance, then an agent who resides and
works in city ` would have access to a discounted version of his neighbor’s productivity, Ak(1−γd`k),in exactly the same way as an agent who resides in ` and pays a time cost of γ per unit of distance
to commute to k. In that sense, both interpretations are interchangeable in their effects on income.
Hence, our results do not strictly hinge on the existence of inter-city commuting. We will return
to this alternative interpretation when we discuss the model’s results.25
Preferences. Utility is equal to income, with two adjustments. First, agents who change resi-
dence from city ` to city k pay a consumption-equivalent utility moving cost µd`k that is increasing
in inter-city distance. As in the case of commuting, without loss of generality, no individual origi-
nally from k has an incentive to move to `. Because of the moving cost, an agent’s utility depends
not just on where she resides and where she works, but also on where she is originally from. Second,
occupying land is costly. In particular, an individual residing at a distance d` from the production
point ` pays a land rent of r`(d`), which reduces her net income and hence her utility.
Therefore, depending on an individual’s place of origin, place of residence and place of work,
there are four possible expressions for utility:
u``(d`) =A`(1− γd`)− r`(d`)
u`k(dk)=Ak(1− γdk)− rk(dk)− µd`k
u`k(d`)=Ak(1− γd`k)− r`(d`)
ukk(d`)=Ak(1− γdk)− rk(dk), (3)
where the superscript on u refers to the individual’s place of origin, the subscript on u refers to her
workplace, and the subscript on d refers to her commuting distance, and hence implicitly to her
place of residence. For example, u`k(d`k) refers to the utility of an individual who is originally from
24If we include the inter-city commuting time as part of the necessary time dedicated to work, we can interpretAk(1 − γd`k) as the productivity of a commuter from the other city.
25Yet another alternative would be to model this effect by introducing trade. We provide such a model in theAppendix.
26
`, resides in ` and works in k, whereas u`k(dk) refers to the utility of an individual who is originally
from `, resides in k and works in k. To simplify notation, later in the paper we will sometimes
refer to u``(d`) as the staying utility US , to u`k(dk) as the moving utility UM , and to u`k(d`) as the
commuting utility UC .
We now provide some more details about the moving cost µd`k, and also relate it to the
concept of urban shadows. Rather than interpreting the moving cost as a time cost, we think of it
as the utility cost of being a migrant. For example, this could include the psychological and social
costs of having to leave friends and family behind. Consistent with this interpretation, we assume
that a return migrant does not pay a moving cost. That is, if an individual who moved from city
` to city k returns to her hometown, she does not pay a moving cost. The possibility of moving
between cities is meant to capture urban shadows: an individual who resides in a low-productivity
smaller city in the proximity of a high-productivity larger city may find it beneficial to move to
its high-productivity neighbor. If so, the smaller city loses population, and the larger city casts a
growth shadow on the smaller city.
Residential Mobility within Cities. People can freely locate within cities. Where land is
unoccupied, land rents are normalized to zero. Hence, at the city edge d` land rents r`(d`) = 0,
whereas at other locations closer to the production center ` land rents are determined by the
within-city residential free mobility condition. The same of course applies to city k.
To determine equilibrium land rents at different locations, note that in city ` there are
potentially two types of agents: residents who work locally in `, denoted by L``, and residents who
commute to k, denoted by L`k. The total cost of land rents and commuting costs incurred by a
resident who lives at distance d` and works locally is r`(d`) +A`γd`, whereas the analogous cost if
she commutes to k is r`(d`)+Akγd`k. Since all commuters to the other city k have to cover the same
distance d`k, independently of where they reside in city `, they all prefer to live on the city edge
and pay zero rent. As a result, there will be an area L`k/2 on both edges of the city where rents are
zero. To be precise, for all d` ∈ [d` − L`k/2, d`] we have r`(d`) = 0. For all other locations closer to
production point `, occupied by residents who work locally, the sum of land rents plus commuting
costs must equalize. Hence, for all d` ∈ [0, d`−L`k/2], we have r`(d`) +A`γd` = A`γ(d`−L`k/2), so
that r`(d`) = A`γ(d` − L`k/2− d`). Summarizing, equilibrium land rents in city ` are:
r`(d`) =
A`γ(d` −L`k2 − d`) if d` ∈
[0, d` −
L`k2
]0 if d` ∈
[d` −
L`k2 , d`
].
(4)
In city k, without loss of generality, no residents commute to `, so land rents are simply:
rk(dk) = Akγ(dk − dk). (5)
27
City Choice and Commuting Choice. The smaller city will lose population if its residents
prefer to move to the larger neighbor, but it will gain population if its residents choose to commute
to the larger city. To provide some intuition for when one situation is more likely than another,
consider an individual who is originally from city ` and resides at a distance d` from the production
point `. She has three choices: she can stay in city ` and work at production point `, earning a utility
US ≡ u``(d`); she can move to city k and reside at a distance dk from her work at production point
k, earning a utility UM ≡ u`k(dk); or she can commute a distance d`k from city ` to production point
k to work, earning a utility UC ≡ u`k(d`). The expressions in (3) suggest that staying is attractive
if productivity differences are small, inter-city distances are large, commuting costs are high, and
moving costs are big; moving is beneficial if commuting costs are not too high and moving costs
are sufficiently low; and commuting is the preferred choice if commuting costs become sufficiently
low.
Building on this intuition, we can now characterize the equilibrium of the economy in terms
of where individuals choose to reside and where they choose to work. We do so for a given set of
parameters A`, Ak, d`k, µ and γ, and for given initial values d0` and d0
k which determine the size
of both cities when populated by their original residents. Without loss of generality, assume that
Ak(1−γd0k) ≥ A`(1−γd0
` ), implying that if all individuals work in the city they are originally from,
the utility of a resident of ` is less than or equal to that of a resident of k.
Depending on the parameter values and on the initial sizes of both cities, the economy will
be in one of four equilibria, represented by the four quadrants of Figure 1. First, if original residents
of city ` do not stand to gain from either moving to city k or commuting to city k, we will say
that we are in a staying equilibrium: every individual stays and works in the city where she resided
originally. This case is illustrated in the top-left panel of Figure 1. Second, if original residents of
city ` get a higher utility from moving than from both commuting or staying, some individuals from
city ` move to city k. As this happens, city ` becomes smaller and city k becomes larger, implying
that the utility from moving goes down and the utility from staying goes up.26 If, as illustrated
in the top-right panel of Figure 1, the two utility levels equalize at a level above the utility from
commuting, then we will say that we are in an inter-city moving equilibrium: some individuals of
` move to k, and the remainder lives and works in `.27
Third, starting in the same situation, with the utility from moving being higher than the
utility from commuting or staying, it is possible that as people start moving, the utility from moving
reaches the utility from commuting. At that point, some individuals in ` start commuting to k,
until the utility from staying equalizes that of commuting. In this case, shown in the bottom-left
26The utility from commuting remains the same, since that utility depends on the distance between city ` and cityk, which is unchanged.
27It is also possible that all individuals move out of city ` before the two utility levels meet. This possibility is notshown in Figure 1.
28
panel of Figure 1, the economy is in an inter-city moving and commuting equilibrium. Lastly, if
the utility from commuting is higher than the utility from moving or staying, some individuals in
` commute to k. As this occurs, less people in ` work in production point `. This weakens the
competition for land in ` and lowers the land rent. As a result, the utility from staying increases,
and the economy reaches an equilibrium when the utility from staying equalizes the utility from
commuting.28 In this case, illustrated in the bottom-right panel of Figure 1, the economy is in an
inter-city commuting equilibrium.
Figure 1: Equilibrium Description
US
UM
UC
US
UC
UM
Staying equilibrium
Inter-city moving and commuting equilibrium Inter-city commuting equilibrium
or
UM
UC
US
UM
US
UC
Inter-city moving equilibrium
or
UM
UC
US
UC
UM
US
UC
US
UM
or
Given initial conditions, this figure graphically illustrates the four possible equilibrium configurations. Horizontallines denote the initial utility levels for the different choices: US refers to the utility of an individual staying andworking in her own city, UM refers to the utility of an individual moving to the other city and working there, andUC refers to the utility of an individual commuting to the other city. In the top-left corner individuals do not gainfrom either moving or commuting to the other city, so we have a staying equilibrium. In the top-right corner andbottom-left corner individuals get a higher utility from moving than from commuting or staying. If moving leads theutility to equalize to that of staying, we get a moving equilibrium, whereas if it leads the utility to equalize to that ofcommuting, we get a moving and commuting equilibrium. In the bottom-right corner individuals get a higher utilityfrom commuting, so we have a commuting equilibrium.
28Of course, if everyone commutes before that equality is reached, then we would have the entire city ` commutingto k.
29
We are now ready to formally define the equilibrium of the economy for a given set of
parameter values and for a given initial size of ` and k.
Equilibrium. Given A`, Ak, d`k, µ and γ, and given initial values d0` and d0
k, with Ak(1−γd0k) >
A`(1− γd0` ), the economy will be in one of four equilibria:
i. Staying equilibrium. If A`(1 − γd0` ) ≥ Ak(1 − γd0
k) − µd`k and A`(1 − γd0` ) ≥ Ak(1 − γd`k),
then no individual has an incentive to move from ` to k.
ii. Inter-city moving equilibrium. If either Ak(1 − γd0k) − µd`k > A`(1 − γd0
` ) ≥ Ak(1 − γd`k)or both Ak(1 − γd0
k) − µd`k > Ak(1 − γd`k) > A`(1 − γd0` ) and Ak(1 − γ(d0
k + m)) − µd`k ≥Ak(1 − γd`k), then a share min(m, d0
` ) moves from city ` to k, where m is the solution to
Ak(1− γ(d0k +m))− µd`k = A`(1− γ(d0
` −m)).
iii. Inter-city moving and commuting equilibrium. If Ak(1− γd0k)−µd`k > Ak(1− γd`k) > A`(1−
γd0` ) and Ak(1−γ(d0
k+m))−µd`k < Ak(1−γd`k), then a share min(m′, d0` ) people moves from
city ` to city k, where m′ is the solution to Ak(1−γ(d0k+m′))−µd`k = Ak(1−γd`k) and m is the
solution to Ak(1−γ(d0k+m))−µd`k = A`(1−γ(d0
`−m)), and a share min(m′′, d0`−min(m′, d0
` ))
people commute from city ` to city k, where m′′ is the solution to A`(1− γ(d0` −m′ −m′′)) =
Ak(1− γd`k).
iv. Inter-city commuting equilibrium. If Ak(1− γd`k) > A`(1− γd0` ) and Ak(1− γd`k) > Ak(1−
γd0k)− µd`k, then min(c, d0
` ) commutes from city ` to city k, where c is the solution to Ak(1−γd`k) = A`(1− γ(d0
` − c)).
Now that we have formally defined the equilibrium, in the following subsection we analyze
whether the smaller city gains or loses population when certain parameter values, such as com-
muting costs, change. This will provide us with valuable predictions on the incidence of urban
shadows and urban spillovers, and it will allow us to link the theory’s predictions to the stylized
facts uncovered in the empirical part of the paper.
4.2 Urban Growth Shadows and Spillovers
In this subsection we explore how a drop in commuting costs affects urban shadows and urban
spillovers. To gain some understanding of the role of commuting costs, consider an agent residing
in the smaller, less productive city. If commuting costs are large, an agent has less incentive to
move or to commute to the larger, more productive city, because in either case she would be facing
longer commuting distances. If commuting costs are at an intermediate level, moving becomes more
attractive: although the commuting distance increases, it does so by less than if the agent were
to commute from the smaller to the larger city. If commuting costs drop far enough, commuting
30
to the larger city becomes the better choice: she saves the fixed cost of moving, while benefiting
from low commuting costs. This intuition suggests that a drop in γ makes moving relatively more
attractive than staying, and makes commuting relatively more attractive than moving.
Starting off in a situation where all individuals have the same utility, we formalize this
intuition and show that a gradual decrease in γ first shifts the economy from a staying equilibrium
to an inter-city moving equilibrium, with some residents of the smaller low-productivity city moving
to the larger high-productivity city. Later, as γ continues to drop, the economy shifts to an inter-city
moving and commuting equilibrium, and then to an inter-city commuting equilibrium, with some
original residents of the smaller low-productivity city commuting to the larger high-productivity
city. This is stated in the following result.
Result 1. Start off in an equilibrium where Ak > A` and where the utility of all individuals is
identical. For a value of µ that is sufficiently small, a gradual drop in commuting costs, γ, moves
the economy sequentially from a staying equilibrium to an inter-city moving equilibrium, an inter-
city moving and commuting equilibrium, and an inter-city commuting equilibrium. In the inter-city
moving equilibrium the smaller city loses residents to the larger city, whereas in the inter-city
moving and commuting equilibrium the smaller city gains residents from the larger city.
Proof. Initially A`(1 − γd0` ) = Ak(1 − γd0
k) and Ak > A`, so that d0k > d0
` . In this case,
A`(1−γd0` ) ≥ Ak(1−γd0
k)−µd`k and A`(1−γd0` ) ≥ Ak(1−γd`k) because d`k ≥ d0
` by construction.
Because Akd0k > A`d
0` , −∂A`(1 − γd0
` )/∂γ < −∂Ak(1 − γd0k)/∂γ, so that a drop in γ leads to
Ak(1−γd0k) > A`(1−γd0
` ). If γ continues to drop and µ < (Ak−A`)/d`k, at some point Ak(1−γd0k)−
µd`k = A`(1−γd0` ). This occurs when γ reaches the threshold γm = (Ak−A`−µd`k)/(Akd0
k−A`d0` ).
If Ak(1 − γmd0`k) < Ak(1 − γmd0
k) − µd`k, which requires µ < (Ak(d`k − d0k)(Ak − A`))/(d`k(d`k +
Akd0k − A`d
0` )), then as soon as γ falls below γm, some of the original residents of ` will want
to move to k. To be precise, min[m, d0` ] people who originally lived in ` will move to k, where
Ak(1−γ(d0k+m))−µd`k = A`(1−γ(d0
`−m)). As γ continues to drop, m will increase. At some point,
the drop in γ reaches Ak(1− γ(d0k + min[m, d0
` ]))−µd`k = Ak(1− γd`k). We refer to this threshold
as γmc, where γmc = µd`k/(Ak(d`k − d0k −min[m, d0
` ])). Any further drop in γ will now imply that
some of the original residents of ` will prefer to commute to k. If γ continues to drop, an increasing
share of the original residents of ` commute. There is a threshold γc = µd`k/(Ak(d`k − d0k)), below
which all original residents of ` commute to k.
The above result implies three threshold values of γ. A high threshold, γm, a middle threshold, γmc,
and a low threshold, γc, such that for γ ≥ γm, we are in a staying equilibrium, for γm > γ ≥ γmc,
we are in an inter-city moving equilibrium, for γmc > γ ≥ γc, we are in an inter-city moving and
commuting equilibrium, and for γ < γc, we are in an inter-city commuting equilibrium.
31
Figure 2: Inter-city Moving and Commuting vs. Commuting Cost
This figure illustrates Result 1. It shows how the share of the population of the small city, as well as the share ofcommuters, changes with commuting costs. A drop in commuting costs leads the small city to first lose population,and then to gain population. For high commuting costs (γ ≥ γm), individuals from the small, less productive cityhave no incentive to move or commute to the large, more productive city. As commuting costs drop (γm > γ ≥ γmc),individuals from the small, less productive city move to the large, more productivity. The population of the smallcity declines: an urban growth shadow. For low levels of commuting costs (γm < γmc), individuals from the small,less productive city commute to the large, more productive city, and its original residents start to return: an urbanspillovers. Eventually, the small city recovers its original population (γ < γc).
Figure 2 illustrates Result 1 with a simple numerical example. The productivity values are
set such that the large city has a TFP that is 10% higher than the small city: A` = 1 and Ak = 1.1.
The inter-city distance is set to 1, and the overall population to 1.5: with cities being symmetric
around their production points, this implies 75% of the land between the two cities is occupied.
We set the moving cost parameter µ to 0.03. When taking the initial income in the small city as
reference, this amounts to a cost of slightly more than 3% in income-equivalence terms. We choose
the initial value of γ to be 0.25, implying that people would lose 25% of their income if they were
to commute to the other city.
Starting off with a geographic distribution of population such that utility levels are the
same in both cities, we analyze what happens to the population share of the small city and to the
population share commuting to the large city as we lower γ from 0.25 all the way to zero. As can
be seen in Figure 2, when γ > γm, there is no inter-city moving or commuting. All residents of
the small city stay put, so that its population remains constant. Once γ drops below γm, some
residents of the small city move to the large city, and the population of the small city gradually
32
declines as the commuting cost continues to decrease. When γ falls below γmc, some residents
from the small city start to commute to the large city. As the commuting cost further drops, the
population of the small city slowly recovers, as some of the movers return and prefer to commute.
Once the commuting cost falls to γc, the small city reaches its original population level, with all of
its residents commuting to the large city.
Relation to Urban Shadows and Spillovers. What does Result 1 tell us about urban growth
shadows and urban growth spillovers? As the commuting cost drops, residents of the smaller city
move to the nearby larger city, and the smaller city loses population. The larger city casts an urban
growth shadow: the nearby smaller city suffers in terms of population growth. A further drop in
the commuting cost reverses this trend, as residents of the smaller city find it more attractive to
commute to the larger city than to move. The larger city no longer displays an urban growth
shadow, but exhibits urban spillovers instead: the nearby smaller city gains in terms of population
growth.
Relation to Empirical Stylized Facts. Our description of the evolution of commuting costs in
the U.S. between 1840 and 2017 suggests a slow decline in γ between 1840 and 1920, a rapid fall in
γ between 1920 and the turn of the 21st century, and a slowdown in the decrease in γ during the last
two decades. In light of Result 1, this would be consistent with an early time period where growth
shadows dominated and a later time period where growth spillovers dominated, with a weakening
in those spillovers in more recent times. This is consistent with our empirical findings for the U.S.,
as summarized in Stylized Fact 1 and Stylized Fact 2.
4.3 Geographic Span of Urban Shadows and Spillovers
We now explore how urban shadows and urban spillovers depend on the distance to the larger city.
The following result states that if inter-city distance increases, then all three threshold values of
the commuting cost are lower.
Result 2. Thresholds γm, γmc and γc are declining in d`k. That is, if the distance to the larger
city increases, the shift from a staying equilibrium to an inter-city moving equilibrium, from an
inter-city moving equilibrium to an inter-city moving and commuting equilibrium, and from an
inter-city moving and commuting equilibrium to an inter-city commuting equilibrium, occurs for
lower values of the commuting cost γ.
Proof. From the proof of Result 1, we can write γm = (Ak −A`−µd`k)/(Akd0k −A`d0
` ). It is clear
that dγm/dd`k < 0. From the same proof of Result 1, we can write γmc = µd`k/(Ak(d`k − d0k −
min[m, d0` ])), where m can be written as (Ak − A` − Akγmcd0
k + A`γmcd0` − µd`k)/(γmc(A` + Ak)).
33
Together, this implies that γmc = max[µd`k/(Ak(d`k− d0k− d0
` )), (Ak(Ak−A`) +A`d`kµ)/(Ak(Ak +
A`)d`k−AkA`(d0k+ d0
` ))]. Here as well, it is immediate that dγmc/dd`k < 0. Threshold γc is reached
when m in the above expression is equal to zero, so γc = µd`k/(Ak(d`k − d0k)). It is immediate that
dγc/dd`k < 0.
The above result says that when the larger city is geographically farther away, commuting costs
need to drop more before individuals from the smaller city want to move to the bigger city, and
they also need to drop more before they find it profitable to commute to the bigger city.
Figure 3: Inter-city Moving and Commuting Thresholds vs. Inter-City Distance
This figure illustrates Result 2 by showing how the three commuting cost thresholds (γm, γmc and γc) decrease asinter-city distance increases. This result implies that as commuting costs decline, urban shadows first expand in spaceas the moving equilibrium applies to increasingly farther-away locations, and then urban spillovers expand in spaceas the moving & commuting equilibrium applies to increasingly farther-away locations.
Figure 3 illustrates Result 2 with a simple numerical example. We use the same parameter
values as before, with the exception of inter-city distance d`k, which we now vary from 1.0 to 2.5.29
For each level of inter-city distance, we plot the three threshold values. As can be seen, the larger
the inter-city distance, the more commuting costs need to drop before people start moving to the
larger city, and before they start commuting to the larger city.
29To make the results comparable to the other figures, for each value of d`k, population is allocated across citiessuch that utility equalizes for γ = 0.25.
34
Relation to Geographic Span of Shadows and Spillovers. What do these findings tell us
about the geographic span of urban shadows and urban spillovers? Using Figure 3, consider a
commuting cost that is relatively high, say, γ = 0.14. In that case, population growth in the
smaller city is relatively lower if the larger city is close-by than if it is farther away. We see urban
growth shadows at close distances, and no effect at further distances. As the commuting cost falls
from this relatively high level, the larger city’s urban growth shadow increases its geographic reach.
Once commuting costs become low enough, we see the emergence of urban spillovers. Consider, for
example, a commuting cost of γ = 0.07. In that case, population growth is relatively higher if the
larger city is in the vicinity, and it is relatively lower if the larger city is at a greater distance. That
is, urban spillovers dominate at short distances, and urban shadows dominate at farther distances.
As the commuting cost falls further, the spatial reach of urban spillovers increases.
Relation to Empirical Stylized Facts. Result 2 allows us to trace the changing geographic
reach of urban shadows and spillovers as commuting costs fall. Initially, it predicts urban shadows
at relatively short distances, that gradually expand as transport costs drop. Eventually, these
shadows are replaced by spillovers, again first at relatively short distances, but later at farther
away distances as the spatial reach of spillovers expands. This is consistent with our empirical
findings for the U.S., as summarized in Stylized Fact 3.
4.4 Relative Size of Large City
How do urban shadows and urban spillovers depend on the relative size of the large city? The
following result shows that the moving threshold is increasing in the relative size of the large city.
That is, commuting costs have to fall by less before a large city starts attracting the population of
its hinterland.
Result 3. Keeping population-weighted productivity unchanged, the threshold γm is increasing in
the relative size of the larger city. That is, the shift from a staying equilibrium to an inter-city
moving equilibrium occurs for a higher value of the commuting cost if the larger city has a bigger
relative size.
Proof. From the proof of Result 1, we can write γm = Ak−A`Akd
0k−A`d
0`− µd`k
Akd0k−A`d
0`. Our aim is to show
that γm is increasing in d0k/d
0` . To to so, we consider the two terms in the γm expression separately.
Because we start off in an equilibrium where A`(1− γ0d0` ) = Ak(1− γ0d0
k), where γ0 is the initial
value of γ, it follows that Ak−A`Akd
0k−A`d
0`
= γ0. Hence, the first term of the γm expression above does
not depend on the relative size d0k/d
0` . This leaves us with the second term, − µd`k
Akd0k−A`d
0`. Because
A`(1 − γ0d0` ) = Ak(1 − γ0d0
k), it follows that AkA`
=1−γ0d0`1−γ0d0k
. If d0k/d
0` increases, we know that d0
k
increases and d0` decreases, since d0
` + d0k is a constant. As a result, if d0
k/d0` increases, it follows
35
that Ak/A` increases. Recall that we are keeping population-weighted productivity the same, so
Akd0k + A`d
0` is a constant we denote by λ. Hence, Akd
0k − A`d0
` = λ − 2A`d0` . If the larger city
becomes larger and its relative productivity increases and the overall productivity is unchanged,
it must be that the productivity of the small city decreases. It hence follows that Akd0k − A`d0
`
increases. This implies that the second term, − µd`kAkd
0k−A`d
0`
is increasing in Ak/A`, so that γm is
increasing in Ak/A`.
The above result shows that larger cities exert a stronger gravitational pull on their hinterland, as
they start casting their urban shadows at higher levels of commuting costs. Figure 4 illustrates this
result, and further shows that the same applies to urban spillovers. We use the same parameter
values as before, with the exception of the productivity Ak which we now vary in order for the
city size to change. As the relative size of k increases from 0.65 to 0.85, all three threshold values
increase.
Figure 4: Inter-city Moving and Commuting Thresholds vs. Inter-City Distance
This figure illustrates Result 3 by showing how the three commuting cost thresholds (γm, γmc and γc) increase asthe relative size of the large city increases. This result implies that as commuting costs decline, it is the largest citiesthat first cast their growth shadow on their smaller neighbors, and likewise, it is the largest cities that first spreadtheir growth spillovers to their smaller neighbors.
Relation to Empirical Stylized Facts. Result 3 implies that as commuting costs decline, it is
the largest cities that first cast their urban growth shadow on their smaller neighbors, and likewise,
36
it is the largest cities that first spread their urban growth spillovers to their smaller neighbors.
The stronger urban shadows and urban spillovers of larger cities is consistent with our empirical
findings for the U.S., as summarized in Stylized Fact 4.
4.5 Alternative Interpretations
Commuting costs are central to our conceptual framework. Their evolution affects whether urban
growth shadows or urban growth spillovers dominate, and they are also key in determining the
geographic reach of these shadows and spillovers. Indeed, by focusing on the evolution of just this
variable, our conceptual framework is able to account for the main stylized facts we identified when
empirically studying growth shadows and spillovers in the U.S. over the period 1840 to 2017.
One potential issue with our interpretation is that in some of the later time periods, after
1980, the geographic span of urban spillovers reached 200km. At face value this seems well beyond
standard inter-city commuting distances, so one could doubt whether in this most recent time period
the conceptual framework captures the essence of what we observe in the data. There are at least
three reasons why our interpretation may still hold. First, although between 1980 and 2017 we find
evidence of urban spillovers having a large geographic reach, those effects dissipate with distance.
For example, correlations between 150km and 200km are one-half to one-quarter their magnitudes
between 1km and 100km. Second, the empirical correlations should always be interpreted relative
to the excluded category (e.g., locations that have no large neighbors within 300km). If in recent
time periods geographically isolated locations have been experiencing particularly low growth, this
pushes up the relative growth rate of all other locations, including those that are, say, 200km
away from large neighbors. Third, although a distance of 200km between the centroid of a rural
county and the centroid of a large metro area may be beyond standard commuting distances, the
distance between that same county and the edge of a large metro area may very well still be within
reasonable commuting time.
That being said, an alternative is that inter-city commuting costs are not the driving force
behind what we observe in the data. As mentioned before, our conceptual framework is equivalent
to one with intra-city (but no inter-city) commuting costs, and with technological spillovers that
decay with distance. If in this alternative model technological spillovers decrease by a share γ per
unit of distance in exactly the same way as the hours worked of a commuter decrease by a share γ
per unit of distance, then both models are observationally equivalent. Workers of the smaller city
have access to the same discounted productivity of their larger neighbor: either by benefitting from
technological spillovers from the larger city or by commuting to the larger city.
In addition to inter-city commuting costs and spatial technological spillovers, a third force
that may contribute to the attractiveness of the larger city is trade and market access. In the
Appendix we consider an alternative model that incorporates trade between both cities. Once
37
again, we switch off the possibility of inter-city commuting and only allow for intra-city commuting.
Because intra-city commuting is more costly in the large than in the small city, falling transport
costs make it relatively more attractive to move to the large city: an urban growth shadow. At
the same time, falling transport costs make it easier to trade with the large city, thus increasing
the incentive to reside and produce in the nearby small city: an urban growth spillover. In the
Appendix we use simulations to show that such an alternative model can generate similar dynamics
to the ones we observe in the U.S. data between 1840 and 2017.
5 Concluding Remarks
In this paper we have analyzed whether a location’s growth benefits or suffers from being geograph-
ically close to a large urban center. To do so, we have focused on U.S. counties and metro areas
over the time period 1840-2017. We have found evidence of urban shadows between 1840 and 1920
and of urban spillovers between 1920 and 2017. Proximity to large urban clusters was negatively
correlated with a location’s growth in the early time period, and positively correlated in the later
time period, albeit with some weakening of this positive correlation in the last decades.
The conceptual framework we have developed suggests that as the cost of commuting drops,
individuals first have an incentive to move from smaller closeby cities to larger urban centers. Later,
if commuting costs continue to fall, individuals prefer to commute, rather than to move, from the
smaller to the larger cities. This implies that falling transport costs first hurt, and then help, the
growth of smaller locations in the vicinity of large urban centers. After documenting the long-run
evolution of commuting costs, we have shown that our framework is consistent with the empirical
evidence.
As such, a single variable — commuting costs — is able to capture the growth patterns of
small cities in the hinterland of large urban clusters over the time period stretching from 1840 to
2017. Other factors are of course likely to have contributed to these spatial growth patterns. In
this context, we have discussed the role of the spatial diffusion of technology, as well as the possible
importance of market access and trade.
38
References
[1] Ahlfeldt, G. M., S. J. Redding, D. M. Sturm, and N. Wolf (2015). “The Economics of Density:
Evidence from the Berlin Wall,” Econometrica, 83, 2127-2189.
[2] Allen, T., and C. Arkolakis (2014). “Trade and the Topography of the Spatial Economy,”
Quarterly Journal of Economics, 129, 1085-1140.
[3] Baum-Snow, N. (2007). “Did Highways Cause Suburbanization?,” Quarterly Journal of Eco-
nomics, 122, 775-805.
[4] Beltran, F. J., A. Dıez-Minguela, and J. Martınez-Galarraga (2017). “The Shadow of Cities:
Size, Location, and the Spatial Distribution of Population in Spain,” Cambridge Working
Paper Economics 1749.
[5] Bernard, A. (2014). Lifted: A Cultural History of the Elevator, New York: NYU Press.
[6] Bernhofen, D. M., El-Sahli, Z., and R. Kneller (2016). “Estimating the Effects of the Container
Revolution on World Trade,” Journal of International Economics, 98, 36-50.
[7] Black, D., and V. Henderson (2003). “Urban Evolution in the USA,” Journal of Economic
Geography, 3, 343-372.
[8] Bosker, M., and E. Buringh (2017). “City Seeds: Geography and the Origins of European
Cities,” Journal of Urban Economics, 98, 139-157.
[9] Christaller, W. (1933). Central Places in Southern Germany. Jena, Germany: Fischer (English
translation by C. W. Baskin, London: Prentice Hall, 1966).
[10] Conley, T. (1999). “GMM Estimation with Cross Sectional Dependence,” Journal of Econo-
metrics, 92, 1-45.
[11] Cronon, W. (1991). Nature’s Metropolis: Chicago and the Great West, New York: W.W.
Norton.
[12] Davis, D. R., and D. E. Weinstein (2002). “Bones, Bombs, and Break Points: The Geography
of Economic Activity,” American Economic Review, 92, 1269-1289.
[13] Desmet, K., D. Nagy, and E. Rossi-Hansberg (2018). “The Geography of Development,” Jour-
nal of Political Economy, 126, 903-983.
[14] Desmet, K., and J. Rappaport (2017). “The Settlement of the United States, 1800-2000: The
Long Transition towards Gibrat’s Law,” Journal of Urban Economics, 98, 50-68.
39
[15] Desmet, K., and E. Rossi-Hansberg (2014). “Spatial Development,” American Economic Re-
view, 104, 1211-1243.
[16] Dobkins, L. H., and Y. Ioannides (2001). “Spatial Interactions among U.S. Cities,” Regional
Science and Urban Economics, 31, 701-731.
[17] Donaldson, D., and R. Hornbeck (2016). “Railroads and American Economic Growth: A
Market Access Approach,” Quarterly Journal of Economics, 131, 799-858.
[18] Edlund, L., C. Machado, and M. Sviatschi (2016). “Bright Minds, Big Rent: Gentrification
and the Rising Returns to Skill,” NBER Working Paper # 21729.
[19] Fogel, R. W. (1964). Railroads and American Economic Growth: Essays in Econometric His-
tory, Baltimore, MD: Johns Hopkins University Press.
[20] Fujita, M., P. Krugman, and A. J. Venables (1999). The Spatial Economy, Cambridge, MA:
MIT Press.
[21] Gardner, T. (1999). “Metropolitan Classification for Census Years before World War II,”
Historical Methods, 32, 139-150.
[22] Glaeser, E. L. (2011),Triumph of the City, London: MacMillan.
[23] Glaeser, E. L., and M. Kahn (2004). “Sprawl and Urban Growth,” In: Henderson J.V, Thisse
J.F (ed.), Handbook of Regional and Urban Economics, Vol.4., Elsevier.
[24] Glaeser, E. L., and M. Kohlhase (2004). “Cities, Regions and the Decline of Transport Costs,”
Papers in Regional Science, 83, 197-228.
[25] Hanson, G. (2005). “Market Potential, Increasing Returns and Geographic Concentration,”
Journal of International Economics, 67, 1-24.
[26] Horan, P. M., and P. G. Hargis (1995). “County Longitudinal Template, 1840-1990.” [com-
puter file]. ICPSR Study 6576. Inter-university Consortium for Political and Social Research
[distributor]. Corrected and amended by Patricia E. Beeson and David N. DeJong, Department
of Economics, University of Pittsburgh, 2001. Corrected and amended by Jordan Rappaport,
Federal Reserve Bank of Kansas City, 2010.
[27] Jackson, K. T. (1985). Crapgrass Frontier. The Suburbanization of the United States. Oxford
University Press.
[28] Kopecky, K., and M. H. Suen (2004). “Economie d’Avant Garde: Suburbanization and the
Automobile.” Research Report No. 6.
40
[29] Krugman, P. (1993). “On the Number and Location of Cities,” European Economic Review,
37, 293-298.
[30] LeRoy, S. F., and J. Sonstelie (1983), “Paradise Lost and Regained: Transportation Innovation,
Income, and Residential Location,” Journal of Urban Economics, 13, 67-89.
[31] Liu, Y., X. Wang, and J. Wu (2011). “Do Bigger Cities Contribute to Economic Growth in
Surrounding Areas? Evidence from County-Level Data in China,” unpublished manuscript.
[32] Losch, A. (1940). The Economics of Location, Jena, Germany: Fischer (English translation,
New Haven, CT: Yale University Press, 1954).
[33] Michaels, G., F. Rauch, and S. J. Redding (2012). “Urbanization and Structural Transforma-
tion,” Quarterly Journal of Economics, 127, 535-586.
[34] Mieszkowski, P., and E. S. Mills (1993). “The Causes of Metropolitan Suburbanization,” Jour-
nal of Economic Perspectives, 7, 135-147.
[35] Partridge M. D., D. S. Rickman, K. Ali, and M. R. Olfert (2009). “Do New Economic Ge-
ography Agglomeration Shadows Underlie Current Population Dynamics across the Urban
Hierarchy?,” Papers in Regional Science, 88, 445-466.
[36] Rappaport, J. (2005). “The Shared Fortunes of Cities and Suburbs,” Federal Reserve Bank of
Kansas City Economic Review, Third Quarter, 33-59.
[37] Rappaport, J. (2007). “Moving to Nice Weather,” Regional Science and Urban Economics, 37,
375-398.
[38] Rappaport, J., and J. D. Sachs (2003). “The United States as a Coastal Nation,” Journal of
Economic Growth, 8, 5-46.
[39] Rauch, F. (2014). “Cities as Spatial Clusters,” Journal of Economic Geography, 14, 759-773.
[40] Redding, S., and D. Sturm (2008). “The Costs of Remoteness: Evidence from German Division
and Reunification,” American Economic Review, 98, 1766-1797.
[41] Rosenthal, S. S, and W. C. Strange (2003). “Geography, Industrial Organization, and Agglom-
eration,” Review of Economics and Statistics 85, 377-393.
[42] Shaw, R. E. (1990). Canals for a Nation. The Canal Era in the United States, 1790-1860, The
University Press of Kentucky.
[43] Su, Y. (2018). “The Rising Value of Time and the Origin of Urban Gentrification,” unpublished
manuscript.
41
[44] Tabuchi, T., and J-F. Thisse (2011). “A New Economic Geography Model of Central Places,”
Journal of Urban Economics, 69, 240-252.
[45] Thorndale, W., and W. Dollarhide (1987). Map Guide to the Federal Censuses, 1790-1920,
Genealogical Publishing Company, Baltimore.
[46] Warner, S. B. Jr. (1972). Streetcar Suburbs. The Process of Growth in Boston, 1870-1900,
Harvard University Press and the MIT Press.
[47] Young, J. (2015). “Infrastructure: Mass Transit in 19th- and 20th-Century Urban America.”
Oxford Research Encyclopedia of American History. Online Publication Date: March 2015.
42
A An Alternative Model with Trade and Market Access
In this Appendix we propose an alternative model without inter-city commuting but with inter-city
trade, and show the existence of the same fundamental tradeoff between urban shadows and urban
spillovers.
Endowments. The economy consists of a continuum of points on a line. The density of land
at all points of the line is one. There are L individuals, each residing on one unit of land. Each
resident has one unit of time, which she divides between work and commuting. On the line there
are two exogenously given production points, indexed by ` or k. The set of individuals living
closer to production point ` than to the other production point comprises city `. Whereas the total
population, L, is exogenous, the populations of the two cities, L` and Lk, are endogenous. The land
rent in city ` at distance d` from production point ` is denoted by r`(d`). The distance between
production points ` and k, denoted by d`k, is big enough so that there is at least some empty land
between the two cities. Land is owned by absentee landlords.
Technology and preferences. Each city produces a different good, firms are competitive, and
labor is the only factor of production. Technology is linear, with one unit of labor producing A`
units of the good at production point ` and Ak units of the good at production point k.
To produce, an individual needs to commute to the production point of the city where she
resides. In contrast to the model in the main paper, there is no inter-city commuting. The time
cost of intra-city commuting per unit of distance is γ. Hence, an individual who resides in city `
at a distance d` from production point ` supplies 1− γd` units of labor, and produces A`(1− γd`)units of her city’s good. Her wage income, w`(d`), is therefore p`A`(1− γd`), where p` is the free-
on-board (f.o.b.) price of the good produced in city `. Her income net of land rents paid, y`(d`), is
w`(d`)− r`(d`). Land rents are paid in terms of the local good to the absentee landlords, and then
disappear from the economy. When a good of city ` is shipped to city k, a share γ′ is lost per unit
of distance, so 1− γ′d`k units arrive. Hence, the price of good ` in city k is p`/(1− γ′d`k).People can freely choose where to reside in their city. This implies that y` equalizes across
all locations within a city. At the edge of city `, land rents are zero, so r`(d`) = 0, where d` refers
to the distance between the city center and the city edge. Hence, for all residents of city `, income
net of land rents is
y` = p`A`(1− γd`). (A.1)
To move to another city, an individual has to be pay a utility cost µd`k.30 We assume that a return
30This introduces a utility difference between the original residents of a city and the immigrant residents from theother city. However, it does not lead to a difference in their income net of land rents, so that (A.1) applies to boththe original residents and the immigrant residents.
43
migrant does not pay a moving cost. That is, if an individual who moved from city ` to city k
returns to her hometown, she does not pay a moving cost.
Agents have CES preferences over the two different goods. The utility of an individual
originally from city k and residing in city ` can then be defined as
uk` =
(cσ−1σ
`` + cσ−1σ
`k
) σσ−1
− Ik` µd`k (A.2)
with
Ik` =
1 if k 6= `
0 otherwise
where c`` and c`k denote the consumption of good ` and k by a resident of city `, σ > 1 is the
elasticity of substitution between both goods, and Ik` is an indicator value equal to zero if the
individual is an original resident of ` and equal to one if she is an immigrant from the other city.
Aggregate production. In city `, total production is∫ d`
0 2A`(1 − γd`)dd` = 2A`d`(1 − 12γd`),
where d` = L`2 . A part of total production is paid out to the absentee land owners, and disappears
from the economy. Net of the payments to land owners, each individual in ` produces A`(1− γd`).As a result, total production net of the payouts to land owners is
Q` = 2A`d`(1− γd`) = A`L`(1− γd`). (A.3)
Aggregate consumption. When solving the utility maximization problem, we can separate the
consumption decision and the residential decision. We start by describing the consumption decision.
An agent who resides in ` maximizes (A.2) subject to
p`A`(1− γd`) = p`c`` +pk
1− γ′d`kc`k. (A.4)
The first order conditions yield the following demand for each one of the two goods:
c`` =y`(p`)
−σ
p1−σ` + ( pk
1−γ′d`k )1−σ
c`k =y`(
pk1−γ′d`k )−σ
p1−σ` + ( pk
1−γ′d`k )1−σ . (A.5)
Aggregate demand for goods produced in location ` is:
C` =y`L`(p`)
−σ
p1−σ` + ( pk
1−γ′d`k )1−σ +ykLk(
p`1−γ′d`k )−σ
( p`1−γ′d`k )1−σ + p1−σ
k
. (A.6)
44
Residential choice. An individual who originally resides in city ` has a choice to stay in city `
or to move to city k. His indirect utility if he stays in city ` is:
u`` =y`(
p1−σ` + ( pk
1−γ′d`k )1−σ) 1
1−σ=
p`A`(1− γd`)(p1−σ` + ( pk
1−γ′d`k )1−σ) 1
1−σ(A.7)
whereas his indirect utility if he moves to k is
u`k =yk(
( p`1−γ′d`k )1−σ + p1−σ
k
) 11−σ− µd`k =
pkAk(1− γdk)(( p`
1−γ′d`k )1−σ + p1−σk
) 11−σ− µd`k. (A.8)
Note that d` = L/2− dk. Denote by d` the value of d` that equalizes u`` and u`k. That is,
p`A`(1− γd`)(p1−σ` + ( pk
1−γ′d`k )1−σ) 1
1−σ=
pkAk(1− γ(L2 − d`))(( p`
1−γ′d`k )1−σ + p1−σk
) 11−σ− µd`k (A.9)
By analogy, denote by dk the value of dk that equalizes ukk and uk` . That is,
pkAk(1− γdk)(( p`
1−γ′d`k )1−σ + p1−σk
) 11−σ
=p`A`(1− γ(L2 − dk))(
(p1−σ` + pk
1−γ′d`k )1−σ) 1
1−σ− µd`k. (A.10)
The original distribution of population has 2d0` individuals living in ` and 2d0
k individuals living in
k, where 2d0` + 2d0
k = L. If d` < d0` , then 2(d0
` − d`) people move from ` to k. If dk < d0k, then
2(d0k − dk) people move from k to `. If neither d` < d0
` nor dk < d0k, then 2(d0
k − dk), no one moves
and everyone lives in their original location of residence.
Equilibrium. For given parameter values L, A`, Ak, d`k, µ, γ, γ′ and σ, and for a given ini-
tial distribution of individuals across cities, d0` and d0
k, an equilibrium is a collection of variables
{p`, pk, L`, Lk, d`, dk, d`, dk} that satisfy conditions (A.1), (A.9), (A.10) and:
1. Goods market clearing:
L`p`A`(1− γd`) =y`L`(p`)
1−σ
p1−σ` + ( pk
1−γ′d`k )1−σ +ykLk(
p`1−γ′d`k )1−σ
( p`1−γ′d`k )1−σ + p1−σ
k
LkpkAk(1− γdk) =y`L`(
pk1−γ′d`k )1−σ
p1−σ` + ( pk
1−γ′d`k )1−σ +ykLk(pk)
1−σ
( p`1−γ′d`k )1−σ + p1−σ
k
(A.11)
2. Labor market clearing:
L = L` + Lk (A.12)
45
3. Land market clearing:
L` = 2d`
Lk = 2dk (A.13)
4. Labor mobility:
d` =
d` if d` < d0
`
d` + (dk − dk) if dk < d0k
d0` otherwise
(A.14)
Numerical example. We illustrate our model with a numerical example. We make the large
city 50% more productive than the small city: A` = 1.0 and Ak = 1.5. The elasticity of substitution
between both goods, σ = 3. The total population L = 6, and inter-city distance d`k = 3. The
moving cost parameter is set to µ = 0.001. Given the inter-city distance and the initial utility in
both cities, this amounts to a little more than 0.3% in terms of utility. For the initial commuting
cost and trade cost parameters, we choose γ = 0.25 and γ′ = 0.25. Using these initial parameters,
we distribute population between the two cities to equalize utility.
Figure 5: Inter-city Moving and Commuting Thresholds vs. Inter-City Distance
We then do comparative statics by simultaneously lowering commuting costs, γ, and trade
costs, γ′. On the one hand, a drop in commuting costs makes it more attractive to live in the larger,
46
more productive city than in the smaller, less productive city. This occurs because it reduces the
disadvantage of a longer within-city commute in the larger city. This force makes the smaller city
lose population: an urban growth shadow. On the other hand, a drop in trade costs improves
market access for the smaller city. This force makes the smaller city gain population: an urban
growth spillover. Depending on which force dominates, the smaller city loses or gains population.
Figure 1 depicts an example where commuting costs (represented on the bottom horizontal
axis) decline at a slower pace than transport costs (represented on the top horizontal axis). In
particular, commuting costs decline from 0.25 to 0.075, whereas transport costs decline from 0.25
to 0.00. The relatively slower decline in commuting costs is consistent with the view of Glaeser
and Kohlhase (2004) that in the twenty-first century U.S. “it is essentially free to move goods,
but expensive to move people”. Moving from right to left on the horizontal axis, we see that this
decline first lowers the population of the smaller city (the urban shadow effect dominates) and then
it increases the population of the smaller city (the urban spillover dominates). This echoes Result
1 in the model of the paper.
47