Attenuation of Agglomeration Economies: Evidence from the
Universe of Chinese Manufacturing Firms∗
Jing Lia†, Liyao Lia‡, Shimeng Liub§
a School of Economics, Singapore Management University, Singapore
b Institute for Economic and Social Research, Jinan University, China
April 27, 2020
Abstract
This paper examines the industry-specific attenuation speed of agglomeration economies
and its interplay with the large presence of state-owned enterprises in China. We achieve
this focus by taking advantage of unique geocoded administrative data on the universe
of Chinese manufacturing firms. The full-spectrum analysis also allows us to assess the
goodness of fit of various spatial decay functional forms and to systematically evaluate the
micro-foundations that govern the decay patterns across industry types. We obtain three
main findings. First, agglomeration economies attenuate sharply with spatial distance in
China, with large heterogeneity in the attenuation speed across ownership and industry
types. Second, the spatial decay speed is positively linked with proxies for knowledge
spillovers and labour market pooling but negatively linked with proxies for input sharing
and the share of the state sector. Last, the inverse square distance decay function presents
the best goodness of fit among the tested functional forms.
Key Words: agglomeration economies; attenuation speed; China; state-ownedenterprise (SOE)
JEL classifications: R1; R3; L3; L6
∗We thank Costas Arkolakis, Nate Baum-Snow, Stuart Rosenthal, William Strange, Nathan Schiff, MatthewTurner, Zhi Wang, Junfu Zhang, seminar participants at Singapore Management University, and participants ofthe 2019 China Meeting of the Econometric Society, 2019 Yangtze River Delta International Forum, 2019 SMUConference on Urban and Regional Economics, for their helpful comments. Jing Li gratefully acknowledgesthe financial support from Singapore Management University under the Lee Kong Chian Fellowship. ShimengLiu gratefully acknowledges support from the National Natural Science Foundation of China (71703059).Remaining errors are our own.†90 Stamford Road, Singapore 178903. Phone: +65-6808-5454. E-mail: [email protected].‡90 Stamford Road, Singapore 178903. E-mail: [email protected].§601W. Huangpu Ave., Tianhe District, Guangzhou, China 510632. Phone: +86-13403024213. E-mail:
1
1. Introduction
How do agglomeration economies attenuate spatially? This question is important because
the answer sheds light on the specific force of agglomeration economies that drives economic
concentrations and urban growth (Rosenthal and Strange, 2019). A growing literature ex-
amines the attenuation of agglomeration economies empirically and reveals striking decay
patterns that vary across industry types (Rosenthal and Strange, 2003, 2005, 2008; Fu, 2007;
Arzaghi and Henderson, 2008; Li, 2014). However, existing studies are insufficient to pro-
vide systematic evidence for the whole economy or to offer generic guidance on theoretical
models.1 Moreover, those studies are all implemented in developed countries (mostly in the
United States), despite a strong desire for understanding the economic fundamentals that
drive rapid urbanization in the developing world.
This paper takes an integrated approach to examine the spatial attenuation of agglomer-
ation economies and its interplay with the large presence of state-owned enterprises (SOEs)
in China. Specifically, we take advantage of unique geocoded administrative data on the
universe of Chinese manufacturing firms to embark on an extensive set of empirical exercises.
We start by providing the first comprehensive estimation of industry-specific attenuation
speed for the entire manufacturing sector. The nuanced full-spectrum analysis allows us, also
for the first time in the literature, to test for the goodness of fit of various spatial decay
functional forms.2 Next, using the most desirable functional form, we empirically verify the
theoretical insight that links the spatial decay speed with the micro-foundation of agglomer-
ation economies.3 Last, we systematically evaluate the role of SOEs in shaping the spatial
1One major limitation is that the studies only focus on a small set of industries. Examples include Rosenthaland Strange (2003) on the software, food processing, apparel, printing and publishing, fabricated metal, andmachinery industries; Arzaghi and Henderson (2008) on the advertising industry; and Li (2014) on the healthcare industry.
2Previous studies that model the spatial decay of productivity spillovers assume certain functional forms,such as the inverse exponential distance decay function in Lucas and Rossi-Hansberg (2002). However, thusfar, no statistical evidence exists to validate this assumption. By specifying rich distance-specific concentricring variables as measures of agglomeration for each industry, we generate sufficient variation in the data toestimate and compare the goodness of fit of various decay functional forms.
3Theory suggests that the spatial decay speed of agglomeration spillovers depends on which underlyingagglomeration forces are in action (Rosenthal and Strange, 2004; Combes and Gobillon, 2015). For instance,industries that heavily rely on knowledge spillovers as the main agglomeration force often require close-range,face-to-face contact, which implies a rapid spatial decay of agglomeration spillovers. Industries that clustermainly because of input-output links could have agglomeration externalities decay slowly and extend to a
2
manifestation of agglomeration economies.
Such a comprehensive analysis is essential to gaining a deep understanding on the nature
of agglomeration economies. The mechanism of agglomeration economies is widely used to
explain the formation of cities and the rapid economic growth taking place within cities
(Marshall, 1920; Glaeser et al., 1992; Ellison and Glaeser, 1999; Ellison et al., 2010). Less
is understood on the empirical underpinnings. The spatial attenuation of agglomeration
economies is an important empirical regularity that sheds light on the micro-foundations.
Therefore, a full-scope micro-level analysis across all industries helps reveal the specific micro-
foundations in each industry and the economic fundamentals that contribute to the spatial
structure of the macro-economy. The estimation of the attenuation speed and evaluation of
the spatial decay functional forms, empowered by the comprehensive analysis, provide useful
empirical guidance on calibrating theoretical work that explicitly models firm productivity
spillovers (Lucas and Rossi-Hansberg, 2002; Ahlfeldt et al., 2015).4
The scope and context of our paper also help to understand the specific economic funda-
mentals that drive urban growth in China and other developing countries. China has achieved
miraculous economic growth in the past few decades, exhibited mostly in its dramatic boom
in the manufacturing sector.5 While much has been proposed as the potential drivers, such
as improved labour mobility and rapid build-up of the transportation infrastructure, less is
understood on how different manufacturing industries in China benefit from various improve-
ments and thrive throughout the evolution (Au and Henderson, 2006; Tombe and Zhu, 2019;
Zhu, 2012). We reveal industry-specific micro-foundations through detailed spatial attenu-
ation patterns that help to shed light on the economic forces facilitating China’s dramatic
economic growth. Evidence from China, the largest developing country in the world, further
helps to verify whether the previously documented regularities of agglomeration economies
larger spatial scale.4For instance, Ogawa and Fujita (1980) and Fujita and Ogawa (1982) explicitly model spatial attenuation
of agglomeration economies to endogenize and determine the location of employment in a structural setting.The trade-off in the location decision resides in the tension between workers’ commuting costs and spatialspillover benefits accrued to firms. As shown in Fujita and Ogawa (1982), the equilibrium outcome rangesfrom a purely monocentric city to a complete dispersion, depending on the importance of the spatial decayfunction relative to commuting costs.
5China has become the world’s second-largest economy in terms of total gross domestic product (US13.608trillion in 2018, https://www.worldbank.org) within just four decades after its “reform and opening-up” policy.It is now often regarded as “the world factory.”
3
also hold for the developing world, adding to a growing body of literature on agglomeration
economies in developing countries.6
Moreover, a unique feature of the Chinese economy–the large presence of SOEs–allows
for investigating the role of SOEs in shaping the spatial manifestation of agglomeration
economies. The presence of SOEs remains controversial. They are usually stamped as be-
ing inefficient because of a lack of incentives and information (Megginson and Netter, 2001;
Djankov and Murrell, 2002; Huang et al., 2017; Estrin et al., 2009). However, because SOEs
are centrally controlled, they are capable of internalizing externalities to enhance efficiency.
Both aspects bear important implications on the agglomeration mechanisms and related in-
dustrial policies in countries with a large state sector. By documenting and comparing the
level and attenuation speed of agglomeration economies within and between the state and
private sectors, we reveal the nature of ownership-specific agglomeration economies that, to
our knowledge, has not been studied in the literature thus far.7
In theory, the benefits of agglomeration can be revealed by the relationship between ag-
glomeration and a variety of measures, including the location choice of new firms, total factor
productivity (TFP), total output per worker, and wages per worker. If agglomeration en-
hances productivity, it should manifest in higher TFP and total output per worker. As the
increased productivity raises firms’ willingness to pay for factor prices in a spatial general
equilibrium setting, wages per worker become higher (Rosen, 1979; Roback, 1982). More-
over, if the productivity gains outweigh the cost of agglomeration, new firms would prefer
to locate in places with higher density (Rosenthal and Strange, 2003; Arzaghi and Hender-
son, 2008). In this paper, we provide a comprehensive analysis of all the above-discussed
aspects of agglomeration benefits. We focus on the location choice of new firms in the main
empirical analysis but also consider firm TFP and other productivity correlates in a set of
complementary analyses.8
6Recent studies include Combes et al. (2015), Duranton (2016), and Chauvin et al. (2017).7Ge (2009) and Lu and Tao (2009) show that the agglomeration index is lower for industries with higher
shares of SOE employment, but their analysis is at the industry level and does not account for detailedgeographic variations to reveal the nature of spatial interactions between and within the private and statesectors.
8We measure firm birth as the probability of birth in each 2 km by 2 km grid in mainland China. Theadvantages of focusing on firm birth mainly reside in that new establishments are unconstrained by previousdecisions, such as the level of capital investments and output. While decisions on wage and output can be
4
To motivate our empirical specification of firm location choices and to diagnose potential
identification challenges, we present a simple conceptual framework based on Rosenthal and
Strange (2003) and Arzaghi and Henderson (2008). We specify a production function on
account of productivity spillovers that attenuate with geographic distance, similar in nature
to Lucas and Rossi-Hansberg (2002). We then specify the firm profit function as revenue,
subject to unobserved location and firm heterogeneity, minus costs. Costs include those for
labour and land capital and other location-specific fixed costs. In this way, one can clearly
see that the probability of firm birth is affected by the presence of existing firms at various
distances and that the identification of the impact of the agglomeration measures could be
compromised by its possible correlation with location-specific amenities and various cost
factors.
Guided by the conceptual framework, we conduct our empirical analysis in three steps.
First, we estimate the scale of distance- and industry-specific agglomeration economies to
explain the probability of firm births. Second, we estimate the attenuation speed of agglom-
eration economies by fitting a set of parametric functions to the first-step estimates. Last,
we explore the goodness of fit of various spatial decay functional forms, evaluate the micro-
foundations that govern the decay patterns across industry types, and reveal the role of SOEs
in shaping the spatial manifestation of agglomeration economies. Our key independent vari-
ables of interest in the first-step regression are measures of employment in the same industry
at various distances, conventionally adopted as the proxies for localization economies.9 To
alleviate concerns of omitted variable bias, we control for a wide range of related factors, in-
cluding proxies for urbanization economies (total manufacturing employment excluding own
industry within the same set of concentric rings), Herfindahl indexes that represent the local
industrial organization and industrial specialization, and city fixed effects.10
constrained by previous choices, such as previous investments in capital, firms’ location decisions are usuallytaking the existing economic environment as exogenously given (Rosenthal and Strange, 2003).
9Specifically, we create several concentric ring variables that measure employment in the same industry atvarious distances from a given location (that is, within 1 kilometer, between 1 and 5 kilometers, and so on)and regress firm birth probability at the centroid location on those concentric ring measures.
10The urbanization measures capture the positive externalities associated with the aggregate urban scale, aswell as the negative impacts of congestion and pollution. The Herfindahl indexes control for local competitive-ness and diversity of economic activities. Both variables play an important role in explaining the probabilityof new firm birth as emphasized in Glaeser et al. (1992) and Henderson et al. (1995). The city fixed effectshelp to absorb a wide range of city-specific amenities and disamenities that might influence the probability of
5
Despite various controls, the empirical identification of agglomeration economies is still
challenging. However, we argue that the identification for the attenuation speed of agglom-
eration economies is less vulnerable than the identification for the level of agglomeration
economies. The reason is that the potential bias in the level estimates is unlikely to be
systematically correlated with our distance measures in the second-step regression, which is
similar to the argument in Rosenthal and Strange (2008). For example, the conceptual model
suggests that one important factor in new firms’ location choices is the location-specific costs
of labour and land capital. To the extent that the cost factors cannot be fully controlled
for by observed location-specific attributes, the estimated localization economies in level
terms suffer from a downward bias.11 Nonetheless, since we control for various industry-
and location-specific attributes, the remaining bias caused by the cost factors is unlikely to
be systematically correlated with our distance measures. Thus, the estimates of attenuation
slope in the second-step regression are not necessarily biased.12
While we strongly believe in our baseline estimates, we adopt an instrumental variable
(IV) approach to corroborate our findings. We instrument for the contemporaneous industry-
specific employment concentration with a flexible function of historical attributes of the same
industry and exogenous grid-specific geological features, including the average terrain eleva-
tion and slope. We argue that because of accumulative effects, the historical measures remain
correlated with our key localization measures but not directly correlated with contempora-
neous unobserved cost factors (Ciccone and Hall, 1996; Combes et al., 2010). Similarly, the
underlying geological features are correlated with the likelihood of developing economic con-
centrations over time but not correlated with unobserved cost factors (Rosenthal and Strange,
2008; Glaeser and Kerr, 2009).13 We specify a nearly saturated functional form for the first-
firm birth, such as local fiscal policies and city-wide placed-based policies.11This potential omitted cost variable bias is in line with Arzaghi and Henderson (2008). They argued for
the need to control for location-specific unit rental cost in explaining the expected profit of advertising firms.Similarly, the possible presence of unobserved productivity amenities that affect both firm births and stockscauses ordinary least squares estimates to be upward biased.
12Note that the magnitude of the bias in levels caused by the presence of unobserved cost factors, as anexample, is determined by the correlation between the ring-specific cost factor and employment concentrations,as well as the ratio of standard deviations of both.
13The identification strategy may not completely address concerns on the presence of unobserved naturaladvantages in estimating the level coefficient. However, it should not compromise our identification of theattenuation slope in the second step because we explicitly control for proxies of natural advantages in oursecond-step industry-level regression. This point will be clear later in this paper.
6
stage regression to model the instruments in a flexible way. As we explain later in detail,
this effort leads to a set of seven hundred and twenty five instruments in total. We adopt
the machine learning Lasso framework to select the most relevant instruments to improve our
first-stage prediction while addressing potential issues associated with weak instruments.
Our empirical analysis draws on several unique administrative firm-level data sets. The
main data set is the China Economic Census (CEC), which provides the exact address of
all Chinese manufacturing firms and other employment information.14 One challenge with
respect to the measurement of nearby economic activities in this literature is the lack of
geographically refined data sets. For that reason, almost all studies on this topic rely on
aggregated data and implicitly presume certain distribution of firm locations within the ge-
ographic unit at which level the data are available. This approach imposes constraints on
identifying the spatial decay.15 In contrast, our data are well-suited to address this issue
because we have access to the exact address of each firm establishment which allows for a
more accurate and flexible way to quantify distance-specific agglomeration measures.16
We obtain the following findings. First, we find that agglomeration economies attenuate
with geographic distance. The initial attenuation is rapid, with the effect of own-industry em-
ployment in the first kilometer significantly larger than the effect of employment further away.
The estimated attenuation speed varies dramatically across industry types. The patterns are
largely consistent when we use the IV approach. The variation in the estimated industry-
specific attenuation speed is positively correlated with variations in proxies for knowledge
spillovers and labour market pooling but negatively correlated with variation in a proxy for
input sharing and the share of state firms. The correlations speak directly to the under-
lying micro-foundations of agglomeration economies. Moreover, among the various spatial
14The unit of observation in the CEC is the legal unit, which we will define formally in Section 4. Most ofthe legal units only consist of one establishment. In 2008, the percentage of the single-establishment legal unitwas 93.88. Thus, we treat a legal unit in our data as an establishment. In addition, we show the robustnessof our results using only the single-establishment legal units.
15For example, if the size of the geographic unit is large, the prior assumption on the spatial distributionwithin the unit would strip away any variation within a more refined distance range and would render thespatial decay pattern unidentified at a more refined distance. This might be especially concerning consideringthat the spatial decay of agglomeration economies could be rather fast for certain industries that rely heavilyon knowledge spillovers.
16In practice, we measure firm distances by allocating firms to each 2 km by 2 km grid and measure thedistance for each grid pairs. We elaborate on our measurement procedure and the reasons that we follow thisprocedure in a later section.
7
attenuation functions that we experimented with, the inverse square distance decay function
presents the best goodness of fit. This provides empirical guidance on the functional form in
modeling agglomeration economies.
Second, the agglomeration benefits and the associated attenuation patterns show strong
heterogeneity across ownership types. Both state and private firms benefit more from the
concentrations of own-type employment. Agglomeration spillovers across ownership types are
smaller in terms of magnitude and geographic scope, compared to within ownership types.
Relative to the average effects documented using the pooled data, existing private firms’
impact on private firms’ entry is both larger in magnitude and more significant. This pattern
suggests stronger agglomeration effects within the private sector. The estimated within-
SOE agglomeration effects are relatively weaker and are associated with a slower spatial
decay speed, compared to the effect within the private sector. Evidence suggests that the
agglomeration spillovers among SOEs could be more on input sharing and less on knowledge
spillovers. Similar patterns are also found when the agglomeration gains are measured in
TFP, output per worker, and wages per worker.
The findings in this paper contribute to the broad literature on the empirics of agglomera-
tion economies. Economists have long recognized the importance of agglomeration economies
that contribute to pronounced geographic clusters (Marshall, 1920; Krugman, 1991; Hender-
son et al., 1995; Henderson, 2003; Greenstone et al., 2010; Combes et al., 2012; Gaubert, 2018).
The exact magnitude of agglomeration economies, however, varies with the type of workers
and industries and with the period and country (Rosenthal and Strange, 2004; Combes and
Gobillon, 2015; Rosenthal and Strange, 2019). These differences arise because of variations
in the extent of the reliance on different fundamental sources of agglomeration economies.
They also arise because of the evolving nature of agglomeration forces at different stages of
economic development or unique institutional features of different countries and economies.
We extend the breadth of the literature by studying the nature of agglomeration economies
in China and by accounting for the role of its large presence of SOEs.
Our findings also add depth to the literature by systematically documenting the relation-
ship between the micro-foundations of agglomeration economies and the attenuation slope. It
8
is important to understand the source of agglomeration economies because different sources
have different policy implications (Holmes, 1999; Costa and Kahn, 2000; Duranton and Puga,
2004; Ellison et al., 2010; Liu, 2015; de la Roca and Puga, 2017; Diodato et al., 2018; Moretti,
2019; Davis and Dingel, 2019). Because different agglomeration forces should operate at dif-
ferent geographic scales, economists have long speculated a relationship between the nature
of agglomeration economies and the associated attenuation patterns (Rosenthal and Strange,
2003, 2008; Arzaghi and Henderson, 2008). Yet, no systematic evidence currently exists in
the literature to statistically establish such a relationship. We achieve that goal here and,
hence, support the notion that the evidence on attenuation of agglomeration economies is
relevant to determining their nature.
The rest of this paper is organized as follows. In Section 2, we discuss the conceptual
framework that explains agglomeration economies and firms’ location choices. Section 3
presents the empirical framework and addresses identification challenges. We discuss the
data and variables in Section 4. We present the baseline results in Section 5. Section 6
focuses on the probability of birth, TFP, and other correlates. We present a more focused
discussion on SOE versus non-SOEs in Section 7. We conclude with Section 8.
2. Agglomeration Economies and Location Choices
In this section, we first present a simple model of firms’ location choices in a standard
market economy on account of agglomeration economies that attenuate with geographic dis-
tance. The model setup helps to guide our empirical specification and interpretation and to
highlight potential identification pitfalls. We then discuss additional factors to consider with
the presence of SOEs in the Chinese economy.
2.1. Standard Market Economy
What determines firms’ location choices that give rise to the observed spatial variation
in the intensity of economic activities is one of the most fundamental questions in urban
economics. Firms could seek locations with natural productivity and/or cost advantages.
Firms could also seek locations with existing concentrations of similar firms driven by the
9
three Marshallian forces: knowledge spillovers, labour pooling, and input sharing (Marshall,
1920). Those forces contribute to the variation in firms’ geographic distribution in addition
to randomness and other organizational factors Ellison et al. (2010); Faggio et al. (2017).
The forces of agglomeration externalities attenuate with geographic distance. The speed of
attenuation differs as a result of industrial variation in the importance of different Marshallian
forces. Knowledge spillovers, for example, often require face-to-face contact within close
proximity. Labour pooling, which results from the ability to share a similar labour market
pool, takes place at a relatively larger distance. Input-sharing may operate at an even larger
geographic scope because the cost of transporting goods is kept low by the development
of transportation technology and infrastructure (Rosenthal and Strange, 2004; Combes and
Gobillon, 2015). Since the extent to which industries rely on each of the three Marshallian
forces is often different, the attenuation of agglomeration economies driven by these forces
would also be different across industry types.17
We specify a firm’s production technology by taking into consideration the industry-
specific geographic nature of agglomeration economies. The production technology is de-
scribed by a decreasing returns to scale production, with labour and land as inputs, and an
external effect that relates the productivity at a location to the density of economic activities
at other locations weighted by a spatial decay function. This external effect represents the ag-
glomeration forces mentioned earlier and was first introduced by Fujita and Ogawa (1982) as
the “location potential.” Lucas (2001) , Berliant et al. (2002) and Lucas and Rossi-Hansberg
(2002) model production technology in similar ways to study the structure of cities.
The production function is, written as the following:
Yi,s,c = γs
[∫ ∫f [µs, d(c, c′)] Lj,s,c′ djdc
′]As,c L
αsi,s,c K
βsi,s,c, (2.1)
where Yi,s,c is the output of firm i in industry s located at c, Li,s,c and Ki,s,c are units of
labour and land used by the firm, αs ∈ (0, 1), βs ∈ (0, 1), and αs+βs ∈ (0, 1). As,c represents
amenities that affect the productivity of firms in industry s at location c. The amenities con-
17Note that this statement does not preclude the possibility that close proximity is also required by thenetworking element to search for jobs and to create close interactions between buyers and sellers in inputsharing.
10
sidered here include observed characteristics of both own and surrounding locations, as well as
local attributes that are difficult to measure such as natural advantages, government policies,
and workforce qualities. The external Marshallian effect concerns localization externalities
arising from concentrations of own industry employment within and around a location and is
governed by two components. The first component is∫ ∫
f [µs, d(c, c′)] Lj,s,c′ djdc′, the sum
of employment of all other firms in the same four-digit industry in all locations weighted by a
spatial decay function f [µs, d(c, c′)]. This decay function decreases with the spatial distance
between c and c′, d(c, c′). µs is a parameter capturing the speed of the spatial decay.18 The
second component is γs, which is the industry-specific scale of agglomeration economies – that
is, the extent to which the weighted own industry employment across all locations impacts
the firm output.
Given the production technology, we formulate firms’ location choices as determined by
the potential profit to be achieved at different locations. The profit of a representative firm
i at location c is the following:
πi,s,c = pc,sYi,s,c − wcLi,s,c − rcKi,s,c − Fs,c, (2.2)
where pc,s is the price of the firm output, Fs,c is an industry and location specific fixed
cost, and wc and rc are the location-specific wage rate and rental cost, respectively. Firms
choose input quantities to maximize profits given wc and rc. In equilibrium, benefits from
agglomeration are capitalized into wages and rents (Glaeser and Mare, 2001; Rosenthal and
Strange, 2008; Arzaghi and Henderson, 2008), either through increased technological pro-
ductivity spillovers or “pecuniary externalities” (Combes and Gobillon, 2015). As we focus
on the determinant of new firms’ potential profits, we treat the output price, wages, and
land rents as given in our model. Potential endogeneity concerns with regards to empirical
estimation will be discussed in detail in Section 3.
Following Rosenthal and Strange (2003), we assume firms are heterogeneous in their
18The geographic scope and attenuation speed of such externalities are different across industries and con-trolled by the sector-specific decay function f [µs, d(c, c′)]. In the empirical estimation, we use several functionalforms to capture the spatial decay of agglomeration spillovers.
11
potential profitability. Hence, a specific firm’s profit function is expressed as
πi,s,c = pc,sYi,s,c(1 + εi,s,c)− wcLi,s,c − rcKi,s,c − Fs,c, (2.3)
where εi,s,c is the firm’s idiosyncratic productivity shock and is independent and identically
distributed across firms according to a cumulative distribution function Φs,c(ε).
Upon learning its own productivity shock, a firm will enter the market if its maximized
potential profit is non-negative. That is,
π∗i,s,c = max{L,K}
πi,s,c = pc,sY∗i,s,c(1 + εi,s,c)− wcL∗i,s,c − rcK∗i,s,c − Fs,c ≥ 0, (2.4)
where L∗i,s,c, K∗i,s,c and Y ∗i,s,c are labour, land capital and the level of output chosen at the
profit maximizing level, respectively. More specifically, we have
L∗i,s,c = (wβs−1c
rβsc)
1δs
[(1 + εi,s,c)ps,cγs
[∫ ∫f [µs, d(c, c′)]Lj,s,c′ djdc
′]As,cα1−βs ββs
] 1δs (2.5)
and
K∗i,s,c = ( rαs−1c
wαsc)
1δs
[(1 + εi,s,c)ps,cγs
[∫ ∫f [µs, d(c, c′)]Lj,s,c′ djdc
′]As,cααss β1−αss
] 1δs , (2.6)
where δs = 1 − αs − βs. For any industry s at location c there is a cut-off ε∗s,c such that
π∗i,s,c ≥ 0 if and only if εi,s,c ≥ ε∗s,c. Using equations (2.3), (2.5), and (2.6), we obtain the
following expression ε∗s,c = (Fs,cδs
)δs(wcαs )αs( rcβs )βs(ps,cγs
[∫ ∫f [µs, d(c, c′)]Lj,s,c′ djdc
′]As,c)−1−1. Therefore, the probability that a firm in industry s enters the market at location c is
1 − Φs,c(ε∗s,c), which is positively determined by the external effect and amenities, but is
negatively affected by wages, rental costs, and other fixed costs.
2.2. The Role of the State Sector
A large presence of SOEs is a unique feature of the Chinese economy. SOEs stand for
legal entities that undertake commercial activities on behalf of the government. Recent
estimates show that SOEs contributed 39 percent of China’s gross domestic product in 2015
(Holz, 2018). Despite being government-owned, SOEs’ priority is similar to private firms
in terms of delivering growth and tax revenue. This is especially true given that SOEs’
performance is closely tied to the promotion prospects of the supervising government officials
12
(Cao et al., 2019). Thus, the managerial decisions of SOEs are, at least in part, based on
profit maximizing incentives.
Nevertheless, the ownership nature and unique institutional features of SOEs entail addi-
tional factors coming into play in SOEs’ decision making and in the interactions within and
between SOEs and private firms. As a result, agglomeration channels in an economy with
a large presence of SOEs can be different from channels in a pure market economy. Entry
locations of private firms and SOEs could be affected differently by the presence of existing
SOEs and non-SOEs, and SOEs may have different incentives in response to agglomeration
externalities compared to private firms.
We first discuss how the entry decisions of private firms may be different depending on
existing concentrations of private and state firms. For private firms, the location decisions are
solely market-driven and exploiting local agglomeration economies from existing businesses is
a key factor to consider. The presence of nearby private firms attracts the entry of new private
firms by generating agglomeration externalities through the standard Marshallian forces. The
impact of nearby SOEs, however, is subject to the influence of unique institutional features
of SOEs that give rise to different interactions between the state and private sectors.
We highlight three important and related institutional features that contribute to dif-
ferent interactions between the state and private firms. The first relates to the government
favouritism toward and the ineffective management of SOEs. It is widely recognized that
SOEs are less efficient than private firms due to the lack of incentives and information (Hsieh
and Song, 2015; Huang et al., 2017). SOEs also face a different economic environment com-
pared with their private counterparts, because they have preferential access to loans and are
often protected by regulations that drive out competition from private companies. These
features imply that SOEs would not be as proactive as private firms to interact with other
firms and to contribute to local agglomeration economies. Therefore, new private firms would
be less attracted to concentrations of SOEs than to concentrations of private firms.
The second is through job hopping. In China, job hopping from SOEs to non-SOEs
is very limited because SOE jobs are more secure and offer higher pay (Meng, 2012; Ge
and Yang, 2014). Infrequent job turnovers between ownership sectors lead to a muted labour
13
pooling mechanism that would otherwise mutually enhance the productivity of both the state
and private sectors. The ineffective labour pooling mechanism between sectors decreases the
tendency of private firms to locate near SOEs relative to near private firms.
The third feature pertains to sharing inputs and outputs. Private firms may benefit more
from sharing inputs and outputs with their SOE counterparts than with other private firms
because SOEs are often sizable and stable business partners. Given their large capacity and
public ownership, SOEs may also have incentives to forfeit certain profit margins to benefit
local private firms in the partnership. This factor would increase private firms’ tendency to
locate near SOEs relative to private firms.
Thus, whether private firms are more or less likely to locate near SOEs is an empirical
question. However, since different mechanisms imply different attenuation patterns of ag-
glomeration economies, by documenting detailed attenuation patterns, we reveal the specific
mechanisms governing the nature of interactions within and between ownership sectors.
We next consider SOEs’ location decisions in response to the presence of existing state
versus private firms. Despite the lack of efficiency, in pursuit of profits, social planners still
locate SOEs based on market incentives, including opportunities to exploit agglomeration
economies. Given that private firms are, on average, more efficient and produce a higher
extent of spillover benefits, new SOEs may be more drawn to existing private firms than to
other SOEs. This is especially plausible considering that the majority of innovation activities
take place in the private sector and that the patents of private firms, as proxies of innovation,
are of higher quality as evidenced by that being cited more often and having a greater
international presence (Fang et al., 2017). Hence, to benefit from innovation spillovers of the
private sector, state firms need to be close to innovative private firms.
From another perspective, the central management of SOEs allows for the opportunity of
designing the spatial distribution of the state sector to maximize aggregate profits.19 SOEs
would then be more likely to locate closer to existing SOEs to internalize positive externalities.
This is especially true with the ownership status of SOEs widely decentralized during the SOE
19The social planner may take externalities into consideration and intentionally design clusters of SOEswhen making development plans. The industrial parks policy is one example in which the government isintentionally forming clusters of firms for possible externality internalization.
14
reform – local governments possess better information and are more effective in integrating
SOE management with the local economy (Huang et al., 2017).20 Thus, whether SOEs are
more attracted to the presence of state or private firms is also theoretically ambiguous.
All in all, while the agglomeration forces and associated attenuation patterns in China are
driven by similar market mechanisms as in developed countries, variations exist because of
China’s unique institutional features with the presence of a large state sector. By documenting
the different decay patterns of agglomeration economies within and between the state and non-
state sectors, we gain insight on the underlying micro-foundations that govern the spillovers
taking place within and across ownership types. These features help to shed light on the
underlying economic forces that drive rapid urban growth in China.
3. Empirical Framework
3.1. Estimation Procedure
In this section, we lay out the empirical framework to identify the spatial decay of ag-
glomeration economies in explaining firms’ location choices. As described in Section 2.1, the
aggregate agglomeration effect is captured by γs[∫ ∫
f [µs, d(c, c′)] Lj,s,c′ djdc′], where the
distance-weighted nearby economic activities are scaled by a factor of γs to constitute the
aggregate agglomeration effect. The related spatial attenuation of agglomeration economies
is captured by the function f [µs, d(c, c′)], which is embedded in the aggregate agglomeration
term. It is not straightforward to disentangle the attenuation speed from the scale effect in
levels, as the speed of the spatial attenuation, µs, is inherently linked to the scale parameter,
γs. In this paper, we adopt a two-step approach that relies on the specific distance-weighting
functional form to help identify µs.
In the first step, we estimate the scale of distance- and industry-specific agglomeration
economies in explaining the probability of firm birth at a location. In particular, we focus
on localization economies while controlling for urbanization economies, where both are spec-
ified in concentric ring variables as conventionally adopted in the literature (Rosenthal and
20The decentralization of SOEs means that the upper-level government delegates the right of control ofSOEs to lower-level governments.
15
Strange, 2004). In addition, we control for proxies for industrial organization and industrial
diversity as emphasized in Glaeser et al. (1992) and Henderson et al. (1995). The estimation
equation in the first step for each two-digit industry s is, hence, specified as follows (s is
suppressed for simplicity).
BirthRatec,p,i =∑r
αrLOCr,i +∑r
βrURBr,i + IOc,i +DIVc + µp + πi + εc,p,i. (3.1)
In the equation, BirthRatec,p,i is defined as the percentage of new firms at location
c (locations are defined as 2km by 2km grids) and prefecture city p out of all new firms
for each four-digit industry i. LOCr,i stands for localization economies, calculated as the
sum of within four-digit industry employment in industry i in concentric ring r. URBr,i
represents urbanization economies, calculated as the sum of all manufacturing employment
excluding industry i in concentric ring r. To capture localization and urbanization economies
at different distances, we specify four concentric ring variables for each location, in addition
to the own-location measures: own-location boundary to 5 km radius concentric ring, 5-10 km
concentric ring, 10-20 km concentric ring, and 20-30 km concentric ring. µp and πi represent
the prefecture city fixed effects and four-digit industry fixed effects.
In addition, we control for proxies for local industry organization and industry diversity.
Organizational features are captured by IOc,i, which is the Herfindahl index for each four-
digit industry within 30 km of the central grid c.21 It is defined as∑
j(Lj,i,c/Li,c)2, where
Lj,i,c is the employment level of firm j in four-digit industry i at the region within 30 km
of c, and Li,c is the total employment level of industry i at the region. We incorporate the
diversity of economic activity using a Herfindahl index of specialization, DIVc. It is defined
as∑
i(Li,c/Lc)2, where Li,c/Lc is industry i’s share of total employment within 30km of the
center of location c. By estimating equation (3.1) for each two-digit industry s, we obtain
distance- and industry-specific estimates of the localization impact on the probability of firm
birth, αr,s.
In the second step, we fit the first-step estimates of distance- and industry-specific agglom-
eration economies into various parametric decay functional forms to disentangle the speed of
21We also experimented with other distance radii, such as 1 km, 5 km, 10 km, and 20 km when definingthis control variable and our baseline results are very robust to this variation.
16
attenuation from the scale effect. Specifically, based on the ring definitions, we assign dis-
tance values, d, to the corresponding rings, r, and adopt several distance-weighting functional
forms to explain the variation in the first-step estimates of ring-specific (distance-specific) lo-
calization economies for each industry, which we now label as αr(d),s or equivalently αd,s. This
second-step procedure achieves two goals. First, we obtain an estimate of the speed of attenu-
ation, µs, for each industry. Second, we experiment with different types of distance-weighting
functional forms and test which functional form provides the best goodness of fit.
The estimation equation in the second step is as follows:
αd,s = µsf(d) + γs + εd,s, (3.2)
where f(d) is a specific distance-weighting functional form, µs is the parameter capturing the
industry-specific decay speed, γs represents industry fixed effects to soak up the industry-
specific scale effect and other unobserved characteristics, and εd,s is the error term. We
experiment with nine functional forms of f(d). These functions are (1) a negative linear
distance function f(d) = −d, (2) an inverse linear distance function f(d) = 1d , (3) an inverse
exponential distance function f(d) = 1ed
, (4) a negative square distance function f(d) = −d2,
(5) an inverse square distance function 1d2
, (6) an inverse square exponential distance function
f(d) = 1e2d
, (7) a negative cube distance function f(d) = −d3, (8) an inverse cube distance
function 1d3
, and (9) an inverse cube exponential distance function 1e3d
.
To verify the relationship between the industry-specific attenuation speed and the under-
lying micro-foundations, we estimate an alternative specification in the second step. Instead
of estimating industry-specific attenuation speed, we document how the decay speed changes
with variations in proxies for the three Marshallian agglomeration forces while controlling for
an industry’ SOE share and its reliance on natural resources, in the spirit of Ellison et al.
(2010). The estimation equation is as follows:
αd,s =µ1f(d)× 1[ISs>M(IS)] + µ2f(d)× 1[LPs>M(LP )] + µ3f(d)× 1[KSs>M(KS)] (3.3)
+ µ4f(d)× 1[NAs=NAH ] + µ5f(d)× 1[SOEs>M(SOE)] + µ6f(d) + εd,s,
where 1[·] denotes an indicator function; M(·) denotes the median of a variable; ISs, LPs, and
17
KSs stand for proxies for input sharing, labour market pooling, and knowledge spillovers,
repsectively; NAs is a proxy for an industry’s reliance on natural resources as a potential
alternative agglomeration force (Ellison and Glaeser, 1999); and SOEs is the share of SOEs
in industry s. Specifically, total transportation cost per dollar of shipment of final output
is used as a proxy for input sharing. Similar to Rosenthal and Strange (2001), we use the
percentage of workers with a college degree or above as a proxy for labour market pooling.
Innovative activity is related to the importance of knowledge spillovers and is measured by
the ratio of new product to total product. Three cost variables are used to construct a proxy
for the reliance of natural resources: water inputs per dollar shipment, energy inputs per
dollar shipment, and other natural resources inputs per dollar shipment.22 An industry is
defined as being highly dependent on natural resources (NAH) if at least two of the cost
variables for the industry are higher than its median.
3.2. Identification Strategy
Given the broad empirical setup, we now discuss and address concerns of endogeneity.
The literature on identifying the magnitude of agglomeration economies is typically con-
cerned with unobserved heterogeneity across people and locations.23 These two types of
unobserved heterogeneity give rise to multiple sources of endogeneity through unobserved
factors in As,c, wc, rc, and Fs,c in our setting and lead to an ambiguous direction of bias.
The first source of endogeneity is caused by factors in As,c that represent unobserved
location-specific productivity or unobserved variation in human capital induced by sorting.
Places endowed with higher productivity breed higher existing business concentrations and,
at the same time, attract new firms to arrive. Alternatively, people with higher ability may
sort into locations with better amenities or higher business concentrations, which also leads
to more frequent firm births (Combes et al., 2008). Thus, failing to directly control for As,c
will lead to an upward bias of the estimated agglomeration effect. The second source of
22Energy and other natural resources variables are constructed as in Rosenthal and Strange (2001). Coal,crude petroleum and natural gas are included as part of the energy variable, whereas output from mining,agriculture, and others are considered as other natural resources.
23For example, Glaeser et al. (2018) summarizes unobserved heterogeneity across people as reflecting “thesorting of people into places based on ability levels” and unobserved heterogeneity across places as reflecting“the tendency of people to move into areas that have endogenously higher productivity.”
18
endogeneity concerns the cost factors in wc and rc. Higher economic concentrations generate
larger agglomeration economies which boost productivity and induce higher wage rates and
land rents (Rosen, 1979; Roback, 1982). Wages and rents negatively affect the likelihood of
firm entry, which causes a downward bias of the estimated agglomeration effect.24 Sorting by
unobserved ability could exacerbate the downward bias by inducing wages to rise further in
highly concentrated locations.25 The third source of endogeneity is caused by the unobserved
fixed cost Fs,c, which could be driven by, for example, industry and location-specific policies.
The three sources of endogeneity give rise to an ambiguous direction of the bias in net.
To mitigate the bias in the estimated scale of agglomeration economies in the first step,
we include a comprehensive set of controls. First, we control for prefecture city fixed effects to
address cross-city differences in amenities, wages, rents, and location-based policies. If a city
can be considered as a self-contained labour market, wage rates should vary across cities but
not within a city, holding constant workers’ own productivity. Thus, city fixed effects could
help address concerns of unobserved wage differences arising from either location heterogene-
ity in productivity or ability sorting across cities. City fixed effects also help explicitly control
for cross-city differences in amenities, rental costs, and government policies that affect firms’
location preferences. Second, we control for ring-specific urbanization measures to address
concerns on unobserved variation in rental costs and other fixed costs within cities. Although
wages may not differ significantly within cities, rental costs do. Urbanization measures serve
as a good proxy for the overall demand for land, and hence, can help mitigate the concerns
of unobserved rental cost differences within cities. Ring-specific urbanization measures also
help control for unobserved positive externalities associated with the aggregate urban scale
and for unobserved negative impacts of congestion and pollution. Finally, we control for two
Herfindahl indexes for the 30 km concentric rings representing the local industrial organiza-
tion and industrial specialization separately. The Herfindahl indexes help address concerns
on unobserved local industry-level competitiveness and diversity of economic activities.
Despite various controls, we cannot fully resolve the concerns of omitted variables in
24This is in line with Arzaghi and Henderson (2008) on controlling for land rent.25If physical capital and human capital are complementary, the land rents also become higher (Acemoglu,
1996).
19
identifying the level of agglomeration benefits in the first-step estimation. However, we argue
that the identification for the attenuation speed of agglomeration economies in the second-step
estimation is less vulnerable than the identification for the level of agglomeration economies
because the possible remaining bias in the level estimates is unlikely to be systematically
correlated with our distance measures. In this case, potential bias in the level estimates
will be differenced out when we include industry fixed effects while fitting in a spatial decay
function to estimate the attenuation slope parameter in the second-step estimation.
While we strongly believe in our baseline estimates, we adopt an IV approach to cor-
roborate our findings. We instrument for contemporaneous industry-specific employment
concentration with a flexible function of historical employment concentrations of the same
industry in 1995 and exogenous grid-specific geological features including the average terrain
elevation and slope. The rationale for our instruments is as follows.
First, because of the accumulative effect, the historical measures of industry concentration
remain correlated with our key localization measures. However, the historical instruments
are not directly correlated with contemporaneous unobserved cost factors, similar to what
was argued in Ciccone and Hall (1996) and Combes et al. (2010). These instruments are
especially sensible in our setting given that the SOE reform policy in China was not imple-
mented widely back in 1995. At that time, the majority of firms were still SOEs, which were
established even earlier. The location choice of SOEs was subject to political and national
security considerations and was typically unrelated to market-driven cost factors before the
SOE reform. Second, the underlying geological features are correlated with the likelihood
of developing economic concentrations mentioned earlier but are not directly correlated with
contemporaneous cost and productivity factors. Instruments of this flavor have been used in
Rosenthal and Strange (2008).
To improve the efficiency of the IV estimation, we specify a nearly saturated functional
form of the instruments in the first-stage estimation. The idea is in line with the literature
on approximating optimal instruments nonparametrically, or equivalently in principle, by
constructing high-power polynomials (Amemiya, 1974; Chamberlain, 1987; Newey, 1990). To
construct the instruments, we follow a four-step procedure. First, as the geological data
20
contain the mean and standard deviation of terrain elevation and slope at the grid level,
we allocate each grid into the corresponding concentric rings (i.e., 0-1 km, 1-5 km, 5-10
km, 10-20 km, and 20-30 km rings). Second, for each concentric ring, we calculate the
first and second moments of grid-specific means and standard deviations for both the terrain
elevation and slope, which creates eight feature variables for each concentric ring.26 Third, we
discretize the mean and standard deviation of grid terrain elevation and slope for the 0-1 km
ring and the eight feature variables for each of the outer rings into ten separate categorical
dummies each to capture the nonlinearity of the impacts, which forms three hundred and
sixty variables. Fourth, we include the 1995 industry- and ring-specific employment counts,
discretized location-specific geological features, and the interactions to form the final set of
seven hundred and twenty five instruments.
The improved efficiency from using a large set of instruments comes at a cost arising from
potential weak instruments, which we resolve using the least absolute shrinkage and selec-
tion operator (Lasso). IV estimators based on many instruments may present undesirable
properties, such as the presence of weak instruments (Andrews et al., 2019). Weak instru-
ments may cause traditional IV estimates to be badly biased since t-tests may fail to control
size and conventional IV confidence intervals may not cover the true parameter value with
intended probability. This potential problem, however, can be resolved by appealing to the
Lasso procedure.
Lasso was introduced by Frank and Friedman (1993) and Tibshirani (1996) and is widely
used as an estimator of regression functions and as a model selection device. Using Lasso to
form first-stage predictions in IV estimation is a practical approach that obtains the efficiency
gains from using optimal instruments while dampening the problems associated with many
instruments (Belloni et al., 2012). However, the Lasso selection techniques are not perfect, and
selection mistakes could contaminate the post-model-selection estimator and inference (Leeb
and Potscher, 2008; Andrews et al., 2019). For this reason, we use the procedure proposed by
Chernozhukov et al. (2015) as opposed to the standard Lasso or post-Lasso procedure.27 In
26Note that the 0-1 km ring only consists of one grid. Thus, the 0-1 km ring is only associated with fourfeature variables, the mean and the standard deviation of grid terrain elevation and slope.
27The standard post-Lasso estimator performs the least square estimation using Lasso-selected instruments.
21
Chernozhukov et al. (2015), Lasso-selected variables and post-Lasso coefficient estimates are
used to construct orthogonalized versions of the dependent variable, independent variables of
interest, and optimal instruments. These variables are then used in a standard IV regression
for the final estimation. In this way, the estimation and inference for the parameters of
interest are locally insensitive to exclusions of relevant instruments.
4. Data, Variables, and Summary Statistics
4.1. Data
Our empirical analysis relies on three administrative firm-level data sets that are obtained
from the National Bureau of Statistics (NBS) of China and a complementary data set that
helps to form our instruments. The first and primary firm-level data set is the CEC.28 The
CEC is available for 2004 and 2008. We use the 2008 economic census for our core analysis
on firm birth. The 2004 economic census is used to construct agglomeration measures for
supplemental analyses on firm productivity and other correlates, which rely on firm data prior
to 2008. Both years provide information on a set of firm attributes, including firm name,
firm address, legal unit code, legal representative name, industry classification, opening year,
ownership type, fixed capital, output value, employment size, and others. The CEC catego-
rizes firms into four-digit industries based on the four-digit Chinese Industry Classification
(CIC) system. We focus on the manufacturing firms with a two-digit industry code from 13
to 42 in this study.
The unit of observation in the CEC is a legal unit (faren danwei). A legal unit needs
to meet several requirements: “(1) They are established legally, having their own names,
organizations, location and able to take civil liability; (2) They possess and use their assets
independently, assume liabilities and are entitled to sign contracts with other units; and (3)
They are financially independent and compile their own balance sheets.”29 By definition, a
legal unit is similar to a firm. Thus, for the rest of this paper, we refer to legal units as firms.
28The NBS conducted the first economic census in 2004, the second economic census in 2008 and thefollowing economic censuses every five years.
29China Statistical Yearbook 2009, Chapter 13.
22
Conveniently for our purpose, most firms in the CEC consist of a single plant. For example,
in the 2008 economic census, the share of single-plant firms is ninety-four percent. Thus, we
treat this data as an establishment-level data set for our main analysis, but we also carry out
estimations using only single-plant firms for robustness.
There are two major advantages of the CEC that make it well-suited for our study. First,
the CEC is the most comprehensive firm-level data set for the Chinese economy. The data
cover the universe of all registered firms in China, irrespective of size.30 As an example of the
scope, there are 5,228,726 firms included in the 2008 economic census. Observing the universe
of firms allows us to characterize the spatial distribution of economic activities accurately.
Second, the CEC provides detailed information on firm addresses, which allows us to
geocode the addresses to obtain the exact longitudes and latitudes of firm locations.31 This is a
major improvement over the previous studies, which often rely on geographically aggregated
data and implicitly presume certain distribution assumptions of firm locations within the
political and administrative units that are often of irregular shapes. The assumptions lead
to measurement errors and estimation bias and are likely to compromise the identification
of the spatial decay of agglomeration economies. In contrast, our empirical analysis is based
on accurate firm locations that allow for a more precise and flexible way of measuring firm
distances and of quantifying distance-specific agglomeration economies.
The second firm-level data set that we use is the Annual Surveys of Industrial Firms
(ASIF) of China from 1998 to 2007. The ASIF is an annual firm panel that covers private
firms with annual sales exceeding five million yuan and all SOEs.32 Similar to the CEC, most
firms in this data set are single-plant firms.33 The ASIF also contains detailed information
on firm addresses, which we use to geocode all firms and to obtain their exact longitudes
and latitudes. It provides the firms’ basic balance sheet information from 1998 to 2007
that enables us to estimate firm-level TFP. We estimate firms’ TFP, output per worker and
30Self-employed individuals and private firms with up to eight employees may operate under a different legalsystem and be excluded from the economic censuses.
31Accurately geocoding addresses in China can be challenging because all map service providers in Chinaare mandated by the government to mask the exact longitudes and latitudes. We provide more detailedbackground and discuss how we overcome this issue in Appendix A.
32In 2011, the sampling cut-off for private firms increases to twenty million yuan.33For instance, in 2007, the share of single-plant firms is 96.6 percent (Brandt et al., 2012).
23
average wage per worker from 2004 to 2007 and study how firm productivity is affected by
agglomeration measures in 2004.34 We adjust all dollar variables using the national consumer
price index so that they are comparable across years.35
We construct our IVs with a third firm-level data set and the digital elevation model
image of China. The third firm-level data set is an industrial firm census conducted by the
NBS in 1995. This survey covers all industrial firms in mainland China at that time. In
total, 510,381 industrial firms were surveyed. Similar to the CEC, the basic firm attributes
are collected in this survey. We construct our IVs of historical industrial attributes with this
data set. In 2003, the NBS adjusted the CIC system to reflect changes in the economy. To
ensure consistency in industry classification in our analysis, we constructed a harmonized
industry classification following Brandt et al. (2017). The digital elevation model image of
China provides the mean and standard deviation of terrain elevation and slope for each 2km
by 2km grid in China and is used to construct the geological instruments. Details on the
construction of instruments are explained in Section 3.2.
4.2. Variables and Summary Statistics
In this section, we first develop a general understanding of the spatial distribution of
economic activities for each manufacturing industry in China. In Table 1a, we list the top five
cities with the highest concentration of each two-digit industry. Three patterns emerge. First,
within each industry, we see a strong agglomeration pattern. For example, Dongguan hosts
more than 15 percent of firms for the industry of stationery, educational, and sports goods.
Even for some less agglomerated industries (food production, raw chemical materials and
chemical products, medical and pharmaceutical products, and non-metal mineral products),
the total share of firms in the top five cities is more than 12 percent. Second, across industries,
there is large heterogeneity in the level of agglomeration. The most agglomerated industry
is electronic and telecommunications, with around 20 percent of firms locate in Shenzhen,
whereas for beverage production, the highest firm share is only 3.46 percent in Yibin. Third,
34The TFP estimation method is described in more detail in Appendix B. We calculate firm TFP until 2007because the ASIF misses key variables for TFP estimation after 2007, such as the value added.
35The consumer price index is provided by the NBS at http://www.stats.gov.cn.
24
while a lot of industries concentrate in large cities, clusters also form in relatively smaller
cities. For example, food processing seems to concentrate in relatively smaller cities, including
Weihai, Yantai, Qingdao, Linyi, and Weifang.36
Next, we discuss how we measure economic concentrations in our analysis. We choose
away from replying on agglomeration measures within readily defined administrative bound-
aries for two reasons. The first reason is that we focus on capturing agglomeration economies
at more refined distances, and even the finest administrative areas are generally too broad
for our analysis. The second reason is that China’s official administrative areas are found to
be inconsistent with the power law (Dingel et al., 2019) and, thus, may be different from the
commonly defined “cities” used in research on developed countries. For these reasons, we use
concentric variables to capture economic concentrations at refined distances. Ideally, we could
define our concentric ring measures based on the exact pair-wise distances between firms. In
practice, we make a compromise by following the below procedure to save computational
power.
We measure firm distances and construct concentric ring measures in the following pro-
cedure. We first overlay 2 km by 2 km grids on the entire mainland China. We then pinpoint
all firms into those grids based on firms’ longitude and latitude.37 The distance between a
pair of firms is then calculated as the distance between the grid centroids that the respective
firms belong to. As we work with a large sample of firms, this simple aggregation allows
us to keep the calculation intensity at a manageable level without compromising much on
precision.38 An underlying assumption is that all firms in a grid locate at the grid centroid.
Finally, we construct five concentric rings (0-1 km, 1-5 km, 5-10 km, 10-20 km, 20-30 km)
from the centroid of each gird and examine how economic activities in each of five concentric
rings affect the economic outcomes of firms in the central grid.39
In the first-step regressions, we relate firm birth patterns to concentric ring-level mea-
36Industry-specific employment heat maps are provided in Online Appendix Figure OA1 for a better visualpresentation.
37Industry-specific employment concentrations at the grid level are reported in Appendix Table A1.38Given that our grids are sufficiently refined geographically, the assumption should not lead to severe
measurement issues. Our method is still superior to the previous studies using geographically aggregated datasince our aggregation is geographically refined and is of regular shape.
39Note that the 0-1 km ring is fully covered by the central grid as the size of the grid is 2 km by 2 km. Thiswill not affect the calculation since we assume all firms in a grid locate at the grid centroid.
25
sures of localization and urbanization economies. The ring-level localization and urbaniza-
tion measures, our key independent variables, are calculated as the existing concentration of
employment within and outside of each four-digit industry in the five concentric rings around
each grid, respectively. To avoid large amount of zeros, we restrict our sample to only those
grids with either existing economic activities or new firm births within each industry. Table
1b reports the summary statistics on the concentric ring measures of localization and ur-
banization economies. The variation in both localization and urbanization measures is large
across different locations. By construction, the outer rings on average have more employment
because they cover larger ground areas.
The dependent variable, new birth share, is calculated as the percentage of firm birth in
a grid out of all new firms for each four-digit industry. Birth share can be interpreted as the
probability of new firm birth incurring in four-digit industry s in grid c. Mathematically, it
is defined as ss,c = ns,c/ns, where ns,c is the total counts of new firms in four-digit industry
s in grid c, and ns,c is the total counts of new firms in four-digit industry s across all grids.
We define a firm as a new firm if the firm’s opening year is 2008 in the 2008 CEC. Using
birth share as the dependent variable is also consistent with our conceptual framework which
shows that the presence of nearby economic concentrations affects the probability of firm
birth at the central location. Table 1c reports summary statistics on the grid-level new firm
birth share for each industry.40
5. Probability of Birth
5.1. Industry-Specific Attenuation
In this section, we present the estimates of distance- and industry-specific agglomeration
economies in explaining the probability of firm birth, as represented in equation (3.1). We
report the ordinary least squares (OLS) and IV results for nine selected manufacturing indus-
tries in Tables 2 and 3. The OLS and IV results for the complete list of twenty-nine industries
are reported in Appendix Tables A2 and A3. The nine selected industries are food production
40Online Appendix Table OA1 reports summary statistics on the grid-level new employment birth share foreach industry.
26
(CIC 14); tobacco processing (CIC 16); furniture manufacturing (CIC 21); printing and record
pressing (CIC 23); medical and pharmaceutical products (CIC 27); transportation equipment
manufacturing (CIC 37); electronic and telecommunications (CIC 40); instruments, meters,
cultural, and official machinery (CIC 41); and artwork and other manufacturing (CIC 42).
We select the industries if they rank at the top in one of the three industry characteristics
that capture the underlying Marshallian forces: the share of transportation costs, the share
of high-skilled labour, and the share of new products.41 We also select two industries that
lead in the share of SOE employment, which highlights a unique feature of the Chinese
economy and may lead to different patterns of agglomeration economies in China. The main
consideration to report the nine industries rather than all in the main text is to save space,
but we also think these industries are especially useful examples to help shed light on the
forces behind the industry-specific attenuation patterns of agglomeration economies. Many
of these industries have also been the focus of other studies, such as Rosenthal and Strange
(2001, 2003) and Jofre-Monseny et al. (2011).
The general patterns in Tables 2 and 3 are quite similar.42 For most industries, localization
economies in the proximity of a grid, as captured by nearby concentration of firms in the same
four-digit industry, strongly increase the probability of firm birth in the grid. More important,
the estimated localization economies attenuate rapidly with distance. In fact, the localization
effects completely decay within 5 km for most industries. Urbanization economies, as captured
by nearby concentration of firms in other industries, also have positive impacts on firm births.
However, such positive urbanization effects completely decay or even become negative after 1
km, possibly due to rising congestion costs. These general patterns emerge for most industries
and are consistent with previous findings in Rosenthal and Strange (2003) and Arzaghi and
Henderson (2008).
The results in Tables 2 and 3 also reveal important heterogeneity in the attenuation
41Transportation cost is calculated based on the input-output table of China. High-skilled labour is definedas workers with a college degree or above. The share of new products is calculated as the ratio of new productoutput to total output, where the new product is defined by ASIF as a product that is not the main outputsof the firm in the last year. The share of SOE employment is calculated as the ratio of employment in SOEsto total employment in all firms.
42An extant literature has followed similar approaches to address potential unobserved features associatedwith city size and also found it to be of small practical importance (Ciccone and Hall, 1996; Combes et al.,2010; de la Roca and Puga, 2017).
27
patterns of agglomeration economies across industries. We present the IV estimates of lo-
calization economies in Figure 1 (except for the tobacco processing industry), which visually
illustrates this heterogeneity. At a first glance, we find that the industries with faster at-
tenuation of localization effects are usually associated with higher innovative activities or
more college-educated labor (e.g., electronic and telecommunications; instruments, meters,
cultural, and official machinery; and artwork and other manufacturing). We will further in-
vestigate this important correlation in Section 5.3. In addition, we fail to find evidence for
either localization or urbanization economies for the tobacco processing industry, which is
the industry associated with the highest share of SOE employment. The unique nature of
the state sector seems to play an important role in determining the pattern of agglomeration
economies. We will explore the role of the state sector further in Section 7.
Comparing Tables 2 and 3, some small discrepancies exist between the OLS and IV
estimates, which suggest possible presence of omitted variable bias. One discrepancy is that
the IV estimates of localization effects for the 0-1 km ring are lower than the OLS estimates
for industries of food production (CIC 14), furniture manufacturing (CIC 21) , medical and
pharmaceutical products (CIC 27), and electronic and telecommunications (CIC 40). This
difference could be explained by the possible presence of unobserved and persistent amenities
affecting both firm births and stocks, which cause OLS estimates to be upward biased. This
explanation is plausible because the role of unobserved and persistent amenities is potentially
important in many industries such as food production, which depends heavily on natural
resources, and furniture manufacturing, which depends on water resources.43
In other instances, the IV estimates of localization effects for the 0-1 km ring are higher
than the OLS estimates, such as for the printing and record pressing industry (CIC 23);
transportation equipment manufacturing industry (CIC 37); instruments, meters, cultural,
and official machinery industry (CIC 41); and artwork and other manufacturing industry
(CIC 42). For those industries, the downward bias caused by insufficient controls for labour
and rental costs may outweigh the upward bias induced by unobserved local amenities in the
OLS estimates. This is plausible given that those industries are relatively capital intensive
43The food production industry and furniture manufacturing industry rank among the highest in the costof natural resources and water-related costs, respectively.
28
and rely more on high-skilled labour.
We also note that for several industries (e.g., food production), the IV estimates of lo-
calization effects for the second and third rings tend to be higher than the OLS estimates.
The same pattern is found in Arzaghi and Henderson (2008) for the advertising industry
in Manhattan. Similar to their argument, this opposing effect with IV estimation could be
driven by the spatial correlation in the unobservables. As the unobserved amenities in the
own ring and neighboring rings are likely to be positively correlated, the neighboring rings
may draw firm births away from the own ring, which biases the OLS estimates of localization
effects in the second and third rings toward zero.
Thus, along with the results for the complete list of twenty-nine industries in Appendix
Tables A2 and A3, we find large heterogeneity in the attenuation patterns of agglomeration
economies across industry types.44 Despite small discrepancies, the general patterns from the
OLS and IV estimates are largely similar, which suggests that our OLS estimates are generally
robust. Previous studies that focus on only a few industries are unable to systematically
document this important heterogeneity and relate it to the underlying micro-foundations.
We explore this relationship after evaluating the goodness of fit of various spatial decay
functions in the next subsection.
5.2. Spatial Decay Function
Here, we statistically test for the goodness of fit of nine spatial decay functional forms.
These functions are (1) a negative linear distance function f(d) = −d, (2) an inverse linear
distance function f(d) = 1d , (3) an inverse exponential distance function f(d) = 1
ed, (4) a
negative square distance function f(d) = −d2, (5) an inverse square distance function 1d2
,
(6) an inverse square exponential distance function f(d) = 1e2d
, (7) a negative cube distance
function f(d) = −d3, (8) an inverse cube distance function 1d3
, and (9) an inverse cube
exponential distance function 1e3d
.
We estimate equation (3.2) with each of the nine functional forms using OLS and IV
estimates of localization effects obtained from the first-step regressions, respectively. We
44In Online Appendix Tables OA2-OA3, we document distance- and industry-specific localization economiesfor the sample of single-establishment firms and the sample of non-SOEs. The results are quite similar.
29
weight the observations based on the inverse of the estimated standard error, while allowing
for the attenuation parameter to vary with each two-digit industry. In the meantime, we
control for industry fixed effects to capture the scale effect of agglomeration economies, as
well as other industry-specific features, such as the extent of its reliance on natural resources.
We then compare the estimated root mean squared errors (RMSE) and mean absolute errors
(MAE) of the estimated models. In principle, the functional form that produces the smallest
RMSE and MAE presents the best fit to the data.
Table 4 reports the coefficient estimates of industry-specific attenuation slopes based on
an inverse square distance decay function. We will show later that this functional form has
the best goodness of fit to the data. The estimates based on the rest of the functional
forms are reported in Appendix Tables A4.1-A4.2 and in Online Appendix Tables OA4.1-
OA4.6. These estimates are produced based on the IV coefficient estimates from the first-step
regressions. Results based on OLS estimates from the first-step regressions are very similar.
Corresponding tables are available upon request.
Table 5 reports the goodness of fit for all nine functional forms. Specifically, the RMSE
and MAE are provided for model estimations based on four sets of first-step results: OLS
estimation of the birth model, IV estimation of the birth model, OLS estimation of the new
employment model, and IV estimation of the new employment model. Based on the RMSE
and MAE statistics, the fifth functional form specification, f(d) = 1d2
, dominates all others
in the model’s goodness of fit. The second-best functional form based on the same criteria
for the firm birth model is the inverse exponential distance function, f(d) = 1ed
, which has
been adopted in various settings including Lucas (2001) and Ahlfeldt et al. (2015).
5.3. Attenuation and Micro-foundations
In this section, we investigate the cross-industry variation in the estimated attenuation
slope and how the variation relates to the underlying micro-foundations of agglomeration
economies. As the inverse square distance decay function presents the best goodness of
fit, we focus on the corresponding estimates in Table 4, although the rest of the functional
forms produce similar patterns. We first highlight that the estimated attenuation speed
30
parameter is very heterogeneous across industries. For example, the industries of chemical
fibers, artwork and other manufacturing, and rubber products have the fastest attenuation
speed of localization economies, while tobacco processing seems to have non-attenuating
localization economies.
Second and more important, there exists an important empirical relationship between
the attenuation speed of localization economies and industry proxies for Marshallian forces
(the share of transportation costs, the share of high-skilled labour, and the share of new
products). First, we unfold the relationship visually in Figure 2 by presenting kernel density
estimations of the industry-specific attenuation speed parameters stratified based on whether
each industry characteristic is above or below its median.45 The industry characteristics we
consider include the above-mentioned proxies for Marshallian forces, the reliance on natural
resources, and the SOE share.
As shown in panel 1 of Figure 2, the decay speed of industries that rely more on transporta-
tion (red line) is concentrated at a lower level than industries that rely less on transportation
(blue line). The pattern is consistent with the idea that industries that rely more on trans-
portation are usually more reliant on input sharing as a source of agglomeration economies
and that input sharing is associated with a slower attenuation slope and a larger geographic
scope. As an example, the food production industry (CIC 14) has the highest reliance on
transportation among all industries. As shown in Tables 2-4, for this industry, the scope of
localization economies is among the largest and the attenuation speed is relatively small.
We use the new product ratio in an industry as a proxy for its reliance on knowledge
spillovers. In theory, as knowledge exchange and networking benefits are heavily reliant on
face-to-face contact and extremely localized, knowledge spillovers are usually confined within
a very narrow scope. Indeed, panel 2 of Figure 2 shows that for industries with larger than
median new product ratio (red line), the decay speed parameters are concentrated at a higher
level with a much larger upper bound. For example, transportation equipment manufacturing
and electronic and telecommunications have the first- and second-highest new product ratios,
respectively, and their decay speed of localization effects is relatively fast.
45We use the Epanechnikov kernel density function with the bandwidth selected using cross-validation.
31
An industry benefits more from labour market pooling if the industry has a high reliance
on skilled labor. Thus, the percentage of college-educated workers in an industry can be used
as a proxy for the importance of labour market pooling to an industry. Panel 3 of Figure
2 shows that the decay speed parameters for industries with a larger than median share of
college-educated workers (red line) are concentrated at a higher level, which indicates the
benefits of labour market pooling decay fast in general. In panels 4 and 5 of Figure 2, we
show that the reliance on natural resources and SOE share seem to play an important role
in an industry’s attenuation pattern. While the impact of an industry’s reliance on the
attenuation speed is unclear, higher than median SOE share (red line) seems to point to a
slower attenuation speed.
Next, we explore the relationship in a regression setup by showing whether an industry’s
attenuation speed changes with proxies for micro-foundations of agglomeration economies
while controlling for an industry’s reliance on natural resources and SOE share.46 We fit the
estimates of localization effects from the first-step OLS and IV regressions into a spatial decay
function and the interaction terms of the decay function with indicator variables to capture
the above-discussed key industry characteristics, as represented in equation (3.3). Those
industry characteristics are chosen in the spirit of Ellison and Glaeser (1999), Rosenthal
and Strange (2001), and Ellison et al. (2010), and the indicator variables are defined in the
empirical framework. Table 6 reports the regression results, with the upper panel using OLS
estimates of localization effects and the lower panel using IV estimates of localization effects.47
The general patterns in Table 6 are consistent with the kernel density estimation results.
In column (1), we only include the interaction terms of the decay function with the indicator
variables that represent high new product ration, high reliance on natural resources, and high
SOE ratio. The coefficient on the interactive term of the decay function and the high new
product ratio is positive and statistically significant at the 1 percent level in both panels,
which indicates a faster decay of localization economies for industries that rely more heavily
46The logic underlying the specification is in line with the literature on industry co-agglomeration pat-terns, which emphasizes the role of both natural resources and agglomeration forces in determining industryagglomeration. See, for example, Ellison and Glaeser (1999) and Ellison et al. (2010).
47In Online Appendix Table OA5, we show the same relationship with the estimates of localization economiesobtained with the sample of non-SOEs.
32
on knowledge spillovers. Similarly, in columns (2)-(3), we examine how the attenuation speed
changes with high college-educated worker ratio and high transportation cost. The results
suggest that the decay speed of localization economies is faster for industries relying more on
labor market pooling. The coefficient estimate on the interactive term of the decay function
and high transportation cost is too imprecise for us to draw a conclusion. In column (4),
we include all interaction terms to test the micro-foundations simultaneously. The estimates
become imprecisely estimated due to colinearity, but the general patterns still hold. Finally,
in all specifications, the coefficient on the interactive term of the decay function and high
SOE ratio is negative and statistically significant, which indicates a slower decay speed of
localization economies for industries with a higher SOE share.
5.4. Comparison with Rosenthal and Strange (2003)
We now contrast our estimates of industry-specific attenuation of agglomeration economies
with estimates from Rosenthal and Strange (2003) to reveal differences in the underlying
micro-foundations of agglomeration economies between China and the United States. Rosen-
thal and Strange (2003) estimate the determinants of firm entry for six industries: software
(SIC 7371, 7372, 7373, and 7375), food processing (SIC 20), apparel (SIC 23), printing and
publishing (SIC 27), fabricated metals (SIC 34), and industrial and commercial machinery
(SIC 35). Except for the software industry, we have corresponding estimates for the other five
manufacturing industries reported in their study. Our similar industries are food production
(CIC 14), garments and other fiber products (CIC18), printing and publishing (CIC 27),
metal products (CIC 34), and machinery and equipment manufacturing (CIC 35).
In the upper panel of Figure 3, we plot the estimated attenuation patterns for the five
manufacturing industries studied by Rosenthal and Strange (2003). In the middle panel, we
plot the same estimates but exclude the apparel industry on a different scale for easier com-
parison. We present, in the lower panel, the attenuation patterns of the Chinese counterparts
based on our estimates. To allow for easy comparison and interpretation, the vertical axis in
each figure is set such that the magnitude of the spillover benefits in the machinery industry
within the first ring in the corresponding study is equal to one; all other spillover effects are
33
measured relative to this value.48 The horizontal axis measures the spatial distance between
firms in the same industry, but the scale and the measurement unit vary between the two
studies. In Rosenthal and Strange (2003), as shown in the upper panel, the four concentric
rings represent 0-1 miles (0-1.6 km), 1-5 miles (1.6-8 km), 5-10 miles (8-16 km), and 10-
15 miles (16-24 km). In our setting, as shown in the lower panel, the five concentric rings
represent 0-1 km, 1-5 km, 5-10 km, 10-20 km, and 20-30 km.
Two interesting patterns emerge. First, the attenuation of agglomeration economies in
the apparel industry (SIC 23) in the United States is very fast, while the corresponding in-
dustry in China (CIC 18) has a much slower spatial decay of agglomeration economies. The
contrast implies that the apparel industry in the United States is more reliant on knowledge
spillovers or labour pooling mechanism than its Chinese counterpart. This is plausible as
the apparel industry in the United States could be more directly engaged in designing and
advertising, which require more ideas sharing and networking. The apparel industry in China
is more on processing and manufacturing, which depend less on knowledge spillovers. Second,
for food products, fabricated metal, and machinery, the spatial decay tends to be slower in
the United States than in China, after adjusting the different scales and distance units used
in both panels. This variation could be driven by the different extent of the transportation
infrastructure build-up in the two countries at the specific time periods.49 If the local trans-
portation system is not well-developed enough to allow for easy collaboration with nearby
firms at fair distances, the agglomeration economies would be restrained within a short range.
6. TFP and Other Correlates
While relying on firm birth patterns to identify agglomeration benefits has its advantages,
studying the impact of agglomeration economies on other productivity measures, such as
TFP and wages, is superior in terms of their easier interpretations.50 In a competitive
48The magnitude of the chosen base coefficient is actually very comparable across the two studies, despitea slight variation in the definition of the first rings. The impact of the first-ring localization economies formachinery in the United States is 6.35e-05 and the corresponding coefficient in China is 6.03e-05.
49In 2008, the localized transportation systems (for example, highways) in China in 2008 were not asextensively developed as that in the United States in 1997 (the sample year in Rosenthal and Strange (2003)).
50The advantage of examining firms’ birth patterns is that the location choice of a new firm is more sensitiveto agglomeration benefits than the changes in wages and TFP of existing firms, which can be constrained by
34
equilibrium, an increase in firm TFP or workers’ productivity, measured by nominal wage,
directly reflects the gains of agglomeration economies (Ciccone and Hall, 1996; Henderson,
2003; Combes et al., 2010; Glaeser et al., 2010). These measures also differ in the way that
the agglomeration benefits captured by TFP do not reflect effects related to land and housing
prices, which nevertheless contribute to wage differentials (Combes and Gobillon, 2015). This
is an advantage for focusing on TFP as opposed to wages, but the downside is that TFP is
not directly observable in data sets and the estimation of TFP could suffer from omitted
variable bias.51
In this section, we explore the attenuation of agglomeration economies using these al-
ternative productivity indicators. In columns (2)-(4) of Table 7, we report the impact of
localization effects at various distances on TFP, output per worker, and wages per worker
– with all industries pooled together. For comparison purposes, in column (1) of Table 7,
we also pool all industries together and estimate distance-specific localization impact on the
probability of firm birth. The general attenuation patterns appear for all four measures, even
though the implied attenuation speed and the spatial scope tend to be different. The largest
localization effects are always found in the 0-1 km ring, followed by localization effects in the
1-5 km ring. Whether the impact extends beyond 5 km depends on the specific measure of
productivity gains. In column (1), localization in the 5-10 km ring, but not those further
away, significantly contributes to the birth rates of new firms. In columns (2)-(3), the local-
ization effects on TFP and output per worker become largely insignificant or even negative
beyond the 5 km radius. In column (4), the wage effects of localization economies extend to a
much larger geographic scope with the coefficient on the 20-30 km ring remaining statistically
significant.
The variation in the attenuation slope and the corresponding spatial scope could be linked
to the nature of agglomeration economies implied by various productivity measures. For easy
previous choices such as investments in capital. The drawback, however, is that firms’ profits depend notonly on productivity but also on input and output quantity which are themselves influenced by agglomerationeffects.
51The economic interpretation of impact of agglomeration economies is not the same for TFP and wages.The elasticity obtained from the TFP regressions needs to be multiplied by one over the share of labour tobe directly comparable to that from wage regressions. In this paper, we focus on comparing the coefficientsacross different rings of the same measure as opposed to cross-regression comparisons.
35
comparison and interpretation across rings, we normalize the magnitude of the first-ring
coefficient as one in Table 7 and plot the attenuation patterns for all measures in Figure
4. Three interesting patterns emerge. First, as workers are highly mobile within the same
labor market, wages could be highly spatially correlated within a city, which leads to a slower
spatial attenuation. Second, firm entry decisions are affected by considerations of various
Marshallian forces and local cost factors. Since cost factors, such as wage, are affected by the
presence of employment at outer rings, the spatial attenuation documented with firm birth
patterns is faster than that captured using other productivity correlates. Third, unlike wage
and output per worker, TFP does not reflect the effects of land rental costs; the attenuation
patterns documented using TFP is flatter than those documented using wage and output per
worker.
In Table 8, we explore the role of the “Chinitz” effect, which is an important feature
of industrial organization that has been emphasized in Chinitz (1961), Vernon (1960), and
Jacobs (1969), as well as more recently in Rosenthal and Strange (2003) and Faggio et al.
(2017). The idea is that, relative to big firms, small firms are more effective in fostering
an innovative and collaborative community and, thus, are more important in contributing
to agglomeration economies and in enhancing nearby economic productivity. To test this
idea, we separately identify the impact of concentrations of small and big firms on the four
productivity measures.52 We find that the birth of new firms is more likely to be enhanced
by nearby employment concentrations of small firms in the 0-1 km and 1-5 km rings, but the
impact of employment concentration of big firms is slightly more pronounced in the 5-10 km
ring. More important, the implied attenuation speed of localization economies from small
firms is faster, which suggests that knowledge spillovers could be the main mechanism of
agglomeration economies for small firms facilitating the “Chinitz” effect.
Contrary to the findings of firm births, we find that the rest three productivity measures of
firms are more affected by the presence of nearby big firms. We think the main reason for this
finding is the sample selection in ASIF and the heterogeneous effect in firm size. As explained
in the data section, the remaining three productivity measures are calculated using the ASIF,
52Small firms are defined as those with employment size below the 10th percentile of all firms in the sameindustry.
36
which only covers private firms with annual sales exceeding five million yuan and all SOEs.
In other words, in columns (2)-(4) of Table 8, we find that the productivity of relatively big
firms is more affected by the presence of nearby big firms, which is plausible because labour
pooling and input sharing are generally more common among firms of comparable sizes. The
finding is consistent with evidence in Bloom et al. (2013) that suggests that larger firms have
a bigger gap between social and private returns.53
7. SOEs versus Non-SOEs
To investigate how the attenuation of agglomeration economies interplay with the large
presence of the state sector in China, we separately identify the impact of concentrations of
SOEs and non-SOEs at different distances on various outcome variables for all firms, SOEs,
and private firms, respectively. The results are presented in Table 9. Columns (1)-(3) focus
on the birth of new firms, with the first column pooling all firms together and columns (2)-
(3) looking at SOEs and non-SOEs separately. In a similar format, columns (4)-(12) report
the impact on firms’ TFP, output per worker, and wages per worker. The attenuation of
agglomeration economies within and between ownership sectors is plotted in Figure 5 for
better visual presentation and comparison.
An initial examination of columns (1)-(3) shows that the agglomeration spillovers within
and across firms of different ownership types follow consistent attenuation patterns as we
found earlier. More important, the results on the birth of new firms reveal large heterogeneity
in how firms of different ownership types contribute to agglomeration economies and in the
cross-ownership agglomeration spillovers. First, pooling all firms together, column (1) shows
that on average the probability of firm birth is more responsive to the presence of private
firms, in both the magnitude and geographic scope, than that of the SOEs. For instance,
the impact of non-SOE employment in the 0-1 km ring is about four times as large as the
impact of SOE employment in the same ring. In terms of scope, the impact of non-SOE
firms extends to 10 km, while the SOE impact is constrained within 1km. Overall, the results
53Bloom et al. (2013) argue that larger firms generate more spillovers since they have a higher level ofconnectivity with other firms in the technology space.
37
imply that private firms contribute to local agglomeration economies more than SOEs, which
is consistent with our prior that private firms are more efficient and market-oriented.
Second, focusing on the birth of SOEs and non-SOEs separately, columns (2) and (3)
reveal that firms benefit more from the employment concentrations of their own ownership
type. In column (2), birth rate of SOEs is more responsive to close-range SOE concentrations,
but is less affected by nearby non-SOE firms. For example, the impact of SOE employment
in the 0-1 km ring on SOE birth probability is six times as large as the impact of non-
SOE employment in the same ring. The results could be driven by the possibility that the
government tries to internalize potential spillover effects within SOEs when making location
choices for new SOEs. It could also be because SOEs are more likely to share labour, inputs,
and knowledge with each other, which produces larger externalities within the state sector.
In constrast, column (3) suggests that the birth rate of private firms is much more affected
by the presence of private firms in the proximity than that of SOEs. The impact of private
firms is larger in magnitude and extends out to more than 10 km, while the impact of SOEs is
smaller and confined within 0-1 km. The evidence implies that private firms generate higher
productivity spillovers to other private firms nearby than to SOEs. Overall, the results suggest
that the pattern in column (1) is driven by the strong within-sector agglomeration spillovers
that are revealed in columns (2)-(3).
We also find that there exist stronger within-ownership-type agglomeration effects than
cross-ownership-type spillovers when we focus on the impact on existing firms’ TFP and
output per worker. Private firms have almost no localization effects on SOEs’ productivity,
while they have an especially pronounced effect on private firms’ productivity. SOEs show
significant localization effects on both SOEs and private firms, with the effect within 0-1 km
being positive and statistically significant only on SOEs. One difference from the results for
the birth rate is that the presence of SOEs also benefits private firms and this effect takes
place at a relatively large geographic scope. We do not have a strong prior to interpret this
finding, but we suspect that this could be driven by potential benefits for private firms to
work and partner with the stable and sizable state sector to share inputs.
We then focus on the wage effects on the existing firms in columns (9)-(12). Pooling firms
38
together, wages per worker are more affected by the presence of non-SOEs than SOEs. The
effect from private firms is larger in both magnitude and the geographic scope. This could
be driven by the fact that private firms generate a higher extent of positive spillovers to
increase labour productivity on average. Compared to the presence of nearby SOEs, private
firms have a larger impact on wages in both the state sector and the private sector. Given
that the impact on SOEs’ wages is of a smaller geographic scope than the impact on private
firms’ wages, the former could mainly be driven by knowledge spillovers while the latter
could be driven by a combination of knowledge sharing, labour pooling, and input sharing.
SOEs’ impact on wages are relatively smaller and confined within a small geographic scope,
especially on wages in the private sector. This is consistent with the fact that the labour
flow from SOEs to non-SOEs is very limited in China. In particular, the fast attenuation of
SOEs’ spillover impact on private firms’ wages suggests that knowledge spillovers could be
the main Marshallian force at work.
8. Conclusion
Taking advantage of comprehensive geocoded administrative data sets, we examine a
complete list of twenty-nine Chinese manufacturing industries to estimate industry-specific
spatial attenuation speed of agglomeration economies and its interplay with the large pres-
ence of the state sector in China. We find that agglomeration economies attenuate sharply
with geographic distance. More interesting, there is notable heterogeneity across industry
and ownership types. Concentrations of private firms produce greater agglomeration benefits
to attract more new arrivals and induce higher increases in other productivity measures com-
pared to their state counterparts. This pattern is consistent with the argument that a more
entrepreneurial and more market-driven industrial system is more conducive to economic
growth. We also find that SOEs and non-SOEs benefit more from their own-type concentra-
tions, which suggests that agglomeration forces are significantly prohibited across ownership
types in China. It is also consistent with the fact that SOEs are capable of internalizing
potential spillovers.
The heterogeneity across industries is further explored and linked with the micro-foundations
39
of agglomeration economies – our nuanced full-spectrum analysis of this heterogeneity at re-
fined geographical levels allow us to statistically evaluate the underlying micro-foundations
that govern the spatial patterns of agglomeration economies. We find that agglomeration
benefits dissipate faster in industries more reliant on knowledge spillovers or labour market
pooling but slower in industries more reliant on input sharing or with a higher share of SOEs.
With the detailed estimates of industry- and distance-specific agglomeration economies for
all manufacturing industries, we also test for the goodness of fit of various spatial decay
functional forms. We find that the inverse square distance decay function presents the best
goodness of fit among the tested functional forms.
The revealed systematic evidence on spatial attenuation speed of agglomeration economies
not only offer generic guidance on theoretical models but also bear important policy implica-
tions. Our results suggest that place-based policies aiming for boosting local agglomeration
economies (for example, the industrial parks policy in China) should take into consideration
the different attenuation speed of agglomeration economies across industry and ownership
types. Moreover, policy makers may consider ways to improve the connections between
SOEs and non-SOEs and to facilitate stronger agglomeration economies across ownership
types. In future research, it would be interesting to investigate if the spatial attenuation pat-
terns evolve over time using data sets that span across a longer time horizon. It would also
be interesting to study how the agglomeration attenuation patterns in the service industries
may differ from that in the current manufacturing setting.
40
References
Acemoglu, D. (1996). A Microfoundation for Social Increasing Returns in Human Capital
Accumulation. The Quarterly Journal of Economics, 111(3):779–804.
Ackerberg, D. A., Caves, K., and Frazer, G. (2015). Identification Properties of Recent
Production Function Estimators. Econometrica, 83(6):2411–2451.
Ahlfeldt, G. M., Redding, S. J., Sturm, D. M., and Wolf, N. (2015). The Economics of
Density: Evidence From the Berlin Wall. Econometrica, 83(6):2127–2189.
Amemiya, T. (1974). The nonlinear two-stage least-squares estimator. Journal of Economet-
rics.
Andrews, I., Stock, J. H., and Sun, L. (2019). Weak Instruments in Instrumental Variables
Regression: Theory and Practice. Annual Review of Economics, 11(1):727–753.
Arzaghi, M. and Henderson, J. V. (2008). Networking off Madison Avenue. Review of
Economic Studies, 75:1011–1038.
Au, C.-C. and Henderson, J. V. (2006). How migration restrictions limit agglomeration and
productivity in China. Journal of Development Economics, 80:350–388.
Belloni, A., Chen, D. L., Chernozhukov, V., and Hansen, C. (2012). Sparse Models and
Methods for Optimal Instruments With an Application to Eminent Domain. Econometrica,
80(6):2369–2429.
Berliant, M., Peng, S. K., and Wang, P. (2002). Production Externalities and Urban Config-
uration. Journal of Economic Theory, 104(2):275–303.
Bloom, N., Schankerman, M., and Reenen, J. V. (2013). Identifying Technology Spillovers
and Product Market Rivalry. Econometrica, 81(4):1347–1393.
Brandt, L., Van Biesebroeck, J., Wang, L., and Zhang, Y. (2017). WTO Accession and
Performance of Chinese Manufacturing Firms. American Economic Review, 107(9):2784–
2820.
41
Brandt, L., Van Biesebroeck, J., and Zhang, Y. (2012). Creative Accounting or Creative
Destruction? Firm-level Productivity Growth in Chinese Manufacturing. Journal of De-
velopment Economics, 97(2):339–351.
Cao, X., Lemmon, M., Pan, X., Qian, M., and Tiane, G. (2019). Political Promotion, CEO
Incentives, and the Relationship Between Pay and Performance. Management Science,
65(7):2947–2965.
Chamberlain, G. (1987). Asymptotic efficiency in estimation with conditional moment re-
strictions. Journal of Econometrics, 34(3):305–334.
Chauvin, J. P., Glaeser, E., Ma, Y., and Tobio, K. (2017). What Is Different About Urban-
ization In Rich and Poor Countries? Cities in Brazil, China, India and the United States.
Journal of Urban Economics, 98:17–49.
Chernozhukov, V., Hansen, C., and Spindler, M. (2015). Post-Selection and Post-
Regularization Inference in Linear Models with Many Controls and Instruments. American
Economic Review, 105(5):486–490.
Chinitz, B. (1961). Contrasts in Agglomeration: New York and Pittsburgh. The American
Economic Review, 51(2):279–289.
Ciccone, A. and Hall, R. E. (1996). Productivity and the Density of Economic Activity.
American Economic Review, 86(1):54–70.
Combes, P.-P., Demurger, S., and Li, S. (2015). Migration Externalities in Chinese Cities.
European Economic Review, 76:152–167.
Combes, P.-P., Duranton, G., and Gobillon, L. (2008). Spatial Wage Disparities: Sorting
Matters! Journal of Urban Economics, 63(2):723–742.
Combes, P.-P., Duranton, G., Gobillon, L., Puga, D., and Roux, S. (2012). The productivity
advantages of large cities: distinguishing agglomeration from firm selection. Econometrica,
80(6):2543–2594.
42
Combes, P.-P., Duranton, G., Gobillon, L., and Roux, S. (2010). Estimating Agglomeration
Economies with History, Geology and Worker Effects. In Agglomeration economics.
Combes, P.-P. and Gobillon, L. (2015). The Empirics of Agglomeration Economies. In
Handbook of Regional and Urban Economics, pages 247–348.
Costa, D. L. and Kahn, M. E. (2000). Power couples: Changes in the locational choice of the
college educated, 1940-1990. Quarterly Journal of Economics, 115(4):1287–1315.
Davis, D. R. and Dingel, J. I. (2019). A spatial knowledge economy. American Economic
Review, 109(1):153–170.
de la Roca, J. and Puga, D. (2017). Learning byworking in big cities. Review of Economic
Studies, 84(1):106–142.
Dingel, J. I., Miscio, A., and Davis, D. R. (2019). Cities, Lights, and Skills in Developing
Economies. Journal of Urban Economics, page 103174.
Diodato, D., Neffke, F., and O’Clery, N. (2018). Why do industries coagglomerate? How
Marshallian externalities differ by industry and have evolved over time. Journal of Urban
Economics, 106:1–26.
Djankov, S. and Murrell, P. (2002). Enterprise Restructuring in Transition: A Quantitative
Survey. Journal of Economic Literature, 40(3):739–792.
Duranton, G. (2016). Agglomeration Effects in Colombia. Journal of Regional Science,
56(2):210–238.
Duranton, G. and Puga, D. (2004). Micro-Foundations of urban agglomeration economies.
Handbook of regional and urban economics.
Ellison, G. and Glaeser, E. L. (1999). The Geographic Concentration of Industry: Does
Natural Advantage Explain Agglomeration? American Economic Review, 89(2):311–316.
Ellison, G., Glaeser, E. L., and Kerr, W. R. (2010). What causes industry agglomeration?
Evidence from coagglomeration patterns. American Economic Review, 100(3):1195–1213.
43
Estrin, S., Hanousek, J., Kocenda, E., and Svejnar, J. (2009). The Effects of Privatization
and Ownership in Transition Economies. Journal of Economic Literature, 47(3):699–728.
Faggio, G., Silva, O., and Strange, W. C. (2017). Heterogeneous Agglomeration. Review of
Economics and Statistics, 99(1):8094.
Fang, L. H., Lerner, J., and Wu, C. (2017). Intellectual Property Rights Protection , Owner-
ship , and Innovation : Evidence from China. Review of Financial Studies, 30(7):2446–2477.
Frank, L. E. and Friedman, J. H. (1993). A Statistical View of Some Chemometrics Regression
Tools. Technometrics, 35(2):109–135.
Fu, S. (2007). Smart Cafe Cities: Testing human capital externalities in the Boston metropoli-
tan area. Journal of Urban Economics, 61(1):86–111.
Fujita, M. and Ogawa, H. (1982). Multiple Equilibria and Structural Transition of Non-
monocentric Urban Configurations. Regional Science and Urban Economics, 12(2):161–
196.
Gaubert, C. (2018). Firm Sorting and Agglomeration. American Economic Review,
108(11):3117 – 3153.
Ge, S. and Yang, D. T. (2014). Changes in China’s Wage Structure. Journal of the European
Economic Association, 12(2):300–336.
Ge, Y. (2009). Globalization and Industry Agglomeration in China. World Development,
37(3):550–559.
Glaeser, E. and Kerr, W. R. (2009). Local Industrial Conditions and Entrepreneurship.
Journal of Economics and Business Management Strategy, 18(3):623–663.
Glaeser, E. L., Kallal, H. D., Scheinkman, J. A., and Shleifer, A. (1992). Growth in Cities.
Journal of Political Economy, 100(6):1126–1152.
Glaeser, E. L., Kerr, W. R., and Ponzetto, G. A. M. (2010). Clusters of entrepreneurship.
Journal of Urban Economics, 67(1):150–168.
44
Glaeser, E. L., Kominers, S. D., Luca, M., and Naik, N. (2018). Big Data and Big Cities:
The Promises and Limitations of Improved Measures of Urban Life. Economic Inquiry,
56(1):114–137.
Glaeser, E. L. and Mare, D. C. (2001). Cities and Skills. Journal of Labor Economics,
19(2):316–342.
Greenstone, M., Richard Hornbeck, and Enrico Moretti (2010). Identifying Agglomeration
Spillovers: Evidence from Winners and Losers of Large Plant Openings. Journal of Political
Economy, 118(3):536–598.
Henderson, J. (2003). Marshall’s scale economies. Journal of Urban Economics, 53(1):1–28.
Henderson, J. V., Kuncoro, A., and Turner, M. (1995). Industrial Development in Cities.
Journal of Political Economy, 103(5).
Holmes, T. J. (1999). Localization of industry and vertical disintegration. Review of Eco-
nomics and Statistics, 81(2):314–325.
Holz, C. A. (2018). The Unfinished Business of State-owned Enterprise Reform in the Peoples
Republic of China. SSRN Electronic Journal, (December).
Hsieh, C.-T. and Song, Z. M. (2015). Grasp the Large, Let Go of the Small : The Trans-
formation of the State Sector in China. Brookings Papers on Economic Activity, pages
295–346.
Huang, Z., Li, L., Ma, G., and Xu, L. C. (2017). Hayek, Local Information, and Commanding
Heights: Decentralizing State-owned Enterprises in China. American Economic Review,
107(8):2455–2478.
Jacobs, J. (1969). The Economy of Cities. New York: Vintage.
Jofre-Monseny, J., Marın-Lopez, R., and Viladecans-Marsal, E. (2011). The Mechanisms of
Agglomeration: Evidence from the Effect of Inter-industry Relations on the Location of
New Firms. Journal of Urban Economics, 70(2-3):61–74.
45
Krugman, P. (1991). Increasing Returns and Economic Geography. Journal of Political
Economy, 99(3):483–499.
Leeb, H. and Potscher, B. M. (2008). Sparse Estimators and the Oracle Property, or the
Return of Hodges Estimator. Journal of Econometrics, 142(1):201–211.
Levinsohn, J. and Petrin, A. (2003). Estimating Production Functions Using Inputs to
Control for Unobservables. Review of Economic Studies, 70(2):317–341.
Li, J. (2014). The influence of state policy and proximity to medical services on health
outcomes. Journal of Urban Economics, 80:97–109.
Liu, S. (2015). Spillovers from universities: Evidence from the land-grant program. Journal
of Urban Economics, 87:25–41.
Loecker, J. D. and Warzynski, F. (2012). Markups and Firm-Level Export Status. American
Economic Review, 102(6):2437–2471.
Lu, J. and Tao, Z. (2009). Trends and determinants of Chinas industrial agglomeration.
Journal of Urban Economics, 65(2):167–180.
Lucas, R. E. (2001). Externalities and Cities. Review of Economic Dynamics.
Lucas, R. E. and Rossi-Hansberg, E. (2002). On the Internal Structure of Cities. Economet-
rica, 70(4):1445–1476.
Marshall, A. (1920). Principles of Economics. London: Macmillan.
Megginson, W. L. and Netter, J. M. (2001). From State to Market: A Survey of Empirical
Studies on Privatization. Journal of Economic Literature, 39(2):321–389.
Meng, X. (2012). Labor Market Outcomes and Reforms in China. Journal of Economic
Perspectives, 26(4):75–102.
Moretti, E. (2019). The Effect of High-Tech Clusters on the Productivity of Top Inventors.
NBER Working Paper Series, (12610):No. 26270.
46
Newey, W. K. (1990). Efficient Instrumental Variables Estimation of Nonlinear Models.
Econometrica.
Ogawa, H. and Fujita, M. (1980). Equilibrium Land Use Patterns in a Nonmonocentric City.
Journal of Regional Science, 20(4):455–475.
Olley, G. S. and Pakes, A. (1996). The Dynamics of Productivity in the Telecommunications
Equipment Industry. Econometrica, 64(6):1263.
Roback, J. (1982). Wages, Rents, and the Quality of Life. Journal of Political Economy,
90(6):1257–1278.
Rosen, H. S. (1979). Housing Decisions and the US Income Tax: An Econometric Analysis.
Journal of Public Economics, 11(1):1–23.
Rosenthal, S. S. and Strange, W. C. (2001). The Determinants of Agglomeration. Journal of
Urban Economics, 50(2):191–229.
Rosenthal, S. S. and Strange, W. C. (2003). Geography, Industrial Organization, and Ag-
glomeration. Review of Economics and Statistics, 85(May):377–393.
Rosenthal, S. S. and Strange, W. C. (2004). Evidence on the Nature and Sources of Agglom-
eration Economies. Handbook of regional and urban economics, 4:2119–2171.
Rosenthal, S. S. and Strange, W. C. (2005). The Geography of Entrepreneurship in the New
York Metropolitan Area. Federal Reserve Bank of New York Economic Policy Review,
11:29–53.
Rosenthal, S. S. and Strange, W. C. (2008). The Attenuation of Human Capital Spillovers.
Journal of Urban Economics, 64(2):373–389.
Rosenthal, S. S. and Strange, W. C. (2019). How Close is Close? The Spatial Reach of
Agglomeration Economies. Working paper, 53(9):1689–1699.
Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal
Statistical Society. Series B: Statistical Methodology, 73(3):273–282.
47
Tombe, T. and Zhu, X. (2019). Trade, Migration, and Productivity: A Quantitative. Amer-
ican Economic Review, 109(5):1843–1872.
Vernon, R. (1960). Metropolis 1985. Harvard University Press.
Zhu, X. (2012). Understanding China’s Growth: Past, Present, and Future. Journal of
Economic Perspectives, 26(4):103–124.
48
Table 1a. Industry Concentrations
CIC 13 CIC 14 CIC 15 CIC 16 CIC 17 CIC 18 CIC 19 CIC 20Leather, Furs, Timber Processing,
Food Processing Food Production Beverage Tobacco Processing Textile Industry Garments&Other Down&Related Bamboo,Cane,PalmProduction Fiber Products Products Fiber&Straw Products
Weihai (3.80%) Shanghai (3.93%) Yibin (3.46%) Changsha (10.83%) Suzhou (6.45%) Quanzhou (5.32%) Quanzhou (12.81%) Xuzhou (5.32%)Yantai (3.40%) Guangzhou (2.57%) Beijing (2.46%) Kunming (6.99%) Shaoxing (5.05%) Suzhou (5.17%) Dongguan (10.00%) Suqian (5.12%)Qingdao (3.19%) Beijing (2.53%) Xuzhou (2.22%) Guiyang (6.27%) Binzhou (3.47%) Shanghai (5.00%) Wenzhou (7.97%) Linyi (5.06%)Linyi (3.18%) Quanzhou (1.95%) Suqian (1.81%) Zhenghou (5.05%) Ningbo (3.26%) Guangzhou (4.09%) Guangzhou (5.77%) Nanping (3.55%)Weifang (3.04%) Chengdu (1.94%) Luzhou (1.81%) Hefei (3.40%) Hangzhou (3.25%) Ningbo (3.94%) Zhongshan (5.04%) Heze (2.98%)
CIC 21 CIC 22 CIC 23 CIC 24 CIC 25 CIC 26 CIC 27 CIC 28Furniture Papermaking& Printing& Stationery, Petroleum Processing, Raw Chemical Medical&Manufacturing Paper Products Record Pressing Educational& Coking Products,Gas Materials& Pharmaceutical Chemical Fibers
Sports Goods Production&Supply Chemical Products Products
Dongguan (9.96%) Dongguan (4.21%) Shanghai (6.35%) Dongguan (15.39%) Luliang (4.21%) Shanghai (3.21%) Shanghai (3.50%) Suzhou (12.54%)Fuoshan (5.71%) Hangzhou (3.88%) Shenzhen (6.25%) Shenzhen (8.44%) Linfen (3.90%) Suzhou (2.67%) Shijiazhuang (3.46%) Wuxi (7.92%)Shanghai (5.57%) Shenzhen (3.83%) Beijing (6.00%) Ningbo (5.30%) Daqing (3.71%) Guangzhou (2.19%) Beijing (3.25%) Jiaxing (7.33%)Shenzhen (5.36%) Suzhou (3.30%) Dongguan (5.44%) Guangzhou (5.17%) Taiyuan (3.68%) Zibo (2.11%) Haerbin (2.54%) Hangzhou (6.45%)Zhongshan (3.49%) Shanghai (2.49%) Fuoshan (4.91%) Zhongshan (5.02%) Zibo (3.54%) Tianjin (1.85%) Tianjin (2.44%) Shaoxing (5.73%)
CIC 29 CIC 30 CIC 31 CIC 32 CIC 33 CIC 34 CIC 35 CIC 36Rubber Products Plastic Products Non-metal Smelting&Pressing Smelting&Pressing Metal Products Machinery&Equipment Special Equipment
Mineral Products of Ferrous Metals of Nonferrous Metals Manufacturing Manufacturing
Qingdao (5.26%) Dongguan (10.37%) Fuoshan (3.53%) Tangshan (6.41%) Fuoshan (4.05%) Shanghai (6.52%) Shanghai (6.36%) Shanghai (5.54%)Shanghai (4.78%) Shenzhen (7.99%) Zibo (3.30%) Anshan (4.42%) Honghe (3.18%) Fuoshan (4.73%) NIngbo (4.31%) Suzhou (4.98%)Dongguan (4.63%) Shanghai (5.41%) Quanzhou (3.27%) Wuhan (3.90%) Yantai (3.08%) Shenzhen (4.41%) Suzhou (3.56%) Shenzhen (4.46%)Suzhou (4.18%) Suzhou (4.87%) Zhenghou (2.61%) Tianjin (3.55%) Yuncheng (2.73%) Suzhou (3.94%) Wuxi (3.50%) Dongguan (3.15%)Guangzhou (3.90%) Fuoshan (3.13%) Chongqing (2.17%) Handan (3.45%) Yingtan (2.66%) Jiangmen (3.45%) Dalian (3.40%) Wuxi (3.06%)
CIC 37 CIC 39 CIC 40 CIC 41 CIC 42Transportation Equi- Electric Equipment Electronic&Teleco- Instruments, Meters, Artwork&Otherpment Manufacturing &Machinery mmunications Cultural&Official Manufacturing
Machinery
Chongqing (7.86%) Shenzhen (7.84%) Shenzhen (19.86%) Shenzhen (10.07%) Quanzhou (8.70%)Shanghai (5.90%) Dongguan (5.90%) Suzhou (15.64%) Dongguan (5.95%) Shenzhen (5.15%)Changchun (3.31%) Fuoshan (5.67%) Dongguan (9.08%) Shanghai (5.72%) Guangzhou (4.99%)Tianjin (2.99%) Shanghai (5.09%) Shanghai (6.00%) Suzhou (4.22%) Taizhou (4.76%)Guangzhou (2.98%) Ningbo (4.97%) Huizhou (3.51%) Ningbo (4.14%) Jinhua (4.31%)
1 For each two-digit industry and each city, the number in the parenthesis is calculated as ns,c/ns, where ns,c is the total number of firms in industry s city c, and ns is the totalnumber of firms in industry s. The top five cities with the largest percentage of firms in the respective industry are listed.
49
Table 1b. Summary Statistics - Concentric Ring Measures of Existing Employment
Name N Mean Std DevLocalization Measures0 - 1 km 23,434,810 2.967718 114.67021 - 5 km 23,434,810 48.25242 631.99825 - 10 km 23,434,810 78.67449 877.937210 - 20 km 23,434,810 219.3559 1807.34220 - 30 km 23,434,810 293.1631 2140.247Urbanization Measures0 - 1 km 23,434,810 1144.517 3891.3151 - 5 km 23,434,810 19282.36 38718.055 - 10 km 23,434,810 31659.85 65016.3710 - 20 km 23,434,810 88373.29 166638.720 - 30 km 23,434,810 117919.8 209310.5
1 Localization measures are calculated as the sum of within four-digit industry employment in respectiveconcentric rings.
2 Urbanization measures are calculated as the sum of all manufacturing employment excluding own four-digit industry in respective concentric rings.
50
Table 1c. Summary Statistics - Grid Level Four-Digit Industry Firm Birth Share
Two-digit Industry Name CIC Code N Mean Std Dev
Food Processing 13 996705 1.50E-05 0.001Food Production 14 1173972 1.62E-05 0.001Beverage Production 15 756300 1.59E-05 0.001Tobacco Processing 16 44346 6.76E-05 0.007Textile Industry 17 1194920 1.67E-05 0.001Garments & Other Fibre Products 18 165315 1.81E-05 0.001Leather, Furs, Down & Related Products 19 485520 2.06E-05 0.002Timber Processing, Bamboo, Cane, Palm Fibre & Straw Products 20 479000 1.67E-05 0.001Furniture Manufacturing 21 250010 2.00E-05 0.002Papermaking & Paper Products 22 295940 1.35E-05 0.001Printing & Record Pressing 23 261945 1.91E-05 0.001Stationery, Educational & Sports Goods 24 578214 2.25E-05 0.003Petroleum Processing, Coking Products, Gas Production & Supply 25 183640 2.18E-05 0.001Raw Chemical Materials & Chemical Products 26 1953240 1.43E-05 0.001Medical & Pharmaceutical Products 27 361578 1.66E-05 0.001Chemical Fibres 28 241101 2.49E-05 0.002Rubber Products 29 453618 1.98E-05 0.001Plastic Products 30 537660 1.67E-05 0.001Non-metal Mineral Products 31 2026950 1.43E-05 0.001Smelting & Pressing of Ferrous Metals 32 232588 1.72E-05 0.001Smelting & Pressing of Nonferrous Metals 33 489096 1.64E-05 0.001Metal Products 34 1070748 1.68E-05 0.001Machinery & Equipment Manufacturing 35 1884769 1.59E-05 0.001Special Equipment Manufacturing 36 2352880 1.66E-05 0.002Transportation Equipment Manufacturing 37 1213107 1.65E-05 0.002Electric Equipment & Machinery 39 1407264 1.71E-05 0.001Electronic & Telecommunications 40 684292 2.05E-05 0.002Instruments, Meters, Cultural & Official Machinery 41 1058784 1.98E-05 0.003Artwork & Other Manufacturing 42 601308 2.00E-05 0.001ALL NA 23,434,810 1.70E-05 0.002
1 Firm birth share is defined as the percentage of new firms in a grid out of all new firms in the four-digit industry.
51
Table 2. OLS Estimates - Firm Birth Share
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Food Tobacco Furniture Printing & Medical & Transportation Electronic & Instruments, Meters, Artwork &
Name Production Processing Manufacturing Record Pressing Pharmaceutical Equipment Telecom- Cultural & Other
Products Manufacturing munications Official Machinery Manufacturing
code 14 16 21 23 27 37 40 41 42
Localization Effects
0-1 km 6.87E-05 -1.78E-05 3.51E-05 2.09E-05 5.06E-05 8.71E-05 0.000107 0.000146 0.000171
(5.09) (-1.03) (3.79) (2.30) (4.76) (5.02) (4.92) (2.81) (6.01)
2-5 km 3.57E-06 2.13E-05 4.69E-06 1.71E-06 5.04E-07 7.53E-06 3.58E-06 5.51E-06 5.37E-06
(2.10) (0.67) (1.91) (0.90) (0.54) (2.69) (1.46) (1.06) (2.24)
5-10 km 9.20E-07 -4.81E-06 3.08E-06 -2.10E-06 -2.88E-08 1.47E-06 -5.85E-07 3.37E-06 7.26E-06
(0.80) (-0.25) (2.37) (-1.26) (-0.03) (0.97) (-0.30) (1.25) (3.86)
10-20 km 1.47E-06 3.91E-06 -3.01E-07 -2.10E-06 5.67E-08 -1.93E-06 -2.25E-06 6.16E-07 2.73E-06
(1.01) (0.19) (-0.09) (-1.03) (0.08) (-1.03) (-0.96) (0.25) (1.51)
20-30 km 2.08E-07 1.05E-05 -9.37E-08 -2.62E-07 -2.63E-06 -3.16E-06 -4.46E-06 -3.87E-06 -4.54E-07
(0.12) (0.63) (-0.06) (-0.17) (-2.84) (-1.89) (-2.17) (-0.92) (-0.26)
Urbanization Effects
0-1 km 1.24E-05 5.90E-05 1.62E-05 2.33E-05 1.90E-05 1.33E-05 1.65E-05 1.25E-05 1.37E-05
(6.01) (1.41) (3.34) (5.62) (6.77) (5.53) (4.76) (4.62) (5.39)
2-5 km -2.44E-06 -7.46E-05 -6.61E-06 8.48E-07 -4.64E-06 -1.60E-06 -2.92E-06 1.68E-06 -8.20E-06
(-2.26) (-1.56) (-1.55) (0.39) (-3.35) (-1.61) (-1.37) (0.64) (-4.06)
5-10 km -8.96E-08 8.99E-05 -3.80E-06 -2.72E-06 -3.28E-06 -9.99E-07 2.68E-07 -5.63E-06 -9.37E-07
(-0.07) (1.92) (-0.61) (-0.99) (-1.63) (-0.98) (0.12) (-1.44) (-0.47)
10-20 km -6.56E-06 -2.98E-06 1.52E-06 -1.73E-06 -5.49E-06 -5.70E-06 -3.35E-06 -4.82E-06 -1.12E-06
(-3.58) (-0.03) (0.22) (-0.56) (-2.49) (-2.71) (-0.87) (-1.43) (-0.37)
20-30 km 6.08E-07 -2.24E-04 -9.64E-07 -3.67E-06 2.32E-06 -7.00E-08 -4.04E-06 -5.84E-06 -5.95E-06
(0.35) (-1.34) (-0.18) (-0.69) (1.04) (-0.03) (-1.06) (-1.61) (-1.24)
Observations 1,173,972 44,346 250,010 261,945 361,578 1,213,107 684,292 1,058,784 601,308
Adj. R-squared 0.001 0.007 0.002 0.002 0.003 0.001 0.002 0.001 0.005
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing industry
diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
52
Table 3. IV Lasso Estimates - Firm Birth Share
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Food Tobacco Furniture Printing & Medical & Transportation Electronic & Instruments, Meters, Artwork &
Name Production Processing Manufacturing Record Pressing Pharmaceutical Equipment Telecom- Cultural & Other
Products Manufacturing munications Official Machinery Manufacturing
code 14 16 21 23 27 37 40 41 42
Localization Effects
0-1 km 6.77E-05 -6.33E-05 2.94E-05 2.42E-05 4.58E-05 0.0001072 8.64E-05 0.0002248 0.0001889
(4.44) (-1.00) (3.58) (1.88) (4.41) (4.01) (4.04) (3.05) (5.45)
1-5 km 6.98E-06 -5.40E-05 -1.77E-06 -2.85E-06 2.57E-06 -5.20E-06 5.28E-06 3.41E-06 1.78E-05
(1.79) (-1.00) (-0.50) (-0.52) (1.12) (-0.58) (0.74) (0.36) (1.27)
5-10 km 1.05E-05 -4.65E-05 1.91E-06 -1.25E-06 2.51E-06 2.82E-06 6.90E-06 -4.18E-06 5.32E-06
(2.36) (-1.00) (0.35) (-0.19) (1.10) (0.44) (1.05) (-0.64) (0.44)
10-20 km -8.15E-07 -4.58E-05 6.02E-06 -3.78E-06 6.26E-07 -3.49E-06 1.58E-06 -8.57E-06 -6.53E-06
(-0.15) (-1.00) (0.74) (-0.56) (0.24) (-0.46) (0.27) (-0.90) (-0.66)
20-30 km 2.25E-06 -5.08E-05 -7.49E-06 1.01E-05 -2.39E-06 3.24E-06 6.47E-07 -1.45E-05 -6.34E-06
(0.42) (-1.00) (-0.77) (0.68) (-0.80) (0.35) (0.10) (-1.06) (-0.50)
Urbanization Effects
0-1 km 1.40E-05 -5.34E-06 2.42E-05 2.45E-05 1.73E-05 1.90E-05 2.41E-05 1.36E-05 2.15E-05
(5.79) (-0.97) (3.16) (5.25) (5.26) (3.95) (3.76) (2.67) (4.61)
1-5 km -7.01E-06 2.89E-05 -7.57E-06 8.43E-06 -7.36E-06 1.91E-06 -1.38E-06 -1.32E-05 -2.93E-05
(-2.42) (1.00) (-2.48) (1.91) (-2.99) (0.33) (-0.20) (-1.34) (-2.64)
5-10 km -8.59E-06 7.20E-06 3.11E-06 -7.76E-06 -7.16E-06 -4.03E-07 -1.40E-05 2.11E-06 -3.83E-06
(-2.01) (0.96) (0.43) (-0.94) (-1.78) (-0.06) (-1.84) (0.14) (-0.30)
10-20 km -6.51E-06 -5.39E-05 -2.42E-05 6.91E-06 -4.58E-06 -4.55E-06 -9.56E-06 -1.84E-06 9.33E-06
(-0.95) (-1.00) (-1.51) (0.61) (-0.80) (-0.31) (-0.72) (-0.10) (0.53)
20-30 km -1.99E-06 1.23E-05 1.09E-05 -2.27E-05 4.45E-07 -1.81E-05 -1.50E-05 1.82E-06 -1.20E-05
(-0.27) (0.99) (0.88) (-1.13) (0.08) (-0.83) (-0.94) (0.08) (-0.57)
Observations 1,173,972 44,346 250,010 261,945 361,578 1,213,107 684,292 1,058,784 601,308
F stats 58.09 16.34 18.56 16.9 58.42 92.42 47.03 51.48 213.3
1 Coefficients reported are ring level localization and urbanization effects obtained by IV Lasso estimation of equation (3.1) for each two-digit industry.2 Post-lasso-orthogonalized variables are used in IV regression.3 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing industry
diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.4 Numbers in parentheses are t-statistics clustered at the grid level.5 First stage F-statistics using Post-lasso-orthogonalized variables are reported.
53
Table 4. Spatial Decay Speed with Inverse Square Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 4.47E-05 6.21E-05 8.58E-05 -2.42E-05 7.93E-05 2.99E-05 0.000145 6.43E-05(11.50) (11.71) (13.40) (-3.42) (16.54) (7.68) (15.12) (14.68)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 3.39E-05 2.65E-05 2.22E-05 0.000147 9.41E-05 6.30E-05 4.06E-05 0.000317(7.29) (7.81) (4.82) (12.88) (12.25) (15.53) (9.36) (17.60)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 0.000161 3.86E-05 6.42E-05 4.08E-05 6.92E-05 4.35E-05 5.97E-05 8.80E-05(18.07) (10.78) (15.60) (9.95) (12.44) (11.46) (15.09) (15.63)
(25) (26) (27) (28) (29) (30)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 8.90E-05 4.89E-05 0.000110 0.000147 0.000169(14.19) (12.87) (15.71) (13.61) (21.14)
Observations 145 Adj. R-squared 0.974
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = 1/d2.3 Numbers in parentheses are t-statistics.
54
Table 5. RMSE & MAE with Different Spatial Decay Functions
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9
New Firm
OLS RMSE 1.30E-05 4.50E-06 2.20E-06 1.40E-05 1.90E-06 2.60E-06 1.40E-05 2.40E-06 2.60E-06
MAE 2.85E-05 7.87E-06 2.54E-06 2.90E-05 2.33E-06 2.86E-06 2.92E-05 2.72E-06 2.86E-06
LASSO RMSE 2.30E-05 7.70E-06 5.90E-06 2.40E-05 5.70E-06 6.10E-06 1.80E-05 1.40E-05 1.40E-05
MAE 3.39E-05 9.26E-06 5.85E-06 3.48E-05 5.74E-06 5.98E-06 5.81E-05 4.68E-05 4.68E-05
New Employment
OLS RMSE 1.40E-05 5.20E-06 2.50E-06 1.50E-05 2.30E-06 2.80E-06 1.50E-05 2.70E-06 2.80E-06
MAE 3.07E-05 8.86E-06 2.78E-06 3.14E-05 2.67E-06 3.06E-06 3.15E-05 2.93E-06 3.06E-06
LASSO RMSE 1.90E-05 1.70E-05 1.60E-05 2.60E-05 7.60E-06 8.00E-06 2.70E-05 7.90E-06 8.00E-06
MAE 6.14E-05 5.39E-05 5.36E-05 3.75E-05 7.34E-06 7.47E-06 3.74E-05 7.37E-06 7.48E-06
1 Root mean squared errors (RMSE) and mean absolute errors (MAE) are based on the residuals of OLS and IV Lasso estimation of equation (3.2) ,
where localization effects (weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the spatial decay function and
two-digit industry dummies.2 Nine spatial decay functions are experimented:
The first decay function is specified as f(d) = −d (Model 1),
The second decay function is specified as f(d) = 1/d (Model 2),
The third decay function is specified as f(d) = 1/ed (Model 3),
The fourth decay function is specified as f(d) = −d2 (Model 4),
The fifth decay function is specified as f(d) = 1/d2 (Model 5),
The sixth decay function is specified as f(d) = 1/e2d (Model 6),
The seventh decay function is specified as f(d) = −d3 (Model 7),
The eighth decay function is specified as f(d) = 1/d3 (Model 8),
The ninth decay function is specified as f(d) = 1/e3d (Model 9).
55
Table 6. Spatial Decay Speed and Industry Characteristics
Regression Using OLS Estimates of Localization Effects(1) (2) (3) (4)
f(d) 0.000083 0.000083 0.000097 0.000077(9.12) (8.91) ( 10.21) (7.11)
f(d)× knowledge spillovers 0.000036 0.000021(3.49) (1.00)
f(d)× labor market pooling 0.000038 0.000023(3.35) (0.94)
f(d)× input sharing -4.11E-06 0.000010(-0.38) (0.88)
f(d)× natural advantage 7.17E-06 0.000013 7.71E-06 7.91E-06(0.71) ( 1.31) (0.70) ( 0.72)
f(d)× high SOE share -0.000047 -0.000054 -0.000036 -0.000053(-4.48) ( -4.73 ) (-3.44) (-4.39)
Constant 9.45E-08 9.45E-08 -2.39E-06 9.45E-08(0.04) (0.04) (-1.22) (0.04)
Adj. R-squared 0.6163 0.6138 0.5833 0.6142
Regression Using LASSO Estimates of Localization Effects(1) (2) (3) (5)
f(d) 0.000089 0.000088 0.00010 0.000086(7.72) (7.56) (8.88) (6.26)
f(d)× knowledge spillovers 0.000038 0.000026(2.88) (0.96)
f(d)× labor market pooling 0.000039 0.000016(2.72) (0.53)
f(d)× input sharing -9.43E-06 3.74E-06(-0.70) (0.25)
f(d)× natural advantage 7.50E-06 0.000014 9.68E-06 9.14E-06(0.59) (1.08) (0.71) (0.66)
f(d)× high SOE share -0.000045 -0.000052 -0.000033 -0.000049(-3.36) (-3.57) (-2.54) (-3.18)
Constant -1.22E-06 -1.22E-06 -1.22E-06 -1.22E-06(-0.38) (-0.38) (-0.37) (-0.37)
Adj. R-squared 0.5470 0.5379 0.5152 0.5350
1 Results are obtained by OLS estimation of (3.3), where localization effects(from either OLS or IV Lasso estimation) are regressed on a spatial decayfunction, and interaction terms of the decay function and various industrycharacteristic indicators.
2 The spatial decay function is specified as f(d) = 1/d2.3 For each two-digit industry, the indicator for reliance on knowledge spillovers
equals one if the ratio of new product to total product in the industry is higherthan the median of all industries and zero otherwise. The indicator for relianceon labor pooling equals one if the percentage of collage-educated workers inthe industry is higher than the median of all industries and zero otherwise.The indicator for reliance on input sharing equals one if transportation costper shipment in the industry is higher than the median of all industries andzero otherwise. The indicator for reliance on natural advantage equals one ifat least two of the three cost variables (water, energy, and natural resourcescost per shipment) in the industry are higher than the median of all industriesand zero otherwise. The indicator variable high SOE share equals one if thepercentage of SOE firms in the industry is higher than the median of allindustries and zero otherwise.
4 Numbers in parentheses are t-statistics.
56
Table 7. TFP and Other Correlates - Full Sample Localization Economies
Firm Birth Share TFP Output Per Worker Wages Per Worker
0-1 km 6.92E-05 0.000745 0.0154 0.0178
(23.11) (7.37) (24.70) (42.72)
1-5 km 3.94E-06 0.000533 0.00465 0.00441
(10.75) (5.06) (7.33) (11.45)
5-10 km 1.27E-06 9.84E-05 -0.00123 -0.00163
(4.57) (0.84) (-1.73) (-3.82)
10-20 km 9.80E-08 -0.000146 -0.00243 0.00228
(0.34) (-1.02) (-2.82) (4.46)
20-30 km -1.63E-06 -7.45E-06 -0.00322 0.00268
(-4.99) (-0.05) (-3.75) (5.30)
Obeservation 23,434,810 1,386,056 1,386,056 1,386,056
Adj. R-squared 0.0011 0.434 0.784 0.874
1 Coefficients reported are ring level localization effects obtained by OLS estimation of equation (3.1) for the
full sample. Four different productivity measures, firm birth share, TFP, output per worker, and wages
per worker, are used as the dependent variable.2 Control variables include ring level urbanization measures, Herfindahl index representing industry orga-
nization for each four-digit industry within 30 km of each grid, Herfindahl index representing industry
diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
57
Table 8. Localization Economies of Small versus Big Firms
Firm Birth Share TFP Output Per Worker Wages Per Worker
0-1 km 0.000134 5.86E-05 0.00115 -0.0237
(7.12) (0.23) (0.74) (-23.70)
1-5 km 4.22E-06 0.000417 0.000674 -0.013
(2.72) (1.92) (0.54) (-16.48)
Employment 5-10 km 2.34E-07 0.000203 -0.00125 -0.00932
of (0.20) (0.91) (-0.97) (-11.40)
Small Firms 10-20 km -1.92E-06 -0.000153 -0.00633 -0.00467
(-3.61) (-0.69) (-4.86) (-5.75)
20-30 km -3.07E-06 -0.000149 -0.00287 -0.00243
(-7.16) (-0.70) (-2.28) (-3.12)
0-1 km 6.63E-05 0.000699 0.0106 0.0208
(22.60) (6.73) (18.08) (48.30)
1-5 km 3.62E-06 0.000468 0.00445 0.00817
(9.78) (4.30) (6.95) (20.55)
Employment 5-10 km 1.20E-06 4.74E-05 8.13E-06 0.00143
of (4.17) (0.39) (0.01) (3.31)
Big Firms 10-20 km 2.12E-07 -0.000107 -0.00167 0.00423
(0.80) (-0.74) (-1.96) (8.18)
20-30 km -1.45E-06 1.20E-05 -0.00311 0.00435
(-4.94) (0.08) (-3.60) (8.40)
Observations 23,434,810 1,386,056 1,386,056 1,386,056
Adj. R-squared 0.0012 0.434 0.792 0.874
1 Coefficients reported are localization effects of small firms and big firms on different productivity measures.2 Control variables include ring level urbanization measures, Herfindahl index representing industry organization
for each four-digit industry within 30 km of each grid, Herfindahl index representing industry diversity within
30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
58
Table 9. Localization Economies within and between Ownership Types
Birth TFP Output Wages
All SOE Non-SOE All SOE Non-SOE All SOE Non-SOE All SOE Non-SOE
Localization Effects of SOE
0-1 km 1.77E-05 7.98E-05 2.55E-05 -0.000919 0.00107 -0.000333 -0.00873 0.05 -0.00663 0.00548 0.0128 0.011
(2.23) (1.97) (2.26) (-3.31) (1.55) (-1.08) (-5.08) (13.36) (-3.89) (5.11) (6.46) (10.12)
1-5 km -4.35E-07 1.01E-05 -4.49E-07 0.000963 0.000443 0.000917 0.00278 0.00372 0.004 0.00139 0.00738 0.000735
(-0.44) (2.37) (-0.32) (5.55) (0.56) (5.31) (2.90) (0.91) (4.35) (2.41) (3.62) (1.27)
5-10 km -8.74E-07 1.59E-06 -1.29E-06 0.000682 0.00205 0.000463 -0.00145 -0.00876 -0.000517 -0.000396 0.000457 -0.000329
(-0.86) (0.85) (-0.98) (3.89) (2.03) (2.68) (-1.54) (-1.70) (-0.57) (-0.69) (0.18) (-0.57)
10-20 km -1.34E-06 8.29E-07 -1.85E-06 0.000335 -0.000314 0.000366 0.00205 0.00191 0.00097 -0.000288 -0.00164 -0.000238
(-2.44) (0.93) (-2.67) (2.58) (-0.34) (2.86) (2.77) (0.39) (1.35) (-0.64) (-0.69) (-0.53)
20-30 km -9.97E-07 -9.06E-07 -1.28E-06 1.43E-05 -8.65E-05 6.11E-05 0.00102 -0.00577 0.000889 -0.00278 -0.00596 -0.00251
(-2.30) (-1.79) (-2.30) (0.12) (-0.09) (0.51) (1.44) (-1.17) (1.29) (-6.41) (-2.48) (-5.83)
Localization Effects of Non-SOE
0-1 km 7.06E-05 1.22E-05 9.87E-05 0.000877 -0.000194 0.000694 0.0168 0.038 0.0113 0.0179 0.0201 0.0154
(22.99) (3.93) (25.14) (8.63) (-0.30) (7.03) (28.33) (11.01) (19.98) (46.24) (11.50) (40.48)
1-5 km 4.05E-06 4.74E-07 5.68E-06 0.000368 0.000903 0.000365 0.00412 0.0173 0.00403 0.00358 0.00878 0.0034
(10.76) (0.94) (12.13) (3.50) (1.13) (3.53) (6.51) (4.13) (6.51) (9.29) (4.11) (8.90)
5-10 km 1.37E-06 7.72E-07 1.98E-06 1.94E-05 0.0000411 2.67E-05 -0.00147 -0.000306 -0.00102 -0.00172 -0.00408 -0.00144
(4.87) (2.03) (5.61) (0.16) (0.05) (0.23) (-2.06) (-0.07) (-1.45) (-4.04) (-1.82) (-3.35)
10-20 km 1.04E-07 -4.98E-07 3.98E-07 -0.000181 0.000566 -0.000165 -0.00247 0.0107 -0.00285 0.00279 0.00319 0.00278
(0.37) (-1.60) (1.12) (-1.27) (0.66) (-1.17) (-2.88) (2.25) (-3.40) (5.51) (1.34) (5.48)
20-30 km -1.63E-06 -8.21E-07 -1.96E-06 -0.00011 -0.000325 -8.21E-05 -0.00384 0.00259 -0.0046 0.00291 -0.00245 0.00282
(-5.09) (-2.44) (-4.93) (-0.75) (-0.37) (-0.57) (-4.47) (0.57) (-5.46) (5.75) (-1.07) (5.57)
Observations 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810 23,434,810
Adj. R-squared 0.0011 0.0001 0.0014 0.435 0.295 0.451 0.784 0.685 0.792 0.874 0.837 0.876
1 Coefficients reported are localization effects of SOEs and non-SOEs on different productivity measures for all firms, SOEs, and non-SOEs.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing industry
diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
59
Figure 1: Cross Industry Comparison of the Attenuation of Localization Economies (IV)
60
(1) Dependence on Transportation (2) New Product Ratio
(3) College (and above) Degree Worker Ratio (4) Dependence on Natural Resources
(5) SOE Ratio
Figure 2: Kernel Density Estimation of Attenuation Parameters by Industry Characteristics
61
Attenuation Pattern from Rosenthal and Strange (2003) - FiveIndustries
Attenuation Pattern from Rosenthal and Strange (2003) - FourIndustries (Apparel Excluded)
Attenuation Pattern based on Our Estimates
Figure 3: Attenuation of Localization Economies for Five Selected Manufacturing Industries inComparison with Rosenthal and Strange (2003)
62
Figure 4: TFP and Other Correlates - Full Sample Localization Economies
63
(1) Birth (2) TFP
(3) Output Per Worker (4) Wages per Workder
Figure 5: Localization Economies within and between Ownership Types
64
Appendix A: Geocoding
Geocoding Chinese data can be tricky. Due to national security concerns, all map service
providers in China are mandated by the government to use a specific coordinate system called
GCJ-02. GCJ-02 (colloquially Mars Coordinates) is formulated by the Chinese Academy of
Surveying and Mapping (CASM) and is based on World Geodetic System 1984 (WGS-84).54
However, the way CASM converts GCJ-02 coordinates to WGS-84 coordinates is by using an
obfuscation algorithm to add random offsets to the WGS-84 latitude and longitude. Thus,
GCJ-02 coordinates can be displayed at the correct location on a GCJ-02 map, but not on a
WGS-84 map. Because almost all geographic information system (GIS) software is based on
WGS-84, directly using GCJ-02 coordinates in GIS software will cause measurement errors in
calculated distances. Geocoding with Google’s geocoding application programming interface
cannot resolve this issue because Google Maps also uses GCJ-02 for locations in China.
Thus, geocoding Chinese data and correctly processing it in GIS software requires us to
reverse the obfuscation algorithm and obtain the longitude and latitude based on WGS-84.
We are unaware of any previous relevant studies that have carefully dealt with this geocoding
issue. This may not be a serious issue if the geographic distances considered in the analysis
are large enough as the distance bias caused by the obfuscation algorithm is often within the
range of several hundreds of meters. However, in this study, we consider the distance of firms
within a few kilometres, in which case carefully dealing with the obfuscation algorithm can be
important to avoid serious bias. We consulted experts in geography who provided the source
code of the obfuscation algorithm to us. We then converted GCJ-02 coordinates to WGS-84
coordinates by reversing the obfuscation algorithm after the regular geocoding process.
54WGS-84 is the most commonly used reference system in cartography, geodesy, and satellite navigation,including GPS.
65
Appendix B: TFP Estimation
We estimate firm-level TFP following Brandt et al. (2012), Loecker and Warzynski (2012),
Ackerberg et al. (2015), and Brandt et al. (2017). The classical challenge in the estimation
of firm TFP is that firm productivity shocks are known to profit-maximizing firms but un-
observable to econometricians. Firms choose the current year’s inputs based on the contem-
poraneous productivity shocks. Thus, firms’ current year’s inputs may be endogenous. In
this case, the OLS estimation of the firm production function will lead to biased estimates of
TFP.
There are several commonly used techniques in the literature to deal with this endogeneity
problem. Olley and Pakes (1996) (i.e. the OP method) show that we can use firm investment
as a proxy for unobserved firm productivity shocks with a control function approach. Extend-
ing their framework, Levinsohn and Petrin (2003) suggest that we can use firm intermediate
inputs as a proxy for unobserved firm productivity shocks. Later progresses of the literature
are mostly built on those two seminal studies. The method we use in this study to estimation
firm-level TFP uses firm intermediate inputs in the control function and a GMM algorithm
that is first proposed by Loecker and Warzynski (2012) for parameter estimation in the firm
production function. The detailed TFP estimation procedure follows Brandt et al. (2012)
and Brandt et al. (2017) closely.
66
Appendix C: Tables
Table A1. Summary Statistics - Grid Level Four-Digit Industry Existing Employment
Two-digit Industry Name CIC Code N Mean Std Dev
Food Processing 13 996705 2.894 68.998
Food Production 14 1173972 1.231 42.701
Beverage Production 15 756300 1.432 62.179
Tobacco Processing 16 44346 4.604 170.803
Textile Industry 17 1194920 5.099 160.653
Garments & Other Fibre Products 18 165315 25.756 310.736
Leather, Furs, Down & Related Products 19 485520 5.257 195.902
Timber Processing, Bamboo, Cane, Palm Fibre & Straw Products 20 479000 2.538 52.687
Furniture Manufacturing 21 250010 3.778 73.696
Papermaking & Paper Products 22 295940 4.783 75.624
Printing & Record Pressing 23 261945 2.055 37.056
Stationery, Educational & Sports Goods 24 578214 2.072 65.986
Petroleum Processing, Coking Products, Gas Production & Supply 25 183640 4.495 144.634
Raw Chemical Materials & Chemical Products 26 1953240 1.858 57.129
Medical & Pharmaceutical Products 27 361578 3.980 85.733
Chemical Fibres 28 241101 1.722 74.258
Rubber Products 29 453618 2.017 67.985
Plastic Products 30 537660 4.361 101.337
Non-metal Mineral Products 31 2026950 2.262 60.659
Smelting & Pressing of Ferrous Metals 32 232588 12.800 480.847
Smelting & Pressing of Nonferrous Metals 33 489096 2.820 122.861
Metal Products 34 1070748 2.767 54.681
Machinery & Equipment Manufacturing 35 1884769 2.429 57.887
Special Equipment Manufacturing 36 2352880 1.154 50.335
Transportation Equipment Manufacturing 37 1213107 3.400 106.446
Electric Equipment & Machinery 39 1407264 3.423 113.021
Electronic & Telecommunications 40 684292 8.346 346.336
Instruments, Meters, Cultural & Official Machinery 41 1058784 1.032 46.118
Artwork & Other Manufacturing 42 601308 2.003 60.761
ALL NA 23,434,810 1147.484 3899.308
67
Table A2. OLS Estimates - Firm Birth Share
(1) (2) (3) (4) (5) (6) (7) (8)
Timber
Garments & Leather, Furs, Processing,
Name Food Processing Food Production Beverage Tobacco Textile Industry Other Fiber Down & Related Bamboo, Cane,
Production Processing Products Products Palm Fiber &
Straw Products
code 13 14 15 16 17 18 19 20
Localization Effects
0-1 km 4.53E-05 6.87E-05 8.58E-05 -1.78E-05 7.80E-05 2.90E-05 0.000145 6.65E-05
(6.87) (5.09) (4.71) (-1.03) (7.78) (4.44) (3.50) (7.92)
1-5 km 2.18E-06 3.57E-06 2.92E-06 2.13E-05 4.18E-06 2.80E-06 -3.13E-06 5.20E-06
(2.50) (2.10) (2.07) (0.67) (2.98) (2.11) (-0.65) (3.72)
5-10 km 1.88E-06 9.20E-07 -2.63E-07 -4.81E-06 -1.93E-07 -2.40E-07 2.41E-07 2.70E-06
(2.58) (0.80) (-0.19) (-0.25) (-0.17) (-0.27) (0.12) (2.94)
10-20 km 1.03E-06 1.47E-06 1.29E-06 3.91E-06 -3.67E-07 2.76E-07 -7.20E-07 2.71E-06
(1.10) (1.01) (0.77) (0.19) (-0.27) (0.21) (-0.39) (2.77)
20-30 km -9.09E-07 2.08E-07 2.21E-08 1.05E-05 -2.31E-06 -4.28E-06 -1.42E-06 2.03E-06
(-0.77) (0.12) (0.01) (0.63) (-1.66) (-2.44) (-0.51) (1.74)
Urbanization Effects
0-1 km 6.34E-06 1.24E-05 1.39E-05 5.90E-05 1.34E-05 1.53E-05 1.24E-05 6.18E-06
(4.50) (6.01) (4.50) (1.41) (7.12) (3.77) (3.53) (5.18)
1-5 km -4.20E-06 -2.44E-06 -4.82E-06 -7.46E-05 -3.86E-06 -5.85E-06 -3.93E-06 -5.73E-06
(-4.83) (-2.26) (-2.09) (-1.56) (-3.61) (-2.89) (-1.85) (-5.59)
5-10 km -2.26E-06 -8.96E-08 -1.26E-06 8.99E-05 -4.11E-07 -1.43E-06 -5.08E-06 5.73E-07
(-1.83) (-0.07) (-0.60) (1.92) (-0.41) (-0.62) (-1.47) (0.52)
10-20 km -3.50E-06 -6.56E-06 -3.58E-06 -2.98E-06 -3.34E-06 -6.43E-06 -5.74E-06 -3.43E-06
(-2.14) (-3.58) (-1.55) (-0.03) (-2.66) (-2.29) (-1.30) (-2.74)
20-30 km -1.15E-06 6.08E-07 -9.16E-07 -2.24E-04 -2.75E-06 1.24E-06 5.19E-08 -1.58E-06
(-0.71) (0.35) (-0.36) (-1.34) (-1.56) (0.44) (0.01) (-0.93)
Observations 996,705 1,173,972 756,300 44,346 1,194,920 165,315 485,520 479,000
Adj. R-squared 0.002 0.001 0.002 0.007 0.003 0.004 0.004 0.008
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
68
Table A2 (Continued). OLS Estimates - Firm Birth Share
(9) (10) (11) (12) (13) (14) (15) (16)
Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &
Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products
& Supply
Code 21 22 23 24 25 26 27 28
Localization Effects
0-1 km 3.51E-05 2.60E-05 2.09E-05 0.000140 9.44E-05 6.18E-05 5.06E-05 0.000319
(3.79) (5.22) (2.30) (2.44) (3.61) (8.51) (4.76) (2.16)
1-5 km 4.69E-06 2.00E-06 1.71E-06 5.12E-06 6.26E-06 2.09E-06 5.04E-07 1.09E-05
(1.91) (2.09) (0.90) (0.93) (2.40) (2.08) (0.54) (1.22)
5-10 km 3.08E-06 -6.32E-08 -2.10E-06 -5.95E-07 1.85E-06 3.11E-07 -2.88E-08 -4.61E-06
(2.37) (-0.08) (-1.26) (-0.10) (0.71) (0.44) (-0.03) (-0.97)
10-20 km -3.01E-07 2.96E-07 -2.10E-06 -6.00E-06 8.39E-07 -3.05E-07 5.67E-08 6.88E-06
(-0.09) (0.31) (-1.03) (-1.04) (0.45) (-0.47) (0.08) (1.14)
20-30 km -9.37E-08 -1.57E-06 -2.62E-07 -1.19E-05 -6.81E-09 -2.53E-06 -2.63E-06 1.77E-06
(-0.06) (-1.97) (-0.17) (-1.74) (-0.00) (-3.34) (-2.84) (0.49)
Urbanization Effects
0-1 km 1.62E-05 2.60E-05 2.33E-05 2.13E-05 1.27E-05 1.17E-05 1.90E-05 1.38E-05
(3.34) (5.22) (5.62) (4.55) (2.63) (7.67) (6.77) (2.19)
1-5 km -6.61E-06 2.00E-06 8.48E-07 -7.97E-07 -5.26E-06 -1.73E-06 -4.64E-06 -6.48E-06
(-1.55) (2.09) (0.39) (-0.24) (-2.31) (-2.79) (-3.35) (-0.91)
5-10 km -3.80E-06 -6.32E-08 -2.72E-06 -5.31E-06 -3.67E-06 -2.64E-06 -3.28E-06 -4.12E-06
(-0.61) (-0.08) (-0.99) (-1.21) (-0.94) (-3.09) (-1.63) (-0.56)
10-20 km 1.52E-06 2.96E-07 -1.73E-06 -7.98E-06 -6.46E-06 -1.46E-06 -5.49E-06 1.70E-05
(0.22) (0.31) (-0.56) (-1.80) (-1.27) (-1.16) (-2.49) (1.26)
20-30 km -9.64E-07 -1.57E-06 -3.67E-06 -5.84E-06 1.12E-06 -2.48E-06 2.32E-06 -2.28E-05
(-0.18) (-1.97) (-0.69) (-1.24) (0.20) (-1.89) (1.04) (-1.64)
Observations 250,010 295,940 261,945 578,214 183,640 1,953,240 361,578 241,101
Adj. R-squared 0.002 0.002 0.002 0.001 0.004 0.001 0.003 0.007
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index repre-
senting industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
69
Table A2 (Continued). OLS Estimates - Firm Birth Share
(17) (18) (19) (20) (21) (22) (23) (24)
Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment Equipment
Mineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
Metals
Code 29 30 31 32 33 34 35 36
Localization Effects
0-1 km 0.000164 3.96E-05 6.52E-05 4.03E-05 6.80E-05 4.33E-05 6.03E-05 8.61E-05
(4.59) (7.05) (8.82) (5.48) (4.99) (6.92) (8.89) (6.14)
1-5 km 5.66E-06 2.73E-06 6.41E-06 3.64E-06 4.37E-06 4.60E-06 4.17E-06 3.29E-06
(2.49) (3.61) (5.50) (2.92) (2.49) (4.78) (5.73) (2.59)
5-10 km 3.15E-06 1.60E-06 3.27E-06 5.94E-09 -6.16E-10 1.34E-07 2.64E-06 1.37E-06
(1.50) (2.16) (4.00) (0.01) (-0.00) (0.18) (2.57) (1.00)
10-20 km 3.81E-06 1.36E-06 7.37E-07 -2.30E-07 -2.10E-06 9.75E-08 7.86E-07 -1.71E-06
(2.10) (2.21) (0.76) (-0.29) (-1.51) (0.10) (0.76) (-1.38)
20-30 km 1.70E-06 1.50E-07 -3.09E-07 -7.31E-07 -2.42E-07 -8.28E-07 -8.81E-07 -3.24E-06
(0.75) (0.16) (-0.28) (-0.70) (-0.15) (-0.77) (-0.82) (-2.58)
Urbanization Effects
0-1 km 1.30E-05 1.27E-05 9.77E-06 9.35E-06 1.24E-05 1.50E-05 1.30E-05 1.93E-05
(4.59) (9.84) (8.89) (4.40) (4.97) (8.79) (10.77) (9.14)
1-5 km -3.93E-06 -2.93E-06 -3.24E-06 -2.63E-06 -5.51E-06 -4.22E-06 -2.63E-06 -1.77E-06
(-2.32) (-4.59) (-4.40) (-1.76) (-2.49) (-3.29) (-3.96) (-2.03)
5-10 km -6.37E-07 -2.24E-06 5.46E-07 -4.15E-06 -3.91E-07 -9.81E-07 -1.12E-06 -2.72E-06
(-0.25) (-2.79) (0.59) (-2.05) (-0.08) (-1.06) (-1.54) (-2.95)
10-20 km -7.58E-06 -1.16E-06 -4.33E-06 -3.85E-06 -6.81E-06 -2.01E-06 -3.30E-06 -2.12E-06
(-2.01) (-0.99) (-3.46) (-1.47) (-1.51) (-1.47) (-3.30) (-1.67)
20-30 km -5.05E-06 -4.10E-06 -1.90E-06 -1.14E-06 -3.93E-06 8.48E-07 -5.14E-07 -6.79E-06
(-1.22) (-2.92) (-1.67) (-0.38) (-0.58) (0.56) (-0.49) (-1.91)
Observations 453,618 537,660 2,026,950 232,588 489,096 1,070,748 1,884,769 2,352,880
Adj. R-squared 0.005 0.006 0.002 0.006 0.002 0.002 0.001 0.001
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
70
Table A2 (Continued). OLS Estimates - Firm Birth Share
(25) (26) (27) (28) (29)
Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & Manufacturing
Manufacturing Official Machinery
Code 37 39 40 41 42
Localization Effects
0-1 km 8.71E-05 4.97E-05 0.000107 0.000146 0.000171
(5.02) (7.85) (4.92) (2.81) (6.01)
1-5 km 7.53E-06 3.64E-06 3.58E-06 5.51E-06 5.37E-06
(2.69) (3.45) (1.46) (1.06) (2.24)
5-10 km 1.47E-06 1.52E-06 -5.85E-07 3.37E-06 7.26E-06
(0.97) (2.09) (-0.30) (1.25) (3.86)
10-20 km -1.93E-06 9.66E-07 -2.25E-06 6.16E-07 2.73E-06
(-1.03) (1.40) (-0.96) (0.25) (1.51)
20-30 km -3.16E-06 6.07E-07 -4.46E-06 -3.87E-06 -4.54E-07
(-1.89) (0.70) (-2.17) (-0.92) (-0.26)
Urbanization Effects
0-1 km 1.33E-05 1.72E-05 1.65E-05 1.25E-05 1.37E-05
(5.53) (7.78) (4.76) (4.62) (5.39)
1-5 km -1.60E-06 -1.71E-06 -2.92E-06 1.68E-06 -8.20E-06
(-1.61) (-1.72) (-1.37) (0.64) (-4.06)
5-10 km -9.99E-07 -1.93E-06 2.68E-07 -5.63E-06 -9.37E-07
(-0.98) (-2.23) (0.12) (-1.44) (-0.47)
10-20 km -5.70E-06 -2.66E-06 -3.35E-06 -4.82E-06 -1.12E-06
(-2.71) (-1.58) (-0.87) (-1.43) (-0.37)
20-30 km -7.00E-08 -8.14E-07 -4.04E-06 -5.84E-06 -5.95E-06
(-0.03) (-0.31) (-1.06) (-1.61) (-1.24)
Observations 1,213,107 1,407,264 684,292 1,058,784 601,308
Adj. R-squared 0.001 0.001 0.002 0.001 0.005
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each
two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each
grid, Herfindahl index representing industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit
industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.
71
Table A3. IV LASSO Estimates - Firm Birth Share
(1) (2) (3) (4) (5) (6) (7) (8)
Timber
Garments & Leather, Furs, Processing,
Name Food Processing Food Production Beverage Tobacco Textile Industry Other Fiber Down & Related Bamboo, Cane,
Production Processing Products Products Palm Fiber &
Straw Products
code 13 14 15 16 17 18 19 20
Localization Effects
0-1 km 4.64E-05 6.77E-05 8.95E-05 -6.33E-05 6.61E-05 2.63E-05 0.0001683 6.63E-05
(5.76) (4.44) (3.88) (-1.00) (5.15) (3.63) (3.10) (6.55)
1-5 km 9.90E-06 6.98E-06 -1.81E-06 -5.40E-05 1.14E-05 -9.02E-07 -1.02E-06 1.16E-05
(2.85) (1.79) (-0.44) (-1.00) (2.66) (-0.31) (-0.10) (4.02)
5-10 km 8.87E-07 1.05E-05 4.22E-06 -4.65E-05 -1.44E-06 -6.75E-07 1.50E-05 1.40E-06
(0.21) (2.36) (0.58) (-1.00) (-0.36) (-0.23) (0.90) (0.60)
10-20 km 4.49E-06 -8.15E-07 1.00E-05 -4.58E-05 -2.47E-06 -2.65E-06 1.79E-05 1.03E-05
(1.18) (-0.15) (2.20) (-1.00) (-0.44) (-0.59) (1.41) (2.76)
20-30 km 4.83E-06 2.25E-06 -5.17E-06 -5.08E-05 9.65E-06 1.31E-05 2.25E-06 -2.66E-06
(0.98) (0.42) (-0.94) (-1.00) (1.49) (1.79) (0.32) (-0.71)
Urbanization Effects
0-1 km 6.50E-06 1.40E-05 1.70E-05 -5.34E-06 1.94E-05 1.91E-05 2.25E-05 8.68E-06
(3.01) (5.79) (2.81) (-0.97) (5.30) (3.42) (3.06) (4.81)
1-5 km -1.25E-05 -7.01E-06 -1.33E-06 2.89E-05 -1.30E-05 -1.63E-06 -8.76E-06 -1.17E-05
(-4.08) (-2.42) (-0.38) (1.00) (-2.97) (-0.43) (-0.91) (-4.48)
5-10 km -1.06E-06 -8.59E-06 -3.27E-06 7.20E-06 1.76E-06 9.53E-07 -1.74E-05 1.93E-06
(-0.19) (-2.01) (-0.81) (0.96) (0.35) (0.19) (-0.89) (0.69)
10-20 km -7.58E-06 -6.51E-06 -3.05E-05 -5.39E-05 2.14E-06 1.21E-06 -4.01E-05 -1.40E-05
(-1.20) (-0.95) (-3.73) (-1.00) (0.24) (0.13) (-1.45) (-3.25)
20-30 km -4.34E-06 -1.99E-06 1.77E-05 1.23E-05 -1.78E-05 -2.07E-05 2.96E-05 6.74E-06
(-0.63) (-0.27) (1.76) (0.99) (-1.78) (-1.73) (1.13) (1.35)
Observations 996,705 1,173,972 756,300 44,346 1,194,920 165,315 485,520 479,000
1 Coefficients reported are ring level localization and urbanization effects obtained by IV Lasso estimation of equation (3.1) for each two-digit industry.2 Post-lasso-orthogonalized variables are used in IV regression.3 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.4 Numbers in parentheses are t-statistics clustered at the grid level.
72
Table A3 (Continued). IV LASSO Estimates - Firm Birth Share
(9) (10) (11) (12) (13) (14) (15) (16)
Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &
Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products
& Supply
Code 21 22 23 24 25 26 27 28
Localization Effects
0-1 km 2.94E-05 2.15E-05 2.42E-05 0.0001681 9.91E-05 7.46E-05 4.58E-05 0.0003302
(3.58) (3.72) (1.88) (2.42) (3.35) (7.45) (4.41) (1.97)
1-5 km -1.77E-06 1.95E-06 -2.85E-06 1.18E-05 1.03E-05 6.50E-07 2.57E-06 4.32E-06
(-0.50) (0.95) (-0.52) (0.80) (1.97) (0.19) (1.12) (0.34)
5-10 km 1.91E-06 1.35E-06 -1.25E-06 1.13E-05 4.60E-06 -1.43E-06 2.51E-06 -8.69E-06
(0.35) (0.52) (-0.19) (0.68) (0.77) (-0.42) (1.10) (-1.34)
10-20 km 6.02E-06 -3.43E-06 -3.78E-06 -2.52E-05 8.46E-06 1.18E-06 6.26E-07 -7.17E-06
(0.74) (-0.88) (-0.56) (-1.58) (2.27) (0.33) (0.24) (-0.73)
20-30 km -7.49E-06 -3.72E-06 1.01E-05 -2.76E-05 3.99E-06 -1.31E-06 -2.39E-06 -6.48E-06
(-0.77) (-0.76) (0.68) (-2.79) (0.92) (-0.35) (-0.80) (-0.61)
Urbanization Effects
0-1 km 2.42E-05 9.10E-06 2.45E-05 1.99E-05 2.43E-05 1.63E-05 1.73E-05 1.68E-05
(3.16) (4.40) (5.25) (2.73) (2.46) (7.76) (5.26) (1.02)
1-5 km -7.57E-06 -6.95E-06 8.43E-06 -1.17E-05 -9.35E-06 -3.90E-06 -7.36E-06 -1.73E-05
(-2.48) (-2.58) (1.91) (-0.92) (-1.42) (-1.58) (-2.99) (-0.68)
5-10 km 3.11E-06 -4.75E-06 -7.76E-06 -2.80E-05 -1.43E-05 -4.05E-06 -7.16E-06 -5.13E-06
(0.43) (-1.26) (-0.94) (-1.85) (-1.34) (-1.02) (-1.78) (-0.22)
10-20 km -2.42E-05 -2.35E-06 6.91E-06 1.15E-05 -1.47E-05 -3.25E-06 -4.58E-06 4.81E-05
(-1.51) (-0.32) (0.61) (0.44) (-1.10) (-0.50) (-0.80) (0.89)
20-30 km 1.09E-05 -2.96E-06 -2.27E-05 1.24E-05 1.52E-05 -7.52E-06 4.45E-07 -2.88E-05
(0.88) (-0.35) (-1.13) (0.95) (0.97) (-1.25) (0.08) (-0.65)
Observations 250,010 295,940 261,945 578,214 183,640 1,953,240 361,578 241,101
1 Coefficients reported are ring level localization and urbanization effects obtained by IV Lasso estimation of equation (3.1) for each two-digit industry.2 Post-lasso-orthogonalized variables are used in IV regression.3 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.4 Numbers in parentheses are t-statistics clustered at the grid level.
73
Table A3 (Continued). IV LASSO Estimates - Firm Birth Share
(17) (18) (19) (20) (21) (22) (23) (24)
Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment Equipment
Mineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
Metals
Code 29 30 31 32 33 34 35 36
Localization Effects
0-1 km 0.0001703 3.85E-05 7.59E-05 3.84E-05 7.56E-05 4.95E-05 7.49E-05 9.28E-05
(4.06) (5.32) (6.66) (4.63) (4.85) (5.98) (6.46) (5.61)
1-5 km 7.60E-06 7.30E-06 2.04E-05 5.84E-07 9.32E-08 6.72E-07 5.28E-06 4.97E-07
(1.37) (2.44) (3.88) (0.26) (0.02) (0.19) (1.54) (0.08)
5-10 km 7.14E-06 -4.65E-06 -3.31E-06 5.15E-06 3.70E-06 3.93E-06 -6.58E-07 3.50E-06
(1.17) (-1.31) (-0.73) (2.59) (1.19) (1.24) (-0.17) (0.67)
10-20 km 7.59E-06 6.28E-06 1.41E-06 -1.11E-06 -1.72E-06 -4.72E-06 6.03E-06 1.04E-05
(1.02) (1.17) (0.38) (-0.41) (-0.50) (-0.86) (1.22) (1.59)
20-30 km 5.71E-06 8.33E-07 3.97E-07 -6.82E-06 -1.38E-06 -2.43E-07 4.67E-06 -3.62E-06
(0.56) (0.16) (0.11) (-2.12) (-0.32) (-0.04) (1.02) (-0.59)
Urbanization Effects
0-1 km 1.30E-05 1.40E-05 1.44E-05 1.11E-05 1.24E-05 1.68E-05 1.76E-05 2.93E-05
(2.82) (7.09) (7.35) (3.68) (3.66) (7.94) (8.90) (7.38)
1-5 km -4.87E-06 -7.14E-06 -1.37E-05 -3.61E-06 -4.50E-06 -4.77E-06 -5.81E-06 -4.63E-06
(-0.99) (-2.83) (-4.65) (-1.38) (-1.17) (-1.55) (-2.21) (-1.14)
5-10 km -6.17E-06 2.36E-06 2.43E-06 -9.96E-06 -1.94E-05 -6.32E-06 6.95E-07 -4.69E-06
(-0.87) (0.74) (0.80) (-2.81) (-2.02) (-1.54) (0.19) (-0.84)
10-20 km -1.39E-05 -8.04E-06 -6.34E-06 -4.09E-06 -4.90E-06 3.13E-06 -1.17E-05 -2.66E-05
(-1.06) (-0.99) (-1.51) (-0.68) (-0.62) (0.35) (-1.45) (-2.04)
20-30 km -1.49E-05 -5.86E-06 -6.80E-06 7.56E-06 1.30E-05 -5.33E-06 -7.19E-06 7.49E-06
(-1.02) (-0.73) (-1.30) (1.24) (1.12) (-0.58) (-1.10) (0.66)
Observations 453,618 537,660 2,026,950 232,588 489,096 1,070,748 1,884,769 2,352,880
1 Coefficients reported are ring level localization and urbanization effects obtained by IV Lasso estimation of equation (3.1) for each two-digit industry.2 Post-lasso-orthogonalized variables are used in IV regression.3 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.4 Numbers in parentheses are t-statistics clustered at the grid level.
74
Table A3 (Continued). IV LASSO Estimates - Firm Birth Share
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Localization Effects0-1 km 0.0001072 5.79E-05 8.64E-05 0.0002248 0.0001889
(4.01) (6.32) (4.04) (3.05) (5.45)1-5 km -5.20E-06 -6.17E-07 5.28E-06 3.41E-06 1.78E-05
(-0.58) (-0.18) (0.74) (0.36) (1.27)5-10 km 2.82E-06 8.66E-08 6.90E-06 -4.18E-06 5.32E-06
(0.44) (0.03) (1.05) (-0.64) (0.44)10-20 km -3.49E-06 1.15E-05 1.58E-06 -8.57E-06 -6.53E-06
(-0.46) (1.85) (0.27) (-0.90) (-0.66)20-30 km 3.24E-06 -1.62E-06 6.47E-07 -1.45E-05 -6.34E-06
(0.35) (-0.29) (0.10) (-1.06) (-0.50)
Urbanization Effects0-1 km 1.90E-05 2.12E-05 2.41E-05 1.36E-05 2.15E-05
(3.95) (9.27) (3.76) (2.67) (4.61)1-5 km 1.91E-06 1.61E-06 -1.38E-06 -1.32E-05 -2.93E-05
(0.33) (0.49) (-0.20) (-1.34) (-2.64)5-10 km -4.03E-07 -4.35E-06 -1.40E-05 2.11E-06 -3.83E-06
(-0.06) (-1.16) (-1.84) (0.14) (-0.30)10-20 km -4.55E-06 -2.10E-05 -9.56E-06 -1.84E-06 9.33E-06
(-0.31) (-1.88) (-0.72) (-0.10) (0.53)20-30 km -1.81E-05 -1.25E-06 -1.50E-05 1.82E-06 -1.20E-05
(-0.83) (-0.11) (-0.94) (0.08) (-0.57)Observations 1,213,107 1,407,264 684,292 1,058,784 601,308
1 Coefficients reported are ring level localization and urbanization effects obtained by IV Lasso estimation of equation (3.1) foreach two-digit industry.
2 Post-lasso-orthogonalized variables are used in IV regression.3 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each
grid, Herfindahl index representing industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digitindustry fixed effects.
4 Numbers in parentheses are t-statistics clustered at the grid level.
75
Table A4.1. Spatial Decay Speed with Inverse Linear Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 4.53E-05 6.74E-05 8.50E-05 -1.62E-05 6.82E-05 2.64E-05 0.00015338 6.73E-05(4.26) (4.76) (5.02) (-0.53) (5.16) (2.57) (6.03) (5.84)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 3.21E-05 2.48E-05 2.61E-05 0.000201 9.25E-05 7.82E-05 4.75E-05 0.00030005(2.84) (2.71) (1.90) (6.95) (4.97) (6.79) (4.13) (7.31)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 0.0001613 4.02E-05 8.07E-05 4.16E-05 7.74E-05 5.31E-05 7.45E-05 9.37E-05(7.14) (3.91) (6.60) (3.98) (5.53) (4.88) (5.97) (6.32)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 0.00010961 5.90E-05 8.73E-05 0.0002342 0.00020302(5.88) (5.18) (5.25) (7.98) (9.52)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = 1/d.3 Numbers in parentheses are t-statistics.
76
Table A4.2. Spatial Decay Speed with Inverse Exponential Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 0.000113 0.000171 0.000238 -3.92E-05 0.000170 6.99E-05 0.000439 0.000168(5.54) (6.24) (7.12) (-0.67) (6.72) (3.60) (8.60) (7.54)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 8.10E-05 5.95E-05 6.89E-05 0.000491 0.000252 0.000204 0.000123 0.000919(3.84) (3.43) (2.65) (8.50) (6.72) (9.15) (5.45) (10.42)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 0.000445 9.91E-05 0.000199 0.000104 0.000205 0.000134 0.000195 0.000246(9.94) (5.09) (8.32) (5.16) (7.46) (6.46) (8.08) (8.56)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 0.000293 0.000154 0.000226 0.000630 0.000513(8.08) (7.10) (6.97) (10.68) (12.34)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = 1/ed.3 Numbers in parentheses are t-statistics.
77
Online Appendix
Table OA1. Summary Statistics - Grid Level Four-Digit Industry Employment Birth Share
Two-digit Industry Name CIC Code N Mean Std Dev
Food Processing 13 996705 1.50E-05 0.001
Food Production 14 1173972 1.62E-05 0.002
Beverage Production 15 756300 1.59E-05 0.002
Tobacco Processing 16 44346 6.76E-05 0.007
Textile Industry 17 1194920 1.67E-05 0.002
Garments & Other Fibre Products 18 165315 1.81E-05 0.002
Leather, Furs, Down & Related Products 19 485520 2.06E-05 0.002
Timber Processing, Bamboo, Cane, Palm Fibre & Straw Products 20 479000 1.67E-05 0.001
Furniture Manufacturing 21 250010 2.00E-05 0.002
Papermaking & Paper Products 22 295940 1.35E-05 0.001
Printing & Record Pressing 23 261945 1.91E-05 0.002
Stationery, Educational & Sports Goods 24 578214 2.25E-05 0.003
Petroleum Processing, Coking Products, Gas Production & Supply 25 183640 2.18E-05 0.002
Raw Chemical Materials & Chemical Products 26 1953240 1.43E-05 0.002
Medical & Pharmaceutical Products 27 361578 1.66E-05 0.001
Chemical Fibres 28 241101 2.49E-05 0.003
Rubber Products 29 453618 1.98E-05 0.002
Plastic Products 30 537660 1.67E-05 0.001
Non-metal Mineral Products 31 2026950 1.43E-05 0.001
Smelting & Pressing of Ferrous Metals 32 232588 1.72E-05 0.001
Smelting & Pressing of Nonferrous Metals 33 489096 1.64E-05 0.002
Metal Products 34 1070748 1.68E-05 0.001
Machinery & Equipment Manufacturing 35 1884769 1.59E-05 0.002
Special Equipment Manufacturing 36 2352880 1.66E-05 0.002
Transportation Equipment Manufacturing 37 1213107 1.65E-05 0.002
Electric Equipment & Machinery 39 1407264 1.71E-05 0.002
Electronic & Telecommunications 40 684292 2.05E-05 0.003
Instruments, Meters, Cultural & Official Machinery 41 1058784 1.98E-05 0.003
Artwork & Other Manufacturing 42 601308 2.00E-05 0.002
ALL NA 23,434,810 1.70E-05 0.002
1
Table OA2. OLS Estimates - Firm Birth Share (Single Establishment)
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Garments & Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Other Fiber Down & Related Bamboo, Cane,
Production Processing Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Localization Effects0-1 km 4.47E-05 6.87E-05 8.93E-05 -2.63E-05 7.87E-05 2.96E-05 0.00014873 6.91E-05
(6.86) (5.08) (4.49) (-1.29) (7.55) (4.41) (3.48) (7.84)1-5 km 2.61E-06 4.30E-06 3.52E-06 2.55E-05 4.67E-06 2.82E-06 -3.28E-06 5.01E-06
(2.82) (2.28) (2.11) (0.63) (3.16) (2.09) (-0.67) (3.43)5-10 km 2.07E-06 9.32E-07 -2.31E-07 -9.86E-06 -3.94E-07 -2.05E-07 4.16E-07 2.65E-06
(2.71) (0.72) (-0.14) (-0.54) (-0.33) (-0.23) (0.21) (2.76)10-20 km 1.21E-06 1.76E-06 2.82E-07 2.05E-07 -3.22E-07 2.69E-07 -1.82E-06 2.65E-06
(1.23) (1.07) (0.16) (0.01) (-0.23) (0.20) (-0.99) (2.58)20-30 km -5.52E-07 3.70E-07 -6.47E-07 7.95E-06 -2.30E-06 -4.29E-06 -1.50E-06 1.56E-06
(-0.44) (0.19) (-0.38) (0.54) (-1.61) (-2.43) (-0.53) (1.32)
Urbanization Effects0-1 km 6.58E-06 1.26E-05 1.43E-05 7.08E-05 1.37E-05 1.53E-05 1.27E-05 6.00E-06
(4.51) (6.00) (4.58) (1.42) (7.10) (3.77) (3.50) (4.92)1-5 km -4.14E-06 -2.41E-06 -4.75E-06 -7.54E-05 -3.72E-06 -5.80E-06 -4.05E-06 -5.63E-06
(-4.74) (-2.19) (-2.04) (-1.58) (-3.47) (-2.87) (-2.01) (-5.21)5-10 km -2.23E-06 2.13E-08 -1.59E-06 8.98E-05 -4.81E-07 -1.17E-06 -4.48E-06 6.98E-07
(-1.73) (0.02) (-0.75) (1.90) (-0.44) (-0.51) (-1.40) (0.61)10-20 km -3.60E-06 -6.55E-06 -3.47E-06 -5.73E-06 -3.22E-06 -7.12E-06 -5.33E-06 -3.73E-06
(-2.10) (-3.46) (-1.44) (-0.06) (-2.50) (-2.46) (-1.17) (-2.86)20-30 km -1.36E-06 5.52E-07 -8.12E-07 -0.00024592 -3.05E-06 1.25E-06 9.61E-07 -6.68E-07
(-0.76) (0.31) (-0.30) (-1.36) (-1.67) (0.44) (0.23) (-0.38)
Observations 187,755 1,163,883 747,708 38,298 1,185,280 164,805 484,010 477,208Adj. R-squared 0.0015 0.0007 0.0013 0.0033 0.0023 0.0025 0.0031 0.0068
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to firms with a single establishment.
2
Table OA2 (Continued). OLS Estimates - Firm Birth Share (Single Establishment)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Localization Effects0-1 km 3.72E-05 2.59E-05 2.28E-05 0.00014626 9.98E-05 6.32E-05 3.87E-05 0.00033713
(3.83) (5.05) (2.30) (2.46) (3.53) (8.56) (4.59) (2.17)1-5 km 3.61E-06 2.07E-06 1.63E-07 5.44E-06 4.75E-06 2.61E-06 1.16E-06 7.10E-06
(1.51) (2.14) (0.10) (0.96) (1.74) (2.43) (1.17) (0.81)5-10 km 3.39E-06 2.50E-08 -1.74E-06 -3.54E-07 2.51E-06 1.59E-07 -6.21E-07 -4.51E-06
(2.51) (0.03) (-1.07) (-0.06) (0.90) (0.21) (-0.74) (-0.90)10-20 km -5.74E-07 4.72E-07 -1.73E-06 -5.99E-06 2.30E-07 -2.14E-07 5.99E-07 1.39E-06
(-0.18) (0.49) (-0.83) (-1.00) (0.11) (-0.31) (0.86) (0.30)20-30 km -1.19E-06 -1.51E-06 1.29E-07 -1.16E-05 -2.65E-07 -2.54E-06 -2.03E-06 -2.41E-09
(-0.64) (-1.84) (0.08) (-1.67) (-0.09) (-3.12) (-2.23) (0.00)
Urbanization Effects0-1 km 1.66E-05 7.02E-06 2.37E-05 2.14E-05 1.30E-05 1.20E-05 1.92E-05 1.49E-05
(3.35) (3.88) (5.55) (4.52) (2.64) (7.76) (6.90) (2.29)1-5 km -6.66E-06 -2.60E-06 1.48E-06 -6.24E-07 -5.39E-06 -1.65E-06 -4.48E-06 -6.93E-06
(-1.55) (-1.30) (0.66) (-0.19) (-2.36) (-2.64) (-3.19) (-0.95)5-10 km -3.82E-06 -3.86E-06 -3.15E-06 -5.07E-06 -2.65E-06 -2.78E-06 -3.58E-06 -6.98E-06
(-0.61) (-1.73) (-1.13) (-1.12) (-0.68) (-3.23) (-1.74) (-0.91)10-20 km 1.37E-06 -6.53E-07 -2.05E-06 -8.30E-06 -8.44E-06 -1.62E-06 -5.94E-06 1.94E-05
(0.20) (-0.25) (-0.63) (-1.78) (-1.59) (-1.28) (-2.55) (1.35)20-30 km -1.26E-06 -3.58E-06 -4.29E-06 -5.54E-06 1.28E-06 -2.41E-06 1.97E-06 -2.53E-05
(-0.24) (-1.62) (-0.79) (-1.11) (0.23) (-1.87) (0.86) (-1.77)
Observations 248,240 294,880 261,405 574,448 182,448 1,947,930 360,522 237,055Adj. R-squared 0.0005 0.0007 0.0004 0.0007 0.0025 0.0011 0.0019 0.0066
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index repre-
senting industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to firms with a single establishment.
3
Table OA2 (Continued). OLS Estimates - Firm Birth Share (Single Establishment)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Localization Effects0-1 km 0.00016478 4.04E-05 6.74E-05 4.28E-05 6.78E-05 4.53E-05 6.25E-05 8.86E-05
(4.51) (7.01) (8.73) (5.39) (4.85) (6.97) (8.90) (6.01)1-5 km 6.00E-06 2.80E-06 6.73E-06 4.14E-06 4.47E-06 4.75E-06 4.10E-06 3.61E-06
(2.53) (3.63) (5.57) (3.16) (2.65) (4.84) (5.89) (2.72)5-10 km 3.51E-06 1.61E-06 3.40E-06 -2.33E-07 -3.27E-07 2.64E-07 1.53E-06 1.60E-06
(1.59) (2.14) (4.01) (-0.28) (-0.25) (0.35) (2.93) (1.10)10-20 km 3.75E-06 1.47E-06 9.00E-07 -1.49E-07 -2.77E-06 1.09E-07 3.23E-08 -1.69E-06
(2.02) (2.35) (0.89) (-0.18) (-1.99) (0.11) (0.02) (-1.28)20-30 km 1.70E-06 1.66E-07 -9.04E-09 -8.51E-07 -1.34E-06 -1.17E-06 -1.84E-06 -3.15E-06
(0.73) (0.18) (-0.01) (-0.76) (-0.79) (-1.05) (-0.97) (-2.35)
Urbanization Effects0-1 km 1.35E-05 1.28E-05 9.97E-06 9.58E-06 1.27E-05 1.51E-05 1.32E-05 1.96E-05
(4.67) (9.85) (9.00) (4.47) (4.99) (8.78) (10.82) (9.14)1-5 km -3.96E-06 -2.92E-06 -3.15E-06 -2.60E-06 -5.62E-06 -4.19E-06 -2.58E-06 -1.72E-06
(-2.27) (-4.49) (-4.25) (-1.75) (-2.52) (-3.26) (-3.77) (-1.95)5-10 km -9.76E-07 -2.24E-06 4.55E-07 -3.91E-06 -2.68E-07 -9.17E-07 -9.63E-07 -2.69E-06
(-0.39) (-2.76) (0.49) (-1.91) (-0.06) (-0.99) (-1.32) (-2.89)10-20 km -7.37E-06 -1.41E-06 -4.31E-06 -4.19E-06 -7.14E-06 -1.86E-06 -3.54E-06 -2.27E-06
(-1.99) (-1.20) (-3.44) (-1.59) (-1.55) (-1.34) (-3.50) (-1.80)20-30 km -5.16E-06 -4.17E-06 -2.20E-06 -6.52E-07 -3.62E-06 1.25E-06 -6.18E-07 -6.80E-06
(-1.22) (-2.98) (-1.88) (-0.21) (-0.51) (0.79) (-0.61) (-1.87)
Observations 450,945 536,499 2,023,470 231,468 485,208 1,066,302 1,881,762 2,338,480Adj. R-squared 0.0043 0.0054 0.0019 0.0043 0.0018 0.0013 0.0012 0.0018
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to firms with a single establishment.
4
Table OA2 (Continued). OLS Estimates - Firm Birth Share (Single Establishment)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Localization Effects0-1 km 9.05E-05 5.18E-05 0.00011004 0.00015562 0.00017832
(4.90) (7.65) (4.85) (2.82) (5.89)1-5 km 8.17E-06 3.78E-06 4.12E-06 6.31E-06 5.77E-06
(2.70) (3.43) (1.65) (1.15) (2.29)5-10 km 1.89E-06 1.48E-06 -5.40E-07 4.07E-06 7.44E-06
(1.18) (2.00) (-0.27) (1.42) (3.80)10-20 km -2.48E-06 1.11E-06 -2.00E-06 1.06E-06 3.79E-06
(-1.24) (1.54) (-0.84) (0.41) (2.05)20-30 km -3.47E-06 6.96E-07 -4.13E-06 -3.45E-06 5.61E-07
(-1.91) (0.77) (-1.96) (-0.78) (0.31)
Urbanization Effects0-1 km 1.34E-05 1.76E-05 1.73E-05 1.28E-05 1.37E-05
(5.47) (7.75) (4.88) (4.68) (5.31)1-5 km -1.71E-06 -1.71E-06 -3.08E-06 1.68E-06 -7.58E-06
(-1.69) (-1.71) (-1.43) (0.64) (-3.71)5-10 km -7.94E-07 -1.87E-06 4.82E-08 -5.29E-06 -1.98E-06
(-0.76) (-2.05) (0.02) (-1.36) (-0.88)10-20 km -5.73E-06 -2.27E-06 -3.67E-06 -5.14E-06 -3.74E-08
(-2.70) (-1.34) (-0.93) (-1.55) (-0.01)20-30 km -6.44E-08 -7.38E-07 -4.03E-06 -5.25E-06 -4.67E-06
(-0.03) (-0.27) (-1.03) (-1.41) (-0.97)
Observations 1,207,689 1,403,952 679,714 1,052,592 597,576Adj. R-squared 0.0012 0.001 0.0019 0.0007 0.0048
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for eachtwo-digit industry.
2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of eachgrid, Herfindahl index representing industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digitindustry fixed effects.
3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to firms with a single establishment.
5
Table OA3. OLS Estimates - Firm Birth Share (Non-SOEs)
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Garments & Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Other Fiber Down & Related Bamboo, Cane,
Production Processing Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Localization Effects0-1 km 4.60E-05 5.28E-05 8.78E-05 -2.89E-05 7.92E-05 2.92E-05 0.00014555 6.79E-05
(6.87) (5.47) (4.62) (-1.36) (7.78) (4.44) (3.5) (7.92)1-5 km 2.28E-06 4.55E-06 2.86E-06 -1.96E-05 4.27E-06 2.81E-06 -3.14E-06 5.28E-06
(2.58) (2.32) (1.78) (-1.41) (3) (2.12) (-0.64) (3.74)5-10 km 1.94E-06 3.01E-07 -9.34E-07 -1.30E-05 -1.93E-07 -2.19E-07 2.62E-07 2.68E-06
(2.62) (0.26) (-0.64) (-1.50) (-0.17) (-0.25) (0.13) (2.91)10-20 km 1.04E-06 1.27E-06 -4.56E-07 -3.62E-06 -5.69E-07 2.52E-07 -6.66E-07 2.53E-06
(1.14) (0.99) (-0.27) (-1.05) (-0.41) (0.19) (-0.37) (2.65)20-30 km -8.59E-07 -1.96E-07 -6.41E-07 -1.91E-06 -2.41E-06 -4.35E-06 -1.36E-06 1.81E-06
(-0.75) (-0.14) (-0.37) (-0.70) (-1.74) (-2.48) (-0.49) (1.60)
Urbanization Effects0-1 km 6.42E-06 1.26E-05 1.37E-05 5.08E-05 1.34E-05 1.52E-05 1.24E-05 6.23E-06
(4.55) (5.80) (4.45) (1.26) (7.11) (3.76) (3.53) (5.20)1-5 km -4.25E-06 -2.54E-06 -4.77E-06 -4.50E-05 -3.85E-06 -5.83E-06 -3.92E-06 -5.77E-06
(-4.89) (-2.33) (-2.06) (-1.15) (-3.60) (-2.88) (-1.85) (-5.62)5-10 km -2.29E-06 -1.45E-07 -9.76E-07 7.32E-05 -4.11E-07 -1.43E-06 -5.08E-06 4.79E-07
(-1.85) (-0.11) (-0.47) (1.69) (-0.42) (-0.62) (-1.47) (0.43)10-20 km -3.46E-06 -6.46E-06 -3.13E-06 -7.04E-05 -3.28E-06 -6.42E-06 -5.74E-06 -3.36E-06
(-2.12) (-3.49) (-1.37) (-1.25) (-2.6) (-2.28) (-1.3) (-2.69)20-30 km -1.21E-06 8.15E-07 -1.60E-06 -9.81E-05 -2.79E-06 1.18E-06 3.07E-08 -1.43E-06
(-0.75) (0.45) (-0.64) (-0.87) (-1.59) (0.41) (0.01) (-0.84)
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index representing
industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to non-SOEs.
6
Table OA3 (Continued). OLS Estimates - Firm Birth Share (Non-SOEs)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Localization Effects0-1 km 3.52E-05 2.64E-05 2.22E-05 0.00014124 0.00010132 6.39E-05 3.97E-05 0.00032937
(3.79) (5.23) (2.23) (2.45) (3.53) (8.62) (4.64) (2.17)1-5 km 4.73E-06 2.06E-06 1.95E-06 5.28E-06 6.48E-06 2.39E-06 6.44E-07 7.25E-06
(1.93) (2.13) (0.97) (0.95) (2.25) (2.27) (0.67) (0.84)5-10 km 3.10E-06 -5.94E-08 -2.06E-06 -4.85E-07 2.43E-06 2.95E-07 -1.60E-07 -4.58E-06
(2.37) (-0.08) (-1.21) (-0.08) (0.86) (0.41) (-0.19) (-0.95)10-20 km -2.76E-07 3.52E-07 -1.79E-06 -5.80E-06 1.22E-06 -1.62E-07 1.75E-07 7.07E-06
(-0.09) (0.37) (-0.87) (-1.01) (0.62) (-0.25) (0.26) (1.15)20-30 km -1.25E-07 -1.60E-06 -2.82E-07 -1.16E-05 1.74E-07 -2.46E-06 -2.67E-06 1.37E-06
(-0.08) (-2.01) (-0.18) (-1.73) (0.07) (-3.26) (-2.94) (0.39)
Urbanization Effects0-1 km 1.62E-05 6.60E-06 2.38E-05 2.13E-05 1.26E-05 1.16E-05 1.89E-05 1.38E-05
(3.34) (3.63) (5.6) (4.54) (2.59) (7.59) (6.67) (2.17)1-5 km -6.60E-06 -2.21E-06 1.00E-06 -7.90E-07 -5.15E-06 -1.76E-06 -4.79E-06 -5.93E-06
(-1.55) (-1.07) (0.45) (-0.24) (-2.25) (-2.81) (-3.41) (-0.83)5-10 km -3.80E-06 -3.21E-06 -3.22E-06 -5.29E-06 -3.66E-06 -2.65E-06 -3.19E-06 -4.23E-06
(-0.61) (-1.55) (-1.16) (-1.2) (-0.93) (-3.08) (-1.57) (-0.58)10-20 km 1.50E-06 -7.09E-07 -2.57E-06 -7.99E-06 -6.22E-06 -1.35E-06 -5.90E-06 1.68E-05
(0.22) (-0.27) (-0.81) (-1.81) (-1.22) (-1.07) (-2.63) (1.25)20-30 km -9.56E-07 -3.09E-06 -2.83E-06 -5.90E-06 7.39E-07 -2.63E-06 2.93E-06 -2.30E-05
(-0.18) (-1.4) (-0.53) (-1.25) (0.13) (-1.99) (1.32) (-1.65)
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index
representing industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to non-SOEs.
7
Table OA3 (Continued). OLS Estimates - Firm Birth Share (Non-SOEs)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Localization Effects0-1 km 0.00016785 3.97E-05 6.73E-05 4.16E-05 6.79E-05 4.38E-05 6.20E-05 8.85E-05
(4.58) (7) (8.73) (5.42) (4.84) (6.92) (8.86) (6.09)1-5 km 4.97E-06 2.68E-06 6.47E-06 4.23E-06 4.32E-06 4.62E-06 4.24E-06 3.47E-06
(2.12) (3.53) (5.46) (3.21) (2.45) (4.74) (5.67) (2.66)5-10 km 3.29E-06 1.63E-06 3.39E-06 5.09E-11 1.57E-07 1.19E-07 2.79E-06 1.39E-06
(1.54) (2.18) (4.08) (0.00) (0.12) (0.16) (2.62) (1.01)10-20 km 3.90E-06 1.36E-06 7.17E-07 1.65E-08 -1.63E-06 9.66E-08 7.84E-07 -1.37E-06
(2.12) (2.21) (0.75) (0.02) (-1.16) (0.10) (0.76) (-1.12)20-30 km 1.36E-06 1.56E-07 -2.83E-07 -6.03E-07 5.02E-07 -8.32E-07 -7.54E-07 -3.03E-06
(0.59) (0.17) (-0.26) (-0.58) (0.30) (-0.78) (-0.71) (-2.51)
Urbanization Effects0-1 km 1.33E-05 1.27E-05 9.76E-06 9.37E-06 1.30E-05 1.50E-05 1.30E-05 1.93E-05
(4.67) (9.82) (8.85) (4.4) (5.15) (8.79) (10.78) (9.13)1-5 km -3.85E-06 -2.90E-06 -3.17E-06 -2.62E-06 -5.57E-06 -4.20E-06 -2.64E-06 -1.69E-06
(-2.29) (-4.53) (-4.3) (-1.76) (-2.51) (-3.28) (-3.99) (-1.95)5-10 km -3.43E-07 -2.24E-06 5.36E-07 -4.13E-06 -5.48E-07 -9.73E-07 -1.09E-06 -2.71E-06
(-0.14) (-2.79) (0.58) (-2.04) (-0.12) (-1.05) (-1.49) (-2.93)10-20 km -7.71E-06 -1.18E-06 -4.29E-06 -3.96E-06 -7.21E-06 -1.99E-06 -3.26E-06 -2.19E-06
(-2.05) (-1.01) (-3.42) (-1.51) (-1.6) (-1.46) (-3.27) (-1.71)20-30 km -4.88E-06 -4.14E-06 -1.91E-06 -1.15E-06 -3.96E-06 7.91E-07 -6.88E-07 -6.84E-06
(-1.17) (-2.95) (-1.67) (-0.39) (-0.59) (0.52) (-0.66) (-1.93)
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) for each two-digit industry.2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 km of each grid, Herfindahl index repre-
senting industry diversity within 30 km of each grid, prefecture city fixed effects, and four-digit industry fixed effects.3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to non-SOEs.
8
Table OA3 (Continued). OLS Estimates - Firm Birth Share (Non-SOEs)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Localization Effects0-1 km 9.76E-05 5.01E-05 0.0001098 0.00015154 0.00017236
(4.65) (7.84) (4.93) (2.83) (6.02)1-5 km 6.09E-06 3.64E-06 3.53E-06 5.88E-06 5.15E-06
(2.15) (3.39) (1.4) (1.11) (2.15)5-10 km 1.71E-06 1.63E-06 -7.16E-07 3.01E-06 7.32E-06
(1.08) (2.21) (-0.36) (1.16) (3.89)10-20 km -1.76E-06 8.00E-07 -2.54E-06 7.13E-07 2.72E-06
(-0.86) (1.17) (-1.14) (0.30) (1.51)20-30 km -2.67E-06 3.76E-07 -4.99E-06 -3.85E-06 -5.01E-07
(-1.66) (0.43) (-2.43) (-0.94) (-0.29)
Urbanization Effects0-1 km 1.37E-05 1.73E-05 1.65E-05 1.25E-05 1.38E-05
(5.64) (7.79) (4.76) (4.61) (5.4)1-5 km -1.56E-06 -1.79E-06 -2.89E-06 1.65E-06 -8.18E-06
(-1.52) (-1.78) (-1.35) (0.63) (-4.06)5-10 km -8.59E-07 -1.93E-06 2.50E-07 -5.53E-06 -8.99E-07
(-0.79) (-2.22) (0.12) (-1.41) (-0.45)10-20 km -5.64E-06 -2.58E-06 -3.33E-06 -4.68E-06 -1.10E-06
(-2.67) (-1.54) (-0.88) (-1.39) (-0.36)20-30 km -8.31E-07 -8.21E-07 -3.96E-06 -6.02E-06 -5.95E-06
(-0.31) (-0.31) (-1.03) (-1.66) (-1.24)
1 Coefficients reported are ring level localization and urbanization effects obtained by OLS estimation of equation (3.1) foreach two-digit industry.
2 Control variables include Herfindahl index representing industry organization for each four-digit industry within 30 kmof each grid, Herfindahl index representing industry diversity within 30 km of each grid, prefecture city fixed effects, andfour-digit industry fixed effects.
3 Numbers in parentheses are t-statistics clustered at the grid level.4 Sample is restricted to non-SOEs.
9
Table OA4.1. Spatial Decay Speed with Negative Linear Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 6.89E-07 8.69E-07 6.80E-07 3.27E-07 7.71E-07 9.08E-08 1.04E-06 8.98E-07(0.75) (0.85) (0.64) (0.11) (0.72) (0.09) (0.74) (1.06)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 5.88E-07 5.31E-07 1.77E-07 2.76E-06 7.18E-07 8.57E-07 5.44E-07 1.01E-06(0.51) (0.63) (0.13) (1.67) (0.69) (0.99) (0.71) (0.63)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Nonmetal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 1.01E-06 6.00E-07 1.26E-06 6.91E-07 7.54E-07 8.36E-07 6.90E-07 1.11E-06(0.75) (0.64) (1.36) (0.90) (0.81) (0.86) (0.74) (0.99)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 9.78E-07 6.56E-07 1.07E-06 1.76E-06 2.98E-06(0.71) (0.67) (0.89) (1.09) (1.83)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = −d.3 Numbers in parentheses are t-statistics.
10
Table OA4.2. Spatial Decay Speed with Negative Square Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 7.55E-09 7.54E-09 6.42E-09 1.78E-08 1.23E-08 9.83E-09 8.24E-09 8.32E-09(0.47) (0.38) (0.34) (0.29) (0.67) (0.52) (0.31) (0.49)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 8.74E-09 5.91E-09 3.05E-09 2.78E-08 1.18E-08 9.36E-09 6.36E-09 9.12E-09(0.43) (0.42) (0.16) (0.70) (0.47) (0.67) (0.43) (0.27)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Nonmetal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 1.05E-08 6.21E-09 1.23E-08 6.91E-09 8.60E-09 8.21E-09 1.06E-08 1.25E-08(0.45) (0.43) (0.76) (0.43) (0.44) (0.53) (0.69) (0.71)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 1.54E-08 7.04E-09 1.48E-08 1.76E-08 1.67E-08(0.73) (0.48) (0.65) (0.56) (0.77)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = −d2.3 Numbers in parentheses are t-statistics.
11
Table OA4.3. Spatial Decay Speed with Inverse Square Exponential Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 0.000114 0.000163 0.000245 -0.000030 0.000224 0.000170 0.000290 0.000126(1.67) (2.20) (2.32) (-0.24) (2.63) (1.93) (2.25) (2.00)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 0.000214 0.000107 0.000181 0.000309 0.000300 0.000211 0.000189 0.000370(2.17) (1.68) (1.91) (2.35) (2.31) (3.14) (2.46) (1.99)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 0.000210 0.000147 0.000184 0.000145 0.000185 0.000185 0.000195 0.000317(2.05) (2.26) (2.76) (1.99) (2.22) (2.72) (2.89) (3.50)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 0.000249 0.000216 0.000296 0.000238 0.000335(2.42) (3.07) (2.68) (2.11) (3.08)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = 1/e2d.3 Numbers in parentheses are t-statistics.
12
Table OA4.4. Spatial Decay Speed with Negative Cube Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 2.16E-10 2.02E-10 1.74E-10 5.54E-10 3.38E-10 2.93E-10 2.10E-10 2.19E-10(0.42) (0.31) (0.28) (0.27) (0.58) (0.47) (0.25) (0.40)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 2.45E-10 1.70E-10 6.95E-11 8.02E-10 3.17E-10 2.62E-10 1.88E-10 2.41E-10(0.38) (0.37) (0.11) (0.62) (0.39) (0.59) (0.39) (0.23)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Nonmetal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 2.87E-10 1.73E-10 3.37E-10 1.83E-10 2.15E-10 2.23E-10 3.04E-10 3.52E-10(0.38) (0.37) (0.64) (0.36) (0.34) (0.44) (0.60) (0.62)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 4.21E-10 1.86E-10 4.12E-10 4.95E-10 4.73E-10(0.63) (0.39) (0.56) (0.49) (0.68)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = −d3.3 Numbers in parentheses are t-statistics.
13
Table OA4.5. Spatial Decay Speed with Inverse Cube Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 0.000044 0.000061 0.000085 -0.000024 0.000078 0.000029 0.000146 0.000064(9.04) (9.20) (10.55) (-2.72) (12.95) (5.98) (12.02) (11.54)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 0.000033 0.000026 0.000022 0.000144 0.000093 0.000062 0.000040 0.000318(5.69) (6.10) (3.76) (10.01) (9.58) (12.18) (7.35) (13.89)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 0.000161 0.000038 0.000063 0.000040 0.000068 0.000043 0.000059 0.000087(14.28) (8.47) (12.18) (7.76) (9.71) (8.92) (11.81) (12.21)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 0.000087 0.000048 0.000109 0.000145 0.000168(11.04) (10.10) (12.29) (10.68) (16.66)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = 1/d3.3 Numbers in parentheses are t-statistics.
14
Table OA4.6. Spatial Decay Speed with Inverse Cube Exponential Distance Decay Function Based on IV Lasso Estimates
(1) (2) (3) (4) (5) (6) (7) (8)Timber
Leather, Furs, Processing,Name Food Processing Food Production Beverage Tobacco Textile Industry Garments & Other Down & Related Bamboo, Cane,
Production Processing Fiber Products Products Palm Fiber &Straw Products
code 13 14 15 16 17 18 19 20
Decay Speed 0.000885 0.001229 0.001704 -0.000487 0.001561 0.000584 0.002930 0.001274(8.35) (8.50) (9.74) (-2.53) (11.95) (5.50) (11.13) (10.66)
(9) (10) (11) (12) (13) (14) (15) (16)Petroleum Raw Chemical
Name Furniture Papermaking & Printing & Stationery, Processing, Materials & Medical &Manufacturing Paper Products Record Pressing Educational & Coking Products, Chemical Pharmaceutical Chemical Fibers
Sports Goods & Gas Production Products Products& Supply
Code 21 22 23 24 25 26 27 28
Decay Speed 0.000665 0.000520 0.000434 0.002867 0.001854 0.001246 0.000806 0.006355(5.25) (5.62) (3.46) (9.21) (8.83) (11.24) (6.79) (12.82)
(17) (18) (19) (20) (21) (22) (23) (24)Smelting & Smelting & Machinery & Special
Name Rubber Products Plastic Products Non-metal Pressing of Pressing of Metal Products Equipment EquipmentMineral Products Ferrous Metals Nonferrous Manufacturing Manufacturing
MetalsCode 29 30 31 32 33 34 35 36
Decay Speed 0.003227 0.000764 0.001260 0.000800 0.001361 0.000849 0.001172 0.001731(13.19) (7.82) (11.22) (7.15) (8.95) (8.22) (10.89) (11.26)
(25) (26) (27) (28) (29)Transportation Electric Equipment Electronic & Instruments, Meters, Artwork & Other
Name Equipment & Machinery Telecommunications Cultural & ManufacturingManufacturing Official Machinery
Code 37 39 40 41 42
Decay Speed 0.001743 0.000968 0.002175 0.002909 0.003364(10.17) (9.32) (11.34) (9.85) (15.39)
1 Coefficients reported are two-digit-industry-specific spatial decay speed obtained by OLS estimation of equation (3.2), where localization effects in IV Lasso estimation(weighted by 1/sd ) are regressed on two-digit industry dummies and interaction terms of the decay function and two-digit industry dummies.
2 The spatial decay function is specified as f(d) = 1/e3d.3 Numbers in parentheses are t-statistics.
15
Table OA5. Spatial Decay Speed And Industry Characteristics (Non-SOEs)
Regression Using OLS Estimates of Localization Effects(1) (2) (3) (4)
f(d) 0.000092 0.0000911 0..0001066 0.0000878(9.33) (9.12 ) (10.43) (7.57)
f(d)× knowledge spillovers 0.0000352 0.0000189(3.30) (0.87)
f(d)× labor market pooling 0.0000377 0.0000226(3.24) (0.91 )
f(d)× input sharing -7.63E-06 6.02E-06( -0.69) (0.50)
f(d)× natural advantage 6.07E-06 0.0000124 7.74E-06 8.08E-06( 0.59) ( 1.17 ) (0.69) (0.71)
f(d)× high SOE share -0.0000456 -0.0000526 -0.0000347 -0.0000512(-4.20) ( -4.48 ) (-3.24) ( -4.11 )
Constant -9.32E-06 -9.32E-06 -9.32E-06 -9.32E-06(-3.06) (-3.06) ( -2.96 ) ( -3.05 )
Adj. R-squared 0.5899 0.5889 0.5596 0.5866
1 Results are obtained by OLS estimation of (3.3), where localization effects from OLSestimation for the sample of non-SOEs are regressed on a spatial decay function, andinteraction terms of the decay function and various industry characteristic indicators.
2 The spatial decay function is specified as f(d) = 1/d2.3 For each two-digit industry, the indicator for reliance on knowledge spillovers equals one
if the ratio of new product to total product in the industry is higher than the median ofall industries and zero otherwise. The indicator for reliance on labor pooling equals oneif the percentage of collage-educated workers in the industry is higher than the medianof all industries and zero otherwise. The indicator for reliance on input sharing equalsone if transportation cost per shipment in the industry is higher than the median of allindustries and zero otherwise. The indicator for reliance on natural advantage equalsone if at least two of the three cost variables (water, energy, and natural resources costper shipment) in the industry are higher than the median of all industries and zerootherwise. The indicator variable high SOE share equals one if the percentage of SOEfirms in the industry is higher than the median of all industries and zero otherwise.
4 Numbers in parentheses are t-statistics.5 Sample is restricted to non-SOE firms.
16
13.Food Processing 14.Food Production
15.Beverage Production 16.Tobacco Processing
17.Textile Industry 18.Garments & Other Fibre Products
19.Leather, Furs, Down & Related Products 20.Timber Processing, Bamboo, Cane, Palm Fibre &Straw Products
17
21.Furniture Manufacturing 22.Papermaking & Paper Products
23.Printing & Record Pressing 24.Stationery, Educational & Sports Goods
25.Petroleum Processing, Coking Products, Gas Pro-duction & Supply
26.Raw Chemical Materials & Chemical Products
27.Medical & Pharmaceutical Products 28.Chemical Fibres
18
29.Rubber Products 30.Plastic Products
31.Non-metal Mineral Products 32.Smelting & Pressing of Ferrous Metals
33.Smelting & Pressing of Nonferrous Metals 34.Metal Products
35.Machinery & Equipment Manufacturing 36.Special Equipment Manufacturing
19
37.Transportation Equipment Manufacturing 39.Electric Equipment & Machinery
40.Electronic & Telecommunications 41.Instruments, Meters, Cultural & Official Machin-ery
42.Artwork & Other Manufacturing
Figure OA1: Concentration of Prefecture City Level Employment by Industry
20