Economic geography and the scaling of urban and regional income in India
Anand Sahasranaman1,2,* and Luís M. A. Bettencourt3,4,#
1Division of Mathematics and Computer Science, Krea University, Sri City, AP 517646, India.
2Centre for Complexity Science, Dept of Mathematics, Imperial College London, London
SW72AZ, UK.
3Mansueto Institute for Urban Innovation, Dept of Ecology and Evolution, Dept of
Sociology, University of Chicago, Chicago IL 60637, USA.
4Santa Fe Institute, Santa Fe NM 87501, USA
* Corresponding Author. Email: [email protected]
# Email: [email protected]
Abstract:
We undertake an exploration of the economic income (Gross Domestic Product, GDP) of
Indian districts and cities based on scaling analyses of the dependence of these quantities on
associated population size. Scaling analysis provides a straightforward method for the
identification of network effects in socioeconomic organization, which are the tell-tale of
cities and urbanization. For districts, a sub-state regional administrative division in India, we
find almost linear scaling of GDP with population, a result quite different from urban
functional units in other national contexts. Using deviations from scaling, we explore the
behavior of these regional units to find strong distinct geographic patterns of economic
behavior. We characterize these patterns in detail and connect them to the literature on
regional economic development for a diverse subcontinental nation such as India. Given the
paucity of economic data for Urban Agglomerations in India, we use a set of assumptions to
create a new dataset of GDP based on districts, for large cities. This reveals superlinear
scaling of income with city size, as expected from theory, while displaying similar underlying
patterns of economic geography observed for district economic performance. This analysis of
the economic performance of Indian cities is severely limited by the absence of higher-
fidelity, direct city level economic data. We discuss the need for standardized and consistent
estimates of the size and change in urban economies in India, and point to a number of
proxies that can be explored to develop such indicators.
Keywords: Urban Economies, Regional Development, Income, GDP, Districts, Urban
Agglomeration, Cities, India.
Introduction:
The economic performance of India is critical to the well-being of over one-sixth of the world’s
population. As India continues to urbanize, and over half the population becomes urban over the next
few decades [1], cities will become ever more central to India’s dynamics of economic growth and
human development. Therefore, the need to develop a scientific understanding of the economic
performance of Indian regions and cities becomes critical. Our previous work explored the
quantitative characteristics of crime, innovation, spatial density and a number of services in Indian
cities using the framework of urban scaling [2]. Here, we extend this analysis to regional patterns of
economic performance using district and state Gross Domestic Product (GDP) information. Within
the constraints imposed by existing units of analysis in the data, we also attempt a systematic
exploration of urban GDP, which has been an issue of long-standing interest for Indian cities.
There is a vast literature in economics on the determinants of regional performance and mechanisms
of economic growth. Some of the significant drivers identified in this literature include transportation
and market access [3–5], agglomeration economies [6–10], and issues related to human capital and its
mobility [11–16]. New Economic Geography posits that access to markets, in the form of
transportation infrastructure networks, is critical to the trajectory of productivity and wages in sub-
national regions [3]. This finding has been found empirically sound across national contexts,
including in developing economies such as China and India [4,5]. Specifically, economic potential in
India is found to be strongly clustered by geography, with the states of Tamil Nadu, Kerala, and
Haryana having the highest concentration of districts with high economic potential, and the state of
Uttar Pradesh containing districts with significant economic underperformance [5]. Agglomeration
economies in sub-national regions are a measure of preferential attachment effects that are reflected in
increasing economic densities and urbanization [6,7]. For instance, the concentration of firms in
similar industries is both a cause and a consequence of geographically proximate investments in
businesses, creation of local talent pools with string matching, and the realization of knowledge
spillovers – and evidence of such agglomeration effects has been empirically validated across nations
[8,9]. There is evidence for agglomeration effects in India emanating from inter-industry urbanization
economies at the regional level [10]. Human capital is also found to influence levels of productivity
through multiple channels [11,12] – with robust evidence available for transmission channels such as
the ability created by locally available trained and skilled workforces, knowledge spillovers enabling
maximal exploitation of agglomeration economies, and also the possibility of high quality human
capital being able to adjust to longer-term structural changes in the economy [13–15]. Empirical work
suggests that human capital (education) has been a significant contributor to increase in output per
worker in India [16].
There is also a significant body of literature on the linkages between urbanization and the economy.
We find that urban locations enable concentration of economic activity through access to diverse
labor pools which enable specialization [17,18], reduced costs on account of proximity to users and
suppliers as well as cheaper transport [3], speedy and effective responses to changing market
conditions [19,20], and enhanced potential for innovation due to geographically concentrated
availability of educated and creative human capital [21]. It has also been empirically shown, based on
cross-country data, that the rate of urbanization at national level exhibits strong positive correlation
with GDP per capita [22]. This strong positive correlation between per capita income levels and
urbanization has been observed to be robust even at a more granular, sub-national level based on
state-level data in India [23]. However, it is important to recognize that while there is a clear and
robust relationship between urbanization and income levels, this does not necessarily translate into a
causal relationship. Indeed, multiple empirical studies on this question find no systematic relationship
between urbanization and economic growth [22,24,25]. Therefore, it appears that while urbanization
is part of the economic development process, there is no evidence that it, per se, independently and
causally impacts economic growth.
Given this context, our attempt in this work is to explore economic growth in Indian regions and cities
using the framework of urban scaling and urban geography [26–28]. Scaling uses population size as
the basis for isolating general agglomeration effects, specifically characterized by increasing returns
to scale in socioeconomic interactions such as innovation and GDP in cities – as evinced in empirical
studies across multiple national jurisdictions [2,27,29–31].
In scaling analysis, an indicator such as GDP, 𝑌𝑖(𝑡, 𝑁𝑖), for city i, with population size 𝑁𝑖(𝑡), at time t
is given by:
𝑌𝑖(𝑡, 𝑁𝑖) = 𝑌0(𝑡)𝑁𝑖𝛽
𝑒𝜉𝑖(𝑡) , (1)
where 𝑌0(𝑡) is a measure of systemic change in GDP across all regions in the analysis, independent of
population size with dimensions of a flow of money per year (income).
The scaling exponent 𝛽 is the elasticity of 𝑌𝑖 to population size. This parameter, when measured for
urban functional areas, is found to fall in three distinct universality classes containing attributes that
represent socioeconomic interactions (𝛽 > 1), economies of scale (𝛽 < 1), and individual human
needs (𝛽 ≃ 1) respectively [27]. Empirically, across a range of countries, it has been observed that
𝛽 ≃ 7/6 > 1 for the GDP indicator [27,29,31], which is also in line with theoretical expectation [26].
Eq. (1) also allows us to express the average dynamics of a set of units, via the temporal change in the
centres for the data in logarithmic variables (⟨ln 𝑌(𝑡)⟩, ⟨ln 𝑁(𝑡)⟩), defined by the average of a set of
units as
⟨ln 𝑌(𝑡)⟩ = 1
𝑁𝑐∑ ln 𝑌𝑖(𝑡), ⟨ln 𝑁(𝑡)⟩ =
1
𝑁𝑐∑ ln 𝑁𝑖(𝑡),
𝑁𝑐𝑖=1
𝑁𝑐𝑖=1 (2)
where 𝑁𝑐 is the total number of units (cities or regions) in the set.
The quantities 𝜉𝑖(𝑡) are specific to individual cities or regions (i) and represent the local, idiosyncratic
features that affect their GDP away from the scaling average. Specifically, 𝜉𝑖(𝑡) represent scale
(population-size)-independent deviations of individual regions from the scaling relation:
𝜉𝑖(𝑡) = ln𝑌𝑖
𝑌0𝑁𝑖𝛽. (3)
Using this simple but systematic framework of scaling analysis and its deviations, we explore regional
economic growth in India, measured as district level GDP (Gross District Domestic Product or
GDDP) and examine the resulting economic geography in light of empirical evidence from regional
economic analyses. We also attempt to use this type of (non-explicitly urban) regional data to
approximate the set of all largest Indian cities and examine the nature of resulting scaling
relationships and urban economic geography.
2. Scaling and Economic Geography of District GDP in India
We use publicly available data for GDP income for districts across 12 Indian states, accounting for
over 74% of the country’s population. For a detailed description of data sources and statistical
methods, please refer Appendix A.
We start by exploring the simplest scaling relationship, between district GDP (GDDP) and
corresponding district population size. Districts tile the entire territory of India and thus vary
enormously in character, some may be parts of large cities as we shall see below, while others will
encompass together rural areas and towns. Nevertheless, we can analyze their scaling relation with
population, essentially asking if these units of analysis somehow manifest any increasing returns in
GDP per capita with their population as “agglomeration effects”.
We find that GDDP scaling is at the cusp of linearity and superlinearity (Figure 1A), with 𝛽 = 1.02
(95% Confidence Interval (CI): 0.92, 1.13). This relationship is very noisy and quite distinct from
expectations for functional urban areas (defined as integrated labor markets), for which 𝛽is expected
to be in the region of 7/6 as observed in other nations [27,29,31]. Taking this result at face value tells
us that Indian districts are not, on average over all district types and regions, generating agglomeration
economies (as measured by GDDP), as predicted by urban scaling theory [26]. We also plot the rank
order of deviations from scaling law, 𝜉𝑖(𝑡) (Figure 1B), which are dimensionless and enable direct
comparison between districts due to the exclusion of population size effects.
Figure 1: Scaling and deviations of GDDP of Indian districts (2011): A: Scaling of district level Gross
Domestic Product (GDDP) with district population (2011). This shows a very slight - not statistically significant
- superlinear relationship to population with an exponent of 1.02 (95% Confidence Interval (CI): [0.92, 1.13]).
B: Rank order of districts based on deviations from scaling relation, 𝜉𝑖(𝑡), depicting the local, idiosyncratic
effects of individual districts in driving scaling behaviour. We see that districts corresponding to large Indian
cities (Mumbai and Thane districts contained within the Mumbai Urban Agglomeration, Bangalore Urban
district in the Bangalore Urban Agglomeration, and Pune district in the Pune Urban Agglomeration) are strong
positive outliers, with much larger economies per capita than the average scaling trend would predict.
In our previous analysis of the properties of Indian urban agglomerations [2], it emerged that the
diverse geography of India was critical to the emergence of scaling behaviour, as far as crime and, to
a lesser extent, technological innovation (measured by patenting activity) are concerned. Specifically,
we found that cities in north-central and eastern India performed qualitatively and quantitatively in
distinct ways from the cities in southern and western India. In order to elicit a scientifically robust
understanding of the economic geography of India, we analyse the deviations from the scaling relation
for GDDP, 𝜉𝑖(𝑡) (from Eq. 2 and Figure 1B), which are scale-independent and provide a principled
mechanism to characterize regional effects. When we map 𝜉𝑖(𝑡) for district GDDP in the year 2011,
β = 1.02R² = 0.53
6
7
8
9
10
11
12
13
12 13 13 14 14 15 15 16 16 17 17
ln (
GD
DP
)
ln (Population)
AMumbaiBangalore(Urban)
ThanePune
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
30
1
31
1
32
1
33
1
34
1
Dev
iati
on
fro
m G
DD
P s
calin
g 1. Mumbai4. Hyderabad
10. Coorg28. Kolkata
68. Chennai154. Lucknow
187. Tonk218. Agra
270. Varanasi286. Gorakhpur
316. Madhubani
B
we obtain a spatial distribution of deviations as shown in Figure 2, which provides a visual
representation of geographic disparities in economic performance across the country. It clearly
emerges that districts in north-central and eastern India, on average, tend to underperform the average
scaling expectation (dashed line in Figure 1A) as highlighted by the clustering of red circles in these
regions, while districts in southern and western India, on average, display overperformance as evinced
by the predomination of blue circles.
Figure 2: Economic geography according to deviations from District GDDP scaling law (𝝃𝒊(𝒕), 2011): Each
circle represents the deviation, 𝜉𝑖 , of a single district. Blue circles represent over-performance with respect to
scaling law and red circles under-performance. The size of the circle indicates the magnitude of over- or under-
performance. Blue circles are concentrated in the south, west, and high north of the country, while red circles
are predominant in north-central and eastern part (Indo-Gangetic plain) of the country.
While Figure 2 strongly suggests an underlying pattern of regional economic geography in India, we
seek to validate this impression more formally by clustering districts based on the distance between
the time-series of their deviations from scaling. Figure 3A shows a heatmap of the Euclidean distance
between pairs of time-series of deviations: The closer to zero this distance is, the greater the similarity
in the temporal evolution of deviations. This heatmap suggests a total of seven clusters (Figure 3A).
Clusters 1 and 2 (Figure 3A&B) are composed of severely underperforming (with respect to scaling
law) districts from the north-central states of Uttar Pradesh and Bihar, respectively. Clusters 3 and 4
are predominantly composed of overperforming districts in the southern, western, and high northern
parts of the country. Cluster 6 is composed of underperforming districts from north-central and
eastern India, while Cluster 7 consists of districts from the same region that on average perform in
accordance with scaling expectations. Finally, Cluster 5 is a remainder, a mixed cluster composed of
districts from across India, which on average slightly overperform the expectations from the scaling
law. Overall, the composition of these clusters formalizes the notion of regional economic geography
hinted at in Figure 2, with districts in north-central and eastern India (representing the states of Uttar
Pradesh, Bihar, Rajasthan, Odisha, West Bengal in our data set) lagging on economic performance,
while the districts in the south, west, and high north (comprising the states Tamil Nadu, Kerala,
Andhra Pradesh, Maharashtra, and Punjab) consistently exceed scaling expectations. This result also
finds close agreement with empirical evidence that the north-central and eastern states have
historically (since 1947) showed lower comparative socioeconomic development as represented by
evolution of GDP and other social indicators (literacy, mortality, population growth), when compared
to the significantly better performance of southern and western states [32–35]. More recently,
assessing the performance of Indian districts on the Multidimensional Poverty Index (MPI) [36],
which measures serious deficits in health, education, and living standards, it emerges that 91 out of
the 100 districts with worst MPI in the country are in the north-central and eastern states (Bihar, Uttar
Pradesh, Odisha, Rajasthan, Madhya Pradesh, Jharkhand, and Chattisgarh).
Figure 3: Clustering Analysis: A: Heatmap of time-series of deviations from District GDDP scaling relation
(𝜉𝑖(𝑡), 2006 to 2011). Clusters are based on Euclidean distance between time-series (from 2006 to 2011) of
pairs of district GDDP deviations. A total of seven clusters are clearly separable, which show a clear regional
economic geography, with underperforming districts primarily belonging to the north-central and eastern parts
of India, while the overperformers predominant in the south and west. The clusters are identified by the regions
their districts primarily belong to: N-C: North-Central (representing the states of Uttar Pradesh, Bihar,
Rajasthan), E: East (West Bengal, Odisha), S: South (Tamil Nadu, Kerala, Andhra Pradesh), W: West
(Maharashtra), HN: High North (Punjab), and Mix: Mixture of districts from across the nation. B: Map of
Clusters. The map depicts the geographical spread of the seven clusters that emerge out of the clustering
analysis. Color Code: Dark Blue: Cluster 1, Dark Green: Cluster 2, Red: Cluster 3, Purple: Cluster 4, Pink:
Cluster 5, Light Green: Cluster 6, Light Blue: Cluster 7.
Despite the confirmation of this geographical pattern to economic performance, the clustering analysis
also points to differentiated performance within geographies. For instance, districts in the states of
Bihar and Uttar Pradesh (in north-central India) ought to be of particular concern to policy makers
because almost all districts in these states (97% in Bihar or 37 out of 38 districts, and 89% in Uttar
Pradesh or 62 out of 70 districts) have significant, negative deviations from scaling (with state-level
average GDDP deviations of -0.98 and -0.43 respectively), concentrated in Clusters 1 and 2 in Figure
3. Other states in the north-central and eastern region such as West Bengal and Rajasthan, while still
having a significant proportion of districts underperforming (47% in West Bengal or 9 out of 19
districts, and 38% in Rajasthan or 12 out of 32), however have a majority of their districts
overperforming the scaling relation, consequently yielding state level average GDDP deviations in the
region of ~0.10, and find themselves clustered in Clusters 5, 6 and 7 (Figure 3). In the better
performing southern, western, and high northern regions of the country, we find that only 6 out of the
120 districts (5 in Tamil Nadu and 1 in Maharashtra) underperform the scaling law. Some of these
intra-geographic differences become apparent when we plot the temporal evolution of the centres of
the population-GDDP distributions (log-log scale) for each state for each year data is available (Eq.
2).
As Figure 4A clearly illustrates the GDDP centres of the Bihar and Uttar Pradesh distributions are the
lowest amongst all states (even as they show an increasing trend), with Odisha’s GDDP centre a little
higher than these two states, and Rajasthan and West Bengal showing the highest GDDP centres in
the north-central and eastern region. These intra-geographic differentiations also echo some of the
economic geography findings of Bhandari and Khare [37], whose economic model of district
A B
performance finds that districts in Uttar Pradesh and Bihar show a significant decline in their share of
the economy over time, while districts in Rajasthan show increase in economic share. It is also
however apparent from Figure 4A that the GDDP centres of all the southern, western and high
northern states are significantly higher than those of even Rajasthan and West Bengal (even
Karnataka whose temporal evolution of GDDP centre appears very similar to Rajasthan, is doing so at
lower population centre)
Figure 4: Centering Analysis: A: Temporal evolution of Population-GDDP centres per state per year: The
centre for each state for each year is calculated as Eq. (2) for districts in each state in each given year. The
north-central states of Uttar Pradesh and Bihar have the lowest GDDP centres despite having relatively high
population centres. The southern and western states show the highest GDDP centres over time. B: Centred
Scaling of GDDP (2010): The scaling of GDDP with population when the data have been centred in each state
shows superlinear scaling of GDDP, with an exponent of 1.15 (95% Confidence Interval (CI): [1.08, 1.22]).
India’s federal structure has ensured that states have significant powers in the design and
implementation of social and economic policy [34], and the heterogenous economic paths charted by
different states post 1947 are testament to the decision making powers of Indian states. Given this
underlying reality where the baseline GDP of different states shows significant variation, the centred
scaling relationship (Eq. 2) provides us with a single-parameter model to estimate the scaling
exponent, while excluding baseline state differences. Figure 4B is a plot of the centred scaling of
GDDP with population, and this reveals an exponent of 1.15 (95% CI of [1.08,1.22]), which is in
reasonable agreement with the expected exponent of 7/6 from urban scaling theory [27]. This starts to
suggest the presence of agglomeration effects at the district level, which can be masked by regional
disparities
We explore this phenomenon further by splitting the GDDP data set into two sets, based on the
geography suggested by this analysis of deviations. Figure 5 shows the quantitatively distinct scaling
relationships exhibited by these two sets of districts segregated by geography.
22
23
24
25
26
27
28
13 13.5 14 14.5 15 15.5 16
Bihar (2004-10) Kerala (2004-11)
Rajasthan (2008-11) Odisha (2004-10)
Karnataka (2007-10) West Bengal (2004-10)
Andhra Pradesh (2004-11) Assam (2009)
Maharashtra (2005-11) Punjab (2007-10)
Uttar Pradesh (2004-11) Tamil Nadu (2005-11)
β = 1.15R² = 0.75
-3
-2
-1
0
1
2
3
-2 -1 0 1 2
ln (
Y) -
<ln
(Y)>
ln(N) - <ln(N)>
Figure 5: Scaling of Indian districts by geography (2011): Scaling of district Gross Domestic Product
(GDDP) with population for two sets of data. The pink data points (Set 1) represent the scaling of districts in the
south, west, and high north of India (comprising the states Karnataka, Tamil Nadu, Kerala, Andhra Pradesh,
Maharashtra, Punjab) and the blue data points (Set 2) are districts in north-central and eastern India (comprising
the states Uttar Pradesh, Bihar, Odisha, Rajasthan, West Bengal). There is a significant difference in the nature
of scaling relationships, with Set 1 exhibiting a superlinear relationship in line with expectation from scaling
theory (exponent of 1.14 with a 95% CI of [1.05, 1.22]), while Set 2 shows a slightly sublinear relationship
(exponent of 0.98 with a 95% CI of [0.86, 1.10]). Again, we see in both Set 1 and Set 2 that the largest districts
outperform the respective scaling lines - Mumbai, Thane, Bangalore Urban and Pune in Set 1, and 24-Parganas
(N), 24-Parganas (S) (both of which are part of the Kolkata Urban Agglomeration), Bardhaman, and
Murshidabad (all of these districts belong to the state of West Bengal) in Set 2.
We observe, again, that districts in the southern, western, and high northern states (Karnataka, Tamil
Nadu, Kerala, Andhra Pradesh, Maharashtra, Punjab) show a superlinear scaling relationship of
GDDP with 𝛽 = 1.14 (95% CI: 1.05, 1.22), which is in keeping with expectations from empirical
observations elsewhere [27,29,31] as well as theory [26]. On the other hand, economic performance in
districts of north-central and eastern states (Uttar Pradesh, Bihar, Rajasthan, Odisha, West Bengal)
shows slightly sublinear scaling, 𝛽 = 0.98 (95% CI: 0.86, 1.10), suggesting the absence of advantages
of scale deriving from socioeconomic interactions in these regions.
3. Scaling and Urban Geography of Income in India
We now attempt to understand scaling of GDP in the context of Indian cities.
We begin by exploring the relationship between state level GDP (Gross State Domestic Product, or
GSDP) and its urban population. We find a sublinear scaling relationship, with an exponent 𝛽 = 0.90
(Figure 6). While we might expect a superlinear relationship with increasing urban population, the
observed result is potentially explained by the effect of the highly populous, low-income states of
north and central India, which despite low levels of urbanization have high overall urban population
counts. For instance, the states of Bihar, Uttar Pradesh, and Rajasthan rank at 30, 26, and 22
respectively (out of 31 states) in terms of their urbanization levels, but in terms of urban population
counts, they rank 12, 2, and 9 respectively.
β = 0.98R² = 0.56
β = 1.14R² = 0.82
6
7
8
9
10
11
12
13
12 13 14 15 16 17
ln (
GD
DP
)
ln (Population)
North-Central andEast
South, West, andHigh North
Mumbai
ThanePune
Bangalore (Urban)
24-Parganas (N)
24-Parganas (S)
Murshidabad
Bardhaman
Figure 6: Scaling of GSDP with state urban population (2011): A: Scaling of state level Gross Domestic
Product (GSDP) with state urban population (2011). This shows a sublinear relationship with an exponent of
0.90 (95% Confidence Interval (CI): [0.82, 0.98]). The large states of Bihar, Uttar Pradesh, and Madhya Pradesh
have low levels of urbanization, but comparatively high urban populations and underperform relative to the
scaling line. Smaller states like Haryana, Himachal Pradesh, and Goa overperform relative to the scaling line.
We have however seen that there exists a robust, empirically tested relationship between urbanization
and per capita incomes, both at a cross-country level and at a cross-state level within a country
[22,23]. We now drill down from the state to the district level and find that this positive relationship
between per capita income and urbanization obtains even at this level of granularity, based on Indian
data (Figure 7A). We also seek to understand how this relationship compares with the relationship
between the deviations from GDDP scaling law, 𝜉𝑖(𝑡), and urbanization. Given that per capita income
increases with urbanization, we would expect that 𝜉𝑖(𝑡) would capture the effect of increasing income
in more urbanized districts. Indeed, as Figure7B illustrates, we find that there is a positive relationship
between 𝜉𝑖(𝑡) and urbanization. Overall, the two curves indicate not only a close qualitative
concurrence but also a quantitative one in terms of the functional forms of the best fit curves that
describe the relationships of 𝜉𝑖(𝑡) and per capita income with urbanization.
β = 0.90R² = 0.95
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
11 12 13 14 15 16 17 18
ln (
Stat
e G
DP
in IN
R B
illio
n)
ln (State Urban Population)
UttarPradesh
Bihar Madhya Pradesh
HaryanaHimachalPradesh
Goa
Maharashtra
Rajasthan
8.0
8.5
9.0
9.5
10.0
10.5
11.0
11.5
12.0
12.5
- 0.20 0.40 0.60 0.80 1.00
ln (
GD
DP
per
cap
ita)
Urbanization Rate
A
y = 0.5851ln(x) + 11.602R² = 0.5233
-2.0
-1.5
-1.0
-0.5
-
0.5
1.0
1.5
2.0
- 0.20 0.40 0.60 0.80 1.00𝜉 𝑖(𝑡
)
Urbanization Rate
B
y = 0.5806ln(x) + 0.9343R² = 0.5255
Figure 7: Income and Urbanization: A: ln (GDDP per capita) v. District urbanization rate. There is a positive
relationship between urbanization and income per capita at the district level. B: Deviation from GDDP scaling
law ( 𝜉𝑖(𝑡) in Figure 1) v. District urbanization rate. 𝜉𝑖(𝑡) also displays a similarly positive relationship with
urbanization. The functional forms of the curves of best fit in both cases show very close correspondence, with
almost the same coefficient (0.5851 and 0.5806); the normalization is clearly different, however.
This naturally leads to the ultimate question of how income scales across urban agglomerations. As
we highlighted in earlier work [2], there is a lack of systematic collection and dissemination of
economic data at the level of Indian cities and therefore, in terms of official statistics, we are left with
using district level GDP as a proxy for city GDPs. In order to create a dataset of such proxied city
GDPs we start with considering districts that are predominantly urban, i.e. with urbanization rates of
at least 50%. Given that per capita urban incomes in India are, on average, 2.75 times per capita rural
incomes [38], it is a reasonable assumption that a very significant proportion of the GDP in majority
urban districts is produced by the corresponding urban components. This leaves us with a set of 38
districts, out of which we only consider those districts in which there is a single identifiable city that
contributes significantly (over 65%) to the urbanization of that district. On average, we find that the
final set of proxied cities thus obtained, contribute to over 86% of the urbanization of their districts.
We also have four urban agglomerations in the data – Mumbai, Kolkata, Chennai, and Hyderabad –
that extend across multiple districts and in these cases, we aggregate the population and GDP of the
constituent districts to proxy the data for these cities, see Appendix A. We also obtained data for
Delhi state, which closely corresponds to the Delhi Urban Agglomeration, and incorporate this as an
urban unit into the analysis. Overall, the final dataset thus created has 24 urban areas, which we use
for analysis of scaling and deviations, see Figure 8. It is apparent that the creation of even this limited
dataset involves several approximations and assumptions (discussed in Appendix A), and while the
data clearly do not capture exact representations of functional Indian cities, what it offers us is a
starting point (in the absence of better data) to begin to explore urban GDP scaling.
When we plot the scaling of these city GDPs with population (Figure 8A), we find superlinear scaling
with an exponent of 1.12 (95% CI: 0.94, 1.30), which is consistent with expectations from functional
cities in other nations and from theory [26,27,29,31]. When we compare the rank orders of cities
dataset based on per capita GDP and deviations (𝜉𝑖(𝑡)) from the scaling law at work here, we find that
the largest cities – Mumbai, Delhi, Kolkata, Chennai, Hyderabad, Pune - rank slightly worse (or at
best, the same) under 𝜉𝑖(𝑡) than per capita GDP, which is explained by the expectations of superlinear
increase in GDP with population under scaling (Figure 8B).
Figure 8: Scaling of Indian cities and Rank order of per capita GDP/scaling deviations (2011): A: Scaling
of GDP with population reveals a superlinear relationship with exponent β = 1.12 (95% CI: [0.94, 1.30]), which
is in line with expectations. Mumbai, Delhi, and Pune outperform the scaling line, while Kolkata and Lucknow
underperform. B: Ranking large Indian cities by per capita GDP and 𝜉𝑖(𝑡), we find that 𝜉𝑖(𝑡) rankings are
slightly poorer for the largest cities under 𝜉𝑖(𝑡) than per capita GDP.
We now turn to exploring the economic geography of urban income using the deviations from scaling,
𝜉𝑖(𝑡), as the basis for urban areas rather than districts. The simple visualization in Figure 9A suggests
a similar geographical breakup in terms of GDP performance as we saw in the case of district GDDP.
On average, cities in the south, west, and high north appear to outperform the scaling law, while those
in north-central and east India underperform.
β = 1.12R² = 0.88
9
10
10
11
11
12
12
13
13
14
14 15 16 17 18
ln (
GD
P)
ln (Population)
A
Mumbai
Delhi
Kolkata
Pune
Lucknow
- 5 10 15 20 25
Mumbai Delhi Pune
Noida Kochi
Hyderabad Nagpur
Ludhiana Jalandhar
Thiruvananthapuram Coimbatore
Thrissur Chennai Kannur
Kozhikode Jaipur
AmritsarMaduraiKolkata
KotaLucknow
KanpurMeerut
Ghaziabad
Rank
ξ(t) Per capita GDP
B
B A
Figure 9: Economic geography and Heatmap based on 𝝃𝒊(𝒕): A: Each circle represents the deviation of a
single city from the scaling law. As before, blue circles represent over-performance and red circles under-
performance. The size of the circle indicates the magnitude of over- or under-performance. Blue circles are
concentrated in the south, west, and high north of the country, while red circles are predominant in north-central
and eastern part (Indo-Gangetic plain) of the country. B: Heatmap of time-series deviations from scaling law.
Clusters are based on Euclidean distance between time-series (from 2006 to 2011) of pairs of city GDP
deviations. This analysis reveals the underlying geographical pattern of urban GDP through the emergence of
geographically distinct clusters of cities in north-central and east India (N-C, E), and of cities in the west, south
and high north (W, S, HN).
We again formalize this notion by performing a clustering analysis of the Euclidean distance between
time-series of 𝜉𝑖(𝑡) for all 24 urban areas in the dataset (Figure 9B) and find that this confirms the
urban economic geography suggested in Figure 9A. Clustering shows a clear geographical basis with
4 of the 6 resultant clusters composed of cities from the south, west, and high north, one cluster
composed of the poorest underperformers from north-central and eastern India, and finally one mixed
cluster, as usual in this type of analysis. Overall, despite the construction of the dataset being based on
urbanization levels, the economic geography revealed here appears to almost exactly mirror that of
district GDDP. This also suggests the possibility that with the availability of higher quality city data,
we might see qualitatively and quantitatively different scaling relationships (i.e. different intercepts)
between cities across these geographies, just as was manifested in the case of district GDDP.
4. Conclusion
We explored a systematic analysis of existing official regional and urban economic data for income in
India with the objective of furthering a stronger scientific understanding of economic performance
and agglomeration effects. We used the framework of scaling theory as our point of departure, from
which we attempted to characterize the economic geography of India using existing regional GDP
data. Based on these analyses, we also proposed approximations for functional definitions of large
Indian cities based on collections of districts. We measured associated increasing returns to their
population scale, which emerge to be consistent with the behaviour for urban areas in other urban
systems and with theory.
There are clearly many limitations to the existing data sources analysed here that will be important to
address in the future, if a firmer analysis of the properties of Indian cities and their development are to
be assessed over time. The Census of India defines Urban Agglomerations as an approximation to
urban functional areas in most other nations. It would be important in the future that these units are
characterized in terms of their economic make up and performance, not only in terms of their GDP, in
ways that are consistent over time and space. The modernization of data collections across the
country, including taxes and employment and property records (beyond existing surveys) should
allow the nation to leapfrog existing practices and create a modern system based on native talent that
is well suited to measure, assess and plan future economic activity in its fast-growing cities.
One of the central difficulties of measuring economic activity in integrated urban economies is their
spatial definition. In the United States and other OECD nations, the solution of this problem relies on
the consistent assessment of daily commuting flows and their integration of geographic political and
civic units into the same unified labour market, known as Metropolitan areas [39,40]. An important
task ahead for Indian cities then is the construction of analogous functional units, especially given the
current scenario – analogous to US cities - where main Indian cities are growing primarily along
peripheries [41].
Measurements of urban metropolitan economies are also becoming more accessible, not only through
the modernization of official data records, but also through new proxies available online and through
new technologies, including digital mapping and remote sensing, assessments of construction,
transportation flows, real estate markets and employment listings. These emerging sources typically
display biases towards formal and high-tech sectors of economic activity, but can be complemented
by neighbourhood surveys and data collections at the local level in more informal setting, a tradition
with great vitality in India. Creating a system that can make use of these traditional and emerging
sources of information towards a deeper understanding of human sustainable development in Indian
cities is a challenge that directly impacts over one-sixth of the world’s population. With increasing
urbanization, cities will play an ever more central role in the future of India’s economy. Developing a
better scientific understanding of their economic development will be critical to ensuring that we fully
leverage this process so that the benefits of growth are distributed more fairly and equitably and
contribute to global sustainability outcomes.
Appendix A: Data sources and methods
District GDP Data: District level GDP data was released by the erstwhile Planning Commission of
India and is available on Government of India’s Open Government Data (OGD) platform
(https://data.gov.in/). The 2004-05 current price GDP time series for 11 Indian states is available at
https://data.gov.in/catalog/district-wise-gdp-and-growth-rate-current-price2004-05. There is some
difference in the lengths of the time series available for districts in different states, as follows: Andhra
Pradesh, West Bengal, Bihar, Odisha, Kerala, Uttar Pradesh, and Punjab have the entire series from
2004-05 to 2010-11, Maharashtra from 2005-06 to 2010-11, Assam for 2009-10, Rajasthan from
2004-05 to 2009-10, and Karnataka from 2007-08 to 2010-11. We also obtained Rajasthan districts
GDP data for 2010-11 from the state government’s publication “Estimates of District Domestic
Product of Rajasthan 2011-12” available at https://bit.ly/2TD2R2G. For Tamil Nadu, the district GDP
time series from 2004-05 to 2010-11 was produced from documents released by the Department of
Economics and Statistics, Tamil Nadu. To our knowledge, data for districts in other states was not
available for this period from any public source. It is likely that this data exists in documents (hard
copies) published by the Departments of Economics and Statistics of these states and needs to be
digitized. However, even in the absence of this data, the current dataset of districts covers over 74%
of the national population and offers a reasonable starting point for our analysis of GDP scaling and
economic geography.
In discussion with several experts on district and state level GDP estimates, it was emphasized to us
that different states may possibly use different methodologies. GDP estimates for districts may result
from using a combination of components measured at the district level and some attributed from the
state level based on available district-wise indicators. Therefore, some of the regional variations in our
analysis may also reflect, in addition to variations in actual performance, states’ varying statistical
practices.
Urban Area GDP Data: For all urban areas that were contained within single districts and contributed
to over 65% of the district’s population, we use the GDP and population measures of the entire
district. There are however 4 urban areas that expand across multiple districts: Mumbai Urban Area
comprising the districts of Mumbai, Thane, and Raigad; Hyderabad Urban Area comprising
Hyderabad and Rangareddy; Chennai Urban Area comprising Chennai, Thiruvallur, and
Kanchipuram; and Kolkata Urban Area comprising Kolkata, Haura, 24-Parganas (North), 24-Parganas
(South), and Hugli. In each of these cases the GDP and population estimates for the urban areas were
obtained as the sum of GDPs and populations of their constituent districts. It is important to realize
that adding the GDP of these districts without consideration for intermediate inputs between them
may overestimate the total GDP of the set.
In order to include Delhi in the GDP analysis of urban areas, we obtain GDP data for Delhi state
(aggregated across all the districts that comprise Delhi), which we include as an approximation for the
Delhi Urban Agglomeration. This data is available from the state government’s publication “Socio-
economic profile of Delhi 2014-15”, accessed at https://bit.ly/2GtVqXN.
State GDP Data: State GDP data for 2010-11 was released by the Planning Commission, available at
https://bit.ly/2GeUM0X.
Population and Urbanization Data: Data on total population, rural population and urban population
for both districts and states is available from the Census of India 2011, at
http://www.censusindia.gov.in/2011census/population_enumeration.html (Primary Census Abstract
Data Tables - India & States/UTs - District Level)
References:
1. United Nations, editor. World Urbanization Prospects: The 2017 revision [Internet].
United Nations, Department of Economic and Social Affairs, Population Division; 2017.
Available: https://population.un.org/wpp/
2. Sahasranaman A, Bettencourt LMA. Urban Geography and Scaling of Contemporary
Indian Cities. Forthcoming in J R Soc Interface. 2018; Available:
https://arxiv.org/abs/1810.12004
3. Krugman P. Increasing Returns and Economic Geography. J Polit Econ. 1991;99: 483–
499. doi:10.1086/261763
4. Roberts M, Deichmann U, Fingleton B, Shi T. Evaluating China’s road to prosperity: A
new economic geography approach. Reg Sci Urban Econ. 2012;42: 580–594.
doi:10.1016/j.regsciurbeco.2012.01.003
5. Roberts M. Identifying the Economic Potential of Indian Districts [Internet]. The World
Bank; 2016. doi:10.1596/1813-9450-7623
6. Jacobs J. The Economy of Cities. Knopf Doubleday Publishing Group; 2016.
7. Duranton G, Puga D. Chapter 48 - Micro-Foundations of Urban Agglomeration
Economies. In: Henderson JV, Thisse J-F, editors. Handbook of Regional and Urban
Economics. Elsevier; 2004. pp. 2063–2117. doi:10.1016/S1574-0080(04)80005-1
8. Ciccone A. Agglomeration effects in Europe. Eur Econ Rev. 2002;46: 213–227.
doi:10.1016/S0014-2921(00)00099-4
9. Roberts M, Goh C. Density, distance and division: the case of Chongqing municipality,
China. Camb J Reg Econ Soc. 2011;4: 189–204. doi:10.1093/cjres/rsr011
10. Lall S, Deichmann U, Shalizi Z. Agglomeration Economies and Productivity in Indian
Industry [Internet]. World Bank; 1999 Nov. doi:10.1596/1813-9450-2663
11. Mankiw NG, Romer D, Weil DN. A Contribution to the Empirics of Economic Growth.
Q J Econ. 1992;107: 407–437. doi:10.2307/2118477
12. Lucas RE. On the mechanics of economic development. J Monet Econ. 1988;22: 3–42.
doi:10.1016/0304-3932(88)90168-7
13. Roberts M. The Growth Performances of the GB Counties: Some New Empirical
Evidence for 1977–1993. Reg Stud. 2004;38: 149–165.
doi:10.1080/0034340042000190136
14. Glaeser EL. Reinventing Boston: 1630–2003. J Econ Geogr. 2005;5: 119–153.
doi:10.1093/jnlecg/lbh058
15. Roberts M, Setterfield M. Endogenous Regional Growth: A Critical Survey. Handbook of
Alternative Theories of Economic Growth. Edward Elgar Publishing; 2010. Available:
https://econpapers.repec.org/bookchap/elgeechap/12814_5f21.htm
16. Bosworth B, Collins SM. Accounting for Growth: Comparing China and India. J Econ
Perspect. 2008;22: 45–66. doi:10.1257/jep.22.1.45
17. Ciccone A, Hall RE. Productivity and the Density of Economic Activity. Am Econ Rev.
1996;86: 54–70.
18. Becker R, Hendersen JV. Intra-industry specialization and urban development.
Economics of Cities: Theoretical Perspectives. Cambridge University Press; 2000.
19. Dixit AK, Stiglitz JE. Monopolistic Competition and Optimum Product Diversity. Am
Econ Rev. 1977;67: 297–308.
20. Abdel‐Rahman H, Fujita M. Product Variety, Marshallian Externalities, and City Sizes. J
Reg Sci. 1990;30: 165–183. doi:10.1111/j.1467-9787.1990.tb00091.x
21. Black D, Henderson V. A Theory of Urban Growth. J Polit Econ. 1999;107: 252–284.
doi:10.1086/250060
22. Bloom DE, Canning D, Fink G. Urbanization and the Wealth of Nations. Science.
2008;319: 772–775. doi:10.1126/science.1153057
23. Tumbe C. Urbanization, Demographic Transition and the Growth of Cities in India,
1870-1920. IGC. 2016;Working Paper C-35205-INC-1.
24. Bertinelli L, Strobl E. Urbanization, Urban Concentration and Economic Growth in
Developing Countries [Internet]. Rochester, NY: Social Science Research Network; 2003
Sep. Report No.: 03/14. Available: https://papers.ssrn.com/abstract=464202
25. Henderson V. The Urbanization Process and Economic Growth: The So-What Question.
J Econ Growth. 2003;8: 47–71. doi:10.1023/A:1022860800744
26. Bettencourt LMA. The Origins of Scaling in Cities. Science. 2013;340: 1438–1441.
doi:10.1126/science.1235823
27. Bettencourt LMA, Lobo J, Helbing D, Kühnert C, West GB. Growth, innovation, scaling,
and the pace of life in cities. Proc Natl Acad Sci. 2007;104: 7301–7306.
doi:10.1073/pnas.0610172104
28. Batty M. The Size, Scale, and Shape of Cities. Science. 2008;319: 769–771.
doi:10.1126/science.1151419
29. Bettencourt LMA, Lobo J. Urban scaling in Europe. J R Soc Interface. 2016;13:
20160005. doi:10.1098/rsif.2016.0005
30. Bettencourt LMA, Lobo J, Strumsky D. Invention in the city: Increasing returns to
patenting as a scaling function of metropolitan size. Res Policy. 2007;36: 107–120.
doi:10.1016/j.respol.2006.09.026
31. Brelsford C, Lobo J, Hand J, Bettencourt LMA. Heterogeneity and scale of sustainable
development in cities. Proc Natl Acad Sci. 2017;114: 8963–8968.
doi:10.1073/pnas.1606033114
32. Bose A. From population to people. BR Publishing Corporation; 1988.
33. Sharma V. Are BIMARU States Still Bimaru? Econ Polit Wkly. 2015;L: 6.
34. Ahluwalia MS. Economic Performance of States in Post-Reforms Period. Econ Polit
Wkly. 2000;35: 1637–1648.
35. Kurian NJ. Widening Regional Disparities in India: Some Indicators. Econ Polit Wkly.
2000;35: 538–550.
36. Alkire S, Jahan S. The New Global MPI 2018: Aligning with the Sustainable
Development Goals. OPHI Work Pap. 2018;121: 21.
37. Bhandari L, Khare A. The Geography of Post-1991 Indian Economy. Glob Bus Rev.
2002;3: 321–340. doi:10.1177/097215090200300216
38. Chand R, Srivastava S, Singh J. Changing Structure of Rural Economy of India:
Implications for Employment and Growth [Internet]. Niti Aayog; 2017. Available:
http://rgdoi.net/10.13140/RG.2.2.17270.09280
39. Office of Management and Budget. 2010 Standards for Delineating Metropolitan and
Micropolitan Statistical Areas. 2010.
40. OECD. Redefining “Urban”: A New Way to Measure Metropolitan Areas [Internet].
Paris: OECD Publishing; 2012. Available: https://doi.org/10.1787/9789264174108-en
41. Sridhar KS. Density gradients and their determinants: Evidence from India. Reg Sci
Urban Econ. 2007;37: 314–344. doi:10.1016/j.regsciurbeco.2006.11.001