1
Ubiquitous Digital Technologies and Spatial Structure: an Update
Emmanouil Tranosa,* and Yannis M. Ioannidesb
aDepartment of Geography, University of Birmingham, UK
bDepartment of Economics, Tufts University, USA *Corresponding author
E-mail addresses: [email protected] (E. Tranos) and [email protected] (Y. M.
Ioannides)
June 22, 2017
Abstract
This paper tests whether the internet and communication technologies have offset agglomeration
benefits and led to more dispersed spatial structures or enhanced urban externalities and resulted in
more concentrated spatial systems. Both empirical and theoretical studies have researched this
question, but (i) revealed opposing findings, (ii) were based on assumptions about technological
capabilities which do not necessarily hold today, and/or (ii) used data from times before digital
technologies reached the current maturity level.
We address these issues by estimating a multi-country model which tests the effect of digital
technologies on Zipf coefficients using recent data. Then, we focus on the US and the UK, for which
we obtained novel data, to test whether such effects exist for smaller cities which were not included
in our global data. The results, which appear to be robust against endogeneity, illustrate a
complementary relationship between the internet, mobile and fixed telephony and agglomeration
externalities.
internet, spatial structure, Zipf, cities, digital communications
L96, R12, C23
2
1. Introduction
Substantial research effort has been spent on exploring the spatial incidence of the internet, much
before its massive expansion after the turn of the new millennium. Early research emphasized the
internet’s aspatial nature (Mitchell 1995). Geographers, economists, but also engineers, theorized
about the spatial impacts that rapid internet penetration might generate on individual cities and the
national spatial structure. The outcome of various such attempts was rather deterministic, celebrating
the emergence of telecottages (Toffler 1980), the rise of a borderless world (Ohmae 1995), the death
of cities (Gilder 1995; Drucker 1998), and, in general, the end of geography (O'Brien 1992), the death
of distance (Cairncross 2001) and the emergence of a new flat world (Friedman 2005). Today, more
than 20 years after the commercialization of the internet (Kende 2003), we know that the above
narratives overstated the potential of the internet and other digital communication technologies to
supplement face-to-face interactions, diminish the cost of distance and, in overall, weaken
agglomeration economies to such an extent as there would be no benefit for people and businesses
to agglomerate in cities. The current global urbanization rates (United Nations 2014) indeed raise
doubt the above predictions.
However, the adoption rate and the pervasiveness of new internet-based communication
technologies such as online social media and mobile internet, which have increased rapidly during the
last 10-15 years also in the developing world, raise questions on how exactly the internet and related
communication applications have affected agglomeration economies. Conflicting technology
examples can be illustrated. On the one hand, despite the broader agreement that no digital
technology can reach the media richness of face-to-face communications, empirical research from the
management field suggests that current digital technologies can effectively facilitate the sharing of
knowledge with low to medium tacitness and even support knowledge sharing of a high degree of
tacitness (Panahi, Watson, and Partridge 2013). On the other hand, the very same technologies can
further enhance what Storper and Venables (2004) termed as buzz, as the constant publication of our
personal and professional updates and whereabouts enabled by current digital technologies can
directly facilitate intentional and unintentional face-to-face meetings.
The above discussion is of great theoretical importance as it lies within the core of urban
agglomeration economies and, more specifically, questions the role of distance. As the urban
economics literature suggests agglomeration externalities are triggered when agents are located in
near proximity because the potential for interaction and knowledge spillovers is higher (Rosenthal and
Strange 2004). Hence, face-to-face interactions and the implied knowledge spillovers are facilitated
within cities due to the opportunities for decreased transportation costs (Storper and Venables 2004).
However, the internet and digital communications have the capacity to directly affect this process by
decreasing transportation cost1 (McCann 2008). In essence, the internet and digital communications
interfere with some of the micro-foundations of agglomeration economies identified by Duranton and
Puga (2004) such as learning and matching. On the one hand, web-based applications such as Massive
Open Online Courses could decrease the need for colocation of actors in order to participate in formal
learning activities. On the other hand, online social media such as LinkedIn 2 can enhance the
probability of matching and the quality of matches even within cities. The emerging question is
1 Or, more broader, spatial transmission costs to follow McCann’s terminology (2008). 2 LinkedIn has been identified as “the largest professional matchmaker site in the world” (Van Dijck 2013, p. 207)
3
whether the generalized impact of the internet and other digital communications offset the benefits
derived by agglomeration economies and result in more dispersed spatial structures, or whether it
further enhances such urban externalities and leads to more concentrated spatial structures.
This paper contributes to the above discussion by presenting empirical research on whether the
internet and digital communications have affected spatial structure and more specifically the size
distribution of cities. Contrary to most of the previous empirical studies, which are reviewed in the
next section, this paper first returns to the same empirical setting of Ioannides et al. (2008) to re-
examine their results in view of availability of additional data on internet use which reflect the
technological maturity and developments as of the present. In doing so, the paper probes the
significance of different levels of aggregation for the relation between agglomeration externalities and
digital communications. To address it, this paper employs a multi-scalar perspective: from a global,
multi-country analysis, using city and urban agglomeration data for many countries to country specific
granular analysis for the US and the UK. For those countries we have brought novel data to bear on
the question. In addition, the paper also distinguishes the effects of different technologies on spatial
structure including internet and broadband internet as well as mobile and landline telephony adoption
rates.
Interestingly, most of the different results – from the global level analysis to the case studies – support
the complementarity argument. Specifically, the paper examines econometrically whether spatial
structures have been affected by the adoption rates of the different digital communication
technologies described above. Our results are also robust against potential endogeneity as one might
claim that the take-up of these technologies could have been affected by spatial structures as
individuals in more dispersed spatial systems might have taken up such technologies in order to
overcome the lower level of agglomeration externalities. In addition, we find some evidence that such
effects are stronger in smaller urban areas. Our findings can directly inform the urban policy agenda
as they advocate towards the inclusion of digital strategies in policies aiming to enhance
agglomeration externalities and improve the position of a city within its national urban hierarchy.
The structure of the paper is as follows: Section 2 provides a brief literature review of previous studies
which approach the relation between communication technologies and agglomeration externalities;
then, Section 3 describes the methods and the data we use; Section 4 presents the results of the
global, multi-country analysis; Section 5 narrows down to the two case studies, the US and UK; the
paper concludes with Section 6 which summarizes and discusses the results.
2. Literature review
This section provides a brief review of the literature which explores the relationship between
agglomeration economies and communication technologies. Gaspar and Glaeser (1998) modelled the
effect of telecommunication improvements on the intensity of face-to-face interactions and city size.
Their results indicated that technological improvements in telecommunications may lead to increased
demand for face-to-face interactions, which will then increase the importance of cities as centers of
interaction. Their theoretical model clearly supported a complementary relation between
agglomeration externalities and advances in telecommunications. However, a fundamental
assumption of their model is that face-to-face interactions are superior to any technology mediated
interactions and if this assumption does not hold, then Gaspar and Glaeser (1998) would have
4
predicted opposite results. As indicated above and as the management literature suggests the
superiority of face-to-face interactions against digitally mediated ones is not always clear as certain
elements of knowledge sharing can also be achieved by via online interactions (Panahi, Watson, and
Partridge 2013; Hildrum 2009). Indeed, to a certain extent, this argument is technology dependent.
E.g., current teleconferencing capabilities are nowhere close to the ones available in late 1990s.
Hence, there is a need to empirically test how the relation between agglomeration externalities and
telecommunications has evolved.
Gaspar and Glaeser (1998) paved the way for a number of empirical studies which approached the
key theoretical question of how digital communications affect agglomeration forces from various
different perspectives. Kolko (2000) used internet diffusion data and identified a clear complementary
link between internet usage and city size. Interestingly, he identified higher internet domain densities
in remote cities which indicated a substitutional effect of the internet for longer-distance non-
electronic communications. His results were consistent for different measures of internet diffusion
(internet domain density and internet take-up). Sinai and Waldfogel (2004) approached the same
question from the consumer’s point of view. Specifically, they studied the link between market size
and locally targeted online content and found that more local content is online for larger markets,
which favors a complementary link between the internet and cities. In addition, their analysis also
indicated that holding local online content constant, the market size has a negative effect on individual
connectivity which is indicative of a substitution effect. Forman et al. (2005) examined whether
commercial internet adoption is higher in cities or rural areas. While the former would indicate a
complementarity between internet adoption and cities, the latter would reflect a supplementary
relation, according to which the internet is used as a means to offset costs and lack of opportunities
related to peripherality. Their results indicated that despite internet adoption by firms with more than
100 employees being faster in smaller urban agglomerations, the adoption of more sophisticated
internet-based applications was positively related with city size in 2000. Sohn, Kim, and Hewings
(2003) compared how information technologies are related to urban spatial structure for Chicago and
Seoul. Although they found a clear complementary link for Chicago, this was not the case for Seoul,
where information technologies contributed to a more dispersed spatial pattern. Focusing on the
municipalities in the province of Barcelona, Pons-Novell and Viladecans-Marsal (2006) found a
complementary link between individual internet take-up and off-line commercial offerings.
Nevertheless, their results cannot safely reject the substitution effect. More recently, Bekkerman and
Gilpin (2013) focused on the role of locally based information resources using a dataset about the US
libraries during the period 2000-2008. Their results suggested that internet access increases the
demand and the value of locally accessible information and such complementarities are higher in
larger metropolitan areas, which are expected to gain more benefits by internet access. Anenberg and
Kung (2015) also identified a complementary relation between the internet and consumption variety
in cities by focusing on food truck industry in the US. Craig, Hoang, and Kohlhase (2016) focused on
internet take-up rates for the US states during the period 2000-2011. Their analysis resulted to
suggestive evidence of the complementary role that internet connectivity performs on urban living.
At a more aggregated level, Ioannides et al. (2008) examined the impact of fixed line telephony on
urban structure using country-level data during the period 1980-2000. Using a panel dataset of spatial
dispersion measures they found robust evidence that an increase in the number of telephone lines
per capita encourages the spatial dispersion of population in that they lead to a more concentrated
distribution of city sizes. However, whereas by the end of the coverage of their data, the evidence on
5
internet usage is more speculative, they show that it goes in the same direction. Focusing on rural
areas, Partridge et al. (2008) found no evidence that rural distance penalties in the US have
substantially changed since 1970s indicating that technological changes including the internet and
digital communications have not managed to alternate spatial structure. Their interpretation of the
absence of an increase in relative growth rates in rural counties is that either technological
improvements increased the distance related costs, an argument which is aligned with the ideas
proposed by McCann (2008), or distance costs have not decreased enough in order to alternate the
growth trajectories of rural areas. Interesting are also the findings of a recent study by Kim and Orazem
(2016) on the economic effects of broadband internet in rural areas. They identified a positive effect
on new firm location decisions, but this effect is higher in rural areas with larger population and in
those rural areas which are adjacent to a metropolitan area, suggesting a complementarity between
the internet and agglomeration economies.
Thus the consensus that emerges from previous research is that the death of distance discussion had
been proved to be premature (Rietveld and Vickerman 2004). However, the exact impact of digital
communications on spatial structure is still an open question. As Leamer and Storper (2001) indicated,
the internet can affect both centripetal and centrifugal forces. Although most of the above studies
supported the complementarity argument, the results are not always conclusive and quite a few of
the above studies provided evidence for a supplementary relation. Interestingly, the only global study
(Ioannides et al. 2008) supported a clear substitution effect, which, might be indicative of how
heterogeneous over space is the effect of telecommunications on cities. Indeed, most of the above
studies focused either on the US or on some specific cities. Moreover, most of the above studies
examined the complementarity/substitution question for a time period when the internet and other
digital communication technologies were still emerging technologies. For instance, internet
penetration in the US in 2000, which was the focus for quite a few of the above studies including
Ioannides et al. (2008), was just above 50 percent, while in 2016 it reached almost 90 percent (Pew
Internet 2016). At a global scale internet penetration raised from 7 to 46 percent during the same
period (Internet Live Stats 2017). In addition, although email and instant messaging technologies were
wide spread in the developed world in early 2000s, network externalities because of mobile internet
and online social media were nowhere close to what we experience today. For instance, Facebook
users increased from 1 million in 2004 to more than 1.5 billion in 20153. Hence, it might have been
premature for the spatial economic effects of the internet and other digital communication to have
been materialized by the time that most of the above studies were conducted.
3. Methods and data
The main aim of this paper is to estimate the potential impact that the internet and digital
communication technologies have generated on the spatial dispersion of economic activities and
consequently population. In order to do so, we adopt a multi-scalar approach. We start with a multi-
country exercise, which includes both developed and developing countries. Section 3.1 discusses the
methods and the different data we use and Section 4 reports the results. Because of limitations related
with multi-country urban population data (see discussion below), we complement our analysis with
3 http://newsroom.fb.com/company-info/
6
two case studies for the US and the UK urban system for which we have access to much more granular
data. Section 3.2 discusses the methods and Section 5 reports the results.
3.1 Multi-country analysis
The multi-country identification strategy is a two-step approach which uses the work of Ioannides et
al. (2008) as a starting point. The first step of our methodology is to estimate the Zipf coefficient for a
broad sample of countries over time. Zipf coefficient is one of the most widely used measures of
spatial dispersion with numerous applications in urban economics and economic geography (see for
example Black and Henderson 2003; Frenken and Boschma 2007; Giesen and Südekum 2011; Rauch
2013; Ioannides and Overman 2003; Ioannides and Zhang 2017; Nitsch 2005). This an appropriate
measure of dispersion because of the extreme heterogeneity of the city size distribution and the very
good fits normally obtained with such estimations. City sizes, s, satisfy Zipf’s law, if
𝑃(𝑠 > 𝑆) =𝑎
𝑆𝜁 (1)
where ζ is typically estimated to be very close to 1, and a is a constant that is equal to the minimum
city size raised to the power of ζ (e.g. Gabaix and Ioannides 2004). In other words, the percentage of
cities with population greater than S equals to a constant α multiplied by the inverse population size
S, if ζ is close to 1. An approximation of Zipf’s law is the rank-size rule. According to this deterministic
rule, twice the population of the second largest city within an urban system equals to the population
size of the largest city; similarly, three times the population of the third largest city equals to the
population size of the largest city, etc. Therefore, eq. (1) can be approximated by the following
equation (Gabaix and Ioannides 2004):
𝑠𝑖 ≈𝑆0
𝑟𝑖 (2)
where, So is a constant which is equal to the largest urban population of the urban system and 𝑟𝑖 is the
rank of the city i, the population of which Si we are trying to estimate. The estimation of the logarithmic
form of eq. (1) has been extensively used by the literature to obtain an estimate of ζ, known as the
Zipf coefficient:
𝑙𝑛𝑟𝑖 = 𝑙𝑛𝑆0 + 휁𝑙𝑛𝑆𝑖 + 𝑒𝑖 (3)
Based on the above discussion, Zipf rule holds when ζ is close to 1. More generally, estimations of city
size distributions have also considered exponents ζ that are not necessarily equal to 1, in which case
we refer to ζ as the Pareto, or power law, exponent. Given that our aim here is to estimate the Zipf
coefficient for a number of countries over time as a measure of dispersion, equation (3) describes the
rank of city i in country c in year t:
𝑙𝑛𝑟𝑖𝑐𝑡 = 𝑙𝑛𝑆0𝑐𝑡 + 휁𝑐𝑡𝑙𝑛𝑆𝑖𝑐𝑡 + 𝑒𝑖𝑐𝑡 (4)
The estimation of (4) has been traditionally performed by Ordinary Least Squares (OLS). Gabaix and
Ioannides (2004) discussed the downward bias of estimates of (4) using OLS on small samples. Gabaix
and Ibragimov (2011) propose a practical remedy to correct this bias, which we do adopt in this paper:
instead of using the log of rank of a city i in a country c in year t, they propose to use the log of rank-
0.5, which has indeed been widely adopted.
7
Researchers working in this area must contend with definitional differences as well as differences in
availability of different kinds of data sources. Definitions of cities differ across countries, for political,
administrative and legal reasons. In order to compare our results with previous work from Ioannides
et al. (2008), our starting point was the data obtained by Thomas Brinkhoff’s City Population project
(Brinkhoff 2014) 4 , which was also used by Soo (2005). However, because of limited success in
addressing potential endogeneity issues as well as inconsistences related with city definitions across
different countries5, we only briefly discuss the results based on these data in the beginning of Section
4 and report them in the Appendix. Our multi-country analysis is then based on the annual population
data for urban agglomerations with 300,000 inhabitants or more from the Department of Economic
and Social Affairs of the United Nations (United Nations 2014). Despite some criticism about the
consistency of the urban agglomeration definitions across different countries (Cohen 2004), this is the
only available source for yearly, multi-country population data for urban agglomerations
(Montgomery 2008; Chen and Ravallion 2007; Decker, Kerkhoff, and Moses 2007). The results based
on these data are reported and discussed in Section 4.
Table 1 presents the estimated Zipf coefficients for the panel of countries that the second stage of the
analysis focuses on using the UN urban agglomerations data 6 . It becomes evident that there is
considerable variation in the estimated coefficients across the different countries of the world.
Because of the empirically established heavy upper tail of data for cities and urban agglomerations,
the Zipf coefficient constitutes a convenient measure of dispersion. The larger its absolute value, the
thinner the upper tail; equivalently, the larger is the coefficient algebraically, the heavier the upper
tail. This key observation is basis for the second step of our methodology.
Insert Table 1
The second step of our methodology involves estimating the following empirical model (Ioannides et
al. 2008):
휁𝑐𝑡 = 휃𝑐 + 𝛿𝑡 + 𝑋𝑐𝑡휂 + 휀𝑐𝑡 (5)
This empirical model, will enable us to estimate the effect of a number of explanatory variables
included in the vector X which pertain to the spatial structure of country c in year t as depicted on the
Zipf coefficient noted as 휁 and reported in Table 1. The main variables of interest here are internet
and digital communications variables including: internet users per 100 inhabitants, broadband users
per 100 inhabitants, mobile phone users per 100 inhabitants, and fixed phone users per 100
inhabitants. To address a potential omitted variable bias (5) includes country fixed effects 휃𝑐, as well
as a time trend 𝛿𝑡; εct is the error term. In addition vector X includes a number of control variables,
the descriptive statistics of which together with these for the other variables used to estimate (5) are
reported in Table 2. Referring to the control variables, total country population is an important
measure of size, GDP per capita, and GDP growth is intimately related to urbanization and so are
population density, and non-agricultural value-added as a share of GDP. Trade, that exports and
imports as a share of GDP, is an important time varying measure of openness. Government
4 http://www.citypopulation.de Freedom House (2014) 5 Thomas Brinkhoff’s City Population data usually refers to administrative units instead of functional cities. 6 A similar table for based on the Thomas Brinkhoff’s City Population can be found in the Appendix.
8
expenditure as a share of GDP may be a proxy of public investment in some countries and government
waste in others.
As Table 3 indicates, although there some rather strong correlations among these variables, mobile
phone penetration appears to have a distinct character from its fixed phone counterpart: their
correlation coefficient is only 0.445. This probably highlights the different composition of the
population or infrastructure development patterns in the developing world, where mobile telephony
helped overcome the lack of fixed line infrastructure and mobile phone networks are also used as the
main way to use the internet (Donner 2008; Hamilton 2003).
Insert Table 2
Insert Table 3
The availability of a panel dataset for city sizes across countries enables us to use country fixed effects,
which can address potential endogeneity issues related to unobserved country specific characteristics
of city size distributions. However, such a strategy does not address potential simultaneity issues.
Simply put, internet penetration might be affected by spatial structure, as reflected in Zipf coefficients,
or both internet penetration and spatial structure might be jointly determined by a third variable. E.g.,
if a country already has a dispersed spatial structure, internet is particularly suitable in facilitating
communication. Potential endogeneity in our specification will prevent us from being able to
determine the causal impact of internet and digital communication technologies usage on spatial
structure, which is the main aim of this paper. In order to address this problem, we will adopt an
instrumented variable strategy. Table 2 also includes the descriptive statistics for the instrumental
variables we are using and, will be discussed in Section 4.
3.2 Case study approach
The above global level analysis is followed by two case studies in order to assess the potential internet
effects on more ‘complete’ urban systems without the exogenously imposed threshold of the 300,000
habitants that the global level analysis adopts. To overcome this problem we focus on the US and the
UK for which we have more granular internet-related data (see Section 5 for the data description).
Given that the analysis will take place separately for each of these two countries the panel data
structure that shaped the identification strategy of the global level analysis cannot be applied here. In
other words, we cannot adopt a two-step approach and include the Zipf coefficient as the LHS variable
of the second step. Therefore, we propose a one-step approach and the estimation of an empirical
cross-sectional model:
𝐷𝑖𝑓𝑓 𝑖𝑛 𝑟𝑎𝑛𝑘𝑠𝑖 = 𝑎 + 𝛽 𝐼𝐶𝑇𝑖 + 𝐵𝐶𝑖 + 휀𝑖 (6)
In order to capture the micro-dynamics of the urban systems in the two case studies, we follow Batty
(2006) and Havlin (1995) and focus on the difference in ranks for individual cities during the study
period. Contrary to their approach, we are not interested in the absolute difference in ranks, but in
the real difference in order to capture whether a city improves on not its position in the urban system
during the study period and then test whether our internet-related variable has an effect on this.
Hence, we define the LHS variable of (6) as follows:
𝐷𝑖𝑓𝑓 𝑖𝑛 𝑟𝑎𝑛𝑘𝑠𝑖 = 𝑟𝑖(𝑡−1) − 𝑟𝑖𝑡 (7)
9
where 𝑟𝑖𝑡 is the population rank of city i in year t. A negative (positive) value for the Diff in ranks
variable indicates that a city’s position in the urban hierarchy of the country worsens (improves) in
relative terms, also due to the population changes of the other cities of the urban system. Notably,
this variable does not only consider the population change of a specific city, but it also considers the
overall urban system dynamics by focusing on the rank and not on population per se. Given that the
data we use and the definitions of cities vary between the UK and the US we are going to discuss these
data in the relevant sections. What we highlight here is that the estimation of (6), just like (5), might
suffer from endogeneity and therefore an instrumental variables strategy is employed in order to
address this issue.
4. Digital technologies and spatial structure: a global view
This section presents the estimation results of eq. (5). Again, the LHS variable is the Zipf coefficient, as
estimated according to the Gabaix and Ibragimov (2011) correction. The main variables of interest can
be found on the top of Table 4; namely internet, broadband, mobile and fixed telephony per 100
habitants expressed in natural logarithms. The main variables of interest are introduced successively
on their own in the regressions reported in Table 4. All regressions include country fixed effects to
control for unobserved heterogeneity and a time trend. In addition, the observations are weighted
with the inverse squared standard error of the estimated Zipf coefficient to address potential noise
that is carried over from the first part of our identification strategy. In regards to the interpretation of
the estimated coefficients, given that the Zipf coefficient has entered the regression not as an absolute
value, but instead as a real number a positive coefficient for a RHS variable indicates an impact
towards the decrease of the spatial dispersion of population. In other words, a positive coefficient
indicates an effect towards less uniform city sizes that is more dispersion of city sizes. The latter is
indicative of enhancement of agglomeration economies because of the expansion of digital
technologies.
We first estimated eq. (5) using as the LHS variable the Zipf coefficient based on the Thomas
Brinkhoff’s City Population data (Brinkhoff 2014). Because of the data consistency issue
(administrative units instead of functional cities) and also because we did not manage to address the
potential endogeneity issues, we only report and discuss these results in the Appendix. Table 4 and 5
report the estimation results based on the UN urban agglomeration data.
Insert Table 4
We note that an agglomerative effect is only detected for the internet users and marginally for
broadband users as indicated by the significant coefficients in columns (1) and (2). For the telephony
variables the estimation of (5) did not yield statistically significant coefficients. Before discussing
further these results and the effects of the other control variables, we need to highlight that the main
challenge of estimating equation (5) is the potential endogenous nature of the share of internet users
which might prevent us from being able to infer a truly causal effect. Endogeneity might be an issue
here as spatial structure, which is represented by the Zipf coefficient, might be affected by another
source, which also affects internet penetration. For instance, economic development might affect the
concentration of population in large cities and at the same time enable more people to go online. If
we do not address this issue, the coefficient for the main variable of interest will capture potential
effects that internet penetration has on spatial structure, but also potential reverse causality effects
10
that spatial structure might generate on internet penetration. To overcome this potential problem,
Table (5) reports estimates of eq. (5) using two-stage least squares (2SLS) with instrumental variables
(IVs). The latter are variables which are correlated with our endogenous variables, but do not influence
current spatial structure. Such an approach will enable us to estimate the causal effect – if any – of
the internet and digital communications on spatial structure. At a first stage, our endogenous variables
are regressed against the IVs. Then, the predicted values of the endogenous variables based on the
IVs and the other control variables are used instead of the endogenous variable to estimate eq. (5). A
significant effect will verify the causal impact of the internet and digital communication usage on
spatial structure.
Insert Table 5
The main challenge for such an exercise is to find valid IVs. We propose here a set of instruments
which are either economic development or technology adoption indicators and are directly related to
digital infrastructure, but not to spatial structure. Their descriptive statistics are presented in Table 2.
Table 5 presents the results of the 2SLS estimations and column (1) tests the effect of internet users.
We instrumented this variable with the mortality rate for children below the age of 5 per 1 million of
inhabitants. We expect that child mortality is correlated with education and therefore with internet
usage. However, we have no theoretical reasons to believe that spatial structure is driven by such
mortality rates. Various studies focusing mostly on developing countries have not found statistical
links between child and infant mortality with urbanization. Hobcraft, McDonald, and Rutstein (1984)
argue that although one might expect lower mortality rates in large cities because of better socio-
economic indicators such as income this might not always be true because public health provision
might be worst in urban areas. Bicego and Boerma (1993) suggest that although accessibility to
modern health services might be higher in cities, urban lifestyle could result in weaker family support
networks which could affect children health. In regards to rural areas, they point to social pressure,
which can be higher in such areas, and therefore result to adoption of traditional practices (e.g. of
child rearing) which can be harmful to child health. Thus, the relationship between child mortality rate
and urban structure is not clear and therefore we believe it is a valid instrument for our estimations.
Moreover, according to Table 5 the regression presented in column (1) does not suffer from weak
identification as the first stage F-test is much higher than all the critical values (Stock and Yogo 2005).
For columns (2) and (3) we adopt a technology related instrument: the number of Secure Sockets Layer
(SSL) certificates per 1 million inhabitants. This technology enables secure web traffic between a user’s
internet browser and a web server. For instance, web banking applications use such a technology to
safeguard the communication between a user and the bank’s server. Given that the internet’s
topology is not necessarily defined by national borders we believe that this variable does not drive
spatial structure as it is more related with the level of sophistication and the dynamism of the digital
economy than with the location of internet users. Specifically, if an internet user from a country A
uses a web-service provided by a company located in a country B, then our instrument only counts
the company which issued the SSL certificate in country B and not the user in country A. In addition,
the correlation coefficient with the Zipf coefficient is only 0.018. Moreover, the first stage F-test both
for columns (2) and (3) is much higher than all the critical values which indicates that our estimations
do not suffer from weak identification.
11
Similarly, fixed telephony penetration is instrumented by the percentage of female participation in
the labor force (column 4). Our instrumental variable is related to fixed telephony, as higher female
participation in the labor force leads to higher demand for fixed telephony, but it is not related with
spatial structure especially given the diverse sample of countries. Firstly, the correlation coefficient
between the IV and the Zipf coefficient is very low (0.18). Secondly, although one might think that
increased female labor participation may lead to relocation of households to large cities and therefore
affect the spatial structure of cites, such a process varies a lot within our dataset. Even if this might be
true for developed countries with mature urban systems for which a location within a large urban
center might provide opportunities for both male and female workers to find jobs, our dataset also
includes countries, 37 per cent of the GDP of which is attributed to primary activities (see Table 2). So,
female labor participation might also be related to economic activities located outside large urban
centers. In terms of the relevant tests presented in Table 5 the weak identification first stage F-test
exceeds by far all the Stock-Yogo critical values.
To further assess our IV strategy, columns (5) – (8) include a second IV which enables the estimation
of the Sargan over-identification restrictions test, the null hypothesis of which suggests that all
instruments are uncorrelated with the error term. Failure to reject this null supports the validity of
our instruments. For the regressions reported in columns (5), (6) and (8), on top of the IVs used for
the regressions in columns (1), (2), and (4), we also adopt as a second IV a 20 year lag of the percentage
of households with a TV. Given that TV broadcasting in a lot of countries is based on aerial signal and
not on cables and therefore could have not complemented the development of other digital
infrastructural systems which might be endogenous (e.g. cable internet) we believe that this is a valid
instrument. The above argument is further enhanced by the fact that we use a 20 year lag of this
variable. For the regression reported in (7) we add as a second IV a 20 year lag of the total employment
in telecommunications. The underlying argument here is that a 20 year lag of the size of
telecommunications sector (1980-1995) does not necessarily affect the current spatial structure.
Telecommunications sector during that period was highly nationalized in various countries and
therefore its size might not directly correspond with efficient telecommunications which could have
driven spatial structure. Moreover, this IV is not correlated with the Zipf coefficient (0.02).
Importantly, all the Sargan tests reported in Table 5 support our choice of IVs as for all the regressions
in columns (5) – (8) we fail to reject the null hypothesis, something which supports the validity of our
instruments. In addition, a comparison between the regressions with one and two instruments
indicates only some marginal quantitative differences in the magnitude of the coefficients, which
further supports our identification strategy.
The estimations presented in Table 5 indicate that increases in internet, broadband internet, mobile
and fixed telephony usage have resulted to a decrease of the spatial dispersion of population for the
time period and for the panel of countries included in our data. In other words, increases in internet
and digital communications resulted to national urban systems which are less uniform in terms of city
sizes and are characterized by higher population concentrations in larger cities. Importantly, our
identification strategy enables us to address potential reverse causality issues and treat the results of
Table 5 as causal. Hence, the main finding of our multi-country, global analysis based on data for urban
agglomerations of at least 300,000 inhabitants is that internet, broadband, fixed and mobile phone
usage appear to act in favor of agglomeration economies and result to urban systems with more
dominant cities on the top of the urban hierarchy.
12
In terms of the magnitude, mobile phone penetration appears to have the highest effect on Zipf
coefficients. This is indicative of the pervasiveness of this technology and its transformative effects. It
is more widespread than the internet and broadband usage especially in the developed world. In
addition, mobile telephony, contrary to landline telephony, is also used today as a platform for
internet access and internet based applications. Its magnitude is more than double the coefficient of
the internet usage and almost three times the coefficient of broadband users.
Regarding the control variables, only a few of them have significant effects on spatial structure
probably because the fixed effects (within) estimation masks the between-country variation. These
effects are in agreement with previous research (Ioannides et al. 2008). Namely, GDP per capita has a
consistent negative effect which indicates that wealthier countries tend to have more balanced urban
systems. The same applies for trade openness. The latter is in accordance with New Economic
Geography models which indicate that international trade openness might weaken agglomeration
forces (Fujita, Krugman, and Venables 1999). In addition, the significant and negative coefficient of
the time trend indicates that over time agglomeration forces weaken.
In total, the results indicate that our measures of internet and telephony penetration have further
enhanced agglomeration forces, at least for large urban agglomerations. These results are robust
against endogeneity issues, but are limited to urban agglomeration included in our data. Indeed, for
the estimation of the Zipf coefficient we only included urban agglomeration of 300,000 inhabitants or
more due to data availability. Hence, the above estimations cannot verify whether such an effect is
also valid for smaller cities. In order to overcome this limitation, the next section presents two case
studies, for which we have obtained much more granular data and therefore we are in a position to
test the effect of the internet on the tail of the urban population distribution for these countries.
5. The Impact of ICT on the US and the UK Spatial Structures
This section focuses on the US and the UK urban systems, for the cities of which eq. (6) from Section
3.2 is estimated separately.
5.1 Internet and the US Spatial Structure: Evidence from the US Micropolitan and Metropolitan
Statistical Areas, 2013-2015
We pursue further our investigation of the impact of internet adoption by using a previously
unutilized, to the best of our knowledge, for this purpose data source. That is, for the first time in 2013
and 2015, data on internet use was made available via the American Community Survey and is
provided at the metropolitan area (urban areas comprised of one or more adjacent counties or county
equivalents that have at least one urban core area of at least 50,000 population, plus adjacent territory
that has a high degree of social and economic integration with the core as measured by commuting
ties) and at the micropolitan area (defined like metropolitan areas except that are comprised of an
urban core of at least 10,000, but less than 50,000 population) level of aggregation (File and Ryan
2014). These functional definitions of a city represent in essence labor markets and according to Table
6 the observable minimum size of city population used for this analysis is just above 62,000 habitants.
Insert Table 6
13
As the LHS variable we use here the outcome of eq. (7) in a normalized form so as that variable is
bounded between 0 and 1 and with a mean of 0.57. This transformation enables us to estimate (6) not
only with OLS, but also with a quasibinomial GLM estimator given that the original form of our LHS
variable does not vary continuously, but is instead defined as a difference between two count
variables, which may also assume negative values. The results of the different estimations remain
qualitatively the same regardless of the form of the LHS variable we use.
Table 7 reports the results of the estimation of eq. (6) using OLS. The sign and significance of the main
variable of interest – percentage of population with computer and broadband connection – verifies
our global level results. That is an increase in internet usage improves the position of a city in the US
urban system. The results are consistent among the different estimators that are OLS and GLM. In
addition, Table 7 also reports the estimates of interaction effects between the share of population
with broadband connection with population and population density. The latter is marginally significant
and negative, which provides some weak evidence that the effect of broadband penetration might be
larger for smaller urban areas.
Insert Table 7
Although not directly comparable, the above estimations are in accordance with our global model. In
addition, the potential presence of endogeneity might be a problem here as well. Therefore, Table 8
reports 2SLS estimations and follows the same IV strategy presented in section 3. That is the first
column of Table 8 includes one instrument for which we have strong reasons to believe is uncorrelated
with the error term. The second column includes a second instrument in order to estimate the Sargan
over-identification restrictions test. The instrument we propose here is Bachelor’s degrees per
inhabitant in 2005. Even if the quality of human capital can affect the population growth of a city 10
years later, our LHS variable adopts a systemic understanding of the US urban system as it measures
the relative position of a city within the overall urban system instead of its population growth. In other
words, a city might experience population growth between two periods, but if other cities have also
experienced population growth, this might not affect its relative position within the urban system.
Hence, we do not expect that this IV affects the LHS variable. In addition, the correlation coefficient
between these two variables is still very low (r=0.24). Moreover, when we add the second IV, which
in this case is the commute time in minutes also in 2005, we fail to reject the null hypothesis of the
Sargan test, something which also adds on the validity of our strategy. Furthermore, our results do
not suffer from weak identification according to the relevant tests in Table 8.
Insert Table 8
In total, the estimates of Table 8 enable us to identify a causal effect of the share of the population
with a computer and a broadband connection on the position of a city in the urban system. Specifically,
an increase in broadband penetration led to an improvement of the position of a city in the US urban
hierarchy for the cities and the time period included in our analysis. Everything else being equal, if a
city whose position in the US urban system did not change during the period 2013-2015 had
experienced an increase of 10 per cent points of broadband penetration, this increase would have
7𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑟𝑎𝑛𝑘𝑠𝑖 =
(𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑟𝑎𝑛𝑘𝑠𝑖+𝑁𝑐𝑖𝑡𝑖𝑒𝑠)
2𝑥𝑁𝑐𝑖𝑡𝑖𝑒𝑠 thus falls in [0, 1]
14
improved its relative position by 4.6 places. As in the global model reported in Section 4, internet
penetration appears to work in favor of agglomeration externalities. In regards to the control
variables, we can identify a negative effect of income, which is consistent with the effect of GDP per
capita in the global model presented in Table 5. During the study period, unemployment rates and the
share of white population negatively affected the ranking of micropolitan and metropolitan areas in
the US.
5.2 Internet and the UK Spatial Structure: Evidence from the Built-up Areas in England and Wales,
2011-2014
The next step in our analysis is to estimate eq. (6) for cities in England and Wales for which we were
able to access internet speed micro-data. More specifically, we obtained individual speed internet
tests from broadbandspeedchecker.co.uk. This website enables individuals to directly measure their
upload and download internet speed. The results of the tests as well as the geo-location of the users
are recorded by the website operator and were provided to us in a fully anonymized manner. More
discussion about the nature and the validity of this data can be found in the work of Riddlesden and
Singleton (2014).
The point nature of these individual level data enables us to aggregate them up to any urban level that
we are interested in. Given that all the above analyses focused on functional definitions of cities, we
adopt here a morphological definition of a city or a town for the UK, which enables us to test the effect
of internet on smaller areas irrespective of being part of a wider urban agglomeration. Therefore we
aggregate the internet speed data at the level of Built-up Areas (BUA) for England and Wales. This is a
‘bricks and mortar’ approach which refers to land which is “irreversibly urban in character” including
villages, towns or cities. Some key characteristics of these areas include: minimum size of 20 hectares;
areas with less than 200 meters between them are linked to a single built-up area; larger built-up
areas are separated to smaller sub-divisions of built-up areas (ONS 2013; see also Johnston, Poulsen,
and Forrest 2014; Johnston, Poulsen, and Forrest 2016 for other research applications of BUA). In
order to obtain information about the tail of the urban size distribution we include, wherever
available, the sub-divisions of BUA. As Table 9 illustrates, our approach results to 2,235 observations
which include built-up areas and sub-divisions of built-up areas for which we have internet speed data.
The lowest population of a built-up area in our data is just above 1,000 inhabitants in 2011. Figure 1
illustrates the built-up areas we use for the South-East of England and the Greater London Area.
Insert Table 9
Insert Figure 1
In terms of our identification strategy, we follow the same approach as we did for the US case. Table
10 presents the OLS and GLM estimation of eq. (6) while Table 11 presents the 2SLS estimation.
Starting from Table 10 we see a significant and positive effect of the average download speed on the
relative position of a BUA in the urban hierarchy in England and Wales. This is consistent with the
previous results from the US and also with the global model. Of course, the same endogeneity issues
might be present here too and we are addressing this below. In terms of control variables, we include
15
a measure of broadband tests per inhabitant in order to control for potential differences in the take-
up of this service. In addition, we include a number of socio-economic variables we believe that can
affect the relative position of a city in the urban hierarchy. Unemployment rate has a negative and
significant effect as in the US case. The same applies for the percentage of British population, which
indicates the importance of migration in urban growth. Population density and the percentage of
people working from home also negatively affected the relative position of BUA during 2011-2014.
Finally, Table 10 includes two interaction terms (columns 2 and 3) between download speed and
population and population density. The negative and significant sign for both interaction coefficients
indicates that the effect of download speed decreases as the size of the BUA or its density increases.
Again, in accordance to the previous results this is indicative of the effect of digital connectivity might
be larger for smaller BUA.
Insert Table 10
In regards to IV strategy, we use here the number of universities as the main instrument. Universities
in England and Wales have been established long time ago and therefore we do not expect them to
affect the change in the relative position of built-up areas in 2014. The correlation coefficient with the
LHS variables is smaller than 0.05. The addition of the second instrument (absolute number of
broadband tests in 2011) in column 2 results to a Sargan test, the null value of which cannot be
rejected. Moreover, the estimations reported in Table 11 do not suffer from weak instruments as the
relevant test is above the rule of thumb value of 10. The coefficients derived from OLS (Table 10) and
2SLS (Table 11) are of the same magnitude and always positive and significant, something which
advocates towards the effect of broadband speed in enhancing agglomeration forces. A back of the
envelope calculation indicates that, everything else being equal, 1 per cent increase in download
speed for a BUA, the relative position of which did not change during the period 2011-2014, would
have increase its position by 0.3 places. In absolute numbers, 1 per cent increase for the average BUA
would mean a marginal change from 13.8 Mbps to 13.9 Mbps.
Insert Table 11
In total, the results of the UK case study are in accordance with the previous results of the global,
multi-country analysis and of the US urban system. Interestingly, the positive effect of internet is still
apparent when the analysis adopts a morphological definition of cities and when the emphasis is not
on internet penetration, but instead on the quality of internet infrastructure as reflected in download
broadband speed. What is also interesting for the UK case study is that this effect appears to be higher
for small and less dense BUA.
6. Discussion and Conclusions
This paper approaches empirically a research question which lies within the core of urban economics
and economic geography. Specifically, this paper employs secondary data from a variety of sources in
order to econometrically test whether the generalized impact of the internet and other digital
communication technologies has offset the benefits derived by agglomeration economies and
resulted in more dispersed spatial population structures, or whether it further enhanced such urban
16
externalities and led to more concentrated spatial structures. Previous studies have resulted to
opposing results in terms of the complementary or supplementary relationship between digital
communications including the internet and agglomeration externalities. As Leamer and Storper (2001)
indicated, the internet can affect both centripetal and centrifugal forces. Our present study revisits
Ioannides et al. (2008) with a completely open mind in view of the availability of several years of
additional data on internet penetration across the world. This leads to a nuanced update of their
results, as we document in the Appendix. It also in effect reaffirms the results of Gaspar and Glaeser
(1998). In general, quite a few of these studies were based on assumptions about technological
capabilities which do not necessarily hold today. Or, they used data from an era prior which the
maturity of communication technologies was nowhere close to the current level.
Our analysis approaches this question from a multi-scalar perspective in order to address, to the
extent possible, the potential heterogeneity of such effects. Namely, we estimate a global, multi-
country model with two alternative data sets, one with a full range of city sizes and a second with data
on urban agglomerations with more than 300,000 inhabitants and test the effect of internet and digital
communications adoption on Zipf coefficients as a measure of size dispersion for heavy-tailed data.
Although the first approach did not result to robust against endogeneity estimates, the second
approach revealed rather robust and consistent results. Then, in order to test such effects for smaller
cities which were not included in our global data, we focus on the US and the UK. Specifically, we
tested the effect of internet usage and internet speed on the position of a micropolitan/metropolitan
area in the US, or a Built Up Area in the UK. The different results of our analysis, which appear to be
robust against endogeneity issues, advocate towards a complementary relation between the internet,
mobile and fixed telephony and agglomeration externalities. While the global model indicated that
increase adoption rates of such technologies has resulted in less uniform urban systems, the two case
studies revealed that internet adoption and internet speed improved the relative position of a city
within its urban system. Interestingly, the latter results indicated that such effects might be even
higher for smaller and less dense urban areas.
We believe that our results, apart from theoretical value, have the capacity to inform urban policy.
The ability of the internet and digital communications to further enhance agglomeration economies
can be used as a tool to support urban growth. In addition, the indication that such effects might be
stronger for smaller and less dense urban areas, at least in England and Wales, might be helpful to
further orient digital strategies towards such locations. Of course, improvements in internet speed is
not a trivial policy instrument as it involves numerous complexities. Apart from infrastructure
installation costs and engineering challenges, governing issues in regards to the ownership of such
digital networks as well as the provision of state subsidies create obstacles for the inclusion of such
strategies in the urban growth agenda. To further inform such policies, more research at granular
scales is needed in order to shed lights on the micro-mechanisms behind such urban processes.
Acknowledgements
We are grateful to many audience members at the 2016 Annual Winter Seminar of the German
speaking section of the European Regional Science Association and the North American Regional
Science Council 2016 meetings. Also to Dr Max Nathan for very helpful comments, to Camille Ryan,
US Bureau of the Census, for very useful suggestions about the American Community Survey data, and
17
to Janusz Jezowicz from http://www.broadbandspeedchecker.co.uk for providing the Internet speed
data. All errors are ours.
References
Anenberg, E., E. Kung. 2015. Information technology and product variety in the city: The case of food trucks. Journal of Urban Economics, 90,60-78.
Batty, M., 2006. Rank clocks. NATURE, 444,592-596. Bekkerman, A., G. Gilpin. 2013. High-speed Internet growth and the demand for locally accessible
information content. Journal of Urban Economics, 77,1-10. Bicego, G. T., J. T. Boerma. 1993. Maternal education and child survival: a comparative study of
survey data from 17 countries. Social science & medicine, 36,1207-1227. Black, D., V. Henderson. 2003. Urban evolution in the USA. Journal of Economic Geography, 3,343-
372. Brinkhoff, T. 2014. City Population 2014 [cited 9 December 2014]. Available from
http://www.citypopulation.de/. Cairncross, F., 2001. The death of distance 2.0 Texere Publishing Limited. London. Chen, S., M. Ravallion. 2007. Absolute poverty measures for the developing world, 1981–2004.
Proceedings of the National Academy of Sciences, 104,16757-16762. Cohen, B., 2004. Urban growth in developing countries: a review of current trends and a caution
regarding existing forecasts. World Development, 32,23-51. Craig, S. G., E. C. Hoang, J. E. Kohlhase. 2016. Does closeness in virtual space complement urban
space? Socio-Economic Planning Sciences. Decker, E. H., A. J. Kerkhoff, M. E. Moses. 2007. Global patterns of city size distributions and their
fundamental drivers. PLOS ONE, 2,e934. Donner, J., 2008. Research approaches to mobile use in the developing world: A review of the
literature. The information society, 24,140-159. Drucker, P. F. 1998. From Capitalism to Knowledge Society. In The knowledge economy, ed. D. Neef,
15-34. Butterworth-Heinemann. Woburn, MA. Duranton, G., D. Puga. 2004. Micro-foundations of urban agglomeration economies. Handbook of
regional and urban economics, 4,2063-2117. File, T., C. Ryan. 2014. Computer and Internet use in the United States: 2013. American Community
Survey Reports. Forman, C., A. Goldfarb, S. Greenstein. 2005. How did location affect adoption of the commercial
Internet? Global village vs. urban leadership. Journal of Urban Economics, 58,389-420. Freedom House. 2014. Freedom in the World 2014 [cited 9 December 2014]. Available from
https://www.freedomhouse.org/report-types/freedom-world#.VIa85DGsURo. Frenken, K., R. A. Boschma. 2007. A theoretical framework for evolutionary economic geography:
industrial dynamics and urban growth as a branching process. Journal of Economic Geography, 7,635-649.
Friedman, T. L., 2005. The world is flat Farrar, Straus and Giroux. New York. Fujita, M., P. Krugman, A. J. Venables. 1999. The spatial economy MIT Press. Cambridge. Gabaix, X., R. Ibragimov. 2011. Rank− 1/2: a simple way to improve the OLS estimation of tail
exponents. Journal of Business & Economic Statistics, 29,24-39. Gabaix, X., Y. M. Ioannides. 2004. The evolution of city size distributions. Handbook of regional and
urban economics, 4,2341-2378. Gaspar, J., E. L. Glaeser. 1998. Information Technology and the Future of Cities. Journal of Urban
Economics, 43,136-156. Giesen, K., J. Südekum. 2011. Zipf's law for cities in the regions and the country. Journal of Economic
Geography, 11,667-686.
18
Gilder, G. 1995. Forbes ASAP, February 27:56. Hamilton, J., 2003. Are main lines and mobile phones substitutes or complements? Evidence from
Africa. Telecommunications Policy, 27,109-133. Havlin, S., 1995. The distance between Zipf plots. Physica A: Statistical Mechanics and its
Applications, 216,148-150. Hildrum, J. M., 2009. Sharing Tacit Knowledge Online: A Case Study of e-Learning in Cisco's Network
of System Integrator Partner Firms. Industry and Innovation, 16,197-218. Hobcraft, J. N., J. W. McDonald, S. O. Rutstein. 1984. Socio-economic factors in infant and child
mortality: a cross-national comparison. Population studies, 38,193-223. Internet Live Stats. Internet Users 2017 [cited 15/2/2017. Available from
http://www.internetlivestats.com/internet-users/. Ioannides, Y. M., H. G. Overman. 2003. Zipf’s law for cities: an empirical examination. Regional
Science and Urban Economics, 33,127-137. Ioannides, Y. M., H. G. Overman, E. Rossi‐Hansberg, K. Schmidheiny. 2008. The effect of information
and communication technologies on urban structure. Economic Policy, 23,201-242. Ioannides, Y. M., J. Zhang. 2017. Walled cities in late imperial China. Journal of Urban Economics,
97,71-88. Johnston, R., M. Poulsen, J. Forrest. 2014. The changing ethnic composition of urban
neighbourhoods in England and Wales, 2001-2011: creating nations of strangers? Geography, 99,67.
———. 2016. Ethnic Residential Patterns in Urban England and Wales, 2001–2011: A System‐Wide Analysis. Tijdschrift voor Economische en Sociale Geografie, 107,1-15.
Kende, M., 2003. The digital handshake: connecting Internet backbones. CommLaw Conspectus, 11,45.
Kim, Y., P. F. Orazem. 2016. Broadband Internet and New Firm Location Decisions in Rural Areas. American Journal of Agricultural Economics, 99,285-302.
Kolko, J. 2000. The death of cities? The death of distance? Evidence from the geography of commercial Internet usage. In The internet upheaval: Raising questions, seeking answers in communications policy, eds. I. Vogelsang and B. M. Compaine, 73-98. The MIT Press. Cambridge, MA and London, UK.
Leamer, E. E., M. Storper. 2001. The Economic Geography of the Internet Age. Journal of International Business Studies, 32,641-65.
McCann, P., 2008. Globalization and economic geography: the world is curved, not flat. Cambridge Journal of Regions, Economy and Society, 1,351-370.
Mitchell, W. J., 1995. City of bits: space, place and the infobahn MIT Press. Cambridge, MA. Montgomery, M. R., 2008. The urban transformation of the developing world. SCIENCE, 319,761-
764. Nitsch, V., 2005. Zipf zipped. Journal of Urban Economics, 57,86-100. O'Brien, R., 1992. Global Financial Integration: The End of Geography Pinter. London. Ohmae, K., 1995. The Borderless World: Power and Strategy in an Interdependent Economy Harper
Business. New York. ONS. 2013. 2011 Built-up Areas - Methodology and Guidance, 1-15: Office for National Statistics. Panahi, S., J. Watson, H. Partridge. 2013. Towards tacit knowledge sharing over social web tools.
Journal of Knowledge Management, 17,379-397. Partridge, M. D., D. S. Rickman, K. Ali, M. R. Olfert. 2008. Employment Growth in the American Urban
Hierarchy: Long Live Distance. The B.E. Journal of Macroeconomics, 8,Article 10. Pew Internet. 2016. Pew Internet & American Life Project: Pew Internet. Pons-Novell, J., E. Viladecans-Marsal. 2006. Cities and the Internet: The end of distance? Journal of
Urban Technology, 13,109-132. Rauch, F., 2013. Cities as spatial clusters. Journal of Economic Geography,lbt034.
19
Riddlesden, D., A. D. Singleton. 2014. Broadband speed equity: A new digital divide? Applied Geography, 52,25-33.
Rietveld, P., R. W. Vickerman. 2004. Transport in regional science: The “death of distance” is premature. Papers in Regional Science, 83,229-248.
Rosenthal, S. S., W. C. Strange. 2004. Evidence on the nature and sources of agglomeration economies. Handbook of regional and urban economics, 4,2119-2171.
Sinai, T., J. Waldfogel. 2004. Geography and the Internet: is the Internet a substitute or a complement for cities? Journal of Urban Economics, 56,1-24.
Sohn, J., T. J. Kim, G. J. D. Hewings. 2003. Information technology and urban spatial structure: A comparative analysis of the Chicago and Seoul regions. Annals of Regional Science, 37,447-462.
Soo, K. T., 2005. Zipf's Law for cities: a cross-country investigation. Regional Science and Urban Economics, 35,239-263.
Stock, J. H., M. Yogo. 2005. Testing for weak instruments in linear IV regression. Identification and inference for econometric models: Essays in honor of Thomas Rothenberg.
Storper, M., A. J. Venables. 2004. Buzz: face-to-face contact and the urban economy. Journal of Economic Geography, 4,351-370.
Toffler, A., 1980. Third way William Morrow. New York. United Nations. 2014. World Urbanization Prospects: The 2014 Revision. In CD-ROM Edition:
Department of Economic and Social Affairs, Population Division, UN. Van Dijck, J., 2013. ‘You have one identity’: performing the self on Facebook and LinkedIn. Media,
Culture & Society, 35,199-215.
20
Appendix
This Appendix replicates the estimation of eq. (5) presented in Section 4 using as the LHS variable the
Zipf coefficient based on the Thomas Brinkhoff’s City Population data (Brinkhoff 2014). This was the
starting point of our analysis in order to compare our results with those of Ioannides et al. (2008).
However, because we did not manage to address the potential endogeneity issue within the expanded
data set and also because of the potential inconsistencies associated with the city definition in the City
Population data, we decided to also use the UN urban agglomerations data for the multi-country
analysis.
Here we present the results based on the City Population data for comparison with Ioannides et al.
(2008). It needs to be highlighted that these results are not directly comparable with the results
presented in Section 4 because of the differences in urban definitions: while the UN data include
population for urban agglomerations, the City Population data include mostly population for urban
jurisdictions. In addition, the latter data source does not have a minimum threshold, while the UN
data only include agglomerations with population of at least 300,000 people. In terms of the size of
the data, the City Population data includes more countries as indicated in Table A1, but there are only
a handful of observations for each country over the study period. The temporal sparseness of the data
reflects the way that the City Population data were constructed as they are mostly based on national
censuses. In total, we have estimated the Zipf coefficient for 82 countries contrary to 38 based on the
UN urban agglomerations data. However, the fixed effects (FE) estimations presented in Table A3 are
only based on 187 observations while the same estimation based on the UN urban agglomerations
data are based on 464 observations (Table 4). The latter is indicative of the sparseness of the City
Population data.
Insert Table A1
Insert Table A2
Insert Table A3
Just like the estimations presented in Table 4, the FE estimation presented in Table A3 might suffer
from endogeneity bias (see discussion in Section 4). However, it is still interesting that the coefficients
for the four variables of interest are negative contrary to the positive coefficients reported in Table 4.
The negative coefficients are in accordance to the findings of Ioannides et al. (2008) results. As before,
we turn our emphasis to the 2SLS estimations presented in Table A4. We use the same identification
strategy we used in the previous regressions and we introduce two IVs: political rights and mortality
rate under 5 years per 1,000 habitants. Regarding the former, it is based on an index provided by
Freedom House (2014) which takes values from 1 to 7 with higher values representing worst political
rights. We know that the status of political rights can affect Internet adoption and usage by controlling
the digital economy and censoring online communications. However, we have no theoretical reasons
to believe that political rights (or lack of them) can directly affect spatial structure. People can equally
move towards urban centers to enjoy anonymity or to rural areas where state control mechanisms
might be weaker. The above argument is also reflected on the very low correlation coefficient
between political rights and the Zipf coefficient (0.19). In regards to mortality rate, this instrument is
used for the regressions with the telephony-related endogenous variables and its validity was
discussed in Section 4.
21
Insert Table A4
Table A4 reports the 2SLS estimations. On top of using the whole sample of countries (columns 1, 4,
7, 10) we have also split out sample to EU and NAFTA countries (columns 2, 5, 8, 11) and non-EU and
non-NAFTA countries (columns 3, 6, 9, 12). We did this to further investigate whether technology
adoption spatial effects vary between group countries with different level of development. The main
issue with these results has to do with the strength of our proposed instruments: they seem to be
weak for most of the regressions presented in Table A4, as indicated by the weak identification test.
A number of other instruments were also tested including all the instruments used in Table 5 for the
2SLS regressions based on the UN urban agglomerations data as well as instruments related with
freedom of press, religion and corruption. Despite having reasons to believe that these variables can
be valid instruments, none of them appeared to be strong. Therefore, we decided to use the UN urban
agglomerations data for the Zipf estimation, which also appears to be more consistent in regards to
the city definition. Moreover, a functional understanding of cities might be more appropriate for an
international data set in order to identify spatial effects from digital technology adoption. As indicated
in the main body of the data, we identify valid and strong instruments for the regressions with the Zipf
coefficient when we use the UN urban agglomerations data. Returning to the estimates reported in
Table A4 our IVs are strong only for the mobile phone usage endogenous variable. Interestingly, the
coefficient for the EU and NAFTA counties is negative and highly significant in accordance to the FE
results and the results reported by Ioannides et al. (2008).
22
Table 1: Zipf coefficients using the UN Urban Agglomerations Data (300,000 Inhabitants or more)
Countries 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Algeria -1.090 -1.099 -1.108 -1.115 -1.122 -1.127 -1.131 -1.137 -1.142 -1.146 -1.148 -1.152
Australia -0.777 -0.788 -0.790 -0.796 -0.798 -0.801 -0.804 -0.806 -0.809 -0.811
Bangladesh -0.716 -0.718 -0.718 -0.719 -0.719
Belgium -1.354 -1.349 -1.344 -1.339 -1.335 -1.330 -1.325 -1.320 -1.316 -1.311 -1.306 -1.302 -1.297 -1.292 -1.288
Brazil -0.933 -0.934 -0.935 -0.936 -0.937 -0.938 -0.939 -0.940 -0.940 -0.941 -0.942 -0.942 -0.943 -0.944 -0.944
Canada -1.065 -1.061 -1.056 -1.051
Chile -0.754 -0.757 -0.760 -0.767 -0.773 -0.779 -0.786 -0.792 -0.799 -0.806 -0.812 -0.819 -0.825 -0.832 -0.839
China -1.199 -1.204 -1.209 -1.214 -1.218 -1.222 -1.225 -1.227 -1.229 -1.229 -1.230 -1.231 -1.231 -1.231
Colombia -0.979 -0.980 -0.980 -0.980 -0.980 -0.980 -0.981 -0.982 -0.983 -0.984 -0.985 -0.986 -0.986 -0.987 -0.988
Egypt, Arab Rep. -0.731 -0.730 -0.729 -0.728 -0.724 -0.723 -0.722 -0.721 -0.720 -0.719 -0.718
France -1.149 -1.148 -1.147 -1.145 -1.144 -1.143 -1.142 -1.140 -1.139 -1.137 -1.136 -1.134 -1.132 -1.131 -1.129
Germany -1.527 -1.525 -1.526 -1.527 -1.527 -1.525 -1.521 -1.517 -1.515 -1.513 -1.511 -1.509 -1.506 -1.503 -1.500
India -1.072 -1.077 -1.081 -1.085 -1.089 -1.092 -1.094 -1.096 -1.098 -1.099 -1.099 -1.099 -1.099
Indonesia -1.144 -1.152 -1.161 -1.169 -1.175 -1.181 -1.186 -1.189 -1.192 -1.195 -1.197 -1.198 -1.197 -1.195
Iran, Islamic Rep. -1.115 -1.127 -1.167 -1.172 -1.177 -1.181 -1.185 -1.189 -1.196
Italy -1.396 -1.401 -1.405 -1.409 -1.413 -1.416 -1.420 -1.424 -1.428 -1.432 -1.435 -1.439 -1.442 -1.446 -1.449
Japan -0.619 -0.619 -0.617 -0.616 -0.614 -0.612 -0.611 -0.610 -0.610 -0.609 -0.608 -0.608 -0.607 -0.606 Korea, Rep. -1.022 -1.036 -1.049 -1.062 -1.073 -1.084 -1.091 -1.095 -1.100 -1.104 -1.108 -1.112 -1.116 -1.120 -1.124
Malaysia -1.019 -1.015 -1.011 -1.008 -1.004 -1.000 -0.996 -0.993 -0.989 -0.985 -0.981 -0.977 -0.974 -0.970
Mexico -1.188 -1.190 -1.191 -1.193 -1.194 -1.195 -1.196 -1.197 -1.199 -1.200 -1.201 -1.202 -1.203 -1.203 -1.204
Morocco -1.239 -1.238 -1.236 -1.241 -1.248 -1.254 -1.259 -1.262 -1.265 -1.268 -1.270 -1.271 -1.271
Netherlands -1.388 -1.407 -1.427 -1.447 -1.468 -1.483 -1.494 -1.504 -1.514 -1.524 -1.534 -1.544 -1.553 -1.562 -1.571
Nigeria -1.249 -1.252 -1.254 -1.256 -1.258 -1.258 -1.258
Pakistan -0.900 -0.900 -0.899 -0.899 -0.898 -0.897 -0.897 -0.896 -0.895 -0.893
Peru -0.827 -0.826 -0.826 -0.825 -0.825 -0.824 -0.824 -0.823 -0.823 -0.822 -0.822
Philippines -0.982 -0.985 -0.987 -0.989 -0.992 -0.994 -0.996 -0.998 -1.001 -1.003 -1.005 -1.007 -1.008 -1.010
Poland -1.833 -1.826 -1.821 -1.816 -1.810 -1.805 -1.800 -1.797 -1.793 -1.789 -1.785 -1.781 -1.776 -1.772
23
Saudi Arabia -0.984 -0.992 -1.001 -1.013 -1.014 -1.014 -1.014 -1.013 -1.010
South Africa -0.791 -0.796 -0.802 -0.807 -0.812 -0.816 -0.821 -0.825 -0.829 -0.833 -0.837 -0.841 -0.844
Spain -0.921 -0.918 -0.916 -0.914 -0.912 -0.910 -0.908 -0.906 -0.904 -0.902 -0.900 -0.899 -0.897 -0.895 -0.893
Sudan -0.850 -0.853 -0.854 -0.854 -0.854 -0.854
Switzerland -1.753 -1.745 -1.737 -1.730 -1.726 -1.721 -1.717 -1.712 -1.707 -1.702 -1.697 -1.691 -1.686 -1.680 -1.674
Tanzania -0.911 -0.907 -0.903 -0.898 -0.893 -0.889 -0.883 -0.878 -0.873 -0.867
Thailand -0.729 -0.736 -0.742 -0.748 -0.753 -0.757 -0.762 -0.767 -0.771 -0.775 -0.778 -0.781
Turkey -1.066 -1.065 -1.063 -1.061 -1.059 -1.056 -1.054 -1.051 -1.048 -1.044 -1.041 -1.038 -1.036 -1.034
United Kingdom -1.186 -1.184 -1.182 -1.179 -1.177 -1.175 -1.172 -1.170 -1.167 -1.165 -1.162 -1.160 -1.157 -1.155 -1.152
United States -1.003 -1.010 -1.017 -1.023 -1.029 -1.036 -1.042 -1.048 -1.053 -1.058 -1.062 -1.066 -1.070 -1.074
Source: United Nations 2014, own estimations; corrected using the Gabaix and Ibragimov (2011) transformation
24
Table 2: Descriptive statistics for the global, multi-country analysis
Variable Obs Mean Std. Dev. Min Max Sources
Zipf coefficient1 462 -1.099 0.275 -1.833 -0.606 UN 2014, own
estimation
Internet users per 100 hab. (log) 462 3.275 1.047 0.095 4.543 ITU 2015
Broadband users per 100 hab. (log) 462 0.912 2.321 -4.605 3.751 ITU 2015
Mobile phone users per 100 hab. (log) 462 4.183 0.753 0.191 5.270 ITU 2015
Fixed phone users per 100 hab. (log) 462 2.846 1.250 -2.303 4.314 ITU 2015
Population (log) 462 17.966 1.125 15.787 21.034 ITU 2015
Population density (log) 462 4.536 1.201 0.927 7.108 World Bank 2016
GDP per capita (log) 462 9.701 0.860 7.515 10.920 World Bank 2016
GDP growth 462 3.560 3.032 -6.609 14.195 World Bank 2016
Trade (% of GDP) 462 67.470 36.595 19.119 210.374 World Bank 2016
Government expenditure (% GDP) 462 15.645 4.636 5.039 27.495 World Bank 2016
Non agriculture value added (% GDP) 462 92.445 7.510 62.950 99.383 World Bank 2016
Mortality rate, under-5 (per 1,000)2 462 23.084 25.571 2.900 146.400 World Bank 2016
Female % of labor force2 462 38.510 9.069 13.743 50.241 World Bank 2016
Secure Internet servers (per 1M. hab.)2 447 260.895 532.009 0.015 2,820.434 World Bank 2016
Households with TV (% - 20y lag)2 462 233.593 193.745 0.370 814.492 ITU 2015
Employment in telecoms. (20y lag)2 454 117,036.800 185,174.700 2,100.000 1,077,300.000 ITU 2015
1 This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011) transformation 2 This variable will be used as an Instrumental Variable for the estimation of (5) based on 2SLS
25
Table 3: Correlations between digital technology variables
Internet Broadband Mobile Fixed
Internet users per 100 hab. (log) 1
Broadband users per 100 hab. (log) 0.872 1.000
Mobile phone users per 100 hab. (log) 0.759 0.739 1.000 Fixed phone users per 100 hab. (log) 0.725 0.714 0.445 1
26
Table 4: OLS estimation of (5)
(1) (2) (3) (4)
Variables Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1
Internet users per 100 hab. (log) 0.0096***
(0.0022)
Broadband users per 100 hab. (log) 0.0013*
(0.0007)
Mobile phone users per 100 hab. (log) -0.0023
(0.0018) Fixed phone users per 100 hab. (log) -0.0036
(0.0028)
Population (log) 0.6192 0.6955 0.7024 0.7674
(0.6459) (0.6570) (0.6585) (0.6589)
Population density (log) -0.6556 -0.6993 -0.6636 -0.7613
(0.6441) (0.6554) (0.6580) (0.6577)
GDP per capita (log) -0.0250*** -0.0117** -0.0045 -0.0080
(0.0063) (0.0053) (0.0061) (0.0051)
GDP growth 0.0003 0.0002 0.0003 0.0003
(0.0003) (0.0003) (0.0003) (0.0003)
Trade (% of GDP) -0.0003*** -0.0004*** -0.0003*** -0.0003***
(0.0001) (0.0001) (0.0001) (0.0001)
Government expenditure (% GDP) -0.0003 -0.0005 -0.0005 -0.0006
(0.0008) (0.0008) (0.0008) (0.0008)
Non agriculture value added (% GDP) -0.0008 -0.0005 -0.0007 -0.0008
(0.0006) (0.0006) (0.0006) (0.0006)
Time trend -0.0020*** -0.0018*** -0.0017*** -0.0016***
(0.0004) (0.0004) (0.0004) (0.0004)
Country FE (within) YES YES YES YES
Constant -9.6186 -11.0161 -11.3698 -12.1351
(9.5131) (9.6750) (9.6920) (9.6991)
Observations 464 464 464 464
R-squared 0.4464 0.4266 0.4242 0.4242
Number of countries 38 38 38 38
Standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1
Weighted by the inverse squared standard error of the estimated Zipf coefficient 1 This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011)
27
Table 5: 2SLS estimation of (5)
(1) (2) (3) (4) (5) (6) (7) (8)
Variables Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1
Internet users per 100 hab. (log) 0.0174*** 0.0172***
(0.0061) (0.0061)
Broadband users per 100 hab. (log) 0.0133*** 0.0121***
(0.0030) (0.0028)
Mobile phone users per 100 hab. (log) 0.0388*** 0.0400***
(0.0107) (0.0087) Fixed phone users per 100 hab. (log) 0.0139** 0.0177***
(0.0065) (0.0062)
Population (log) 0.5308 0.6976 1.3929 0.5761 0.5330 0.7354 3.1479** 0.5338
(0.6515) (0.8287) (1.0005) (0.6854) (0.6510) (0.7949) (1.2839) (0.6994)
Population density (log) -0.6063 -0.8552 -1.8316* -0.5409 -0.6076 -0.8762 -3.6027*** -0.4923
(0.6475) (0.8233) (1.0161) (0.6851) (0.6471) (0.7900) (1.2994) (0.6989)
GDP per capita (log) -0.0382*** -0.0455*** -0.0825*** -0.0119** -0.0378*** -0.0420*** -0.0867*** -0.0127**
(0.0114) (0.0105) (0.0218) (0.0055) (0.0114) (0.0098) (0.0186) (0.0056)
GDP growth 0.0002 -0.0006* 0.0000 0.0001 0.0002 -0.0006 -0.0000 0.0000
(0.0003) (0.0004) (0.0004) (0.0003) (0.0003) (0.0004) (0.0004) (0.0003)
Trade (% of GDP) -0.0003*** -0.0006*** -0.0009*** -0.0004*** -0.0003*** -0.0006*** -0.0009*** -0.0004***
(0.0001) (0.0001) (0.0002) (0.0001) (0.0001) (0.0001) (0.0002) (0.0001)
Government expenditure (% GDP) -0.0001 0.0001 -0.0011 -0.0002 -0.0001 0.0001 -0.0015 -0.0001
(0.0008) (0.0010) (0.0012) (0.0008) (0.0008) (0.0009) (0.0012) (0.0008)
Non agriculture value added (% GDP) -0.0009 0.0005 0.0002 -0.0000 -0.0009 0.0004 -0.0001 0.0002
(0.0006) (0.0008) (0.0009) (0.0007) (0.0006) (0.0007) (0.0009) (0.0007)
Time trend -0.0022*** -0.0029*** 0.0000 -0.0017*** -0.0022*** -0.0027*** 0.0003 -0.0017***
(0.0005) (0.0006) (0.0008) (0.0004) (0.0005) (0.0006) (0.0007) (0.0004)
Country FE (within) YES YES YES YES YES YES YES YES
Observations 464 449 449 464 464 449 442 464
28
R-squared 0.4302 0.0351 -0.4136 0.3691 0.4310 0.1114 -0.4399 0.3420
Number of countries 38 38 38 38 38 38 38 38
Sargan 1.785 2.438 0.184 2.418
Chi-sq(3) P-val 0.182 0.118 0.668 0.120
Weak identification 65.55 46.44 26.80 99.94 32.72 25.37 20.74 60.24
Standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1 Weighted by the inverse squared standard error of the estimated Zipf coefficient IVs: (1) Mortality rate, under-5 (per 1,000); (2) & (3) Secure Internet servers (per 1M. Hab.); (4) Female % of labour force IVs: (5), (6) and (8) as (1), (2) and (4) respectively and also Households with TVs (% - 20 y. lag); IVs: (7) as (3) and also Employment in telecoms. (20 y. lag). 1This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011)
29
Table 6: Descriptive statistics for the US case study
Variables Obs Mean Std. Dev. Min Max Sources
Normalized difference in ranks, 2013-15 428 0.5 0.004 0.48 0.522 ACS 2016
Unemployment, 2013 428 8.503 2.56 2.3 19.9 ACS 2016
Income, 2013 428 48,637 8,969 24,945 91,533 ACS 2016
Population, 2013 428 605,250 1,416,024 62,282 19,949,502 ACS 2016
Employment in service, 2013 (%) 428 0.793 0.062 0.49 0.922 ACS 2016
Commute, minutes, 2013 428 22.368 3.308 15.3 37.9 ACS 2016
Population >= 25 with Bachelor's degree, 2005 (%) 428 15.218 4.634 6.6 33.9 ACS 2016
Commute, minutes, 2005 428 21.704 3.466 14 40.7 ACS 2016
Population with computer and broadband connection (%) 428 0.792 0.068 0.434 0.92 ACS 2016
White population, 2013 (%) 428 0.803 0.129 0.173 0.963 ACS 2016
Population density, 2013 428 51.551 53.167 0.948 405.757 ACS 2016
30
Table 7: OLS and GLM estimation of (6) for the US
Variables Normalized difference in ranks 2013-15
OLS glm:
quasibinomial link = logit
(1) (2) (3) (4)
Population with computer and broadband connection, 2015 (%) 0.011** 0.015** 0.043 0.043**
(0.005) (0.006) (0.045) (0.019)
Population, 2013 (log) -0.0002 -0.0003 0.002 -0.001 (0.0002) (0.0002) (0.003) (0.001)
Unemployment, 2013 (%) -0.0003*** -0.0003*** -0.0003*** -0.001*** (0.0001) (0.0001) (0.0001) (0.0004)
White population, 2013 (%) -0.002 -0.002 -0.002 -0.007 (0.002) (0.002) (0.002) (0.008)
Income, 2013 (log) -0.001 -0.00004 -0.0005 -0.002 (0.002) (0.002) (0.002) (0.008)
Population density, 2013 0.0000 0.0001* 0.0000 -0.00002 (0.0000) (0.0001) (0.0000) (0.00001)
Employment in service, 2013 (%) 0.007** 0.007** 0.007* 0.027** (0.003) (0.003) (0.003) (0.014)
Commute, minutes, 2013 -0.00001 0.0000 0.0000 -0.0001 (0.0001) (0.0001) (0.0001) (0.0003)
Pop. with comp. and broadband con. (%) x pop. density, 2013 -0.0001*
(0.0001)
Pop. with comp. and broadband con. (%) x population, 2013 -0.003
(0.004)
Constant 0.499*** 0.491*** 0.472*** -0.006 (0.019) (0.020) (0.042) (0.075)
Observations 428 428 428 428
Adjusted R2 0.052 0.056 0.051
Rob. standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1
31
Table 8: 2SLS estimation of (6) for the US
Variables Normalized difference in
ranks, 2013-15
(1) (2)
Population with computer and broadband connection, 2015 (%)
0.054** 0.053**
(0.026) (0.026)
Population, 2013 (log) -0.0003 -0.0003 (0.0002) (0.0002)
Unemployment, 2013 (%) -0.0005*** -0.0005*** (0.0002) (0.0002)
White population, 2013 (%) -0.008* -0.008* (0.005) (0.005)
Income, 2013 (log) -0.013* -0.013* (0.007) (0.007)
Population density, 2013 -0.00001 -0.00001 (0.0000) (0.0000)
Employment in service, 2013 (%) 0.0001 0.0001 (0.005) (0.005)
Commute, minutes, 2013 0.0001 0.0001 (0.0001) (0.0001)
Constant 0.611*** 0.610*** (0.066) (0.066)
Weak identification 38.36 19.34
Sargan 0.6
Chi-sq(3) P-val 0.44
Observations 428 428
Adjusted R2 -0.166 -0.164
Robust Std. Error in parentheses; ***p<0.01; **p<0.05; *p<0.1
IVs: (1) Bachelors degree per hab. in 2005
IVs: (2) Bachelors degree per hab. in 2005, Commute, minutes, 2005
32
Table 9: Descriptive statistics for the UK case study
Variables Obs Mean Std. Dev. Min Max Sources
Normalized difference in ranks, 2011-14 2,235 0.497 0.008 0.428 0.624 ONS 2016
Download speed, 2014 (kbps) 2,235 13,806 6,670 603 62,105 broadbandspeedchecker.co.uk
Broadband test, 2014 2,235 334 767 31 11,865 broadbandspeedchecker.co.uk
Population, 2011 2,235 23,484 56,083 1,029 1,106,968 ONS 2016
Population, 2014 2,235 24,031 57,760 1,027 1,136,229 ONS 2016
Broadband test per capita, 2014 2,235 0.017 0.016 0.003 0.48 ONS 2016
% of unemployment, 2011 2,235 0.057 0.025 0 0.199 ONS 2016
% of British population, 2011 2,235 0.924 0.088 0.167 0.994 ONS 2016
Urban areas (dummy 2,235 0.478 0.5 0 1 ONS 2016
Population density, 2011 2,235 17.372 17.632 0.1 170 ONS 2016
% of people working from home, 2011 2,235 0.122 0.046 0.046 0.336 ONS 2016
Employment in service, 2011 (%) 2,235 0.786 0.051 0.562 0.966 ONS 2016
% of people with a university degree, 2001 2,235 0.193 0.081 0.034 0.606 ONS 2016
33
Table 10: OLS and GLM estimation of (6) for the UK
Variables Normalized difference in ranks 2011-14
OLS glm:
quasibinomial
link = logit
(1) (2) (3) (4)
Download speed, 2014 (log) 0.001** 0.002*** 0.007*** 0.004** (0.0004) (0.001) (0.003) (0.002)
Population, 2011 (log) 0.001*** 0.001*** 0.006*** (0.0003) (0.0003) (0.001)
Population, 2014 (log) 0.009***
(0.003)
Broadband tests per capita, 2014
-0.028 -0.03 -0.023 -0.112
(0.024) (0.023) (0.023) (0.096)
% of unemployment, 2011 -0.042*** -0.042*** -0.044*** -0.166*** (0.01) (0.01) (0.01) (0.039)
% of British population, 2011 -0.009*** -0.009*** -0.008*** -0.036*** (0.002) (0.002) (0.002) (0.01)
Urban (dummy) 0.0004 0.0004 -0.0001 0.001 (0.0005) (0.0005) (0.0005) (0.002)
Population density, 2011 -0.00004*** 0.001*** -0.0001*** -0.0002*** (0.00001) (0.0002) (0.00001) (0.0001)
% of people working from home, 2011
-0.031*** -0.030*** -0.028*** -0.122***
(0.008) (0.008) (0.008) (0.034)
Employment in service, 2011 (%)
0.001 0.001 0.001 0.005
(0.004) (0.004) (0.004) (0.017)
Download speed, 2014 (log) x pop. density, 2011
-0.0001***
(0.00002)
Download speed, 2014 (log) x population, 2014 (log)
-0.001***
(0.0003)
Constant 0.490*** 0.484*** 0.425*** -0.038 (0.006) (0.007) (0.024) (0.025)
Observations 2,235 2,235 2,235 2,235
Adjusted R2 0.09 0.092 0.101
Rob. standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1
34
Table 11: 2SLS estimation of (6) for the UK
Variables Normalized difference in ranks, 2011-14
(1) (2)
Download speed, 2014 (log) 0.007** 0.016*** (0.003) (0.004)
Population, 2011 (log) 0.001*** 0.0002 -0.0003 -0.0003
Broadband tests per capita, 2014 (0.027) (0.025) -0.026 -0.03
% of unemployment, 2011 -0.035*** -0.025* (0.012) (0.015)
% of British population, 2011 -0.009*** -0.008*** (0.003) (0.003)
Urban (dummy) 0.001 0.001 (0.0005) (0.001)
Population density, 2011 -0.00004*** -0.00004** (0.00001) (0.00002)
% of people working from home, 2011 -0.024** -0.015 (0.01) (0.013)
Employment in service, 2011 (%) -0.002 -0.007 (0.005) (0.006)
Constant 0.440*** 0.364*** (0.03) (0.037)
Weak identification 13.89 15.25
Sargan 1.76
Chi-sq(3) P-val 0.18
Observations 2,235 2,235
Adjusted R2 -0.054 -0.799
Robust Std. Error in parentheses; ***p<0.01; **p<0.05; *p<0.1
IVs: (1) Number of universities
IVs: (2) Number of universities, Number of broadband tests, 2011
35
Table A1: Zipf coefficients using the city population data (Brinkhoff 2014)
Countries 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Albania -0.981 -0.891
Algeria -1.154 -1.261
Argentina -0.950
Australia -0.778 -0.740 -0.759 -0.773
Austria -1.348 -1.352
Azerbaijan -0.933 -1.005 Bangladesh -1.153 -1.168
Belarus -0.879 -0.863
Belgium -1.740 -1.727
Bolivia -0.722 -0.764
Botswana -0.891 -0.999
Brazil -1.204 -1.249
Bulgaria -1.142 -1.103 -1.097 Burkina Faso -1.034 -1.012
Cambodia -0.847 -0.856
Canada -0.819
Chile -0.953 -0.968
China -1.071 -1.065
Colombia -0.907 -0.909
Croatia -0.879 -0.889
Cyprus -1.154 Czech Republic -1.272 -1.245
Denmark -1.004 -1.024 -1.037 Dominican Republic -0.856 -0.910
Ecuador -0.862 -0.897
36
Egypt, Arab Rep. -0.762 -1.001
Estonia -0.744 -0.721
Finland -0.970 -0.963 -0.960
France -1.526 -1.529 -1.525
Georgia -0.849 -0.830 -0.824
Germany -1.269 -1.287 -1.272 -1.265
Ghana -1.093 -0.966
Greece -0.981 -1.023
Honduras -0.810 -0.967
Hungary -1.227 -1.234
Iceland -0.812 -0.806 -0.781
India -0.992 -1.007
Indonesia -1.041 -1.258 Iran, Islamic Rep. -1.001 -1.012
Ireland -0.908 -0.942 -0.929 -0.935
Italy -1.383 -1.420 -1.421
Japan -0.975 -1.180 -1.197 Kazakhstan -0.963 -0.964
Kenya -0.779 -0.901 Korea, Rep. -0.933 -1.000 -0.939 -0.930 Kyrgyz Republic -0.929 -0.912
Latvia -0.737 -0.726
Lithuania -0.880 -0.891
Malawi -0.790 -0.769
Mali -0.953 -0.987
Mauritius -1.069 -1.114
Mexico -0.886 -0.759 -0.850 -0.915
37
Mozambique -0.746 -0.788
Namibia -0.910 -0.759
Nepal -1.102 -1.025 Netherlands -0.270 -0.264 New Zealand -0.704 -0.730 -0.738
Nicaragua -0.964 -1.027 -1.045
Niger -0.978 -0.899
Norway -0.966 -0.975 -0.977
Panama -0.775 -0.797 Philippines -0.837 -0.807
Poland -1.175 -1.200 -1.200
Portugal -1.165 -1.186 Russian Federation -1.189 -1.203 Saudi Arabia -0.807 -0.803
Serbia -1.222 Slovak Republic -1.457 -1.464
Slovenia -1.134 -1.134 South Africa -0.904 -0.938
Spain -1.303 -1.425
Sweden -1.172 -1.162 -1.162 -1.159 Switzerland -1.501 -1.547 -1.543
Tajikistan -1.041 -1.001
Tanzania -0.995 -0.739
Turkey -1.038 -0.959 -0.956
Uganda -1.258 -1.258
38
United Kingdom -1.381 -1.436 United States -1.282 -1.421 -1.421 Venezuela, RB -1.076 -1.127
Vietnam -0.955 -0.954
Zambia -0.704 -0.846
Source: Brinkhoff, 2014, own estimations; corrected using the Gabaix and Ibragimov (2011) transformation
39
Table A2: Descriptive statistics for the global, multi-country analysis based on the City Population (Brinkhodd 2014)
Variables Obs Mean Std. Dev. Min Max Sources
Zipf coefficient1 187 -1.0233 0.231426 -1.73965 -0.26438 Brinkhoff 2014, own estimation
Internet users per 100 hab. (log) 187 2.249999 2.327144 -6.94804 4.553877 ITU 2015
Broadband users per 100 hab. (log) 140 0.614889 2.747945 -8.47845 3.686023 ITU 2015
Mobile phone users per 100 hab. (log) 164 3.788911 1.553586 -3.98164 5.24264 ITU 2015
Fixed phone users per 100 hab. (log) 187 2.648674 1.537178 -1.66073 4.291828 ITU 2015
Population (log) 187 16.63885 1.523393 12.54684 21.01422 ITU 2015
Population density (log) 187 4.033531 1.323941 0.854415 7.07327 World Bank 2016
GDP per capita (log) 187 9.132718 1.234125 6.097948 10.78581 World Bank 2016
GDP growth 187 3.904782 3.137127 -9.13 11 World Bank 2016
Trade (% of GDP) 187 80.1786 37.46099 16.7497 199.675 World Bank 2016
Government expenditure (% GDP) 187 16.36759 5.267809 4.506118 28.36146 World Bank 2016
Non agriculture value added (% GDP) 187 89.76607 10.98673 53.66336 99.32111 World Bank 2016
Political rights 187 2.513369 1.879025 1 7 Freedom House
Mortality rate, under-5 (per 1,000)2 187 32.41979 43.70039 2.4 231.7 World Bank 2016
1 This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011) transformation 2 This variable will be used as an Instrumental Variable for the estimation of (5) based on 2SLS
40
Table A3: OLS estimation of (5) based on the City Population (Brinkhodd 2014)
(1) (2) (3) (4)
Variables Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1
Internet users per 100 hab. (log) -0.0025
(0.0064)
Broadband users per 100 hab. (log) -0.0152***
(0.0043)
Mobile phone users per 100 hab. (log) -0.0026
(0.0081) Fixed phone users per 100 hab. (log) -0.0598***
(0.0162)
Population (log) 0.9516* -0.4581 0.4515 1.0077*
(0.5522) (0.4755) (0.5423) (0.5167)
Population density (log) -0.8516 0.1287 -0.3182 -0.8177*
(0.5181) (0.4175) (0.5072) (0.4849)
GDP per capita (log) 0.1158** 0.0894* 0.1425** 0.1268***
(0.0468) (0.0503) (0.0544) (0.0439)
GDP growth 0.0006 -0.0014 -0.0031 0.0010
(0.0018) (0.0016) (0.0021) (0.0017)
Trade (% of GDP) 0.0007 0.0009* 0.0009** 0.0005
(0.0005) (0.0005) (0.0005) (0.0004)
Government expenditure (% GDP) -0.0004 -0.0001 0.0040 -0.0026
(0.0032) (0.0027) (0.0032) (0.0029)
Non agriculture value added (% GDP) -0.0082* 0.0065 -0.0066 -0.0064
(0.0045) (0.0059) (0.0055) (0.0041)
Time trend -0.0049** 0.0011 -0.0070*** -0.0058***
(0.0020) (0.0025) (0.0019) (0.0017)
Country FE (within) YES YES YES YES
Constant -4.1834 2.7400 5.9573 -3.5586
(7.6674) (6.5315) (7.4348) (6.8188)
Observations 187 140 164 187
R-squared 0.1729 0.6112 0.2093 0.2748
Number of countries 82 81 82 82
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Weighted by the inverse squared standard error of the estimated Zipf coefficient 1 This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011)
41
Table A4: 2SLS estimation of (5) based on the City Population (Brinkhodd 2014)
(1) (2) (3) (4) (5) (6)
Variables Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1
Internet users per 100 hab. (log) 0.0630* 0.0875*** 0.0653
(0.0335) (0.0302) (0.1838) Broadband users per 100 hab. (log) -0.0154 -0.0188 0.0058
(0.0135) (0.0221) (0.0222) Mobile phone users per 100 hab. (log)
Fixed phone users per 100 hab. (log)
Population (log) 0.6624 2.1864* 1.3656 -0.4618 0.4606 -1.1278**
(0.7788) (1.1240) (1.2129) (0.4867) (1.0687) (0.5534)
Population density (log) -1.0464 -3.0580*** -1.2648 0.1318 -0.7659 0.3700
(0.7247) (1.1122) (1.6595) (0.4250) (1.3388) (0.4087)
GDP per capita (log) 0.1139* -0.1624 0.2197 0.0900 0.0678 -0.1415
(0.0648) (0.0995) (0.1594) (0.0573) (0.1762) (0.0992)
GDP growth -0.0031 -0.0041 -0.0007 -0.0014 -0.0026 -0.0022
(0.0031) (0.0047) (0.0092) (0.0015) (0.0028) (0.0015)
Trade (% of GDP) 0.0002 0.0001 -0.0003 0.0009 0.0007 0.0029**
(0.0007) (0.0008) (0.0022) (0.0007) (0.0007) (0.0014) Government expenditure (% GDP) 0.0102 0.0226*** 0.0047 -0.0002 0.0032 -0.0060
(0.0069) (0.0078) (0.0138) (0.0032) (0.0071) (0.0050) Non agriculture value added (% GDP) -0.0229** -0.0019 -0.0237 0.0065 0.0334 0.0073
(0.0096) (0.0153) (0.0357) (0.0068) (0.0244) (0.0108)
Time trend -
0.0137*** -0.0117*** -0.0255 0.0012 0.0004 0.0049
(0.0052) (0.0041) (0.0411) (0.0059) (0.0050) (0.0087)
Country FE (within) YES YES YES YES YES YES
Observations 183 80 103 103 62 41
R-squared -0.7390 0.0754 -0.8454 0.6112 0.7337 0.7338
Number of countries 78 31 47 44 25 19
Countries all EU & NAFTA non EU &
NAFTA all EU &
NAFTA non EU &
NAFTA
Weak identification 7.156 8.883 0.265 4.821 1.564 1.068
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Weighted by the inverse squared standard error of the estimated Zipf coefficient
IVs: (1) - (6) Political rights; (7) - (12) Mortality rate, under-5 (per 1,000) 1This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011)
42
Table A4 (continued): 2SLS estimation of (5) based on the City Population (Brinkhodd 2014) (7) (8) (9) (10) (11) (12)
Variables Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1 Zipf coef.1
Internet users per 100 hab. (log)
Broadband users per 100 hab. (log)
Mobile phone users per 100 hab. (log) 0.0137
-0.0546*** -0.0066
(0.0141) (0.0157) (0.0198) Fixed phone users per 100 hab. (log) 0.5406 -0.0048 -0.1886
(8.3090) (0.0530) (0.3535)
Population (log) 0.3888 0.5436 0.5426 0.3336 2.0116** 1.3387
(0.5275) (0.5717) (0.6956) (9.5274) (0.8512) (0.8584)
Population density (log) -0.3912 -0.9150 -0.1203 -1.2325 -2.2402*** -0.8767
(0.4942) (0.5629) (0.6245) (6.0208) (0.8561) (0.7378)
GDP per capita (log) 0.1121** 0.0935 0.1482* 0.0155 -0.0800 0.2004**
(0.0570) (0.0670) (0.0841) (1.5483) (0.0707) (0.0865)
GDP growth -0.0032 -0.0008 -0.0032 -0.0044 0.0078*** -0.0036
(0.0020) (0.0017) (0.0026) (0.0748) (0.0019) (0.0033)
Trade (% of GDP) 0.0010** 0.0007* 0.0007 0.0018 0.0008 0.0003
(0.0004) (0.0004) (0.0006) (0.0180) (0.0006) (0.0007) Government expenditure (% GDP) 0.0046 0.0048** 0.0051 0.0236 0.0025 -0.0073
(0.0031) (0.0022) (0.0049) (0.3628) (0.0039) (0.0153) Non agriculture value added (% GDP) -0.0088 0.0130** -0.0104 -0.0301 0.0291*** -0.0063
(0.0055) (0.0058) (0.0072) (0.3293) (0.0092) (0.0111)
Time trend -0.0075*** -0.0012 -0.0081* -0.0001 -0.0029 -0.0079
(0.0019) (0.0018) (0.0046) (0.0782) (0.0021) (0.0072)
Country FE (within) YES YES YES YES YES YES
Observations 147 73 74 183 80 103
R-squared 0.1659 0.8269 0.2100 -10.1334 0.5153 -0.0273
Number of countries 65 30 35 78 31 47
Countries all EU &
NAFTA non EU &
NAFTA all EU & NAFTA non EU &
NAFTA
Weak identification 33.40 99.47 9.243 0.00511 11.84 0.219
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Weighted by the inverse squared standard error of the estimated Zipf coefficient
IVs: (1) - (6) Political rights; (7) - (12) Mortality rate, under-5 (per 1,000) 1This is the corrected Zipf coefficient following Gabaix and Ibragimov (2011)
43
Figure 1: Built-up areas for the South-East of England and the Greater London Area