The Eﬀect of Urban Agglomeration on Wages: Evidence from...

The Effect of Urban Agglomeration on Wages:

Evidence from Samples of Siblings

Harry Krashinsky∗

University of Toronto

Abstract

The large and significant relationship between city population and wages has beenwell established in the agglomeration literature, yet its causal interpretation remainsdebated. This paper contributes new evidence to this debate by using multiple datasets of siblings in order to estimate the agglomeration premium while controlling forunobserved individual heterogeneity with a family-specific fixed effect. In the absenceof this fixed effect, the agglomeration premium is large and significant. But after afamilial fixed effect is included in the regression framework, the city-size wage premiumbecomes small in magnitude and statistically insignificant in all of the data sets usedin the analysis. The results demonstrate the importance of family background forinterpreting the agglomeration premium.

∗Corresponding author: Harry Krashinsky, 121 St. George Street, Centre for Industrial Relations, Uni-versity of Toronto, Toronto, Ontario, Canada, M5S 2E8. Telephone: (416) 978-1744. Fax: (416) 978-5696.Email: [email protected] I would like to thank Orley Ashenfelter and Cecilia Rouse for permit-ting me to access to the data of identical twins. I would like to thank William Strange and Stuart Rosenthalfor several helpful comments and discussions about this paper. I would also like to thank Jeremy Glazierfor helpful research assistance.

1 Introduction

The effect of a city’s population on wages is highly significant and large in magni-

tude. Various studies have demonstrated that doubling the population of an individual’s

city would cause wages to rise by three to seven percent, and moving from a city of less

than 500,000 people to one with more than half-a-million residents would increase wages by

over 20 percent. Both of these effects are at least as large as the returns to many standard

variables included in a wage regression, and perhaps because of this, researchers have ques-

tioned the causal nature of the relationship between city size and wages. Generally, papers

which argue against the causal effects of city size on wages tend to find that non-random

selection is the reason wages are higher in cities than in non-urban areas — better workers

select into cities to obtain higher-paying jobs there. Conversely, studies which argue that

city size does have a causal effect on wages suggest that even after accounting for selection

issues, agglomeration causes better matches between workers and firms because larger cities

have “thicker” markets, or that cities contain other amenities (or disamenities) which cause

wages to be higher in urban areas.

To contribute to this debate, this paper presents new evidence from multiple data

sets of siblings in order to estimate the city-size wage premium while using an econometric

strategy that is new to the agglomeration literature: incorporating familial effects on wages.

The first data set used in the analysis is a sample of identical twins; this data is advantageous

for assessing the agglomeration premium because it is possible to contrast the wages of twins

in cities of different sizes. Such a contrast estimates the causal effect of agglomeration on

wages because it accounts for the unobserved component of ability, since the twin pairs

are genetically identical, and it also accounts for familial effects on earnings because the

twins also share the same family background. The evidence will show that in a cross-

sectional regression without controls for familial ability, there is a significant and large effect

of city size on wages, but this effect becomes insignificant in the within-twin analysis. More

importantly, controlling for familial ability causes the agglomeration premium to become

1

significantly reduced in many different specifications and econometric approaches. The

analysis uses two popular measures of city size to represent the effect of agglomeration:

the log of a city’s population and an indicator equal to one if the city’s population exceeds

500,000 residents. For both variables, the inclusion of a family-specific fixed effect eliminates

the city-size wage premium.

To address the robustness of these findings, two other data sets will also be incor-

porated into the analysis. First, the National Longitudinal Survey of Youth (NLSY) will

be employed because it is a longitudinal data set that also contains a large number of sib-

lings. Since there are few twins in this sample, it is not possible to assume that the siblings

are equally able because they are genetically identical; however, the NLSY does contain a

measure of unobserved ability since all respondents were assigned a score based upon their

performance on a series of standardized tests. As such, within-family differences in the

unobserved component of ability can be captured by differences in these standardized test

scores. The results will demonstrate that, as was the case with the data set of twins, there

is a large and significant effect of city size on wages in simple cross sectional regressions, but

the inclusion of a familial fixed effect into these regressions eliminates the significance of the

agglomeration premium.

The second data set that will be used to consider the robustness of the impact of

familial effects on the agglomeration premium is the five-percent Public Use Microdata Sam-

ple (PUMS) from the 2000 United States Census. The PUMS is a household-based sample,

and it contains information about each member of the household, as well as information re-

garding the familial relationship between the head of the household and all other household

members. As such, it is possible to identify two types of sibling sets in the data: household

heads and household members who are siblings of the household head, as well as children

of a household head who are living together in a given household. Furthermore, the infor-

mation collected by the Census makes it possible to determine the effect on wages of the

population of the public-use microdata area (PUMA) where the respondent works. This

2

measure of agglomeration does have a significantly positive effect on wages, but including

familial-specific fixed effects eliminates the significance of the agglomeration premium for

both sets of siblings. Overall, the evidence from all three data sets is remarkably con-

sistent and underlines the importance of controlling for familial ability when assessing the

magnitude and significance of the agglomeration premium.

2 Literature Review

The literature on the agglomeration wage premium documents the significant effect

of city size on wages1, and discusses how this premium is affected by selection, which arises

from the fact that the distribution of individual ability may not be the same across cities

of different sizes. Wheeler (2001) uses the 1980 IPUMS sample and finds that the return

to doubling city size is equal to approximately 3 percent. He also finds that this return

varies significantly for less- or more-educated workers: whereas there is not a significant

effect of city size on wages for workers without any high school education, college graduates

exhibit a return of approximately 4 percent if city size is doubled. Glaeser and Mare

(2001) study the urban wage premium by comparing individuals who live in a city of at

least 500,000 people to those who do not, and find that the wage difference between these

two groups is approximately 25 to 30 percent. After incorporating individual fixed effects

into their framework, Glaeser and Mare show that this premium significantly decreases to

between 4 to 10 percent, and the authors suggest that this decrease could be consistent with

the agglomeration premium being caused by ability bias. However, they also argue that

the fixed-effect framework does not address the higher wage growth evident within cities.

Concurrent with this notion, Glaeser and Mare find evidence of higher wage growth for

workers in cities, but it is smaller than the impact of the fixed effect on wages from living

in cities, and the authors suggest that this fixed effect is an important component of the

urban wage premium. Yankow (2006) also considers big-city wage premia by analyzing cities1One of the first papers to provide theoretical and empirical evidence of this idea is Roback (1982).

3

with at least one million residents, and cities between one-quarter million and one million

residents. Similar to Glaeser and Mare, he finds that fixed effects can account for about

two-thirds of the urban premium in this specification, and that growth effects can account for

some of the remaining premium. Wheeler (2006) also investigates the effect of cities on wage

growth, and specifically compares wage growth which occurs within a job to wage growth

resulting from job changes. Overall, he finds mixed evidence for higher wage growth in cities,

and that wage growth within a job is not significantly higher for workers at jobs in large

cities; it is only wage growth resulting from job changes that is significantly higher in large

cities (however, this difference is not significant in the fixed effect specification). Rosenthal

and Strange (2006) consider the transmission mechanisms through which the urban wage

premium is conveyed, and find that it is primarily due to the concentration of more-educated

workers in urban areas, and that this effect is attenuates with distance. After including

controls for endogeneity as well as using an instrumental variables procedure, the authors

find that an increase in population size has a significant effect on wages. Bacalod et. al.

(2007) create measures of skills to explore the urban premium, and they find that attributes

such as cognitive skills are, in general, uniformly distributed across cities of different size.

However, these skills are more highly valued in larger cities than smaller ones, and this

greater valuation is robust to the inclusion of AFQT scores and individual fixed effects in

the wage equation.

International evidence on the agglomeration premium is generally similar to the

evidence from U.S. data sources. Tabuchi and Yoshida (2000) use Japanese data to estimate

the urban wage premium, and find that wages should increase by 10 percent if city size

doubles. Combes, Duranton and Gobillon (2007) use a large French panel data set to

consider the impact of the area fixed effects on wages. In their analysis of French wages,

Combes et. al. find that individual fixed effects are, by far, the most important determinants

of wages as well as area fixed effects, which suggests that sorting on the basis of ability is

an important component of the urban wage premium. However, it should also be noted

4

that area fixed effects were not entirely eliminated by the incorporation of individual fixed

effects into a wage regression, and that agglomeration still plays a significant role in wage

determination.

3 Data and Estimation Approach

To consider the effect of familial ability on the urban wage premium, a common

estimation procedure will be used for all three data sets, and the analysis will begin with the

data set of identical twins. This data was collected during the summers of 1991, 1992, 1993

at the Twinsburg Twins Festival in Twinsburg, Ohio, and the interview questionnaires were

modeled after the Census and CPS instruments.2 The data are drawn from the sub-sample

of identical white twins,3 both of whom have worked within two years prior to the interview

and are living within the United States, and the key question for the purpose of the analysis

is the population of the city in which each sibling lives. Respondents were asked to report

the city in which they lived, and then this city’s population was separately entered into

the data based upon Census statistics.4 Table 1 displays the characteristics of the twins

sample, and compares them to white workers from reweighted5 CPS supplements. The data2Some of the data from the first three waves of this survey were used by Ashenfelter and Krueger (1994)

and Ashenfelter and Rouse (1998), who provide a discussion of the procedures used to collect this data.

Some additional questions were specifically designed for interviewing twins, such as the twin’s report of his

or her sibling’s educational attainment, which was used as an instrumental variable to account for the effect

of measurement error on the return to education.3Ashenfelter and Krueger (1994) and Ashenfelter and Rouse (1998) discuss the fact that, on average,

the black twins interviewed for the sample exhibited unrepresentative characteristics. As such, they were

dropped from the sample. However, this exclusion does not affect any of the main results presented in this

paper.4The respondent’s report of their hometown was matched against the city population provided by the

1990 U.S. Census.5Reweighting was conducted on the basis of the twin’s state of residence. As was the case with the results

in Ashenfelter and Rouse (1998), these differences have no large effect on the results in this paper. Also,

wage regressions using CPS and twin data yield very similar coefficients on all the variables in my wage

regressions.

5

set composed of identical twins is generally similar to the reweighted CPS samples, with

some small differences evident in characteristics like marital status.

The data set of identical twins provides a unique advantage in assessing the ag-

glomeration premium; specifically, it is possible to determine the causal effect of a city’s

population on wages by assuming that the unobserved component of ability is equal for

both twins. This implies that the difference in earnings between a twin in a large city

and his sibling in a small town will be attributed to the effect of city size on earnings, and

will not be biased by the unobserved component of ability. Figures 1 and 2 graphically

demonstrate the effect of making this assumption about the city-size premium. In Figure

1, the log of each twin’s hourly wage is plotted against the log of their city’s size, and a

positive fitted relationship is evident between these two variables.6 If this positive effect of

city size was not prone to ability bias, then comparing the within-twin difference in wages

with the within-twin difference in city size should yield a roughly similar result, given that

each pair of twins is assumed to be equally able. However, Figure 2 plots the within-twin

differences in wages and city size, and there is not a significant relationship between these

two within-twin differences — that is, a difference in city size is not observed to correlate

with a difference in wages for each twin pair.7 This suggests that the effect of ability bias

is important within the analysis of the agglomeration premium.

However, Figures 1 and 2 are only suggestive of the importance of ability bias,

and it is necessary to explore the agglomeration premium with a more formal econometric

framework. To operationalize this framework, it is assumed that ability has a linear effect

on earnings, and the earnings equations for each twin can be expressed as follows:

y1j = β01jX1j + α0Zj +Aj + ε1j

y2j = β02jX2j + α0Zj +Aj + ε2j

6In a simple bivariate regression of log wages on the log of city size, the coefficient on city size is 0.056

with a standard error of 0.012.7The bivariate regression of the within-twin difference in wages on the within-twin difference in city size

yields a coefficient of 0.014 with a standard error of 0.016 for the within-twin difference in city size.

6

where Xij represents a vector of individual characteristics for twin i from family j, Zj

represents common characteristics for family j, Aj is a family-specific ability term and εij

is an individual-specific error term. The identifying assumption of the model is that the

returns to individual characteristics Xij are the same for both twins, and that ability is

correlated between twins. Specifically, Aj is expressed as:

Aj = γ

µX1j +X2j

2

¶+ vj

These assumptions lead to the reduced-form correlated random-effects model (Chamberlain

1982):

y1j = βX1j + αZj + γ

µX1j +X2j

2

¶+ vj + ε1j

y2j = βX2j + αZj + γ

µX1j +X2j

2

¶+ vj + ε2j

where γ represents the correlation between a family’s ability level and each twin’s individual

characteristics. An attractive component of this model is that it provides estimates of both

γ, the effect of familial ability on wages, and β, the effect of individual-specific variables on

earnings.

An alternative estimation procedure that accounts for familial ability bias is the

fixed-effects model, which differences the two regressions used in the correlated random

effects model. The resulting equation is:

(y1j − y2j) = β(X1j −X2j) + (ε1j − ε2j)

Although the fixed-effect model yields unbiased estimates that are not correlated with ability,

it does not provide a direct estimate of γ.

Estimates from the OLS, correlated random effects and fixed effects models are

provided in Table 2, which displays results for earnings equations which use two different

measures to represent the agglomeration premium: the logarithm of the respondent’s city’s

population, and an indicator variable equal to one if the respondent’s city has a population

in excess of 500,000 residents. If familial ability had no effect on earnings, then the OLS

7

estimates displayed in Table 2 would provide an unbiased estimate of the effect of the ex-

ogenous regressors, including both variables used to capture the agglomeration premium.

Also, under these circumstances, the OLS and correlated random effects estimates would

differ only because of sampling error. However, this is not the case. Results in the first

three columns show that the coefficient for the log of city size differs dramatically depending

on the estimation procedure. Without controls for ability, the estimates in column one

show that the premium for doubling a city’s size is roughly four percent, which is within

the range of premia estimated by prior studies.8 However, the results in columns two and

three demonstrate that accounting for familial ability greatly reduces the significance and

the magnitude of this coefficient — the city size premium is basically reduced to zero, with a

correspondingly small t-value. In addition, the correlation between ability and city size is

large in magnitude and highly significant. These results suggest that unobserved ability is

a significant factor for determining the city size premium, which is consistent with studies

such as Combes et. al. (2007), amongst others.9

The last three columns of Table 2 display results from regressions which include an

indicator equal to one if the city’s population is greater than half of a million residents. The

findings in the fourth column demonstrate that residing in a city whose population exceeds

500,000 generates a wage premium of approximately 19 percent. Glaeser and Mare argued

that this effect may be due to selection (captured by an individual fixed effect) as well as

higher wage growth that occurs in cities. The results in columns five and six of this Table

attest to the importance of fixed effects, since they demonstrate that there is a large effect of8As previously discussed, Wheeler (2001) finds a premium of 3 percent for doubling city size. Also,

Bacolod et. al. find a premium of approximately 6 to 7 percent.9 It is also noted that the other estimated coefficients in the OLS model from the data set of twins (such

as the returns to education, marital status, and tenure) are similar to those in commonly used data sets.

Also, as demonstrated in earlier work, the return to education remains significant even after controlling for

familial ability, as does tenure, but marital status does not. Ashenfelter and Krueger (1994) and Ashenfelter

and Rouse (1998) showed that education remains significant even in the presence of a family fixed-effect,

while Krashinsky (2004) showed that the marital premium drops to zero after familial controls are included

in the regression.

8

familial ability on this wage premium. In fact, incorporating a familial fixed effect into the

econometric framework results in the premium becoming insignificant and significantly lower

than it was in the case where no such controls were included in the regression. This suggests

that, similar to the findings in the first three columns of the table, the agglomeration wage

premium for cities of at least 500,000 people is also prone to bias through unobserved ability.

Table 3 bifurcates the sample into men and women to determine whether or not

the effects of familial ability controls differ by gender, and uses both types of measures of

agglomeration employed in Table 2 — the log of the city’s population and an indicator equal to

one if the city’s population is over five-hundred thousand. The first four columns of Table 3

present the OLS and correlated-random effects estimates for men, and the last four columns

present the same estimates for women. The first and fifth columns show that the premium

associated with the log of a city’s population is highly significant for both men and women,

and roughly of the same magnitude for both groups — approximately 4 percent for women

and 5 percent for men. For both genders, however, columns two and six demonstrate that

the effect of controlling for familial ability is the same as in Table 2 — the city-size premium

is no longer significant, its magnitude is basically zero and it is significantly lower than it

was in the absence of familial controls. Columns three and seven display the return to

living in a city with a population of at least half of one million people, and for men, the

premium is large and highly significant. Columns four and eight demonstrate that with the

inclusion of familial controls, though, the coefficient on this variable is small in magnitude

and statistically insignificant.

Recent papers have also documented the value of various skills in cities. For instance,

Bacolod et. al. have found that certain types of skills are rewarded more in cities than outside

of cities. To that end, the city-size wage premium is investigated within a quantile regression

context to allow the agglomeration premium to differ for workers with different levels of skill,

and to assess the impact of a familial fixed effect within this context as well. Results from

quantile regressions are displayed in Table 4 for each of the different variables that represent

9

the effect of city size on wages, both with and without the inclusion of family controls in

the regression specification.10 The first two rows of Table 4 show the results from quantile

regressions at the 10th, 25th, 50th, 75th and 90th percentiles for models which use the log of

the city’s population as the independent variable representing the effect of city size. For all

five percentiles, city size has a significant effect on wages, and this effect grows in magnitude

at higher percentiles (which was also evident in Bacolod et. al.). However, for all five

cases, introducing controls for familial ability causes the premium to become statistically

insignificant, and significantly smaller than the return to city size in the absence of family

controls. This suggests that familial ability not only has an impact on the average return

to city size (as was demonstrated in Tables 2 and 3), but it also affects the agglomeration

premium throughout the wage distribution as well. The last two rows of Table 4 show

similar results for the quantile regressions which include a dummy variable equal to one if

the twin resides in a city whose population exceeds 500,000 people: at all five percentiles, the

effect of including familial controls reduces the magnitude of the effect of city size on wages.

Furthermore, at all percentiles except the 50th, the premium is significant in the absence of

familial controls, but insignificant with these controls.

Overall, the results in Tables 2 through 4 demonstrate that controlling for familial

ability between twins accounts for virtually all of the wage premium associated with city size,

and this is true for many different specifications. Familial controls had a significant effect

on the premium associated with the log of the city’s population and a dummy variable equal

to one if the city’s population exceeds half a million residents. Also, the family fixed effect

had a similar impact on the agglomeration premium evident in quantile regressions. This

suggests that the unobserved component of ability is a significant factor for explaining the

wage effects of agglomeration. However, given that the data set of twins is not a large data

set, it is important to demonstrate that the impact of familial effects on the agglomeration10For brevity’s sake, Table 4 only contains estimates from the OLS and correlated random effects models.

Within-twin estimates of the agglomeration premium in the quantile regression context are similar to results

using the correlated random effects model.

10

premium is also present in other data sets as well. As such, the agglomeration premium

will be analyzed in two separate data sets in the next section.

4 Results from the NLSY and the U.S. Census

To consider the robustness of the findings from the data set of twins, the analysis

will also explore the effect of familial ability on the agglomeration premium within the

National Longitudinal Survey of Youth (NLSY) and the five-percent Public Use Microdata

Sample (PUMS) from the 2000 United States Census. The NLSY contains several sets of

siblings because of its design: the data were assembled from an individual-level survey drawn

from households with youths between the ages of 14 and 22 in 1979. A large number of

households have multiple youths surveyed for the sample, and since the data also contains

longitudinal information about the urban status of each respondent’s town (well after they

move out of their parents’ home), it possible to use this data to compare the wages of

siblings in different areas. Similarly, the U.S. Census is a household-level survey which

contains information about the head of the household, as well as other members within the

household, so it is possible to make two types of sibling comparisons with this data. First,

since some households contain a household head and his or her sibling, and because the data

contain information on each respondent’s Public Use Microdata Area (PUMA) of work, a

wage comparison may be conducted for these siblings working in different PUMAs of work.

Second, similar comparisons can be made for children of the household head who are working

and still living with the household head. Overall, the evidence from the NLSY and Census

will demonstrate that familial fixed effects make the agglomeration premium statistically

insignificant and small in magnitude, which corroborates the findings from the data set of

twins.

Table 5 presents descriptive statistics from a sample drawn from the 1979 to 2004

waves of the NLSY for all respondents who work at least 15 hours per week and more than

26 weeks of the year, not including respondents from the two oversamples collected by the

11

NLSY.11 For comparability’s sake, the analysis is restricted to include only those siblings who

are within three years of age of each other, and also uses same-gender siblings — sisters are

compared to sisters and brothers to brothers — in order to avoid issues regarding differential

labor force participation for brother-sister pairs.12 Table 5 displays means from the entire

cross-sectional sample of the NLSY as well as means from the sample of same-gendered

siblings; generally, the samples are quite similar.

The first row displays the percentage of respondents who reside in “urban” areas,

which the NLSY classifies as a central core or city and its adjacent, closely settled territory

which have a combined total population of 25,000 or more. Although not as detailed

as the population of the respondent’s city, the “urban” indicator is the best measure of

agglomeration in the publicly-available files of the NLSY, and it is consistent with measures

used in studies which study agglomeration by analyzing cities with populations above and

below a given threshold. The results in the first row of the Table show that approximately

seventy-seven percent of the overall sample live in urban areas, and about eighty percent

of siblings of both genders live in urban places. The second through eighth rows of the

Table display various observable characteristics of the sample, including age, marital status

and average log wage. As it was with the results from the first row, the characteristics of

the overall sample from the NLSY are quite similar to the characteristics of the sample of

siblings.

The ninth row of the Table displays the adjusted average score from a standardized

test the respondents wrote, which is commonly referred to as the Air Force Qualifying Test

(AFQT). The score on this test is created from an amalgam of scores on a series of tests

known as the Armed Services Vocational Aptitude Battery (ASVAB); these tests were given11The NLSY is comprised of three main subsamples: the representative cross-sectional subsample, a mili-

tary oversample, and an oversampling of civilian Hispanic or Latino, black, and economically disadvantaged,

non-black/non-Hispanic youth. In order to use the most representative data, the two oversamples were

excluded from the analysis, and siblings were drawn from the representative cross-sectional subsample.12Since females have a lower probability of participating in the labor market than males, the exclusion of

siblings of different genders circumvents the need to model these participation decisions.

12

to virtually all respondents in the NLSY in 1980.13 However, since respondents varied in

age and education at the time of writing the test, it is necessary to adjust the scores for these

two factors when analyzing the test scores. As such, Table 5 presents an “adjusted” score

from the AFQT — it is the residual from a regression of the AFQT score on a respondent’s

age and education at the time of writing the tests. The advantage of this variable is that it

provides an approximate measure of the unobserved component of the respondent’s ability;

that is, it represents his or her aptitude above and beyond observable measures.14

This variable is useful in a regression context because it can assist in accounting for

within-sibling differences in ability. This is explored in Table 6, which displays results from

wage regressions which use a simple OLS procedure as well as a fixed-effects framework to

measure the urban premium for siblings in the NLSY. The first two columns of the Table

analyze a pooled sample of both brothers and sisters, and the results in the first column

indicate that there is a highly significant 13 percent return to living in an “urban” area,

even after controlling for observable characteristics. However, including a family-specific

fixed effect into the analysis significantly alters the agglomeration premium. The findings in

column two suggest that the agglomeration premium is slightly less than two percent, after

accounting for a family-specific fixed effect. In addition, these results are strengthened by the

fact that each sibling’s adjusted AFQT score is included in the regression. Unlike the data

set of twins, the siblings in the NLSY are not genetically identical, and it is plausible that

a family fixed effect may not capture all of the within-sibling differences in the unobserved

component of ability. However, the within-sibling difference in the adjusted AFQT score

should serve as a good proxy for any remaining portion of unobserved ability that is not

captured by the familial fixed effect.

The remaining four columns of Table 6 present results bifurcated for the sample of13In a few cases, the ASVAB tests were written in 1981.14There are many other ways of normalizing the AFQT measure, such as converting the raw test score to

a percentile score within each age cohort. All of the main findings from the NLSY are robust to the use of

different normalization adjustments for the AFQT score.

13

brothers and the sample of sisters, since the returns to individual regressors in the wage

equation may be different for men and women. Generally, though, the impact of a familial

fixed effect for each gender is the same as the results for the pooled sample in columns one

and two. Columns three and four demonstrate that the agglomeration premium for brothers

is statistically significant and approximately 13 percent in an OLS regression without any

family controls, but two-and-a-half percent (and only marginally significant) once familial

fixed effects are included within the regression. Similarly, columns five and six show that

the agglomeration premium for sisters is statistically significant and approximately twelve

percent in a simple OLS regression, but only one percent and statistically insignificant once

familial controls are included in the regression. Substantively, the results in columns three

through six do not alter any of the fundamental conclusions drawn in the first two columns

of the table: even separating the analysis by gender, the large and significant agglomeration

premium becomes small in magnitude and statistically insignificant at the five percent level

of significance after familial controls are included in the regression.

A further examination of the agglomeration premium in the NLSY is presented in

Table 7, which contains results from quantile regressions at the tenth, twenty-fifth, fiftieth,

seventy-fifth and ninetieth percentiles for the three subsamples considered in Table 6. The

first two columns present quantile regression results for the pooled sample of brothers and

sisters from the data, and the results are similar to those in Table 6 (and also the findings

from the quantile regression results from the data set of twins). In the first column, it

is demonstrated that there is a significant urban premium at all five percentiles, and this

premium is also increasing in magnitude at higher percentiles. However, the second column

demonstrates that including a family-specific fixed effect makes this premium small and

statistically insignificant in all but one case. When the siblings are analyzed by gender in

the remaining columns of the Table, similar findings are evident. The results in columns

three and five for brothers and sisters, respectively, demonstrate that the urban premium is

significant at all of the percentiles, and that the premium increasing at higher percentiles.

14

In columns four and six, the results show that the urban premium becomes insignificant

(and small in magnitude) in all cases. Overall, as was the case in Table 4, the findings

suggest that the urban premium increases at larger percentiles, but becomes statistically

insignificant throughout the wage distribution once familial fixed effects are included in the

regression specification.

The five-percent PUMS from the U.S. Census also allows for a within-sibling analysis,

as previously discussed, because it contains information about two main groups that will be

of particular use to the analysis: household heads and their siblings, and children of household

heads who are siblings and live in the home of the household head. Ideally, it would be

possible to identify siblings in different households (as was the case with the data set of

twins and the NLSY), however, the sample design makes such a comparison impossible.

This limitation has an effect on the types of individuals selected from the Census for this

analysis, since the sample of household heads who also have a sibling living with them (or

households with two working children still living at home) may not be representative of the

overall population. Table 8 confirms this notion by comparing sample means from the overall

census to means from the subsamples that will be used for the analysis. The first column

shows the means from the overall census population of male household heads, male children of

household heads and male siblings of household heads; comparing the characteristics of this

group to the male household heads who live with their male siblings (in column three), it is

clear that the latter sample is less educated, less wealthy and much younger than the former

sample. A similar comparison can be made with the females from the census; column two

reports sample means for all female household heads, siblings and children from the census,

and a comparison to the results from column five (for female household heads who also live

with their sisters) shows that the same differences are evident, although to a lesser degree —

there are only minor differences in hourly wages, suggesting that selection effects are more

minor for the female sample. Given the results from the sample of twins and from the

NLSY, it would be expected that the agglomeration premium would be smaller for siblings

15

from the Census (especially male siblings), given that they are drawn from lower percentiles

of the wage distribution.

Table 9 confirms this fact. Columns one and three show that there is a relatively

small agglomeration premium for the PUMA of work for both male household heads and their

siblings (approximately half of a percent) and for male children of household heads (approx-

imately one-and-a-half percent), and both are consistent with the returns to agglomeration

in the lower wage percentiles seen in the NLSY and data set of twins. However, columns

two and four demonstrate that, as was the case in the other data sets, familial fixed effects

eliminate the significance of the agglomeration premium for males, and significantly reduces

its magnitude. An analysis of female siblings from the Census reveals similar patterns, too.

Table 8 showed that female siblings within the Census appeared to be far more similar to

the overall sample of women in the Census — especially in regards to wages — than was the

case for men. As a result, the agglomeration premium for female siblings in the Census is

much more similar to that from the overall literature; columns five and seven reveal that

the agglomeration premium is approximately three percent for both female household heads

and their siblings, as well as female children living with the household head. Again, the

inclusion of a familial fixed effect in columns six and eight makes the premium insignificant

and substantially smaller in magnitude — virtually zero for both subsamples.

To further consider the agglomeration premium within the Census, Table 10 repli-

cates the analysis from Table 9, but instead of using the log of the population of the respon-

dent’s PUMA of work, the framework uses an indicator variable equal to one if population

of the respondent’s PUMA of work exceeds five-hundred thousand, and zero otherwise. The

results in Table 10 are highly similar to findings in Table 9: the first and third column

show that the two types of male siblings exhibit a significant wage premium (between six

and nine percent) for working in a PUMA whose population exceeds one-half million, but

columns two and four show that this premium becomes insignificant and small in magnitude

after family fixed effects are included in the framework. As well, the two female samples of

16

siblings show that this type of agglomeration premium is large and significant in the absence

of family fixed effects: columns five and seven report that these women exhibit a twelve to

fourteen percent premium for working in a PUMA whose population exceeded one-half of a

million people. However, these large premia became insignificant and small in magnitude

once familial fixed effects were included in the regression specification.

Overall, the NLSY and Census provide evidence that is consistent with findings from

the data set of twins, and the results reinforce the notion that the issue of selection is a highly

important factor when computing the agglomeration premium. One remaining issue for the

analysis, though, involves some econometric complications that can affect conclusions drawn

from any sibling-based framework; these issues will be discussed in the following section.

5 Within-Silbling Differences in Ability and Measure-

ment Error

The key issue raised in this study was the manner in which the wage premium

for agglomeration involves the sorting of workers inside or outside of cities — in particular,

that there may be a non-random sorting of more able workers into cities which creates this

premium. The assumption used to identify the causal effect of agglomeration on wages

is that the unobserved component of ability is captured by a familial fixed effect, and any

further sorting into or out of cities that occurs after accounting for this fixed effect is due to

factors unrelated to productivity, such as the preference for amenities or disamenities present

within cities.15 However, as it has been with other studies which use data on siblings, it

can be questioned whether or not a familial fixed effect actually captures all of a siblings’s

unobserved ability. In particular, it may be the case that even after including a family-based

fixed effect into the regression specification, within-sibling differences in ability still exist,

even with the inclusion of within-sibling differences in test scores, as was the case with data15Roback’s (1982) seminal work on this subject provides a model of individual choice to reside inside or

outside of cities.

17

from the NLSY. Both Neumark (1999) and Bound and Solon (1999) outlined the potential

biases that can affect within-twin estimates of the return to education, and the same biases

can affect within-sibling estimates of any other variable in the wage equation. If sibling

i’s individual-specific component of ability is denoted by the variable bAij, then the wageequations for each sibling can be written as:

y1j = βX1j + αZj + θAj + φ bA1j + ε1j

y2j = βX2j + αZj + θAj + φ bA2j + ε2j

In this case, the within-sibling estimates of β derived from a regression of ∆yj on ∆Xj are

not unbiased, because a within-sibling estimator will not a fully remove the effects of ability:

(y1j − y2j) = β(X1j −X2j) + φ( bA1j − bA2j) + (ε1j − ε2j)

∆yj = β∆Xj + φ∆ bAj +∆εj

and the resulting estimates of β are biased by the correlation of ∆A0j and ∆Xj:

bFE = (∆X0j∆Xj)

−1∆X 0j∆yj

= β + φ(∆X 0j∆Xj)

−1∆X 0j∆ bAj

It has been suggested that there exists a positive correlation withA0ij and a series of regressors

in the wage equation, such as education, marital status, tenure, and city size. Thus, the

row vector, ∆X 0j∆ bAj, would be expected to contain exclusively positive entries. The more

able sibling would also receive a higher wage than his or her counterpart, suggesting that

φ > 0, causing an upward bias in the estimation results for bFE. This lead Bound and Solon

and Neumark to suggest that the within-sibling estimates are upper-bounds of the unbiased

return to education, since it could be argued that differences in educational attainment

between the siblings were due to differences in unobserved ability that was not captured by

the family-specific fixed effect. However, this criticism is equally valid for any other variable

analyzed in the within-sibling framework, including city size, since it could be argued that

the more able sibling locates to a larger city.

18

Although the existence of within-sibling differences in ability may weaken conclusions

drawn about estimates of the return to education from the data set of siblings (in particular,

the data set of identical twins), it has favorable implications for the evidence on the city

size wage premium presented herein. Since differences in inter-sibling ability cause an

upward bias of the within-sibling fixed-effect estimator, then the fixed-effect estimate is an

upper-bound on the true value of the return to city size. However, because the fixed-effect

estimate of the city size wage premium is insignificant, then this suggests that the unbiased

coefficients also are insignificant (and possibly negative). Thus, the presence of any within-

sibling differences in ability would actually strengthen the conclusions drawn from the results

about the causal effects of city size on wages.

One additional consideration for the analysis is the potential effect of measurement

error. Many authors (Ashenfelter and Krueger (1994), Griliches (1979)) have demonstrated

that measurement error has an attenuating effect on coefficient estimates from a within-

sibling framework, so it could be the case that the within-sibling or family fixed effects

estimates of the urban premium are small because of these attenuating effects. However,

this is unlikely to be true, because little measurement error would be present for the variables

used to represent agglomeration in the analysis. In the NLSY and Census, the population of

a respondent’s city or PUMA of work is recorded through a relatively accurate administrative

record, not a relatively inaccurate self-report. Further, for the data set of identical twins,

each twin was asked for his or her city of residence, not the population of this place — the

population was coded into the data based upon each respondent’s report of their town. The

likelihood that a respondent misreported his or her hometown is exceptionally small, and as

such, for all three data sources, the accuracy of the variables used in the analysis is good.

Given this accuracy, the impact of attenuation bias due to measurement error should be

very small (if at all), and could not account for the change in the estimated agglomeration

premium in the presence of familial fixed effects.

19

6 Conclusion

The effect of agglomeration on wages is highly significant and large in magnitude. But

the question of the causal nature of this effect has been debated in the literature, which

remains divided on this subject. The evidence presented in this paper is derived from

multiple data sources and analyzed with an econometric approach that allows for the causal

return to city size to be estimated by using a family fixed-effect for samples of siblings,

including data from the U.S. Census, the NLSY and a sample of identical twins. The

results from all three data sources were remarkably consistent, and demonstrated that there

were not significant causal effects of many different variables used to represent the effect of

agglomeration on wages, such as: the log of a city’s population, an indicator variable equal

to one if the city had a population in excess of 500,000 residents, an indicator variable equal

to one if the city had a population in excess of 25,000 residents, and similar measures for

the population of the respondent’s PUMA of work. In a simple cross-sectional regression,

all of these variables exhibited significant and large effects on wages. However, these effects

became statistically insignificant and small in magnitude once controls for familial ability

were included within the regression framework. In addition, it was found that the effect of

controlling for familial ability was not only evident in regressions which estimated the average

effect of agglomeration on wages, but also in quantile regressions as well. These approaches

relate to the recent finding that agglomeration has greater effects for more skilled workers;

even though the agglomeration premium is higher for more able workers, controlling for

familial ability causes the city size wage premium to become insignificant for both less- and

more-skilled workers. Overall, the evidence suggests that familial ability plays a significant

role in the effect of city size on wages.

20

References

[1] Aaronson, Daniel. “Using Sibling Data to Estimate the Impact of Neighborhoods onChildren’s Educational Outcomes.” Journal of Human Resources, Autumn 1998, 33(4),pp. 915-946.

[2] Ashenfelter, Orley and Alan Krueger. “Estimates of the Economic Return to Schoolingfrom a new Sample of Twins.” American Economic Review, December 1994, 84(5), pp.1157-1173.

[3] Ashenfelter, Orley and Cecilia Rouse. “Income, Schooling and Ability: Evidence froma New Sample of Twins.” Quarterly Journal of Economics, February 1998, 113(1), pp.253-284.

[4] Bacalod, Marigee; Blum, Bernado S. and Strange, William C. “Skills in the City”Mimeo, University of Toronto, 2007.

[5] Bound, John and Gary Solon. “Double Trouble: On the Value of Twins-Based Estima-tion of the Return to Schooling.” Economics of Education Review, April 1999, 18(2),pp. 169-182.

[6] Chamberlain, Gary. “Multivariate Regression Models for Panel Data.” Journal ofEconometrics, January 1982, 18(1), pp. 5-46.

[7] Combes, Pierre-Phillippe; Duranton, Gilles; and Gobillon, Laurent. “Spatial WageDisparities: Sorting Matters!” Journal of Urban Economics, forthcoming.

[8] Glaeser, Edward L. and Mare, David C. “Cities and Skills.” Journal of Labor Eco-nomics, 19(2), April 2001, pp. 316 - 342.

[9] Griliches, Zvi. “Sibling Models and Data in Economics: Beginnings of a Survey.” Jour-nal of Political Economy, October 1979, 87(5), Part 2, pp. S37-S64.

[10] Krashinsky, Harry A. “Do Marital Status and Computer Use Really Change the WageStructure?” Journal of Human Resources, Summer 2004, pp. 774-791.

[11] Neumark, David. “Biases in Twin Estimates of the Return to Schooling.” Economicsof Education Review, April 1999, 18(2), pp. 143-148.

[12] Roback, Jennifer. “Wages, Rents and the Quality of Life.” Journal of Political Econ-omy, 90(6), December 1982, pp. 1257 - 1278.

[13] Rosenthal, Stuart S. and Strange, William C. “The Attenuation of Human CapitalSpillovers.” Working paper, University of Toronto, 2006.

[14] Tabuchi, Takatoshi and Yoshida, Atsushi. “Separating Urban AgglomerationEconomies in Consumption and Production.” Journal of Urban Economics, 48, July2000, pp. 70-84.

21

[15] Wheeler, Christopher H. “Search, Sorting, and Urban Agglomeration.” Journal ofLabor Economics, 19(4), October 2001, pp. 879 - 899.

[16] Wheeler, Christopher H. “Cities and the Growth of Wages Among Young Workers:Evidence from the NLSY.” Journal of Urban Economics, 60, September 2006, pp.162-184.

[17] Yankow, Jeffrey J. “Why Do Cities Pay More? An Empirical Examination of SomeCompeting Theiries of the Urban Wage Premium.” Journal of Urban Economics, 60,September 2006, pp. 139-161.

22

Table 1: Sample Means for Twins Data and Reweighted CPS MORG data

Twins Data CPS Data

Age 37.71 (11.19)

37.34 (11.44)

Hourly Wage 14.26 (12.90)

13.90 (8.80)

Years of Education 13.97 (2.04)

13.61 (3.14)

Married 0.492 (0.494)

0.596 (0.491)

Female 0.585 (0.493)

0.480 (0.500)

Married Female 0.315 (0.461)

0.270 (0.443)

Unionized 0.210 (0.403)

0.220 (0.415)

Standard deviations in parentheses. The CPS sample is drawn from the 1993 outgoing rotation groups, and consists of respondents between the ages of 18 and 65, who work full-time and earn real hourly wages of at least $2.50 per hour and no more than $100 per hour. The data on identical twins was collected from 1991 to 1993, and has the same age and hourly earnings restrictions as the CPS data. CPS data are reweighted on the basis of the geographic location of individuals in the data set of twins.

Table 2: Pooled Sample Wage Regressions from the Twins Data

OLS (1)

CRE (2)

Fixed-Effects

(3) OLS

(4) CRE (5)

Fixed-Effects

(6)

Log(City Population) 0.043***

(0.009) 0.001

(0.015) 0.001

(0.015) … … …

Family Average Log(City Population)

0.056***

(0.021) … … …

City Population ≥ 500,000

… … … 0.188***

(0.068) -0.075 (0.105)

-0.075 (0.105)

Family Average City Population ≥ 500,000

… … … 0.346** (0.151)

Education 0.102***

(0.011) 0.074***

(0.018) 0.074***

(0.018) 0.108***

(0.011) 0.075*** (0.018)

0.075***

(0.018) Family Average Education

0.030 (0.023)

0.037 (0.023)

Married 0.260***

(0.069) 0.085

(0.086) 0.085

(0.086) 0.237***

(0.070) 0.085

(0.086) 0.085

(0.086) Family Average Marital Status

0.250* (0.130)

0.213 (0.131)

Married Female -0.235***

(0.085) -0.008 (0.114)

-0.049 (0.111)

-0.240***

(0.086) -0.052 (0.111)

-0.052 (0.111)

Family Average Married Female

-0.227 (0.166)

-0.249 (0.167)

Covered by a Union 0.077*

(0.048) 0.096* (0.058)

0.096* (0.058)

0.079 (0.049)

0.094 (0.058)

0.094 (0.058)

Family Average Union Coverage

-0.044 (0.100)

-0.041 (0.102)

Tenure 0.025***

(0.003) 0.024***

(0.004) 0.024***

(0.004) 0.024***

(0.003) 0.023*** (0.004)

0.023***

(0.004) Family Average Tenure 0.003

(0.006) 0.002

(0.006)

Female -0.153***

(0.060) -0.131* (0.076)

-0.155** (0.061)

-0.126 (0.077)

N 526 526 263 526 526 263 Standard Errors are listed in parentheses. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level OLS and CRE regressions also include age and age squared terms.

Table 3: Wage Regressions for Male and Female Subsamples of the Twins Data

Standard Errors are listed in parentheses. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level. The first four columns of the table display regression results for the male sample of twins and the last four columns display regression results for the female sample of twins. The columns entitled OLS report results from ordinary least squares regressions which do not include familial fixed effects, and the columns entitled CRE report results from an correlated random effects model which does include a familial fixed effect. The first two rows of the table report results from a set of wage regressions which use as their measure of agglomeration the logarithm of the respondent’s city’s population. The last two rows of the table report results from a set of wage regressions which use as their measure of agglomeration an indicator variable equal to one if the respondent resides in a city whose population exceeds 500,000 people. In addition to the variables representing agglomeration, all regressions also include the same variables used in Table 2, including: age and its square, education, a female indicator variable, marital status and its interaction with the female indicator variable, an indicator equal to one if the individual is covered by a labor union, and tenure on the current job.

Men Women OLS

(1) CRE (2)

OLS (3)

CRE (4)

OLS (5)

CRE (6)

OLS (7)

CRE (8)

Log(City Population) 0.049*** (0.016)

0.007 (0.023)

… … 0.038*** (0.012)

-0.003 (0.020)

… …

Avg. Log(City Population)

0.063* (0.036)

… … 0.048* (0.026)

… …

City Population ≥ 500,000 … … 0.303***

(0.116) -0.138 (0.149)

… … 0.045 (0.077)

0.002 (0.151)


… … 0.623*** (0.241)

… … … 0.023 (0.198)

Table 4: Quantile Wage Regressions for the Effect of Log of City Population and for the Effect of City Population Exceeding 500,000 for the Twins Data

Standard Errors are listed in parentheses. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level. The first two columns of the table (labeled “10th Percentile”) display quantile regression results for sample of twins at the tenth percentile, and the following pairs of columns display results from the twenty-fifth, fiftieth, seventy-fifth and ninetieth percentiles. The first two rows of the table report results from a set of wage regressions which use as their measure of agglomeration the logarithm of the respondent’s city’s population. The last two rows of the table report results from a set of wage regressions which use as their measure of agglomeration an indicator variable equal to one if the respondent resides in a city whose population exceeds 500,000 people. In addition to the variables representing agglomeration, all regressions also include the same variables used in Table 2, including: age and its square, education, a female indicator variable, marital status and its interaction with the female indicator variable, an indicator equal to one if the individual is covered by a labor union, and tenure on the current job.

10th Percentile 25th Percentile 50th Percentile 75th Percentile 90th Percentile OLS

(1) CRE (2)

OLS (3)

CRE (4)

OLS (5)

CRE (6)

OLS (7)

CRE (8)

OLS (9)

CRE (10)

Log(City Population) 0.034**

(0.015) -0.006 (0.021)

0.033*** (0.012)

-0.005 (0.014)

0.031** (0.013)

-0.003 (0.014)

0.044*** (0.017)

-0.004 (0.018)

0.062*** (0.020)

-0.003 (0.031)

Family Average Log(City Population)

0.057** (0.028)

0.048** (0.023)

0.056** (0.026)

0.065** (0.029)

0.077** (0.035)

City Population ≥ 500,000

0.291** (0.114)

0.055 (0.073)

0.161** (0.075)

0.009 (0.124)

0.134 (0.082)

0.019 (0.103)

0.251** (0.122)

-0.038 (0.219)

0.231* (0.139)

-0.576 (0.328)


0.257** (0.129) 0.206

(0.143) 0.285 (0.204) 0.332

(0.258) 0.916** (0.349)

Table 5: Sample Means for Siblings and Overall Sample in the NLSY from 1979 to 2004

Entire Sample Pooled

Brothers and Sisters

Brothers Sisters

Urban 0.768

(0.422) 0.790

(0.407) 0.780

(0.414) 0.801

(0.399)

Age 29.38 (6.908)

28.70 (6.894)

28.56 (6.780)

28.85 (7.017)

Log Hourly Wage 2.201

(0.535) 2.203

(0.536) 2.267

(0.551) 2.131

(0.510)

Years of Education

12.90 (2.392)

12.98 (2.292)

12.71 (2.402)

13.27 (2.124)

Female 0.487

(0.500) 0.473

(0.500) … …

Married 0.473

(0.499) 0.444

(0.497) 0.433

(0.496) 0.455

(0.498)

Collective Bargaining

0.154 (0.361)

0.169 (0.375)

0.181 (0.385)

0.156 (0.362)

Tenure (in years) 12.47

(22.48) 12.73

(23.63) 12.84

(24.57) 12.62

(22.55)

Adjusted AFQT 4.848 (17.57)

3.694 (18.41)

3.419 (19.18)

4.000 (17.52)

N 74,491 14,197 7,480 6,717

Standard deviations are listed in parentheses. The NLSY sample is limited to respondents with a sibling who work at least 15 hours per week and whose wage exceeds $2/hour (in 1992 dollars).

Table 6: Wage Regressions from the NLSY With and Without a Familial Fixed Effect

Pooled Sample of Siblings Brothers Sisters OLS

(1) Family FE (2) OLS

(3) Family FE

(4) OLS (5)

Family FE (6)

Urban 0.127***

(0.017) 0.017*

(0.010) 0.134***

(0.025) 0.026* (0.014)

0.115*** (0.023)

0.011 (0.015)

Education 0.086***

(0.004) 0.091***

(0.003) 0.083***

(0.006) 0.089***

(0.004) 0.087***

(0.007) 0.096***

(0.004) Married 0.180***

(0.021) 0.151***

(0.010) 0.178***

(0.022) 0.137***

(0.011) 0.010

(0.022) -0.005 (0.011)

Married Female -0.173***

(0.027) -0.167***

(0.013) … … …

Collective Bargaining

0.154*** (0.017)

0.141*** (0.009)

0.162*** (0.022)

0.146*** (0.012)

0.139*** (0.027)

0.132*** (0.014)

Tenure 0.002***

(0.000) 0.001***

(0.000) 0.002***

(0.0005) 0.001***

(0.000) 0.003***

(0.001) 0.002***

(0.000) Female -0.067***

(0.021) 0.127***

(0.044) … … …

Adjusted AFQT 0.005***

(0.000) 0.005***

(0.000) 0.006***

(0.001) 0.006***

(0.001) 0.005***

(0.001) 0.004***

(0.001) N 14,197 14,197 7,480 7,480 6,717 6,717

Standard Errors are listed in parentheses and are clustered at the household level. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level. All regressions also include individual experience and its square, as well as eight indicator variables to capture industry effects. Columns one and two (entitled “Pooled Sample of Siblings”) report regression estimates from a sample of brothers and sisters from the NLSY; the next two columns only consider the subsample of brothers, and the last two columns only consider the subsample of sisters. Columns two, four and six, entitled “Family FE”, report results from regressions which include a family-specific fixed effect.

Table 7: Agglomeration Premium from Quantile Wage Regressions from the NLSY With and Without a Familial Fixed Effect

Pooled Sample of Siblings Brothers Sisters No Family FE

(1) Family FE (2) No Family FE

(3) Family FE

(4) No Family FE (5)

Family FE (6)

10th Percentile 0.073***

(0.014) -0.032

(0.034) 0.048**

(0.020) 0.010

(0.049) 0.080***

(0.019) -0.037 (0.039)

25th Percentile 0.094***

(0.011) 0.008

(0.023) 0.110***

(0.014) 0.040

(0.037) 0.096***

(0.016) 0.004

(0.034) 50th Percentile 0.123***

(0.010) 0.037

(0.025) 0.140***

(0.014) 0.052

(0.036) 0.114***

(0.015) 0.029

(0.029) 75th Percentile 0.132***

(0.011) 0.041

(0.027) 0.142***

(0.014) 0.035

(0.033) 0.121***

(0.015) 0.038

(0.031) 90th Percentile 0.141***

(0.015) 0.111***

(0.035) 0.125***

(0.020) 0.014

(0.045) 0.129***

(0.022) 0.012

(0.046) N 14,197 14,197 7,480 7,480 6,717 6,717

Standard Errors are listed in parentheses and are clustered at the household level. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level. In addition to the variables representing agglomeration, all regressions also include the same variables used in Table 6, including: experience and its square, education, a female indicator variable, marital status and its interaction with the female indicator variable, an indicator equal to one if the individual is covered by a collective bargaining agreement, tenure on the current job, adjusted AFQT score, as well as eight indicator variables to capture industry effects. Columns one and two (entitled “Pooled Sample of Siblings”) report regression estimates from a sample of brothers and sisters from the NLSY; the next two columns only consider the subsample of brothers, and the last two columns only consider the subsample of sisters. Columns two, four and six, entitled “Family FE”, report results from regressions which include a family-specific fixed effect within the regression.

Table 8: Sample Means for Siblings and Overall Sample from the Five-Percent PUMS of the 2000 U.S. Census

Entire Male

Sample (1)

Entire Female

Sample (2)

Male Household Heads and Male

Siblings (3)

Male Children of Household

Head (4)

Female Household Heads and

Female Siblings

(5)

Female Children of Household

Head (6)

Log of Workplace PUMA Population

12.80 (1.369)

12.92 (1.475)

13.30 (1.692)

13.00 (1.626)

13.27 (1.775)

13.12 (1.708)

Age 40.22 (10.45)

38.39 (11.05)

32.80 (9.382)

27.44 (7.866)

36.08 (10.76)

27.47 (8.050)

Log Hourly

Wage 2.771

(0.716) 2.494

(0.638) 2.360

(0.625) 2.231

(0.577) 2.405

(0.609) 2.194

(0.569)

Years of Education

13.69 (2.800)

13.93 (2.488)

11.55 (3.757)

12.59 (2.439)

13.33 (2.862)

13.37 (2.347)

Married 0.688 (0.463)

0.197 (0.398)

0.314 (0.464)

0.071 (0.256)

0.108 (0.310)

0.064 (0.245)

N 2,261,412 874,757 42,688 46,095 21,529 25,944

Standard deviations are listed in parentheses. The sample is limited to respondents with a sibling who work at least 15 hours per week and whose wage exceeds $2/hour. The first column reports sample means from the sample of male household heads, male siblings of household heads and male children of household heads, and the second column reports sample means from the equivalent female sample. The third column reports sample means for male household heads and their male siblings (from households which have both types of people present), and the fifth column presents sample means from the equivalent female sample. The fourth column reports sample means for male children of household heads (in households with at least two male children who work), and the sixth column presents sample means from the equivalent female sample.

Table 9: Wage Regressions from the Census With and Without a Familial Fixed Effect

Male Household Heads and Male Siblings Male Children of

Household Head Female Household Heads and Female Siblings Female Children of

Household Head OLS

(1) Family FE(2) OLS

(3) Family FE

(4) OLS (5)

Family FE(6) OLS

(7) Family FE

(8) Log of Workplace PUMA Population

0.006*** (0.002)

-0.006 (0.004)

0.017*** (0.002)

-0.0004 (0.004)

0.027***

(0.002) 0.001

(0.006) 0.029***

(0.002) -0.005 (0.005)

Education 0.059***

(0.001) 0.037***

(0.002) 0.061***

(0.001) 0.045***

(0.003) 0.087***

(0.002) 0.069***

(0.003) 0.087***

(0.002) 0.066***

(0.003) Experience 0.027***

(0.001) 0.032***

(0.002) 0.037***

(0.001) 0.037***

(0.002) 0.028***

(0.001) 0.032***

(0.003) 0.040***

(0.001) 0.045***

(0.003) Experience2/100 -0.039***

(0.002) -0.045***

(0.004) -0.071***

(0.004) -0.062***

(0.006) -0.040***

(0.003) -0.039***

(0.007) -0.073***

(0.005) -0.083***

(0.008) Married 0.056***

(0.006) 0.117***

(0.010) 0.045***

(0.011) 0.005

(0.017) -0.027**

(0.013) -0.001 (0.021)

0.010 (0.013)

0.022 (0.023)

N 42,688 42,688 46,090 46,090 21,529 21,529 25,944 25,944 Standard Errors are listed in parentheses and are clustered at the household level. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level. All regressions also include eight indicator variables to capture industry effects. The first and second columns report regression results for male household heads and their male siblings (from households which have both types of people present), and the fifth and sixth columns present regression results from the equivalent female sample. The third and fourth columns report regression results for male children of household heads (in households with at least two male children who work), and the seventh and eighth columns presents regression results from the equivalent female sample. Columns two, four and six, entitled “Family FE”, report results from regressions which include a family-specific fixed effect within the regression.

Table 10: Wage Regressions from the Census With and Without a Familial Fixed Effect

Male Household Heads and Male Siblings Male Children of

Household Head Female Household Heads and Female Siblings Female Children of

Household Head OLS



(5) Family FE

(6) OLS (7)

Family FE(8)

Workplace PUMA Population ≥ 500,000

0.062***

(0.006) -0.005

(0.016) 0.084***

(0.006) 0.025

(0.017) 0.140***

(0.008) 0.038

(0.023) 0.123***

(0.007) 0.006

(0.022) Education 0.058***

(0.001) 0.037***

(0.002) 0.061***

(0.001) 0.045***

(0.003) 0.085***

(0.002) 0.069***

(0.003) 0.087***

(0.002) 0.066***

(0.003) Experience 0.027***

(0.001) 0.032***

(0.002) 0.037***

(0.001) 0.037***

(0.002) 0.028***

(0.001) 0.032***

(0.003) 0.040***

(0.001) 0.045***

(0.003) Experience2/100 -0.039***

(0.003) -0.045***

(0.005) -0.072***

(0.004) -0.062***

(0.006) -0.041***

(0.003) -0.039***

(0.007) -0.072***

(0.005) -0.083***

(0.008) Married 0.059***

(0.006) 0.117***

(0.010) 0.047***

(0.011) 0.005

(0.017) -0.026**

(0.013) -0.001 (0.021)

0.009 (0.013)

0.022 (0.023)

N 42,688 42,688 46,090 46,090 21,529 21,529 25,944 25,944 Standard Errors are listed in parentheses and are clustered at the household level. *** Significant at the 1% level, ** Significant at the 5% level, * Significant at the 10% level. All regressions also include eight indicator variables to capture industry effects. The first and second columns report regression results for male household heads and their male siblings (from households which have both types of people present), and the fifth and sixth columns present regression results from the equivalent female sample. The third and fourth columns report regression results for male children of household heads (in households with at least two male children who work), and the seventh and eighth columns presents regression results from the equivalent female sample. Columns two, four and six, entitled “Family FE”, report results from regressions which include a family-specific fixed effect within the regression.

12

34

5

5 10 15Log City Population

Log Hourly Wage Fitted values

Figure 1: Wages and City Size

fi

−2

−1

01

2

−10 −5 0 5 10Within−Twin Difference in Log City Population

Within−Twin Difference in Log Hourly Wage Fitted values

Figure 2: Within−Twin Differences in City Size and Wages

fi

Date post:	13-Mar-2020
Category:	Documents
Upload:	others
View:	7 times
Download:	0 times

The Eﬀect of Urban Agglomeration on Wages: Evidence from...

Documents