+ All Categories
Home > Documents > Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills:...

Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills:...

Date post: 04-Apr-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
42
Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro * January 19, 2016 Abstract The goal of this paper is to measure and decompose the wage return to a set of human skills, taking into account the self-selection of workers into their occupations. The paper combines data from the National Longitudinal Survey of Youth, 1979 Cohort (NLSY79), with data from the Occupational Information Network (O*Net) and proposes an instrumental variables approach to estimate the wage return to math and language skills. To deal with the endogeneity of occupations, I instrument the importance of math for a worker’s occupation in her thirties and forties (occupational choices) with the importance of math for the worker’s preferred occupation back in her early twenties (occupational aspirations). A similar instrument is proposed for language skills. The total wage return to math and language skills is then decomposed between direct returns and occupational sorting effects. The paper finds that most of the wage return to language skills between 1992 and 2012 was due to occupational sorting. Math skills have a larger return than language skills and occupational sorting explained only 45% of the total wage return to math skills in 2012. The remaining 55% corresponds to direct returns, which are realized across all occupations. * Department of Applied Economics, University of Minnesota. Email: [email protected]. I am grateful to Joseph Ritter for his continuous support and feedback as my advisor. I also thank Marc Bellemare, Elizabeth Davis, Johanna Fajardo-Gonzalez, Paul Glewwe, Jason Kerwin and Aaron Sojourner for their comments and suggestions. I received valuable feedback at the 2015 Midwest Economic Association meeting, the 20th Latin American and Caribbean Economic Association meeting and UMN’s Labor Workshop. All errors are my own. 1
Transcript
Page 1: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Occupational Choice and Returns to Skills:evidence from the NLSY79 and O*Net

Juan Chaparro ∗

January 19, 2016

Abstract

The goal of this paper is to measure and decompose the wage return to aset of human skills, taking into account the self-selection of workers into theiroccupations. The paper combines data from the National Longitudinal Surveyof Youth, 1979 Cohort (NLSY79), with data from the Occupational InformationNetwork (O*Net) and proposes an instrumental variables approach to estimatethe wage return to math and language skills. To deal with the endogeneity ofoccupations, I instrument the importance of math for a worker’s occupationin her thirties and forties (occupational choices) with the importance of mathfor the worker’s preferred occupation back in her early twenties (occupationalaspirations). A similar instrument is proposed for language skills. The totalwage return to math and language skills is then decomposed between directreturns and occupational sorting effects. The paper finds that most of thewage return to language skills between 1992 and 2012 was due to occupationalsorting. Math skills have a larger return than language skills and occupationalsorting explained only 45% of the total wage return to math skills in 2012.The remaining 55% corresponds to direct returns, which are realized across alloccupations.

∗Department of Applied Economics, University of Minnesota. Email: [email protected]. I am gratefulto Joseph Ritter for his continuous support and feedback as my advisor. I also thank Marc Bellemare,Elizabeth Davis, Johanna Fajardo-Gonzalez, Paul Glewwe, Jason Kerwin and Aaron Sojourner for theircomments and suggestions. I received valuable feedback at the 2015 Midwest Economic Associationmeeting, the 20th Latin American and Caribbean Economic Association meeting and UMN’s LaborWorkshop. All errors are my own.

1

Page 2: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

1 Introduction

The U.S. economy had around 135 million jobs in May 2014, which the Bureau of LaborStatistics classified into hundreds of occupational categories. There were approximately603,000 lawyers, 174,000 electrical engineers, 118,000 head chefs and 1.1 million restaurantcooks. In total, 840 detailed occupational categories were used in the most recent issue ofthe Occupational Employment Statistics (BLS, 2014). The United States has rich data onthe past and present occupational choices of its workforce.

Various sociological and psychological theories argue that work life, and occupationsin particular, can be an important part of a person’s identity (Budd, 2011, Chapter 9).In addition, occupations implicitly carry substantial information about a worker’s humancapital. For example, all practicing lawyers and physicians finished professional school,electrical engineers have at least a college degree, and head chefs have more work experiencethan regular cooks. There is valuable information embedded in occupational choices.

The goal of this paper is to use occupational choices in the process of measuring the wagereturn to a set of human skills. There are two channels through which skills might affectproductivity and wages: first, some skills might improve a worker’s productivity no matterwhich occupation she performs; second, acquiring new skills allows a worker to choosea different occupation in which these skills are more valuable and relevant. This paperattempts to measure the contribution of each channel to the total wage return. To do so,I combine data from the National Longitudinal Study of Youth, 1979 cohort (NLSY79),with information from the Occupational Information Network (O*Net).

The main challenge that has to be addressed is the self-selection of workers into theiroccupation (Roy, 1951; Heckman and Honore, 1990). For this reason, occupations haveto be treated as endogenous regressors in wage equations. The NLSY79 has followed acohort of respondents since 1979, when they were on average 18 years old. Respondentswere asked in 1979 and 1982 about their occupational aspirations for age 35.1 This paperexplores the validity of using characteristics of the occupation to which someone aspired toin her early twenties (occupational aspiration) as instruments of the characteristics of theoccupations performed by the same individual during her thirties and forties (occupationalchoices).

What are the main characteristics of any occupation? Can occupations be measured andcompared to one another? The research by industrial and organizational psychologists who

1Respondents were asked the following question: ”What kind of work would you like to be doing when youare 35 years old?”. Aspirations and Expectations Questionnaire, Question 1 (Section 22 in 1979 andSection 17 in 1982). The most common answers were manager, secretary, registered nurse, accountant,and computer programmer. All answers were classified into approximately 300 occupational codes.

2

Page 3: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

have explored these questions is the foundation of the Occupational Information Network,known as O*Net (Peterson et al., 2001). O*Net describes in detail the skills, abilities, tasksand educational requirements of all the occupations in the U.S. economy.

I process the O*Net data to create a standardized measure of the importance of mathand language skills for each occupation, following the methodology proposed by Acemogluand Autor (2011). Both standardized measures are used to score workers’ occupationalchoices and their occupational aspirations in the NLSY79 data. The richness of the dataallows me to instrument the importance of math for a worker’s occupation in 2012, forexample, with the importance of math for the occupation she aspired to back in 1982.A similar instrument is proposed for language skills. The empirical strategy followed inthis paper addresses the following questions: How large is the wage return to math andlanguage skills? Has the return changed as the cohort aged? What fraction of the returnis due to occupational sorting?

Since its origins, Human Capital Theory has conceptualized human capital as a unidi-mensional stock built through investment in time-consuming activities (Ben-Porath, 1967;Becker, 1993). Since human capital was regarded as unidimensional, it was appropriateto consider years of education as the best indicator of human capital accumulation. Inconsequence, applied research focused for many years on the economic returns to school-ing (Mincer, 1974; Griliches, 1977; Card, 2001). More recently, human capital has beenredefined as multidimensional; a collection of different human skills (Cawley et al., 2001;Bowles et al., 2001).

There is evidence that both cognitive and noncognitive skills determine multiple labormarket outcomes, including occupational choice (Heckman et al., 2006; Cobb-Clark andTan, 2011; Almlund et al., 2011), but most of the literature is based on broad occupationalcategories: Heckman et al. (2006) used only two groups (blue-collar and white-collar jobs),whereas Cobb-Clark and Tan (2011) created 18 groups.2 This paper uses 420 occupa-tional categories available in the combined NLSY79 - O*Net data, based on three differentCensus classification systems (1970, 1980 and 2000). It is appropriate to use narrow oc-cupational categories, in which lawyers, nurses and electrical engineers are distinguishedfrom one another, in order to extract the valuable information about human skills carriedby occupational choices.

The paper has the following structure: Section 2 first explains the endogeneity of occu-pational choice in any wage equation. Section 3.1 presents a definition of occupations interms of the skills involved as the starting point of the theoretical framework. The theoret-ical framework also defines the economic problems faced by firms (Section 3.2) and workers(Section 3.3). Since workers self-select into occupations, the total return to any skill can be

22-digit sub-major occupations from the Australian Standard Classification of Occupations.

3

Page 4: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

decomposed into a direct return (across all occupations) and an occupational sorting effect.The formal definition of the decomposition is explained in Section 3.4. The econometricframework is explained in Section 4, with emphasis on the key assumptions required foridentification under instrumental variables. The O*Net data is explained in Section 5.1and the key traits in the NLSY79 data are discussed in Section 5.2. After combining bothdatasets, each occupation becomes a two-dimensional vector of math and language skills.Estimation results and the decomposition of total wage returns are discussed in Section 6.Section 7 concludes.

2 Research Problem

Workers self-select into their occupations. Each worker decides which occupation she wouldlike to perform, given her skills and the options available to her. If her corresponding labormarket malfunctions or if she has a limited set of skills, then her occupational alternativesare restricted. This could be the case of low-wage workers in developed countries or anaverage worker in a developing country. In every case, all these workers face and solve anoccupational choice problem.

A relevant empirical question arises: what is the wage premium or penalty of choosing aparticular occupation? For exploring this question, consider the following wage equation,

ln(wi) = βXi +∑K−1

k=1 γk1[Oki = 1] + ui

where wi is the wage of worker i, Xi is a vector of individual characteristics such asage, gender or race. The total number of occupations available is K. If worker i performsoccupation k, then the indicator variable Ok

i is equal to 1. γk denotes a semi-elasticity thatmeasures the average wage premium or penalty of entering occupation k, relative to occu-pation K (the omitted category). The error term, ui, summarizes all other determinantsof wages that are not included in the wage equation. In particular, all worker’s skills areincluded in ui.

The self-selection of workers creates a correlation between unobserved skills and cho-sen occupations. Ordinary least squared estimates of parameters [β, γ1, . . . , γK−1] wouldbe inconsistent, because the occupational indicators violate the exogeneity assumption(E[ui|Ok

i ] 6= 0). Therefore, any attempt to use occupational choices for inference musttake into account that occupations are endogenous in a wage equation. The theoreticalframework explains this problem in more detail and lays the foundations of a possibleempirical solution.

4

Page 5: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

3 Theoretical Framework

The main goal of the following model is to emphasize the endogeneity of occupationalchoice in a wage equation and to motivate a solution. The model is based on Roy (1951),Rosen (1986), Kremer (1993) and Lazear (2009).

3.1 The environment

Individuals are characterized by a set of social and demographic traits such as age, genderand race. Denote these variables for individual i as vector zi. Each individual also hasa set of skills that she could offer in the labor market. Individual i has a proficiencylevel of psi in skill s and there are S skills in total. Therefore, vector (zi, p

1i , . . . , p

Si ) fully

describes each individual before facing any decision-making process.

Denote by rsk the importance of skill s for performing occupation k. I will refer to vector(r1

k, . . . , rSk ) as the skill profile vector. I assume there is a one-to-one correspondence

between occupations and skill profiles. When a worker chooses an occupation, she ischoosing a specific skill profile vector and vice versa. Let Λ be a compact subset of RS

+,representing the set of available occupations. The set of available occupations depends onthe level of development and economic structure of the economy. Therefore, individualstake the set of available occupations as given and their chosen occupation must be anelement of Λ.

As in Rosen (1986), labor market transactions have a double purpose, because skills andoccupations are traded simultaneously. There is a market for skills, where firms look forthe appropriate worker for each occupation; at the same time, individuals look for theirpreferred occupation in a market for occupations. Workers and firms play opposite rolesin each one of these markets. I model the economic behavior of firms through a workerselection problem (Section 3.2). The economic behavior of individuals is explained usingan occupational choice problem (Section 3.3).

Let wki be equal to the wage that worker i would obtain if she was employed in occu-

pation k. As in Roy (1951), we can not observe counterfactual wages, although they arewell defined and play a fundamental role in labor market equilibrium. The wage shoulddepend on the skills of the worker, as well as the characteristics of the occupation. LetW (p1

i , . . . , pSi , r

1k, . . . , r

Sk ) be the wage function that has such property. It is the outcome of

equilibrium conditions for skills and occupations. Both equilibrium conditions are definedin Appendix 8.1 (job market equilibrium).

5

Page 6: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

3.2 The Worker Selection Problem

Consider the economic problem faced by firms. A firm has a job opening in a particularoccupation and it will look for the most appropriate candidate to fill the position. Theoutput generated by the worker will depend on the interaction between her individual skillsand the characteristics of the occupation. I will follow Kremer (1993) and use a modifiedO-Ring production function to model such interaction.

Let qsi,k be the probability that worker i performs correctly the tasks associated withskill s, if she is hired to work in occupation k. More able workers should be less proneto making mistakes, but worker’s ability is relative to the occupation. A measure of thisidea is the ratio between the worker’s proficiency in skill s and the importance of the sameskill for the occupation, psi/r

sk. Therefore, qsi,k = h(psi/r

sk), where h : (0,∞) → (0, 1) and

h′(·) > 0.3

The quality of the job match is defined as∏S

s=1 qsi,k.4 The quality of a match depends

not only on how proficient the worker is on different labor skills, but also on how relevantthese skills are for the occupation. This is the main reason why I differentiate between thevector of individual proficiency (p1

i , . . . , pSi ) and the skill profile vector (r1

k, . . . , rSk ). Lazear

(2009) postulated a similar idea, but in his model skill requirements are specific to firmsrather than occupations.

Finally, let B(r1k, . . . , r

Sk ) > 0 be the maximum value of output produced by a worker in

occupation k who makes no mistakes. I assume the maximum output is non-decreasing ineach one of the elements of the skill profile vector (∂B/∂rsk ≥ 0,∀s). We can now formulatethe worker selection problem:

Maxp1i ,...,p

Si

( S∏s=1

qsi,k

)B(r1

k, . . . , rSk )−W (p1

i , . . . , pSi , r

1k, . . . , r

Sk )

s.t. qsi,k = h(psi/rsk) ∀s ∈ {1, . . . , S}

3I rule out psi = 0 and rsk = 0 as possible cases, but it is important to define the behavior of function hunder both limiting cases. In the first case, if a worker has very poor skills of type s, then the probabilityof performing correctly the related tasks should tend towards 0. Therefore, limp→0 h(p/r) = 0. In thesecond case, if skill s is not important for performing a given occupation, then the quality of the jobmatch should not be affected by the skill proficiency of the worker. Thus, limr→0 h(p/r) = 1.

4The quality of the job match ranges between 0 and 1. It is similar to the probability of successfulproduction in Kremer (1993, p. 553).

6

Page 7: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

The wage function W is shared with the occupational choice problem and is the result ofthe job market equilibrium explained in Appendix 8.1. The solution to the worker selectionproblem is characterized by the following set of S first order conditions:

h′(psi/rsk)

1

rsk︸ ︷︷ ︸(a)

(b)︷ ︸︸ ︷(∏s′ 6=s

qs′

i,k

)B(r1

k, . . . , rSk ) =

∂W

∂psi, ∀s ∈ {1, . . . , S} (1)

Hiring a worker who is more able on skill s has two consequences on production. On onehand, it increases the probability that the worker does a better job on the tasks associatedwith the skill. The change in probability is equal to segment (a) in the first order conditionfor psi . On the other hand, the expected value of the output associated with the other S−1skills also increases, due to the complementarity nature of the O-Ring production function,as explained by Kremer (1993). This second change occurs in segment (b) of the remainingS − 1 first order conditions. The marginal cost of hiring a more able worker in skill s isgiven by ∂W/∂psi .

The solution to the worker selection problem is given by S functions that pin down theskill proficiency vector of the hired worker,

p∗si = P s

(r1k, . . . , r

Sk ;B, h,W

), ∀s ∈ {1, . . . , S} (2)

Therefore, the characteristics of the hired worker (p∗1i , . . . , p∗Si ) will depend on the skill

profile of the occupation she is hired to perform (r1k, . . . , r

Sk ), the maximum value of the

output generated by the occupation (B) and function h. The wage function W and itsproperties also determine the characteristics of the hired worker.5

3.3 The Occupational Choice Problem

Individual preferences are represented by the following utility function:

5Appendix 8.1 defines the supply of skills available in the market. I assume the market is thick enoughfor firms to find a worker with the desired combination of skills, (p∗1i , . . . , p

∗Si ), as long as the firm is

willing to pay the equilibrium wage rate given by function W .

7

Page 8: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

U(ci, r1k, . . . , r

Sk ; p1

i , . . . , pSi , zi) = u(ci)− C(r1

k, . . . , rSk ; p1

i , . . . , pSi , zi)

Utility can be broken down in two parts. The first one is an increasing and concavefunction of consumption (ci). The second part is the effort cost derived from choosing andworking in a particular occupation, called function C. Note that the effort cost functiondepends on the proficiency of the worker in every skill and the skill profile vector of thedesired occupation. It also depends on the set of social and demographic characteristics ofthe individual (zi).

Function C is analog to the effort cost function used in signaling models of education(Spence, 1973). In a classical Roy model, workers choose their occupation using an income-maximizing rule. Following a suggestion by Heckman and Honore (1990), I allow workersto take into account non-wage dimensions of work through function C. Some furtherassumptions of the effort cost function are the following:

• ∂C∂psi

< 0,∀s: the effort cost of performing any given occupation is decreasing in the

skill level of the worker.

• ∂C∂rsk

> 0,∀s: performing occupations which are more demanding require higher effort.

Finally, note that the environment has no time or explicit effort dimension. These aresimplifying assumptions, but some components in the model could be interpreted as timeor effort choices. In particular, the time and effort required to perform each occupationcould be embedded in cost function C. If that is the case, then occupations that requiremore hours of work or additional effort will generate an additional utility cost. Thisassumption implies that the model allows for heterogeneity in work time or exerted effortacross occupations and not within occupations.

We now have all the elements to define and solve the occupational choice problem facedby workers:

Maxci,r1k,...,r

Sk

u(ci)− C(r1k, . . . , r

Sk ; p1

i , . . . , pSi , zi)

s.t. ci ≤ W (p1i , . . . , p

Si , r

1k, . . . , r

Sk )

(r1k, . . . , r

Sk ) ∈ Λ ⊂ RS

+ ci > 0

8

Page 9: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

The worker decides the best possible occupation by choosing an optimal skill profilevector (r∗1k , . . . , r

∗Sk ∈ Λ). Assuming an interior solution, this vector should comply with

the following S first order conditions:

du

dci

∂W

∂rsk− ∂C

∂rsk= 0, ∀s ∈ {1, . . . , S} (3)

These equations represent a balance between the benefits and the costs of choosingoccupations with different skill profiles. For example, if a worker decides to migrate intoan occupation where skill s is more important, that would imply an additional effort cost of∂C/∂rsk, but the decision would also represent additional labor income of ∂W/∂rsk, whichwould be valued at the marginal utility of consumption, du/dci.

The system of first order conditions in (3) describes optimal occupational choice, by theimplicit determination of the optimal skill profile vector (r∗1k , . . . , r

∗Sk ). Therefore, there is

a system of S implicit functions which drive the demand in the market for occupations(See Appendix 8.1):

r∗sk = Rs

(zi, p

1i , . . . , p

Si ;u,W,C

),∀s (4)

Occupational choice depends on individual social and demographic characteristics (zi),the proficiency of the individual in all skills (p1

i , . . . , pSi ), and the functional forms of the

utility function (u), the wage equation (W ) and the effort cost function (C).

Let wk∗i be the actual wage earned by worker i and k∗ her chosen occupation. wk∗

i

corresponds to wage data that could actually be collected. The observed wage will dependon optimal occupational choices:

wk∗

i = W (p1i , . . . , p

Si , r

∗1k , . . . , r

∗Sk ) (5)

As in a classical Roy model, this theoretical framework distinguishes between observedand counterfactual wages. Observed wages (wk∗

i ) correspond to the wage function (W )evaluated at the optimal occupational choice. Counterfactual wages for any given workercould be calculated in theory using the same wage function.

9

Page 10: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

3.4 The wage return to skills and its decomposition

Let ∂wk∗i / ∂psi be the wage return to skill s, which can be broken down mathematically

using the main results from the occupational choice problem (Equations 4 and 5):

∂wk∗i

∂psi=∂W

∂psi+

S∑t=1

∂W

∂rtk

∂Rt

∂psi(6)

The wage return of a skill can be broken down into two pieces. The first term inEquation 6 corresponds to a direct return (∂W/∂psi ). The direct return measures the effecton a worker’s wage as she becomes more competent in skill s, but does not change heroccupation or her proficiency in any other skill.

The remaining terms under the summation in Equation 6 measure the indirect returnor occupational sorting effect. As a worker becomes more proficient in skill s, she now hasan incentive to choose another occupation. The incentive to choose a different occupationis measured by ∂Rt/∂psi and the effect on wages of changing occupations is captured by∂W/∂rtk.

One of the goals from the empirical analysis is to measure the wage return of differentskills and decompose them into direct returns and occupational sorting effects, followingEquation 6. These results will be discussed in Section 6.2.

10

Page 11: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

4 Econometric Framework: Instrumental Variables

The main results from the occupational choice problem (Equations (4) and (5)) can be usedto write an econometric model. The econometric model incorporates a basic wage equationplus S equations which represent the endogenous occupational choice. The analysis of thefollowing econometric model is based on Wooldridge (2010, pp. 89 - 98).

4.1 Econometric model

Consider the following model,

wki = θwzw

i + β1p1i + β2p

2i + . . .+ βSp

Si + α1r

1k + α2r

2k + . . .+ αSr

Sk + ei (7)

rsk = θszsi + γs1p

1i + γs2p

2i + . . .+ γsSp

Si + usi , ∀s ∈ {1, . . . , S} (8)

where βs is equal to the wage gain that a worker would obtain when her proficiencyin skill s increases by one unit, holding her occupation constant. Conceptually, βs ≡∂W/∂psi . On the other hand αs, which is equivalent to ∂W/∂rsk, measures the marginalwage change that would occur when a worker migrates into an occupation where skill s ismore relevant, holding constant her proficiency level in every skill. Vector zw

i encompassesall other exogenous individual characteristics that affect wages. Their effect is captured bythe vector of parameters θw.

From the theoretical model, we know that occupational skill requirements (rsk) are Sendogenous variables in wage equation (7). The occupational choice decision is modeledthrough the system of S equations available in (8). The theoretical framework indicatesthat occupational choice is a function of the proficiency profile of the agent in every skill(p1

i , . . . , pSi ). Therefore, γts ≡ ∂rtk/∂psi and it measures occupational mobility along dimen-

sion t due to the agent’s accumulation of skill s.

Vector zsi includes all exogenous individual characteristics that could affect optimal oc-

cupational choice along dimension s. Following equation (4), such exogenous character-istics could include traits from the agent’s preferences, like properties of the utility fromconsumption function (u) or the effort cost function associated with occupational choices(C).

11

Page 12: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

We can now express the wage return to skill s (∂wk∗i / ∂psi ) in the context of the econo-

metric framework. The econometric equivalent of Equation 6 is the following:

∂wk∗i

∂psi≡ βs +

S∑t=1

(αtγ

ts

)(9)

The direct return of skill s will be measured by the estimate of βs. The indirect returnor occupational sorting effect for the same skill corresponds to the summation term inEquation 9. Note that it combines all the α parameters from the wage equation with acorresponding γ parameter, each one from a different occupational choice equation (Equa-tions 7 and 8). Therefore, consistent estimation of parameters α, β and γ is fundamentalto understand the wage return of different skills.

4.2 Identification assumptions and possible instruments

The relationship between zwi and zs

i is crucial for identification of the parameters in equation(7) based on exclusion restrictions: there must be at least one individual characteristic thatdetermines occupational choice which does not enter the wage equation. In other terms,each zs

i vector must include at least one variable that is not included in vector zwi . By

adapting the key assumptions explained by Wooldridge (2010, pp. 89 - 90), we can formallydefine the identification assumptions behind the econometric model:

1. E(ei) = 0, E(usi ) = 0,∀s

2. Cov(zwi , ei) = Cov(p1

i , ei) = . . . = Cov(pSi , ei) = 0

3. Cov(zsi , u

si ) = Cov(p1

i , usi ) = . . . = Cov(pSi , u

si ) = 0,∀s

4. Cov(zsi , ei) = 0,∀s

5. θszsi = θswzw

i + θss′zs′

i , θss′ 6= 0,∀s.

Assumption 1 is just a normalization. Assumptions 2 and 3 state that the entire vectorof individual skills (p1

i , . . . , pSi ) must be exogenous both in the wage equation and in every

occupational skill requirement equation. If we also take into consideration Assumption 4,the exclusion restriction, then every individual characteristic included in the wage equation(zw

i ) or in the occupational skill requirement equations (zsi ) must also be exogenous to the

entire system.

12

Page 13: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Assumption 5 is a critical identification assumption. The variables included in vector zsi

can be divided into those which are also included in vector zwi and those which are not,

denoted by zs′

i . The effect of these excluded variables on rsk is measured by the subvectorof parameters θss′ . Thus, as long as θss′ 6= 0, the excluded variables are relevant instrumentsin the occupational choice equations.

I propose occupational aspirations, stated by the individual before she enters the labormarket, as an instrument for occupational choices. Consider the following example: sup-pose we ask an individual in her early 20s the following question: What occupation wouldyou like to perform when you are 40 years old? . This is a measure of occupational prefer-ences. Then, 20 years later, we observe the actual occupation and the wage earned by thesame individual. The desired and actual occupations might not be the same, but the skillrequirements for both might be correlated.

In conclusion, I will instrument the importance of a certain skill for the actual occupation(rsk) with the importance of the same skill for the desired occupation (zs′

i ). The data sectionexplains how I implement this idea by combining data from the Occupational InformationNetwork (O*Net) and the National Longitudinal Study of Youth 1979 (NLSY79).

13

Page 14: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

5 Data

5.1 The Occupational Information Network (O*Net)

There is a key implicit assumption in the theoretical and econometric frameworks: that anyoccupation can be translated into a vector of S measurable skill requirements (r1

k, . . . , rSk ).

I plan to use the Occupational Information Network (O*Net) database to implement thisidea.

The Occupational Information Network is a public information system funded by the U.S.Department of Labor. The O*Net team collects information about the main characteristicsof 861 occupations for the U.S. economy. Occupations are classified using the O*NET-SOCtaxonomy, which is based on the Standard Occupational Classification system (SOC).

Occupations are analyzed using the O*Net Content Model, which synthesizes decadesof research in the field of industrial psychology (Peterson et al., 2001). According to thismodel, an occupation can be described in full detail by considering its tasks and workactivities, any previous knowledge and educational requirements, all skills and abilitiesinvolved, and some other key characteristics. For a full description of the Content Model,see ESC (2010). A list of the questionnaires currently used by the O*Net program tocollect data can be found in Figure 1.

Figure 1: O*NET questionnaires

Source: Table 4-3 in NRC (2010, pg. 74)

14

Page 15: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

The O*Net Content Model is composed of 8 questionnaires, also known as domains:Skills, Knowledge, Work Styles, Education and Training, Work Activities, Work Context,Abilities and Tasks (Figure 1). Each domain has a conceptual definition and is brokendown into a set of descriptors. For example, the Knowledge domain is composed of 33descriptors. There are in total 239 descriptors, not including occupation-specific tasks.

The questionnaires on skills and abilities are filled out by job analysts, who are mostlyindustrial psychologists specialized in human resource management. All the other ques-tionnaires in the O*Net program are answered either by job incumbents or occupationalexperts, with an average of 30 respondents per occupation. Job incumbents are workerswho perform the occupation at the time of survey. Occupational experts are members ofprofessional associations who know specific details for a group of related occupations.

I reviewed all 239 descriptors and selected those included in questionnaires answeredonly by job incumbents and which were related to math or language skills. Only twodescriptors comply with these conditions: 1) Mathematics, from the Knowledge domain(Figure 2) and 2) English Language, also from the Knowledge domain (Figure 3).

O*Net data is collected using “behaviorally anchored rating scales” (Peterson et al., 2001,pg. 474). To understand this research technique, consider Figure 2. Each job incumbentfirst answered how important are math skills for their own occupation on a scale from 1(not important) to 5 (extremely important). The answer is called the importance scoreand corresponds to Question A in Figure 2.

If the respondent considered that math skills have at least some importance (importancescore of 2, 3, 4 or 5), then he had to rate the skill level that is required for any worker tohave a good performance in the occupation. This is done by Question B also in Figure 2.This score is called the level rating, and it ranges from 1 (lowest) to 7 (highest). Thequestion used by the O*Net research team to collect information about knowledge of theEnglish language has the same structure and is available in Figure 3.

O*Net generates publicly-available databases which are updated every year (O*Net-Partnership, 2011). The database reports importance scores and level ratings for hun-dreds of 8-digit O*NET-SOC occupational codes. The publicly-available scores have beenrescaled to range between 0 and 100 and correspond to averages across all respondents.

In 2010, the National Research Council (NRC) gathered a panel to analyze the researchsoundness of O*Net (NRC, 2010). The panel concluded that O*Net is a valuable researchtool and the U.S. Department of Labor should continue to finance it. I will follow the adviceof Juan Sanchez and David Autor, members of the NRC panel, and use the importancescores for empirical analyses (NRC, 2010, pp. 195 - 197).

15

Page 16: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Figure 2: O*Net question used to gather data about math knowledge

Source: O*NET knowledge questionnaire, available at http://www.onetcenter.org/questionnaires.html

Figure 3: O*Net question used to gather data about language knowledge

Source: O*NET knowledge questionnaire, available at http://www.onetcenter.org/questionnaires.html

16

Page 17: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

The O*Net uses a modified 8-digit version of the Standard Occupational Classificationsystem (SOC), with more that 800 occupational codes. Therefore, most O*Net occu-pations have to be aggregated into broader categories and transformed into the Censusclassification system. Appendix 8.2 explains how this process was implemented, followingthe methodology proposed by Acemoglu and Autor (2011).

The key insight by Acemoglu and Autor was the use of total employment within eachSOC occupation as weights. Thus, the standardized math and language score for eachCensus occupation is equal to a weighted average of the importance score of those SOCoccupations linked to the Census code through an appropriate crosswalk.

The histogram for the standardized math knowledge scores can be found in Figure 4.A similar graph for the standardized English language knowledge scores is available asFigure 5. The measurement units of both scores are standard deviations. For example,the math knowledge required by Electrical Engineers is approximately 2 s.d. above theaverage for the entire U.S. employed population. As an opposite case, the math knowledgerequirement for Dishwashers is 1.93 s.d. below the average.

Consider now Figure 5. Lawyers perform an occupation with very high language re-quirements, as the English language score for this occupation is 2.28 s.d. above average.Dishwashers are again on the opposite side of the distribution, with a language score equalto -2.08 s.d.

Each occupation can be plotted in a two-dimensional space using their math and lan-guage standardized scores. The result is Figure 6. The graph shows the pattern of skillrequirements for occupations available in the U.S. labor market.

Figure 6 can be interpreted using the theoretical framework. Let S = {L,M}, standfor Language and Math knowledge. K = 418 is the number of occupational codes plottedin the figure. In this case, the set of feasible occupations is a subset of R2

+ (Λ ⊂ R2+)

and each occupation (k) corresponds to a two-dimensional vector of language and mathrequirements, which must be an element of the set of feasible occupations ((rLk , r

Mk ) ∈ Λ).

When workers solve the Occupational Choice Problem (Section 3.3), they choose a vector(rLk , r

Mk ) in the space depicted in Figure 6.

17

Page 18: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Figure 4: Standardized Math Knowledge, Importance Scores, 418 4-digit Census 2000 Oc-cupations

Figure 5: Standardized English Language Knowledge, Importance Scores, 418 4-digit Cen-sus 2000 Occupations

18

Page 19: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Figure 6: Math and Language Knowledge, Importance Scores, 418 4-digit Census 2000Occupations

Note for Figures 4,5 and 6: based on O*Net data (Version 16) and SOC-Census crosswalks from Ace-moglu and Autor (2011). The units are standard deviations among the U.S. employed population of theimportance score.

19

Page 20: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

5.2 National Longitudinal Study of Youth (NLSY79)

The NLSY79 is a representative sample of individuals born in the United States between1957 and 1964. The original sample size were 12,686 individuals, who were interviewed forthe first time in 1979. Annual interviews where conducted until 1994 and data has beencollected every two years since 1996. The most recent year of publicly available data is2012.

Back in 1980, respondents answered the Armed Services Vocational Aptitude Batterytests (ASVAB), which compromised 10 different questionnaires. The Armed Forces Quali-fication Test (AFQT) is based on four of them: Word Knowledge, Paragraph Comprehen-sion, Arithmetic Reasoning and Mathematics Knowledge. The AFQT score has been usedbefore as a measure of skills acquired before entering the labor market (Neal and Johnson,1996). The research team behind the NLSY79 has further processes the ASVAB data totake into account the age differences of respondents. As a result, age-appropriate mathand language Z-scores are available.6

In 1979, the NLSY79 cohort answered a short version of Rotter’s Locus of Control in-strument (Rotter, 1966). The instrument measures to what extent the individual considersthat most of her life events are a consequence of chance or fate (External locus of control),or a consequence of personal decisions and effort (Internal locus of control). Heckmanet al. (2006) used this variable, combined with measures of self-esteem, to construct alatent measure of non-cognitive skills. I will use the standardized Rotter locus of controlscore directly.7

Occupational choices have been categorized with a very high level of detail. The NLSY79has always used a Census Classification system to classify occupations. The 1970 CensusClassification System, at a 3-digit level, was used between 1979 and 1993. The 1980 System,also at a 3-digit level, was used between 1982 and 2000. Since 2002, all occupations havebeen classified using the 2000 Census codes. In the publicly available data for 2012, theoccupation of 6,721 respondents who were active in the labor market were categorized into424 4-digit occupational codes.

The identification strategy is based on using occupational preferences as an instrumentfor occupational choices. In the case of the NLSY79, occupational preferences were mea-sured in Section 22 of the 1979 questionnaire (Aspirations and Expectations). Respondents

6I exclude 107 respondents who had some problem while taking the ASVAB. The math Z-score (variableR0648301) is based on Arithmetic Reasoning and Math Knowledge. The language Z-score (variableR0648305) is based on Word Knowledge and Paragraph Comprehension.

7The original score is coded in the external direction. I multiplied the score by -1 after standardization,to interpret the variable in the internal direction.

20

Page 21: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

were asked about their future plans regarding labor participation and occupational choice.The questionnaire starts by saying: “Now I would like to talk with you about your futureplans. What would you like to be doing when you are 35 years old?”. The next questionasked: “What kind of work would you like to be doing when you are 35 years old?”.8 Thesame set of questions was included again in 1982 as Section 17.9 The answers to all thesequestions were coded using the 1970 Census Classifications system. Therefore, after aproper combination of these answers with the O*Net data, I could instrument the mathand language requirements of a worker’s actual occupation with the math and languagerequirements of the worker’s preferred occupation back in 1979 or 1982.

The key outcome of analysis is Hourly Rate of Pay, a measure constructed by the NLSY79research team that combines wage or salary data with reported work time. For short, Iwill refer to Hourly Rate of Pay as wage. Only workers whose wage is between $1 and $120are kept in the sample. I exclude self-employed individuals and workers who are employedin family business without pay. I further restrict the sample to individuals who completedat least the 9th grade at the time of survey.

The empirical analysis focuses on labor market outcomes during prime age, between1992 and 2012. The selected sample size in 1992 is equal to 4,796 respondents, when theiraverage age was equal to 30.7 years. The sample size in 2012 drops to 3,469 individuals,with an average age of 51.1 years. Summary statistics by year are reported on Tables 1through 5. Around half of the sample are women, 32 percent are African American and19 percent are Hispanic. There are three standardized and time-invariant measures ofindividual skills: math (pMi ), language (pLi ) and internal locus of control (pCi ). The currentoccupation O*Net scores (rMk , r

Lk ) and the occupational aspirations scores (zM

′i , zL

′i ) were

standardized following the procedure described in Appendix 8.2 (Step 3).

8Section 22 of the 1979 questionnaire corresponds to variables R0170000 through R0170800 in the publiclyavailable database

9Variables R0808200 through R0809000 in the publicly available database

21

Page 22: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

6 Results

6.1 OLS and IV models

Consider first Table 6. The table reports five cross-sectional OLS wage equations, between1992 and 2012.10 The specification follows Equation 7 from the Econometric Framework.Thus, there are three sets of regressors. The first set of regressors are individual exogenouscharacteristics such as age and indicators for women, African American or Hispanic re-spondents (vector zw

i ). The second set of regressors are individual skills on math, languageand the internal locus of control (pMi , p

Li , p

Ci ). The third set of regressors are the math and

language scores for the occupation performed by the worker in each year (rMk , rLk ).

The positive or non-significant wage gaps for African Americans or Hispanics are con-sistent with the findings by Neal and Johnson (1996). It is a consequence of includingmeasures of skills acquired before entering the labor market. All these skills are positivelycorrelated with wages. There is also a positive correlation between wages and the math orlanguage knowledge occupational scores. However, I can not claim a causal interpretationof these positive correlations, because rMk and rLk are endogenous regressors.

Tables 7, 8 and 9 summarize the results from using instrumental variables to tackle theendogeneity of occupational choice. Table 7 presents the first stage regressions for the mathknowledge occupational score (rMk ). The specification of the first stage regressions followsEquation 8 from the Econometric Framework. rMk is instrumented using all individualexogenous characteristics (zw

i ) as well as the worker’s individual proficiency scores. Theadditional instruments, not included in the wage equations, are math and language scoresfrom the respondent’s occupational preferences back in 1982 (zM

′i and zL

′i ).

According to Table 7, the occupational sorting of workers along the math dimension isdriven by their own math skills (pMi ) and their taste for math-related occupations (zM

′i ).

There is an interesting interpretation of the parameter associated with zM′

i (θMM ′). Supposeworkers had perfect foresight. In that case, the occupation performed by a worker at 35years of age (around 1996) should be same occupation stated by her back in 1982. Underthis hypothetical situation, θMM ′ should be equal to 1 and all the other parameters in themodel should be equal to 0, including the constant term. But that is not the case: in allyears analyzed, the estimate for θMM ′ is less than 1.11 Therefore, workers tend to overshuttheir occupational aspirations along the math dimension.

10According to the NBER’s Business Cycle Dating Committee, two recessions occurred in the US between1992 and 2012: 1) between March and November of 2001 and 2) between December 2007 and June2009 (The Great Recession). I excluded from the analysis those years close to either recession.

11For example, in 1996: point estimate = 0.129, 95% confidence interval = [0.097 , 0.160].

22

Page 23: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Consider now Table 8. These are the first stage regressions that instrument rLk using thesame set of instruments as in Table 7. Language skills (pLi ) and taste for language-relatedoccupations (zL

′i ) influence the occupational sorting of workers along the language dimen-

sion. As in the case of math, workers also tend to overshut their occupation aspirationsalong the language dimension (θLL′ < 1). Math skills (pMi ) also have an additional sortingeffect: individuals with better math skills are able to join occupations in which languageskills are more relevant. There was no evidence of the opposite in Table 7. This suggesta complementarity between math and language skills, but only within those occupationswhere language is a predominant requirement.

Table 9 reports the instrumental variables estimates of the wage equation. Math skillshave a positive and statistically significant return in every year, except in 2006. The OLSmodels overestimate the direct return to math skills, when compared to the IV models.Furthermore, the positive return to developing an internal locus of control is stable andpersistent (average of 0.026 log points across the period of analysis). There is a strongcontrast between the OLS and the IV models on the direct return to language skills:according to the IV results, there is a small but positive direct return when respondentswhere in their 30s (1992 and 1996), but the return fades out during their 40s and early 50s(2004, 2006 and 2012).

There are profound differences in the estimated returns to occupational mobility (αM

and αL). The OLS models underestimate the return to occupational mobility on bothdimensions (rMk and rLk ). Focus on 2012. According to the IV results, a worker whotransitions into an occupation in which the math requirements are 1 standard deviationabove the requirements for her current occupation, would obtain a wage gain of 0.393 logpoints (≈48%). The equivalent coefficient in the OLS model is 0.061 log points (≈6.3%).

6.2 Wage return to each skill

We now have all the elements to compute the wage return on each skill, following Equations6 and 9. Figures 7, 8 and 9 present a visual summary of the results. Each bar is brokendown into three parts: 1) the direct return, 2) the occupational sorting effect along thelanguage dimension and 3) the occupational sorting effect along the math dimension (SeeSections 3.4 and 4.1).

Consider first math skills (pMi , Figure 7). Their total return increased from 0.155 logpoints in 1992 to 0.212 log points in 2012. They have a strong occupational sorting effect,not only along the math dimension but also along the language dimension. Taken together,the occupational sorting effect of math skills represented around 60 percent of the totalreturn across all years.

23

Page 24: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Continue now to language skills depicted in Figure 8. The total wage return for languageskills is much smaller than the return for math skills. At its peak, in 1996, the total returnreached 0.069 log points. Furthermore, its composition changes drastically across the lifecycle. Language skills had a positive direct return when the cohort was in their 30s (1992and 1996), but the direct return vanishes from the early 40s going forward (2004, 2006and 2012). In addition, most of the occupational sorting effect of language skills occursthrough the language dimension.

Finally, consider the return on developing an internal locus of control (Figure 9). Thetotal wage return ranged between 0.03 and 0.04 log points through out the years of analysis.There is a substantial difference in the composition of the returns, when compared tothe case of math or language skills: almost 75% of the total return is derived from thedirect return. Recall that the direct return to a skill can be interpreted as the effect onworkers’ productivity across all occupations (Sections 3.4). In conclusion, the break downof the returns allows us to understand the role played by different skills in the process ofoccupational sorting and the enhancement of workers’ productivity.

7 Concluding Remarks

This paper has shown the value of extracting information from occupational choices. Theendogeneity of occupations in wage equations can be overcome if occupational aspirationsare used to instrument occupational choices. In order to do so, occupations must be definedand measured in terms of the skill profile they required. The O*Net is a valuable researchtool that quantifies different dimensions of occupations.

The appropriate design of job training programs depends on the composition of thewage return to different skills. For example, if most of the effect of math skills is dueto direct returns, then job training programs that promote mathematical reasoning andproblem solving across different occupations could provide benefits. As a another example,if the returns to language skills are mostly driven by occupational sorting effects, then anyjob training program that focuses on reading, writing or speaking skills should expect thetrainees to migrate from their previous occupation into new ones. Hence, careful estimationof the wage return to skills is relevant, not only for an accurate description of labor markets,but also for the proper design of active labor market policies (Heckman et al., 1999).

This paper has calculated the total wage return for math and language skills for theNLSY79 cohort. Math skills have the largest return: a 1 standard deviation gain in mathskills during the late teen years is associated with a 16.8% (0.155 log points) increase inhourly wages by the early thirties. The positive return to math skills became larger as the

24

Page 25: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

cohort aged. By 2012, when the cohort’s average age was 51 years, the total wage returnreached 23.7% (0.212 log points). The return to language skills is much smaller: on average6.6% between 2002 and 2012.

This paper also decomposed the total wage return of each skill between direct effectsand occupational sorting effects. It is possible to calculate such a decomposition in thecontext of an instrumental variables estimation. Math skills affect wages across the lifecycle through direct effects and occupational sorting effects. When respondents were intheir early thirties, occupational sorting contributed to approximately 58% of the totalwage return on math skills. The contribution of sorting effects for math skills peaked whenrespondents were on average 45 years old (in 2006) at 81%. Language skills followed adifferent pattern: direct returns were relevant during the cohort’s early thirties (1992 and1996), but most of the wage return was due to occupational sorting effects years later.More than 90% of the total wage return of language skills was due to occupational sortingeffects from the moment the cohort reached their forties in the early 2000s.

25

Page 26: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 1: Summary Statistics, 1992

Variable Mean Std. Dev. Min. Max. NLn of Hourly Rate of Pay (ln(w)) 2.23 0.51 0 4.75 4796Age (zwi ) 30.73 2.22 27 35 4796Female indicator (zwi ) 0.49 0.5 0 1 4796African American indicator (zwi ) 0.31 0.46 0 1 4796Hispanic indicator (zwi ) 0.18 0.39 0 1 4796Math skills, ASVAB 1980 (pMi ) -0.17 0.97 -3.19 2.97 4796Language skills, ASVAB 1980 (pLi ) -0.19 0.98 -3.31 2.96 4796Rotter Locus of Control, 1979 (pCi ) 0 0.97 -3.03 1.92 4796Occupation O*Net score: Math (rMk ) 0.05 0.97 -2.86 2.25 4796Occupation O*Net score: Language (rLk ) -0.01 0.95 -2.54 2.48 4796

Occ. Aspiration in 1982: Math (zM′

i ) 0.33 0.9 -2.36 2.99 4796Occ. Aspiration in 1982: Language (zL

′i ) 0.4 0.79 -2.19 2.28 4796

Table 2: Summary Statistics, 1996

Variable Mean Std. Dev. Min. Max. NLn of Hourly Rate of Pay (ln(w)) 2.41 0.57 0.06 4.79 4835Age (zwi ) 34.62 2.23 31 39 4835Female indicator (zwi ) 0.5 0.5 0 1 4835African American indicator (zwi ) 0.32 0.47 0 1 4835Hispanic indicator (zwi ) 0.18 0.39 0 1 4835Math skills, ASVAB 1980 (pMi ) -0.2 0.97 -3.03 2.97 4835Language skills, ASVAB 1980 (pLi ) -0.21 0.98 -3.31 3.04 4835Rotter Locus of Control, 1979 (pCi ) -0.01 0.98 -3.03 1.92 4835Occupation O*Net score: Math (rMk ) 0.11 0.97 -2.86 2.25 4835Occupation O*Net score: Language (rLk ) -0.01 0.94 -2.54 2.48 4835

Occ. Aspiration in 1982: Math (zM′

i ) 0.31 0.9 -2.67 2.99 4835Occ. Aspiration in 1982: Language (zL

′i ) 0.38 0.8 -2.44 2.28 4835

26

Page 27: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 3: Summary Statistics, 2004

Variable Mean Std. Dev. Min. Max. NLn of Hourly Rate of Pay (ln(w)) 2.7 0.6 0.12 4.75 3684Age (zwi ) 42.97 2.24 39 48 3684Female indicator (zwi ) 0.52 0.5 0 1 3684African American indicator (zwi ) 0.33 0.47 0 1 3684Hispanic indicator (zwi ) 0.19 0.39 0 1 3684Math skills, ASVAB 1980 (pMi ) -0.25 0.97 -3.03 2.97 3684Language skills, ASVAB 1980 (pLi ) -0.26 0.97 -3.2 3.04 3684Rotter Locus of Control, 1979 (pCi ) -0.06 0.98 -3.03 1.92 3684Occupation O*Net score: Math (rMk ) -0.02 0.96 -2.86 2.25 3684Occupation O*Net score: Language (rLk ) -0.03 0.96 -2.57 2.56 3684

Occ. Aspiration in 1982: Math (zM′

i ) 0.29 0.9 -2.67 2.99 3684Occ. Aspiration in 1982: Language (zL

′i ) 0.37 0.79 -2.44 2.28 3684

Table 4: Summary Statistics, 2006

Variable Mean Std. Dev. Min. Max. NLn of Hourly Rate of Pay (ln(w)) 2.76 0.61 0.04 4.79 3641Age (zwi ) 44.49 2.21 41 49 3641Female indicator (zwi ) 0.51 0.5 0 1 3641African American indicator (zwi ) 0.33 0.47 0 1 3641Hispanic indicator (zwi ) 0.19 0.39 0 1 3641Math skills, ASVAB 1980 (pMi ) -0.25 0.97 -3.03 2.97 3641Language skills, ASVAB 1980 (pLi ) -0.26 0.97 -3.2 2.66 3641Rotter Locus of Control, 1979 (pCi ) -0.07 0.99 -3.03 1.92 3641Occupation O*Net score: Math (rMk ) -0.01 0.96 -2.86 2.25 3641Occupation O*Net score: Language (rLk ) -0.01 0.95 -2.57 2.56 3641

Occ. Aspiration in 1982: Math (zM′

i ) 0.29 0.9 -2.36 2.99 3641Occ. Aspiration in 1982: Language (zL

′i ) 0.36 0.79 -2.16 2.28 3641

27

Page 28: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 5: Summary Statistics, 2012

Variable Mean Std. Dev. Min. Max. NLn of Hourly Rate of Pay (ln(w)) 2.91 0.6 0.21 4.75 3469Age (zwi ) 51.11 2.21 47 56 3469Female indicator (zwi ) 0.54 0.5 0 1 3469African American indicator (zwi ) 0.32 0.47 0 1 3469Hispanic indicator (zwi ) 0.19 0.39 0 1 3469Math skills, ASVAB 1980 (pMi ) -0.2 0.97 -3.19 2.97 3469Language skills, ASVAB 1980 (pLi ) -0.21 0.97 -3.2 2.74 3469Rotter Locus of Control, 1979 (pCi ) -0.05 0.98 -3.03 1.92 3469Occupation O*Net score: Math (rMk ) 0 0.97 -2.86 2.25 3469Occupation O*Net score: Language (rLk ) 0.08 0.99 -2.57 2.56 3469

Occ. Aspiration in 1982: Math (zM′

i ) 0.3 0.89 -2.36 2.99 3469Occ. Aspiration in 1982: Language (zL

′i ) 0.4 0.78 -2.16 2.28 3469

28

Page 29: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 6: OLS models - Ln of Hourly Rate of Pay - NLSY79

ln(wki ) = θwzw

i + βMpMi + βLp

Li + βCp

Ci + αMr

Mk + αLr

Lk + ei

1992 1996 2004 2006 2012

Age (zwi ) 0.016∗∗∗ 0.011∗∗∗ 0.008∗∗ 0.008∗∗ 0.004(0.003) (0.003) (0.004) (0.004) (0.004)

Female indicator (zwi ) -0.199∗∗∗ -0.208∗∗∗ -0.274∗∗∗ -0.271∗∗∗ -0.272∗∗∗

(0.014) (0.015) (0.018) (0.019) (0.019)

African American indicator (zwi ) -0.026 0.012 0.030 0.022 0.029(0.016) (0.018) (0.022) (0.022) (0.023)

Hispanic indicator (zwi ) 0.064∗∗∗ 0.069∗∗∗ 0.089∗∗∗ 0.064∗∗∗ 0.091∗∗∗

(0.018) (0.020) (0.023) (0.025) (0.024)

Math skills, ASVAB 1980 (pMi ) 0.135∗∗∗ 0.156∗∗∗ 0.188∗∗∗ 0.180∗∗∗ 0.190∗∗∗

(0.011) (0.012) (0.014) (0.015) (0.015)

Language skills, ASVAB 1980 (pLi ) 0.042∗∗∗ 0.055∗∗∗ 0.044∗∗∗ 0.046∗∗∗ 0.038∗∗

(0.011) (0.012) (0.015) (0.015) (0.015)

Rotter Locus of Control, 1979 (pCi ) 0.029∗∗∗ 0.029∗∗∗ 0.034∗∗∗ 0.038∗∗∗ 0.029∗∗∗

(0.007) (0.008) (0.009) (0.009) (0.010)

Occupation O*Net score: Math (rMk ) 0.057∗∗∗ 0.052∗∗∗ 0.054∗∗∗ 0.062∗∗∗ 0.061∗∗∗

(0.007) (0.008) (0.009) (0.010) (0.010)

Occupation O*Net score: Language (rLk ) 0.093∗∗∗ 0.115∗∗∗ 0.134∗∗∗ 0.132∗∗∗ 0.127∗∗∗

(0.008) (0.009) (0.010) (0.011) (0.010)

Constant 1.866∗∗∗ 2.169∗∗∗ 2.557∗∗∗ 2.601∗∗∗ 2.850∗∗∗

(0.089) (0.110) (0.163) (0.174) (0.200)

R2 0.27 0.28 0.32 0.30 0.30Average age (years) 30.7 34.6 43 44.5 51.1Observations 4796 4835 3684 3641 3469

Robust standard errors in parentheses∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01

29

Page 30: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 7: First Stage - Current Occupation O*Net Score: Math (rMk ) - NLSY79

rMk = θMw zwi + γMM p

Mi + γML p

Li + γMC p

Ci + θMM ′z

M ′i + θML′ z

L′i + uMi

1992 1996 2004 2006 2012

Age (zwi ) 0.000 -0.003 -0.008 -0.009 -0.019∗∗∗

(0.006) (0.006) (0.007) (0.007) (0.007)

Female indicator (zwi ) -0.025 -0.042 -0.095∗∗∗ -0.059∗ -0.017(0.029) (0.028) (0.032) (0.032) (0.034)

African American indicator (zwi ) -0.093∗∗ -0.155∗∗∗ -0.151∗∗∗ -0.164∗∗∗ -0.150∗∗∗

(0.036) (0.035) (0.041) (0.041) (0.042)

Hispanic indicator (zwi ) 0.015 -0.040 0.034 -0.045 -0.039(0.038) (0.038) (0.044) (0.044) (0.045)

Math skills, ASVAB 1980 (pMi ) 0.186∗∗∗ 0.212∗∗∗ 0.156∗∗∗ 0.189∗∗∗ 0.139∗∗∗

(0.024) (0.024) (0.026) (0.026) (0.029)

Language skills, ASVAB 1980 (pLi ) 0.032 -0.008 0.021 -0.003 0.057∗∗

(0.024) (0.023) (0.027) (0.027) (0.028)

Rotter Locus of Control, 1979 (pCi ) 0.013 0.003 0.017 0.004 0.011(0.015) (0.015) (0.017) (0.017) (0.018)

Occ. Aspiration in 1982: Math (zM′

i ) 0.127∗∗∗ 0.129∗∗∗ 0.121∗∗∗ 0.099∗∗∗ 0.108∗∗∗

(0.016) (0.016) (0.018) (0.018) (0.019)

Occ. Aspiration in 1982: Language (zL′

i ) 0.006 0.042∗∗ 0.026 0.004 0.042∗

(0.018) (0.018) (0.021) (0.021) (0.022)

Constant 0.067 0.273 0.429 0.481 1.040∗∗∗

(0.189) (0.211) (0.297) (0.315) (0.377)

R2 0.08 0.09 0.08 0.07 0.07F statistic 32.022 34.250 22.815 14.500 17.835Observations 4796 4835 3684 3641 3469

Robust standard errors in parentheses∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01

30

Page 31: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 8: First Stage - Current Occupation O*Net Score: Language (rLk ) - NLSY79

rLk = θLwzwi + γLMp

Mi + γLLp

Li + γLCp

Ci + θLM ′z

M ′i + θLL′z

L′i + uLi

1992 1996 2004 2006 2012

Age (zwi ) 0.004 -0.001 0.001 0.001 -0.005(0.006) (0.005) (0.006) (0.006) (0.007)

Female indicator (zwi ) 0.435∗∗∗ 0.400∗∗∗ 0.446∗∗∗ 0.513∗∗∗ 0.573∗∗∗

(0.026) (0.026) (0.030) (0.030) (0.032)

African American indicator (zwi ) 0.074∗∗ 0.020 0.048 0.032 0.068∗

(0.033) (0.032) (0.037) (0.037) (0.038)

Hispanic indicator (zwi ) 0.132∗∗∗ 0.169∗∗∗ 0.194∗∗∗ 0.166∗∗∗ 0.151∗∗∗

(0.035) (0.035) (0.040) (0.039) (0.041)

Math skills, ASVAB 1980 (pMi ) 0.198∗∗∗ 0.169∗∗∗ 0.156∗∗∗ 0.185∗∗∗ 0.189∗∗∗

(0.021) (0.021) (0.025) (0.024) (0.026)

Language skills, ASVAB 1980 (pLi ) 0.131∗∗∗ 0.157∗∗∗ 0.209∗∗∗ 0.183∗∗∗ 0.191∗∗∗

(0.021) (0.021) (0.025) (0.025) (0.025)

Rotter Locus of Control, 1979 (pCi ) 0.021 0.032∗∗ 0.025 0.011 0.023(0.013) (0.013) (0.015) (0.015) (0.016)

Occ. Aspiration in 1982: Math (zM′

i ) -0.004 0.003 -0.017 -0.032∗∗ -0.011(0.015) (0.014) (0.016) (0.016) (0.017)

Occ. Aspiration in 1982: Language (zL′

i ) 0.164∗∗∗ 0.182∗∗∗ 0.183∗∗∗ 0.153∗∗∗ 0.166∗∗∗

(0.017) (0.017) (0.020) (0.019) (0.021)

Constant -0.409∗∗ -0.207 -0.305 -0.295 -0.021(0.174) (0.189) (0.271) (0.278) (0.340)

R2 0.20 0.21 0.22 0.24 0.24F statistic 44.538 56.679 42.793 32.190 30.665Observations 4796 4835 3684 3641 3469

Robust standard errors in parentheses∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01

31

Page 32: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Table 9: IV models - Ln of Hourly Rate of Pay - NLSY79

ln(wki ) = θwzw

i + βMpMi + βLp

Li + βCp

Ci + αM r̂

Mk + αLr̂

Lk + ei

1992 1996 2004 2006 2012

Age (zwi ) 0.015∗∗∗ 0.012∗∗∗ 0.011∗∗ 0.012∗∗ 0.011∗∗

(0.003) (0.004) (0.005) (0.006) (0.005)

Female indicator (zwi ) -0.211∗∗∗ -0.245∗∗∗ -0.302∗∗∗ -0.355∗∗∗ -0.318∗∗∗

(0.033) (0.036) (0.050) (0.065) (0.063)

African American indicator (zwi ) -0.013 0.038 0.068∗∗ 0.079∗∗ 0.061∗

(0.021) (0.024) (0.033) (0.038) (0.034)

Hispanic indicator (zwi ) 0.050∗∗ 0.054∗∗ 0.039 0.041 0.083∗∗

(0.023) (0.027) (0.035) (0.039) (0.034)

Math skills, ASVAB 1980 (pMi ) 0.069∗∗∗ 0.068∗∗∗ 0.090∗∗∗ 0.038 0.117∗∗∗

(0.023) (0.024) (0.028) (0.047) (0.029)

Language skills, ASVAB 1980 (pLi ) 0.027∗ 0.035∗ 0.002 0.004 -0.001(0.016) (0.018) (0.028) (0.031) (0.026)

Rotter Locus of Control, 1979 (pCi ) 0.024∗∗∗ 0.024∗∗∗ 0.025∗∗ 0.035∗∗∗ 0.023∗

(0.008) (0.009) (0.012) (0.013) (0.012)

Occupation O*Net score: Math (r̂Mk ) 0.310∗∗∗ 0.318∗∗∗ 0.445∗∗∗ 0.540∗∗∗ 0.393∗∗∗

(0.066) (0.073) (0.100) (0.147) (0.109)

Occupation O*Net score: Language (r̂Lk ) 0.145∗∗ 0.230∗∗∗ 0.275∗∗∗ 0.344∗∗∗ 0.216∗∗

(0.060) (0.064) (0.081) (0.108) (0.091)

Constant 1.863∗∗∗ 2.096∗∗∗ 2.411∗∗∗ 2.415∗∗∗ 2.477∗∗∗

(0.105) (0.130) (0.217) (0.255) (0.271)

Average age (years) 30.7 34.6 43 44.5 51.1Durbin-Wu-Hausman (DWH) test 10.765 15.816 18.185 13.858 9.112DWH p-value < 0.01 < 0.01 < 0.01 < 0.01 < 0.01Observations 4796 4835 3684 3641 3469

Robust standard errors in parentheses

Null hypohtesis of the Durbin-Wu-Hausman test: exogeneity of rMk and rLk in the wage equation.∗ p < 0.10, ∗∗ p < 0.05, ∗∗∗ p < 0.01

32

Page 33: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Figure 7: Wage return to math skills, ∂ln(wki ) / ∂pMi

Notes:

- Direct return = β̂M from Table 9 (IV).

- Occupational sorting: Language = α̂Lγ̂LM .

- Occupational sorting: Math = α̂M γ̂MM .

- α̂L and α̂M come from Table 9 (IV).

- γ̂LM comes from Table 8 (First stage for rLk ).

- γ̂MM comes from Table 7 (First stage for rMk ).

33

Page 34: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Figure 8: Wage return to language skills, ∂ln(wki ) / ∂pLi

Notes:

- Direct return = β̂L from Table 9 (IV).

- Occupational sorting: Language = α̂Lγ̂LL .

- Occupational sorting: Math = α̂M γ̂ML .

- α̂L and α̂M come from Table 9 (IV).

- γ̂LL comes from Table 8 (First stage for rLk ).

- γ̂ML comes from Table 7 (First stage for rMk ).

34

Page 35: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Figure 9: Wage return to developing an internal locus of control, ∂ln(wki ) / ∂pCi

Notes:

- Direct return = β̂C from Table 9 (IV).

- Occupational sorting: Language = α̂Lγ̂LC .

- Occupational sorting: Math = α̂M γ̂MC .

- α̂L and α̂M come from Table 9 (IV).

- γ̂LC comes from Table 8 (First stage for rLk ).

- γ̂MC comes from Table 7 (First stage for rMk ).

35

Page 36: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

8 Appendixes

8.1 Job market equilibrium

There are two markets: a market for skills and a market for occupations. The distribu-tion of skills among the labor force is represented by a cumulative distribution functionF (p1, . . . ,pS) = Pr(p1

i ≤ p1, . . . , pSi ≤ pS). The distribution of occupations available inthe economy is described by G(r1, . . . , rS) = Pr(r1

k ≤ r1, . . . , rSk ≤ rS). The support of Gis the set of feasible occupations, Λ.

Skills Market

Function F (p1, . . . ,pS) defines the supply of skills. It measures the fraction of theworkforce for which vector (p1, . . . ,pS) represents an upper bound on their skills. Thedemand for skills is more complex: the worker selection problem is solved for each of theoccupations on the support of function G. The solution to the worker selection problem(function P s from Equation 2) indicates the optimal skill profile of the worker that shouldbe hired to perform in each occupation. Therefore, there is a subset of occupations forwhich the optimal skill profile is equal to or below vector (p1, . . . ,pS). The demand forskills is derived from this subset of occupations, denoted by Γ.

Therefore, the equilibrium condition in the market for skills is defined as,

˙̇

F (p1, . . . ,pS) ≥∫· · ·∫

Γ

dG(r1k, . . . , r

Sk )

Γ =

{(r1

k, . . . , rSk ) : P s(r1

k, . . . , rSk ;W ) ≤ ps,∀s

}

Occupations Market

The equilibrium in the market for occupations is very similar. In this case, the supply ofoccupations available in the economy is measured by function G(r1, . . . , rS). The solutionto the occupational choice problem drives the demand for occupations (function Rs fromEquation 4). There is a subset of individuals among the support of F who would like towork in occupations where the skill requirements are equal to or below vector (r1, . . . , rS).I will denote this subset as ∆. Thus, the equilibrium condition in this market is givenby,

36

Page 37: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

G(r1, . . . , rS) ≥∫· · ·∫

dF (p1i , . . . , p

Si )

∆ =

{(p1

i , . . . , pSi ) : Rs(p1

i , . . . , pSi ;W ) ≤ rs,∀s

}

Wage function, W (p1i , . . . , p

Si , r

1k, . . . , r

Sk )

The price mechanism of both markets is summarized in the wage function, W . Supply isinelastic in both of them, but the demand for skills and the demand for occupations respondto changes in the wage function. W and all its properties are determined as an equilibriumoutcome. The wage function must be such that the equilibrium conditions are met bothin the skills market and in the occupation markets. All the first-order conditions from theworker selection problem (Equation 1) and the occupational choice problem (Equation 3)must also hold.

8.2 Crosswalks and merged O*NET / NLSY79 data

The NLSY79 has always classified occupations following a version of the Census Classifi-cations system. The O*Net uses a modified version of the Standard Occupational Classi-fication (SOC) system. Occupations are defined with more detail under the SOC system,when compared to the Census system. As consequence, O*Net collects data for more than800 occupational codes and the NLSY79 recognizes approximately 400 occupations.

• From 8-digit O*Net-SOC 2010 to 6-digit SOC 2010 (Step 1): The 2010 SOC systemconsists of 23 major groups, 97 minor groups, 461 broad occupations and 840 detailedoccupations. It is a hierarchical system: each major group is divided into minorgroups; minor groups are divided into broad occupations and broad occupations aredivided into detailed occupations. The hierarchical structure is summarized in a6-digit coding system, in which the first two digits indicate the major group, thethird digit represents the minor group, the fourth and fifth digits correspond tothe broad occupation and the sixth digit signals the detailed occupation (SOCPC,2010). For example, trailer truck drivers (53-3032) is a detailed occupation containedinside a broad occupation called ”Driver/Sales Workers and Truck Drivers” (53-3030),which is part of a minor group called ”Motor Vehicle Operators” (53-3000). Thisminor group belongs to a major group called ”Transportation and Material MovingOccupations” (53-0000).

37

Page 38: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

The classification system used by O*Net is heavily based on the SOC system. It usesan 8-digit code. The first 6 digits correspond to the equivalent 6-digit SOC 2010detailed occupation. The seventh and eighth digit are used in case a more refineddefinition of the occupation is needed, specially in the case of new or rapidly growingoccupations. As an example, baristas have a stand-alone occupation in O*Net (35-3022.01), but they are not a detailed occupation under the SOC. The correspondingSOC detailed occupation is ”Counter Attendants, Cafeteria, Food Concession, andCoffee Shop” (35-3022). As another example, the SOC detailed occupation of ”Clin-ical, Counseling, and School Psychologists” (19-3031) is broken down in O*Net intothree separate occupations: school psychologists (19-3031.01), clinical psychologists(19-3031.02) and counseling psychologists (19-3031.03). If no refinement is needed,then the seventh and eighth digits are equal to ”00”.

In most cases, there is a one-to-one correspondence between an 8-digit O*Net SOCcode and a 6-digit SOC 2010 detailed occupation. In the other cases, when thelast two digits are different than ”00”, then the equivalent O*Net score for the 6-digit SOC detailed occupation is equal to the average score among those 8-digitoccupations which share their first six digits.

• From 6-digit SOC 2010 to 6-digit SOC 2000 (Step 2): The Bureau of Labor Statisticspublished a crosswalk between the 2010 SOC and the 2000 SOC on February 2010.The crosswalk is publicly available (http://www.bls.gov/soc/soccrosswalks.htm). Thereis a one-to-one correspondence between most detailed occupations from both sys-tems. However, in some cases, a 2000 SOC detailed occupation was divided into twoor more titles in the 2010 classification. If so, then the O*Net score for the 2000SOC occupation corresponds to the average score of its related 2010 SOC titles. Forexample, Registered Nurses are coded as 29-1111 under the 2000 SOC. This title wasdivided into four detailed occupations in the 2010 SOC system: stand-alone Regis-tered Nurses (29-1141), Nurse Anesthetists (29-1151), Nurse Midwives (29-1161) andNurse Practitioners (29-1171).

• From 6-digit SOC 2000 to 4-digit Census 2000 (Step 3): Each SOC occupational codecorresponds to only one Census code, but most Census codes are related to more thanone SOC code. 4-digit Census codes can be interpreted as consolidations of 6-digitSOC codes. Thus, O*Net scores must be aggregated somehow. Acemoglu and Autor(2011) faced the same problem and proposed a solution based on data from the Occu-pational Employment Statistics (OES). I used the SOC-Census crosswalk created byAcemoglu and Autor, available at http://economics.mit.edu/faculty/dautor/data/acemoglu.

The OES reports total employment by occupation in the United States at the 6-digitSOC level and is included in the SOC-Census crosswalk by Acemoglu and Autor.Their key idea is to use total employment as weights. Therefore, the O*Net score

38

Page 39: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

O*Net importance scores (1 to 5 scale)

Knowledge Knowledge Work StylesMath Language Persistence

Occupation (6-digit SOC) EmploymentElectrical Engineers (172071) 441,390 4.25 3.92 4.38

Electronics Engineers (172072) 395,800 4.06 3.67 3.85Employment-weighted average, all occ. 2.91 3.39 3.71

Employment-weighted std. deviation, all occ. 0.62 0.60 0.45O*Net standardized scores (Z scores)

Occupation (6-digit SOC) WeightsElectrical Engineers (172071) 0.53 2.17 0.88 1.48

Electronics Engineers (172072) 0.47 1.86 0.46 0.32Aggregated O*Net scores (Z scores)

Occupation (4-digit Census)Electrical and

Electronics Engineers (1410) 2.20 0.69 0.93Note: calculations based on the SOC-Census crosswalk created by Acemoglu and Autor (2011).

Table 10: From 6-digit SOC 2000 to 4-digit Census 2000: the case of Electrical and Elec-tronic Engineers.

of a 4-digit Census occupation is equal to the employment-weighted average O*Netscore of the corresponding 6-digit SOC occupations. Additionally, I standardized allO*Net scores using employment-weighted averages and standard deviations. Table10 presents as an example the case of Electrical and Electronic Engineers (4-digit2000 Census code 1410).

• From 4-digit Census 2000 to 4-digit Census 1990 (Step 4): the Minnesota PopulationCenter (MPC) has a rich set of crosswalks for the different Census classifications sys-tems published during the second half of the twentieth century. The crosswalks arepublicly available at https://usa.ipums.org/usa/volii/ and are explained by Meyerand Osborne (2005). Thre is a MPC crosswalk that links 4-digit Census 2000 codeswith the equivalent 4-digit Census 1990 codes. Furthermore, the crosswalk includestotal employment for each occupation in the 1990 classification system. The avail-ability of employment data allowed me to implement Acemoglu and Autor’s method-ology. That is, in those cases where a 4-digit 1990 code corresponds to more thanone 4-digit 2000 code, I use total employment to generate a new weighted averageO*Net score.

39

Page 40: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

• From 4-digit Census 1990 to 3-digit Census 1970 (Step 5) and 3-digit Census 1980(Step 6): The MPC crosswalk explained by Meyer and Osborne (2005) summarizesthe links between the 1990 Census codes and other Census classifications from otherdecades. In particular, each 3-digit code in the 1970 Census system corresponds toone 4-digit 1990 Census code. A similar property holds for the 1980 classificationsystem. Therefore, the last step assigns to each 1970 / 1980 Census code the O*Netscore computed for the corresponding 1990 Census occupation. This last step iscritical, due to the Census codes historically used by the NLSY79: the 1970 systemwas used between 1979 and 1993; the 1980 system was used between 1982 and 2000,and the 2000 has been used since 2002.

40

Page 41: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

References

Acemoglu, D. and Autor, D. (2011). Skills, tasks and technologies: Implications for em-ployment and earnings. Handbook of Labor Economics, 4b:1043–1171.

Almlund, M., Duckworth, A. L., Heckman, J., and Kautz, T. (2011). Personality psychol-ogy and economics. Technical report.

Becker, G. S. (1993). Human Capital: A Theoretical and Empirical Analysis with SpecialReference to Education. University of Chicago Press.

Ben-Porath, Y. (1967). The production of human capital and the life cycle of earnings.Journal of Political Economy, 75(4):352–365.

BLS (2014). Bureau of labor statistics, u.s. department of labor, occupational employmentstatistics.

Bowles, S., Gintis, H., and Osborne, M. (2001). The determinants of earnings: A behavioralapproach. Journal of Economic Literature, 39(4):1137–1176.

Budd, J. W. (2011). The Thought of Work. Cornell University Press.

Card, D. (2001). Estimating the return to schooling: Progress on some persistent econo-metric problems. Econometrica, 69(5):1127–1160.

Cawley, J., Heckman, J., and Vytlacil, E. (2001). Three observations on wages and mea-sured cognitive ability. Labour Economics, 8(4):419–442.

Cobb-Clark, D. A. and Tan, M. (2011). Noncognitive skills, occupational attainment, andrelative wages. Labour Economics, 18(1):1–13.

ESC (2010). The o*net content model: detailed outline with descriptions. Prepared bythe National Center for O*NET Development for US Department of Labor.

Griliches, Z. (1977). Estimating the returns to schooling: Some econometric problems.Econometrica, 45(1):1–22.

Heckman, J. and Honore, B. (1990). The empirical content of the roy model. Econometrica,58(5):1121–1149.

Heckman, J., LaLonde, R., and Smith, J. (1999). The economics and econometrics ofactive labor market programs. Handbook of Labor Economics, 3:1865–2097.

Heckman, J., Stixrud, J., and Urzua, S. (2006). The effects of cognitive and noncognitiveabilities on labor market outcomes and social behavior. Journal of Labor Economics,24(3):411–482.

41

Page 42: Occupational Choice and Returns to Skills: evidence …...Occupational Choice and Returns to Skills: evidence from the NLSY79 and O*Net Juan Chaparro January 19, 2016 Abstract The

Kremer, M. (1993). The o-ring theory of economic development. The Quarterly Journalof Economics, 108(3):551–575.

Lazear, E. P. (2009). Firm-specific human capital: A skill-weights approach. Journal ofPolitical Economy, 117(5):914–940.

Meyer, P. B. and Osborne, A. M. (2005). Proposed category system for 1960-2000 censusoccupations. Bureau of Labor Statistics Working Paper, (383).

Mincer, J. (1974). Schooling, Experience and Earnings. Columbia University Press.

Neal, D. A. and Johnson, W. R. (1996). The role of premarket factors in black-white wagedifferences. The Journal of Political Economy, 104(5):869–895.

NRC (2010). A Database for a Changing Economy: Review of the Occupational InformationNetwork (O*NET). The National Academies Press. Panel to Review the OccupationalInformation Network (O*NET).

O*Net-Partnership (2011). O*net database releases archive. Version 16.0.

Peterson, N. G., Mumford, M. D., Borman, W. C., Jeanneret, P. R., Fleishman, E. A.,Levin, K. Y., Campion, M. A., Mayfield, M. S., Morgeson, F. P., Pearlman, K., et al.(2001). Understanding work using the occupational information network (o* net): Im-plications for practice and research. Personnel Psychology, 54(2):451–492.

Rosen, S. (1986). The theory of equalizing differences. Handbook of Labor Economics,1:641–692.

Rotter, J. B. (1966). Generalized expectancies for internal versus external control of rein-forcement. Psychological monographs: General and applied, 80(1):1.

Roy, A. D. (1951). Some thoughts on the distribution of earnings. Oxford economic papers,3(2):135–146.

SOCPC (2010). 2010 soc user guide.

Spence, M. (1973). Job market signaling. The Quarterly Journal of Economics, 87(3):355–374.

Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data. The MITpress, second edition.

42


Recommended