Estimation of a Life-Cycle Model with Human Capital, Labor ...

Estimation of a Life-Cycle Model with HumanCapital, Labor Supply and Retirement∗

Xiaodong Fan Ananth Seshadri

Monash University University of Wisconsin-Madison

Christopher Taber

University of Wisconsin-Madison

November 30, 2019

Abstract

We develop and estimate a life-cycle model in which individuals make decisions

about consumption, human capital investment, and labor supply. The most novel as-

pect of our paper is the focus on human capital towards the end of the life cycle. Re-

tirement arises endogenously as part of the labor supply decision. The model allows

for both an endogenous wage process through human capital investment (which is

typically assumed exogenous in the retirement literature), an endogenous retirement

decision (which is typically assumed exogenous in the human capital literature), and

accounts for the social security system. We estimate the model using Indirect Infer-

ence to match the life-cycle profiles of employment and measured wages from the

SIPP data. The model replicates the main features of the data—in particular the large

increase in measured wages and small increase in labor supply at the beginning of the

life cycle as well as the small decrease in measured wages but large decrease in labor

supply at the end of the life cycle. We use the model to estimate the effects of various

changes to tax and Social Security policies and show that allowing for human capital

accumulation is critical.

KEYWORDS: human capital, Ben-Porath, labor supply, retirement

JEL Classification: J22, J24, J26∗We would like to thank the editor (James Heckman), four anonymous referees, seminar participants at

the SED, Tokyo, Yale, SOLE, CEPAR, Reading, Royal Holloway, Catholic University of Milan, Minnesota,Toulouse, Cornell, and Johns Hopkins for helpful comments and suggestions. This research was under-taken with the assistance of resources and services from the National Computational Infrastructure (NCI),which is supported by the Australian Government. All remaining errors are our own.

1 Introduction

The Ben-Porath (1967) model of life-cycle human capital production and the life-cyclelabor supply model are two of the most important models in labor economics. The formeris the dominant framework used to rationalize wage growth over the life cycle; the lat-ter has been used to study hours worked over the life cycle, including retirement. Quitesurprisingly, aside from the seminal work in Heckman (1975, 1976), there has been littleeffort integrating these two important paradigms. This paper attempts to fill this void byestimating a life-cycle model in which workers choose human capital and labor supplyjointly. An important aspect of our model is that we do not treat retirement as a separatedecision in the model nor do we treat it differently in the data. We treat it as a word thatrefers to low levels of labor supply late in life. In our model this declining labor sup-ply over the life cycle occurs endogenously as part of the optimal life-cycle labor supplydecision.

There has been work examining models of learning-by-doing and labor supply, mostnotably Imai and Keane (2004). However, these papers focus on the early part of the lifecycle. The most novel aspect of our paper is examining human capital towards the endof working life. This is important as the retirement literature typically takes the wageprocess as given and estimates the incidence of retirement (e.g., Gustman and Steinmeier,1986; Rust and Phelan, 1997; French, 2005; French and Jones, 2011). Cross-section rawwages for people who work fall substantially before retirement. They decline by over 25%between ages 55 and 65 (French, 2005). In much of the retirement literature, this trendis critical to understanding retirement behavior. By contrast, life-cycle human capitalmodels take the retirement date as given, but model the formation of the wage process(e.g., Ben-Porath, 1967; Heckman, 1975, 1976; Heckman et al., 1998a; Manuelli et al., 2012).We estimate a model wherein the wage and labor supply choices are rationalized in oneunified setting accounting for the social security system. After endogenizing both laborsupply and human capital, our model is rich enough to explain the life-cycle patterns ofboth wages and labor supply, with a focus on wage patterns and declining labor supply(i.e. retirement) at the end of working life.

Specifically, we develop and estimate a Ben-Porath type human capital model in whichworkers make consumption, human capital investment, and labor supply decisions. Weestimate the model using Indirect Inference, matching the measured wage and labor sup-ply profiles of male high school graduates from the Survey of Income and Program Par-ticipation (SIPP). With a parsimonious life-cycle model in which the taste for leisure doesnot depend upon age or experience, we are able to replicate the main features of the data.

1

In particular, we match the large increase in measured wages and very small increase inlabor supply at the beginning of the life cycle as well as the small decrease in measuredwages but very large decrease in labor supply at the end of the life cycle.

One important piece to our ability to fit both ends of the life cycle is human capitaldepreciation. We take the definition of depreciation to be broad—it could be individuals’skills literally declining or it could be obsolescence of their skills as the nature of theirwork changes. The distinction is not important in our context. In a simple model withouthuman capital depreciation, there is no a priori reason for workers to concentrate theirleisure towards the end of the life cycle. However, this is no longer the case with humancapital depreciation which imposes a shadow cost on leisure. When workers take timeoff in the middle of their career, their human capital depreciates and they earn less whenthey return to the labor market. On the other hand, if this period of nonworking occursat the end of the career, the shadow cost is much less a concern because the horizon isshorter. Older workers may choose not to re-enter at a lower wage so they continue tostay out of the labor market.

An interesting aspect of our model is that even though the preference for leisure doesnot vary systematically over the life cycle, we do find that measured “labor supply elastic-ities” do vary over the life cycle. In our dynamic model, the shadow cost of not workingis much higher early in the life cycle (as pointed out by e.g. Imai and Keane, 2004) and itis lower for older workers as opposed to peak earners. We find that early in the life cyclethe measured labor supply elasticity is low, around 0.2. However, workers around thestandard retirement age are more sensitive to wage fluctuations with elasticities between0.6 and 1.0.

While our baseline model does not incorporate health, we estimate a specification thatallows the taste for leisure to depend on health and for this effect to increase with age.We show that while becoming unhealthy has a large effect on labor supply health shocksare relatively uncommon in the years in which labor supply declines (ages 50–65). As aresult, health plays a relatively minor role in explaining the decline in labor supply late inlife. We also estimate another model which allows for part time labor supply. This doesnot substantially change the results—in large part because part time work is relativelyuncommon for the workers in our sample.

We use the estimated model to simulate the impacts of various Social Security policychanges. Much serious work has been developed to quantitatively estimate the economicconsequences of an aging population and evaluate the remedy policies (Gustman andSteinmeier, 1986; Rust and Phelan, 1997; French, 2005; French and Jones, 2011; Haan andProwse, 2014). They model retirement as a result of combinations of declining wages,

2

increasing actuarial unfairness of the Social Security and pension system, and increas-ing tastes for leisure. However, there is a major difference between our model and theprevious retirement literature. Prior work typically takes the wage process as given andfocuses on the retirement decision itself. For example, when conducting the counterfac-tual experiment of reducing the Social Security benefit by 20%, the previous literaturetakes the same age-wage profile as in the baseline model and re-estimates the retirementbehavior under the new environment. As the wage has already been declining signifi-cantly and exogenously approaching the retirement age, under the new policy workingis still less likely attractive for many workers. However, as we show in our model, lessgenerous Social Security benefits result in higher labor supply later in the life cycle, soworkers adjust their investment over the life cycle, which results in a higher human cap-ital level as well as higher labor supply earlier. In an experiment in which Social Securitybenefits are decreased by 20%, the measured wage levels are 2%–7% higher between 60and 75. Over the whole life cycle, measured average yearly wages, total pre-tax laborincome, total employment rates, and human capital investment increase by 1.80%, 1.99%,1.47%, and 1.50% respectively.

2 Relevant Literature

Human capital models have been widely accepted as a mechanism to explain life-cycle wage growth as well as the labor supply and income patterns. In his seminal paper,Ben-Porath (1967) develops the human capital model with the idea that individuals investin their human capital “up front.” In what follows we often use the term “human capi-tal model” to mean “Ben-Porath model.” Heckman (1975, 1976) extends the model andpresents more general human capital models in which each individual makes decisionson labor supply, investment and consumption. In both papers, each individual lives forfinite periods and the retirement age is fixed. Manuelli et al. (2012) calibrate a Ben-Porathmodel to include the endogenous retirement decision. All three models are deterministic.

Relative to the success in theory, there hasn’t been much work empirically estimatingthe Ben-Porath model. Mincer (1974) derives an approximation of the Ben-Porath modeland greatly simplifies the estimation with a quadratic in experience, which is used in nu-merous empirical papers estimating the wage process (Heckman et al. 2006 survey theliterature). Early work on explicit estimation of the Ben-Porath model was done by Heck-man (1975, 1976), Haley (1976), and Rosen (1976). Heckman et al. (1998a) is a more recentattempt to estimate the Ben-Porath model. They utilize the implication of the standardBen-Porath model where at old ages the investment is almost zero. However, this impli-

3

cation does not hold any more when the retirement is uncertain, where each individualalways has an incentive to invest a positive amount in human capital. Browning et al.(1999) survey much of this literature.1

Another type of human capital model, the learning-by-doing model, draws relativelymore attention in empirical work. In the learning-by-doing model human capital accu-mulates exogenously, but only when an individual works. Thus workers can only impacttheir human capital accumulation through the work decision. In these models, the totalcost of leisure is not only the direct lost earnings at the current time, but also includesthe additional lost future earnings from the lower level of human capital. Shaw (1989)is among the first to empirically estimate the learning-by-doing model, using the PSIDmodel and utilizing the Euler equations on consumption and labor supply with translogutility. Keane and Wolpin (1997) and Imai and Keane (2004) are two classic examples ofresearch that directly estimate a dynamic life-cycle model with learning-by-doing. Blun-dell et al. (2016) is a more recent example. These papers assume an exogenously fixedretirement age. Wallenius (2011) points out that such a learning-by-doing model doesnot fit the pattern of wages and hours well at old ages.2 Heckman et al. (2003) study thepotential effects of wage subsidies on skill formulation by comparing on-the-job train-ing models with learning-by-doing models. They simulate the effects of the 1994 EITCschedule for families with two children and find evidence that EITC lowers the long-termwages of people with low levels of education. They contrast the Ben-Porath style modelpredictions of the EITC policy effects with those of the learning-by-doing model. Whilelearning-by-doing fits better for more educated women, the Ben-Porath style model fitsbetter for less educated women.

There is a large and growing literature on many aspects of retirement. In these models,typically retirement is induced either by increasing utility toward leisure (e.g. Gustmanand Steinmeier, 1986) or increasing disutility toward labor supply (e.g. Blau, 2008). Haanand Prowse (2014) estimate the extent to which the increase in life expectancy affectsretirement. Blau (2008) evaluates the role of uncertain retirement ages in the retirement-consumption puzzle.

Retirement can also be induced by declining wages at old ages and/or fixed costsof working. Rust and Phelan (1997) estimate a dynamic life-cycle labor supply modelwith endogenous retirement decisions to study the effect of Social Security and Medicare

1Other more recent work includes Taber (2002), who incorporates progressive income taxes into theestimation, and Kuruscu (2006), who estimates the model nonparametrically.

2However, if one interprets the hourly wages as labor income and hours as employment rates (sincethere is no participation decision in their model), the fit in Imai and Keane (2004) would be improved atolder ages.

4

in retirement behavior. French (2005) estimates a more comprehensive model includingsavings to study the effect of Social Security and pension as well as health in retirementdecisions. French and Jones (2011) evaluate the role of health insurance in shaping re-tirement behavior. Casanova (2010) studies the joint retirement decision among marriedcouples. Prescott et al. (2009) and Rogerson and Wallenius (2013) present models whereretirement could be induced by a convex effective labor function or fixed costs.

In all the retirement literature listed above—theoretical or empirical—the wage pro-cess is assumed to be exogenous. That is, even when the environment changes whileconducting counterfactual experiments, for example changing the Social Security poli-cies, the wage process is kept the same and only the response in the retirement decisionis studied. Studying the 1999 pension reform in German, Gohl et al. (2019) find that thatthis assumption may be wrong. Responding to an exogenous increase in early retirementage from 60 to 63, employed women aged 53–60 increase their human capital investmentsignificantly. This will likely change the wage profile.

3 Model

3.1 Overview

The model is a finite time life-cycle model. The main features that individuals chooseare• Human capital investment• Labor supply (extensive margin)• Consumption/savings

We add several other features to the baseline model both for fitting the data and for real-ism• Social Security benefits/taxes• Exogenous marriage and spousal labor supply• Bequest motive• Consumption floor

3.2 Environment and Econometric Specification

Demographics

Time is discrete and measured in years. Each individual i lives from period t = 0 tot = T. We use i and t subscripts to be clear how parameters vary across individuals and

5

time. At the beginning of the initial period, each individual is endowed with an initialasset Ai0 ∈ R and an initial human capital level Hi0 ∈ R+.

Family status is an exogenous discrete state variable that can take three different val-ues. A single or divorced individual is denoted by Mit = 0, while a married individualis indicated by either Mit = 1 (spouse not working) or Mit = 2 (spouse working). Eachindividual is single at the beginning of the life cycle, Mi0 = 0. The family status evolvesfollowing an age-dependent Markov transition matrix.

Preferences

In the baseline model we focus on the extensive margin of labor supply only, so ateach period the individual decides either to work or not. The flow utility at period t is

ut (cit, ìt, ssait; γit, Mit) = ψtMit

c1−ηcit

1− ηc+ γitìt + vtssait (1)

where cit is family consumption, ìt ∈ {0, 1} is leisure, and ssait is a dummy variableindicating whether the individual starts claiming Social Security benefits. We mentionagain that retirement is not modeled explicitly—it is a word that one may use to describethe status ìt for older workers but we don’t model this any different than any other periodof non-employment.

The coefficient ψtm shifts the marginal utility of consumption (e.g., Gourinchas andParker, 2002) and is assumed a parametric form,

ψtm = exp(

ϕ1t + ϕ2t2 + ϕ3t3 + ϕ41 {m 6= 0})

(2)

Note that the shifter depends upon marital status.The coefficient γit represents taste for leisure and also depends on the family status.

We use the parametric form for γit,

γit = exp

(ai0 +

2

∑j=1

aj1 {Mit = j}+ εit

)(3)

where εit follows an independent and identical normal distribution with mean 0 and vari-ance σ2

ε . A key part of our exercise is that we do not explicitly allow γit to vary systemat-ically across age.

The final term of flow utility, vt, accounts for tastes for applying for Social Securitybenefits. The literature has documented two peaks of Social Security application at age

6

62 and 65. Rust and Phelan (1997) demonstrate that the resource constraint and healthinsurance constraint are two major factors contributing to the peaks at age 62 and 65,respectively. While it is beyond the scope of this paper to explain these patterns, it isimportant to account for them when modeling labor supply and savings decisions ofpeople of this age. With this goal in mind we let the model fit these patterns by assumingan individual obtains additional utility from receiving the Social Security benefit, andthus this total additional flow utility at period t becomes

vt = b621 {t = 62}+ [b65 + b65t (t− 65)] 1 {65 ≤ t ≤ 70} (4)

The first term (b62) captures the effect of resource constraint as well as pension eligibilityand the second term captures the “security value” of health insurance through employ-ment, as studied in previous literature (e.g., Rust and Phelan, 1997; French, 2005; Frenchand Jones, 2011).

Life ends at the end of period T and each individual values the bequest he will leave.It takes the form

b(AiT+1) = b1(b2 + AiT+1)

1−ηc

1− ηc(5)

where b1 captures the relative weight of the bequest motive and b2 determines its curva-ture as in DeNardi (2004).

Human Capital

If a man chooses to work, ìt = 0, he decides on how much time, Iit ∈ [0, 1], to investin human capital and spends the rest, 1− Iit, at effective (or productive) work from whichthe wage income is earned. Human capital is produced according to the production func-tion

Hit+1 = (1− δ) Hit + ξitπi IαIit HαH

it (6)

where Hit is the human capital level at period t. ξit is an idiosyncratic shock to the humancapital innovation. If an individual chooses not to work, he does not invest in humancapital (so Iit = 0) and human capital depreciates at rate δ.

We assume ξit is i.i.d and follows a log-normal distribution,

log (ξit) ∼ N

− log(

σ2ξ + 1

)2

, log(

σ2ξ + 1

) (7)

so that the level of ξit has a mean of one and a variance of σ2ξ .

7

The labor market is perfectly competitive. We normalize the rental rate of humancapital to one so that the wage for the effective labor supply equals the human capital Hit.Thus pre-tax labor income at any point in time is

wit = Hit (1− ìt) (1− Iit) . (8)

Social Security and Budget Constraint

While we have tried to keep the basic model as simple as possible, the Social Securitysystem in the U.S. is such a crucial part of later life economic decisions that we incorporateit into the model. We model the Social Security enrollment decision as a one time decision.Once a person turns 62 they can start claiming Social Security and once they have startedclaiming, they continue to collect benefits until their death. We let ssit be a state variableindicating whether a person began claiming prior to period t and as mentioned abovessait indicates the decision to start claiming benefits. Since claiming is irreversible, oncessit = 1 then ssait is no longer a relevant choice variable. Thus the law of motion can bewritten as

ssi0 =0

ssit+1 =max {ssit, ssait} . (9)

The claiming decision (ssait) is made separately from the labor supply decision (ìt) sothat one can receive the Social Security benefit while working (subject to applicable rulessuch as the earnings test).

Once they have begun claiming, an individual collects benefits ssbit which is a func-tion of the claiming age, the Average Indexed Monthly Earnings (AIMEit), and workingbehavior after claiming (through the earnings test). In practice we approximate the AIMEand use the Social Security rules as of 2004. The benefit ssbit is updated each year if anindividual worked to account for the earnings test. Details are in Appendix B. This isincorporated into the budget constraint

Ait+1 = Ait + Υt (rAit, wit, yit, ssbit)− cit + τit, (10)

where Ait stands for asset, r is the risk free interest rate, and yit is spousal income. Υt (·) isthe after-tax income which is a function of positive capital income, wage income, spousalincome (if applicable), the Social Security benefit (if applicable), and the tax code. Detailscan be found in Appendix B.

8

Spousal income takes the form

yit = ζit1 {Mit = 2} , log (ζit) ∼ N(

µζt, σ2ζt

)(11)

where ζit is an age-dependent log-normal random variable.Government transfers, τit, provide a consumption floor c as in Hubbard et al. (1995)

soτit = max {0, c− (Ait + Υit −Ait+1)} , (12)

where Ait+1 is the asset lower bound at period t + 1.3

We note that some of our model assumptions are strong and lead all human capital tobe financed by the worker through foregone wages. In particular, with equation (8), wedeviate from the original Ben-Porath model by ignoring monetary inputs which wouldhave to be subtracted from the right hand side. Relaxing various parts of this model couldlead the firm to finance some of the human capital—for example search frictions, asym-metric information, or if some of the human capital is firm specific (see e.g. Acemoglu andPischke, 1998, 1999; Sanders and Taber, 2012). However, separating the contributions offirms and workers to training empirically is notoriously difficult, if not impossible. All weare assuming is that workers have to sacrifice some current earnings in order to increasefuture earnings. In addition to workers’ contribution to trainings, our wage specificationalso captures career choices with a low starting wage but a steeper age-earnings profile.For example, a law school graduate could either start as an associate in a law firm witha relatively low starting wage but very high potential earnings in the future, or chooseanother career with a higher starting wage but a flatter age-earnings profile.

3.3 Solving the Model

Four random variables are realized each period: evolving family status, Mit, spousalincome, yit, the shock in leisure taste, εit, and the human capital innovation shock, ξit. Thetiming of the model works as follows: between periods t− 1 and t the yit−1 and ξit−1 aredrawn determining Ait and Hit, the Markov process determines Mit, and the leisure shockεit is realized. The agent then simultaneously chooses consumption, labor supply, humancapital investment, and when relevant, Social Security application. All four shocks arei.i.d. conditional on Mit−1 from the perspective of the econometrician and the agent—so

3We define the asset lower bound as the amount that each individual can pay back for sure beforedeath, as in Aiyagari (1994). Since the probability of not working at each period is positive, the lowerbound is characterized by the non-negative consumption and the bequest function specified below, whichis Ait = −b2/ (1 + r)T−t+1.

9

agents have no private information about their value prior to their realizations.The recursive value function for t < T can be written as

Vt (Xit) = maxc,`,I,ssa

{ut (c, `, ssa; γit, Mit) + βE [Vt+1 (Xit+1) | Xit, c, `, I, ssa]} (13)

whereXit = {Mit, Ait, Hit, ssit, AIMEit, ssbit, ε; ai0, πi} (14)

is the vector of state variables. Note that AIMEit is only relevant prior to claiming (ssit =

0) while ssbit is only relevant after claiming (ssit = 1). That is, prior to claiming, AIMEit

increases over time, but ssbit has not yet been determined. At the time an individual startsto claim, the benefit (ssbit) is determined and relevant for the rest of life, but AIMEit isonly relevant in its contribution to ssbit, so once that has been determined AIMEit is nolonger relevant. The expectation is over the human capital innovation ξit, spousal incomeyit, the leisure shock εit+1, and the Markov draw for the new family status Mit+1.

For t = T we write

VT (XiT) = maxc,`,I,ssa

{uT (c, `, ssa; γiT, MiT) + βE [b(AiT+1) | XiT, c, `, I, ssa]} (15)

The solution to the agent’s problem each period is computed in two stages. We firstsolve for the optimal choices conditional on the labor supply status and then we deter-mine the labor supply decision.

Define Xit to be the set of state variables apart from εit. The optimal consumptionCit0

(Xit

), investment Iit0

(Xit

), and Social Security claiming SSAit0

(Xit

)decisions

conditional on participating in the labor market (ìt = 0) can be obtained from{Cit0

(Xit

), Iit0

(Xit

),SSAit0

(Xit

)}≡ argmax

c,I,ssa

{ψtMit

c1−ηc

1− ηc+ vt (ssa) + βE

[Vt+1 (Xit+1) | Xit, c, ìt = 0, I, ssa

]}(16)

and the conditional value function is

Vt0

(Xit

)≡ψtMit

(Cit0

(Xit

))1−ηc

1− ηc+ vt

(SSAit0

(Xit

))+ βE

[Vt+1 (Xit+1)| Xit, Cit0

(Xit

), ìt = 0, Iit0

(Xit

),SSAit0

(Xit

)]. (17)

Notice that since there is no serial correlation in the stochastic shocks of leisure, εit, the

10

conditional policy and value functions defined in equations (16) and (17) do not dependon it.

Similarly, conditional on not working (ìt = 1), we can calculate the optimal consump-tion and claiming decision from{Cit1

(Xit

),SSAit1

(Xit

)}≡ argmax

c,ssa

{ψtMit

c1−ηc

1− ηc+ vt(ssa) + βE

[Vt+1 (Xit+1)| Xit, c, ìt = 1, Iit = 0, ssa

]}(18)

and define the conditional value function to be

Vt1

(Xit

)≡

(Cit1

(Xit

))1−ηc

1− ηc+ vt

(SSAit1

(Xit

))+ βE

[Vt+1 (Xit+1)| Xit, Cit1

(Xit

), ìt = 1, Iit = 0,SSAit1

(Xit

)]. (19)

The optimal labor supply solution is

ìt =arg max`∈{0,1}

{Vt`

(Xit

)+ γit`

}(20)

This gives a convenient functional form for the expected value function. To see this notethat

ε∗t

(Xit

)≡ log

(Vt0

(Xit

)− Vt1

(Xit

))− ai0 −

2

∑j=1

aj1 {Mit = j} (21)

is the cutoff value of εit that determines work (see Appendix A for derivation). Then it iseasy to see that the optimal labor supply decision is

ìt = 1(

εit ≥ ε∗t

(Xit

))(22)

where 1 (·) is the indicator function.Using properties of log-normal random variables, we show in Appendix A that the

expected value function is

E[

Vt (Xit)| Xit

]=Φ

ε∗t

(Xit

)σε

Vt0

(Xit

)+

1−Φ

ε∗t

(Xit

)σε

·

·

Vt1

(Xit

)+ exp

(ai0 +

2

∑j=1

aj1 {Mit = j}+ σ2ε

2

) Φ(

1− ε∗t (Xit)σε

)1−Φ

(ε∗t (Xit)

σε

)

11

Finally note that Xit+1 is a known function of Xit, cit, ìt, Iit, ssait, ξit, yit, and Mit+1, soto solve for

E [Vt+1 (Xit+1) | Xit, cit, ìt, Iit, ssait] =E[

E[Vt+1 (Xit+1) | Xit+1

]| Xit, cit, ìt, Iit, ssait

]we integrate over the distributions of Mt+1, ζit, and ξit.

3.4 Heterogeneity

We allow for heterogeneity in ability to learn (πi), initial human capital (Hi0), andtastes for leisure (ai0). For computational reasons we only have nine types determiningthe joint distribution of (ai0, πi). Specifically, we model it as a nine-point Gauss-Hermiteapproximation of a joint normal distribution, which depends on five parameters: themean and variance of ai0, the mean and variance of πi, and the correlation between thetwo. Respectively we write this as (µa0 , σa0 , µπ, σπ, ρ). We emphasize that since we areonly using nine points we are not assuming that the Gauss-Hermite is a good approxima-tion of a normal, but rather view this as the parametrization itself.

Since human capital is already a continuous state variable in our model, we can bemore flexible in its initial value. We allow it to be correlated with (ai0, πi) through thefunctional form

Hi0 = exp (γ0 + γa0 ai0 + γππi + νi) (23)

where ν ∼ N(

0, σ2H0

)is an i.i.d normal random variable.

4 Estimation

The estimation of the model is carried out using a three-step strategy. First, we pre-set parameters that either can be cleanly identified without explicitly using our model orare not the focus of this paper. In the second step we estimate the evolution of the statevariables involving spouses. In the third step, we estimate the remaining preference andproduction parameters of the model using Indirect Inference. The model is described byequations (1)–(23) and we summarize the parameters here. The parameters determin-ing unobserved heterogeneity are µa0 , σa0 , µπ, σπ, ρ, γ0, γa0 , γπ, and σH0 . The additionalparameters related to preferences are the discount rate, β, the intertemporal elasticity ofconsumption, ηc, the consumption shifter, ϕ1 − ϕ4, the taste for leisure, a1, a2, σε, the be-quest parameters, b1 and b2, and the Social Security claiming parameters b62, b65, and b65t.Human capital production is determined by δ, αI , αH and σξ . Parameters related to the

12

Table 1: Normalized or pre-set parameters

Parameters Normalized/Pre-set ValuesInterest rate r 0.03Discount β 0.97Initial wealtha A0 0.0Initial AIMEa AIME0 0.0Consumption floorb c 2.19Bequest shifterc b2 222.0

aThe initial age is 18.bThe consumption floor is equivalent to $4380 in 2004$, since we normalize the total time endowmentfor labor supply at one period—which is 2000 hours—as one.cThe bequest shifter is equivalent to $444, 000 in 2004$.

budget constraint are the interest rate r and the consumption floor c. There are otherparameters used to determine family status and spousal earnings. Finally there are ini-tial values for the state variables, assets, Ai0, and Averaged Indexed Monthly Earnings,AIMEi0.

4.1 Pre-set Parameters

The set of parameters pre-set in the first stage includes the interest rate, the time dis-count rate, initial wealth and initial AIME, consumption floor, and bequest shifter.

One period is defined as one year.4 The initial period in our model corresponds toage 18 and ends at age 80.5 The early retirement age is 62 and the normal retirement ageis 65. The risk free real interest rate is set as r = 0.03 and the time discount rate is setas β = 0.97. The consumption floor is set as c = 2.19, as estimated in French and Jones(2011).6

The parameter which determines the curvature of the bequest function is set as b2 =

222, as in French and Jones (2011).7 We assume all individuals start off their adult lifewith no wealth and zero level of AIME at age 18. These normalized or pre-set parametersare summarized in Table 1. In section 8 of the paper we show that the results are robustto other alternatives.

4Mid-year retirement might be an issue. However, more than half of workers are never observed work-ing half-time approaching retirement, so it would not be a big issue.

5The life expectancy for white males is 74.1 in 2000 and 76.5 in 2010.6c = 4380/2000 = 2.19 since we normalize the total time endowment for labor supply at one period as

one.7It is equivalent to $444, 000 in 2004 U.S. dollar.

13

4.2 Demographics

We estimate the spousal demographics separately from the rest of the model. Weestimate the 3× 3 Markov transition matrix at each age from the SIPP data, smoothed bya probit regression on the age quadruple. For each age, we estimate the mean µζt andstandard deviation σζt of the logarithm of the positive spousal income in the SIPP data,and then smooth them by an age quadruple function.

4.3 Estimation Procedure

We apply Indirect Inference to estimate the remaining parameters of interest, Θ, with

Θ =

µa0 , σa0 , µπ, σπ, ρ, γ0, γa0 , γπ, σH0 ,︸︷︷︸heterogeneity

ηc, ϕ1 − ϕ4,︸︷︷︸c

a1, a2, σε,︸︷︷︸leisure

b1,︸︷︷︸bequest

b62, b65, b65t,︸︷︷︸SSA

δ, αI , αH, σξ︸︷︷︸human capital

according to the following procedure.

i) Calculate the auxiliary model from the data.ii) Iterate on the following procedure for different values of Θ until the minimum dis-

tance has been found.(a) Given a set of parameters, solve value functions and policy functions for the

entire state space grid.(b) Generate the life-cycle profile for each simulated individual.(c) Calculate the auxiliary model from the simulation.(d) Calculate the distance between the simulated auxiliary model and the data aux-

iliary model.

4.4 Data and the Auxiliary Parameters

Our primary data set is the Survey of Income and Program Participation (SIPP). TheSIPP is comprised of a number of short panels of respondents and we use all of the panelsstarting with the 1984 panel and ending with the 2008 panel. We use the SIPP because itis a large representative data set with a panel data element. To focus on as homogeneousa group as possible, the sample only includes male high school graduates. Estimationresults for college graduates are presented in Appendix D.

As is standard in the literature on estimation of Ben-Porath style human capital we

14

assume that measured wages in the data correspond to

Wt = Ht (1− It) (24)

in the model.The primary four things that agents in our model choose are consumption, labor sup-

ply, human capital investment, and Social Security application. We obtain life-cycle dataon the three of these that can be easily observed: consumption, labor supply, and So-cial Security application. Human capital is not observed directly, so we choose momentson measured wages. We match the life-cycle profile of measured wages and also life-cycle measured wages conditional on fixed effects as they look quite different and wewant our model to be able to explain both. Since depreciation will play an important rolein our results, we construct a measure of human capital decline following spells of non-employment. To measure persistence in employment we also collect data on the transitionrates in and out of work.

In SIPP an individual is observed at most three times each year. Due to the seam biasproblem in SIPP we only use measures of working and wages during the survey month.We use only years in which we observe the worker three times and if an individual worksin two or thee of the observations, he is categorized as working in the labor market, oth-erwise not.8 We construct the hourly wage as the earnings in the survey month dividedby the total number of hours worked in the survey month and average across the surveymonths in a year in which the respondent works.

We begin estimation of the model from age 22 rather than 18 for two reasons. First,we have a short panel meaning that many 19-year-old high school graduates may returnto college after they leave the panel. Second, our model does not include any searchor matching behavior, which might be important for the labor supply patterns amongvery early labor market entrance as they transition from school to work as suggestedby literature (Topel and Ward, 1992; Neal, 1999). Our model does over-predict the laborsupply for those individuals.

Eight sets of moment conditions across different ages are chosen to assemble the aux-iliary model. We use a total of 230,657 panel observations from 80,519 different respon-dents.

i) The employment rates (ER), ages 22–65.9

8Clearly this aggregation is imperfect as the model is simulated at an annual basis. Ideally we wouldsimulate the model at the monthly level, but this is not computationally feasible. Our goal is to understandlabor supply at the life-cycle frequency so abstracting from the monthly frequency does not seem first order.

9We focus on the employment rather than the labor force participation in both data and the model, as

15

ii) The first moments of the logarithm of measured wages, ages 22–65.iii) The first moments of the logarithm of measured wages after controlling for individ-

ual fixed effects, ages 22–65.10

iv) The second moments (standard deviation) of the logarithm of measured wages, ages22–65.

v) The first moments of adult equivalent consumption, ages 22–65.11

vi) The Social Security benefit application rates, ages 62 to 70.vii) The overall transition probabilities averaged between age 35 and 50,12

(a) from working to not working(b) from not working to working

viii) The average measured wage change rate after one nonemployment spell averagedbetween age 41 and 65.13

We match both age-measured wage profiles, with and without controlling for individualfixed effect as the two have quite different patterns.

Figures 1(a)–1(e) present the six profiles. Figure 1(a) plots the employment rates be-tween age 22 and 65. Figure 1(b) plots two log measured wage profiles. The first one isthe profile from the pooled sample, while the second one is the profile after controlling forindividual fixed effects. The original log measured wage profile has a hump shape, butthe one filtering out individual fixed effects does not decline within the examined periodwhich is between age 22 and 65. Figure 1(c) shows the extent to which the variance of logmeasured wages increases with age. Figure 1(d) presents adult equivalent consumptionprofile while Figure 1(e) illustrates the two peaks at age 62 and 65 in the Social Securitybenefit application ages.

The most interesting result in Figures 1(a)–1(e) is the discrepancy between the age-measured wage profiles with and without controlling for individual fixed effects. Thishas been documented in various data sets, including the National Longitudinal Survey

we do not have unemployment in the model.10To construct these moments we first regress log wage on the age dummies and survey year dummies

and obtain the predicted log wage, denoted as z. We pick a base age (age 30) and calculate the averagepredicted log wage at the base age for each year, denoted as za,j, where a is the base age and j is for surveyyear. We then pick a base year y and calculate the difference of za,j between each year j and the base year y,denoted as ∆za,j. Finally we calculate the difference between the original log wage and ∆za,j and define theresult as ˜ln Wt, which is the log wage after filtering out the time fixed effects.

11The adult equivalent consumption profile is constructed from the Consumer Expenditure Survey as inFernández-Villaverde and Krueger (2007)

12We want to focus on transition probabilities caused by heterogeneity rather than retirement, so wechoose the prime working age. Choosing a different age period, such as age 41 to 65, does not change theresults in any significant way.

13We choose this close-to-retirement age group to emphasize the depreciation and minimize the invest-ment channel.

16

Figure 1: Data moments and profiles

(a) Employment rates (SIPP)

.2.4

.6.8

1E

mpl

oym

ent R

ates

(E

R)

22 30 40 50 60 65Age

(b) Mean log measured wages (SIPP)

2.3

2.4

2.5

2.6

2.7

2.8

Mea

n Lo

g M

easu

red

Wag

es

22 30 40 50 60 65Age

Mean log measured wagesMean log measured wages (FE)

(c) S.D. log measured wages (SIPP)

0.2

5.5

.75

1S

.D. L

og M

easu

red

Wag

es

22 30 40 50 60 65Age

(d) Adult equivalent consumption (CES)6

810

12A

dult

Equ

ival

ent C

onsu

mpt

ion

22 30 40 50 60 65Age

(e) Social Security application (SIPP)

0.1

.2.3

.4.5

Soc

ial S

ecur

ity A

pplic

atio

n

60 61 62 63 64 65 66 67 68 69 70Age

(f) Mean log measured wages (CPS)

2.2

2.4

2.6

2.8

3

Me

an

Lo

g M

ea

sure

d W

ag

es

22 30 40 50 60 65Age

MORG, log wages

March, log wages

MORG, log wages (FE)

March, log wages (FE)

17

Table 2: Transitions moments

Transition Probabilitiesa Wage Change RateWorking to Not Working After One

Models Not Working to Working Nonemployment Spellb

Data 0.034 0.200 -0.071Baseline model 0.035 0.235 -0.085No depreciation at work 0.039 0.255 -0.071

aThe transition rate is the average transition probability between age 35 and 50.bThe average wage change rate after one nonemployment spell is the average change rate betweenage 41 and 65.

of Older Men (NLSOM) data (Johnson and Neumark, 1996), the Panel Study of IncomeDynamics (PSID) data (Rupert and Zanella, 2012), and the Health and Retirement Survey(HRS) data (Casanova, 2013). These papers find that after controlling for individual fixedeffects the age-wage profile is flatter than the hump-shaped age-wage profile estimatedusing pooling observations, and it does not decline until 60s or late 60s. All of these pa-pers argue that this evidence is not consistent with the traditional human capital modelsince the traditional human capital model would predict a hump-shaped wage. The intu-ition is that when the human capital depreciation outweighs the investment, wages startto decline which generates a hump-shaped profile. Fitting the wage profile after control-ling for fixed effects makes our problem more challenging because we need to explain thedecrease in labor supply later in life when there is little evidence that measured wagesdecline.

To further verify this result we compare our SIPP results with the Current PopulationSurvey (CPS) data. From the CPS Merged Outgoing Rotation Groups (MORG) data, wematch the same respondent in two consecutive surveys using the method proposed inMadrian and Lefgren (2000), and we have a short panel with each individual interviewedtwice, one year apart.14 We construct a similar short panel from the CPS March AnnualSocial and Economic Supplement files (March). The difference is that the wage informa-tion is collected from the reference week in the CPS MORG data and from the previousyear in the CPS March data.

Figure 1(f) presents the age-measured wage profiles with or without controlling forindividual fixed effects for male high school graduates from the 1979–2012 CPS MORGdata and the 1979–2007 CPS March data. We find an even larger discrepancy in the age-

14For MORG data, they are the fourth and eighth interview.

18

measured wage profiles as in the SIPP data presented in Figure 1(b).15 In the model thisprofile corresponds to net earnings Ht (1− It) .

The values for the remaining moments (vii and viii) can be seen in the first row ofTable 2. One can see that there is substantial persistence in labor supply and that themeasured wage change following a nonemployment spell is large.

5 Estimation Results

The estimates of the parameters are listed in Table 3. Of particular importance are thedepreciation rate, δ, curvature in the human capital production function, αI , and σε whichdetermines the elasticity of labor supply. Before discussing these parameter values weexamine the fit of the model in Figures 2(a)–2(f). The fit of the model in the two overalltransition probabilities and the average wage change rate after one nonemployment spellis presented in the first two rows in Table 2.16

The first and central point is that our parsimonious model can reconcile the main factsin the data: a small increase in labor supply/large increase in measured wages at thebeginning of the life cycle along with the large decrease in labor supply/small decreasein measured wages at the end of the life cycle.17

The simulated employment rate increases slightly between age 22 and 30 as shown inFigure 2(a). Our main result is that this simple model is able to generate a massive declinein labor supply between age 55 and 65, which fits the sharp decline of employment rateswithin that age period in the data and simultaneously the flat measured wage profile inthe fixed effect model.

Our model generates similar discrepancy between the log measured wages with andwithout controlling for individual fixed effects, as shown in Figures 2(c) and 2(b), andboth profiles fit the data well. Log measured wages after filtering out individual fixedeffects increase at a decreasing pace from age 22 to age 58 and then decreases slightly(Figure 2(c)). On the other hand, Figure 2(b) shows that the original log measured wageprofile presents a hump shape which resembles the data profile. The model also replicates

15Time fixed effects are filtered out, as described in footnote 10. We use the same starting year for the CPSMORG data and the CPS March data. Using the 1979–2007 CPS MORG data generates essentially sameprofiles.

16The overidentification test statistic is reported in the bottom of Table 3. The model is rejected at the0.1% level. The fact that we reject is not surprising given the simplicity of our model and the size of oursample. One could easily add some extra parameters to pass the statistical criterion, but this is not our goal.Our goal is to use a simple model that does a very good job of capturing the life-cycle patterns.

17One should keep in mind that our parsimonious specification might be a limitation on our policy coun-terfactuals as other features that we have not explicitly modeled might impact those simulations.

19

Table 3: Estimates in the baseline modela

Parameters Estimates Standard ErrorsHC depreciationb δ 0.091 (0.004)HC production function: I factor αI 0.093 (0.021)HC production function: H factor αH 0.103 (0.011)Standard deviation of HC innovation σξ 0.014 (0.003)Consumption: CRRA ηc 4.013 (0.032)Consumption shifter: coef on t

(×10−1) ϕ1 0.304 (0.052)

Consumption shifter: coef on t2 (×10−2) ϕ2 0.136 (0.016)Consumption shifter: coef on t3 (×10−3) ϕ3 -0.035 (0.002)Consumption shifter: coef on married ϕ4 0.349 (0.089)Leisure: standard deviation of shock σε 0.203 (0.010)Leisure: spouse not working a1 0.870 (0.141)Leisure: spouse working a2 -0.795 (0.109)Bequest weight b1 24,483,682 (3,947,870)Parameter heterogeneityc

Leisure: mean of intercept µa0 -5.453 (0.083)Leisure: standard deviation of intercept σa0 0.816 (0.060)HC productivity, mean µπ 1.760 (0.095)HC productivity, standard deviation σπ 0.628 (0.069)Correlation between a0 & π ρ -0.767 (0.047)Initial human capital level at age 18Intercept γ0 1.565 (0.185)Coefficient on a0 γa0 0.051 (0.014)Coefficient on π γπ 0.644 (0.113)Standard deviation of error term σH0 0.007 (0.005)Additional Social Security Application effectsEffect of resource constraint

(×10−3) b62 0.217 (0.084)

Effect of health insurance: constant(×10−3) b65 0.078 (0.042)

Effect of health insurance: coef on t(×10−3) b65t 0.166 (0.052)

χ2 Statistic = 636d Degrees of freedom = 207aIndirect Inference estimates. Estimates use a diagonal weighting matrix. Standard errors are givenin parentheses.bHC: Human Capital.cThe joint distribution of (a0, π) is a parametric discrete distribution with nine points determined bythese five parameters, using a nine-point Gauss-Hermite approximation.dThis is the J-statistic. The critical value of the χ2 distribution is χ2

(207,0.01) = 257.

20

Figure 2: Fit of model

(a) Employment rates (ER)

.2.4

.6.8

1E

mpl

oym

ent R

ates

(E

R)

22 30 40 50 60 65Age

SimulationData

(b) Mean log measured wages

2.3

2.4

2.5

2.6

2.7

2.8

Mea

n Lo

g M

easu

red

Wag

es

22 30 40 50 60 65Age

SimulationData

(c) Mean log measured wages (FE)

2.3

2.4

2.5

2.6

2.7

2.8

Mea

n Lo

g M

easu

red

Wag

es (

FE

)

22 30 40 50 60 65Age

SimulationData

(d) S.D. log measured wages0

.25

.5.7

51

S.D

. Log

Mea

sure

d W

ages

22 30 40 50 60 65Age

SimulationData

(e) Adult equivalent consumption

68

1012

Adu

lt E

quiv

alen

t Con

sum

ptio

n

22 30 40 50 60 65Age

SimulationData

(f) Social Security application

0.1

.2.3

.4.5

Soc

ial S

ecur

ity A

pplic

atio

n

60 61 62 63 64 65 66 67 68 69 70Age

SimulationData

21

the log measured wage variation as in the data (Figure 2(d)).Our model tracks the hump-shape and the level of the adult equivalent consumption

profile reasonably well (Figure 2(e)), as well as the two peaks at 62 and 65 in the SocialSecurity application (Figure 2(f)).18 The model also generates the similar overall transitionprobabilities between working and not working and the average wage change rate afterone nonemployment spell, as shown in Table 2.

We obtain our fit of the life-cycle profiles of labor supply and log measured wagesdespite the lack of any explicit time-dependent preference of leisure, production or con-straints in our model. A key feature of our model makes them possible: the combinationof human capital depreciation and the separation between the effective labor and ob-served labor. We discuss these issues in the following subsection. We also mention oneother reason that causes labor supply to fall late in life is precautionary savings. Workersbuild up a buffer stock of assets which leads to lower labor supply late in life. As this is acommon feature of many models we focus on the more novel aspects of our model.

5.1 The Role of Human Capital Depreciation

An important feature for explaining the life-cycle profiles comes from a point empha-sized by Heckman et al. (1998a): measured wages are different than observed humancapital. We see in Figure 2(c) that in both the model and the data, once fixed effects areaccounted for, measured wages are close to flat for ages 50–65 despite the fact that thereis a large decrease in labor supply. This distinction between human capital and measuredwages can help explain this effect. We show the key features in Figure 3(a). The solid lineand long-dashed line replicate the simulated log wages without and with fixed effectsfrom Figure 2(b) and 2(c). The long-dash-dotted line labeled “log(H): working” is anal-ogous to the average log wage but shows the potential wage for workers. That is, boththe mean log wage and mean log(H) condition on working during that period but thefirst presents the mean of log (Hit (1− Iit)) while the latter presents the mean of log (Hit).One can see that the difference between the two curves declines over time as investmentdecreases. The time investment profile in Figure 3(b) accounts for this implication. Thesolid line is the unconditional investment profile while the dashed line is the average in-vestment profile conditional on working. These two profiles are very close to each other

18We didn’t force our model to fit the initial decline at young ages in the consumption profile of highschool graduates for two reasons. First, the initial decline in the data needs further investigation and couldbe for reasons not present in our model (e.g., sponsored by or living with parents). Second, the consumptionand leisure are additively separable in our model, and thus the shape of initial consumption does not affectthe labor supply decision in the absence of binding borrowing constraints.

22

Figure 3: Log measured wages, human capital, and investments

(a) Log measured wages and human capital

2.2

2.4

2.6

2.8

3M

ean

Log

Mea

sure

d W

ages

22 30 40 50 60 65Age

Mean log measured wagesMean log measured wages (FE)log(H): workinglog(H): all

(b) Investment

0.0

3.0

6.0

9.1

2.1

5In

vest

men

t

22 30 40 50 60 65Age

InvestmentInvestment at work

at prime ages, and both decrease over time. The short-dash-dotted line labeled “log(H):all” in Figure 3(a) also presents the mean of log (H) but for the full population, not justworkers. From the latter curve one sees at older ages (around 60) the actual human capitallevel has already depreciated to a relatively low level, even though the measured wagelevel is still quite high. This is due to the decline in investment that happens around thattime, both from the decline in investment on the job and from not working at all.

The relatively high value of investment late in the working career is also related to whywe find a much smaller level of the human capital curvature parameter, αI , compared tothe literature summarized in Browning et al. (1999). The larger is αI , the steeper is thedecline in human capital investment with age. At the extreme when αI = 1 one gets a“bang-bang” solution with full investment to a point and then zero investment thereafter.Because depreciation is large, in order to fit the relatively flat wage profile that we see atolder ages one needs a lot of investment at this age which requires a small value of αI .Heckman et al. (1998a) fit the wage data with a much larger value of αI but our modelsare quite different in a number of ways including the fact that this model includes leisureand in their model they set deprecation to zero.

At the early stage of the life cycle, workers invest a considerable amount of time inhuman capital production which drives up both the human capital level and the wage.Once the worker reaches his mid-career, he reduces the time investment at an increasingrate and human capital starts to decrease. As the worker spends less of his working timeinvesting, wages continue to increase. One can see in Figure 3(a) that the measured wagekeeps increasing after age 45 and peaks around 55, after which the measured wage startsdeclining slowly. After age 62, however, since the worker has already allocated most of

23

his time in effective working, there is little further room for such adjustment. As a result,the measured wage declines at almost the same rate at which human capital depreciates.This leads to large falls in labor supply at older ages. Figure 3(b) presents the investmentprofile in our model; the level and trend are very close to Figure 4 in Mulligan (1998),who calculates the time spent learning skills on the job at a 1976 study of time use by theSurvey Research Center. The shape is also similar to that in Blundell et al. (2019) who findsubstantial training among older workers—though using data from the United Kingdom.

Such separation also helps generate the pattern that the working hours profile peaksearlier than the wage profile (Weiss, 1986). Working hours increase slightly with agewhen the worker is young, with a large portion devoted to human capital investment.The working hours profile peaks around age 40 and starts declining. However, withproportionally less time devoted to human capital investment and more time to effectivelabor supply (Figure 3(b)), the measured wage increases from labor market entry to aboutage 55.

To deliberately show the significance of the human capital depreciation in matchingthe labor supply profile and the two log measured wage profiles, we re-estimate themodel without depreciation while working. That is, depreciation is essential to explainthe results in Table 2 that measured wages decline when one is not working, but we esti-mate a model in which it declines off the job but not on the job.19 Specifically, we assumehuman capital only depreciates if not working,

Hit+1 =

Hit + ξitπi IαIit HαH

it if ìt = 0

(1− δ) Hit if ìt = 1(25)

and this model is labeled as the “no depreciation at work” model. The estimation resultsof this model are listed in Table 4.

The fit of the model is shown in the third row of Table 2 and Figure 4. One can see themodel is not able to match the profiles of labor supply and log measured wages simul-taneously. In particular the fit of the wage profile both with and without fixed effects ismuch worse than in Figures 2(b) and 2(c). We take this as evidence of the importance ofdepreciation.

We also tried looking at this in a different way by estimating a model with completelyexogenous human capital and another with learning-by-doing. These results are con-tained in Appendix C. The fit of these models is also considerably worse than our base-

19We did try estimating the model without any depreciation at all and not trying to fit the moment relatedto declining wages off the job. In results available from the authors we show that the model can not fit thedata.

24

Table 4: Estimates of model with no depreciation at worka

Parameters Estimates S.E.HC depreciationb δ 0.066 (0.004)HC production function: I factor αI 0.419 (0.038)HC production function: H factor αH 0.002 (0.0004)Standard deviation of HC innovation σξ 0.022 (0.003)Consumption: CRRA ηc 4.008 (0.022)Consumption shifter: on t

(×10−1) ϕ1 1.331 (0.033)

Consumption shifter: on t2 (×10−2) ϕ2 -0.128 (0.013)Consumption shifter: on t3 (×10−3) ϕ3 -0.015 (0.002)Consumption shifter: coef on married ϕ4 0.016 (0.003)Leisure: standard deviation of shock aε 0.152 (0.011)Leisure: spouse working a2 -1.123 (0.095)Bequest weight b1 157,510,736 (12,759,766)Parameter heterogeneityc

Leisure: mean of intercept µa0 -4.397 (0.060)Leisure: standard deviation of intercept σa0 0.004 (0.001)HC productivity, mean µπ 0.679 (0.025)HC productivity, standard deviation σπ 1.919 (0.171)Correlation between a0 & π ρ -0.866 (0.072)Initial human capital level at age 18Intercept γ0 2.462 (0.021)Coefficient on a0 γa0 0.100 (0.004)Coefficient on π γπ 0.669 (0.041)Standard deviation of error term σH0 0.032 (0.030)Additional Social Security Application effectsEffect of resource constraint

(×10−3) b62 0.312 (0.047)

Effect of health insurance: constant(×10−3) b65 0.743 (0.128)

Effect of health insurance: coef on t(×10−3) b65t 0.714 (0.101)

χ2 Statisticd 9976Degrees of freedom 207

aIndirect Inference estimates. Estimates use a diagonal weighting matrix. Standard errors aregiven in parentheses.bHC: Human Capital.cThe joint distribution of (a0, π) is a parametric discrete distribution with nine points determinedby these five parameters, using a nine-point Gauss-Hermite approximation.dThis is the J-statistic. The critical value of the χ2 distribution is χ2

(207,0.01) = 257.

25

Figure 4: Fit of the alternative model with no depreciation at work

(a) Employment rates

.2.4

.6.8

1E

mpl

oym

ent R

ates

(E

R)

22 30 40 50 60 65Age

SimulationData

(b) Mean log measured wages (FE)

2.3

2.4

2.5

2.6

2.7

2.8

Mea

n Lo

g M

easu

red

Wag

es (

FE

)

22 30 40 50 60 65Age

SimulationData

(c) Mean log measured wages

2.3

2.4

2.5

2.6

2.7

2.8

Mea

n Lo

g M

easu

red

Wag

es

22 30 40 50 60 65Age

SimulationData


.25

.5.7

51

S.D

. Log

Mea

sure

d W

ages

22 30 40 50 60 65Age

SimulationData


68

1012

Adu

lt E

quiv

alen

t Con

sum

ptio

n

22 30 40 50 60 65Age

SimulationData


0.1

.2.3

.4.5

Soc

ial S

ecur

ity A

pplic

atio

n

60 61 62 63 64 65 66 67 68 69 70Age

SimulationData

26

line model, especially the exogenous model.Given that we have shown that our estimate of a depreciation value δ = 0.091 plays a

major role explaining the pattern of wages and life-cycle labor supply, it is important toplace this value into the range of estimates in the literature. This is not easily done as thereis a very large range of estimates and none are directly comparable to our number. Someare larger than our 9.1% estimate and others are smaller. There are broadly three differentliteratures that estimate related parameters. The first of these is motivated by familyleave for women and tries to estimate the effect of career interruption on wages. It findsestimates ranging from 1.5% per year to 25%.20 A second literature looks at displacementfrom the Displaced Worker Survey and also finds a wide range of estimates—many ofwhich are not directly comparable to ours.21 A third literature examines the effect ofthe length of an unemployment spell on the wage at rehire. Schmieder et al. (2016) isa recent and convincingly identified paper of this type. They estimate the effect usinga regression discontinuity with German data. In Germany the length of eligibility forunemployment insurance depends on age with jumps at ages 42 and at 44. They see anincrease in unemployment duration at these two discontinuity points, so they use thekink points as instruments in order to estimate the effect of the length of unemploymentduration on re-employment wages. They find that one extra month of unemploymentleads to a decrease in wages of 0.8% which gives an annual rate remarkably close to ourestimate of 9.1%. While it looks at women in England, Blundell et al. (2016) is of similarstyle to our paper in the sense that it is a structural life-cycle model of labor supply andhuman capital formation. Interestingly, their analysis reveals a substantial depreciation

20A classic early paper on this topic is Mincer and Polachek (1974) which estimates a net depreciation rateof around 1.5 percent per year. Mincer and Ofek (1982) go beyond this to discuss the difference betweenshort term and long term losses from interruption. In the long run individuals invest in human capital tooffset the initial loss, so Mincer and Ofek (1982)’s definition of short term losses is more closely related to ourconcept of depreciation. Using panel data methods for the National Longitudinal Survey of Mature Womenthey find estimates ranging from 5.6% to 8.9%. Light and Ureta (1995) use National Longitudinal Survey ofYouth 1979 data and estimate that the immediate effect of a year of non-participation in the labor marketleads to a decline in earnings of 25%. Kunze (2002) and Gorlich and de Grip (2009) both use German data(IAB employment sample and German Socio-economic panel respectively). Kunze (2002) finds estimatesof about 2–5% wages losses for women from unemployment spells but about 13–18% from parental leave.Gorlich and de Grip (2009) find a variety of results ranging from around 1.5% to 5% depending on the typeof spell.

21While much of this literature is more focused on earnings than wages, some papers look at weeklyearnings. Both Farber (1993) and Ruhm (1991) estimate the effect of a displacement on re-employmentwages and obtain a range of estimates with most being around declines of 10% but varying from 6.5% to16.9%. These numbers are not annualized but are just from the incidence of displacement. Li (2013) usesthe same data but produces annualized versions so that the effects can be more easily compared to ourestimate of δ. She estimates the effects for many different occupations with a huge range of estimates acrossoccupations. Focusing on the three largest occupations she finds a deprecation of 9.4% for Installation andRepair workers, 7.7% for Production workers, and 17.4% for workers in Transportation.

27

of human capital ranging from 6% to 11%.

5.2 Elasticity of Labor Supply

The key parameter in our model that determines the elasticity of labor supply is σε

but its value is hard to interpret. In this subsection, we provide a measure to help thereader judge the magnitude. Since labor supply is discrete, we examine the elasticityalong the extensive margin. At the individual level, the labor supply elasticity is zero un-less the worker is exactly indifferent between working or not, in which case it is infinite.Therefore, we can not construct the standard Marshallian and Hicksian labor supply elas-ticities. However, to compare our elasticity to something similar to what is estimated inthe literature we construct a counterpart to these by increasing the human capital rentalrate at different ages by 10% (from 1 to 1.1), and then simulating the percentage changein the employment rate using the baseline model and dividing by the difference in themeasured wage.22 We should emphasize that this is not the actual elasticity in our modelbecause it does not account for the shadow cost of time. We call it the analogue to theempirical elasticity (aee) and define it formally below.

Let hbt be the employment rate at age t in the baseline model and ht

t be the employmentrate at age t (denoted by the subscript) in the simulation in which we increase the rentalrate at age t (denoted by the superscript) by 10%. Then we define elasticity is calculatedas

aee ≡log(ht

t)− log

(hb

t)

log(wtt)− log(wb

t )(26)

This summary statistic is plotted in Figure 5. One can see that according to this mea-sure, labor supply appears to be much more elastic at older ages than at younger ages.This is due in large part to the shadow cost of leisure. The shadow cost is substantiallylarger for young workers than for older workers since the older workers have a shortertime horizon. As a result, the labor supply of young workers is less responsive to tempo-rary wage shocks than is the labor supply of older workers. It is also due to the densityof the tastes for leisure γt. When the probability of working is closer to 50% the densityof people close to being indifferent will be larger which results in a larger elasticity.

For individuals under age 60 these estimates are very close to the estimates of laborsupply elasticities found in the literature—though our definition of labor supply is notidentical to them so they are not precisely the same. For example, the early literatureestimates the Frisch elasticity being 0.09 (Browning et al., 1999), 0.15 (MaCurdy, 1981), and

22In simulations we assume that the increase in rental rates is anticipated.

28

Figure 5: Analogue to the empirical elasticity of labor supply

0.3

.6.9

1.2

1.5

.2E

last

icity

20 30 40 50 60 70Age

0.31 (Altonji, 1986). Chetty (2012) reports extensive (Hicksian) labor supply elasticitiesaround 0.25 combining estimates from many different studies and approaches. Focusingon retirement ages, Rogerson and Wallenius (2013) suggest that the IES is 0.75 or greatergiven empirically reasonable level of nonconvexities or fixed costs. The average of ourestimates between ages 55 and 65 is remarkably close to theirs.

6 Roles of Health, Disability, or Part Time

We have intentionally kept our model simple to show that human capital can explainthe dramatic fall in labor supply at the end of the life cycle. However, there are many alter-native reasons why labor supply might decline. Aside from Social Security rules, whichwe have already incorporated, the most important is health (e.g. Currie and Madrian1999, French and Jones 2011) where we include disability as part of health. If the primaryreason for retirement is health or disability, its omission might seriously distort our re-sults. In this section we incorporate health and disability into our model in a very flexibleway. We show that while health is an important factor, it is not the primary driver ofretirement.

We also investigate the case where individuals can choose to work part time and showthat the option of partial labor supply is not a major contributing factor to retirement forthe workers we study.

We use the estimates of these models in the next section to show that our policy coun-terfactuals do not change much when we include them.

29

6.1 The Role of Health and Disability

We allow for an additional state variable—health status, Sit ∈ {1, 2, 3, 4}, with 1 beingin excellent health, 2 in good health, 3 in bad health, and 4 being disabled. We modelthe disability state as absorbing; it also makes one eligible for certain benefits, includingthe Social Security Disability Insurance or Supplemental Security Income (Appendix B.4).Each individual is assumed to have good health from the beginning of the first period upto age 49, Sit = 2, t ≤ 49. After age 49, the health status evolves exogenously accordingto a time-dependent probability transition matrix, and is realized at the beginning of eachperiod before any choice is made.23

We allow the taste for leisure in the utility function (1) to depend on the health statusand change with age,

γit = exp

(ai0 +

2

∑j=1

aj1 {Mit = j}+4

∑j=1

1 {Sit = j}(

a0sj + a1

sjt)+ εit

). (27)

That is, individuals with excellent, bad, or disabled health status have a different taste forleisure than those with good health and this difference changes as they age.24 We referto this model as the baseline health model. We normalize a0

s2 = 0 and we assume thatleisure taste only changes for non-healthy people—that is, we assume a1

s1 = a1s2 = 0 but

estimate a1s3 and a1

s4.To estimate these five new parameters,

{a0

s1, a0s3, a0

s4, a1s3, a1

s4}

, we include three moresets of moment conditions: the difference in employment rates between workers withexcellent health and workers with good health, the employment rate difference betweenworkers with good health and workers with bad health, as well as the difference betweenworkers with bad health and those with disability, across ages from 50 to 65. The datamoments are derived from the 1963–2007 CPS March data and the raw patterns can beseen in panel (f) of Figure 6.

We then re-estimate the model. The estimates of the parameters are listed in the firsttwo columns of Table 5. The fit of the model is presented in Figure 6. Including healthleads to a similar fit for the base profiles (a)–(e) and the additional moments shown in

23The health transition matrix is estimated from the CPS data. We include the health status from age 50 fortwo reasons. First, most individuals have excellent or good health before age 50. Second, this simplificationreduces computation time.

24A key aspect of the thought experiment behind this paper is to not allow preferences to vary systemat-ically with age in our baseline model. In practice we can only fit the interaction of health and labor supplyin the data by allowing for an interaction between health and tastes for leisure in this model with health.The main point of this subsection is that health is important but not essential to explain the profiles, soeven though we are favoring the model with health by allowing this extra flexibility, health has a relativelyminor role.

30

panel (f) fit well. The transitions are shown in Table 6.Just because the fit does not improve much does not imply that health does not play

an important role. It may just be that either health or human capital could explain retire-ment.25 To explore the implications of health we use the model estimated with health,but then simulate a counterfactual in which there was no health change. We eliminate theimportance of health for individuals over 50 in two different ways—1) we do not allowtheir health to worsen and 2) we eliminate the interaction between health and preferencesfor work. Specifically, we simulate an experiment in which the health status an individualhad at age 50 remains for the rest of their life. Secondly, in addition to fixing the healthstatus at age 50, for individuals with excellent/bad/disabled health status on and afterage 50, we assume their taste for leisure does not change with age. That is, letting t∗ bethe time period when the individual turns 50, we assume that the taste for leisure is now

γit = exp

(ai0 +

2

∑j=1

aj1 {Mit = j}+4

∑j=1

1 {Sit = j}(

a0sj + a1

sj ·min {t, t∗})+ εit

)(28)

and Sit = Sit∗ for t > t∗. We then re-solve the modified model and simulate the life-cycleprofile for each individual using the same estimates from the aforementioned baselinehealth model.26 The labor supply profiles of these two experiments are plotted as the longdashed lines in Figure 7. If the health condition does not change with age, workers dosupply more labor, in both experiments. The average difference in labor supply betweenthe first counterfactual and the baseline health model is 15.5%. This implies that the wors-ening health condition is an important factor inducing retirement. When we assume thetaste of leisure does not vary with age for all health status, the difference in labor supplybetween the second counterfactual and the baseline health model is only slightly larger,18.1%. This implies that at least in our model the increasing taste for leisure is not an im-portant factor for retirement. Overall, these experiments imply that in our model healthis not a major factor driving retirement. This result confirms findings in the previous lit-erature. French (2005) estimates that the changes in health attribute to roughly 10% of thedrop in the labor force participation rates between ages 55 and 70, and the contribution tohours worked by workers near retirement is much smaller. Blau and Shvydko (2011) alsoreport that health deterioration is an important but not major cause of retirement.

The relatively small effect comes from the fact that bad health is relatively uncommon,

25Note that this is not to say they are not separately identified. The extra moments we use for the healthmodel identify the importance of health.

26We are assuming that agents have rational expectations and are aware that their health status will notchange. We have also simulated models in which they are not aware that their health status will remainfixed—it does not change the basic message.

31

Table 5: Estimates in the models with health and with part time optiona

with Health with Part TimeParameters Estimates S.E. Estimates S.E.HC depreciationb δ 0.121 (0.013) 0.092 (0.011)HC production function: I factor αI 0.113 (0.024) 0.093 (0.010)HC production function: H factor αH 0.197 (0.027) 0.101 (0.012)Standard deviation of HC innovation σξ 0.014 (0.009) 0.013 (0.003)Consumption: CRRA ηc 4.083 (0.048) 4.007 (0.106)Consumption shifter: coef on t (×10) ϕ1 0.226 (0.080) 0.306 (0.071)Consumption shifter: coef on t2 (×102) ϕ2 0.216 (0.037) 0.135 (0.016)Consumption shifter: coef on t3 (×103) ϕ3 -0.053 (0.007) -0.034 (0.002)Consumption shifter: coef on married ϕ4 1.173 (0.441) 0.075 (0.008)Leisure: standard deviation of Shock aε 0.581 (0.063) 0.208 (0.017)Leisure: spouse not working a1 0.233 (0.179) 0.875 (0.100)Leisure: spouse working a2 -0.485 (0.322) -0.748 (0.075)Leisure: excellent health a0

s1 -0.417 (0.077) - -Leisure: bad health a0

s3 0.139 (0.123) - -Leisure: bad health time trend a1

s3 0.013 (0.003) - -Leisure: disabled a0

s4 2.516 (0.385) - -Leisure: disabled time trend a1

s4 0.015 (0.006) - -Part time utility: constant a$0 - - -1.557 (0.021)Part time utility: coef on t (×10) a$1 - - 0.174 (0.014)Part time utility: coef on t2 (×102) a$2 - - 0.008 (0.001)Bequest weight b1 36,044,236 (7,321,871) 21,931,152 (1,017,202)Parameter heterogeneityc

Leisure: mean of intercept µa0 -6.133 (0.214) -5.453 (0.178)Leisure: standard deviation of intercept σa0 1.610 (0.176) 0.829 (0.192)HC productivity, mean µπ 1.836 (0.097) 1.760 (0.050)HC productivity, standard deviation σπ 0.464 (0.071) 0.597 (0.059)Correlation between a0 & π ρ -0.151 (0.004 -0.825 (0.079)Initial human capital level at age 18Intercept γ0 1.465 (0.472) 1.630 (0.685)Coefficient on a0 γa0 0.191 (0.047) 0.107 (0.056)Coefficient on π γπ 1.047 (0.349) 0.769 (0.394)Standard deviation of error term σH0 0.138 (0.070) 0.073 (0.099)Additional Social Security Application effectsEffect of resource constraint

(×103) b62 0.153 (0.078) 0.223 (0.007)

Effect of health insurance: constant(×103) b65 0.077 (0.075) 0.006 (0.001)

Effect of health insurance: coef on t(×103) b65t 0.126 (0.072) 0.153 (0.005)

χ2 Statisticd 1489 2905Degrees of freedom 250 248

aIndirect Inference estimates. Estimates use a diagonal weighting matrix. Standard errors are given in parentheses.bHC: Human Capital.cThe joint distribution of (a0, π) is a parametric discrete distribution with nine points determined by these five parameters,using a nine-point Gauss-Hermite approximation.dThis is the J-statistic. The critical values of the χ2 distribution are χ2

(250,0.01) = 305, χ2(248,0.01) = 303.

32

Figure 6: Fit of model with health


.2.4

.6.8

1E

mp

loy

me

nt

Ra

tes

22 30 40 50 60 65Age

Simulation

Data

(b) Mean log measured wages

2.3

2.4

2.5

2.6

2.7

2.8

Log

Wag

es &

Log

Wag

es (

FE

)

22 30 40 50 60 65Age

Log Wages: SimulationLog Wages: DataLog Wages (FE): SimulationLog Wages (FE): Data

(c) S.D. log measured wages

0.2

5.5

.75

1S

.D. L

og M

easu

red

Wag

es

22 30 40 50 60 65Age

SimulationData

(d) Adult equivalent consumption6

810

12A

dult

equi

vale

nt c

onsu

mpt

ion

22 30 40 50 60 65Age

SimulationData

(e) Social Security application

0.1

.2.3

.4S

ocia

l Sec

urity

App

licat

ion

60 61 62 63 64 65 66 67 68 69Age

SimulationData

(f) ER difference, Excellent vs Good vs Bad vs Dis-abled

0.2

.4.6

.8D

iffer

ence

in E

mpl

oym

ent R

ates

50 55 60 65Age

E−G: simulationE−G: dataG−B: simulationG−B: dataB−D: simulationB−D: data

33

Table 6: Transitions from the model with health or part time optiona

Transition Probabilities Wage Change RateWorking to Not Working After One


Data 0.034 0.200 -0.071With health 0.039 0.250 -0.080With Part Time option 0.040 0.216 -0.081

aThe transition rate is the average transition probability between age 35 and 50.bThe average wage change rate after one nonemployment spell is the average change rate betweenage 41 and 65.

Figure 7: Sensitivity to heath preferences: employment rates

(a) Health status no change after age 50

0.2

.4.6

.81

Em

plo

ym

en

t R

ate

s

20 30 40 50 60 70 80Age

Baseline

Fixed at 50

All Excellent

All Good

All Bad

All Disabled

(b) Health status and taste of leisure no change afterage 50

0.2

.4.6

.81

Em

plo

ym

en

t R

ate

s

20 30 40 50 60 70 80Age

Baseline

Fixed at 50

All Excellent

All Good

All Bad

All Disabled

34

not from the fact that it doesn’t affect retirement. To see this, we show that at the individ-ual (as opposed to aggregate) level, disability does induce an immediate and permanentdecline in labor supply. We do this by assuming a worker’s health status become excel-lent (or good/band/disabled) permanently at age 50. Similar labor supply profiles areplotted in Figure 7. These counterfactuals illustrate that upon becoming disabled, whichis permanent, most workers will retire immediately and permanently.

6.2 The Role of Part Time

We now perform a similar exercise in which we include the choice of part time workinto the model. At each period, an individual decides to work full time (ìt = 0), or towork part time (ìt = p), or not to work (ìt = 1). As we did for health we allow theutility from working part time to vary across ages and is to be estimated.27 One can seefrom Figure 8(b) that part time work is uncommon for the sample we study. Working parttime means spending half time in the labor market and the other half time at leisure. Welet $t be the parameter that determines part time utility which varies across ages but notacross individuals and we assume the utility of leisure associated with part time work isγit$t. We restrict this variable to be in the unit interval so the utility of leisure from parttime work lies between no work and full time work.

If an individual chooses to work part time, the investment in human capital is Iit ∈[0, 1

2

]and the effective work time is 1

2 − Iit, with wage earning

wit = Hit ·(

12− Iit

).

The solution is analogous to the baseline model with binary labor supply choices. Theoptimal labor supply solution is

ìt =arg max`∈{0,p,1}

{Vt`

(Xit

)+ γit (1 {` = 1}+ $it1 {` = p})

}(29)

where{Citp

(Xit

), Iitp

(Xit

),SSAitp

(Xit

)}≡ argmax

c,I,ssa

{ψtMit

c1−ηc

1− ηc+ vt (ssa) + βE

[Vt+1 (Xt+1)| Xit, c, ìt = p, I, ssa

]}(30)

27In this sense our main goal is to test the robustness of our model to inclusion of part time work ratherthan explain part time work per se.

35

Vtp

(Xit

)≡ψtMit

(Citp

(Xit

))1−ηc

1− ηc+ vt

(SSAitp

(Xit

))+ βE

[Vt+1 (Xit+1)| Xit, Citp

(Xit

), ìt = p, Iitp

(Xit

),SSAt,p

(Xit

)](31)

The details of solving this model depend on the values of the three values of Vt`

(Xit

)as well as $t. We discuss the details in Appendix A. In the most interesting case all threeoutcomes of labor supply occur with positive probability. In this case the model is like anordered probit with cutoffs

`t =

0, εit < ε∗t1

(Xit

),

p, ε∗t1

(Xit

)< εit < ε∗t2

(Xit

),

1, εit > ε∗t2

(Xit

)where

ε∗t1

(Xit

)= log

Vt0

(Xit

)− Vtp

(Xit

)$t

− ai0 −2

∑j=1

aj1 {Mit = j}

ε∗t2

(Xit

)= log

Vtp

(Xit

)− Vt1

(Xit

)1− $t

− ai0 −2

∑j=1

aj1 {Mit = j} .

The expected value function still has a closed form, but it is complicated and given inAppendix A.

Following the same strategy as in section 4 and assuming

$t =1

1 + exp(−a$0 − a$1 · t− a$2 · t2

) ,

we re-estimate the model with part time option.28 Table 5 presents parameter estimates.The fit of the model, as shown in Figure 8 and Table 6, is similar as the baseline model.

To investigate the effect of having a part time option, we use the estimates in Table 5and simulate a counterfactual without such an option. Figure 9 presents the profiles oflabor supply and the human capital. It appears that removing the part time option doesnot change the retirement pattern significantly, suggesting that the more flexible laborsupply arrangement is not the major source of retirement.

28We include the part time rate at each age from 22 to 65 as additional moments.

36

Figure 8: Fit of model with part time


.2.4

.6.8

1E

mp

loy

me

nt

Ra

tes

22 30 40 50 65Age

Simulation

Data

(b) Employmet rates, part time

0.0

2.0

4.0

6.0

8.1

Em

plo

ym

en

t R

ate

s, P

art

Tim

e

22 30 40 50 65Age

Simulation

Data


2.3

2.4

2.5

2.6

2.7

2.8

Log

Wag

es &

Log

Wag

es (

FE

)

22 30 40 50 60 65Age

Log Wages: SimulationLog Wages: DataLog Wages (FE): SimulationLog Wages (FE): Data


.2.4

.6.8

1S

.D. L

og M

easu

red

Wag

es

22 30 40 50 65Age

SimulationData


68

1012

Adu

lt eq

uiva

lent

con

sum

ptio

n

22 30 40 50 65Age

SimulationData


0.1

.2.3

.4S

ocia

l Sec

urity

App

licat

ion

60 61 62 63 64 65 66 67 68 69 70Age

SimulationData

37

Figure 9: Sensitivity to part time option: turn off the part time option


.2.4

.6.8

1E

mpl

oym

ent R

ates

22 30 40 50 60 65Age

BaselineRemove Part Time option

(b) Human Capital levels

1314

1516

1718

Hum

an C

apita

l

22 30 40 50 60 65Age

BaselineRemove Part Time option

7 Changes in Tax and Social Security

The preceding sections show that the model fits the life-cycle profiles of labor supplyand log measured wages in the data well. In this section, we use the model to predict howchanges in the taxes or Social Security rules would affect behavior in labor supply, humancapital investment and the resulting log wage profile. We conduct seven counterfactualpolicy experiments which reflect various changes in tax and Social Security rules.

The policy experiments arei) Increase taxes proportionally by 50%

ii) Eliminate Social Security earnings testiii) Increase the Normal Retirement Age to 67iv) Reduce Social Security benefits by 20%v) Eliminate Social Security taxes

vi) Eliminate Social Security benefitsvii) Eliminate Social Security system (both taxes and benefits)

It is important to recognize that we are focusing on men with exactly 12 years of edu-cation. A full evaluation would require incorporating the other demographic groups aswell.

The results of these experiments are summarized in columns 1–7 of Table 7, wherethe first panel is the baseline model, the second is from the model with health, and thethird is from the model with part time option. All numbers are summations or averagesthroughout the life cycle (from age 18 to 80).

Column 1 shows the result from the first experiment. A tax hike has both substitution

38

and income effects. The substitution effect discourages labor supply while the income ef-fect encourages labor supply. Our first experiment indicates that in our model the incomeeffect dominates the substitution effect and an average individual works one additionalyear over the life cycle, equivalent to 2.5% of the total lifetime labor supply.29 Most in-teresting about our model is the effects on human capital investment which increases by3.4%, leading to 2.4% increase in the human capital level and 0.9% increase in the log mea-sured wages (equivalent to 2.4% increase in the measured wage levels). The direct effectof taxes discourages human capital investment, but the increase in labor supply (and inparticular delayed retirement) increases human capital investment. Most of the increasein the labor supply is allocated to the effective labor, which increases by almost one yearor 2.4%.30 Annual consumption reduces by 2.5%.

The manner in which Social Security rules affect labor supply and wages is of centralinterest to policy makers. The remaining six experiments are devoted to answering thesequestions. In the first three we manipulate the current Social Security rules (columns2–4) while in the last three we decompose the distortionary effects of the current SocialSecurity system (columns 5–7).

First we remove the Social Security earnings test, which is effective between age 62and 70 in the baseline model. In the second one, we delay Normal Retirement Age (NRA)by two years: the new NRA is age 67 in this counterfactual experiment while it is age 65 inthe baseline model. In the third one, we reduce the Social Security benefit proportionallyby 20%. The results are presented in columns 2–4 in Table 7. Removing the Social Secu-rity earnings test between ages 62 and 70 has a smaller effect on all variables; delaying thenormal retirement age by two years, has a larger impact; reducing the generosity of theSocial Security benefit has the largest effect among these three. For instance, they increasethe labor supply by three-and-a-half, five, or seven months, respectively. One importantfeature is that the change in the labor supply does not only happen later in the life cyclewhen the policy change is directly effective; it takes place over the whole life cycle, asindicated in Figure 10. For instance, delaying the Normal Retirement Age or universallyreducing Social Security benefit induce substantially higher employment rates as well asmore investment, resulting persistently higher human capital levels and therefore higherwages at old ages. The wage difference is negligible before age 60 but increases substan-

29In our model leisure and consumption are separable. In such a model, whether the income or substi-tution effect dominates depend on whether ηc is larger or smaller than one. We estimate ηc to be around 4which is well within the estimates in the literature.

30Other papers have looked at the effects of taxes and human capital with this type of model. Examplesare Heckman et al. (1998b), Heckman et al. (1999), and Taber (2002). These experiments are quite differentas labor supply makes a large difference here so the results are not directly comparable.

39

tially after that, reaching 7% around age 67. Our results are echoed in Gohl et al. (2019)who estimate a related effect directly and find that employed women aged 53–60 increasetheir human capital investment substantially when the early retirement age is increasedfrom 60 to 63 in Germany. Ignoring such a human capital or wage response in experi-ments involving retirement policy will most likely introduce bias. The budget calculationin Table C3 shows that these three experiments reduce the Social Security deficit by 1.2%,23.6%, and 38.9%, respectively.

In the last three experiments, we decompose the effect of the current U.S. Social Secu-rity system into the individual effects of the Social Security taxes and the Social Securitybenefit. In Column 5 we keep the Social Security benefit but eliminate the Social Securitytaxes (the payroll taxes);31 in Column 6 we remove the Social Security benefit completelybut keep the Social Security taxes; in Column 7 we remove the entire Social Security sys-tem, that is, both the Social Security taxes and the benefit. Removing the Social Securitytaxes induces an average individual to supply 2.8 years less labor. This is not surprisingbecause removing the Social Security taxes is essentially a universal cut in the tax rate.In our tax hike counterfactual, the income effect dominates the substitution effect as istrue for the cut in Social Security taxes as well. Analogously, removing the Social Securitybenefit induces more labor supply. However, the increase in the labor supply is 4.0 years,which is higher than 2.8 years reduction of labor supply in the case of removing SocialSecurity taxes. The combination of these two effects leads to the results in the last exper-iment where both the Social Security taxes and benefit are removed. Column 7 indicatesthat eliminating the current Social Security system increases average labor supply by 0.3years over the life cycle. Such observation is also mentioned qualitatively in Gustmanand Steinmeier (1986) and Rust and Phelan (1997). Figure 11 shows that the changes inthe labor supply and log wages are most phenomenal at old ages when either taxes orbenefit is removed in the Social Security system.

Another point worth emphasizing is that in almost every policy counterfactual, theincrease in the endogenously determined wage levels are substantial. This is especiallytrue at old ages (between age 60 and 75):32 as high as 7%–9% when removing the earn-ings test or delaying NRA by two years or reducing Social Security benefit, almost ±30%when removing Social Security benefits or taxes, or 10% when removing the entire SocialSecurity system. These are caused by changes in the human capital levels as a result ofhigher or lower investment. This makes the importance of endogenizing human capitalclear. Ignoring the human capital investment channel would generate substantial bias in

31The income taxes are still effective.32The employment rate is very low after age 75 so the wage comparison is less interesting.

40

terms of predicting labor supply at old ages in similar experiments.Table 7 also presents the results of experiments from the alternative models, specif-

ically, Panel B from the model with health and Panel C from the model with part timeoption. Across these two different models the responses to the policy changes are qual-itatively similar to our baseline model in many variables across all experiments with afew quantitatively notable difference. One exception is from the model with health inPanel B. In the last experiment in Column 7, where we remove the whole Social Securitysystem, in the model with health an average worker work nine less months, while in allother models the same worker will supply more labor. This is because in the model withhealth workers have additional incentive to accumulate AIME due to the SSDI and SSIbenefit; removing such incentive has a negative effect on the labor supply. It is worthnoting that in the model with part time option, many policy changes cause large propor-tional changes in workers taking the part time option, indicating that this option is policysensitive.33

8 Robustness Check

In section 4.1 we set some of the parameters to certain values taken from the previousliterature. In this section we vary those pre-set parameters to see how they affect our es-timation results. In particular, we check following variants: (1) increase the consumptionfloor c from 2.19 to 2.5; (2) decrease the consumption floor c from 2.19 to 1.8; (3) decreasethe time discount rate β from 0.97 to 0.96 but increase the interest r from 0.03 to 0.04; (4)increase the initial asset A0 from 0.0 to 50, 000. In each case, all other pre-set parametersare kept the same as the baseline model, and then we re-estimate all of the parameters ofthe model. The estimation results are listed in Table 9; the moments are plotted in Figure12 and listed in Table 8 as well.

In all cases the simulated moments fit the data moments quite well. Varying pre-setparameters does change the estimated values of some parameters, but in all variants ourmodel generates simulated auxiliary model which match data auxiliary model quite well.

33Another reason is that not many workers take part time option in the baseline model, as in the data.Thus a small change in level may result a large change in the percentage.

41

Tabl

e7:

Effe

cts

ofch

angi

ngta

xes

orSo

cial

Secu

rity

rule

s,ba

selin

em

odel

01

23

45

67

Base

line

Tax

Incr

ease

50%

No

Earn

ings

Test

NR

A=

67R

educ

eSS

B20

%N

oSS

Taxe

sN

oSS

Bene

fitN

oSS

Syst

emLe

vela

∆Le

velb

%∆

c∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆

Pane

lA:B

asel

ine

Mod

elLa

bor

Supp

ly40

.293

0.99

42.

467

0.28

50.

708

0.42

81.

063

0.59

21.

470

-2.7

57-6

.841

4.00

29.

933

0.27

40.

681

Effe

ctiv

eLa

bor

0.59

70.

014

2.40

00.

004

0.70

50.

006

1.02

90.

009

1.46

8-0

.040

-6.7

150.

059

9.90

90.

005

0.82

5Pr

e-ta

xIn

com

e10

.063

0.40

03.

973

0.07

00.

696

0.13

81.

371

0.20

01.

986

-0.9

15-9

.088

1.25

112

.433

0.01

00.

098

Ave

rage

lnw

2.57

90.

024

0.94

40.

005

0.18

90.

012

0.44

80.

018

0.68

4-0

.081

-3.1

480.

073

2.82

30.

017

0.67

7H

uman

Cap

ital

14.0

300.

344

2.44

90.

073

0.52

30.

122

0.87

30.

163

1.15

9-0

.821

-5.8

531.

010

7.19

8-0

.038

-0.2

71In

vest

men

t0.

042

0.00

13.

396

0.00

030.

748

0.00

11.

542

0.00

11.

497

-0.0

04-8

.614

0.00

410

.270

-0.0

01-1

.346

Con

sum

ptio

n8.

975

-0.2

23-2

.484

0.00

50.

057

-0.0

32-0

.358

-0.0

52-0

.577

0.26

82.

991

-0.2

24-2

.497

0.01

20.

138

Pane

lB:M

odel

wit

hH

ealt

hLa

bor

Supp

ly42

.066

0.71

01.

687

0.07

10.

168

0.32

20.

766

0.21

80.

517

-1.5

98-3

.800

1.34

43.

194

-0.7

55-1

.795

Effe

ctiv

eLa

bor

0.61

70.

010

1.69

00.

001

0.21

00.

004

0.72

80.

003

0.56

5-0

.023

-3.6

870.

021

3.33

1-0

.009

-1.5

18Pr

e-ta

xIn

com

e10

.273

0.25

72.

501

0.01

40.

134

0.09

10.

888

0.09

40.

912

-0.5

61-5

.461

0.53

65.

213

-0.1

67-1

.621

Ave

rage

lnw

2.63

60.

009

0.32

50.

0000

50.

002

0.00

20.

065

0.00

50.

178

-0.0

19-0

.723

0.02

10.

797

0.00

20.

089

Hum

anC

apit

al13

.779

0.26

11.

893

0.01

30.

095

0.09

60.

698

0.06

30.

459

-0.5

45-3

.952

0.39

52.

870

-0.2

97-2

.157

Inve

stm

ent

0.05

10.

001

1.64

8-0

.000

2-0

.335

0.00

11.

234

-0.0

0003

-0.0

64-0

.003

-5.1

770.

001

1.52

5-0

.003

-5.1

63C

onsu

mpt

ion

8.85

1-0

.254

-2.8

660.

007

0.08

3-0

.022

-0.2

54-0

.069

-0.7

770.

377

4.25

9-0

.382

-4.3

150.

020

0.22

3Pa

nelC

:Mod

elw

ith

Part

Tim

eO

ptio

nLa

bor

Supp

ly41

.298

1.52

83.

700

-0.0

34-0

.083

-0.0

45-0

.109

0.01

00.

025

-3.2

63-7

.902

5.86

314

.198

3.57

68.

659

LS(F

Teq

v)d

40.1

401.

132

2.81

90.

341

0.85

00.

044

0.11

0-0

.279

-0.6

96-4

.822

-12.

012

1.99

34.

965

-1.6

02-3

.991

LS(P

T)e

2.31

60.

793

34.2

28-0

.751

-32.

407

-0.1

78-7

.694

0.57

924

.994

3.11

713

4.55

77.

741

334.

198

10.3

5644

7.12

7Ef

fect

ive

Labo

r0.

594

0.01

82.

960

0.00

50.

879

0.00

20.

270

-0.0

03-0

.517

-0.0

68-1

1.44

10.

028

4.74

3-0

.024

-4.0

21Pr

e-ta

xIn

com

e10

.039

0.36

13.

601

0.10

91.

083

0.00

30.

029

-0.0

67-0

.672

-1.3

82-1

3.77

10.

747

7.44

5-0

.503

-5.0

10A

vera

geln

w2.

602

0.02

71.

054

0.00

050.

019

0.00

30.

120

0.01

50.

564

-0.0

56-2

.136

0.10

54.

050

0.06

82.

613

Hum

anC

apit

al14

.163

0.30

12.

122

0.02

10.

147

-0.0

94-0

.667

-0.0

33-0

.236

-1.0

46-7

.388

1.09

37.

715

0.29

12.

053

Inve

stm

ent

0.04

30.

0004

0.85

50.

0002

0.44

6-0

.001

-2.1

19-0

.001

-3.1

81-0

.009

-19.

945

0.00

38.

053

-0.0

02-3

.579

Con

sum

ptio

n8.

950

-0.2

24-2

.502

0.01

10.

118

-0.0

63-0

.700

-0.0

52-0

.584

0.24

22.

706

-0.0

42-0

.468

0.21

52.

406

a The

"Lev

el"

colu

mn

refe

rsto

the

annu

alva

lue

aver

aged

over

the

who

lelif

ecy

cle,

exce

ptth

e"L

abor

Supp

ly",

"LS

(FT

eqv)

",an

d"L

S(P

T)",

whi

char

eth

eto

taln

umbe

rof

year

sw

orke

dov

erth

ew

hole

life

cycl

e.Fo

rex

ampl

e,in

the

base

line

mod

el,t

heto

tall

abor

supp

lyis

40.2

93ye

ars

from

18to

80.

b The

"∆Le

vel"

colu

mn

refe

rsto

the

diff

eren

ceof

the

tota

lval

uebe

twee

nth

ecu

rren

tex

peri

men

tan

dth

eba

selin

em

odel

.Fo

rex

ampl

e,in

the

"No

Earn

ings

Test

"ca

se,t

hela

bor

supp

lyis

0.28

5ye

ars

high

erth

anth

atin

the

base

line

mod

elac

ross

the

who

lelif

ecy

cle

from

18to

80.

c The

"%∆

"co

lum

nre

fers

toth

epe

rcen

tage

ofth

edi

ffer

ence

inth

e"∆

Leve

l"co

lum

nre

lati

veto

the

leve

lin

the

base

line

mod

el.

For

exam

ple,

inth

e"N

oEa

rnin

gsTe

st"

case

,the

labo

rsu

pply

in-

crea

ses

by0.

285

year

sw

hich

iseq

uiva

lent

to0.

708%

ofth

ela

bor

supp

lyin

the

base

line

mod

el.

d The

"LS

(FT

eqv)

"is

the

full

tim

eeq

uiva

lent

labo

rsu

pply

whe

rew

orki

ngpa

rtti

me

isco

unte

das

0.5.

e The

"LS

(PT)

"re

fers

toth

ela

bor

supp

lyof

part

tim

ew

orke

rs.

42

Figure 10: Policy experiments: change taxes or Social Security benefits

(a) Difference in employment rates

−.0

10

.02

.04

.06

.08

.1D

iffe

ren

ce in

Em

plo

yme

nt

Ra

tes

20 30 40 50 60 70 80Age

Increase tax 50%

No ET

NRA=67

Reduce SS 20%

(b) Difference in mean log measured wages

−.0

20

.05

.1.1

5D

iffer

ence

in L

og W

ages

20 30 40 50 60 70 80Age

Increase tax 50%No ETNRA=67Reduce SS 20%

(c) Difference in investment

−.0

020

.002

.004

.006

Diff

eren

ce in

Inve

stm

ent

20 30 40 50 60 70 80Age


(d) Difference in human capital

−.2

0.2

.4.6

.81

Diff

eren

ce in

Hum

an C

apita

l

20 30 40 50 60 70 80Age


Table 8: Transitions from alternative modelsa



1 Data 0.034 0.200 -0.0712 Larger c 0.036 0.235 -0.0853 Smaller c 0.036 0.225 -0.0864 Larger r smaller δ 0.042 0.254 -0.0865 Larger A0 0.042 0.241 -0.088

aThe transition rate is the average transition probability between age 35 and 50.bThe average wage change rate after one nonemployment spell is the average change rate be-tween age 41 and 65.

43

Figure 11: Policy experiments: remove Social Security taxes or benefits

(a) Difference in employment rates

−.2

−.1

0.1

.2.3

Diff

ere

nce

in E

mp

loy

me

nt

Ra

tes

20 30 40 50 60 70 80Age

No SS taxes

No SS benefit

No SS syste m

(b) Difference in mean log measured wages

−.6

−.4

−.2

0.2

.4D

iffer

ence

in L

og W

ages

20 30 40 50 60 70 80Age

No SS taxesNo SS benefitNo SS system

(c) Difference in investment

−.0

050

.005

.01

.015

Diff

eren

ce in

Inve

stm

ent

20 30 40 50 60 70 80Age


(d) Difference in human capital

−2

02

4D

iffer

ence

in H

uman

Cap

ital

20 30 40 50 60 70 80Age


44

Table 9: Estimates in the model variantsa

1 2 3 4MODEL SPECIFICATIONS Larger c Lower c Change δ, r Larger A0

Interest rate r 0.04Discount β 0.96Initial wealth A0 50, 000Consumption floor c 2.5 1.8HC depreciationb δ 0.092 0.092 0.091 0.099HC production function: I factor αI 0.111 0.107 0.182 0.102HC production function: H factor αH 0.123 0.116 0.170 0.123Standard deviation of HC innovation σξ 0.039 0.020 0.018 0.051Consumption: CRRA ηc 4.008 4.024 4.002 4.020Consumption shifter: coef on t

(×10−1) ϕ1 0.271 0.285 0.221 0.260

Consumption shifter: coef on t2 (×10−2) ϕ2 0.149 0.139 0.153 0.144Consumption shifter: coef on t3 (×10−3) ϕ3 -0.036 -0.035 -0.035 -0.034Consumption shifter: coef on married ϕ4 0.314 0.296 0.184 0.386Leisure: standard deviation of shock aε 0.201 0.205 0.205 0.240Leisure: spouse not working a1 0.910 0.838 1.186 0.824Leisure: spouse working a2 -0.774 -0.749 -1.096 -0.475Bequest weight b1 22,970,464 25,873,726 25,921,804 34,223,452Parameter heterogeneityc

Leisure: mean of intercept µa0 -5.480 -5.514 -5.634 -5.693Leisure: standard deviation of intercept σa0 0.863 0.885 0.789 0.772HC productivity, mean µπ 1.752 1.760 1.760 1.824HC productivity, standard deviation σπ 0.609 0.617 0.599 0.640Correlation between a0 & π ρ -0.650 -0.765 -0.630 -0.749Initial human capital level at age 18Intercept γ0 1.528 1.558 1.582 1.525Coefficient on a0 γa0 0.047 0.060 0.050 0.049Coefficient on π γπ 0.655 0.682 0.666 0.639Standard deviation of error term σH0 0.044 0.006 0.071 0.001Additional Social Security Application effectsEffect of resource constraint

(×10−3) b62 0.211 0.217 0.084 0.220

Effect of health insurance: constant(×10−3) b65 0.089 0.093 0.161 0.056

Effect of health insurance: coef on t(×10−3) b65t 0.169 0.142 0.146 0.159

aIndirect Inference estimates. Estimates use a diagonal weighting matrix.bHC: Human Capital.cThe joint distribution of (a0, π) is a parametric discrete distribution with nine points determined by these five parameters,using a nine-point Gauss-Hermite approximation.

45

Figure 12: Fit of alternative models


.2.4

.6.8

1E

mp

loy

me

nt

Ra

tes

22 30 40 50 60 62 65t

Data

Larger C F

Smaller C F

Larger r

Larger A 0


2.3

2.4

2.5

2.6

2.7

2.8

Me

an

Lo

g M

ea

sure

d W

ag

es

(FE

)

22 30 40 50 60 62 65t

Data

Larger C F

Smaller C F

Larger r

Larger A 0


2.3

2.4

2.5

2.6

2.7

2.8

Me

an

Lo

g M

ea

sure

d W

ag

es

22 30 40 50 60 62 65t

Data

Larger C F

Smaller C F

Larger r

Larger A 0


.2.4

.6.8

1S

.D. L

og

Me

asu

red

Wa

ge

s

22 30 40 50 60 62 65t

Data

Larger C F

Smaller C F

Larger r

Larger A 0


68

1012

Con

sum

ptio

n

22 30 40 50 60 62 65t

DataLarger CFSmaller CFLarger rLarger A0


0.1

.2.3

.4S

ocia

l Sec

urity

App

licat

ion

60 61 62 63 64 65 66 67 68 69 70t

DataLarger CFSmaller CFLarger rLarger A0

46

9 Conclusion

This paper develops and estimates a rich life-cycle model that merges a Ben-Porathstyle human capital framework with a neoclassical style model of endogenous labor sup-ply. We use it to study what is typically referred to as retirement in the literature withouttreating retirement as fundamentally different than a no-work decision in either the dataor the model. In the model, each individual chooses consumption, labor supply, humancapital investment, and Social Security application. Investment in human capital gen-erates wage growth over the life cycle, while depreciation of human capital is the mainforce generating a decline in working for older workers. We show that the parsimoniousmodel is able to fit the main features of life-cycle labor supply, measured wages (with andwithout fixed effects) as well as retirement. In particular we can fit both the large increasein measured wages and small changes in labor supply at the beginning of the life cyclealong with the small changes in measured wages but large changes in labor supply at theend.

Despite the fact that our framework does not rely on age or time varying preferenceor production function parameters, our model is consistent with a rather small and em-pirically plausible labor supply elasticity that rises with age. To show the importanceof depreciation in explaining the result we re-estimate the model without allowing de-preciation on the job and show the model can not fit the data as well. We also estimateextensions of the model allowing for health shocks and part time option individually.While these factors are relevant, they are not the main factors driving retirement. Themodel is also robust to several robustness checks in which we vary pre-set parameters.

We use the estimated model to simulate the impacts of various policy changes. Whileprior work typically takes the wage process as given and focuses on the retirement de-cision, we are able to model the effect of the policy change on the wage process and thelabor supply decisions. As we show in our model, less generous Social Security benefitsresult in higher labor supply later in the life cycle, so workers adjust their investmentover the life cycle. This results in a higher human capital level as well as higher laborsupply earlier in the life cycle. The bottom line is that modeling labor supply and humancapital decisions jointly is critical in an analysis of the effects of policy changes. While pre-sumably other factors would be important for explaining other features of labor markets,endogenous labor supply is critical for understanding life-cycle human capital invest-ment and life-cycle human capital investment is critical for understanding life-cycle laborsupply.

47

References

Acemoglu, Daron and Jörn-Steffen Pischke, “Why Do Firms Train? Theory and Evi-dence*,” The Quarterly Journal of Economics, 02 1998, 113 (1), 79–119.

and , “The structure of wages and investment in general training,” Journal of politicaleconomy, 1999, 107 (3), 539–572.

Aiyagari, S. Rao, “Uninsured Idiosyncratic Risk and Aggregate Saving,” The QuarterlyJournal of Economics, 1994, 109 (3), 659–684.

Altonji, Joseph G, “Intertemporal substitution in labor supply: Evidence from microdata,” The Journal of Political Economy, 1986, pp. S176–S215.

Ben-Porath, Yoram, “The Production of Human Capital and the Life Cycle of Earnings,”Journal of Political Economy, August 1967, 75 (4), 352–365.

Blau, David, “Retirement and Consumption in a Life Cycle Model,” Journal of Labor Eco-nomics, January 2008, 26 (1), 35–71.

and Tetyana Shvydko, “Labor Market Rigidities and the Employment Behavior ofOlder Workers,” Industrial and Labor Relations Review, April 2011, 64 (3), 464–484.

Blundell, Richard, Monica Costa Dias, Costas Meghir, and Jonathon Shaw, “FemaleLabour Supply, Human Capital and Welfare Reform,” Econometrica, September 2016,84 (5), 1705–1753.

, Monica Costa-Dias, David Goll, and Costas Meghir, “Wages, Experience and Train-ing of Women over the Lifecycle,” April 2019. Unpublished Manuscript.

Browning, Martin, Lars Peter Hansen, and James J. Heckman, “Micro Data and GeneralEquilibrium Models,” Handbook of Macro Economics, 1999, 1, 525–602.

Casanova, Maria, “Happy Together: A Structural Model of Couples’ Joint RetirementChoices,” November 2010. Working paper.

, “Revisiting the Hump-Shaped Wage Profile,” August 2013. Working paper.

Chetty, Raj, “Bounds on Elasticities with Optimization Frictions: A Synthesis of Microand Macro Evidence on Labor Supply,” Econometrica, 2012, 80 (3), 969–1018.

Currie, Janet and Brigitte C Madrian, “Health, health insurance and the labor market,”Handbook of labor economics, 1999, 3, 3309–3416.

48

DeNardi, Mariacristina, “Wealth Inequality and Intergenerational Links,” The Review ofEconomic Studies, July 2004, 71 (3), 743–768.

Farber, Henry, “The Incidence and Costs of Job Loss: 1982-91,” Brookings Papers: Microe-conomics, 1993, pp. 73–132.

Fernández-Villaverde, Jesús and Dirk Krueger, “Consumption over the life cycle: Factsfrom consumer expenditure survey data,” The Review of Economics and Statistics, 2007,89 (3), 552–565.

French, Eric, “The Effects of Health, Wealth and Wages on Labor Supply and RetirementBehavior,” Review of Economic Studies, April 2005, 72 (2), 395–427.

and John Bailey Jones, “The effects of Health Insurance and Self-Insurance on Retire-ment Behavior,” Econometrica, May 2011, 79 (3), 693–732.

Gohl, Niklas, Peter Haan, Felix Weinhardt, and Elisabeth Kurz, “Human Capital In-vestment: Causal Evidence from Pension Reform,” March 2019. Working paper, DIWBerlin and FU Berlin.

Gorlich, Dennis and Andries de Grip, “Human Capital Depreciation During Home-time,” Oxford Economic Papers, 2009, 61, i98–i121.

Gourinchas, Pierre-Olivier and Jonathan A Parker, “Consumption over the life cycle,”Econometrica, 2002, 70 (1), 47–89.

Gustman, Alan L. and Thomas L. Steinmeier, “A Structural Retirement Model,” Econo-metrica, May 1986, 54 (3), 555–584.

Haan, Peter and Victoria Prowse, “Longevity, life-cycle behavior and pension reform,”Journal of Econometrics, 2014, 178, 582–601.

Haley, William, “Estimation of the Earnings Profile from Optimal Human Capital Accu-mulation,” Econometrica, 1976, 44 (6), 1223–1288.

Heckman, James J., “Estimates of a Human Capital Production Function Embedded in aLife-Cycle Model of Labor Supply,” in N. Terleckyj, ed., Household production and con-sumption, Columbia University Press, 1975, pp. 99–138.

, “A Life-Cycle Model of Earnings, Learning, and Consumption,” The Journal of PoliticalEconomy, August 1976, 84 (4), S11–S44.

49

Heckman, James J, Lance J Lochner, and Petra E Todd, “Earnings functions, rates of re-turn and treatment effects: The Mincer equation and beyond,” Handbook of the Economicsof Education, 2006, 1, 307–458.

Heckman, James J., Lance Lochner, and Christopher Taber, “Explaining Rising Wage In-equality: Explanations With A Dynamic General Equilibrium Model of Labor EarningsWith Heterogeneous Agents,” Review of Economic Dynamics, 1998, 1 (1), 1–58.

, , and , “Tax Policy and Human-Capital Formation,” American Economic Review:Papers and Proceedings, 1998, 88 (2), 293–297.

, , and , “Human Capital Formation and General Equilibrium Treatment Effects: AStudy of Tax and Tuition Policy,” Fiscal Studies, 1999, 20 (1), 25–40.

, , and Ricardo Cossa, “Learning-by-doing vs. on-the-job training: Using variationinduced by the EITC to distinguish between models of skill formation,” in Edmund S.Phelps, ed., Designing inclusion: Tools to raise low-end pay and employment in private enter-prise,, Cambridge University Press 2003, pp. 74–130.

Hubbard, R. Glenn, Jonathan Skinner, and Stephen P. Zeldes, “Precautionary Savingand Social Insurance,” The Journal of Political Economy, April 1995, 103 (2), 360–399.

Imai, Susumu and Michael P. Keane, “Intertemporal Labor Supply and Human CapitalAccumlation,” International Economic Review, May 2004, 45 (2), 601–641.

Johnson, Richard W and David Neumark, “Wage declines among older men,” The Re-view of Economics and Statistics, 1996, pp. 740–748.

Keane, Michael P. and Kenneth I. Wolpin, “The Career Decisions of Young Men,” Journalof Political Economy, June 1997, 105 (3), 473–522.

Kunze, Astrid, “The Timings of Careers and Human Captial Depreciation,” June 2002.IZA Working Paper No. 509.

Kuruscu, Burhanettin, “Training and Lifetime Income,” American Economic Review, June2006, 96 (3), 832–846.

Li, Hsueh-Hsiang, “The Effects of Human Capital Depreciation on Occupational GenderSegregation,” May 2013.

Light, Audrey and Manuelita Ureta, “Early-Career Work Experience and Gender WageDifferentials,” Journal of Labor Economics, 1995, 13 (1), 121–154.

50

MaCurdy, Thomas E, “An Empirical Model of Labor Supply in a Life-Cycle Setting,” TheJournal of Political Economy, 1981, 89 (6), 1059–1085.

Madrian, Brigitte C and Lars John Lefgren, “An approach to longitudinally matchingCurrent Population Survey (CPS) respondents,” Journal of Economic and Social Measure-ment, 2000, 26 (1), 31–62.

Manuelli, Rodolfo E., Ananth Seshadri, and Yongseok Shin, “Lifetime Labor Supplyand Human Capital Investment,” January 2012. Working paper.

Mincer, Jacob, Schooling, Experience and Earnings, New York: NBER Press, 1974.

and Haim Ofek, “Interuppted Work Careers: Depreciation and Restoration of HumanCapital,” The Journal of Human Resources, 1982, 17 (1), 3–24.

and Soloman Polachek, “Family Investments in Human Capital: Earnings of Women,”Journal of Political Economy, 1974, 82 (2), s76–s108.

Mulligan, Casey B, “Substitution over time: another look at life-cycle labor supply,”NBER macroeconomics annual, 1998, 13, 75–134.

Neal, Derek, “The Complexity of Job Mobility among Young Men,” Journal of Labor Eco-nomics, April 1999, 17 (2), 237–261.

Prescott, Edward C., Richard Rogerson, and Johanna Wallenius, “Lifetime aggregatelabor supply with endogenous workweek length,” Review of Economic Dynamics, 2009,12, 23–36.

Rogerson, Richard and Johanna Wallenius, “Nonconvexities, Retirement and the Elas-ticity of Labor Supply,” American Economic Review, June 2013, 103 (4), 1445–1462.

Rosen, Sherwin, “A Theory of Life Earnings,” Journal of Political Economy, 1976, 84 (4),S45–S67.

Ruhm, Christopher, “Are Workers Permanently Scarred by Job Displacement,” AmericanEconomic Review, 1991, 81 (1), 319–324.

Rupert, Peter and Giulio Zanella, “Revisiting wage, earnings, and hours profiles,” June2012. Working paper.

Rust, John and Christopher Phelan, “How Social Security and Medicare Affect Retire-ment Behavior In a World of Incomplete Markets,” Econometrica, July 1997, 65 (4), 781–831.

51

Sanders, Carl and Christopher Taber, “Life-Cycle Wage Growth and Heterogeneous Hu-man Capital,” Annual Review of Economics, 2012, 4 (1), 399–425.

Schmieder, Johannes F., Till von Wachter, and Stefan Bender, “The Effect of Unemploy-ment Benefits and Nonemployment Durations on Wages,” American Economic Review,March 2016, 106 (3), 739–77.

Shaw, Kathryn L., “Life-Cycle Labor Supply with Human Capital Accumulation,” Inter-national Economic Review, May 1989, 30 (2), 431–456.

Taber, Christopher, “Tax Reform and Human Capital Accumulation: Evidence from anEmpirical General Equilibrium Model of Skill Formation,” Advances in Economic Analy-sis and Policy, 2002, 2 (1).

Topel, Robert H. and Michael P. Ward, “Job Mobility and the Careers of Young Men,”The Quarterly Journal of Economics, May 1992, 107 (2), 439–479.

Wallenius, Johanna, “Human capital accumulation and the intertemporal elasticity ofsubstitution of labor: How large is the bias?,” Review of Economic Dynamics, 2011, 14 (4),577–591.

Weiss, Yoram, “The determination of lifecycle earnings: A survey,” in Orley Ashenfelterand David Card, eds., Handbook of Labor Economics, Vol. 1, Amsterdam: North-Holland,1986, pp. 603–640.

52

Appendix A Value Function Derivations

The following fact will be useful in this section:If εit is normal with 0 expected value and variance of σ2

ε then

E (eεit | εit ≥ ε∗) =eσ2

ε2

Φ(

σε − ε∗σε

)1−Φ

(ε∗σε

) (A1)

We first solve the labor supply model and value function for the baseline model andthen for the model that includes part time option.

A.1 Baseline Model

For simplicity assume that

ait ≡ ai0 + a11 {Mit = 1}+ a21 {Mit = 2} (A2)

so that we can writeγit = exp (ait + εit) (A3)

Then as we state in the text as long as Vt0

(Xit

)> Vt1

(Xit

),

ìt =1{

Vt1

(Xit

)+ γit ≥ Vt0

(Xit

)}= 1

{ait + εit ≥ log

(Vt0

(Xit

)− Vt1

(Xit

))}=1{

εit ≥ ε∗it

(Xit

)}(A4)

whereε∗t

(Xit

)≡ log

(Vt0

(Xit

)− Vt1

(Xit

))− ait. (A5)

Note that if Vt0

(Xit

)≤ Vt1

(Xit

)then the individual would never choose to work.

The only difference between Xit and Xit is that εit is included in Xit, so

E[

Vt (Xit)| Xit

]= Pr

(εit < ε∗t

(Xit

))E[

Vt0

(Xit

)∣∣∣ Xit, εit < ε∗t

(Xit

)]+ Pr

(εit ≥ ε∗t

(Xit

))E[

Vt1

(Xit

)+ eait eεit

∣∣∣ Xit, εit ≥ ε∗t

(Xit

)]

= Φ

ε∗t

(Xit

)σε

Vt0

(Xit

)+

1−Φ

ε∗t

(Xit

)σε

Vt1

(Xit

)+ eait+

σ2ε2

Φ(

σε −ε∗t (Xit)

σε

)1−Φ

(ε∗t (Xit)

σε

)

A-1

A.2 Part Time

We continue to use the simplified expression for ait defined above. First consider thelabor supply decision. In this case

`t =

0 Vt0

(Xit

)> max

{Vtp

(Xit

)+ γit$t, Vt1

(Xit

)+ γit

}p Vtp

(Xit

)+ γit$t > max

{Vt0

(Xit

), Vt1

(Xit

)+ γit

}1 Vt1

(Xit

)+ γit > max

{Vt0

(Xit

), Vtp

(Xit

)+ γit$t

} (A6)

or

`t =

0, γit < min

{Vt0(Xit)−Vtp(Xit)

$t, Vt0

(Xit

)− Vt1

(Xit

)}p,

Vt0(Xit)−Vtp(Xit)$t

< γit <Vtp(Xit)−Vt1(Xit)

1−$t,

1, γit > max{

Vtp(Xit)−Vt1(Xit)1−$t

, Vt0

(Xit

)− Vt1

(Xit

)} (A7)

One can see that there are a number of different cases to consider. Since γit is log normal,ties will be irrelevant so we abstract from them.

The most interesting case is that in which

0 <Vt0

(Xit

)− Vtp

(Xit

)$t

< Vt0

(Xit

)− Vt1

(Xit

)<

Vtp

(Xit

)− Vt1

(Xit

)1− $t

(A8)

as this is the only case where all three possibilities happen with positive probability.In this case

`t =

0, εit < ε∗t1

(Xit

),

p, ε∗t1

(Xit

)< εit < ε∗t2

(Xit

),

1, εit > ε∗t2

(Xit

) (A9)

where

ε∗t1

(Xit

)= log

Vt0

(Xit

)− Vtp

(Xit

)$t

− ait (A10)

ε∗t2

(Xit

)= log

Vtp

(Xit

)− Vt1

(Xit

)1− $t

− ait (A11)

A-2

To derive the value function recall that for any random variable Y

E (Y | a ≤ Y < b) =E (Y | Y ≥ a) Pr (Y ≥ a)− E (Y | Y ≥ b) Pr (Y ≥ b)

Pr (a ≤ Y < b)

Using this and the log normal result, in this case with all three possibilities (i.e. A8)

E[

Vt (Xit)| Xit

]=Pr

(εit < ε∗t1

(Xit

))E[

Vt0

(Xit

)∣∣∣ Xit, εit < ε∗t1

(Xit

)0]

+ Pr(

ε∗t1

(Xit

)< εit ≤ ε∗t2

(Xit

))E[

Vtp

(Xit

)+ $teait eεit

∣∣∣ Xit, ε∗t1

(Xit

)< εit ≤ ε∗t2

(Xit

)]+ Pr

(εit ≥ ε∗t2

(Xit

))E[

Vt1

(Xit

)+ eait eεit

∣∣∣ Xit, εit ≥ ε∗t2

(Xit

)]=Φ

ε∗t1

(Xit

)σε

Vt0

(Xit

)+

Φ

ε∗t2

(Xit

)σε

−Φ

ε∗t1

(Xit

)σε

·

·

Vtp

(Xit

)+ $teait+

σ2ε2

Φ(

σε −ε∗t1(Xit)

σε

)−Φ

(σε −

ε∗t2(Xit)σε

)Φ(

ε∗t2(Xit)σε

)−Φ

(ε∗t1(Xit)

σε

)

+

1−Φ

ε∗t2

(Xit

)σε

Vt1

(Xit

)+ eait+

σ2ε2

Φ(

σε −ε∗t2(Xit)

σε

)1−Φ

(ε∗t2(Xit)

σε

) (A12)

The other possibilities are special cases of this for which fewer than three possibilitiesare possible.

IfVtp

(Xit

)− Vt1

(Xit

)1− $t

<Vt0

(Xit

)− Vtp

(Xit

)$t

(A13)

then part time is not an option and we return to the basic model. In other cases, otheroptions will disappear which simplifies the expression.

A-3

Appendix B Taxes and Social Security

We use tax codes and Social Security rules in the year of 2004, except earnings testwhere we follow the rules in 1999.34

There are two different kinds of taxes that the worker’s wage income is subject to,namely the payroll taxes and the federal income taxes. We ignore state income taxes.The payroll taxes include the Social Security portion, 6.2% capped at $87, 900, and theMedicare tax, 1.45% uncapped. The federal income taxes are progressive and we use thetax rules under head of household. The personal exemption for each person is $3, 100 andthe standard deduction for head of household is $7, 150. These all together generate thetax codes used in the paper in the following formula,

Υit = υ(

Yoit + ssbtaxable

it

)+ ssbpre

it − ssbtaxableit (B1)

where Υit is total net income, Yoit = max {rAit, 0}+ wit + yit is the gross income, ssbpre

it ispre-tax Social Security benefit, ssbtaxable

it is the taxable part of Social Security benefit, andν is the after tax income as a function of pre-tax income. It is presented in Table B1.

The pre-tax Social Security benefit amount, ssbpreit , is determined by the current age t,

the age when the individual first starts receiving the benefit, tssa, and the entire earninghistory up to current age t. It is summarized in the following formula,

ssbpreit =

0 if ssit = ssait = 0

ssbit − ETit if ssit + ssait > 0 & t < 70

ssbit if ssit + ssait > 0 & t ≥ 70

(B2)

where

ssbit =

Πt−1

j=tssa

(1 + DRCET

ij

)· PIAit ·

(1− 65−tssa

15

)if 62 ≤ tssa < 65

Πt−1j=tssa

(1 + DRCET

ij

)· PIAit · [1 + 0.06 · (tssa − 65)] if 65 ≤ tssa < 70

PIAit · [1 + 0.06 · (69− 65)] if tssa ≥ 70

(B3)

is the eligible Social Security benefit and the DRCETij is the Delayed Retirement Credit due

to benefits withdrew at the earnings test.The remainder of the section describes how each component in equations (B1)–(B3) is

34Before 2000, the earnings test applies to ages before 70. Since 2000, the earnings test is elimi-nated after reaching NRA. All information about Social Security benefits in this section is extracted fromhttp://www.ssa.gov.

B-1

Table B1: Wage income tax codes (in 2004$).

Marginal Tax Rate Pre-tax (Y) Post-tax Income Υt = υ0 + υ1 (Y− υ2)0.0765 ≤ 10, 250 0.9235Y0.1765 10, 251− 20, 450 9, 465.88 + 0.8235 (Y− 10, 250)0.2265 20, 451− 49, 150 17, 865.58 + 0.7735 (Y− 20, 450)0.3265 49, 151− 87, 900 40, 065.03 + 0.6735 (Y− 49, 150)0.2645 87, 901− 110, 750 66, 163.15 + 0.7355 (Y− 87, 900)0.2945 110, 751− 172, 950 82, 969.33 + 0.7055 (Y− 110, 750)0.3445 172, 951− 329, 350 126, 851.43 + 0.6555 (Y− 172, 950)0.3645 ≥ 329, 351 229, 371.63 + 0.6355 (Y− 329, 350)

defined.

B.1 Social Security Benefits

The normal retirement age (NRA) is 65. The worker receives full Social Security ben-efits if he applies for the benefits at the NRA. The full retirement benefits are equal tothe Primary Insurance Amount (PIA), which is a function of Average Indexed MonthlyEarnings (AIME),

PIAit = 0.9 ·min {bp1, AIMEit}+ 0.32 ·min {bp2 − bp1, max {0, AIMEit − bp1}}+0.15 ·max {0, AIMEit − bp2} (B4)

where (bp1, bp2) = (612, 3689).The AIME is computed as the monthly average earning of the 35 years with highest

inflation-adjusted earnings. Only earnings subject to the Social Security tax are used inthe calculation and therefore AIME is capped. The included earning in a specific year isadjusted for wage inflation by multiplying the wage growth rate relative to the base year,which is at age 60. The wage growth rate is calculated by dividing the average wage inthe base year by the average wage in that specific year. Earnings after the base year arenot adjusted. Interestingly, the wage growth rate of the national average wage index isvery similar to the growth rate of CPI-U after Year 1969, as shown in Figure B1, so weignore the small difference between these two and use the real wages to update AIMEwithout adjustment.

Computing exact AIME requires keeping tracking of the worker’s entire annual earn-ing history, which is computationally infeasible. Instead we apply an approximating

B-2

Figure B1: Wage Index, CPI, and minimum wage share in AIME

(a) Relative (to Year 2004) indices of National Aver-age Wage Index and CPI-U

05

1015

1940 1960 1980 2000 2020Year

NAW index to 2004CPI−U to 2004

(b) Share of minimum wage in AIME, assumingstarting working from age 16

.002.003

.005

.007

.009

.01

.012

.015

.017

.02

.021

.019

.015

.012

.01

.007

.006.005

.004.003.003.003.002.002.002.002.002.002.002.002.002.002.003.003.003.003

0.0

05.0

1.0

15.0

2S

hare

of m

in w

age

in A

IME

50 60 70 80 90age

method, taking into account the wage growth pattern over the life cycle,

AIMEit+1 = AIMEit + max{

0,sseit

35× 12− sharemin (t) · AIMEit

}(B5)

where sseit = min {Hit (1− ìt) (1− Iit) , ¯sse} is included earning, capped at ¯sse = $87, 900.The sharemin is the share of minimum wage in AIME. Figure B1 plots the estimatedsharemin (t) from CPS data for age 52 to 76, assuming starting working age of 16, andsharemin (t < 52)=0.

The early retirement age (ERA) is 62. Starting from ERA, the worker is eligible toreceiving the Social Security benefits at a reduced level. In this case, the benefit is reduced5/9 of one percent for each month before NRA, or 6.67% per year, up to three years.Beyond three years, the benefit is reduced 5/12 of one percent per month or 5% per year.

On the other hand, delayed receiving Social Security benefits after the NRA increasesbenefits. The delayed retirement credit (DRC) of 6% is given to the applicant for eachdelayed year up to age 69.35 No DRC is given for applicants at age 70 or older.

B.2 The Social Security Earnings Test

We use the Social Security earnings test rules in 1999. The Social Security benefitscould be withheld partly or totally if the worker is earning income while taking the Social

35The 6% DRC is for cohorts born between 1935 and 1936 (inclusive). The DRC varies from 3% for cohortsborn in 1924 or earlier to 8% for cohorts born in 1943 or later. In between, it increases by 0.5% every twoyears.

B-3

Security benefits at ages before 70.For beneficiary under age 65, $1 of benefits for every $2 of earnings in excess of the

exempt amount ($10, 885 in 2004 dollars) is withheld. The benefit withholding rate forthose aged 65–69 is $1 of benefits for every $3 of earnings in excess of the exempt amount($17, 575 in 2004 dollars). The following formula summarizes the earnings test,

ETit =

min{

ssbit, max{

0, Yoit−10885

2

}}if ssit + ssait > 0 & 62 ≤ t < 65

min{

ssbit, max{

0, Yoit−17575

3

}}if ssit + ssait > 0 & 65 ≤ t < 70

(B6)

If a whole year’s worth of benefits is withheld between ages 62 to 64, benefits in thefuture will be raised by 6.7% each year. If the benefit is withheld between age 65 to 69,the future benefits will be raised by 6.0%. Given our terminal age at 80, it is favorable forindividuals aged 62 to 64 but not actuarially fair for individuals aged 65 or older. This issummarized by the following formula,

DRCETit =

ETit

ssbeit×

0.067 if ssit + ssait > 0 & 62 ≤ t < 65

0.06 if ssit + ssait > 0 & 65 ≤ t < 70

B.3 Taxable Social Security Benefits

The Social Security benefits are not taxable if it is the only income. If there is otherincome, compute “total income” as the sum of half of the benefits and all other income.If total income is no more than the base amount ($25, 000 for head of household) then nobenefits are taxable. If total income is higher than $34, 000 then up to 85% of the benefitscould be taxable. Defining Yit = Yo

it +12 ssbit, the taxable part of Social Security benefits is

calculated as

ssbtaxableit =

0, if Yit ≤ 25000

min{

0.85 · ssbit, 12 min

{ssbit, Yit − 25000, 9000

}otherwise

+ 0.85 max{

0, Yit − 34000}} (B7)

B.4 Disability Benefits

The Social Security Disability Insurance (SSDI) and the Supplemental Security Income(SSI) programs are the two largest Federal programs that provide assistance to peoplewith disabilities.

B-4

To be qualified for the SSDI, workers cannot earn employment income higher than thedisability thresholds, namely the Substantial Gainful Activity (SGA), which are $810 permonth for nonblind persons and $1350 for blind persons in 2004. We use the SGA of $810.Before NRA, the SSDI benefit is based on AIME. Upon reaching the NRA, SSDI benefitsare automatically converted to the normal Social Security benefits and are not subject tothe SGA earnings restriction anymore. Workers with disabilities may also receive the Sup-plemental Security Income (SSI). SSDI and SSI combined guarantee a minimum monthlybenefit of $564. The following formula summarizes the disability benefits,

SSDIit + SSIit =

0 if t < 65 & Yo

it ≥ 9720

max {6768, PIAit} if t < 65 & Yoit < 9720

ssbit if t ≥ 65

(B8)

B.5 Practical Implementation

We adjust the state variables AIMEit (relevant for ssit = 0) and ssbit (relevant forssit = 1) to reflect all aforementioned factors which affects current or future Social Se-curity benefits, including the Early Retirement, the Delayed Retirement Credit, and thebenefit increase due to earnings test. Specifically, we use the following formula whennumerically solving the life-cycle model.

Prior to claiming, AIMEit is updated according to equation (B5). Note that this is notexact but is an approximation.

In the first year that an individual begins to claim (when ssit = 0, ssait = 1), the initialssbit is calculated as

ssbit = PIAit ·[

1−(

65− t15

)· 1 {t < 65}

]· [1 + 0.06 ·min (4, t− 65) · 1 {t > 65}] (B9)

where PIAit comes from equation (B4).When ssit + ssait > 0, the Social Security benefit is calculated as

ssbpreit = ssbit − ETit · 1 {t < 70} (B10)

The ssbit is then updated as

ssbit+1 = ssbit ·(

1 +ETit

ssbit· {0.067 · 1 {t < 65}+ 0.06 · 1 {65 ≤ t < 70}}

)(B11)

The earnings test is calculated from equation (B6).

B-5

Appendix C Alternative Human Capital Models

We compare our baseline human capital accumulation model with two variants. Allother aspects of the model remain the same. The first variation assumes the innovationpart in the human capital production function is completely exogenous. The second vari-ation assumes the innovation only occurs if individuals work, but is exogenous condi-tional on work. This is essentially a learning-by-doing model as in, for example, Imai andKeane (2004), with an age-dependent human capital production function. To keep thiscomparable, we alter our baseline model as little as possible. We also restrict the num-ber of total parameters to remain the same so that we are comparing models with similarlevels of flexibility.

First we consider the model with exogenous human capital. In this case human capitalevolves according to the function

Hit+1 = (1− δ) Hit + ξitπ(

1 + α1t + α2t2)

(C1)

where t is potential experience. Notice that this is very close to our standard model fromequation (6). We have exactly the same parameter names, except that (αI , αH) are re-placed with (α1, α2) since their roles have changed considerably. In this case human capi-tal evolves completely exogenously in the sense that individuals can do nothing to changetheir human capital. For this reason, we remove the moment condition of wage changerate after one nonemployment spell when estimating this exogenous model.

The parametrization of the second model is analogous. Here we alter the exogenousmodel so that human capital only grows for workers:

Hit+1 = (1− δ) Hit + (1− ìt) ξitπ(

1 + α1t + α2t2)

. (C2)

We refer to this as the “learning-by-doing” model. Even though it looks quite similar tothe exogenous model, as a practical matter it is very different as workers can control theirhuman capital through their labor supply decision. When individuals do not work, theirhuman capital depreciates at rate δ.36

In section 5 above we discuss two different reasons why our model can fit the life-cycle profiles of wages and labor supply and in particular the large increase in wages butsmall increase in labor supply at the beginning of the life cycle and the large decrease in

36We estimated another two different versions of production functions in (C1) and (C2): one replacing(1 + α1t + α2t2) with HαH

it and the other with(1 + α1Hit + α2H2

it)

in the second term at the right hand side.The results are very similar.

C-1

labor supply but small decrease in wages at the end. The first is human capital deprecia-tion: when workers stop working their human capital falls. The second is the distinctionbetween measured wages and human capital. These two models allow for us to see therelative importance for these two different explanations because the exogenous humancapital model lacks both of these features while the learning-by-doing allows for the for-mer but not the latter.

The estimates of these models are presented in Table C1 and the fit of the two models ispresented in Figure C1. We first discuss the completely exogenous model. As expected, itis difficult for this model to fit both the labor supply and the two wage profiles at the sametime. Precautionary savings and Social Security lead to income effects where labor supplycan fall late in life in the exogenous model. The problem is that to fit the decrease in laborsupply at the end requires a very large labor supply elasticity (as well as a lot of sampleselection bias to give an estimated flat wage). However, the large elasticity to explainlabor supply at the end leads to a huge increase in labor supply at the beginning that wedo not see in the data. To see the size of the elasticity, we estimate our version of theanalogue to the empirical elasticity as above and present it in Figure C2. The exogenousmodel requires a substantially larger elasticity at most ages.37

By contrast the learning-by-doing model fits the data well—though not quite as wellas our baseline model. The elasticity of labor supply is much closer to the baseline modelthan it is to the exogenous model—as one can see from Figure C2, or from the fact thataε takes on a similar value 0.123 as opposed to 0.006 in the exogenous model, still lowerthan the value of 0.203 in the baseline model. This results in a higher elasticity of laborsupply at most ages than the baseline model. The key to understanding this differenceis human capital. When the human capital rental rate increases at age t, in the learning-by-doing model workers are able to adjust their labor supply decision throughout thewhole life cycle, which is more efficient than the exogenous model and consequently in-duces a smaller elasticity of labor supply. On the other hand, the baseline model gives aworker an extra channel for adjustment—the allocation of time between investment andworking. This enables workers to react to the increased return to human capital evenmore efficiently than the learning-by-doing model, and therefore a smallest elasticity atmost ages. It is important to note here that we did not try a wide range of exogenous orlearning-by-doing models; we just did a comparison between our baseline model and an

37The high school graduates start with zero initial asset and borrow to finance their consumption. Asthe wage increases rapidly, so does their labor supply. The borrowing constraint starts to bind around age30 where they need to work a lot. The labor supply tops and the elasticity of labor supply reaches theminimum. As the borrowing constraint relaxes and the wage flats out and falls, the labor supply starts tofall at an increasing pace.

C-2

Table C1: Estimates of alternative human capital modelsa

Exogenousb Learning-by-Doingc

Parameters Estimates S.E. Estimates S.E.HC depreciationd δ 0.118 (0.004) 0.074 (0.003)HC production function: on t α1 0.013 (0.002) -0.002 (0.001)HC production function: on t2 (×104) α2 -2.532 (0.555) -0.0003 (0.0002)Standard deviation of HC innovation σξ 0.725 (0.114) 0.313 (0.090)Consumption: CRRA ηc 4.147 (0.041) 3.606 (0.035)Consumption shifter: on t (×10) ϕ1 0.465 (0.086) 0.415 (0.043)Consumption shifter: on t2 (×102) ϕ2 0.097 (0.024) 0.046 (0.010)Consumption shifter: on t3 (×103) ϕ3 -0.038 (0.003) -0.024 (0.001)Consumption shifter: coef on married ϕ4 1.150 (0.117) 0.009 (0.004)Leisure: standard deviation of shock aε 0.006 (0.002) 0.123 (0.013)Leisure: spouse not working a1 2.069 (0.175) 1.172 (0.186)Leisure: spouse working a2 0.288 (0.021) -1.180 (0.160)Bequest weight b1 57,389,488 (14,610,945) 2,738,653 (556,092)Parameter heterogeneityLeisure: mean of intercept µa0 -6.235 (0.065) -5.146 (0.062)Leisure: standard deviation of intercept σa0 0.518 (0.126) 1.222 (0.071)HC productivity, mean µπ 1.840 (0.003) 1.704 (0.036)HC productivity, standard deviation σπ 0.657 (0.008) 0.637 (0.026)Correlation between a0 & π ρ -0.683 (0.148) 0.494 (0.070)Initial human capital level at age 18Intercept γ0 1.610 (0.137) 1.138 (0.182)Coefficient on a0 γa0 0.029 (0.011) 0.067 (0.020)Coefficient on π γπ 0.318 (0.083) 0.805 (0.065)Standard deviation of error term σH0 0.005 (0.003) 0.044 (0.036)Additional Social Security Application effectsEffect of resource constraint

(×103) b62 0.148 (0.030) 0.260 (0.061)

Effect of health insurance: constant(×103) b65 0.043 (0.028) 0.149 (0.073)

Effect of health insurance: coef on t(×103) b65t 0.190 (0.031) 0.327 (0.070)

χ2 Statistice 1185 839Degrees of freedom 206 207

aIndirect Inference estimates. Estimates use a diagonal weighting matrix. Standard errors are given in parentheses.bIn the exogenous model, the human capital production function is Ht+1 = (1− δ) Ht + ξtπ

(1 + α1t + α2t2).

cIn the learning-by-doing model, the human capital production function isHt+1 = (1− δ) Ht + (1− `t) ξtπ

(1 + α1t + α2t2).

dHC: Human Capital.eThis is the J-statistic. The critical values of the χ2 distribution are χ2

(206,0.01) = 256, χ2(207,0.01) = 257.

C-3

Figure C1: Exogenous and learning-by-doing models moments


.2.4

.6.8

1E

mp

loy

me

nt

Ra

tes

22 30 40 50 60 65Age

Data

Baseline

Exogenous

Learning−by−doing


2.3

2.4

2.5

2.6

2.7

2.8

Me

an

Lo

g M

ea

sure

d W

ag

es

(FE

)

22 30 40 50 60 65Age

Data

Baseline

Exogenous



2.3

2.4

2.5

2.6

2.7

2.8

Me

an

Lo

g M

ea

sure

d W

ag

es

22 30 40 50 60 65Age

Data

Baseline

Exogenous



.25

.5.7

51

S.D

. L

og

Me

asu

red

Wa

ge

s

22 30 40 50 60 65Age

Data

Baseline

Exogenous



68

10

12

Ad

ult

Eq

uiv

ale

nt

Co

nsu

mp

tio

n

22 30 40 50 60 65Age

Data

Baseline

Exogenous



0.1

.2.3

.4.5

Soc

ial S

ecur

ity A

pplic

atio

n’

60 61 62 63 64 65 66 67 68 69 70Age

DataBaselineExogenousLearning−by−doing

C-4

Figure C2: Analogue to the Empirical Elasticity of Labor Supply: Exogenous model andLearning-by-doing model

.1.2

.3.4

.61

24

6E

last

iciti

es

20 30 40 50 60 70Age

BaselineExogenousLearning−by−doing

Table C2: Transitions from alternative human capital modelsa



1 Data 0.034 0.200 -0.0712 Exogenous model 0.036 0.212 n/a3 Learning-by-doing Model 0.028 0.215 -0.056

aThe transition rate is the average transition probability between age 35 and 50.bThe average wage change rate after one nonemployment spell is the average change rate between age 41and 65.

exogenous or learning-by-doing model chosen to be close to our baseline model. Presum-ably alternative and more flexible models could fit the data better—though this is true ofour baseline model as well.

This comparison between the fit of the three models suggests that the human capi-tal depreciation rate seems to be relatively more important for fitting the data than thedifference between human capital and measured wages.

Table C3 presents the budget calculation from the seven policy experiments for thesetwo alternative human capital models in Panel B and C.

C-5

Tabl

eC

3:Bu

dget

calc

ulat

ion

for

expe

rim

ents

for

the

thre

ehu

man

capi

talm

odel

s

01

23

45

67

Base

line

Tax

Incr

ease

50%

No

Earn

ings

Test

NR

A=

67R

educ

eSS

B20

%N

oSS

Taxe

sN

oSS

Bene

fitN

oSS

Syst

emLe

vela

∆Le

velb

%∆

c∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆∆

Leve

l%

∆

Pane

lA:B

asel

ine

Mod

elTo

talO

utpu

t12

67.8

8450

.374

3.97

38.

821

0.69

617

.389

1.37

125

.176

1.98

6-1

15.2

27-9

.088

157.

633

12.4

331.

241

0.09

8SS

Tax

78.6

093.

123

3.97

30.

547

0.69

61.

078

1.37

11.

561

1.98

6-7

8.60

9-1

009.

773

12.4

33-7

8.60

9-1

00M

edic

are

Tax

18.3

840.

730

3.97

30.

128

0.69

60.

252

1.37

10.

365

1.98

6-1

8.38

4-1

002.

286

12.4

33-1

8.38

4-1

00SS

Bene

fit20

7.24

42.

173

1.04

9-0

.667

-0.3

22-2

4.73

2-1

1.93

4-4

0.93

7-1

9.75

3-6

.371

-3.0

74-2

07.2

44-1

00-2

07.2

44-1

00SS

Defi

citd

110.

251

-1.6

81-1

.524

-1.3

42-1

.217

-26.

062

-23.

639

-42.

863

-38.

878

90.6

2282

.196

-219

.303

-198

.913

-110

.251

-100

Inco

me

Tax

197.

750

79.6

8340

.295

1.83

40.

927

3.35

71.

697

5.82

92.

948

-44.

644

-22.

576

34.2

2117

.305

-19.

537

-9.8

80Pa

nelB

:Exo

geno

usM

odel

Tota

lOut

put

1275

.091

30.7

462.

411

3.72

60.

292

14.8

191.

162

20.9

141.

640

-79.

389

-6.2

2610

9.14

38.

560

4.85

30.

381

SSTa

x79

.032

1.90

52.

411

0.23

10.

292

0.91

81.

162

1.29

61.

640

-79.

032

-100

6.76

68.

561

-79.

032

-100

Med

icar

eTa

x18

.489

0.44

62.

411

0.05

40.

292

0.21

51.

162

0.30

31.

640

-18.

489

-100

1.58

38.

560

-18.

489

-100

SSBe

nefit

208.

109

1.10

90.

533

-0.1

21-0

.058

-24.

651

-11.

845

-41.

120

-19.

759

-3.5

97-1

.728

-208

.109

-100

-208

.109

-100

SSD

efici

t11

0.58

9-1

.242

-1.1

23-0

.406

-0.3

67-2

5.78

5-2

3.31

6-4

2.72

0-3

8.62

993

.924

84.9

30-2

16.4

58-1

95.7

32-1

10.5

89-1

00In

com

eTa

x19

4.49

576

.734

39.4

531.

961

1.00

83.

119

1.60

45.

293

2.72

1-3

9.95

1-2

0.54

129

.530

15.1

83-1

9.03

1-9

.785

Pane

lC:L

earn

ing-

by-d

oing

Mod

elTo

talO

utpu

t13

00.2

1836

.540

2.81

03.

876

0.29

813

.562

1.04

318

.396

1.41

5-7

8.56

2-6

.042

119.

327

9.17

75.

868

0.45

1SS

Tax

80.6

142.

266

2.81

00.

240

0.29

80.

841

1.04

31.

141

1.41

5-8

0.61

4-1

007.

398

9.17

7-8

0.61

4-1

00M

edic

are

Tax

18.8

530.

530

2.81

00.

056

0.29

80.

197

1.04

30.

267

1.41

5-1

8.85

3-1

001.

730

9.17

7-1

8.85

3-1

00SS

Bene

fit21

1.07

62.

208

1.04

60.

167

0.07

9-2

4.59

3-1

1.65

1-4

1.23

2-1

9.53

4-5

.428

-2.5

72-2

11.0

76-1

00-2

11.0

76-1

00SS

Defi

cit

111.

609

-0.5

87-0

.526

-0.1

29-0

.116

-25.

630

-22.

964

-42.

639

-38.

204

94.0

3884

.257

-220

.205

-197

.299

-111

.609

-100

Inco

me

Tax

188.

214

74.2

0739

.427

1.33

10.

707

2.54

71.

353

4.36

62.

320

-36.

187

-19.

227

28.5

1115

.148

-15.

999

-8.5

01a T

he"L

evel

"co

lum

nre

fers

toth

eag

greg

ate

valu

eov

erth

ew

hole

life

cycl

e.Fo

rex

ampl

e,in

the

base

line

mod

el,t

heto

talo

utpu

ttha

tan

aver

age

indi

vidu

alpr

oduc

ebe

twee

n18

and

80is

$1,2

68th

ousa

nd.A

llnu

mbe

rsin

this

colu

mn

are

inth

eun

itof

$100

0.b Th

e"∆

Leve

l"co

lum

nre

fers

toth

edi

ffer

ence

ofth

eto

talv

alue

betw

een

the

curr

ent

expe

rim

ent

and

the

base

line

mod

el.

For

exam

ple,

inth

e"N

oEa

rnin

gsTe

st"

case

,the

tota

lout

put

is$8

.821

thou

sand

high

erth

anth

atin

the

base

line

mod

elac

ross

the

who

lelif

ecy

cle

from

18to

80.A

llnu

mbe

rsin

this

colu

mn

are

inth

eun

itof

$100

0.c Th

e"%

∆"c

olum

nre

fers

toth

epe

rcen

tage

ofth

edi

ffer

ence

inth

e"∆

Leve

l"co

lum

nre

lati

veto

the

leve

lin

the

base

line

mod

el.F

orex

ampl

e,in

the

"No

Earn

ings

Test

"cas

e,th

eto

talo

utpu

tinc

reas

esby

$8.8

21th

ousa

ndw

hich

iseq

uiva

lent

to0.

696%

ofth

eto

talo

utpu

tin

the

base

line

mod

el.A

llnu

mbe

rsin

this

colu

mn

are

inth

eun

itof

%.

d "SS

Defi

cit"

=SS

Tax

+M

edic

are

Tax

-SS

Bene

fit.I

tis

the

defic

itfo

rth

ego

vern

men

tbud

get.

C-6

Appendix D College Graduates

We estimate the model for college graduates separately as well. The results are shownin Table D2, D1, and Figure D1. One can see that the basic results and fit are quite similaras the baseline model.

Table D1: Transitions for college graduatesa



12 Data 0.018 0.296 -0.09513 Model 0.022 0.332 -0.091

aThe transition rate is the average transition probability between age 35 and 50.bThe average wage change rate after one nonemployment spell is the average changerate between age 41 and 65.

D-1

Table D2: Estimates in the baseline model for college graduatesa

Parameters Estimates Standard ErrorsHC depreciationb δ 0.097 (0.005)HC production function: I factor αI 0.084 (0.026)HC production function: H factor αH 0.192 (0.022)Standard deviation of HC innovation σξ 0.303 (0.088)Consumption: CRRA ηc 3.791 (0.031)Consumption shifter: coef on t

(×10−1) ϕ1 0.166 (0.048)

Consumption shifter: coef on t2 (×10−2) ϕ2 0.145 (0.026)Consumption shifter: coef on t3 (×10−3) ϕ3 -0.042 (0.005)Consumption shifter: coef on married ϕ4 1.933 (0.293)Leisure: standard deviation of shock aε 0.132 (0.013)Leisure: spouse not working a1 1.063 (0.084)Leisure: spouse working a2 -1.346 (0.254)Bequest weight b1 27,130,480 (5,836,684)Parameter heterogeneityc

Leisure: mean of intercept µa0 -5.933 (0.063)Leisure: standard deviation of intercept σa0 0.367 (0.148)HC productivity, mean µπ 2.216 (0.077)HC productivity, standard deviation σπ 0.899 (0.050)Correlation between a0 & π ρ -0.023 (0.002)Initial human capital level at age 18Intercept γ0 1.697 (0.163)Coefficient on a0 γa0 -0.024 (0.005)Coefficient on π γπ 0.358 (0.061)Standard deviation of error term σH0 0.471 (0.062)Additional Social Security Application effectsEffect of resource constraint

(×103) b62 0.086 (0.032)

Effect of health insurance: constant(×103) b65 0.025 (0.005)

Effect of health insurance: coef on t(×103) b65t 0.071 (0.014)

χ2 Statistic = 874d Degrees of freedom = 207aIndirect Inference estimates. Estimates use a diagonal weighting matrix. Standard errors are givenin parentheses.bHC: Human Capital.cThe joint distribution of (a0, π) is a parametric discrete distribution with nine points determinedby these five parameters, using a nine-point Gauss-Hermite approximation.dThis is the J-statistic. The critical value of the χ2 distribution is χ2

(207,0.01) = 257.

D-2

Figure D1: Fit of model with college graduates


.4.6

.81

Em

plo

ym

en

t R

ate

s

26 30 40 50 60 65Age

Simulation

Data


2.6

2.8

33

.23

.4M

ea

n L

og

Me

asu

red

Wa

ge

s (F

E)

26 30 40 50 60 65Age

Simulation

Data


2.6

2.8

33

.23

.4M

ea

n L

og

Me

asu

red

Wa

ge

s

26 30 40 50 60 65Age

Simulation

Data


.25

.5.7

51

S.D

. Lo

g M

ea

sure

d W

ag

es

26 30 40 50 60 65Age

Simulation

Data


81

01

21

41

61

8A

du

lt e

qu

iva

len

t co

nsu

mp

tio

n

26 30 40 50 60 65Age

Simulation

Data


0.1

.2.3

.4.5

So

cia

l Se

curi

ty A

pp

lica

tio

n

60 61 62 63 64 65 66 67 68 69 70Age

Simulation

Data

D-3

Date post:	02-Nov-2021
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Estimation of a Life-Cycle Model with Human Capital, Labor ...

Documents