+ All Categories
Home > Documents > Assessing the Performance of the Lee-Carter Approach to ...

Assessing the Performance of the Lee-Carter Approach to ...

Date post: 08-May-2022
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
28
September 19, 2000 date last saved: 09/19/00 2:46 PM date last printed: 09/19/00 2:46 PM Assessing the Performance of the Lee-Carter Approach to Modeling and Forecasting Mortality Ronald Lee Demography and Economics University of California 2232 Piedmont Ave Berkeley, CA 94720 [email protected] Timothy Miller Demography University of California 2232 Piedmont Ave Berkeley, CA 94720 [email protected] Research for this paper was funded by a grant from NIA, AG11761.We thank John Wilmoth for making available mortality data for the US, France, Sweden, and Japan, through the Berkeley Mortality Data Base. We thank Statistics Canada and Francois Nault for making Canadian data available. John Wilmoth and Ken Wachter provided very useful suggestions for the analysis. This paper is available in electronic form at www.demog.berkeley.edu.
Transcript
Page 1: Assessing the Performance of the Lee-Carter Approach to ...

September 19, 2000date last saved: 09/19/00 2:46 PM

date last printed: 09/19/00 2:46 PM

Assessing the Performanceof the Lee-Carter Approach

to Modeling and Forecasting Mortality

Ronald LeeDemography and Economics

University of California2232 Piedmont AveBerkeley, CA 94720

[email protected]

Timothy MillerDemography

University of California2232 Piedmont AveBerkeley, CA 94720

[email protected]

Research for this paper was funded by a grant from NIA, AG11761.We thank JohnWilmoth for making available mortality data for the US, France, Sweden, and Japan,through the Berkeley Mortality Data Base. We thank Statistics Canada and FrancoisNault for making Canadian data available. John Wilmoth and Ken Wachter providedvery useful suggestions for the analysis. This paper is available in electronic form atwww.demog.berkeley.edu.

Page 2: Assessing the Performance of the Lee-Carter Approach to ...

2

Abstract

The Lee-Carter method for forecasting mortality was published eight years ago, with anapplication to US mortality data, 1900-1989. The method has been quite well received,but there have also been criticisms. Some have thought that the probability bands areimplausibly narrow. Others have argued that many age specific rates are so low that theycan’t realistically be projected to decline much further. Some argue that it must be sub-optimal to ignore biomedical information that might inform the forecasts, and thatforecasts based on expert opinion should be preferred. Some have called for more withinsample testing of the methods, and others have questioned whether the ax and bx shouldbe treated as invariant. Bell (1997) noted that the model did not fit the jump off data verywell. In this paper we will examine many of these issues.

This paper will assess the performance of the 1992 forecast over the years since 1989. Itwill also conduct some more demanding tests of its performance within sample for theUS as well as for some other countries. It will compare within sample performance to theperformance of the projections of the Social Security Administration (SSA) and someother US forecasts. It will consider some extensions and modifications of the originalprocedure.

Results include:

• The original forecast started with an initial level of e0 that was .6 years higher thanthe actual for 1989. This error was carried over to all subsequent years of the forecast.Adjusting for this error in data for initial level, the forecast was within 0.2 years of e0in 1998 and similarly close to the rates of decline of the individual age groups from1989 to 1997.

• Applying the method retroactively to project to e0 in 1998, using only data availableup to each historic start point, the hypothetical forecasts are quite accurate, withforecasts starting in 1946 having errors of two years or less. The 95% probabilitybounds contained the true value for 1998 85% of the time.

• We analyze 78 hypothetical forecasts with jump-off years from 1920 to 1997 andforecast horizons from 78 years to 1 year. The method tended to under-predict gainsin life expectancy in the US, particularly when launched from earlier dates. 91% oferrors at 31-40 year horizons were negative (predicted e0 less than actual) and 100%of errors beyond a 50 year horizon were negative. The true e(0) fell within the 95%probability interval for 2,984 out of 3,081 forecasted e(0) values or 97% of the time.The probability bounds appear to be too broad for horizons up to 40 years and toonarrow for horizons beyond 50 years.

• The average error and mean squared error for LC forecasts since 1950 aresubstantially lower than those of SSA since 1950.

• If the method had been used to forecast 1995 e0 for Sweden, starting in 1950, itwould have been right on target until 1980, and two years too low in 1995. Results

Page 3: Assessing the Performance of the Lee-Carter Approach to ...

3

for France and Canada are very similar. For Japan, the data only start in 1950;forecasts from 1975 to 1996 are below the actual value, and one year too low by1996. Looking at all the forecasts combined, the 95% probability bounds contain theactual e(0) values for 152 out of 162 forecasted values or 94% of the time.

• There have been very significant changes in the relative rates of decline of mortalityby age, in the US, Sweden, France, Canada, and Japan, contrary to an assumption ofthe original method. This requires that the ax and bx coefficients be estimated on datasince 1950 or so, not over the whole century.

• Forecasts should use actual last observed death rates as the base for forecasts, asdescribed in the paper. Second stage fitting can be done more easily using actual e0 asthe fit criterion in place of matching the total number of deaths.

Page 4: Assessing the Performance of the Lee-Carter Approach to ...

1

I. Introduction

Important policy decisions are made today based on forecasts of the elderly population 75years in the future. Pension policies are the prime example. Fundamental changes in theUS Social Security Administration are under consideration in part because of a financialcrisis for the system which is based on long term population projections. Old agedependency ratios are the key variable in these forecasts, and they depend on the numberof elderly in the numerator, and the number of working age people in the denominator.The denominator depends heavily on future trends in fertility and perhaps migration, andthese are notoriously difficulty to forecast. The elderly in the numerator have alreadybeen born, at least for forecasts over a 65 year horizon, and so they are on firmer ground.Yet the record of demographers and official agencies in forecasting their numbers isflawed. A series of studies by Keilman (1999; 1997) has found systematic underprediction of the elderly population in industrial nations, by about .5% for each year of aforecast, so that after 75 years one might expect the actual number to exceed the forecastby as much as 60%! (=1/(1-75*.005)). For the “oldest old”, those over 85, the under-prediction occurs at about 1% per year of the forecast, so after 75 years the actual numbercould exceed the forecast by as much as 300% (=1/(1-75*.01)). While immigration musthave contributed to these errors, the main culprit is the systematic under-prediction ofmortality decline and life expectancy gain. We will suggest that these problems continuein the recent and current forecasts of industrial nations.

In this paper, we will present an ex post assessment of the performance of the mortalityforecasts of the Social Security Administration, and find some evidence of bias towardsthe prediction of smaller increases in life expectancy at birth than subsequently occurred.Our main purpose, however, is to make a careful and detailed assessment of theperformance of the Lee-Carter method for forecasting mortality. We evaluate theperformance both in terms of projected e(0) and projected age-specific mortality rates.We first examine the performance of the forecasts published in 1992 relative tosubsequent mortality trends. We next construct forecasts with jump-off years earlier inthe 20th century, pretending we had only the data available up to that point, andcomparing the subsequent forecasts to the actual outcomes. We also conduct somesimilar, but less detailed, experiments using the method to produce forecasts for Japan,Canada, France and Sweden, with jump-off year in 1950. And last, we examine agepatterns of decline during the 20th century and consider the possibility that the age patternhas changed over time contrary to the assumptions of the method.

II. Original article (1992)

A. Overview of the LC approachLee and Carter (1992, henceforth LC) developed a new method for modeling andforecasting mortality, and used it to forecast US mortality to 2065. Since that time, themethod has attracted a certain amount of attention. The most recent Census Bureaupopulation forecasts (Hollmann et al., 2000) use the Lee-Carter forecast as a benchmark

Page 5: Assessing the Performance of the Lee-Carter Approach to ...

2

for their long-run forecast of US life expectancy. The two most recent Social SecurityAdvisory Panels have recommended the adoption of the method, or forecasts consistentwith it, by the Trustees. The method has also been applied in a number of other countries(most recently for the G7 nations, see Tuljapurkar et al., 2000). We begin this sectionwith a brief overview of the approach followed by an assessment of the performance ofthe 1992 forecast over the years since 1989, the jump-off year of the forecast.

The basic LC model of age specific death rates (ASDRs, and denoted mx,t) is:

( ), ,ln x t x x t x tm a b k ε= + + (Equation 1)

Here xa describes the general age shape of the ASDRs, while tk is an index of the general

level of mortality. The xb coefficients describe the tendency of mortality at age x to

change when the general level of mortality ( tk ) changes. When xb is large for some x,

then the death rate at age x varies a lot when the general level of mortality changes (aswith x=0 for infant mortality, for example) and when xb is small, then that death rates at

that age vary little when the general level of mortality changes (as is often the case withmortality at older ages). Note that the model assumes that all the ASDR move up or downtogether, although not necessarily by the same amounts, since all are driven by the sameperiod index, tk . In principle, not all the xb need have the same sign, in which case

movement in opposite directions could occur, but in practice, all the xb do have the same

sign, at least when the model is fit over fairly long periods. Note that the proportional rateof decline of any death rate is give by xb ( /dk dt ). If /dk dt is constant, that is if tk is

declining linearly, then each ASDR will decline at its own age specific exponential rate,proportional to xb , and depending on the rapidity of the decline in tk . The same model

was selected by Gomez de Leon (1990) using exploratory data analysis on the historicaldata for Norway, out of a larger set of possibilities.

The strategy is to estimate this model on the historical data for the population in question,obtaining values for xa , xb and tk . The values of tk form a time series, with one value

for each year of data. Standard statistical methods can then be used to model and forecastthis time series. LC selected a random walk with drift as the appropriate model, whichhas the form:

1t t tk k c e−= + + (Equation 2)

In this specification, c is the drift term, and k is forecast to decline linearly withincrements of c, while deviations from this path, te , are permanently incorporated in the

trajectory. The variance of te is used to calculate the uncertainty in forecasting k over

any given horizon. The drift term, c, is also estimated with uncertainty, and the standarderror of its estimate can be used to form a more complete measure of the uncertainty inforecasting k.

Page 6: Assessing the Performance of the Lee-Carter Approach to ...

3

The projected k can then be used in Equation 1, together with the estimated xa and xb , to

calculate forecasts of the ASDRs, and from these any desired life table functions can bederived. The probability intervals on the forecasts of k can then be used in the same wayto calculate intervals for the forecasts of the ASDRs, and (because these are all linearfunctions of the same k) the forecast of e0. However, forecast errors in the ASDRs and e0derive additionally from the εx,t and from uncertainty about the true values of xa and xb .

LC show that these latter sources of error matter less and less as the forecast horizonlengthens, and they are dominated by uncertainty about k in the long run. For a forecasthorizon of 10 years, 98% of the standard error of the forecast of 0e is accounted for by

uncertainty in k; for the individual age specific rates, the other sources of uncertainty aremore important initially and remain important longer, but after 25 years most account forless than 10% of the standard error of the forecasts (see LC table B2).

From inspection of Equation 1 it is apparent that there is no observed variable on the righthand side of the equation, so ordinary regression methods cannot be used to estimate themodel. LC describes a simple approximate method using regression methods, but theSingular Value Decomposition (SVD) gives an exact least squares fit. Also note that if

xa , xb and tk is one set of coefficients for the model, then xa , xb /A and A* tk will be an

exactly equivalent set, for any constant A. Similarly, xa – xb *A, xb , tk (1+A) will also

be an equivalent formulation for arbitrary constant A. LC stipulated a uniquerepresentation by setting xa equal to the average of the logarithms of mx,t over the data

period, and setting the average value of tk equal to zero. In this case the sum of the xb

values is unity.

The method has a number of appealing features. The basic model is very simple, andalthough its use for forecasting involves a number of steps, each is simple in itself. Themethod is “relational” in demographers’ terminology. That is, it involves thetransformation of actual existing mortality schedules for each study population, andtherefore on the one hand is largely non-parametric, and on the other hand incorporatesparticular features of the mortality pattern of a given population. The method is alsoprobabilistic, in the sense that it involves statistical fitting of models, and the quality ofthe fit of the historical data can be used to provide probability intervals for the forecasts.As a matter of empirical fact, in the applications of the method to date, involving at leastten national data sets, the historical trend in k has always been found to be highly linearwith time, and the random walk with drift has been found to give a good fit. Thisapproximate linearity is useful for forecasting. It contrasts with the typically nonlineartrajectories of life expectancy, which rises at a decelerating rate when age specificmortality rates decline at constant exponential rates. Finally, the method can also be usedas the basis of a simple model life table system, and indirect estimation methods can bedeveloped to expand the mortality data available as the basis for forecasting.

B. Assessing the original forecastIn their original article, LC noted that the model would not fit the age specific mortalitydata exactly in the jump off year, which would mean that the initial conditions for the

Page 7: Assessing the Performance of the Lee-Carter Approach to ...

4

forecast would not be quite right. This would inevitably lead to error which would beparticularly important in the early years of the forecast. They noted that it would bepossible to set xa equal to the most recently observed log age specific rates, and thereby

fit the initial conditions exactly (with tk = 0). However, they argued that this practice

might extrapolate idiosyncratic features of mortality in the jump off year, and it wastherefore preferable to estimate xa as the average values of the log death rates (LC:665-

666). In retrospect, this appears to have been a mistake, since the error in 0e of .6 years at

the jump off year caused significant bias in the forecasts for the first decade, as we shallsee below, and as Bell (1997) has pointed out (LC estimated 0e for 1989 at 75.66 years,

whereas official data puts it at 75.08). Bell (1997) assessed the performance of fourmortality forecasts: LC (as published); LC (with the jump off year corrected); McNown-Rogers; and the SSA actuaries. He concluded that the LC forecasts did better than theSSA or McNown-Rogers, but that a corrected LC forecast did better still.

Figure 1 displays the original LC mean forecast of e0, a similar forecast but with thecorrect jump-off level, and the SSA projections done at the same time. The bias in theoriginal LC projections is clearly apparent, but it is also apparent that those projectionscorrectly identified the trend in 0e . SSA appears to be somewhat low, ending up about

0.8 years below the actual e0. The adjusted LC is about 0.2 years too low in 1998 (thelatest data available to us). Over this period, the actual e0 always remains well within the95% prediction interval for both the original LC and the adjusted LC.

If the forecasts of 0e performed well from 1989 to 1998, how about the forecasts of the

individual age specific rates? Once again, there are certainly errors due to the errors ininitial conditions. Figure 2 instead focuses on the LC projected age specific rate ofdecline of death rates from 1989 to 1997 for sexes combined, since this will not beaffected by the errors in initial rates. It also plots the actual rates of decline, and thoseprojected by SSA. The agreement between the LC forecast and the actual rates of declineis striking, particularly at the older ages. The SSA projections, however, incorrectlyforecast slower mortality decline in the young adult years. We will return to this topiclater, for a different perspective on the age pattern of decline.

C. Criticisms and advances since publicationThe method has been quite well received, but there have also been criticisms. Some havethought that the probability bands are implausibly narrow (e.g. Alho, 1992:673). Othershave argued that many age specific rates are so low that they can’t realistically beprojected to decline much further. Some argue that biomedical information should informthe forecasts, perhaps through incorporating expert opinion as is done by the SocialSecurity Actuaries. Some have called for more within-sample testing of the methods, andothers have questioned whether the xa and xb should be treated as invariant. Bell (1997)

noted that the model did not fit the jump off data very well. In this paper we will examinemany of these issues.

Page 8: Assessing the Performance of the Lee-Carter Approach to ...

5

Considerable work has been done to refine and extend the method since the original LCarticle. Wilmoth (1993) has developed improved fitting methods based on weighted leastsquares. Methods for modeling and forecasting regional systems of mortality have beendeveloped (Lee and Nault, 1993). Better procedures for dealing with the jump-off yearhave been developed (Bell, 1997). Alternatives for modeling mortality for the oldest oldhave been explored. Consideration has been given to the special role of leader andfollower countries (Wilmoth, 1998). The method has been applied to cause of death data(Wilmoth, 1998) to sexes separately, and by race. (Carter and Lee, 1992; Carter 1996).There have been many applications to countries other than the US (e.g., Lee and Rofman,1992; Tuljapurkar et al., 2000). Lee (2000) provides a summary of the model’sdevelopment, extensions, and applications such as stochastic forecasts of social securitysystem finances.

III. Assessing LC on US time series, within sample

A. The nature of the testsIn the original LC article, there were some tests of forecast performance within thehistorical data period, but none of these involved re-estimating xa , xb and tk . Instead,

time series models were fit to different portions of the time series of estimated tk . Here

we will make a more rigorous test, in which we refit the model from scratch on eachchosen sub-sample of data. Our earliest experimental forecast is based on data from 1900through 1920. Our next uses data 1900 through 1921; our next through 1922; and so onuntil our last forecast uses data from 1900 through 1997 to make a forecast for 1998. Inthis way, we have 78 different forecasts for mortality one year ahead; 77 for a two yearhorizon; and finally one with a 78 year horizon. We re-estimated the xa and xb for each

set of data, and then re-estimated tk for these years conditional on these xa and

xb estimates, by choosing tk (in the second stage) so as to match exactly the given value

of 0e in the data for that year.i This departs slightly from the procedure in the original

LC, where tk was chosen to match total deaths, which requires annual age-distributed

population data as well.

Once tk was estimated for each year of the sample, we did not carry out standard

diagnostic methods to choose an optimal ARIMA model for each data sub-sample, butrather assumed that the random walk with drift model held. It was fitted and used toforecast tk over the desired time range.

LC introduced a dummy variable for the influenza epidemic of 1918. Our preferencetoday is to include the dummy (permitting a one time positive change in k in 1918,followed by a one time equal negative change in k in 1919), and in the forecast toincorporate a 1/T chance of an identical positive and negative change in k occurring,where T is the length of the base period over which the model was fit. This has a small

Page 9: Assessing the Performance of the Lee-Carter Approach to ...

6

effect on both the mean and the variance of the forecast. We did not do this for theseexperimental forecasts, here described.

B. Forecasting to 1998 (e0)Figure 3 plots all 78 forecasts for life expectancy in the year 1998, each from a differentjump-off year, and each over a different forecast horizon. Each forecast for 1998 isplotted above its jump-off date. The 95% probability intervals are also plotted. Thehorizontal line indicates the observed value of life expectancy for 1998, so it is the truevalue relative to which the forecasts can be assessed. There are several points to note.First, although the experimental forecasts tend to be too low, they are generally fairlyclose to the actual value for 1998. The earlier forecasts, using data up through the 1920sand 1930s are on average five years below the true value; beginning in 1946 forecasts arewithin two years of the correct value. Over all, the mean forecasts look quite good.Second, the 95% probability intervals failed to contain the true value for 1998 in 12 outof the 78 forecasts, or 15% of the time, compared to the 5% which was intended. Third,the median forecast for 1998 fell below the actual value for 1998 in 74 of the 78forecasts, or 95% of the time.

C. Errors by forecast horizon (e0)It is also useful to assess forecast errors (forecast-actual) by horizon. We have done thisfor horizons of 1, 5, 10, 20, 40 and 60 years. For a 1 year horizon, we have 78 differentjump-off dates, while for the 60 year horizon, we have only 19. For each forecast, wefind the percentile in its probability distribution where the observed value falls. Forexample, if the actual corresponds to the median of the forecast distribution, we assign it50. If it corresponds to the lower 7% of the distribution, we assign it 7; and so on. Wethen plot the frequency distribution of these percentile scores. If the probabilitydistribution associated with each forecast does in fact describe the probability distributionof errors, then this frequency distribution should be uniform between 0 and 100. If theactual distribution of percentiles is more concentrated in the middle, around 50, thatindicates that the distribution of the errors is more tightly clustered then our forecast leadsus to expect, and if there are less in the middle of the distribution and more towards the 0and 100 end, then our forecast understates the width of the error distribution. If most ofthe true values fall below the 50th percentile, then most of the time we have over-estimated. While if they fall above the 50th percentile, then we tend to systematicallyunderestimate the true value.

Figure 4 plots the histogram of the percentiles for each horizon.

Table 1 presents various measures of forecast performance, including the Mean SquaredError (MSE), the Mean Absolute Percent Error (MAPE), the average error (Bias), thepercent of positive errors, and the proportion of actual values that fall within the 95%probability interval of the forecast. The table reports performance by forecast horizonsas well as a summary over all forecast horizons.

Page 10: Assessing the Performance of the Lee-Carter Approach to ...

7

Table 1ForecastHorizon

Averageerror

MAD RMSE MAPE Number ofestimates

% under-projected

% within95%

probabilityinterval

1-5 -0.11 0.45 0.60 0.16 380 54 996-10 -0.32 0.82 1.03 0.47 355 56 100

11-20 -0.73 1.23 1.60 1.15 635 67 9721-30 -1.37 1.47 1.99 2.03 535 84 10031-40 -1.68 1.73 2.14 2.45 435 91 10041-50 -2.23 2.25 2.75 3.41 335 96 9551-60 -3.54 3.54 3.75 5.07 235 100 8961-78 -4.38 4.38 4.53 5.39 171 100 80ALL -1.49 1.76 2.34 2.45 3,081 78% 97%

The method tended to under-predict gains in life expectancy in the US, particularly whenlaunched from earlier dates. 91% of errors over 31-40 year horizons were negative(predicted e0 less than actual) and 100% of errors beyond a 50 year horizon werenegative. The 95% confidence bounds contain the actual e(0) value 97% of the time.But, they appear to be too broad for intervals up to a 40 year horizon and too narrow forthose beyond a 50 year horizon.

D. Error correlations by age, horizonAs noted briefly above, Equation 1 has an error term, ,x tε , since the expression does not

provide a perfect representation of variation in age specific rates over time. Informulating the probability intervals for the life expectancy forecasts, this error term wasignored, and only errors arising from the innovation in tk and from errors in estimating

the drift term, were incorporated. If we were interested only in e(0) and if the ,x tε were

uncorrelated across age, this assumption might be relatively harmless, because sometwenty different values of ,x tε enter into the calculation of any life expectancy, and the

average effect should be very small. However, if the errors are correlated, such that thosefor older ages tend to move together and those for younger ages tend to move together,then they might have an important influence even on life expectancy. There are alsoerrors in the estimation of the xa and xb coefficients, which are not taken into account in

our probability intervals for the e0 forecasts.

In general, we find that forecast errors tend to be strongly correlated at younger ages, lessso at older ages, and young errors are only weakly correlated with errors at older ages. Atlonger horizons, correlations become more positive due to dominance of errors in k.Figure 5 provides some examples for select age groups and forecast horizons. Furtherwork on the analysis of age-specific errors is underway.

Page 11: Assessing the Performance of the Lee-Carter Approach to ...

8

IV. Assessing LC on historical time series from other countries

We also carried out within sample tests for Sweden, Japan, France and Canada. Theresults are shown in the panels of Figure 6. For France, where both WWI and WWII hadprofound effects on mortality, we have dummied the effects in a similar way, but notallowed for a possible recurrence in the future. Allowing for a recurrence would greatlyincrease the variance of the forecast. Such decisions reflect the judgment of the analyst.

If the method had been used to forecast 1995 e0 for Sweden, starting in 1950, it wouldhave been right on target until 1980, and two years too low in 1995. Results for Franceand Canada are very similar. For Japan, the data only start in 1950; forecasts from 1975to 1996 are below the actual value, and one year too low by 1996. Looking at all theforecasts combined, the 95% probability bounds contain the actual e(0) values for 152 outof 162 forecasted values or 94% of the time.

V. Changing age-shape of mortality

A number of people have suggested that the xb coefficients might vary over time; this

possibility was not explored by LC. Kannisto et al. (1994) found that the rate of mortalitydecline had been accelerating over recent decades for ages 80 to 100. Horiuchi andWilmoth (1995) show that in a number of countries, mortality declines at older ages nowtake place more rapidly then at lower ages, reversing the historical pattern. This researchsuggests that it is important to take very seriously the possibility that the age pattern ofmortality decline may alter over time, and may not be well described by a fixed set of

xb coefficients. Note that the xa coefficients will always be changing over different

historical periods, because they are the average log death rates, and these averages willchange in level as mortality falls, and change in shape because the xb coefficients tell us

that at different ages, mortality declines at different rates. This poses no problem, becausethe changing shape and level of the xa are implicit in the xb , and no additional treatment

is necessary.

Recall that our earlier examination of the post-publication performance of LC showedthat it correctly forecast the age pattern of mortality decline as well as the increase in e0

over the past 9 years. This suggests that the fixed xb assumption has worked well.

However, a closer examination of the age pattern of decline in the US shows otherwise.Figure 7 plots the average rate of decline for sexes combined mortality by age for 1900 to1949 and for 1950 to 1995. It is clear that there has been an important change, withmortality now declining at roughly the same rate across all ages above 15, whereas forthe first half of the century it declined far more rapidly at the younger ages.

Examination of the historical pattern of decline in Japan, Sweden, Canada, and Franceshows similarly striking changes, with a flattening of the age profile of decline. (SeeFigure 8).

Page 12: Assessing the Performance of the Lee-Carter Approach to ...

9

Is this a long term change, routed in the changing cause structure of mortality, or in theresistance of mortality at different ages to biomedical progress? Or is it due to what wemight hope will be more transitory influences on young adult mortality in industrialnations, such as AIDS and accidents? We are not sure. But the more prudent course is toassume that these changes are long term, and to incorporate them into our forecasts in oneway or another. A simple and satisfactory solution, adopted by Tuljapurkar et al. (2000),is to base the forecast on data since 1950, and assume fixed xb over that range but not

over the whole century.

VI. Comparison of official forecasts from SSA and others to LC forecasts

A. Forecasting to 1998We have examined the historical record of SSA projections, including two earlier onesthat were used by SSA but prepared by other agencies. Figures 9 and 10 examineforecasts of e(0) for the year 1998. Figure 9 compares the middle series forecast fromSSA with the median LC forecast. The figure shows that the official projections havebeen systematically too low – by 12 years in 1930, about 7 years in the 1940s, then by 2to 4 years until those done in 1980, which then jumped to being too high. It can be seenthat the SSA estimates reacted strongly to the slow mortality gains of the 1960s, and thento the rapid gains of the 1980s. By contrast, the LC method responds only modestly tothese fluctuations, since they only modestly affect the average trend over the century. TheLC method also tends to be somewhat low in early years, but performs substantiallybetter than SSA. It would have been closer to the true value in 1998 for most forecasts.It picks up the correct track for 1998 considerably earlier.

Figure 10 shows the high-low range of SSA projections along with the 95% probabilityinterval of LC. The true value of e(0) for 1998 lies beyond the high bound for most ofthe SSA forecasts up until 1970.

B. Errors by horizon, comparison to LCIn assessing errors by forecast horizon, we have restricted our sample to post-1950government forecasts. We have only 3 early government forecasts (pre-1950) – whichprovided e(0) forecasts for only a few select years in the future. This makes the analysisof errors by length of horizon complicated for these groups. We are working onobtaining more of the data for these early forecasts. For comparison to LC, we use boththe full sample (1920-1997) and a restricted sample which matches the time period of theSSA forecasts (1950-1997). For LC, we have forecasts for every year. For SSA, theforecasts are issued irregularly. In our calculations we have weighted each SSA forecastsby the reciprocal of the number of forecasts issued within the decade. In this way, eachdecade contributes equally to the error estimates.

Figure 11 compares the average bias in the SSA and LC forecasts by length of forecasthorizon. Horizons are by single year from 1 to 7 and then grouped (8-12, 13-17, 18-22,

Page 13: Assessing the Performance of the Lee-Carter Approach to ...

10

23-27, 28-38, 39-46, and 39-60 years). SSA forecasts issued since 1950 comparefavorably with LC forecasts issued since 1920. However, when we examine those LCforecasts issued during the same time period (since 1950), we find that LC performssubstantially better.

Figure 12 compares the root mean square error (RMSE) for SSA and LC forecasts. SSAforecasts perform slightly better than those of LC for the first and second years. At allhorizons beyond 2 years, LC performs better than SSA and substantially better as theforecast horizon increases.

C. General problem of official forecastsGovernment forecasts generally rely on expert opinion for their long-run forecast. Theevidence suggests that this has resulted in forecasts which are too pessimistic. The earlyreports were issued during the Great Depression and the Second World War. Perhapsthese events influenced expert opinion about future progress. And yet, at that time, thedata were telling a different story, since mortality had been declining quite rapidly overthe previous decades. A quote from the 1943 report is interesting in this regard.Thompson and Whelpton state their objection to statistical forecasting methods such asextrapolation: “More important, the extrapolation of past trends according to suchformulas might show future trends which seemed incompatible with present knowledgeregarding the causes of death and the means of controlling them.” (National ResourcesPlanning Board, 1943, p. 10). This suggests an alternative explanation for the pessimismof experts: present knowledge informs us about current limits, but not the future meansof overcoming them. The Lee-Carter approach bases its long-run forecasts on thecentury-long decline in mortality in which limits have been continuously confronted andovercome.

VII. Conclusions

1) Lee-Carter (LC) forecasts of life expectancy and the age pattern of mortalityperformed quite well for the period since publication, at least after adjusting for anerror in jump-off level.

2) Historical LC projections from various jump-off dates in the 20th century would havepreformed well. For forecasts with jump-off after 1945, we are always within 2 yearsof the actual e(0) in 1998. The forecasts tend to under-predict future gains, especiallythose in the distant future. The 95% probability bounds contain the true value of e(0)97% of the time. But, the bounds appear to be too broad for horizons up to 40 yearsand too narrow for those beyond 50 years.

3) Social security projections also have systematically under-predicted gains in e(0)since 1950. The average error and mean squared error for LC forecasts since 1950are substantially lower than those of SSA since 1950.

Page 14: Assessing the Performance of the Lee-Carter Approach to ...

11

4) LC life expectancy forecasts for Canada, Sweden and France with jump off year 1950and for Japan with jump off year 1973 would have performed very well. But, like theUS, would have systematically under-predicted actual gains.

5) Contrary to a basic assumption in the Lee-Carter model, the age pattern of mortalitydecline has shifted systematically in the US, Sweden, France, Canada, and Japan inthe second half of the 20th century, with a flattening of the age specific rates ofdecline above age 15.

Page 15: Assessing the Performance of the Lee-Carter Approach to ...

12

VIII. References

Alho, Juha M. (1992)

Bell, William R. (1997) “Comparing and Assessing Time Series Methods for ForecastingAge Specific Demographic Rates” Journal of Official Statistics 13:279-303.

Carter, Lawrence (1996) “Long-Run Relationships in Differential U.S. MortalityForecasts by Race and Gender: Non-Cointegrated Time Series Comparisons”, 1996Annual Meetings of the Population Association of America, New Orleans, May 9-11, 1996.

Carter, Lawrence and Ronald D. Lee (1992) "Modeling and Forecasting U.S. Mortality:Differentials in Life Expectancy by Sex," in Dennis Ahlburg and Kenneth Land,eds, Population Forecasting, a Special Issue of the International Journal ofForecasting, v.8, n.3 (November) pp.393-412.

Gomez de Leon, Jose (1990) Empirical DEA Models to Fit and Project Time Series ofAge-Specific Mortality Rates," Unpublished manuscript of the Central Bureau ofStatistics, Norway (July).

Hollmann, Frederick W.; Mulder, Tammany J.; and Kallan, Jeffrey E. (2000)“Methodology and assumptions for the population projections of the United States:1999 to 2100,” Population Division Working Paper No. 38, U.S. Bureau of theCensus.

Horiuchi, Shiro and John R. Wilmoth (1995) “The aging of mortality decline.”Presented at the Annual Meeting of the Population Association of America, SanFrancisco, April 6-8, 1995.

Kannisto, Vaino; Lauristsen, Jens; Thatcher, A. Rodger; and Vaupel, James W. (1994)“Reductions in mortality at advanced ages: Several decades of evidence from 27countries.” Population and Development Review, Vol. 20, No 4, pp. 793-810.

Keilman, Nico (1997)

Keilman, Nico (1999)

Lee, Ronald D. (2000) “The Lee-Carter Method for Forecasting Mortality, with VariousExtensions and Applications,” North American Actuarial Journal, Vol. 4, No. 1,pp. 80-91.

Lee, Ronald D. and Lawrence Carter (1992) "Modeling and Forecasting the Time Seriesof U.S. Mortality," Journal of the American Statistical Association v.87 n.419(September) pp.659-671.

Page 16: Assessing the Performance of the Lee-Carter Approach to ...

13

Lee, Ronald D. and Francois Nault (1993) “Modeling and Forecasting ProvincialMortality in Canada,” paper presented at the World Congress of the InternationalUnion for the Scientific Study of Population, Montreal, 1993.

Lee, Ronald D. and Rafael Rofman (1994) “Modelacion y Proyeccion de la Mortalidaden Chile,” NOTAS 22, no 59, pp. 182-213. Also available in English from theauthors, titled “Modeling and Forecasting Mortality in Chile.”

Tuljapurkar, Shripad; Nan Li and Carl Boe (2000) “A universal pattern of mortalitydecline in the G-7 countries” Nature, in press.

Wilmoth, John R. (1993) "Computational Methods for Fitting and Extrapolating the Lee-Carter Model of Mortality Change," Technical Report, Department ofDemography, University of California, Berkeley.

Wilmoth, John R. (1998) “Is the pace of Japanese mortality decline converging towardinternational trends?” Population and Development Review 24(3): 593-600.

i In all cases, the data we use are taken from the SSA data base, as maintained on the Berkeley MortalityData Base web site, www.demog.berkeley.edu. The original LC article used NCHS data, and for the periodbefore 1933 estimated age specific mortality and e0 indirectly using the age distribution of the totalpopulation, total deaths per year, and the ax and bx coefficients as estimated by SVD from the data 1933 to1987, after the death registration area was complete.

Page 17: Assessing the Performance of the Lee-Carter Approach to ...

Figure 1: Forecasts of life expectancy from 1989.

Date

1980 1985 1989 1995 1998

74

75

76

77

Lee-Carter forecast (1

992).

Lee-Carter fo

recast with corre

ct jump-off.

Social Security forecast (1992).

Page 18: Assessing the Performance of the Lee-Carter Approach to ...

Age Group

Per

cent

age

Dec

line

0 20 40 60 80

-10

12

34

Figure 2: Average Annual Decline in Age-Specific Mortality, 1989-1997

Actual and Forecasts of Lee-Carter (1992) and SSA (1992)

Actual: NCHS Population 10mx values, 1989-1997Lee-Carter(1992) Forecast, 1989-2000SSA (1992) Forecast, 1990-2000

Page 19: Assessing the Performance of the Lee-Carter Approach to ...

Figure 3: e(0) Forecasts for the Year 1998 by Forecast Date

Forecast Date

50

55

60

65

70

75

76.7

80

1920 1930 1940 1950 1960 1970 1980 1990 1997

76.7

Median Forecasts95% Probability IntervalActual

Page 20: Assessing the Performance of the Lee-Carter Approach to ...

0 20 40 60 80 100

Error Percentiles

2 11 6 6 10 7 13 11 5 7

1 year forecasts, ( 78 obs.)

0 20 40 60 80 100

Error Percentiles

1 7 10 10 4 8 11 7 7 9

5 year forecasts, ( 74 obs.)

0 20 40 60 80 100

Error Percentiles

3 4 7 8 8 7 3 7 8 14

10 year forecasts, ( 69 obs.)

0 20 40 60 80 100

Error Percentiles

0 1 7 3 1 9 6 10 12 10

20 year forecasts, ( 59 obs.)

0 20 40 60 80 100

Error Percentiles

0 0 0 2 2 5 6 6 8 10

40 year forecasts, ( 39 obs.)

0 20 40 60 80 100

Error Percentiles

0 0 0 0 0 0 1 1 3 14

60 year forecasts, ( 19 obs.)

Figure 4: Percentile Error Distribution by Forecast Length

Page 21: Assessing the Performance of the Lee-Carter Approach to ...

Error Correlations for Ages 0-1

Age Group

Cor

rela

tion

-0.2

0.2

0.6

1.0

0-1 5-9 15-19 25-29 35-39 45-49 55-59 65-69 75-79 85-89 95-99

1 yr5 yr10 yr

Error Correlations for Ages 1-4

Age Group

Cor

rela

tion

-0.2

0.2

0.6

1.0

0-1 5-9 15-19 25-29 35-39 45-49 55-59 65-69 75-79 85-89 95-99

1 yr5 yr10 yr

Error Correlations for Ages 25-29

Age Group

Cor

rela

tion

-0.2

0.2

0.6

1.0

0-1 5-9 15-19 25-29 35-39 45-49 55-59 65-69 75-79 85-89 95-99

1 yr5 yr10 yr

Error Correlations for Ages 50-54

Age Group

Cor

rela

tion

-0.2

0.2

0.6

1.0

0-1 5-9 15-19 25-29 35-39 45-49 55-59 65-69 75-79 85-89 95-99

1 yr5 yr10 yr

Error Correlations for Ages 65-69

Age GroupC

orre

latio

n

-0.2

0.2

0.6

1.0

0-1 5-9 15-19 25-29 35-39 45-49 55-59 65-69 75-79 85-89 95-99

1 yr5 yr10 yr

Error Correlations for Ages 80-84

Age Group

Cor

rela

tion

-0.2

0.2

0.6

1.0

0-1 5-9 15-19 25-29 35-39 45-49 55-59 65-69 75-79 85-89 95-99

1 yr5 yr10 yr

Figure 5: Error Correlations

Page 22: Assessing the Performance of the Lee-Carter Approach to ...

Canada from 1950

Date

e(0)

1920 1940 1960 1980 2000

55

65

75

ActualLee-Carter Forecast: MedianLee-Carter Forecast: 95% Probability Bounds

Sweden from 1950

Date

e(0)

1900 1920 1940 1960 1980 2000

55

65

75

ActualLee-Carter Forecast: MedianLee-Carter Forecast: 95% Probability Bounds

France from 1950

Date

e(0)

1900 1920 1940 1960 1980 2000

40

55

70

ActualLee-Carter Forecast: MedianLee-Carter Forecast: 95% Probability Bounds

Japan from 1973

Date

e(0)

1950 1960 1970 1980 1990 2000

6065707580

ActualLee-Carter Forecast: MedianLee-Carter Forecast: 95% Probability Bounds

Figure 6: LC forecasts of life expectancy

Page 23: Assessing the Performance of the Lee-Carter Approach to ...

Figure 7: Average Annual Reduction in Age-Specific Death Rates, US

Age

Per

cent

age

Dec

line

0 20 40 60 80

01

23

45

1901-05 to 1950-551950-55 to 1990-95

Page 24: Assessing the Performance of the Lee-Carter Approach to ...

Sweden

Age

Per

cent

age

Dec

line

0 20 40 60 80

02

4

1900 to 19501950 to 1995

France

Age

Per

cent

age

Dec

line

0 20 40 60 80

02

4

1900 to 19501950 to 1995

Canada

Age

Per

cent

age

Dec

line

0 20 40 60 80

02

4

1922 to 19501950 to 1997

Japan

Age

Per

cent

age

Dec

line

0 20 40 60 80

04

81950 to 19751975 to 1996

Figure 8: Average Annual Reduction in Age-Specific Death Rates

Page 25: Assessing the Performance of the Lee-Carter Approach to ...

Figure 9: LC and SSA e(0) Forecast for 1998, by Forecast Date

Forecast Date

65.0

70.0

75.0

76.7

1920 1930 1940 1950 1960 1970 1980 1990 1997

76.7

Actual e(0) in 1998Lee-Carter Forecasts: MedianSoc. Sec & early govt. forecasts: Middle

Page 26: Assessing the Performance of the Lee-Carter Approach to ...

Figure 10: 95% Probability Interval and High-Low Range by Forecast Date

Forecast Date

1920 1930 1940 1950 1960 1970 1980 1990 1997

55

60

65

70

75

76.7

80

76.7

••

• • •

• • • • • • ••

• • •

•• •

••

• •• • • • •

• • • •

Actual e(0) in 1998Lee-Carter Forecasts: 95% Probability BoundsSoc. Sec. & early govt. forecasts: High and Low

Page 27: Assessing the Performance of the Lee-Carter Approach to ...

Figure 11: Mean Error in Forecasts of Life Expectancy

-3.00

-2.50

-2.00

-1.50

-1.00

-0.50

0.00

0.50

0 5 10 15 20 25 30 35 40 45 50

Length of Forecast

Pre

dic

ted

-Act

ual

SSA (1957-1996, decades equally weighted)

LC (1920-1996)

LC (1950-1996)

LC (1950-1996)

LC (1920-1996)

SSA (1957-1996)

Page 28: Assessing the Performance of the Lee-Carter Approach to ...

Figure 12: Root Mean Squared Error in Forecasts of Life Expectancy

0.00

0.50

1.00

1.50

2.00

2.50

3.00

0 5 10 15 20 25 30 35 40 45 50

Length of Forecast

RM

SE

SSA (1957-1996, decades equally weighted)

LC (1920-1996)

LC (1950-1996)

LC (1920-1996)

SSA (1957-1996)

LC (1950-1996)


Recommended