What Does a Term Structure Model Imply About Very … does a term structure model imply about very...

What Does a Term Structure

Model Imply About Very

Long-Term Interest Rates? Anne Balter, Antoon Pelsser and Peter Schotman

DP 02/2014-065

What does a term structure modelimply about very long-term interest

rates?

Anne BalterMaastricht University and Netspar

[email protected]

Antoon PelsserMaastricht University, Kleynen Consultants and Netspar

[email protected]

Peter SchotmanMaastricht University and [email protected]

November 8, 2015

Abstract: We estimate a term structure model on interest rate data with maturi-ties up to 20 years and then extrapolate the yield curve to maturities up to100 years. Such model based extrapolation is motivated by limited liquidity ofvery long-dated fixed income instruments. The extrapolated curves are alwaysabove observed swap rates at maturities in the 20-50 years range. The extrap-olation appears mainly driven by the near unit root of the level factor underthe risk neutral measure. In a no-arbitrage term structure model this leadsto a strong convexity effect, which implies that extrapolated yield curves aregenerally upward sloping for maturities longer than 20 years before eventuallybending steeply downwards. Our estimates use Bayesian methods. The prior isinformative on mean reversion parameters and imposes a zero lower bound onthe unconditional means.

Keywords: term structure models, parameter uncertainty, extrapolation, insurancesupervision

JEL codes: G23, G12, C58

1 Introduction

Long maturity discount rates are an essential input for valuing the liabilities of pension

funds and insurance companies. Life-insurance or pension fund liabilities can be as

long as 100 years, whereas the available liquid instruments in the market have much

shorter maturities. In most countries market rates can only be observed for maturities

up to 20 or 30 years for government debt. Swap rates are available for maturities up

to 50 years, but there are doubts about the liquidity of the longest maturities. Fair

value, or market consistent, valuation requires discount rates that are close to market

rates, but free of liquidity effects.1

For this purpose various methods have been proposed to extend an observed yield

curve. Assuming that market rates are liquid up to a ‘last liquid point’ with maturity

of 20 years (say), how should such a yield curve be extended to maturities up to 100

years? One option is the use of numerical extrapolation techniques. A prominent

example is the Smith-Wilson methodology adopted by European Insurance and Oc-

cupational Pensions Authority (EIOPA), which extrapolates the forward rate curve

using exponential functions.2 The extrapolation method provides a smooth extension

from the yield at the last liquid point to an externally specified ultimate forward

rate (UFR) and a chosen convergence speed parameter. Other methods, such as the

Nelson-Siegel methodology, would first fit level, slope and curvature factors using

data on the liquid part of the yield curve and then extend the yield curve with the

parameters of the fitted model.3 In the Smith-Wilson methodology the yield curve

always converges to the same constant, whereas in the Nelson-Siegel model long rates

converge to a time-varying level factor estimated from the current term structure.

A problem with these methods is that the extension is based on curve fitting,

1 Quoting from From Moody’s Analytics (May 2013): ‘Fair value is the price that would bereceived to sell an asset or paid to transfer a liability in an orderly transaction (that is, not a forcedliquidation or distressed sale) between market participants at the measurement date under currentmarket conditions.’

2 The technical details of the methodology are described in CEIOPS (2010), EIOPA (2014) andEIOPA (2015).

3 See Diebold and Rudebusch (2013) for a textbook treatment of the Nelson-Siegel model forfitting term structure data.

1

and not a formal term structure model. The Nelson-Siegel model can be made ar-

bitrage free by adding a yield adjustment term, like in Christensen, Diebold, and

Rudebusch (2011), but this adjustment will push very long term yields to minus in-

finity. The strong downward pressure on long-term yields is caused by the unit root of

the level factor under the risk neutral measure that drives the convexity adjustment.

The arbitrage-free Nelson-Siegel model is a member of the class of essentially affine

Gaussian term structure models (Duffee, 2002). Other models in that class do not

necessarily have a unit root in the risk neutral dynamics, and will thus have a smaller

convexity adjustment at the very long end of the yield curve.

The existence of a convexity effect is the main argument for using a formal term

structure model for the extrapolation. It implies that yields do not always converge

monotonically from the last liquid point towards the ultimate yield. Starting from

low interest rate levels, the convergence will follow a hump shape, in which yields first

overshoot the ultimate yield before a finally decreasing slowly towards the limit. With

current low interest rates such an extrapolation will result in a much higher level for

long-term yields than either the Nelson-Siegel or the Smith-Wilson extrapolation.

A secondary aim of the paper is to quantify the uncertainty around extrapolated

yields given a formal term structure model. How much can we learn from time series

of observed yields with medium to long term maturities (5 to 20 years) about the

shape of the yield curve at very long maturities (between 20 and 100 years). More

explicitly, what can we infer about the three crucial elements at the long end of the

yield curve: the ultimate forward rate, the convergence speed towards the UFR, and

the convexity? The 20-years maturity separates the liquid segment of the yield curve

from the very long maturities which the market is deemed to be less liquid. Taking

20 years is the ‘last liquid point’ is advocated by EIOPA. We also see data evidence

for this breakpoint, since the volatility of yields increases slowly with maturities after

this point, contrary to the implications of theoretical term structure models.

The simplest model to estimate these quantities is a single factor Vasicek model.

The model can be parameterized by three key parameters: the ultimate yield, mean

2

reversion (or convergence speed) and volatility. It therefore directly addresses the

main challenges for extrapolating a yield curve. The single factor model can not fit

the complex curvatures at the short end (1 month to 2 years) of the yield curve, but

performs reasonably for long maturities. We therefore estimate the parameters using

data on 5- and 20-year maturities.

In a Bayesian analysis using Euro swap rate data we find that the mean reversion

of the level factor is non-zero, but with considerable probability mass very close to

the unit root (under the risk neutral measure). That means that convexity effects

are important. Convexity drives down the limiting yield. But since convergence

towards it is very slow, the yield curve remains upward sloping for maturities up

to 100 years. As expected the Vasicek extrapolation leads to a higher level of very

long-term yields than the Nelson-Siegel or Smith-Wilson methods. The uncertainty

around the extrapolated yields increases with the maturity. For market conditions

prevailing in the fall of 2013 the Nelson-Siegel and Smith-Wilson extrapolations are

at the lower end of the 95% highest posterior density region.

2 Data

Our yield curve data consists of a monthly panel of discount rates from the website

of the Bundesbank.4 These yield curve data are constructed from Euro swap rates

with maturities ranging from 1 to 50 years. The sample period is from January 2002

to September 2013 resulting in 141 data-points per maturity.

Figure 1 provides an overview of the data. The average term structure is increasing

until the 20-year maturity, after which it becomes slightly downward sloping for longer

maturities. The yield curve has fluctuated substantially over the twelve year sample

period. The figure shows a curve from the beginning of the sample (March 2002)

when the curve has a similar shape as the average, but at a 1.5% higher level. The

lowest long term yields are from May 2012, where the 50 years maturity yield is just

4 http:/www.bundesbank.de/Navigation/EN/Statistics/Time series databases

3

Figure 1: Euro swap rates

0

1

2

3

4

5

6

0 10 20 30 40 50

Average

Low

High

0.00

0.40

0.80

1.20

1.60

0.00

0.05

0.10

0.15

0.20

0.25

0 10 20 30 40 50

Changes

Residuals

Levels

In the left panel the solid blue line shows the average yield curve over the period January2002 – September 2013. The two red lines with crosses show the yield curves with theminimum and maximum rate at a 20 years maturity. The right panel shows the volatilityof yield level, yield changes and the one month prediction errors from an AR(1) model.The horizontal axis in both panels is the maturity in years. The vertical axis unit ispercent per year. The right vertical axis is for levels; the left vertical axis is for changesand prediction errors.

below 2%. With such dominant parallel shifts of yield curves, it will be difficult to fit

a long-term common ultimate forward rate to these data. It is definitely not reached

before a 50 years maturity and convergence to a common ultimate yield requires a

very low level of implied mean reversion.

The right hand panel of figure 1 shows the volatility of yields. The standard

deviation of yield levels decreases quickly up to maturities of 12 years, after which the

volatility of the level stabilizes. Within an affine term structure model this points to

at least two factors. The initial downward slope would then be related to a stationary

factor with strong mean reversion. This stationary factor has a negligible influence on

yields with maturities longer than 10 years. The flat part at long maturities requires

a level factor that is close to a random walk under the risk neutral measure. Since the

shortest maturity in this data set is one year, it is difficult to identify a third factor.

The same figure also shows the standard deviation of yield changes. In a one-factor

model it should have the same shape as the level volatility, and only the scaling should

be different. In a multi-factor model the shape for short and medium-term maturities

can be very different from the shape of the level volatilities. The figure shows the

4

Figure 2: Time series data

-0.8

-0.4

0

0.4

0.8

-0.8 -0.4 0 0.4 0.8

D 2

0 y

ear

yie

ld

D 5 year yield

0

1

2

3

4

5

6

5 years

20 years

The figure shows the monthly time series data for the 5- and 20-year maturity discountyields. The scatter diagram dispalys first differences of the yields. The time series graphsshows the level.

familiar hump-shaped volatility, where the volatility peaks at the three year maturity,

and then starts a gradual decrease. The initial hump shape for shorter to intermediate

maturities can be explained by a two-factor model. The gradual downward sloping

volatility is consistent with a slowly mean-reverting level factor. Most puzzling is

the upward sloping pattern beyond the 20 years maturity, which cannot be explained

by standard term structure models. The affine model, for example, implies that the

volatility curve is downward sloping for longer maturities. Very long-dated swap

prices may contain more noise because the market at these long-term maturities is

less liquid. This suggests that the 20- years rate may be a good reference for the last

liquid point to start the extrapolation.

In our econometric model we will work with time series data on discount yields for

maturities 5 and 20 years. The time series are shown in figure 2. The scatter diagram

shows the close association between first differences in the 5-year and 20-year rates

suggesting that the single factor assumption is not too far off. From the time series

plot it appears that both rates have a slight negative trend over the last decade, so

that it may be hard to estimate an unconditional mean from these time series.

5

3 Term Structure Model

Our basic model is the essentially affine term structure model introduced by Duffee

(2002) as an extension of the Duffie and Kan (1996) class of affine term structure

models. Given a K-vector of factors xt that follow a vector Ornstein-Uhlenbeck

process under the risk-neutral density (Q measure), the yield curve is given by

yt(τ) = a(τ) + b(τ)′xt (1)

where yt(τ) is the yield of a discount bond at time t with time to maturity τ ; a(τ)

and the K-vector b(τ) are functions of the time to maturity and the underlying

parameters of the model. Dai and Singleton (2000) show that identification requires

that the factor dynamics can be specified by a set of K mean reversion parameters,

one for each of the factors. The larger the mean reversion coefficient of a factor, the

lower will be the impact of a factor on long-term yields. For typical estimates of a

three factor model, more than 95% of the variation at maturities longer than 5 years

is explained by the first factor, usually referred to as the level factor.

3.1 Vasicek model

Since our aim is to extrapolate the yield curve beyond maturities of 20 years, using

data in the segment between 5 and 20 years, we specialize our model to a single factor.

For the single factor ‘Vasicek’ model both xt and b(τ) are scalars, and the yield curve

is given explicitly as

yt(τ) = θ + b(τ) (xt − θ) + 12ω

2τb(τ)2 (2)

where

b(τ) =1− e−κτ

κτ(3)

ω2 =σ2

2κ(4)

θ = µ− ω2

κ(5)

6

The function b(τ) defines the volatility of long-term yields relative to the level factor

xt; ω2 is the unconditional variance of the factor; and θ is the limiting yield yt(τ)

when τ → ∞. The constant θ is both the ultimate yield as well as the ultimate

forward rate. It is equal to the unconditional mean of the risk-neutral distribution of

the factor minus the infinite horizon convexity adjustment ω2/κ. The convexity term

scales with the unconditional variance ω2 of the risk-neutral factor dynamics. If the

mean-reversion κ goes to zero, meaning a true level factor, the variance will tend to

infinity. But even if κ does not move all the way to its limit, the convexity in the

ultimate yield can be substantial due to the additional κ in the denominator for ω2.

For small κ it can become so big that yields will be negative for large s and converge

to a negative θ (with fixed µ). This will occur, for example, in the arbitrage-free

version of the Nelson-Siegel model of Christensen, Diebold and Rudebusch (2011).

In the Vasicek model the factor x follows a Ornstein-Uhlenbeck process under the

risk neutral measure Q. For the time series dynamics under the physical measure

P we need to assume a process for the stochastic discount factor. Following Duffee

(2002) we make the essentially affine assumption

dΛ

Λ= −xdt− λdW̃ , (6)

where λt = Λ0 + Λ1xt, and where W̃ is standard Brownian motion. With this as-

sumption the time series process for the factor becomes

dx = κ̃(µ̃− x)dt+ σdW̃ (7)

and parameters under the P and Q measures are related by

κ = κ̃+ σΛ1

µ = µ̃− σ(Λ0 + Λ1µ̃)

κ

(8)

With the definition of the risk neutral mean µ, the ultimate yield θ in (5) can be

further decomposed as the sum of the average spot rate µ̃, a risk premium depending

on risk parameters Λ0 and Λ1, and the convexity effect ω2/κ.

Combining (2) and (7) provides an expression for the time series behavior of

different yields yt(τ), which are the starting point for the econometric analysis.

7

3.2 Extrapolation

In general extrapolation uses forward rates and the identity

yt(s) =1

s

∫ s

0

ft(u)du, (9)

with ft(τ) the instantaneous forward rate at time t for time t+ τ . If we have reliable

data for the term structure up to the reference maturity τ ∗ (‘last liquid point’) the

extension to maturities s > τ ∗ follows as

yt(s) =1

s

(τ ∗y∗t +

∫ s

τ∗ft(u)du

), (10)

where y∗t = yt(τ∗) is the observed yield at maturity τ ∗.5 The extrapolation ensures

continuity of the extended yield curve at the last liquid point. For the one factor

Vasicek model the forward rate curve is a function of the single state variable xt,

ft(τ) = θ + e−κτ (xt − θ) + ω2e−κτ1− e−κτ

κ, (11)

which shows that θ is also the ultimate forward rate. Integrating (11) over τ provides

a closed form extrapolation for (10) with xt still as a latent variable. If the Vasicek

model would provide an exact fit for the entire term structure, xt could be calibrated

from any observed yield or forward rate, always leading to the same extrapolated

yield curve. Since we do not expect a perfect fit, at least not at short maturities, the

most natural choices for empirical work are (i) to use (2) to solve for xt as a function

of y∗t ; or (ii) to use (11) to calibrate xt to the forward rate f ∗t ≡ ft(τ∗) at the last

liquid point.

Calibrating with respect to y∗t gives the extrapolation scheme

yt(s) =b(s)

b(τ ∗)y∗t +

(1− b(s)

b(τ ∗)

)θ + Cy(s)ω

2

Cy(s) = 12b(s) (sb(s)− τ ∗b(τ ∗))

(12)

with Cy(s) the convexity effect in the extrapolation. The extrapolated curve (12) is

continuous at τ ∗, but not necessarily differentiable. Yields with maturity slightly less

5 In practice most methods replace the instantaneous forward rate by one year forward ratesFt(τ, τ + 1) in which case the integral in (10) becomes the sum

∑s−1i=τ∗ Ft(i, i+ 1). For expositional

purposes we stick with the instantaneous representation.

8

than τ ∗ are observed in the data and the slope at s < τ ∗ is therefore as it is observed

in the data, which need not exactly correspond to the implied slope from the Vasicek

model. When calibrating to the forward rate the extrapolation formula becomes

yt(s) = θ +τ ∗

s(y∗t − θ) + (1− τ ∗

s)b(s− τ ∗)(f ∗t − θ) + Cf (s)ω

2

Cf (s) =1

2s(sb(s)− τ ∗b(τ ∗))2 ,

(13)

This function is differentiable at τ ∗ due to the data identity f ∗t = y∗t +τ ∗ ∂yt(τ∗)

∂τimplied

by (9) which relates forward rates and yields. Therefore, calibrating the state variable

to the forward rate provides a smoother extrapolation. The disadvantage of using the

forward rate is measurement error. The instantaneous forward rate is not directly

observable, but needs to be calculated as a derivative of the observed yield curve with

respect to maturity. It is therefore more sensitive to measurement error than the

yield itself.6

Both extrapolation formulas express the extrapolated yield as a weighted average

of data at the last liquid point (y∗t and/or f ∗t ) and the ultimate yield θ plus a convexity

adjustment. In both cases the curve converges to the ultimate forward rate as s→∞.

The salient feature of the extrapolation are the convexity adjustment terms Cy(s) and

Cf (s), both of which are positive. Hence, even if y∗t and f ∗t are both equal to θ, the

extrapolation first moves yt(s) above the ultimate yield, before slowly converging

downwards to the ultimate yield θ. The Vasicek extrapolation will thus be markedly

different from a simple weighted average of the last liquid point and an ultimate

forward rate. It will often imply a steeper upward sloping yield curve before eventually

flattening (or decreasing) towards the ultimate yield. The convexity adjustments in

(12) and (13) are positive, because the convexity is measured relative to the ultimate

yield θ, which itself has the maximum possible negative convexity effect −ω2

κ.

For comparison we also construct two alternative extrapolations. One is the

Smith-Wilson extrapolation for which we use the Excel tool available of the EIOPA

6 One could formally allow for a slight measurement error in the observed rate y∗t and extract thestate variables from multiple maturities using a Kalman filter. In that case the extrapolation woulduse the estimated x̂t and proceed directly by integrating (11). The resulting extrapolation will thennot be differentiable at τ∗.

9

website.7 The extrapolation uses the entire yield curve up to τ ∗ as input to produce

a smooth extrapolation towards a constant ultimate yield θ = 4.2%. The other is the

Nelson-Siegel model, for which we work with the textbook specification in Diebold

and Rudebusch (2013),

yt(τ) = β1t + β2t

(1− e−λτ

λτ

)+ β3t

(1− e−λτ

λτ− e−λτ

)(14)

The factors βjt (j = 1, 2, 3) are calibrated for each date t using all maturities up to

τ ∗ with a constant shape parameter λ = 0.52 (the optimal least squares value for our

sample) and then used to extend the yield curve for maturities s > τ ∗. With the three

time-varying factors the Nelson-Siegel model should be able to fit most of the yield

curve. Since the curve converges to the level factor β1t it implies a flat yield curve at

very long maturities. Our implementation of the Nelson-Siegel model does not include

the arbitrage-free extension suggested in Christensen, Diebold and Rudebusch (2011).

The extension implies an additional maturity specific constant term β0(τ). Adding

these terms we would again obtain a large convexity effect because the level factor in

the Nelson-Siegel model implies a unit root for the Vasicek factor, i.e. κ = 0.8

4 Econometric model

4.1 Specification

To extrapolate the yield curve we need the parameters of the risk neutral density.

We estimate these parameters from data at the long end of the yield curve. In the

Vasicek model (2) the mean reversion parameter κ is identified through the relative

volatilities b(τ) of yields with different maturities, while the infinite yield θ is identified

from the differences between the unconditional means of long-term yields. Assuming

that these parameters are constant we use time series data for both the 5-year and

20-year maturity discount yields. The 5-year yield is the shortest maturity that

7 The tool ceiops-tool-extrapolator-risk-free-rates en.xls is available on the websitehttps://eiopa.europa.eu.

8 See the leading adjustment term I1 in Appendix B in Christensen et al (2011)

10

seems uncontaminated by additional factors, while the 20-years rate is the longest

one that still is on the downward sloping part of the volatility curve in the empirical

data. The 20-years rate is also the ’last liquid point’ in the extrapolation of EIOPA.

We need multiple maturities to separate the parameters under the P measure from

the risk neutral Q parameters. For this we need sufficient cross-sectional variation.

By choosing the two maturities relatively far apart we include as much of the cross

sectional information as possible.

Henceforth, we consider the two restricted AR(1) processes for maturities (τ1, τ2) =

(5, 20),

yt(τi)− yt−h(τi) = α (mi − yt−h(τi)) + et(τi), i = 1, 2, (15)

where h is the length of the time interval between two observations (one month,

h = 1/12), mi is the unconditional mean of a discount rate with maturity τi, and

the shocks eit are normally distributed with mean zero and covariance matrix Σ.

The mean reversion parameter α = 1 − e−κ̃h is the discrete time equivalent of the

continuous time mean reversion parameter κ̃. The mean-reversion parameter α is the

same for the two different interest rates. According to (2) and (7) the error covariance

matrix takes the form

Σ∗ = s2hσ2

b21 b1b2

b1b2 b22

, (16)

where bi = b(τi) and where s2h = 1−e−2κ̃h

2κ̃is a scaling constant that links the discrete

time model to the continuous time parameterization.9 Through the function b(τ)

the covariance matrix is a function of κ. The mean reversion under the risk-neutral

measure is therefore identified through the covariance matrix. Since we are using a

single factor model, the matrix Σ∗ has rank one. To avoid the stochastic singularity

in the estimation it is common to assume a small measurement (or model) error. This

can be done through a formal measurement equation and a Kalman filter model as

in De Jong (2000). Since we only have two time series, we take the simpler approach

9 In an Euler discretization we would have α = κ̃h and s2h = h.

11

by adding a small positive variance to the diagonal elements of Σ∗,

Σ = Σ∗ + s2hη2I (17)

Interpreting η2 as a measurement error variance will only be credible if η2 is small

relative to the overall volatility of the shocks eit. Large measurement error would

not only cast doubt on the model specification, but would also imply that the regres-

sors yt−h(τi) are subject to errors-in-variables. The three elements in Σ are exactly

identified the three model parameters σ2, η2 and κ.

An estimate of Σ is admissable if the implied κ ≥ 0. From the structure of (16)

it is clear that nonnegativeness of κ requires that σ11 > σ22 and that σ21 > 0. In

the appendix we derive an additional admissability condition which is specific to the

maturities τ1 = 5 and τ2 = 20.

The intercepts mi in (15) are related to the parameters θ and µ̃,

mi = biµ̃+ (1− bi)θ +τib

2i

4κσ2 i = 1, 2 (18)

Since κ and σ2 are already identified from the covariance matrix Σ, these two equa-

tions uniquely identify µ̃ and θ. Inverting (18) is problematic when κ→ 0. For small

values of κ the system becomes almost singular because b1 → b2, while at the same

time the intercepts go to infinity (unless σ2 → 0). The ultimate yield may therefore

be very difficult to identify from the data if the risk-neutral mean reversion is small.

Assuming normality and time series independence for the error terms, we obtain

a normal likelihood function from which we can estimate the six unknown parameters

κ, κ̃, θ, µ̃, σ2 and η2. The parameters are also exactly identified from the reduced

form parameters α, m1, m2 and Σ.

4.2 Bayesian analysis

Due to the near unit root behaviour of interest rates and the relatively short time

series sample, the unconditional means θ and µ̃ will be hard to estimate. In a Bayesian

analysis we can impose stationarity of the dynamics under both P and Q and add

12

a prior view on the unconditional mean of the interest rates that we analyze. A

Bayesian analysis also provides a way to account for the parameter uncertainty by

computing the extrapolation as a weighted average of different sets of parameters

with weights given by the posterior density of the parameters.

We will use mildly informative priors on the reduced form parameters. For the

time series mean reversion α we specify a normal prior truncated to α > 0, such

that the prior mean and standard deviation are 0.013 and 0.01, respectively. With

monthly data the prior is centered around a first order autocorrelation of 0.987. The

prior is centered close to a unit root, but the relatively tight precision also ensures

that the posterior will be away from the unit root unless the data are very informative

on the dynamics.

To impose that the long term means mi of the interest rates are positive we use

a truncated normal prior on mi > 0. We assume independent priors for m1 and

m2 with prior mean 4% and prior standard deviation 3.9%. The truncated normal

prior ensures that the unconditional means are positive at maturities τ1 and τ2, but

it does not guarantee that the unconditional mean is positive at all maturities. Most

problematic could be the ultimate yield θ, since it is extremely sensitive to a near

unit root in the risk-neutral process.

For the covariance matrix Σ we assume a truncated inverted Wishart distribution

p (Σ−1) ∼ TW(Ψ−1, ν) with

Ψ = 0.012

1 0.95

0.95 1

,

and the degrees of freedom parameter ν = 3 (slightly above the minimum value of

2). The prior is truncated to the region that satisfies the admissability conditions

for κ > 0. The prior for κ is therefore implicit in the prior for Σ. Even though the

prior is almost non-informative for Σ, the prior for κ is informative. Accounting for

the truncation by the admissability conditions, the implicit marginal prior for κ has

a mean of 0.033 and standard deviation of 0.049 (based on a numerical evaluation).

Since all priors are proper and have well-defined means and variances for the

13

reduced form parameters, the posterior moments for the reduced form parameters

also exist. The posterior is not available in closed form due to the truncation and the

non-linear parameterization involving the product αm. Numerically the posterior

can be easily obtained through Gibbs sampling, since all conditional posteriors are

straightforward except for the truncation. When we sample from the conditional

posteriors we reject a draw if it is outside the admissible region. In some cases the

probability of accepting a draw can be extremely low. This happens when we need to

draw the unconditional means m at a point where the mean reversion parameter α is

close to zero. In this case the data are uninformative about the unconditional mean,

meaning that we need to draw m from a distribution that is approximately equal to

the prior. Since this is a truncated normal with negative location parameters, the

probability of obtaining a positive number by drawing from a normal distribution

becomes very small. For small α we therefore use the exponential rejection sampling

algorithm suggested by Geweke (1991).

While drawing from the posterior density we also obtain the posterior distribution

of the term structure extrapolations (12) and (13) by computing the functions b(s),

Cy(s) and Cf (s) at each draw of the parameters.

5 Results

5.1 Parameter estimates

Maximum likelhood estimation results are in table 1. In contrast to many other

studies the model is estimated with data on medium- to long-term maturities. Results

are similar, however, to what has been found for other sample periods and countries.

The time series mean reversion κ̃ is not significantly different from zero.10 It implies

a monthly first order autocorrelation of 0.975. We also find, like e.g. De Jong (2000),

that the risk neutral mean reversion parameter κ is much smaller than the time

10 Standard errors reported in the table are from the Hessian of the log-likelihood function. Robuststandard errors allowing for heteroskedasticity and non-normality do not make a difference to theconclusions in this case.

14

Table 1: Parameter estimates

ML Bayesian

par est se mean stdev 95% lo 95% hi

κ 0.0202 0.0101 0.0203 0.0096 0.0016 0.0382κ̃ 0.3023 0.1685 0.1687 0.0915 1.×10−6 0.3332µ 0.1338 0.2677 0.2017 1.2164 0.0315 0.4101µ̃ 0.0155 0.0102 0.0139 0.0044 0.0072 0.0212θ 0.0717 0.0324 -7.31 473.8 -0.4038 0.2106Λ0 0.2940 0.7668 -0.0007 0.2388 -0.4957 0.4881Λ1 -40.556 24.157 -21.602 13.584 -47.026 2.7775(100σ)2 0.471 0.131 0.485 0.084 0.330 0.652(100ω)2 11.90 4.93 21.0 119.1 4.90 39.(100η)2 0.110 0.013 0.109 0.013 0.084 0.135The table reports parameter estimates for the bivariate model using interest rates with maturitiesτ1 = 5 and τ2 = 20 years. The first columns contain maximum likelihood estimates (‘par est’)and asymptotic standard errors (‘se’). Standard errors are from the Hessian of the log-likelihoodfunction. The remaining columns show Bayesian posterior means (‘mean’), standard deviations(‘stdev’) and 95% highest posterior density intervals (‘95% lo’, ‘95% hi’) Results are based on onemillion draws from the Gibbs sampler.

series mean reversion. It is also not significantly different from zero, even though

the asymptotic standard error is very small. The estimated cross-sectional mean

reversion implies a convergence of forward rates towards the ultimate forward rate θ

that is much slower than what is assumed in the Smith-Wilson methodology adopted

by EIOPA. Under the EIOPA specification full converence takes place in 40 years;

the ML estimate of κ implies that after 40 years only 45% of the gap between the

observed ft(τ∗) and θ has been closed. Due to the near unit root under the risk neutral

measure Q the estimate of the unconditional mean µ is very imprecise. The long-term

yield θ is even more imprecise because of the additional uncertainty due to the strong

convexity effect at the long end of the yield curve. The implied parameters for the

price of risk are both insignificant. The imprecision in Λ0 and Λ1 can be reduced by

setting Λ1 = 0, which would also imply that κ̃ = κ and further exacerbate the unit

problem in estimating the mean parameters.

The model error variance is so small that results cannot be affected by errors-in-

variables problems. Assuming the measurement error to be uncorrelated over time,

the measurement error in yt−h is of the order 12s

2hη

2 ≈ 5 × 10−7, which is negligible

15

Figure 3: Conditional posterior draws of θ given κ

The figure shows a scatter plot of the draws θ(i) conditional on κ(i). The left-hand panelshows all draws except the 1% most negative. The right-hand panel shows the 1% smallestvalues of θ(i). Note the different scales for the two panels.

relative to its true variance b2iω2 ≈ 10−3.

The results for the Bayesian analysis in table 1 are based on 1 million draws from

the MCMC sampler. The posterior moments for κ and κ̃ are close to the maximum

likelihood estimates. Due to the prior specification the posterior mean for the time

series mean reversion is a bit closer to the unit root and also a bit more precise.

Similar to the ML estimates the mean reversion under P is still substantially larger

than under Q. The risk parameter Λ1 provides a direct comparison on the equality

of the two parameters and, as for the ML estimates, the 95% credible interval for Λ1

contains zero.

Most of the difference with the ML results come from inference on θ. The ulti-

mate yield θ is hard to identify from the data. Although the prior imposes that the

unconditional means of the 5- and 20-year yields exist and are positive, this does not

guarantee that the unconditional means at other maturities are also positive. Pos-

terior moments for θ are dominated by a few extremely negative outliers when κ is

close to zero. Figure 3 shows a scatter plot of the posterior draws for θ conditional

on κ. The right-side of the figure zooms in on the 1% smallest draws for θ. All of

these occur conditional on very small values for κ < 10−4. Given the large uncer-

tainty on the ultimate yield, it is doubtful if the posterior mean of θ exists with our

prior specification. A clear sign that the posterior mean may not exist is the extreme

16

skewness: due to the severe outliers the average of the simulated θ(i) is far below the

5% quantile of the posterior density.11

In contrast to the ultimate yield, the posterior on the unconditional variance under

Q is well-behaved. It does have a fat right tail, but the posterior simulation does not

produce any of the severe outliers that we encountered for θ. The unconditional

variance depends on κ−1, whereas the ultimate yield depends on κ−2.

5.2 Extrapolation

For the extrapolation we set the reference maturity as τ ∗ = 20 years. This is the

longest maturity in our model and is also the choice of the ‘last liquid point’ made

by EIOPA. We first investigate the effect of the parameter uncertainty assuming that

the Vasicek model fits perfectly. In this case both extrapolation formulas (12) and

(13) are identical. Figure 4 shows the extrapolated curves for the ML estimates as

well as the posterior mean of the Bayesian analysis. Starting at a 20-years yield of 4%

the implied curve is slightly upward sloping, both at the ML estimates as well as for

the posterior mean. Despite the large differences between the ML estimate and the

posterior mean for the UFR parameter θ, the extrapolated curves look very similar

up to 100 years.

The uncertainty around the extrapolation increases with maturity. For example,

at the 60 year maturity the 95% credibility interval ranges between 3.4% and 8.0%.

Most of the probability mass is consistent with an upward sloping yield curve starting

from a 4% initial level. Still the lower end of the interval can support a flat extrap-

olation. Even though the interval widens quickly, even at maturities of 100 years it

is not anywhere near the 95% region for θ in table 1. The negative outliers for θ at

small values for κ hardly affect the extrapolation at relevant maturities.12

11 What matters for the existence of the mean of θ are the properties of the ratio σ2/κ2. Bothparameters depend on the error covariance matrix Σ, which is well determined and for which alllow order moments clearly exist. But since κ, and thus κ−2, is an implicit non-linear function of Σ,its properties can not be determined analytically. Moreover, σ2 and κ are dependent functions ofthe same matrix Σ.

12 Adding the prior restriction θ > 0 reduces the credibility interval, but also results in a more

17

Figure 4: Vasicek extrapolation

0%

2%

4%

6%

8%

10%

12%

0 20 40 60 80 100

Yie

ld (

% p

a)

maturity s

0%

1%

2%

3%

4%

5%

6%

7%

0 20 40 60 80 100

Yie

ld (

% p

a)

maturity s

The left panel shows the posterior mean of yt(s) for s > τ∗ given τ∗ = 20 years andy∗t = 4%. The dashed lines define the 95% highest posterior density region. The dottedline shows the extrapolation conditional on the ML estimates. The right panel showsthe posterior means for different last liquid points y∗t , being 2% (dashed), 4% (solid)and 6% (dashed) respectively. The solid lines in left and right panels represent the sameextrapolated curve.

In the extrapolation formula (12) the ultimate yield θ is multiplied by the weight

(1− b(s)/b(τ ∗)) θ. The large negative outliers for θ occur when κ is very close to zero,

in which case the weight will also go to zero. Figure 5 shows the posterior mean of the

product (1− b(s)/b(τ ∗)) θ. Despite the outliers in θ the overall effect of θ on yields

up to maturity of 60 years accounts for less than 10 basis points on the yields yt(s).

The effect becomes negative for longer maturities. The weighted average of y∗t and θ

would therefore produce a downward sloping extrapolation.

By construction the convexity term Cy(s)ω2 adds positively to the extrapolated

yield. The posterior mean of Cy(s)ω2 in the right panel of figure 5 increases over the

entire range to the maturity of 100 years (even though it must decrease to zero by

construction as s → ∞). At the 60 years maturity it will contribute about 2% to

the yield curve on top of the weighted average of y∗t and θ; at 100 years the effect

increases to 4%.

The 95% HPD bounds for both the ultimate yield component as well as the

convexity adjustment are much wider than for the overall extrapolation in figure 4.

strongly upward sloping extrapolation.

18

Figure 5: Extrapolation components

The left panel shows the posterior mean and 95% HPD for the term (1− b(s)/b(τ∗)) θ.The right panel shows the same quantities for the convexity effect Cy(s)ω2 defined in (12).

-10%

-6%

-2%

2%

6%

10%

0 20 40 60 80 100

0%

2%

4%

6%

8%

10%

0 20 40 60 80 100

maturity s

The difference is due to the correlation between the negative outliers in θ and positive

outliers for ω2, both driven by the near unit root draws for κ. For small κ we will

often have a very negative θ pushing long-term yields downward, and at the same time

a very large positive ω2 pulling yields upwards. For large κ the opposite happens:

negligible convexity and a positive θ with substantial weight. The downward effect

of the ultimate yield and the upward convexity effect balance each other.

The ultimate yield and convexity components are independent of the initial last

liquid yield y∗t . Different values of y∗t produce slowly converging yield curves because

the weight b(s)/b(τ ∗) in (12) is monotone decreasing in s. The curves in figure 4

show the posterior means for three different values of y∗t . The extrapolated curves are

almost parallel consistent with a very slow convergence rate. Even at the 100 year

maturity the curves are still far apart. Even with y∗t = 6% the yield curve is initially

upward sloping due to the strong convexity effect (relative to θ).

For empirical data the Vasicek curve does not perfectly fit. Extrapolation will

then depend on whether the state variable is calibrated with respect to the level or

the slope of the yield curve at the last liquid point. As an example, figure 6 presents

the extrapolation for September 2013. It shows that calibrating the state variable to

the yield y∗t produces a kink in the yield curve at τ ∗. Using the forward rate f ∗t the

extrapolation is smooth, and also slightly below the yield based extrapolation. Apart

19

Figure 6: Extrapolation based on last liquid forward rate

0%

1%

2%

3%

4%

5%

6%

0 20 40 60

Yie

ld (

% p

a)

Maturity s

September 2013

yield

forward

data

0%

1%

2%

3%

4%

5%

6%

7%

0 20 40 60

Yie

ld (

% p

a)

Maturity s

Sample Average

yield

forward

data

The figure shows the posterior mean of alternative yield curve extrapolations. The solidblue line is the observed yield curve, the red diamonds line is the extrapolation calibratedon the 20-years yield, and the green dots is based on the 20-years forward rate. The rightpanel shows the results for the month September 2013; the left panel refers the averageover the entire sample.

from the initial lower slope the properties of the extrapolation are similar to the yield

based extrapolation. The curve is upward sloping due to the convexity effect and

convergence to the ultimate level θ is extremely slow. The same effect occurs in many

more months. On average the forward extrapolation is about 0.4% below the yield

extrapolation for for maturities between 30 and 60 years.

For maturities in the 20-50 years range we can also observe the actual yield curve,

which is always below the extrapolated curve. The difference between the extrapo-

lated yield and the observed discount yield increases monotonically to 1.2% at the 50

years maturity. Figure 7 shows the time series of the differences between extrapolated

and observed yields at the a maturity of 30 years. The residuals are small until mid

2008, then jump upwards, and slowly come down toward the end of the sample.

The convexity effect is the distinctive difference with alternative extrapolation

methods. Figure 8 adds the Smith-Wilson and Nelson-Siegel extrapolations to the

Vasicek curves for September 2013 discussed before. Both are below the Vasicek

curves, but still above the observed data. The Smith-Wilson extrapolation lets the

forward rate converge to θ = 4.2% at 60 years maturity. This convergence is much

faster than our estimated κ. The curve is therefore above the observed long end of

the yield curve, which has been under 4% level for most of the sample since 2008. It

20

Figure 7: Time series residuals

0.0%

0.5%

1.0%

1.5%

2002 2004 2006 2008 2010 2012 2014

Yie

ld (

% p

a)

yield

forward

The figure shows time series of the differ-ence between extrapolated and observed yieldcurves at the 30 years maturity. Extrapola-tion is based on the Vasicek model calibratedto either the yield or the forward rate at 20-years maturity.

Figure 8: Alternative extrapolations

0%

1%

2%

3%

4%

5%

6%

0 20 40 60 80 100

Yie

ld (

% p

a)

Maturity s

data

yield

forward

Smith-Wilson

Nelson-Siegel

The figure shows yield curves for September2013 using different extrapolation methods.The curves ‘yield’, ‘forward’ and ’data’ areas in figure 6. The Smith-Wilson extrapola-tion formula and the Nelson-Siegel model areboth calibrated on the observed yield curvefor maturities 1–20 years at annual intervals.

is doubtful whether this extrapolation reflects market consistency.

The Nelson-Siegel curve has a time-varying endpoint equal to the level factor at

each date. Even this curve is above the observed data, not only in September 2013

but also on average for the entire sample. The Nelson-Siegel extrapolates based on

the level, slope and curvature factors identified from the yield curve up to τ ∗ = 20

years. Close to τ ∗ the observed yield curve is still upward sloping, and as a result

the NS extrapolation converges to a level above y∗t . Since the observed curve is on

average slightly downward sloping for very long maturities the extrapolation is above

the observed yield curve.

5.3 Robustness

We performed several robustness checks with respect to the Vasicek extrapolations.

One set of variations considers the maturities on which the model parameters are

estimated. For the basic setting we used maturities of 5 and 20 years to estimate the

parameters. The 5-year maturity could be too low and be too much affected by other

less persistent factors. For this reason we also consider the combination (τ1, τ2) =

(10, 20) years. On the other hand, since most term structure models are estimated

21

Figure 9: Extrapolation based alternative parameter estimates

0%

4%

8%

12%

0 20 40 60 80 100

Yie

ld (

% p

a)

maturity s

(10, 20)

( 5, 10)

( 5, 20)

0%

2%

4%

6%

0 20 40 60 80 100

Yie

ld (

% p

a)

maturity s

EU

US

The figure shows yield curve extrapolations from alternative parameter estimates. Inthe left panel parameters are estimated for alternative pairs of maturities (τ1, τ2). In theright panel the alternative parameters are derived from US swap rate data.

on maturities up to 10 years we also estimate the model on the pair (τ1, τ2) = (5, 10)

years. In both cases the distance between the maturities is less than in the basic

model. As a result it turns out that the mean reversion parameter κ is estimated with

less precision and has much more probablility mass around the unit root. This further

emphasises the problem of estimating the ultimate forward rate θ. Figure 9 displays

the resulting extrapolations. For both alternative pairs of (τ1, τ2) the convexity effect

is more important resulting in extrapolations above that of the baseline model.

For a second robustness check we evaluate if the same features, strong convexity

and extrapolation above the observed yield curve, are present in US swap rate data.

From Datastream we collected swap rate data starting in May 1998. From these

we construct the discount yields at maturities (τ1, τ2) = (5, 20) years to to estimate

the parameters of the Vasicek models and the extrapolations. For mean reversion

paramter κ the estimate is very similar, but more precise. We therefore observe fewer

negative outliers in the posterior draws for θ and fewer positive outliers for ω2. Overall

figure 9 shows that the US data imply an upward sloping extrapolation. Also, similar

to the Euro data, the extrapolatiod yield curve is above the observed yield curve.

22

6 Conclusion

Yield curves extrapolated using the Vasicek model are generally almost parallel and

in most cases slightly upward sloping for maturities between 20 and 100 years. The

main difference with alternative extrapolation techniques is the convexity effect in

very long term yields. The convexity effect is an important element in no-arbitrage

term structure models and can be a large component due to the slow mean reversion

of the dominant interest rate level factor under the risk neutral measure.

The extrapolation based on a no-arbitrage term structure model is always above

the observed yield curve. This both underscores the importance of using a model

based yield curve for valuations of very long-dated maturities, but also raises the

issue of why observations at the very long end are so low and why they do not fit a

standard term structure model.

A Admissability conditions for Σ

The bivariate restricted VAR has error covariance matrix

Σ =

σ11 σ21

σ21 σ22

= s2h

σ2b21 + η2 σ2b1b2

σ2b1b2 σ2b22 + η2

Assuming τ1 < τ2 we impose σ11 > σ22 to ensure b1 > b2. Similarly, since b1 and b2

are both positive, we also impose σ21 > 0. To solve for κ we construct

S ≡ σ11 − σ22σ21

=b21 − b22b1b2

(19)

where the scalar S is a function of the elements of Σ and where the last expression

only depends on κ. The conditions on Σ imply that S > 0. The condition can be

rewritten as

b2b1

= 12

(√S2 + 4− S

)(20)

Since in our application τ2 = 4τ1, the left-hand side can be rewritten as

b2b1

= 14

(1 + e−κτ1

) (1 + e−2κτ1

)(21)

23

For positive κ this is a monotone decreasing function in κ, and hence the equation

has a unique solution for κ, if it exists.

For S > 0 the right-hand side varies between 0 and 1, meaning that b2 < b1

as required by positive mean reversion κ > 0. Since the left-hand side has a lower

bound of 14 , we must also restrict the right-hand side to be above 1

4 , which implies

the restriction S < 154

. This upper bound is a third admissability condition on Σ.

References

Christensen, J., Diebold, F., and Rudebusch, G. (2011). The Affine Class of Arbitrage-

Free Nelson-Siegel Term Structure Models. Journal of Econometrics, 164:4–20.

Committee of European Insurance and Occupational Pensions Supervisors (2010).

QIS5: Risk Free Interest Rates – Extrapolation Method. Technical report.

eiopa.europa.eu/fileadmin/tx dam/files/consultations/QIS/

QIS5/ceiops-paper-extrapolation-risk-free-rates en-20100802.pdf.

Dai, Q. and Singleton, K. (2000). Specification Analysis of Affine Term Structure

Models. Journal of Finance, 55:1943–1978.

De Jong, F. (2000). Time-Series and Cross Section Information in Affine Term Struc-

ture Models. Journal of Business and Economics Statistics, 18:300–314.

Diebold, F. and Rudebusch, G. (2013). Yield Curve Modeling and Forecasting: the

Dynamic Nelson-Siegel Approach. Princeton University Press.

Duffee, G. (2002). Term Premia and Interest Rate Forecasts in Affine Models. Journal

of Finance, 57:405–443.

Duffie, D. and Kan, R. (1996). A Yield Factor Model of Interest Rates. Mathematical

Finance, 6:379–406.

European Insurance and Occupational Pensions Authority (2014). Consultation Pa-

per on a Technical Ddocument Rregarding the Risk-free Interest Rate Term Struc-

ture. Report EIOPA-CP-14/042.

European Insurance and Occupational Pensions Authority (2015). Technical Doc-

umentation of the Methodology to Derive EIOPAs Risk-free Interest Rate Term

Structures. EIOPA-BoS-15/035.

24

Geweke, J. (1991). Efficient simulation from the multivariate normal and student-t

distributions subject to linear constraints and the evaluation of constraint proba-

bilities. In Computing science and statistics: Proceedings of the 23rd symposium

on the interface, pages 571–578.

Smith, A. and Wilson, T. (2001). Fitting Yield curves with long Term Constraints.

Research Notes, Bacon and Woodrow.

25

Date post:	03-May-2018
Category:	Documents
Upload:	vanxuyen
View:	216 times
Download:	2 times

What Does a Term Structure Model Imply About Very … does a term structure model imply about very...

Documents