What Does a Term Structure
Model Imply About Very
Long-Term Interest Rates? Anne Balter, Antoon Pelsser and Peter Schotman
DP 02/2014-065
What does a term structure modelimply about very long-term interest
rates?
Anne BalterMaastricht University and Netspar
Antoon PelsserMaastricht University, Kleynen Consultants and Netspar
Peter SchotmanMaastricht University and [email protected]
November 8, 2015
Abstract: We estimate a term structure model on interest rate data with maturi-ties up to 20 years and then extrapolate the yield curve to maturities up to100 years. Such model based extrapolation is motivated by limited liquidity ofvery long-dated fixed income instruments. The extrapolated curves are alwaysabove observed swap rates at maturities in the 20-50 years range. The extrap-olation appears mainly driven by the near unit root of the level factor underthe risk neutral measure. In a no-arbitrage term structure model this leadsto a strong convexity effect, which implies that extrapolated yield curves aregenerally upward sloping for maturities longer than 20 years before eventuallybending steeply downwards. Our estimates use Bayesian methods. The prior isinformative on mean reversion parameters and imposes a zero lower bound onthe unconditional means.
Keywords: term structure models, parameter uncertainty, extrapolation, insurancesupervision
JEL codes: G23, G12, C58
1 Introduction
Long maturity discount rates are an essential input for valuing the liabilities of pension
funds and insurance companies. Life-insurance or pension fund liabilities can be as
long as 100 years, whereas the available liquid instruments in the market have much
shorter maturities. In most countries market rates can only be observed for maturities
up to 20 or 30 years for government debt. Swap rates are available for maturities up
to 50 years, but there are doubts about the liquidity of the longest maturities. Fair
value, or market consistent, valuation requires discount rates that are close to market
rates, but free of liquidity effects.1
For this purpose various methods have been proposed to extend an observed yield
curve. Assuming that market rates are liquid up to a ‘last liquid point’ with maturity
of 20 years (say), how should such a yield curve be extended to maturities up to 100
years? One option is the use of numerical extrapolation techniques. A prominent
example is the Smith-Wilson methodology adopted by European Insurance and Oc-
cupational Pensions Authority (EIOPA), which extrapolates the forward rate curve
using exponential functions.2 The extrapolation method provides a smooth extension
from the yield at the last liquid point to an externally specified ultimate forward
rate (UFR) and a chosen convergence speed parameter. Other methods, such as the
Nelson-Siegel methodology, would first fit level, slope and curvature factors using
data on the liquid part of the yield curve and then extend the yield curve with the
parameters of the fitted model.3 In the Smith-Wilson methodology the yield curve
always converges to the same constant, whereas in the Nelson-Siegel model long rates
converge to a time-varying level factor estimated from the current term structure.
A problem with these methods is that the extension is based on curve fitting,
1 Quoting from From Moody’s Analytics (May 2013): ‘Fair value is the price that would bereceived to sell an asset or paid to transfer a liability in an orderly transaction (that is, not a forcedliquidation or distressed sale) between market participants at the measurement date under currentmarket conditions.’
2 The technical details of the methodology are described in CEIOPS (2010), EIOPA (2014) andEIOPA (2015).
3 See Diebold and Rudebusch (2013) for a textbook treatment of the Nelson-Siegel model forfitting term structure data.
1
and not a formal term structure model. The Nelson-Siegel model can be made ar-
bitrage free by adding a yield adjustment term, like in Christensen, Diebold, and
Rudebusch (2011), but this adjustment will push very long term yields to minus in-
finity. The strong downward pressure on long-term yields is caused by the unit root of
the level factor under the risk neutral measure that drives the convexity adjustment.
The arbitrage-free Nelson-Siegel model is a member of the class of essentially affine
Gaussian term structure models (Duffee, 2002). Other models in that class do not
necessarily have a unit root in the risk neutral dynamics, and will thus have a smaller
convexity adjustment at the very long end of the yield curve.
The existence of a convexity effect is the main argument for using a formal term
structure model for the extrapolation. It implies that yields do not always converge
monotonically from the last liquid point towards the ultimate yield. Starting from
low interest rate levels, the convergence will follow a hump shape, in which yields first
overshoot the ultimate yield before a finally decreasing slowly towards the limit. With
current low interest rates such an extrapolation will result in a much higher level for
long-term yields than either the Nelson-Siegel or the Smith-Wilson extrapolation.
A secondary aim of the paper is to quantify the uncertainty around extrapolated
yields given a formal term structure model. How much can we learn from time series
of observed yields with medium to long term maturities (5 to 20 years) about the
shape of the yield curve at very long maturities (between 20 and 100 years). More
explicitly, what can we infer about the three crucial elements at the long end of the
yield curve: the ultimate forward rate, the convergence speed towards the UFR, and
the convexity? The 20-years maturity separates the liquid segment of the yield curve
from the very long maturities which the market is deemed to be less liquid. Taking
20 years is the ‘last liquid point’ is advocated by EIOPA. We also see data evidence
for this breakpoint, since the volatility of yields increases slowly with maturities after
this point, contrary to the implications of theoretical term structure models.
The simplest model to estimate these quantities is a single factor Vasicek model.
The model can be parameterized by three key parameters: the ultimate yield, mean
2
reversion (or convergence speed) and volatility. It therefore directly addresses the
main challenges for extrapolating a yield curve. The single factor model can not fit
the complex curvatures at the short end (1 month to 2 years) of the yield curve, but
performs reasonably for long maturities. We therefore estimate the parameters using
data on 5- and 20-year maturities.
In a Bayesian analysis using Euro swap rate data we find that the mean reversion
of the level factor is non-zero, but with considerable probability mass very close to
the unit root (under the risk neutral measure). That means that convexity effects
are important. Convexity drives down the limiting yield. But since convergence
towards it is very slow, the yield curve remains upward sloping for maturities up
to 100 years. As expected the Vasicek extrapolation leads to a higher level of very
long-term yields than the Nelson-Siegel or Smith-Wilson methods. The uncertainty
around the extrapolated yields increases with the maturity. For market conditions
prevailing in the fall of 2013 the Nelson-Siegel and Smith-Wilson extrapolations are
at the lower end of the 95% highest posterior density region.
2 Data
Our yield curve data consists of a monthly panel of discount rates from the website
of the Bundesbank.4 These yield curve data are constructed from Euro swap rates
with maturities ranging from 1 to 50 years. The sample period is from January 2002
to September 2013 resulting in 141 data-points per maturity.
Figure 1 provides an overview of the data. The average term structure is increasing
until the 20-year maturity, after which it becomes slightly downward sloping for longer
maturities. The yield curve has fluctuated substantially over the twelve year sample
period. The figure shows a curve from the beginning of the sample (March 2002)
when the curve has a similar shape as the average, but at a 1.5% higher level. The
lowest long term yields are from May 2012, where the 50 years maturity yield is just
4 http:/www.bundesbank.de/Navigation/EN/Statistics/Time series databases
3
Figure 1: Euro swap rates
0
1
2
3
4
5
6
0 10 20 30 40 50
Average
Low
High
0.00
0.40
0.80
1.20
1.60
0.00
0.05
0.10
0.15
0.20
0.25
0 10 20 30 40 50
Changes
Residuals
Levels
In the left panel the solid blue line shows the average yield curve over the period January2002 – September 2013. The two red lines with crosses show the yield curves with theminimum and maximum rate at a 20 years maturity. The right panel shows the volatilityof yield level, yield changes and the one month prediction errors from an AR(1) model.The horizontal axis in both panels is the maturity in years. The vertical axis unit ispercent per year. The right vertical axis is for levels; the left vertical axis is for changesand prediction errors.
below 2%. With such dominant parallel shifts of yield curves, it will be difficult to fit
a long-term common ultimate forward rate to these data. It is definitely not reached
before a 50 years maturity and convergence to a common ultimate yield requires a
very low level of implied mean reversion.
The right hand panel of figure 1 shows the volatility of yields. The standard
deviation of yield levels decreases quickly up to maturities of 12 years, after which the
volatility of the level stabilizes. Within an affine term structure model this points to
at least two factors. The initial downward slope would then be related to a stationary
factor with strong mean reversion. This stationary factor has a negligible influence on
yields with maturities longer than 10 years. The flat part at long maturities requires
a level factor that is close to a random walk under the risk neutral measure. Since the
shortest maturity in this data set is one year, it is difficult to identify a third factor.
The same figure also shows the standard deviation of yield changes. In a one-factor
model it should have the same shape as the level volatility, and only the scaling should
be different. In a multi-factor model the shape for short and medium-term maturities
can be very different from the shape of the level volatilities. The figure shows the
4
Figure 2: Time series data
-0.8
-0.4
0
0.4
0.8
-0.8 -0.4 0 0.4 0.8
D 2
0 y
ear
yie
ld
D 5 year yield
0
1
2
3
4
5
6
5 years
20 years
The figure shows the monthly time series data for the 5- and 20-year maturity discountyields. The scatter diagram dispalys first differences of the yields. The time series graphsshows the level.
familiar hump-shaped volatility, where the volatility peaks at the three year maturity,
and then starts a gradual decrease. The initial hump shape for shorter to intermediate
maturities can be explained by a two-factor model. The gradual downward sloping
volatility is consistent with a slowly mean-reverting level factor. Most puzzling is
the upward sloping pattern beyond the 20 years maturity, which cannot be explained
by standard term structure models. The affine model, for example, implies that the
volatility curve is downward sloping for longer maturities. Very long-dated swap
prices may contain more noise because the market at these long-term maturities is
less liquid. This suggests that the 20- years rate may be a good reference for the last
liquid point to start the extrapolation.
In our econometric model we will work with time series data on discount yields for
maturities 5 and 20 years. The time series are shown in figure 2. The scatter diagram
shows the close association between first differences in the 5-year and 20-year rates
suggesting that the single factor assumption is not too far off. From the time series
plot it appears that both rates have a slight negative trend over the last decade, so
that it may be hard to estimate an unconditional mean from these time series.
5
3 Term Structure Model
Our basic model is the essentially affine term structure model introduced by Duffee
(2002) as an extension of the Duffie and Kan (1996) class of affine term structure
models. Given a K-vector of factors xt that follow a vector Ornstein-Uhlenbeck
process under the risk-neutral density (Q measure), the yield curve is given by
yt(τ) = a(τ) + b(τ)′xt (1)
where yt(τ) is the yield of a discount bond at time t with time to maturity τ ; a(τ)
and the K-vector b(τ) are functions of the time to maturity and the underlying
parameters of the model. Dai and Singleton (2000) show that identification requires
that the factor dynamics can be specified by a set of K mean reversion parameters,
one for each of the factors. The larger the mean reversion coefficient of a factor, the
lower will be the impact of a factor on long-term yields. For typical estimates of a
three factor model, more than 95% of the variation at maturities longer than 5 years
is explained by the first factor, usually referred to as the level factor.
3.1 Vasicek model
Since our aim is to extrapolate the yield curve beyond maturities of 20 years, using
data in the segment between 5 and 20 years, we specialize our model to a single factor.
For the single factor ‘Vasicek’ model both xt and b(τ) are scalars, and the yield curve
is given explicitly as
yt(τ) = θ + b(τ) (xt − θ) + 12ω
2τb(τ)2 (2)
where
b(τ) =1− e−κτ
κτ(3)
ω2 =σ2
2κ(4)
θ = µ− ω2
κ(5)
6
The function b(τ) defines the volatility of long-term yields relative to the level factor
xt; ω2 is the unconditional variance of the factor; and θ is the limiting yield yt(τ)
when τ → ∞. The constant θ is both the ultimate yield as well as the ultimate
forward rate. It is equal to the unconditional mean of the risk-neutral distribution of
the factor minus the infinite horizon convexity adjustment ω2/κ. The convexity term
scales with the unconditional variance ω2 of the risk-neutral factor dynamics. If the
mean-reversion κ goes to zero, meaning a true level factor, the variance will tend to
infinity. But even if κ does not move all the way to its limit, the convexity in the
ultimate yield can be substantial due to the additional κ in the denominator for ω2.
For small κ it can become so big that yields will be negative for large s and converge
to a negative θ (with fixed µ). This will occur, for example, in the arbitrage-free
version of the Nelson-Siegel model of Christensen, Diebold and Rudebusch (2011).
In the Vasicek model the factor x follows a Ornstein-Uhlenbeck process under the
risk neutral measure Q. For the time series dynamics under the physical measure
P we need to assume a process for the stochastic discount factor. Following Duffee
(2002) we make the essentially affine assumption
dΛ
Λ= −xdt− λdW̃ , (6)
where λt = Λ0 + Λ1xt, and where W̃ is standard Brownian motion. With this as-
sumption the time series process for the factor becomes
dx = κ̃(µ̃− x)dt+ σdW̃ (7)
and parameters under the P and Q measures are related by
κ = κ̃+ σΛ1
µ = µ̃− σ(Λ0 + Λ1µ̃)
κ
(8)
With the definition of the risk neutral mean µ, the ultimate yield θ in (5) can be
further decomposed as the sum of the average spot rate µ̃, a risk premium depending
on risk parameters Λ0 and Λ1, and the convexity effect ω2/κ.
Combining (2) and (7) provides an expression for the time series behavior of
different yields yt(τ), which are the starting point for the econometric analysis.
7
3.2 Extrapolation
In general extrapolation uses forward rates and the identity
yt(s) =1
s
∫ s
0
ft(u)du, (9)
with ft(τ) the instantaneous forward rate at time t for time t+ τ . If we have reliable
data for the term structure up to the reference maturity τ ∗ (‘last liquid point’) the
extension to maturities s > τ ∗ follows as
yt(s) =1
s
(τ ∗y∗t +
∫ s
τ∗ft(u)du
), (10)
where y∗t = yt(τ∗) is the observed yield at maturity τ ∗.5 The extrapolation ensures
continuity of the extended yield curve at the last liquid point. For the one factor
Vasicek model the forward rate curve is a function of the single state variable xt,
ft(τ) = θ + e−κτ (xt − θ) + ω2e−κτ1− e−κτ
κ, (11)
which shows that θ is also the ultimate forward rate. Integrating (11) over τ provides
a closed form extrapolation for (10) with xt still as a latent variable. If the Vasicek
model would provide an exact fit for the entire term structure, xt could be calibrated
from any observed yield or forward rate, always leading to the same extrapolated
yield curve. Since we do not expect a perfect fit, at least not at short maturities, the
most natural choices for empirical work are (i) to use (2) to solve for xt as a function
of y∗t ; or (ii) to use (11) to calibrate xt to the forward rate f ∗t ≡ ft(τ∗) at the last
liquid point.
Calibrating with respect to y∗t gives the extrapolation scheme
yt(s) =b(s)
b(τ ∗)y∗t +
(1− b(s)
b(τ ∗)
)θ + Cy(s)ω
2
Cy(s) = 12b(s) (sb(s)− τ ∗b(τ ∗))
(12)
with Cy(s) the convexity effect in the extrapolation. The extrapolated curve (12) is
continuous at τ ∗, but not necessarily differentiable. Yields with maturity slightly less
5 In practice most methods replace the instantaneous forward rate by one year forward ratesFt(τ, τ + 1) in which case the integral in (10) becomes the sum
∑s−1i=τ∗ Ft(i, i+ 1). For expositional
purposes we stick with the instantaneous representation.
8
than τ ∗ are observed in the data and the slope at s < τ ∗ is therefore as it is observed
in the data, which need not exactly correspond to the implied slope from the Vasicek
model. When calibrating to the forward rate the extrapolation formula becomes
yt(s) = θ +τ ∗
s(y∗t − θ) + (1− τ ∗
s)b(s− τ ∗)(f ∗t − θ) + Cf (s)ω
2
Cf (s) =1
2s(sb(s)− τ ∗b(τ ∗))2 ,
(13)
This function is differentiable at τ ∗ due to the data identity f ∗t = y∗t +τ ∗ ∂yt(τ∗)
∂τimplied
by (9) which relates forward rates and yields. Therefore, calibrating the state variable
to the forward rate provides a smoother extrapolation. The disadvantage of using the
forward rate is measurement error. The instantaneous forward rate is not directly
observable, but needs to be calculated as a derivative of the observed yield curve with
respect to maturity. It is therefore more sensitive to measurement error than the
yield itself.6
Both extrapolation formulas express the extrapolated yield as a weighted average
of data at the last liquid point (y∗t and/or f ∗t ) and the ultimate yield θ plus a convexity
adjustment. In both cases the curve converges to the ultimate forward rate as s→∞.
The salient feature of the extrapolation are the convexity adjustment terms Cy(s) and
Cf (s), both of which are positive. Hence, even if y∗t and f ∗t are both equal to θ, the
extrapolation first moves yt(s) above the ultimate yield, before slowly converging
downwards to the ultimate yield θ. The Vasicek extrapolation will thus be markedly
different from a simple weighted average of the last liquid point and an ultimate
forward rate. It will often imply a steeper upward sloping yield curve before eventually
flattening (or decreasing) towards the ultimate yield. The convexity adjustments in
(12) and (13) are positive, because the convexity is measured relative to the ultimate
yield θ, which itself has the maximum possible negative convexity effect −ω2
κ.
For comparison we also construct two alternative extrapolations. One is the
Smith-Wilson extrapolation for which we use the Excel tool available of the EIOPA
6 One could formally allow for a slight measurement error in the observed rate y∗t and extract thestate variables from multiple maturities using a Kalman filter. In that case the extrapolation woulduse the estimated x̂t and proceed directly by integrating (11). The resulting extrapolation will thennot be differentiable at τ∗.
9
website.7 The extrapolation uses the entire yield curve up to τ ∗ as input to produce
a smooth extrapolation towards a constant ultimate yield θ = 4.2%. The other is the
Nelson-Siegel model, for which we work with the textbook specification in Diebold
and Rudebusch (2013),
yt(τ) = β1t + β2t
(1− e−λτ
λτ
)+ β3t
(1− e−λτ
λτ− e−λτ
)(14)
The factors βjt (j = 1, 2, 3) are calibrated for each date t using all maturities up to
τ ∗ with a constant shape parameter λ = 0.52 (the optimal least squares value for our
sample) and then used to extend the yield curve for maturities s > τ ∗. With the three
time-varying factors the Nelson-Siegel model should be able to fit most of the yield
curve. Since the curve converges to the level factor β1t it implies a flat yield curve at
very long maturities. Our implementation of the Nelson-Siegel model does not include
the arbitrage-free extension suggested in Christensen, Diebold and Rudebusch (2011).
The extension implies an additional maturity specific constant term β0(τ). Adding
these terms we would again obtain a large convexity effect because the level factor in
the Nelson-Siegel model implies a unit root for the Vasicek factor, i.e. κ = 0.8
4 Econometric model
4.1 Specification
To extrapolate the yield curve we need the parameters of the risk neutral density.
We estimate these parameters from data at the long end of the yield curve. In the
Vasicek model (2) the mean reversion parameter κ is identified through the relative
volatilities b(τ) of yields with different maturities, while the infinite yield θ is identified
from the differences between the unconditional means of long-term yields. Assuming
that these parameters are constant we use time series data for both the 5-year and
20-year maturity discount yields. The 5-year yield is the shortest maturity that
7 The tool ceiops-tool-extrapolator-risk-free-rates en.xls is available on the websitehttps://eiopa.europa.eu.
8 See the leading adjustment term I1 in Appendix B in Christensen et al (2011)
10
seems uncontaminated by additional factors, while the 20-years rate is the longest
one that still is on the downward sloping part of the volatility curve in the empirical
data. The 20-years rate is also the ’last liquid point’ in the extrapolation of EIOPA.
We need multiple maturities to separate the parameters under the P measure from
the risk neutral Q parameters. For this we need sufficient cross-sectional variation.
By choosing the two maturities relatively far apart we include as much of the cross
sectional information as possible.
Henceforth, we consider the two restricted AR(1) processes for maturities (τ1, τ2) =
(5, 20),
yt(τi)− yt−h(τi) = α (mi − yt−h(τi)) + et(τi), i = 1, 2, (15)
where h is the length of the time interval between two observations (one month,
h = 1/12), mi is the unconditional mean of a discount rate with maturity τi, and
the shocks eit are normally distributed with mean zero and covariance matrix Σ.
The mean reversion parameter α = 1 − e−κ̃h is the discrete time equivalent of the
continuous time mean reversion parameter κ̃. The mean-reversion parameter α is the
same for the two different interest rates. According to (2) and (7) the error covariance
matrix takes the form
Σ∗ = s2hσ2
b21 b1b2
b1b2 b22
, (16)
where bi = b(τi) and where s2h = 1−e−2κ̃h
2κ̃is a scaling constant that links the discrete
time model to the continuous time parameterization.9 Through the function b(τ)
the covariance matrix is a function of κ. The mean reversion under the risk-neutral
measure is therefore identified through the covariance matrix. Since we are using a
single factor model, the matrix Σ∗ has rank one. To avoid the stochastic singularity
in the estimation it is common to assume a small measurement (or model) error. This
can be done through a formal measurement equation and a Kalman filter model as
in De Jong (2000). Since we only have two time series, we take the simpler approach
9 In an Euler discretization we would have α = κ̃h and s2h = h.
11
by adding a small positive variance to the diagonal elements of Σ∗,
Σ = Σ∗ + s2hη2I (17)
Interpreting η2 as a measurement error variance will only be credible if η2 is small
relative to the overall volatility of the shocks eit. Large measurement error would
not only cast doubt on the model specification, but would also imply that the regres-
sors yt−h(τi) are subject to errors-in-variables. The three elements in Σ are exactly
identified the three model parameters σ2, η2 and κ.
An estimate of Σ is admissable if the implied κ ≥ 0. From the structure of (16)
it is clear that nonnegativeness of κ requires that σ11 > σ22 and that σ21 > 0. In
the appendix we derive an additional admissability condition which is specific to the
maturities τ1 = 5 and τ2 = 20.
The intercepts mi in (15) are related to the parameters θ and µ̃,
mi = biµ̃+ (1− bi)θ +τib
2i
4κσ2 i = 1, 2 (18)
Since κ and σ2 are already identified from the covariance matrix Σ, these two equa-
tions uniquely identify µ̃ and θ. Inverting (18) is problematic when κ→ 0. For small
values of κ the system becomes almost singular because b1 → b2, while at the same
time the intercepts go to infinity (unless σ2 → 0). The ultimate yield may therefore
be very difficult to identify from the data if the risk-neutral mean reversion is small.
Assuming normality and time series independence for the error terms, we obtain
a normal likelihood function from which we can estimate the six unknown parameters
κ, κ̃, θ, µ̃, σ2 and η2. The parameters are also exactly identified from the reduced
form parameters α, m1, m2 and Σ.
4.2 Bayesian analysis
Due to the near unit root behaviour of interest rates and the relatively short time
series sample, the unconditional means θ and µ̃ will be hard to estimate. In a Bayesian
analysis we can impose stationarity of the dynamics under both P and Q and add
12
a prior view on the unconditional mean of the interest rates that we analyze. A
Bayesian analysis also provides a way to account for the parameter uncertainty by
computing the extrapolation as a weighted average of different sets of parameters
with weights given by the posterior density of the parameters.
We will use mildly informative priors on the reduced form parameters. For the
time series mean reversion α we specify a normal prior truncated to α > 0, such
that the prior mean and standard deviation are 0.013 and 0.01, respectively. With
monthly data the prior is centered around a first order autocorrelation of 0.987. The
prior is centered close to a unit root, but the relatively tight precision also ensures
that the posterior will be away from the unit root unless the data are very informative
on the dynamics.
To impose that the long term means mi of the interest rates are positive we use
a truncated normal prior on mi > 0. We assume independent priors for m1 and
m2 with prior mean 4% and prior standard deviation 3.9%. The truncated normal
prior ensures that the unconditional means are positive at maturities τ1 and τ2, but
it does not guarantee that the unconditional mean is positive at all maturities. Most
problematic could be the ultimate yield θ, since it is extremely sensitive to a near
unit root in the risk-neutral process.
For the covariance matrix Σ we assume a truncated inverted Wishart distribution
p (Σ−1) ∼ TW(Ψ−1, ν) with
Ψ = 0.012
1 0.95
0.95 1
,
and the degrees of freedom parameter ν = 3 (slightly above the minimum value of
2). The prior is truncated to the region that satisfies the admissability conditions
for κ > 0. The prior for κ is therefore implicit in the prior for Σ. Even though the
prior is almost non-informative for Σ, the prior for κ is informative. Accounting for
the truncation by the admissability conditions, the implicit marginal prior for κ has
a mean of 0.033 and standard deviation of 0.049 (based on a numerical evaluation).
Since all priors are proper and have well-defined means and variances for the
13
reduced form parameters, the posterior moments for the reduced form parameters
also exist. The posterior is not available in closed form due to the truncation and the
non-linear parameterization involving the product αm. Numerically the posterior
can be easily obtained through Gibbs sampling, since all conditional posteriors are
straightforward except for the truncation. When we sample from the conditional
posteriors we reject a draw if it is outside the admissible region. In some cases the
probability of accepting a draw can be extremely low. This happens when we need to
draw the unconditional means m at a point where the mean reversion parameter α is
close to zero. In this case the data are uninformative about the unconditional mean,
meaning that we need to draw m from a distribution that is approximately equal to
the prior. Since this is a truncated normal with negative location parameters, the
probability of obtaining a positive number by drawing from a normal distribution
becomes very small. For small α we therefore use the exponential rejection sampling
algorithm suggested by Geweke (1991).
While drawing from the posterior density we also obtain the posterior distribution
of the term structure extrapolations (12) and (13) by computing the functions b(s),
Cy(s) and Cf (s) at each draw of the parameters.
5 Results
5.1 Parameter estimates
Maximum likelhood estimation results are in table 1. In contrast to many other
studies the model is estimated with data on medium- to long-term maturities. Results
are similar, however, to what has been found for other sample periods and countries.
The time series mean reversion κ̃ is not significantly different from zero.10 It implies
a monthly first order autocorrelation of 0.975. We also find, like e.g. De Jong (2000),
that the risk neutral mean reversion parameter κ is much smaller than the time
10 Standard errors reported in the table are from the Hessian of the log-likelihood function. Robuststandard errors allowing for heteroskedasticity and non-normality do not make a difference to theconclusions in this case.
14
Table 1: Parameter estimates
ML Bayesian
par est se mean stdev 95% lo 95% hi
κ 0.0202 0.0101 0.0203 0.0096 0.0016 0.0382κ̃ 0.3023 0.1685 0.1687 0.0915 1.×10−6 0.3332µ 0.1338 0.2677 0.2017 1.2164 0.0315 0.4101µ̃ 0.0155 0.0102 0.0139 0.0044 0.0072 0.0212θ 0.0717 0.0324 -7.31 473.8 -0.4038 0.2106Λ0 0.2940 0.7668 -0.0007 0.2388 -0.4957 0.4881Λ1 -40.556 24.157 -21.602 13.584 -47.026 2.7775(100σ)2 0.471 0.131 0.485 0.084 0.330 0.652(100ω)2 11.90 4.93 21.0 119.1 4.90 39.(100η)2 0.110 0.013 0.109 0.013 0.084 0.135The table reports parameter estimates for the bivariate model using interest rates with maturitiesτ1 = 5 and τ2 = 20 years. The first columns contain maximum likelihood estimates (‘par est’)and asymptotic standard errors (‘se’). Standard errors are from the Hessian of the log-likelihoodfunction. The remaining columns show Bayesian posterior means (‘mean’), standard deviations(‘stdev’) and 95% highest posterior density intervals (‘95% lo’, ‘95% hi’) Results are based on onemillion draws from the Gibbs sampler.
series mean reversion. It is also not significantly different from zero, even though
the asymptotic standard error is very small. The estimated cross-sectional mean
reversion implies a convergence of forward rates towards the ultimate forward rate θ
that is much slower than what is assumed in the Smith-Wilson methodology adopted
by EIOPA. Under the EIOPA specification full converence takes place in 40 years;
the ML estimate of κ implies that after 40 years only 45% of the gap between the
observed ft(τ∗) and θ has been closed. Due to the near unit root under the risk neutral
measure Q the estimate of the unconditional mean µ is very imprecise. The long-term
yield θ is even more imprecise because of the additional uncertainty due to the strong
convexity effect at the long end of the yield curve. The implied parameters for the
price of risk are both insignificant. The imprecision in Λ0 and Λ1 can be reduced by
setting Λ1 = 0, which would also imply that κ̃ = κ and further exacerbate the unit
problem in estimating the mean parameters.
The model error variance is so small that results cannot be affected by errors-in-
variables problems. Assuming the measurement error to be uncorrelated over time,
the measurement error in yt−h is of the order 12s
2hη
2 ≈ 5 × 10−7, which is negligible
15
Figure 3: Conditional posterior draws of θ given κ
The figure shows a scatter plot of the draws θ(i) conditional on κ(i). The left-hand panelshows all draws except the 1% most negative. The right-hand panel shows the 1% smallestvalues of θ(i). Note the different scales for the two panels.
relative to its true variance b2iω2 ≈ 10−3.
The results for the Bayesian analysis in table 1 are based on 1 million draws from
the MCMC sampler. The posterior moments for κ and κ̃ are close to the maximum
likelihood estimates. Due to the prior specification the posterior mean for the time
series mean reversion is a bit closer to the unit root and also a bit more precise.
Similar to the ML estimates the mean reversion under P is still substantially larger
than under Q. The risk parameter Λ1 provides a direct comparison on the equality
of the two parameters and, as for the ML estimates, the 95% credible interval for Λ1
contains zero.
Most of the difference with the ML results come from inference on θ. The ulti-
mate yield θ is hard to identify from the data. Although the prior imposes that the
unconditional means of the 5- and 20-year yields exist and are positive, this does not
guarantee that the unconditional means at other maturities are also positive. Pos-
terior moments for θ are dominated by a few extremely negative outliers when κ is
close to zero. Figure 3 shows a scatter plot of the posterior draws for θ conditional
on κ. The right-side of the figure zooms in on the 1% smallest draws for θ. All of
these occur conditional on very small values for κ < 10−4. Given the large uncer-
tainty on the ultimate yield, it is doubtful if the posterior mean of θ exists with our
prior specification. A clear sign that the posterior mean may not exist is the extreme
16
skewness: due to the severe outliers the average of the simulated θ(i) is far below the
5% quantile of the posterior density.11
In contrast to the ultimate yield, the posterior on the unconditional variance under
Q is well-behaved. It does have a fat right tail, but the posterior simulation does not
produce any of the severe outliers that we encountered for θ. The unconditional
variance depends on κ−1, whereas the ultimate yield depends on κ−2.
5.2 Extrapolation
For the extrapolation we set the reference maturity as τ ∗ = 20 years. This is the
longest maturity in our model and is also the choice of the ‘last liquid point’ made
by EIOPA. We first investigate the effect of the parameter uncertainty assuming that
the Vasicek model fits perfectly. In this case both extrapolation formulas (12) and
(13) are identical. Figure 4 shows the extrapolated curves for the ML estimates as
well as the posterior mean of the Bayesian analysis. Starting at a 20-years yield of 4%
the implied curve is slightly upward sloping, both at the ML estimates as well as for
the posterior mean. Despite the large differences between the ML estimate and the
posterior mean for the UFR parameter θ, the extrapolated curves look very similar
up to 100 years.
The uncertainty around the extrapolation increases with maturity. For example,
at the 60 year maturity the 95% credibility interval ranges between 3.4% and 8.0%.
Most of the probability mass is consistent with an upward sloping yield curve starting
from a 4% initial level. Still the lower end of the interval can support a flat extrap-
olation. Even though the interval widens quickly, even at maturities of 100 years it
is not anywhere near the 95% region for θ in table 1. The negative outliers for θ at
small values for κ hardly affect the extrapolation at relevant maturities.12
11 What matters for the existence of the mean of θ are the properties of the ratio σ2/κ2. Bothparameters depend on the error covariance matrix Σ, which is well determined and for which alllow order moments clearly exist. But since κ, and thus κ−2, is an implicit non-linear function of Σ,its properties can not be determined analytically. Moreover, σ2 and κ are dependent functions ofthe same matrix Σ.
12 Adding the prior restriction θ > 0 reduces the credibility interval, but also results in a more
17
Figure 4: Vasicek extrapolation
0%
2%
4%
6%
8%
10%
12%
0 20 40 60 80 100
Yie
ld (
% p
a)
maturity s
0%
1%
2%
3%
4%
5%
6%
7%
0 20 40 60 80 100
Yie
ld (
% p
a)
maturity s
The left panel shows the posterior mean of yt(s) for s > τ∗ given τ∗ = 20 years andy∗t = 4%. The dashed lines define the 95% highest posterior density region. The dottedline shows the extrapolation conditional on the ML estimates. The right panel showsthe posterior means for different last liquid points y∗t , being 2% (dashed), 4% (solid)and 6% (dashed) respectively. The solid lines in left and right panels represent the sameextrapolated curve.
In the extrapolation formula (12) the ultimate yield θ is multiplied by the weight
(1− b(s)/b(τ ∗)) θ. The large negative outliers for θ occur when κ is very close to zero,
in which case the weight will also go to zero. Figure 5 shows the posterior mean of the
product (1− b(s)/b(τ ∗)) θ. Despite the outliers in θ the overall effect of θ on yields
up to maturity of 60 years accounts for less than 10 basis points on the yields yt(s).
The effect becomes negative for longer maturities. The weighted average of y∗t and θ
would therefore produce a downward sloping extrapolation.
By construction the convexity term Cy(s)ω2 adds positively to the extrapolated
yield. The posterior mean of Cy(s)ω2 in the right panel of figure 5 increases over the
entire range to the maturity of 100 years (even though it must decrease to zero by
construction as s → ∞). At the 60 years maturity it will contribute about 2% to
the yield curve on top of the weighted average of y∗t and θ; at 100 years the effect
increases to 4%.
The 95% HPD bounds for both the ultimate yield component as well as the
convexity adjustment are much wider than for the overall extrapolation in figure 4.
strongly upward sloping extrapolation.
18
Figure 5: Extrapolation components
The left panel shows the posterior mean and 95% HPD for the term (1− b(s)/b(τ∗)) θ.The right panel shows the same quantities for the convexity effect Cy(s)ω2 defined in (12).
-10%
-6%
-2%
2%
6%
10%
0 20 40 60 80 100
0%
2%
4%
6%
8%
10%
0 20 40 60 80 100
maturity s
The difference is due to the correlation between the negative outliers in θ and positive
outliers for ω2, both driven by the near unit root draws for κ. For small κ we will
often have a very negative θ pushing long-term yields downward, and at the same time
a very large positive ω2 pulling yields upwards. For large κ the opposite happens:
negligible convexity and a positive θ with substantial weight. The downward effect
of the ultimate yield and the upward convexity effect balance each other.
The ultimate yield and convexity components are independent of the initial last
liquid yield y∗t . Different values of y∗t produce slowly converging yield curves because
the weight b(s)/b(τ ∗) in (12) is monotone decreasing in s. The curves in figure 4
show the posterior means for three different values of y∗t . The extrapolated curves are
almost parallel consistent with a very slow convergence rate. Even at the 100 year
maturity the curves are still far apart. Even with y∗t = 6% the yield curve is initially
upward sloping due to the strong convexity effect (relative to θ).
For empirical data the Vasicek curve does not perfectly fit. Extrapolation will
then depend on whether the state variable is calibrated with respect to the level or
the slope of the yield curve at the last liquid point. As an example, figure 6 presents
the extrapolation for September 2013. It shows that calibrating the state variable to
the yield y∗t produces a kink in the yield curve at τ ∗. Using the forward rate f ∗t the
extrapolation is smooth, and also slightly below the yield based extrapolation. Apart
19
Figure 6: Extrapolation based on last liquid forward rate
0%
1%
2%
3%
4%
5%
6%
0 20 40 60
Yie
ld (
% p
a)
Maturity s
September 2013
yield
forward
data
0%
1%
2%
3%
4%
5%
6%
7%
0 20 40 60
Yie
ld (
% p
a)
Maturity s
Sample Average
yield
forward
data
The figure shows the posterior mean of alternative yield curve extrapolations. The solidblue line is the observed yield curve, the red diamonds line is the extrapolation calibratedon the 20-years yield, and the green dots is based on the 20-years forward rate. The rightpanel shows the results for the month September 2013; the left panel refers the averageover the entire sample.
from the initial lower slope the properties of the extrapolation are similar to the yield
based extrapolation. The curve is upward sloping due to the convexity effect and
convergence to the ultimate level θ is extremely slow. The same effect occurs in many
more months. On average the forward extrapolation is about 0.4% below the yield
extrapolation for for maturities between 30 and 60 years.
For maturities in the 20-50 years range we can also observe the actual yield curve,
which is always below the extrapolated curve. The difference between the extrapo-
lated yield and the observed discount yield increases monotonically to 1.2% at the 50
years maturity. Figure 7 shows the time series of the differences between extrapolated
and observed yields at the a maturity of 30 years. The residuals are small until mid
2008, then jump upwards, and slowly come down toward the end of the sample.
The convexity effect is the distinctive difference with alternative extrapolation
methods. Figure 8 adds the Smith-Wilson and Nelson-Siegel extrapolations to the
Vasicek curves for September 2013 discussed before. Both are below the Vasicek
curves, but still above the observed data. The Smith-Wilson extrapolation lets the
forward rate converge to θ = 4.2% at 60 years maturity. This convergence is much
faster than our estimated κ. The curve is therefore above the observed long end of
the yield curve, which has been under 4% level for most of the sample since 2008. It
20
Figure 7: Time series residuals
0.0%
0.5%
1.0%
1.5%
2002 2004 2006 2008 2010 2012 2014
Yie
ld (
% p
a)
yield
forward
The figure shows time series of the differ-ence between extrapolated and observed yieldcurves at the 30 years maturity. Extrapola-tion is based on the Vasicek model calibratedto either the yield or the forward rate at 20-years maturity.
Figure 8: Alternative extrapolations
0%
1%
2%
3%
4%
5%
6%
0 20 40 60 80 100
Yie
ld (
% p
a)
Maturity s
data
yield
forward
Smith-Wilson
Nelson-Siegel
The figure shows yield curves for September2013 using different extrapolation methods.The curves ‘yield’, ‘forward’ and ’data’ areas in figure 6. The Smith-Wilson extrapola-tion formula and the Nelson-Siegel model areboth calibrated on the observed yield curvefor maturities 1–20 years at annual intervals.
is doubtful whether this extrapolation reflects market consistency.
The Nelson-Siegel curve has a time-varying endpoint equal to the level factor at
each date. Even this curve is above the observed data, not only in September 2013
but also on average for the entire sample. The Nelson-Siegel extrapolates based on
the level, slope and curvature factors identified from the yield curve up to τ ∗ = 20
years. Close to τ ∗ the observed yield curve is still upward sloping, and as a result
the NS extrapolation converges to a level above y∗t . Since the observed curve is on
average slightly downward sloping for very long maturities the extrapolation is above
the observed yield curve.
5.3 Robustness
We performed several robustness checks with respect to the Vasicek extrapolations.
One set of variations considers the maturities on which the model parameters are
estimated. For the basic setting we used maturities of 5 and 20 years to estimate the
parameters. The 5-year maturity could be too low and be too much affected by other
less persistent factors. For this reason we also consider the combination (τ1, τ2) =
(10, 20) years. On the other hand, since most term structure models are estimated
21
Figure 9: Extrapolation based alternative parameter estimates
0%
4%
8%
12%
0 20 40 60 80 100
Yie
ld (
% p
a)
maturity s
(10, 20)
( 5, 10)
( 5, 20)
0%
2%
4%
6%
0 20 40 60 80 100
Yie
ld (
% p
a)
maturity s
EU
US
The figure shows yield curve extrapolations from alternative parameter estimates. Inthe left panel parameters are estimated for alternative pairs of maturities (τ1, τ2). In theright panel the alternative parameters are derived from US swap rate data.
on maturities up to 10 years we also estimate the model on the pair (τ1, τ2) = (5, 10)
years. In both cases the distance between the maturities is less than in the basic
model. As a result it turns out that the mean reversion parameter κ is estimated with
less precision and has much more probablility mass around the unit root. This further
emphasises the problem of estimating the ultimate forward rate θ. Figure 9 displays
the resulting extrapolations. For both alternative pairs of (τ1, τ2) the convexity effect
is more important resulting in extrapolations above that of the baseline model.
For a second robustness check we evaluate if the same features, strong convexity
and extrapolation above the observed yield curve, are present in US swap rate data.
From Datastream we collected swap rate data starting in May 1998. From these
we construct the discount yields at maturities (τ1, τ2) = (5, 20) years to to estimate
the parameters of the Vasicek models and the extrapolations. For mean reversion
paramter κ the estimate is very similar, but more precise. We therefore observe fewer
negative outliers in the posterior draws for θ and fewer positive outliers for ω2. Overall
figure 9 shows that the US data imply an upward sloping extrapolation. Also, similar
to the Euro data, the extrapolatiod yield curve is above the observed yield curve.
22
6 Conclusion
Yield curves extrapolated using the Vasicek model are generally almost parallel and
in most cases slightly upward sloping for maturities between 20 and 100 years. The
main difference with alternative extrapolation techniques is the convexity effect in
very long term yields. The convexity effect is an important element in no-arbitrage
term structure models and can be a large component due to the slow mean reversion
of the dominant interest rate level factor under the risk neutral measure.
The extrapolation based on a no-arbitrage term structure model is always above
the observed yield curve. This both underscores the importance of using a model
based yield curve for valuations of very long-dated maturities, but also raises the
issue of why observations at the very long end are so low and why they do not fit a
standard term structure model.
A Admissability conditions for Σ
The bivariate restricted VAR has error covariance matrix
Σ =
σ11 σ21
σ21 σ22
= s2h
σ2b21 + η2 σ2b1b2
σ2b1b2 σ2b22 + η2
Assuming τ1 < τ2 we impose σ11 > σ22 to ensure b1 > b2. Similarly, since b1 and b2
are both positive, we also impose σ21 > 0. To solve for κ we construct
S ≡ σ11 − σ22σ21
=b21 − b22b1b2
(19)
where the scalar S is a function of the elements of Σ and where the last expression
only depends on κ. The conditions on Σ imply that S > 0. The condition can be
rewritten as
b2b1
= 12
(√S2 + 4− S
)(20)
Since in our application τ2 = 4τ1, the left-hand side can be rewritten as
b2b1
= 14
(1 + e−κτ1
) (1 + e−2κτ1
)(21)
23
For positive κ this is a monotone decreasing function in κ, and hence the equation
has a unique solution for κ, if it exists.
For S > 0 the right-hand side varies between 0 and 1, meaning that b2 < b1
as required by positive mean reversion κ > 0. Since the left-hand side has a lower
bound of 14 , we must also restrict the right-hand side to be above 1
4 , which implies
the restriction S < 154
. This upper bound is a third admissability condition on Σ.
References
Christensen, J., Diebold, F., and Rudebusch, G. (2011). The Affine Class of Arbitrage-
Free Nelson-Siegel Term Structure Models. Journal of Econometrics, 164:4–20.
Committee of European Insurance and Occupational Pensions Supervisors (2010).
QIS5: Risk Free Interest Rates – Extrapolation Method. Technical report.
eiopa.europa.eu/fileadmin/tx dam/files/consultations/QIS/
QIS5/ceiops-paper-extrapolation-risk-free-rates en-20100802.pdf.
Dai, Q. and Singleton, K. (2000). Specification Analysis of Affine Term Structure
Models. Journal of Finance, 55:1943–1978.
De Jong, F. (2000). Time-Series and Cross Section Information in Affine Term Struc-
ture Models. Journal of Business and Economics Statistics, 18:300–314.
Diebold, F. and Rudebusch, G. (2013). Yield Curve Modeling and Forecasting: the
Dynamic Nelson-Siegel Approach. Princeton University Press.
Duffee, G. (2002). Term Premia and Interest Rate Forecasts in Affine Models. Journal
of Finance, 57:405–443.
Duffie, D. and Kan, R. (1996). A Yield Factor Model of Interest Rates. Mathematical
Finance, 6:379–406.
European Insurance and Occupational Pensions Authority (2014). Consultation Pa-
per on a Technical Ddocument Rregarding the Risk-free Interest Rate Term Struc-
ture. Report EIOPA-CP-14/042.
European Insurance and Occupational Pensions Authority (2015). Technical Doc-
umentation of the Methodology to Derive EIOPAs Risk-free Interest Rate Term
Structures. EIOPA-BoS-15/035.
24
Geweke, J. (1991). Efficient simulation from the multivariate normal and student-t
distributions subject to linear constraints and the evaluation of constraint proba-
bilities. In Computing science and statistics: Proceedings of the 23rd symposium
on the interface, pages 571–578.
Smith, A. and Wilson, T. (2001). Fitting Yield curves with long Term Constraints.
Research Notes, Bacon and Woodrow.
25