Semi-analytic Lattice Integration
of a Markov Functional Term
Structure Model
�...
Christ Church College
University of Oxford
A thesis submitted for the degree of
Master of Science in Mathematical Finance
Hilary 2009
Abstract
One common use of Markov functional models is to approximate LIBOR market
models and to avoid complications the terminal forward measure is typically used. If
this method is applied to long term structures (ten or more years), the distribution
of the early LIBORs in the term structure has a very large tail, which is normally
not completely captured by common numerical techniques (either Monte Carlo or
grid-based methods).
A numerical method that is frequently applied to Markov functional models is
known as the semi-analytic lattice integrator (Sali) tree. This thesis examines the
implications of the long tails on the Sali tree. Adequate boundary conditions and
grid sizes are derived in order to capture the effect of the long tails.
It turns out that this method either exhibits stability problems or demands for a
relatively small lattice spacing. The reason for this is examined in detail and several
variations of the Sali tree to avoid this effect are suggested and analysed. Furthermore
the optimisation of the grid parameters is considered in order to reduce the necessary
computation time.
2
Contents
1 Introduction 1
2 Preliminaries 3
2.1 Markov Functional Models . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Sali-Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Cubic Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.1 Stability of Cubic Splines . . . . . . . . . . . . . . . . . . . . 6
3 Markov Black-Derman-Toy Model 7
3.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Analytical Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Numerical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Approaches to Improve Convergence 18
4.1 Other Spline Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.1 Akima Interpolation . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.2 Tension Splines . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1.2.2 Selection of Tension Factors . . . . . . . . . . . . . . 21
4.1.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 Splitting off the Asymptotic Behaviour . . . . . . . . . . . . . . . . . 24
5 Optimising the Grid 29
5.1 Grid Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Lattice Spacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
i
6 Conclusion 37
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
A Analytical solution of the MBDT-Model 39
A.1 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
A.2 Estimation of the Tails . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.2.1 Non-Asymptotic Contributions . . . . . . . . . . . . . . . . . 40
A.2.2 Integral over the Tail . . . . . . . . . . . . . . . . . . . . . . . 41
B Tension Splines 42
C Expectation Values of Splines 44
C.1 Cubic Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
C.2 Tension Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
D Combined Approaches to Improve Convergence 46
Bibliography 49
ii
List of Figures
3.1 Analytic L − Li for different tenors and levels of volatility . . . . . . 9
3.2 It(x) analytic according to eq. (3.1) . . . . . . . . . . . . . . . . . . . 11
3.3 comparison of ai and ai . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Effect of h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5 Oscillations of the cubic spline approximation . . . . . . . . . . . . . 17
4.1 Effect of h with Akima interpolation . . . . . . . . . . . . . . . . . . 20
4.2 Effect of h with splines under tension . . . . . . . . . . . . . . . . . . 23
4.3 Tension factor σ′ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Effect of h on L − L with spilt asymptotics . . . . . . . . . . . . . . . 26
4.5 f1 for n = 20 and ψ = 0.2 . . . . . . . . . . . . . . . . . . . . . . . . 27
4.6 Sali L − Li for different tenors and levels of volatility . . . . . . . . . 28
5.1 Effect of different tail extrapolations on L − L with spilt asymptotics 32
5.2 Grid size needed for calibration in figure 5.1 . . . . . . . . . . . . . . 33
5.3 Sali L − Li for different tenors and levels of volatility with optimised
grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
D.1 Effect of h on L − L with spilt asymptotics and Akima splines . . . . 47
D.2 Effect of h on L − L with spilt asymptotics and splines under tension 48
iii
Chapter 1
Introduction
Markov functional models, suggested by Hunt, Kennedy and Pelsser [11, 9] as tools
for pricing exotic derivatives, have the characteristic property that the discount bond
prices are at any time a function of some low dimensional Markov process.
With LIBOR market models they share the easy calibration to market prices,
but due to the low dimension of the random process they allow for a much more
efficient implementation. So one common use of Markov functional models is to
approximate LIBOR market models. For simplicity, Markov functional models are
typically formulated in the terminal forward measure.
If this method is applied to long term structures (ten years or longer), the dis-
tribution of the early LIBORs in the term structure has a very large tail, which is
typically not captured completely by common numerical methods. Either the grid or
tree is too small or a vast number of Monte Carlo steps would be necessary to capture
these contributions.
A method that is often used to implement Markov functional models is the semi-
analytic lattice integrator (Sali) tree [6]: The backward integration is done on a grid
using exact formulae to integrate piecewise defined functions against the propagation
kernel of the driving stochastic process, interpolating the resulting function at the
prior time.
The goal of this thesis is to analyse the effect of the large tails observed for long
term structures on the accuracy of the Sali tree. The determination of an adequate
grid-size will be examined as well as a proper treatment of the semi-infinite intervals
beyond the grid.
This thesis is structured as follows: In chapter 2 basic concepts are introduced.
Markov functional models are defined and the Sali tree will be outlined. Special
attention is payed to cubic splines that are normally used as basic functions.
1
This is followed by the definition of a model that is still analytically solvable but
complex enough to show the main characteristics of a realistic Markov functional
model in chapter 3. The analytical calibration of the model is presented as well
as a Sali approach. In contrast to the standard Sali tree, the contributions of the
semi-infinite intervals beyond the grid are taken into account in order to keep the
effects of the long tails and to derive realistic boundary conditions for the cubic
spline interpolation on the finite grid.
In chapter 4 three variations of the Sali approach are introduced with the goal
to achieve better results at larger lattice spacing. Two are based on different spline
types, the third is based on splitting off the exponential behaviour that is responsible
for the long tails and apply the Sali approach to “well behaved” functions.
The purpose of chapter 5 is to optimise the grid in order to get precise results
with a minimum of computation time while chapter 6 summarises the results.
2
Chapter 2
Preliminaries
2.1 Markov Functional Models
In practice, exotic derivatives are priced by calibrating a model to the market prices
of liquid simple derivatives and then using this model to price the exotic derivative.
As such the role of the model could be described as an ’extrapolation tool’. As the
model should describe prices in an efficient market it must be arbitrage free. For a
practical applications two other features are important. The model should
• be well-calibrated, i.e. correctly price a large class of liquid instruments without
over-fitting
• allow for an efficient implementation
Hunt, Kennedy and Pelser [11, 9] suggested a Markov functional model for this pur-
pose, where the randomness comes through a low dimensional Markov process and
the interest rates are a functional of this random process.
In a single currency economy a Markov functional model can be described as
follows: Let DtT be the value at time t of a zero coupon bond maturing at time T
with DTT = 1. The underlying assets should be a finite number of these bonds with
T ∈ T = {Ti|i = 1, . . . , n}. This is enough for the present purpose, but the treatment
can be generalised to an infinite number of underlyings.
Let Ft be the filtration representing the information available at time t. Any
trading strategy in the market should be self-financing. The value of a portfolio
generated by such a trading strategy is called a price process and any price process
that is positive almost surely is called a numeraire.
Given a numeraire N we assume there is a measure N equivalent to the natural
measure P such that the process (DtT /Nt)T∈T is an {Ft} martingale.
3
We assume that any derivative can be replicated by a self financing portfolio.
If we further assume that there is a time limit ∂∗, at which the value of a derivative
is determined purely by the asset prices, the value of this derivative at any earlier
time is given by
Vt = NtEN[V∂∗/N∂∗|Ft]
= NtEN[VT /NT |Ft] (2.1)
for 0 ≤ t ≤ T ≤ ∂∗.
2.1.1 Definition
Let Xt be a time inhomogeneous Markov process under N and the boundary function
∂S : [0, ∂∗] → [0, ∂∗] with ∂S ≤ S a real function defining the boundary times up to
which the following assumptions on the discount bonds are valid. We assume that
• the prices of the pure discount bonds are a function of the random process:
DtS = DtS(xt) for 0 ≤ t ≤ ∂S (2.2)
• the same should apply to the numeraire N :
Nt = Nt(xt) for 0 ≤ t ≤ ∂∗ (2.3)
Then a Markov functional interest rate model is completely determined by
1. the law of the process X under N
2. the functional form of the numeraire Nt(Xt) for 0 ≤ t ≤ ∂∗
3. because of equation (2.1) we do not need the form of DtS(xt) for all times. It
is sufficient to know the functional at the boundary t = ∂S.
The calibration of the model demands the determination of these three elements.
2.2 Sali-Trees
A numerical method which is well suited for the treatment of Markov functional
models is the semi-analytic lattice integrator (Sali) tree [3, 6]. Let T = {t1, t2 . . . tn}be a set of discrete time steps with ti < tj for i < j. Assume that the conditional pdf
f(xt′|xt) is available for all t, t′ ∈ T , t < t′.
4
The calculation of the expectation value of a quantity V at a time step ti from
the distribution at ti+1
Vi(xi) = Ei[Vi+1|xi] =
∫f(xi+1|xi)Vi+1(xi+1)dxi+1 , (2.4)
where xi is a short notation for xti , is done in two steps:
1. The integral is evaluated at time ti for a finite number of grid points xt,k,
k = 1 . . . ni. So the method relies on the fact that the function Vi+1 is such that
the integral (2.4) can be evaluated either analytically or by efficient numerical
integration. This is further discussed in the next paragraph.
2. Then the function Vi(xi) is approximated by fitting a spline or another piecewise
polynomial function to the grid {xi,k}k=1...ni, leading to a function Vi(xi).
To do the next time-step, we have to integrate Vi(xi)f(xi|xi−1). As Vi is piecewise
a linear combination of the base functions bk(x), this can be done if the integral∫ xi,j+1
xi,jbk(xi)f(xi|xi−1)dxi can be evaluated efficiently. This is certainly the case if
Xt is a Brownian motion and cubic splines or any other set of piecewise polynomial
interpolation functions are used. The integrals relevant for this case are given in
Appendix C.1.
To get the iteration started, the payoff VT is approximated by the base functions
bk(x) as well.
In general a much smaller number of grid points is needed compared with conven-
tional trees to get a similar precision. A notable strength of the Sali-tree comes with
payoff functions that are non-continuous either themselves or in their first derivative.
It is possible to define domains of integration where the payoff is well-behaved. This
major advantage of the Sali-tree is described in [6]. As it is not needed for the cali-
bration of the model described in the next chapter this point will not be elaborated
further.
2.3 Cubic Splines
The following section summarises properties of cubic splines that are relevant for
this thesis. A detailed introduction can be found in most textbooks about numerical
mathematics such as [14]
Let ∆ = {a = x0 < x1 < . . . < xn = b} be a partition of the interval (a, b) and
f : (a, b) → R a real function. The cubic spline for f on ∆ is a function S∆ : (a, b) → R
with the following properties:
5
• S∆(xi) = f(xi) for all i ∈ {1, . . . , n}
• S∆ ∈ C2(a, b), i.e. f is twice continuously differentiable
• On each subinterval (xi, xi+1) S∆ corresponds to a polynomial of order three.
To make the cubic spline unique, two further equations are needed which are usu-
ally chosen to be a condition imposed on the spline’s derivatives at the boundaries.
Common boundary conditions are either vanishing second derivative or a fixed first
derivative. The first case is usually referred to as a ’natural spline’.
Determining the parameters of the polynomials boils down to a linear equation.
The implementation used in this work is taken from [15].
2.3.1 Stability of Cubic Splines
Cubic splines are guaranteed to converge towards the original function with dimin-
ishing distance between the grid points. To be more precise:
Let f ∈ C4(a, b), assume an L ∈ R exists, so that f (4)(x) ≤ L ∀x ∈ [a, b]. Given
a sequence of grids ∆m = {a = x(m)0 < x
(m)1 < . . . < x
(m)nm = b} with maximal lattice
spacing
‖∆m‖ = maxi
(x(m)i+1 − x
(m)i ) (2.5)
and
supm,i
||∆m||x
(m)i+1 − x
(m)i
≤ K (2.6)
for some number K ∈ R, then for i ∈ {0, 1, 2, 3} constants Ci ≤ 2 independent of ∆m
exist, so that for all x ∈ [a, b]
|f (i)(x) − S(i)∆m
(x)| ≤ CiLK‖∆m‖4−i . (2.7)
This ensures that S∆mwill eventually converge to f . As the derivatives of f enter
the right hand side of (2.7) via L, the lattice spacing necessary to obtain a decent
precision might be prohibitively small. Note that (2.6) is always fulfilled as long
as equidistant grids are used. Hall and Mayer [8] were able to prove the following
estimates for ci = CiK: c0 = 5/384, c1 = 1/24, c2 = (K + K−1)/2, where c0 and c1
are optimal.
6
Chapter 3
Markov Black-Derman-Toy Model
3.1 The Model
To demonstrate the effects of the long tails we choose a model that is still analyti-
cally tractable but shows the main features of a Markov functional model for a term
structure of LIBOR rates. It can be considered as the Markov functional version
of the Black-Derman-Toy model [7]. In this work it will be referred to as ‘Markov
Black-Derman-Toy model’ or MBDT model.
Let Li be the LIBOR-rate from time Ti to Ti+1 and δi the accrual factor for
that period. In the terminal measure, i.e. using the last discount bond DTi,Tnas a
numeraire, we assume
Li = Li
1 +
exp(−qiψ(Ti)
∫ Ti
0g(t)dWt − 1
2q2i ψ(Ti)
2∫ Ti
0g(t)2dt
)− 1
qi
(3.1)
where ψ and g are positive, real functions of time, 0 < qi ≤ 1, Li positive, real
numbers and Wt is a standard Brownian motion under the terminal measure.
We consider only the lognormal case, i.e. qi = 1. The integral WGi=
∫ Ti
0g(t)dWt
is a normal variable with variance Gi =∫ Ti
0g(t)2dt and can thus be considered as
a time changed Brownian motion. So (3.1) with qi = 1 depends only on Gi and
ψi = ψ(Ti):
Li = Li exp
(−ψiWGi
− 1
2ψ2
i Gi
)= LiEi(−ψiWGi
) (3.2)
where E is Dolean’s exponential of a contiuous martingale Xt:
Et(X) = exp(Xt −1
2var(Xt)) . (3.3)
7
and we use the shorthand Ei(X) = Eti(X). Note that Es[Et(X)] = Es(X) for all s ≤ t.
We use the short form Di,j = DTi,Tj(WGi
) with i < j for the discount bonds and
Di,j =Di,j
Di,nfor the numeraire adjusted discount bonds in the terminal measure.
3.2 Analytical Calibration
Assume that ψi and Gi are given. So for the model calibration we only need to
determine the Li from the market values at time t = 0, i.e. D0,i. With
Di,i+1Di,n = Di,i+1 = (1 + δiLi(Ti))−1 (3.4)
we arrive at
Dj,i =Dj,i
Dj,n
= Ej
[ 1
Di,n
](3.5)
= Ej
[Di,i+1(1 + δiLiEi(−ψiWG·
))]
(3.6)
where j < i and Ei[·] = E[·|FTi] is the expectation value given the filtration at time
Ti. For j = 0 in particular we get
D0,i = D0,i+1 + δiLiE0[Ei(−ψiWG·)Di,i+1] (3.7)
which allows to calculate Li from the initial prices of the zero bonds D0,i and D0,i+1
and the distribution of Di,i+1(WGi). Starting with
Dj,n = 1 ∀j ≤ n (3.8)
we can determine Li by induction. In Appendix A.1 it is shown that Di,i+1 has the
form
Di,i+1 =2n−i−1−1∑
j=0
Xi.jEi(−Yi,jWG·) (3.9)
with constants Xi,j and Yi,j given by (A.3,A.4) that depend only on those Lj with
j > i. We can thus get Li using (3.7):
Li =D0,i − D0,i+1
δiE0[Ei(−ψiWG·)Di,i+1]
(3.10)
From (3.6) with i = j we get the functional form of the numeraire Di,n. Together
with the definition of the Markov process W and the form of the discount bonds at
8
the boundary Dn,n = 1 the definition of the Markov functional model is complete
according to section 2.1.1.
Below results for L are shown with the following parameters: The value of the
discount bonds at time T0 are chosen to get a flat initial LIBOR rate of L0 = 5% and
different tenor structures with yearly resets and annual compounding are used. For
simplicity we choose Gi = Ti. Figure 3.1 shows the convexity adjustment L0 − L for
different time independent levels of the volatility ψ.
0 2 4 6 80
1
2
3
4
50.70.60.50.40.30.20.1
0 5 10 15 200
1
2
3
4
5
0.50.40.30.20.150.1
0 5 10 15 20 25 300
1
2
3
4
5
0.40.30.20.180.150.140.130.1
Figure 3.1: L0 − L with L calculated analytically according to (3.10), a flat initialLIBOR rate of L0 = 5%, a tenor of (from top left) 10, 20 and 30 years and differentvalues of the volatility ψ, which is assumed to be time-independent
9
Now we have a closer look on the functional form of Di,i+1(xt). As can be seen
from equation (3.9) it is the sum of terms that each are proportional to exp(−Yi,jx).
The expectation values of these terms are
Et [exp(−Yi,jx)] =1√
2π(Gi+1 − Gt)
∫ ∞
−∞
exp
(−1
2
((x − xt)
2
Gi+1 − Gt
+ 2Yi,jx
))dx
=1√
2π(Gi+1 − Gt)exp
(1
2Y 2
i,j(Gi+1 − Gt) − Yi,jxt
)
∫ ∞
−∞
exp
(−1
2(Gi+1 − Gt)
(x − xt
Gi+1 − Gt
+ Yi,j
)2)
dx . (3.11)
For a given xt the main contribution to the integral is at x = xt − Yi,j(Gi+1 − Gi)
where, according to (A.4) Yi,j could get as large as∑n
i+1 ψi, leading to contributions
far from the central value x = 0. This behaviour is most striking in the denominator
of (3.10), where t = 0. Figure (3.2) shows the integrand
Ii(x) = Ei(−ψiWGi)Di,i+1 · n(x; Gi) . (3.12)
For e.g. i = 4 there are significant contributions at x ≈ −12, six times the standard
deviation from the central value. The observation described above will be of interest
as soon as we use numerical methods to calibrate the model. Trees, finite differences
and Monte Carlo methods all have some kind of finite cut-off for x. The first two
explicitly by the grid size, the later implicitly, as a finite number of runs will lead
to a negligible probability several standard deviations from the central value. For a
Monte Carlo integration that uses a uniform sampling of the distribution WGiover
a region including six standard deviations from the central value, about 1011 Monte
Carlo steps would be necessary. The effect for Monte Carlo simulation has already
been investigated by Merrill Lynch Quantitative Risk Management [10].
3.3 Numerical Approach
The expense to calculate the Li numerically using the above analytic solution is of
order 2n, which starts to become prohibitively large for longer tenors. So even in
this case of an analytically solvable model a numerical approach like the Sali-tree is
necessary. As there are no discontinuities involved in D, we do not have to care about
different domains of integration.
Let xi,k be the grid for the underlying stochastic process WG at time-step i and
Di,j the Sali approximation for Di,j. Then the Sali step i → (i − 1) is:
10
-30 -20 -10 0 10 20 300
0,2
0,4
0,6
0,8
18161412108642
-30 -20 -10 0 10 20 300
0,1
0,2
0,3
0,4
18161412108642
Figure 3.2: It(x) (eq. 3.12) for a tenor of 20 years, for several t and ψ = 0.15 (above)and ψ = 0.3 (below). The other parameters are chosen as in fig. 3.1. While we seeonly a slight asymmetry for ψ = 0.15, the contributions far from the central value aresignificant for the larger volatility.
11
1. Starting with Di,i+1 calculate Li using (3.10).
2. For each xi−1,k evaluate
(Di−1,i)k = E[(1 + δiLi exp(−ψixi − ψ2
i Gi/2))Di,i+1
∣∣WGi= xi−1,k
](3.13)
3. From these (Di−1,i)k determine Di−1,i by fitting the splines.
Here we consider cubic splines that are continuous in their second derivative as
described in section 2.3.
Apart from the question which splines to use, the main decisions necessary for
applying this method are the boundary conditions and the placement of grid points.
For the time being we assume equally spaced grid points (−x,−x + h, . . . , x) with
h = 2x/N , but we have a closer look at the boundary conditions:
As the single summands in (3.9) grow exponentially for x < 0, natural boundary
conditions, i.e. vanishing second derivative at the boundaries, are clearly inappro-
priate. Instead, we try to get a reasonable estimate for the first derivative at the
boundaries. As the method that is developed here should not be limited to the sim-
ple case of a model that is in principle solvable analytically, we will not use detailed
knowledge about the analytical form of Di,n(xi).
Instead, estimates for the asymptotical behaviour of Di,n(xi) for xi ≫ 0 and
xi ≪ 0 are needed.
For xi → ∞ from (3.2) we get Lj → 0 a.s. ∀j > i. So Di,n(xi) → 1 for xi → ∞As Lj is monotonous in xi, this is also true for Di,n(xi) and we may assume a zero
first derivative at the upper boundary.
To determine the boundary conditions for xi → −∞, define ∆j = xj−xi for j > i.
From (3.4) and (3.2) we get for x ≪ 0
Dj,j+1(xj) = (1 + δiLi(xi + ∆j))−1
∼ exp(ψj(xi + ∆j)) (3.14)
and therefore
E[Dj,j+1|WGi= xi] ∼ exp(ψjxi) (3.15)
With (3.5)
Di,i+1 = E
[ 1
Di+1,n
∣∣∣WGi= xi
]
∼ exp
(−xi
n−1∑
j=i+1
ψj
)(3.16)
12
So we assume that Di,i+1(xi) has the asymptotic form
Di,i+1(xi) ∼ bi · exp(−aixi) for xi ≪ 0 (3.17)
and the boundary condition at xi = −x is D′i,i+1 = −aibi · exp(−aixi). To determine
ai and bi we could either set ai to ai =∑n−1
j=i+1 ψj and determine bi from the first grid
point bi = exp(aix)(Di,i+1)0 or we could determine both ai and bi from the first two
grid points:
ai = ln((Di,i+1)0/(Di,i+1)1)/h (3.18)
bi = (Di,i+1)0 exp(aix) . (3.19)
For a correct estimation of the asymptotic behaviour it is important to choose x
large enough so that the asymptotic behaviour dominates all other terms for |x| > x.
An estimation of Di,i+1(xi)/(bi exp(−aixi)) for |x| > x would be helpful but in most
practical cases unrealistic to achieve. Instead, the difference ai − ai will be used as a
consistency check to see if x is chosen large enough so that the asymptotic behaviour
can be assumed for x < −x. x will be chosen so that ai − ai stays within a fixed
interval (−δa, δa) for all i.
In this special case of an analytically solvable model, the role of the asymptotic
behaviour can be seen by comparing it to the analytic solution (3.9). The exponential
with maximal coefficient is singled out as the leading term for x → −∞. It depends
on the factors Xi,j what value of x is necessary for this exponential to dominate the
others. For the derivation of an upper limit to the non-asymptotic terms see appendix
A.2. The result justifies the use of ai − ai as an indicator for an adequate grid size.
The boundary effects on the first grid points have to be taken into account. When
taking the expectation value (3.13) it is insufficient to calculate the integral from −x
to x as this will lead to wrong results for xk close to ±x and to wrong boundary
conditions. At each further time-step these errors will cause deviations closer to the
centre of the grid. Instead, the integral for the semi-infinite intervals (−∞,−x) and
13
(x,∞) will be approximated by the integral over the asymptotic behaviour:
(Di−1,i)k =
∞∫
−∞
(1 + δiLi exp(−ψix − ψ2i Gi/2))Di,i+1(x)n(x − xk; Gi − Gi−1)dx
≈ bi−1
−x∫
−∞
e−ai−1xn(x − xk; Gi − Gi−1)dx
+N−1∑
j=0
(xi)j+1∫
(xi)j
(1 + δiLie
−ψix−ψ2i Gi/2
)Di,i+1(x)n(x − xk; Gi − Gi−1)dx
+
∞∫
x
n(x − xk; Gi − Gi−1)dx (3.20)
=N−1∑
j=0
(xi)j+1∫
(xi)j
(1 + δiLie
−ψix−ψ2i Gi/2
)Di,i+1(x)n(x − xk; Gi − Gi−1)dx
+bi−1
2e−ai−1xk+(Gi−Gi−1)a2/2
(1 + erf
(−x − xk + a(Gi − Gi−1)√
2(Gi − Gi−1)
))
+1
2
(1 − erf
(x − xk√
2(Gi − Gi−1)
)). (3.21)
As Di,i+1 is a polynomial of third order in each of the intervals (xi,j, xi,j+1), the
integral can be solved analytically.
To verify, whether x is large enough for the asymptotic behaviour described above
to be a good approximation, we compare the theoretical value ai to the value calcu-
lated from the to first grid points using (3.18). Figure 3.3 shows the numerical value
for ai compared to the theoretical value ai for n = 20, h = 0.25, ψ ∈ {0.15, 0.3} and
several values of x as a function of the time-step i.
As we can see, the asymptotic behaviour described above is a good approximation
for these parameters if x > 50. For ψ = 0.3 the effect of the contributions far from
the central value is clearly visible, as even with x = 30 the asymptotic behaviour can
be observed for large i, but large deviations from the asymptotic behaviour can be
observed, as soon as the additional peak shown in figure 3.2 becomes significant.
In the present and the next chapter numerical examples are with a tenor of 20
and ψ ∈ (0.15, 0.3). Figure 3.3 illustrates that a grid size given by x = 60 is fully
appropriate. In chapter 5.1 the choice of an optimal grid size will be investigated
further.
14
0 5 10 15 200
0,5
1
1,5
2
2,5
3
analytical102030405060
i
ai
0 5 10 15 200
1
2
3
4
5
6
a0
1020304050
ai
i
Figure 3.3: ai and ai according to (3.18) for n = 20, h = 0.25, different grid sizes xand ψ = 0.15 (left) and ψ = 0.3 (right)
After determining the boundary conditions and the grid size, the spacing between
the grid points must be set. For the time being we stick with equally spaced points
and vary h. Figure 3.4 illustrates that h has to be chosen relatively small to get
reliable results. For any larger grid spacing the numerical solution is good up to some
point i. For any j < i Lj is practically zero.
The reason for this lies in a well known problem with cubic splines (see e.g.
[14, 2]): Cubic splines are a global interpolation method, which means that every
grid point affects the parameters for the spline in every single interval. This can lead
to oscillations in the whole domain. For a series of lattices ∆m on a finite interval
[−x, x] with lattice spacing ‖∆m‖ → 0, a function f ∈ C(4)[−x, x] and corresponding
cubic spline S∆mthe convergence theorem for cubic splines (see section 2.3.1) states
that S∆mdoes converge uniformly to f . But this convergence is influenced by the
fourth derivative of f :
|f(x) − S∆m(x)| ≤ c0L‖∆m‖4 (3.22)
with f (4)(x) ≤ L ∀x ∈ [a, b] and c0 = 5/384. In our case we have a series of
functions fi that grow exponentially. At the lower boundary fi(x) ≈ bi exp(−aix).
So L ≥ f(4)i (−x) ≈ fi(−x)a4
i and as soon as c0(aih)4 > 1 the error might even be
larger than the function value itself. Figure 3.5 shows Di,i+1(x) for i = 11, 10, 9, 8.
It can clearly be seen that first even with ‘well behaved’ grid points the oscillations
set in and at a later step these oscillations lead to implausible grid points.
In principle this problem could be solved by choosing h small enough, but this
would lead to excessive need of computing time. The following chapter will show
alternative approaches.
15
0 5 10 150
1
2
3
4
5
210.50.25analytic
i
0 5 10 150
1
2
3
4
5
210.50.25analytic
i
Figure 3.4: The effect of h on the quality of the Sali approach to L0− Li with n = 20,x = 60, ψ = 0.3 (above) and ψ = 0.2 (below)
16
-50 -48 -46 -44 -42 -400
1×1024
2×1024
3×1024
xi = 11
-50 -48 -46 -44 -42 -40-1×1028
0
1×1028
xi = 10
-50 -48 -46 -44 -42 -40-5×1031
0
5×1031
xi = 9
-50 -48 -46 -44 -42 -40-5×1033
0
5×1033
xi = 8
Figure 3.5: Di,i+1(x) for i = 11, 10, 9, 8 for n = 20, x = 60, ψ = 0.2 and h = 1. Thedotted line connects the values at the grid points, the solid line shows the cubic splinedefined by these points. It can clearly be seen that the oscillations of the spline thatset in at i = 11 lead to inconsistent values at the grid points for smaller i.
17
Chapter 4
Approaches to Improve
Convergence
To handle the oscillations that were observed for the spline approximation, basically
two approaches can be used. Either the method used to calculate the splines can be
varied or the function that is approximated can be changed. Both ideas are further
investigated in the following sections.
4.1 Other Spline Types
Though cubic splines are much less prone to over-oscillation than e.g. fitting of a
polynomial, the phenomenon is well known in the literature (see e.g. [14, 2]). This
is in part due to the cubic spline’s lack of locality. A local change in the input data
will modify the curve even far away from this point. Several other versions of splines
have been presented to handle this problem. We will use Akima interpolation [1] and
tension splines [13, 14, 4].
Let again be ∆ = {a = x0 < x1 < . . . < xN = b} a partition of the interval
(a, b) and y1 . . . yN ∈ R. As for cubic splines we search a function y : (a, b) → R with
y(xi) = yi ∀1 ≤ i ≤ N , but the additional conditions that make y unique differ for
each type of splines.
4.1.1 Akima Interpolation
Like cubic splines, Akima interpolation is based on cubic polynomials. The condition
of a continuous second derivative is abandoned and instead the first derivative at a
given grid point is estimated using the neighbouring points. With
qi =yi − yi−1
xi − xi−1
for i ∈ {1, . . . N} (4.1)
18
the first derivative at xi is estimated as
y′i =
qi|qi+2 − qi+1| + qi+1|qi+2 − qi+1||qi+2 − qi+1| + |qi+2 − qi+1|
for i ∈ {3, . . . N − 2} (4.2)
If qi+2 = qi+1 and qi+2 = qi+1 then y′i can not be derived from (4.2). In this case
y′i = (qi + qi+1)/2.
The derivative at the first two points and the last two points has to be chosen by
other means. For Dj,j+1(xj) we use the asymptotic behaviour described in section 3.3
and assume
D′j,j+1(xj,i) =
{−ajbj · exp(−ajxj,i for i ∈ {1, 2}0 for i ∈ {N,N − 1} (4.3)
with aj and bj from (3.18) and (3.19).
Considering an interval (x0, x1) with y1, y2, y′1, y
′2 given, the polynomial can be
expressed as
y(x) = p0 + p1(x − x1) + p2(x − x1)2 + p3(x − x1)
3 (4.4)
where
p0 = y1 (4.5)
p1 = y′1 (4.6)
p2 = [3(y2 − y1)/(x2 − x1) − 2y′1 − y′
2]/(x2 − x1) (4.7)
p3 = [y′1 + y′
2 − 2(y2 − y1)/(x2 − x1)]/(x2 − x1)2 . (4.8)
(4.9)
The integrations needed to determine the expectation values of the polynomials can
again be done using appendix C.1
Figure 4.1 shows L0 − Li calculated using a Sali tree with Akima interpolation.
Compared to figure 3.4 it shows no abrupt transition to the over-oscillating behaviour
and an overall better convergence.
4.1.2 Tension Splines
Splines under tension, first suggested by Schweikert in 1966 [13], are a common tool
for shape preserving interpolation, i.e. for interpolation that should avoid the spurious
oscillations observed for cubic splines. The name comes from the fact that the curve
can be interpreted as a very light and flexible bar, that is not only constrained to run
through certain points, but is also ’pulled’ at the ends. The strength of this tension
is described by a tension factor σ.
19
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
Figure 4.1: The effect of h on the quality of the Sali approach with Akima interpola-tion instead of cubic splines. The graphs show L0 − Li with n = 20, x = 60, ψ = 0.3(above) and ψ = 0.2 (below)
20
4.1.2.1 Definition
For σ = 0 no tension is applied, which should lead to normal cubic splines. For
σ → ∞ the tension should minimise the length of the curve, leading to straight lines
connecting the given function values, thus avoiding spurious oscillations but loosing
smoothness and any non-spurious extrema.
The derivation of the formulation of splines under tension used here can be found
in [4]. It is summarised in appendix B. The basic assumptions are a continuous
second derivative of y and piecewise linearity of the term y′′(x) − σ2y(x) in each
interval (xi, xi+1). This leads to the following form of the interpolation:
y(x) =y
(2)i+1 sinh(σ′(x − xi)) + y
(2)i sinh(σ′(xi+1 − x))
sinh(σ′(xi+1 − xi))
+(yi+1 − y
(2)i+1)(x − xi) + (yi − y
(2)i )(xi+1 − x)
xi+1 − xi
(4.10)
for x ∈ (xi, xi+1), where
σ′ = σ · xN − x1
n − 1(4.11)
is used instead of σ and the parameters y(2)i are determined by a system of linear
equations very similar to cubic splines. Boundary conditions are needed either for
y′(xj) or y′′(xj) with j ∈ {1, N}. The same conditions as in section 3.3 are used. An
algorithm to determine y(2)i is given in [5].
The use of σ′ instead of σ is to avoid scaling effects when the grid size is changed.
The integrations that are needed to determine the expectation values of the tension
splines can still be done analytically. The results are given in appendix C.2.
4.1.2.2 Selection of Tension Factors
Special care has to be taken when choosing the tension factor σ′. If it is too small,
the same problems as with cubic splines will be observed. If it is chosen too large,
smoothness will be lost and extrema will be underestimated. If additional conditions
are know about the function, like convexity or bounds on a derivative, it is in principle
possible to determine the set of σ′ ≥ 0 for which the constraint is satisfied [12].
In this work no such conditions are used as they would limit the applicability of
the results to more complex models for which no analytical solution is known. So
another way has to be found to rule out values of σ′ that lead to unstable results. To
avoid spurious oscillations as observed in figure 3.5, σ′ will be chosen so that within
21
any two adjacent intervals (xi−1, xi), (xi, xi+1) there is at most one inflexion point.
The derivatives of y in Ii = (xi, xi+1) are:
y′(x) = σ′[y
(2)i+1 cosh(σ′(x − xi)) − y
(2)i cosh(σ′(xi+1 − x))
]/ sinh(σ′hi)
+[yi+1 − yi − y
(2)i+1 + y
(2)i
]/hi (4.12)
y′′(x) = σ′2[y
(2)i+1 sinh(σ′(x − xi)) + y
(2)i sinh(σ′(xi+1 − x))
]/ sinh(σ′hi)(4.13)
with hi = xi+1 − xi. As all sinh terms in y′′ are positive for xi < x < xi+1, y′′ has
no zeros if y(2)i+1y
(2)i > 0. Now assume y
(2)i+1y
(2)i < 0. As y′′(xj) = σ′2y
(2)j , the second
derivative has at least one zero in Ii.
y′′(x) = 0
⇔ |y(2)i+1| sinh(σ′(x − xi)) = |y(2)
i | sinh(σ′(x+1 − x)) . (4.14)
As the left hand side is zero for x = xi and strictly increasing in Ii and the right
hand side is zero for x = xi+1 and strictly deceasing in Ii, exactly one x ∈ Ii exists
with y′′(x) = 0. With the same reasoning we see that if y(2)i = 0, there are no further
inflexion points in (xi−1, xi+1), though y′′ might be constantly zero, this is still at
most one inflexion point.
To summarise: If y(2)i+1y
(2)i > 0 there is no inflexion point in Ii, with y
(2)i+1y
(2)i < 0
there is exactly one inflexion point in Ii and with yi = 0 there is at most one inflexion
point in (xi−1, xi+1).
When the calibration of the splines is started, σ′ is set to a positive, but small
value. When two inflexion points are observed within two adjacent intervals, σ′ is
increased by a fixed amount ∆σ′ until no two such intervals exist.
Though y′′(x) → 0 ∀x ∈ Ii, this convergence is not uniform. So it can not be
guaranteed that the condition can be fulfilled for any σ′ < ∞. Due to this fact σ′ will
only be increased up to a maximum value σ′max. Then σ′
max will be used as tension
factor. The performance can be optimised by choosing ∆σ′ relatively large and then
optimising σ′ by nested intervals.
4.1.2.3 Results
Fig. 4.2 shows the result of the model calibration for n = 20 and ψ ∈ {0.2, 0.3}.Though the results are much better than for the simple cubic splines, the splines
under tension show no improvement compared to Akima interpolation.
Figure 4.3 shows the values of σp determined by the algorithm described above
for the same cases. The value of σ′max = 1000 is not reached, so the condition of no
22
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
Figure 4.2: The effect of h on the quality of the Sali approach with splines undertension instead of cubic splines. The graphs show L0 − Li with n = 20, x = 60,ψ = 0.3 (above) and ψ = 0.2 (below)
23
oscillations is always fulfilled. The deviation of the Sali result using splines under
tension is not due to oscillations, but due to a loss of smoothness caused by the high
tension factor.
0 5 10 15 200
50
100
150
200
δx = 2
δx = 1
δx = 0.5
δx = 0.25
i
σ’
0 5 10 15 200
100
200
300
δx = 2
δx = 1
δx = 0.5
δx = 0.25
i
σ’
Figure 4.3: Tension factor σ′ as a function of the time-step i for n = 20, x = 60,ψ = 0.3 (above) and ψ = 0.2 (below)
4.2 Splitting off the Asymptotic Behaviour
In the last chapter we saw that cubic splines are unsuitable to handle exponential
growth. The basic idea discussed in the present section is to keep the cubic splines
and split of the exponential asymptotic behaviour instead. Consider the two terms
that have to be integrated during a Sali step (3.10, 3.13). Using
gs,i(xi) = (1 + b(s)i−1 exp(−ai−1xi)) s ∈ {1, 2} (4.15)
we define fs,i for s ∈ {0, 1} by
g0,i(xi) · f0,i(xi) = exp(−ψixi − ψ2i Gi/2)Di,i+1(xi) (4.16)
g1,i(xi) · f1,i(xi) = (1 + δiLi exp(−ψixi − ψ2i Gi/2))Di,i+1(xi) (4.17)
where the b(s)i−1 are chosen so that fs,i(−x) = 1. If b
(s)i−1 ≤ 0 the asymptotic behaviour
will not be split off, a normal Sali-step will be performed instead. Using the asymp-
totic behaviour of D discussed in Section 3.3, we get:
f ′s,i(xi) ≈ 0 for |xi| ≫ 0 . (4.18)
Natural boundary conditions, i.e. f ′′s,i(xi) = 0 at x = ±x will be used for fs,i . The
exponential term that has been split off can be handled analytically:
Ej[exp(−ψixi − ψ2i Gi/2)Di,i+1(xi)]
= Ej[(1 + b(0)i−1 exp(−ai−1xi)) · f0,i(xi)]
24
=
∞∫
−∞
(1 + b(0)i−1 exp(−ai−1xi)) · f0,i(xi) · n(xi; Gi − Gj) dxi
=
∞∫
−∞
f0,i(xi) · n(xi; Gi − Gj) dxi
+ b(0)i−1
∞∫
−∞
exp(−ai−1xi) · f0,i(xi) · n(xi; Gi − Gj) dxi
=
∞∫
−∞
f0,i(xi) · n(xi; Gi − Gj) dxi (4.19)
+ b(0)i−1e
a2i−1
2(Gi−Gj)
∞∫
−∞
f0,i(xi − ai−1(Gi − Gj)) · n(xi; Gi − Gj) dxi
and
Ej[(1 + δiLi exp(−ψixi − ψ2i Gi/2))Di,i+1(xi)|xj]
=
∞∫
−∞
f1,i(xi) · n(xi − xj; Gi − Gj) dxi (4.20)
+ b(1)i−1e
a2i−1
2(Gi−Gj)
∞∫
−∞
f1,i(xi − ai−1(Gi − Gj)) · n(xi − xj; Gi − Gj) dxi
From here on we can proceed like in section 3.3 with the sole exception that the
cubic spline interpolation is used on fn,i which should lead to a much more stable
behaviour.
The tails x > x and x < −x are less problematic in this case, as fs,i(x) ≈ 1 for
x < −x and fs,i(x) ≈ s for x > x. Nevertheless they will be taken into account and
fs,i(x) will be assumed to be constant outside (−x, x).
Figure 4.4 shows Li for different values of h and the same parameters as in figure
3.4 with the only difference that the asymptotics are split off here. As it can clearly
be seen the stability is improved significantly. The deviations observed for h = 2 are
caused by oscillations as those observed in figure 3.5. This can be seen in figure 4.5.
Figure 4.6 shows the result of the Sali-approach using cubic splines with split-off
exponential behaviour for the same set of tenors and volatilities ψ as in figure 3.1 using
the analytical solution. With h = 1 the analytical solution can be reproduced in most
cases except for long tenor and very high volatility. To get an impression whether
x = 60 is large enough so that the asymptotic behaviour is a good approximation for
25
0 5 10 150
1
2
3
4
5
210.50.25analytic
i
0 5 10 150
1
2
3
4
5
210.50.25analytic
i
Figure 4.4: The effect of h on the quality of the Sali approach with split off asymptoticbehaviour. The graphs show L0−Li with n = 20, x = 60, ψ = 0.3 (above) and ψ = 0.2(below)
26
-20 -10 0
0
500
1000
1500
2000 1086421
x
Figure 4.5: f1,i according to (4.17) for different time-steps i (n = 20, ψ = 0.2, h = 2).
all x < −x, the limit to the non-asymptotic terms c0 (A.14) is monitored. For all sets
of tenor and volatility considered and every time step it is smaller than 0.1.
The combination of the methods described above, splitting off the exponential
behaviour and using a spline type that is less prone to overoscillations does show an
improvement compared to the simple use of the other spline types but is not as good
as using the normal cubic splines after splitting off the exponential. As long as no
overoscillations occur, the simple cubic splines seems to be best suited among the
examined spline types. For completeness, figures showing these results are given in
appendix D.
When applying this method to other models it is not necessary to identify the
exact asymptotic behaviour. It suffices to identify a function gs,i(x) so that fs,i(x) is
‘well behaved’ in the sense that it shows less spurious oscillations than the original
function gs,i(x) · fs,i(x) and the integrations involved can still be performed in an
efficient manner.
27
0 2 4 6 80
1
2
3
4
50.70.60.50.40.30.20.1
0 5 10 15 20
0
1
2
3
4
5
0.50.40.30.20.150.1
0 5 10 15 20 25 300
2
40.40.30.20.180.150.140.130.1
Figure 4.6: L0 − L with L calculated using a Sali tree with split off asymptoticbehaviour, a tenor of (from top left) 10, 20 and 30 years and different values of ψwith x = 60 and h = 1 (symbols). The lines show the analytic results from figure 3.1for comparison
28
Chapter 5
Optimising the Grid
Up to now no attention has been paid to the structure of the underlying grid. In
principle, three aspects can be considered to optimise the grid :
• The size of the grid given by the maximal value x. If this is chosen too small,
essential contributions far from equilibrium might get lost. If x is chosen too
large, computation time will be higher than necessary. In addition this may
increase stability problems at the lower end of the grid.
• The value of the lattice spacing h. For each interpolation technique used up to
now, the effect of changing h has been demonstrated (fig. 3.4, 4.1, 4.2 and 4.4).
The conflict between performance and stability should be solved such that h is
small enough to get an accurate result but within this restriction as large as
possible.
• The concept of equidistant points can be abandoned in favour of a lattice that
is adapted to the structure of the function. The convergence theorem for cubic
splines clearly favours a homogeneous grid, as this case allows for a minimal
factor K (2.6) compared with other grids with the same size and number of
grid points. So within this work only homogeneous grids will be considered.
The optimisation should be done with the goal to minimise computation time while
keeping an acceptable level of precision. As the integration (3.21) has to be done
for each single grid point and the integration implies a sum over all grid points, the
computing time is of order O(N2) as long as the number of grid points is the same
for each time-step.
But the Sali method does not rely on a constant grid for all time steps. So all of
the above adjustments can be done during the calibration for each single time-step.
29
The optimisation will be done on the basis of cubic splines with split-off asymptotic
behaviour from section 4.2, as these led to the most reliable results.
5.1 Grid Size
According to equation (4.19) two integrations are done over f0,i. The first with a
normal distribution centred at x = 0, the other centred at x = −ai−1(Gi −Gj), both
with variance Gi − Gj. So it must be guaranteed that a reasonable vicinity of both
points is covered by the grid.
In the following a range of ±3√
Gi − Gj is considered to be sufficient. The validity
of this assumption is confirmed a posteriori using the bound on non-asymptotic terms
derived in appendix A.2.1.
As in the first part of a Sali-step the integration (4.19) is done with j = 0, the
grid should at least cover the interval
Gi = (x0, xN) =(−ai−1 Gi − 3 ·
√Gi, 3 ·
√Gi
). (5.1)
The last step i = 0 earns special attention. All other values of x are integrated
over, averaging out small oscillations, but for i = 0 the limit of zero variance is reached
and the two integrals in equation (4.19) evaluate as f0,0(0). To obtain maximum
precision for this single point, the grid is chosen so that x = 0 is a grid point for all
time-steps. So x0 and xN are chosen as
x0 = −m0 · h < −ai−1 Gi − 3 ·√
Gi (5.2)
xN = mN · h > 3 ·√
Gi (5.3)
with minimal m0,mN ∈ N.
It is not guaranteed that Di,i+1 already shows the asymptotic behaviour at x0 =
−ai−1 Gi − 3 ·√
Gi. So neither ai nor bi can be determined from the first two grid
points. ai from section 3.3 will be used. b(s)i will be determined as follows: From
equations (4.16,4.17) it can be seen that
b(0)i−1 = exp
(−1
2Giψ
2i
)bi (5.4)
b(1)i−1 = δiLib
(0)i−1 (5.5)
where bi defined in equation (3.17) can be determined from b(1)i by using equation
(4.20): As f1,i(x) ≈ 1 for x ≪ 0, both integrals in equation (4.20) converge to 1 for
30
xj → −∞. So bi = b(1)i exp(a2
i (Gi − Gj)/2). From this and bN = 1 the following
expressions for b(s)i are derived:
b(0)i = exp
(−1
2Giψ
2i
) n∏
j=i+1
δjLj exp
(−1
2ψ2
j Gj +1
2ψj(Gj − Gn)2
)(5.6)
b(1)i = δiLib
(0)i . (5.7)
The above considerations are valid for the integration over f0,i in equation (4.19).
For equation (4.20) the situation is basically different as the Gaussians are centred
around x = xj and x = −ai−1(Gi − Gj) + xj, where xj can be any grid-point. So
vicinities of these points can not be covered by any finite grid and the tails still have
to be taken into account.
As fs,i(x0) can be far from 1, the tail x < x0 has to be handled in a different
manner than in section 4.2. As fs,i(x) → 1 for x → −∞, assuming fs,i(x) = fs,i(x0)
for x < x0 clearly overestimates the tail, though fs,i(x) = 1 clearly underestimates it.
As an estimation of the functional form for x < x0 an extrapolation is used that
fits with the known values for x ≥ x0 and x → −∞: An exponential decay
fs,i(x) = 1 + c · exp(d · x) (5.8)
if fs,i(x0) < fs,i(x1) and a Gaussian
fs,i(x) = 1 + c · (−d(x − m)2) (5.9)
if fs,i(x0) > fs,i(x1). The parameters are chosen so that the extrapolation fits with
values at the first two or three grid points respectively leading to
c =log(y1 − 1) − log(y0 − 1)
x1 − x0
(5.10)
d = (y0 − 1) · exp(ax0) (5.11)
for the exponential decay and
m = −log(y0−1)−log(y1−1)log(y0−1)−log(y2−1)
(x20 − x2
2) − x20 + x2
1
2(
log(y0−1)−log(y1−1)log(y0−1)−log(y2−1)
(x2 − x0) + x0 − x1
) (5.12)
d =log(y1 − 1) − log(y0 − 1)
(x0 − c)2 − (x1 − c)2(5.13)
c = (y0 − 1) exp(b(x0 − c)2) (5.14)
for the Gaussian.
31
The above estimation of the tails can still lead to inconsistent results if d < 0
in equation (5.9). In this case either an other estimate for the tail is needed or the
grid has to be chosen larger. It must be emphasised that the importance of the tails
does not come from equation (4.19), where the tails are far from the central value of
the Gaussian, but from equation (4.20), where the Gaussian is centred around any
grid-value.
Choosing a more or less arbitrary functional form for x < x0 is in principle not
different from choosing a cubic polynomial for each interval inside the grid. The
different functional form comes from the fact that the known behaviour for x → −∞has to be taken into account.
For consistency reasons the same approach should be used on the upper interval
(xN ,∞) though the numerical impact is negligible.
0 5 10 15 200
0,5
1
1,5no tailf(x)=f(x
0) for x<x
0
f(x)=1 for x<x0
exponentialanalytical solution
i
Figure 5.1: Effect of different tail extrapolations on the quality of the Sali-approachwith spilt asymptotics. The graphs show L0 − Li with n = 20, ψ = 0.2 and h = 1. Itis obvious that the exponential tails defined in equations (5.8–5.14) are necessary toreproduce the analytical results.
Figure 5.1 illustrates the effect of the different approaches to the interval (−∞, x0).
All estimations of a constant tail clearly over- or underestimate the integral, while
the exponential extrapolation reproduces the analytical solution with high accuracy.
32
The number of grid points that were used for each single step is displayed in figure 5.2
The total number of integrations done is 1368 compared to 4678 integrations needed
in the previous chapters with a fixed −x0 = xN = 60.
0 5 10 15 2010
20
30
40
50
i
N
Figure 5.2: The Grid size needed forthe calibration in figure 5.1 as a func-tion of the time-step i.
Note that the driving factor on the grid
size is basically different from the previous
chapters. While in the previous chapters it
was important to choose x large enough so
that the asymptotic behaviour takes over,
after splitting off the exponential behaviour
the approximated function is ’well behaved’
so that the grid can be limited to a much
smaller interval.
This approach relies on the fact that the
coefficients of the asymptotic exponential
can be determined analytically (i.e. ai and
bi) as they are needed in (4.19) and (4.20).
Otherwise a larger grid would be needed to
determine them numerically.
As both ai and bi are known analytically, the estimation of the influence of non-
asymptotic terms used previously can be applied in a straightforward manner to get
a boundary on the integral over the lower tail x < x0 as shown in appendix A.2.2.
The integration is done for f0,i with σ =√
Gi, x0 = −ai−1Gi − 3σ and both µ = 0
and µ = −ai−1Gi, so according to (A.17) the error from the lower tail is bounded by
Θx0=
c
2
[erfc
(3 + ψ
√Gi√
2
)+ erfc
(3 + (ai−1 + ψ)
√Gi√
2
) ]
· exp
(Giψ
2
√2
− ai−1Gi
)(5.15)
with time-independent volatility ψ. If ψi varies with i, ψ has to be replaced by
minj≥i ψi. The ratio between Θx0and the integral (4.19) will be monitored to verify
the adequacy of the choice of the grid size.
This estimate is rather rough as it assumes an approximation f0,i ≈ 1 for x < x0,
whereas the functional form introduced earlier is supposed to be much more precise.
On the other hand this approximation can be used to verify the functional form used
as an approximation for the tail.
33
5.2 Lattice Spacing
The basic idea for an automatic selection of the lattice spacing h is the same as for
the selection of the tension factor σ′ for splines under tension in section 4.1.2.2. If
oscillations are observed, h is reduced by a constant factor w and the time-step is
repeated.
In more detail, after performing a time-step i with lattice-spacing h, Di−1,i is split
into an asymptotic part and a non-asymptotic part as defined in equation (4.16). The
non-asymptotic part f0,i−1 is checked for oscillations as in the case of splines under
tension in section 4.1.2.2. In the case of cubic splines it is trivial that a change of
the sign of y′′(xi) indicates a point of inflexion. If within any two adjacent intervals
two points of inflexion are observed, the calculation of Di−1,i is repeated with lattice
spacing h/w.
Figure 5.3 shows the result of the Sali approach using cubic splines with split-
off exponential behaviour and optimised grid size given by equations (5.2,5.3) and
a lattice spacing determined by the algorithm described in the present section with
starting value h0 = 2 and w = 1.5. In the case n = 30 and ψ = 0.1 the estimation
of the tail (5.9) fails which is indicated by an ‘upside-down’ Gaussian with d < 0.
Here it is necessary to revert to a larger grid size. The relative effect of the lower tail,
Θx0divided by (4.19) can get as large as 3% for the last non-trivial step i = 1, but
otherwise stays well below 0.2%.
Comparing figure 5.3 to the matching results for a fixed grid in figure 4.6 we see
that the adjustment of the lattice spacing leads to a good convergence even with very
high values of ψ.
Table 5.2 shows the number of integrations needed for the calibration with fixed
and optimised grid respectively. As each integration contains a sum over all grid
points, the total number of summands is given as well, indicating the computation
cost. For short tenor the improvement is significant. For longer tenor a smaller lattice
spacing than used previously for the fixed grid is necessary to avoid the errors for high
volatility. This leads to a higher computational cost.
The actual computation time on an average desktop system (Athlon64 X2 4200+)
is given in table 5.2. For long tenor the computation time of the analytical solution
gets out of hand, making it necessary to use the SALI approach even for this simple
model with a known analytical solution.
34
0 2 4 6 80
1
2
3
4
50.70.60.50.40.30.20.1
0 5 10 15 200
1
2
3
4
5
0.50.40.30.20.150.1
0 5 10 15 20 25 300
1
2
3
4
5
0.40.30.20.180.150.140.130.1
Figure 5.3: L0 − L with L calculated using a Sali tree with split off asymptoticbehaviour, a tenor of (from top left) 10, 20 and 30 years, different values of ψ withoptimised grid size and lattice spacing (symbols). The lines show the analytic resultsfrom figure 3.1 for comparison
35
fixed grid, x = 60, h = 1 optimised gridn Integrations Summands Integrations Summands10 15,386 1,861,706 4,160 135,78620 27,828 3,367,188 18,298 22,34,41030 56,624 6,851,504 60,594 14,498,006
Table 5.1: The number of integrations and the number of summands necessary forthe calibration using the Sali approach with a fixed grid (figure 4.6) and optimisedgrid (figure 5.3)
n Fixed grid, x = 60, h = 1 optimised grid Analytical solution10 5 1 < 120 9 7 230 17 37 1,693
Table 5.2: The computation time in seconds necessary for the calibration using theanalytical solution (figure 3.1) or the SALI tree with fixed or optimised grid (figures4.6 and 5.3 respectively) on an average desktop system.
36
Chapter 6
Conclusion
6.1 Summary
The large tails that are observed for Markov functional models for long term struc-
tures have been investigated using an analytically solvable model based on the Black-
Derman-Toy model. The analytical solution has been used to demonstrate notable
effects on the model calibration by the large tails.
For the application of the Sali approach to the MBDT model the asymptotic be-
haviour of the price function of the discount bonds had to be investigated in order
to get realistic boundary conditions and an appropriate approximation of the contri-
bution beyond the grid. Even in this case of an analytically solvable model, the Sali
approach proved to be necessary to get results for long tenor within an acceptable
time.
The difference between the parameters of the asymptotic behaviour determined
from the outer grid points and the parameters determined analytically can be used
as an indicator whether the grid size is adequate to assume asymptotic behaviour
beyond the grid.
The cubic splines used in this attempt turned out to be rather unstable for high
volatilities and long tenor. This is consistent with the stability theorem for cubic
splines which indicates that a very small lattice spacing may be needed to get reason-
able results when modelling exponential behaviour with cubic splines. This depends
on the factor inside the exponential which in our case is determined by volatility and
time to maturity.
Three approaches to improve the quality of the numerical results were investi-
gated. Two were based on different spline types that are supposed to be less prone
to over-oscillations. Some improvement can be seen, but this is limited by a lack of
smoothness that comes with these new spline types.
37
The third approach is based on splitting off the exponential behaviour and ap-
plying the semi-analytic integration to the remaining ‘well behaved’ function. This
improved the numerical results significantly.
Based on the third approach an optimisation of the underlying grid was investi-
gated:
As the asymptotic behaviour to split off can completely be determined analytically,
it is possible to restrict the lattice to a much smaller vicinity of two ’centres’ of the
probability density. So the calculation time can be reduced significantly.
A good value of the lattice spacing can be determined by dynamical adjustment
during the calculation, which makes sure that no over-oscillations occur.
6.2 Outlook
As a next step one can be either apply the methods described above to more complex
Markov functional models or calibrate the MBDT model to real market data and
compare the prices to these obtained by a LIBOR market model.
The application of the methods derived within this thesis to more complex models
relies on the determination of the asymptotic behaviour. This could either be done
giving an exact analytical formula or by fitting a parameterised function to a small
number of grid points close to the boundary.
In the latter case, the choice of the grid size will be a non-trivial issue. It has to
be chosen large enough so that the asymptotic behaviour is determined correctly and
is a good approximation outside the grid, while in the earlier case the grid can be
chosen much smaller.
38
Appendix A
Analytical solution of the
MBDT-Model
A.1 Calibration
First we show that Di,i+1 has the form
Di,i+1 = Ei
2n−i−1−1∑
j=0
Xi,jEi(−Yi,jWG·)
(A.1)
with constants Xi,j and Yi,j. This is certainly true for i = n− 1 with Xn−1,0 = 1 and
Yn−1,0 = 0. Now assume that (A.1) has been proven for some i < n. Then we get by
using (3.6) and the martingale property of E
Di−1,i =2n−i−1−1∑
j=0
Ei−1
[Xi,jEi(−Yi,jWG·
)(1 + δiLiEi(−ψiWG·)]
=2n−i−1−1∑
j=0
(Xi,jEi−1(−Yi,jWG·
) + δiLiXi,jEi−1
[Ei(−Yi,jWG·
)Ei(−ψiWG·)])
=2n−i−1−1∑
j=0
(Xi,jEi−1(−Yi,jWG·
)
+δiLiXi,jEi−1
[e((Yi,j+ψi)
2−ψ2i −Y 2
i,j)Gi/2Ei(−(Yi,j + ψi)WG·)] )
=2n−i−1∑
j=0
(Xi−1,jEi−1(−Yi−1,jWG·
))
(A.2)
39
with
Xi−1,j =
{δiLie
(Yi,j−2n−1−i+ψi)
2−ψ2i −Y 2
i,j−2n−1−i)Gi/2Xi,j−2n−1−i for 2n−i > j ≥ 2n−1−i
Xi,j for j < 2n−1−i(A.3)
Yi−1,j =
{Yi,j for j < 2n−1−i
Yi,j−2n−1−i + ψi for 2n−i > j ≥ 2n−1−i (A.4)
A.2 Estimation of the Tails
A.2.1 Non-Asymptotic Contributions
The present section will establish an estimation of the influence of non-asymptotic
terms for the lower tail x < x0. This is to show in which cases the approximation
used in 3.21 is justified. From the analytical solution we will use the fact that Di−1,i
is a sum of exponential terms and that the coefficient Yi−1,j is a sum of ψi or in the
case of homogenous volatility just a multiple of ψ. The whole derivation is done at a
fixed time-step, so the notation can simplified.
Consider a sum of exponential terms
f(x) =k∑
i=0
βi exp(−αix) (A.5)
with βi > 0 and αi > αj > 0 for all k ≥ j > i ≥ 0. For x → −∞ clearly the term for
i = 0 gives the asymptotic behaviour, so a = α0. For x0 < x1 < 0 the approximated
coefficient a is
a =1
hlog
(f(x0)
f(x1)
). (A.6)
Choose β > 0 so that
f(x1) = β0 exp(−α0x1) + β exp(−α1x1) . (A.7)
Thenk∑
i=1
βi exp(−αix) < β exp(−α1x) ∀x < x1 (A.8)
Let
g0(x) = β0 exp(−α0x) and (A.9)
g1(x) = β exp(−α1x) . (A.10)
40
g0(x) is the asymptotic behaviour and g1(x) limits the non-asymptotic terms for
x < x0. As g1(x)/g0(x) < g1(x0)/g0(x0) for all x < x0, it is sufficient to get an upper
limit for
c = g1(x0)/g0(x0) (A.11)
for an estimation of the non-asymptotic terms. From
exp(ah) = f(x0)/f(x1) ≤g0(x0) + g1(x0)
g0(x1) + g1(x1)(A.12)
with h = x1 − x0, we arrive at
exp((a − a)h) = exp(ah)g0(x1)
g0(x0)≤ 1 + g1(x0)/g0(x0)
1 + g1(x1)/g0(x1)
=1 + c
1 + c exp(h(α0 − α1))(A.13)
and so
c ≤ c0 =exp((a − a)h) − 1
exp((a − α1)h) − 1(A.14)
Where the difference a − α1 can be derived from (A.4). In the case of a time inde-
pendent volatility ψi = ψ, we have a − α1 = ψ.
A.2.2 Integral over the Tail
This section establishes an estimation of the integral over the lower tail to be used in
section 5.1 where both a and b are known analytically and the integration is performed
after splitting off the asymptotic behaviour. Let
f0(x) =f(x)
β0 exp(−α0x)(A.15)
with f from equation A.5. Obviously
1 ≤ f0(x) ≤ 1 + c0 exp((a0 − a1)x) (A.16)
for x < x0. For simplicity we assume a time-independent volatility, so α0 − α1 = ψ.
Integrating (A.16) leads to an estimate of the error made by replacing the tail by 1.
∫ x0
−∞
(f0(x) − 1)n(x − µ; σ)dx ≤ c0
2
(1 + erf
(x0 − µ − ψσ2
√2σ
))
· exp
((σψ)2
2+ µψ
)(A.17)
41
Appendix B
Tension Splines
A derivation of splines under tension can be found in [4]. For completeness it is
summarised here. Let again ∆ = {a = x0 < x1 < . . . < xN = b} be a partition of
the interval (a, b) and f : (a, b) → R with f(xi) = yi. The tension spline for f with
tension factor σ is a function y : (a, b) → R with
y(xi) = yi ∀1 ≤ i ≤ N , (B.1)
that is continuous in its second derivative and where for each interval (xi, xi+1) the
quantity y′′(x) − σ2y(x) is linear in x:
y′′(x) − σ2y(x) = (y′′(xi) − σ2yi) ·xi+1 − x
hi
+(y′′(xi+1) − σ2yi+1) ·x − xi
hi
(B.2)
for xi < x < xi+1 with hi = xi+1 − xi. Solving (B.2) and using (B.1) results in
y(x) =y′′
i
σ2· sinh(σ(xi+1 − x))
sinh(σhi)+
(yi −
y′′i
σ2
)xi+1 − x
hi
+y′′
i+1
σ2· sinh(σ(x − xi))
sinh(σhi)+
(yi+1 −
y′′i+1
σ2
)x − xi
hi
(B.3)
for xi < x < xi+1 and y′′i = y′′(xi). Taking the second derivative and equating left-
and right-hand derivatives at xi leads to(
1
hi−1
− σ
sinh(σhi−1)
)· y′′
i−1
σ2(B.4)
+
(σ coth(σhi−1) −
1
hi−1
+ σ coth(σhi) −1
hi
)· y′′
i
σ2
+
(1
hi
− σ
sinh(σhi)
)· y′′
i+1
σ2
=yi+1 − yi
hi
− yi − yi−1
hi−1
, (B.5)
42
for 1 < i < N . Like in the case of simple cubic splines two more equations are
needed to make the splines unique. As boundary conditions the first derivative at the
boundary are chosen as in section 3.3. Taking the first derivative of (B.3) leeds to
(σ coth(σh1) −
1
h1
)· f ′′
1
σ2+
(1
h1
− σ sinh(σhi)
)· f ′′
2
σ2=
y2 − y1
h1
− y′1 (B.6)
and(
1
hN−1
− σ sinh(σhN−1)
)· f ′′
N−1
σ2
(σ coth(σhN−1) −
1
hN−1
)· f ′′
N
σ2
= y′N − yN − yN−1
hN−1
. (B.7)
As coth(x) > 1/x for x > 0, the matrix representing the above system of linear
equations is strictly diagonal dominant and thus nonsingular. Due to the tridiagonal
structure it can be solved numerically by LU decomposition in O(N) operations
[5, 15].
43
Appendix C
Expectation Values of Splines
C.1 Cubic Splines
The integrals of polynomials times the normal distribution are a textbook matter.
In the form they are presented here they can e.g. be found in [6] and [9]. They are
included in this work for completeness and to emphasise a detail that can for longer
tenors cause large errors in the numerical evaluation.
We define
Ip(a, b, c) =
∫ b
a
(x + c)pn(x)dx (C.1)
with the normal distribution n(x) = (2π)−1/2 exp(−x2/2). For the treatment of cubic
splines only the cases p ∈ {0, 1, 2, 3} are relevant:
I0(a, b, c) = N(b) − N(a) (C.2)
I1(a, b, c) = c(N(b) − N(a)) + n(a) − n(b) (C.3)
I2(a, b, c) = (c2 + 1)(N(b) − N(a)) + (2c + a)n(a) − (2c + b)n(b) (C.4)
I3(a, b, c) = (c3 + 3c)(N(b) − N(a)) + (2c2 + ca + (c + a)2 + 2)n(a)
−(2c2 + cb + (c + b)2 + 2)n(b) , (C.5)
where
N(x) =
∫ x
−∞
n(x′)dx′ (C.6)
=1
2(1 + erf(x/
√2)) (C.7)
=1
2(2 − erfc(x/
√2)) (C.8)
is the cumulative normal distribution. The difference N(b)−N(a) can cause numerical
problems, as for a, b ≫ 1 both terms are very close to 1. Above some x ∈ R N(a) =
44
N(b) = 1 for all a, b > x within numerical precision. So, if the terms we want to
evaluate numerically have substantial contributions far from the central value of zero,
these contributions are not correctly taken into account.
For these cases the complementary error function provides a numerically stable
form of the cumulative normal distribution. Assuming |a− b| ≤ 1 the following terms
are used:
N(b) − N(a) =
(erfc(−b/√
2) − erfc(−a/√
2))/2 for a/√
2 ≤ −2
(erf(b/√
2) − erf(a/√
2))/2 for − 2 < a/√
2 < 2
(erfc(a/√
2) − erfc(b/√
2))/2 for 2 ≤ a/√
2
(C.9)
C.2 Tension Splines
The tension splines consist of a term that is linear in x and another term that is of the
form sinh(σ′(x−x0)). The expectation values of both can be determined analytically.
The integration of the linear term can be taken from (C.3), for the other term we
define
B±(a, b, c) =
∫ b
a
exp(±(x + c))n(x)dx . (C.10)
So ∫ b
a
sinh(x + c)n(x)dx =1
2(B+(a, b, c) − B−(a, b, c)) (C.11)
with
B±(a, b, c) =1√2π
∫ b
a
exp
(±(x + c) − 1
2x2
)dx
=1√2π
∫ b
a
exp
(−1
2(x ∓ 1)2 +
1
2+ c
)dx
= exp
(1
2+ c
)[N(b ∓ 1) − N(a ∓ 1)] (C.12)
45
Appendix D
Combined Approaches to Improve
Convergence
The two methods used to improve convergence in sections 4.1 and 4.2, using a spline
type that is less prone to over-oscillations and splitting off the exponential behaviour,
can be easily combined.
As Figures D.1 and D.2 indicate, this does not lead to a further increase in stability.
Instead, larger deviations from the analytical solution can be observed than in the
case of ordinary cubic splines and split off exponential behaviour.
This hints to the tradeoff that comes with the other spline types which is a lack
of smoothness. It becomes most evident when considering splines under tension with
a high tension factor. Then the spline under tension converges to straight lines con-
necting the grid-points, cutting off any maximum of fs,i that does not happen to lie
on a grid-point.
46
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
Figure D.1: The effect of h on the quality of the Sali approach with split off asymptoticbehaviour and Akima splines. The graphs show L0− Li with n = 20, x = 60, ψ = 0.3(above) and ψ = 0.2 (below)
47
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
0 5 10 15 200
1
2
3
4
5
210.50.25analytic
i
Figure D.2: The effect of h on the quality of the Sali approach with split off asymptoticbehaviour and splines under tension. The graphs show L0 − Li with n = 20, x = 60,ψ = 0.3 (above) and ψ = 0.2 (below)
48
Bibliography
[1] A. Akima. A new method of interpolation and smooth curve fitting based on
local procedures. J. ACM, 17(4):589–602, 1970.
[2] L. B. Andersen. Yield Curve Construction with Tension Splines. SSRN eLi-
brary, 2005.
[3] P. Balland. Semi-analytic mesh: From s to m. Merill Lynch Technical Report,
1999.
[4] A. K. Cline. Scalar- and planar- valued curve fitting using spines under tension.
Comm ACM, 17(4):218–220, 1974.
[5] A. K. Cline. Six subprograms for curve fitting using splines under tension.
Comm ACM, 17(4):220–223, 1974.
[6] Z. Hu et al. Cutting edges using domain integration. Risk Magazine,
19(11):95–99, 2006.
[7] E. Derman F. Black and W. Toy. A one-factor model of interest rates and
its application to treasury bond options. Fin. Anal. Journ., pages 33–39, 1990.
[8] C. A. Hall and W. W. Meyer. Optimal error bounds for cubic spline
interpolation. J. Approximation Theory, 16:105–122, 1976.
[9] P. J. Hunt and J. E. Kennedy. Financial Derivatives in Theory and Practice.
John Wiley & Sons: Chichester, 2000.
[10] Merrill Lynch Quantitative Risk Management. Private communica-
tion.
[11] J. Kennedy P. Hunt and A. Pelsser. Markov-functional interest rate mod-
els. Finance and Stochastics, 4(4):391–408, 2000.
49
[12] R. J. Renka. Interpolatory tension splines with automatic selection of tension
factors. SIAM J. Sci. Comput., 8(3):393–415, 1987.
[13] D. G. Schweikert. An interpoloation curve using a spline in tension. J. Math.
Physics, 45:312–317, 1966.
[14] J. Stoer. Numerische Mathematik 1. Springer-Verlag, 1993.
[15] W. T. Vetterling W. H. Press, S. A. Teukolsky and B. P. Flannery.
Numerical Recipes in C. Cambridge University Press, 1992.
50