We shall deal with probability distributions suitable for stochastic mod-
elling of frequency and severity of insurance losses and of aggregate losses.
1 Claim Severity Models
This section is devoted to determinining probabilty distribution of an amount
of a single payment. The inference is based on data from past experience
with the given type of losses.
1.1 Empirical estimation
Let X1; : : : ; Xn be i.i.d. random variables (random sample) with d.f. F (x).
Our goal is to learn as much as possible about F (x) from the random
sample.
The empirical approach estimates F (x) by the empirical distribution.
For observation X1 = x1; : : : ; Xn = xn we de�ne the empirical distri-
bution function as
Fn(x) =
Pnj=1 I[xj � x]
n;
where I[A] = 1 if A holds true, I[A] = 0 otherwise.
The empirical d.f. is a step function that increases by 1=n at each data
point.
Grouped data
1
In this case we have boundaries c0 < c1 < � � � < cr � +1 and for each
j = 1; : : : ; r we know nj - the number of observations in the interval (cj�1; cj].
The empirical d.f. can be obtained at the boundaries as
Fn(cj) =1
n
jXi=1
ni:
The graph formed by connecting the d.f. at these points by straight lines is
called the ogive and is an approximation to the empirical d.f. Formally,
~Fn(x) =
8>>>>>><>>>>>>:
0 if x � c0;
(cj�x)Fn(cj�1)+(x�cj�1)Fn(cj)
cj�cj�1; if cj�1 < x � cj
1 if x > cr:
~Fn(x) is not de�ned for x > cr if cr =1 (unless nr = 0).
The derivative (where it exists) of the ogive is an empirical approximation
to the probability density function and is called a histogram.
~fn(x) =
8>>>>>><>>>>>>:
0 if x � c0;
Fn(cj)�Fn(cj�1)
cj�cj�1=
njn (cj�cj�1)
if cj�1 < x � cj;
0 if x > cr:
1.2 Moments
Let X be a positive random variable with d.f. F (x); x � 0.
We de�ne:
2
� k-th raw moment (k-th moment about the origin)
�0k =
Z 1
0
xk f(x) dx;
where f(x) is the p.d.f. corresponding to d.f. F (x).
The empirical estimate of �0k based on the observation X1 = x1; : : : ; Xn =
xn is
�̂0k =
Zxk dFn(x) =
1
n
nXj=1
xkj :
For grouped data, we obtain using histogram as an estimate of the density
(provided that cr <1)
�̂0k =rX
j=1
Z cj
cj�1
xknj
n (cj � cj�1)dx =
rXj=1
nj (ck+1j � ck+1j�1)
n (k + 1) (cj � cj�1):
Quantities useful for insurance calculations:
� k-th limited moment
E�(X ^ u)k
�= E [min(X; u)]k :
E(X ^ u) is called limited expected value (LEV).
In the context of insurance, u could be a policy limit - when the loss ex-
ceeds the limit u, only amount of u is considered as a loss for which insurance
cover is applicable. (Similar: XL-reinsurance treaty.)
It holds
E�(X ^ u)k
�=
Z u
0
xk dF (x) + uk (1� F (u)):
3
For a continuous distribution with a density f(x)
E�(X ^ u)k
�=
Z u
0
xk f(x) dx+ ukZ 1
u
f(x) dx:
For the LEV of a positive random variable it holds
E(X ^ u) =
Z u
0
(1� F (x)) dx: (1)
Proof.
E(X ^ u) =
Z u
0
x dF (x) + u (1� F (u))
= �x (1� F (x))ju0 +Z u
0
(1� F (x)) dx+ u (1� F (u)):
(Note that for u! +1 we obtain the well knowm formula for the expected
value of a positive random variable.)
In the insurance we can deal with a deductible - when the loss is less than
or equal to the deductible, there is no payment, and when the loss exceeds
the deductible, the amount paid is the loss less the deductible.
Let X be the random variable representing the loss. With a deductible of
d and a limit of u, the amount paid (per loss) is represented by r.v. Y ,
Y =
8>>>>>><>>>>>>:
0 if X � d;
X � d if d < X < u;
u� d if X � u:
4
Then the expected amount paid per loss is
EY =
Z u
d
(x� d) dF (x) + (u� d) (1� F (u))
=
Z u
0
x dF (x)�Z d
0
x dF (x)� d (F (u)� F (d)) + (u� d) (1� F (u))
=
Z u
0
x dF (x) + u (1� F (u))�Z d
0
x dF (x)� d (1� F (d))
= E(X ^ u)� E(X ^ d):
According to (1) it holds
EY =
Z u
d
(1� F (x)) dx:
The expected payment per payment is
E[Y jX > d] =E(X ^ u)� E[X ^ d]
1� F (d):
De�nition. The loss elimination ratio for a deductible of d is the relative
reduction in the expected payment given the imposition of a deductible.
LERX(d) =E [X ^ d]
EX;
provided that EX and E(X ^ d) exists.
De�nition. The mean excess loss for a deductible d is the expected loss
in excess of d, conditioned on the loss exceeding the deductible.
eX(d) = E(X � djX > d) =EX � E(X ^ d)
1� F (d):
5
Empirical estimate for the k-th limited moment based on observationX1 =
x1; : : : ; Xn = xn:
Eh(X̂ ^ u)
i=
1
n
0@X
xj<u
xkj +Xxj�u
uk
1A :
For grouped data with boundaries c0 < c1 < � � � < cr we assume that u is
such that cj�1 � u � cj. Then we can use the histogram to estimate the
k-th limited moment:
~Eh(X̂ ^ u)k
i=
j�1Xi=1
Z ci
ci�1
xkni
n (ci � ci�1)dx+
Z u
cj�1
xknj
n (cj � cj�1)dx
+
Z cj
u
uknj
n (cj � cj�1)dx+
rXi=j+1
Z ci
ci�1
ukni
n (ci � ci�1)dx
=
j�1Xi=1
ni (ck+1i � ck+1i�1 )
n (k + 1) (ci � ci�1)+
nj (uk+1 � ck+1j�1)
n (k + 1) (cj � cj�1)
+nj u
k (cj � u)
n (cj � cj�1)+
rXi=j+1
ni uk
n:
1.3 Parametric models
We shall consider a parametric family of distributions fF (x; �); � 2 �g, where
� is a parameter (scalar or vector) and � is the set of all possible parameter
values.
Parametric inference - steps:
1) Determine which parametric family describes the population.
2) Determine the value of the parameter (vector of parameters).
6
3) Determine the value of the quantity of interest.
4) Assess the acuracy of the value determined in 3).
In steps 1,2,4 we use methods of mathematical statistics (e.g. parameter
estimation, hypotheses tests...)
Examples of parametric distributions and their characteristics
Exponential distribution
f(x) =1
�e�
x� ; F (x) = 1� e�
x� ; x � 0; � > 0
EXk =
Z 1
0
xk f(x) dx = �kZ 1
0
yke�y dy = �k �(k + 1) = �k k!
Eh(X ^ u)k
i=
Z u
0
xk f(x) dx+ ukZ 1
u
f(x) dx
= �k k!
Z u=�
0
yk e�y dy + uk e�x�
Gamma distribution
f(x) =
�x�
��e�
x�
x�(�); x � 0; � > 0; � > 0;
EXk =�k�1
�(�)
Z 1
0
�x�
��+k�1e�
x� dx =
�k �(� + k)
�(�)= �k (� + k � 1) : : : �:
7
1.4 Creating new families of parametric distributions
- transformations
Let X be a continuous r.v. with p.d.f. fX(x) and a d.f. FX(x); x � 0. We
shall explain the derivation of more general parametric families via various
transformations of r.v. X.
1) Multiplication by a constant (change of scale)
Let Y = � X; � > 0: Then
FY (y) = FX(y=�) and fY (y) =1�fX(y=�); y > 0.
2) Raising to a power
Let Y = X1=� . Then if � > 0 we obtain
FY (y) = FX(y� ); fX(y) = � y��1 fX(y
� ); y > 0;
and the distribution of Y is called transformed. If � < 0 it holds
FY (y) = 1� FX(y� ); fY (y) = �� y��1 fX(y� ); y > 0;
and the distribution of Y is called inverse transformed. In the special case
� = �1 we speak about inverse distribution.
Important examples of parametric families we obtain by transforming
Gamma distributed random variable.
Let X have Gamma distribution with � = 1 and let � > 0. Then r.v.
8
Y = X1=� has the p.d.f.
f(y) =� y�� e�y
�
y �(�); y > 0:
After introducing a scale parameter � we obtain the p.d.f. of transformed
Gamma distribution:
f(y) =��y�
���e�(
y�)�
y �(�); y > 0:
It is a 3-parameter family with some well known distributions as special cases:
Gamma (� = 1), Weibull (� = 1), Exponencial (� = � = 1).
Moments of the transformed Gamma distribution can be expressed by the
formula
EXk =�k �(� + k=�)
�(�):
(It follows from
EXk = �kZ 1
0
�(y�)��+k e�(
y�)�
y �(�)dy
by substituting x = (y�)� .)
Raising X to a power � < 0 gives a p.d.f.
fY (y) = �� y��1 y�� e�y
�
y��(�):
We substitute the negative parameter � by its opposite value and again we
introduce a scale parameter �. The resulting density
f(y) =���y
���e�(
�y )
�
y �(�); y > 0; � > 0
9
is a p.d.f. of so called inverse transformed Gamma distribution.
Special cases are: inverse Gamma (� = 1), inverse Weibull (� = 1), inverse
exponential (� = � = 1).
Moments of the inverse transformed Gamma distribution are given by
EXk =�k �(�� k=�)
�(�); k < ��:
Pareto distribution is a special case of so called transformed Beta distri-
bution with p.d.f.
f(x) =�(� + �)
�(�) �(�)
(x=�) �
x [1 + (x=�) ]�+�:
Moments of this distibution are given by
EXk =�k �(� + k= ) �(�� k= )
�(�) �(�)
and are �nite only in case k < � .
Setting = 1, � = 1 we obtain Pareto distribution with p.d.f.
f(x) = � �� (� + x)���1; x � 0:
3) Exponentiation
Let Y = eX . Then Y has d.f. FY (y) = FX(log y) and p.d.f. fY (y) =
1yfX(log y).
10
Let X have normal distribution N(�; �2). Then Y has lognormal distrib-
ution with p.d.f.
f(x) =1
x �p2�
exp��(log x� �)2
2�2
�:
Moments of lognormal distribution can be expressed using moment generat-
ing function of normal distribution:
EXk = ek�+k2�2=2:
1.5 Tail behavior
The tail behavior is expressed by the survival function
S(x) = 1� F (x) = P (X > x)
considered for x!1.
The tail bahavior of two probability distributions is similar, if the ratio of
their survival functions tends to a constant non-zero limit as x ! 1. The
same holds for the ratio of their probability density functions, since
limx!1
SX(x)
SY (x)= lim
x!1
fX(x)
fY (x):
We shall illustrate the comparison of probability distributions according
their tail behavior for the following examples:
11
1) Pareto
F (x) = 1��x�
���; x � �; � > 0;
f(x) = �� �x���1;
2) Gamma
f(x) =
�x�
��e�
x�
x�(�); x � 0; � > 0; � > 0;
3) lognormal
f(x) =1
x �p2�
exp��(log x� �)2
2�2
�:
We obtain the following comparisons:
1) Gamma vs. Pareto
limx!1
x��1 e�x=�
x�(�+1)=
limx!1
exph(�� 1) log x� x
�+ (� + 1) log x
i= 0:
Pareto has havier tail than Gamma.
2) lognormal vs. Gamma
limx!1
x�1 exp�� 1
2�2(log x� �)2
�x��1 e�x=a
=
limx!1
exp
�� 1
2�2(log x� �)2 � � log x+
x
�
�= +1:
Lognormal has havier tail than Gamma.
12
3) Pareto vs. lognormal
limx!1
x�(�+1)
x�1 exp�� 1
2�2(log x� �)2
� =limx!1
exp
��� log x+
1
2�2(log x� �)2
�= +1:
Pareto has havier tail than lognormal.
2 Claim Frequency Models
We shall consider discrete distributions on non-negative integer values, i.e.
pk = P (N = k); k = 0; 1; : : :
Examples of parametric distributions used most frequently:
1) Poisson distribution
pk = e���k
k!; k = 0; 1; : : :
2) Negative Binomial distribution
pk =
�k + r � 1
k
� �1
1 + �
�r ��
1 + �
�k
; k = 0; 1; : : : ; r > 0; � > 0;
3) Binomial distribution
pk =
�m
k
�qk (1� q)m�k; k = 0; 1; : : : ;m; 0 < q < 1;
4) Geometric distribution (Negative Binomial with r = 1)
pk =1
1 + �
��
1 + �
�k
; k = 0; 1; : : : ; � > 0:
13
2.1 The (a; b; 0) class
All the above mentioned distributions belong to a general class of two-
parametric distributions, called the (a; b; 0) class.
De�nition. Discrete random variable with the probabilty function (p.f.)
fpk; k = 0; 1; : : : g is a member of the (a;b;0) class, provided that there
exist constants a, b such that
pkpk�1
= a+b
k; k = 1; 2; : : : : (2)
Note that the probability p0 is determined by (2) through the condition
P1k=0 pk = 1.
For the above mentioned distributions we obtain the following values of
parameters a, b:
Table 1: Members of the (a; b; 0) class
Distribution a b
Poisson 0 �
Negative Binomial �1+�
(r � 1) �1+�
Binomial � q1�q
(m+ 1) q1�q
Geometric �1+�
0
It can be shown that the above mentioned distributions are the only dis-
crete distributions satisfying (2).
14
Formula (2) can be rewritten as
kpkpk�1
= a k + b; k = 1; 2; : : :
Assume that we observe number of claims during certain period of time for
n policies. Let nk be number of policies with k recorded claims, k = 0; 1; : : :
We can estimate the ratio pkpk�1
by nknk�1
bpkdpk�1 = nknk�1
:
This suggests a graphical way of indicating which of the distributions should
be selected: We plothk; k nk
nk�1
ifor k = 0; 1; : : : The points should form a
straight line, where the slope is 0 for the Poisson distribution, it is negative
for the binomial and positive for the negative binomial distribution.
2.2 The (a; b; 1) class
We explain a generalization of the (a; b; 0) class, that enables a better �t of
the probability at zero.
De�nition. Discrete random variable with probability function fpk; k =
0; 1; : : : g is a member of the (a;b;1) class, provided that there exist con-
stants a, b such that
pkpk�1
= a+b
k; k = 2; 3; : : : : (3)
15
The distribution for k = 1; : : : has the same shape as the (a; b; 0) class in
the sense that the probabilities are the same up to a constant of proportion-
ality.P1
k=1 pk can be set to any number in the interval [0; 1). The remaining
probability is p0 = 1�P1k=1 pk.
When we set p0 = 0, the distribution is called zero-truncated (ZT).
When p0 > 0, the distribution is called zero-modi�ed (ZM).
The zero-modi�ed distribution can be viewed as a mixture of a zero-
truncated distribution and a degenerate distribution with all the probabilty
at zero.
To show this we denote by fpk; k = 0; 1; : : : g the distribution from the
(a; b; 0) class and by fpMk ; k = 0; 1; : : : g the corresponding distribution from
the (a; b; 1) class. The probability generating functions of these distributions
are
P (z) =1Xk=0
pk zk; PM(z) =
1Xk=0
pMk zk:
It holds
pMk = c pk; k = 1; 2; : : :
and pM0 is an arbitrary number. Then
PM(z) = pM0 +1Xk=1
pMk zk
= pM0 + c1Xk=1
pk zk = pM0 + c [P (z)� p0]:
16
Since PM(1) = P (1) = 1,
1 = pM0 + c (1� p0)
resulting in
c =1� pM01� p0
;
hence
pk =1� pM01� p0
pk; k = 1; 2; : : :
We can write
PM(z) = pM0 +1� pM01� p0
[P (z)� p0] =
�1� 1� pM0
1� p0
�+1� pM01� p0
P (z);
this is a weighted average of the probability generating functions of the de-
generate distribution and the corresponding (a; b; 0) member.
The zero-truncated distribution can be viewed as a special case of the
zero-modi�ed distribution with pM0 = 0. Then we obtain
pTk =pk
1� p0; k = 1; 2; : : :
We give a summary of the (a; b; 1) class:
1. For a = 0, b = �, � > 0 we obtain
Poisson (p0 = e��),
ZT Poisson (p0 = 0),
ZM Poisson (p0 arbitrary).
17
2. For a = � q1�q
, b = (m+ 1) q1�q
, 0 < q < 1 we obtain
binomial (p0 = (1� q)m),
ZT binomial (p0 = 0),
ZM binomial (p0 arbitrary).
3. For negative binomial distribution in the (a; b; 0) class we have a = �1+�
,
b = (r � 1) �1+�
, r > 0, � > 0, with p0 = (1 + �)�r.
In the (a; b; 1) class possible values of parameter r can be extended to
r > �1:
We have to show that for r > �1, � > 0, the recursive formula (3) with
p0 = 0 de�nes a proper distribution. It is su�cient to show that for
any value of p1, the values pk; k = 2; 3; : : : obtained from (3) satisfy
pk > 0; k = 2; 3; : : : andP1
k=1 pk <1.
pk = pk�1
��
1 + �+r � 1
k
�
1 + �
�
= p1
��
1 + �
�k�1 �1 +
r � 1
k
�� � ��1 +
r � 1
2
�;
where r � 1 > �k; k = 2; 3; : : : .
1Xk=2
pk = p1
1Xk=2
��
1 + �
�k�1(r + 1 + k � 2) � � � (r + 1)
k!
= p1
1Xk=1
��
1 + �
�k ��(r + 1)
k
�(�1)k <1:
18
We call the resulting distribution "extended" truncated negative bino-
mial distribution.
A special case for r = 0 is the logarithmic distribution with the prob-
ability function
pk =1
k log(1 + �)
��
1 + �
�k
; k = 1; 2; : : :
2.3 Compound frequency models
A large class of distributions can be created by the process of compounding
any two discrete distributions.
Let N be a r.v. with the probability function
pn = P(N = n); n = 0; 1; : : :
and the probability generating function
P1(z) = E zN =1Xn=0
pn zn:
Let M1;M2; : : : be i.i.d. random variables, independent of N , with the prob-
ability function
fn = P(M = n); n = 0; 1; : : :
and the probability generating function
P2(z) = E zN =1Xn=0
fn zn:
19
Then
S =NXi=1
Mi
has a compound distribution with probability generating function
P (z) = P1 (P2(z)) :
�P (z) = EE
�zSjN� = 1X
n=0
pn E�zSjN = n
�
=1Xn=0
pn E�zM1+���+MnjN = n
�=
1Xn=0
pn Pn2 (z)
�
We shall call the distribution of N primary distribution and the distrib-
ution of M secondary distribution.
Recursive formula (Panjer)
When the primary distribution is a member of the (a; b; 0) class, then
gk =1
1� a f0
kXj=1
(a+ b j=k) fj gk�j; k = 1; 2; : : : (4)
Proof. From (2)
n pn = a (n� 1) pn�1 + (a+ b) pn�1; n� 1; 2; : : : : (5)
Multiplying each side of (5) by [P2(z)]n�1 P 0
2(z) and summing over n yields
1Xn=1
n pn [P2(z)]n�1 P 0
2(z) =
a1Xn=1
(n� 1)pn�1 [P2(z)]n�1 P 0
2(z) + (a+ b)1Xn=1
pn�1 [P2(z)]n�1 P 0
2(z):
20
Therefore
P 0(z) = aP 0(z)P2(z) + (a+ b)P (z)P 02(z):
Comparing the coe�cients of zk�1 we obtain
k gk = akX
j=0
(k � j) fj gk�j + (a+ b)kX
j=0
j fj gk�j
= a k f0 gk + akX
j=1
(k � j) fj gk�j + (a+ b)kX
j=1
j fj gk�j
= a k f0 gk + a kkX
j=1
fj gk�j + bkX
j=1
j fj gk�j:
Therefore
gk = a f0 gk +kX
j=1
�a+
bj
k
�fj gk�j:
Formula (4) requires the starting value g0. This can be computed as
g0 =1Xn=0
P (S = 0jN = n) P(N = n)
=1Xn=0
P (M1 + � � �+Mn = 0) P(N = n)
=1Xn=0
(f0)n pn = P1(f0):
Note. When f0 = 0, then g0 = P1(0) = p0.
If the primary distribution is a member of the (a; b; 1) class, (4) is modi�ed
to
gk =1
1� a f0
�[p1� (a+ b) p0] fk +
kXj=1
(a+ b j=k) fj gk�j
�; k = 1; 2; : : : (6)
21
Proof.From (3)
n pn = a (n� 1) pn�1 + (a+ b) pn�1; n = 2; 3; : : : (7)
Multiplying each side of (7) by [P2(z)]n�1 P 0
2(z) and summing over n yields
1Xn=2
n pn [P2(z)]n�1 P 0
2(z) =
a1Xn=2
(n� 1)pn�1 [P2(z)]n�1 P 0
2(z) + (a+ b)1Xn=2
pn�1 [P2(z)]n�1 P 0
2(z):
Since
P 0(z) =1Xn=1
n pn [P 02(z)]
n�1P 02(z)
we obtain
P 0(z)� p1 P02(z) = aP 0(z)P2(z) + (a+ b)P 0
2(z) [P (z)� p0] :
After rearranging,
P 0(z) = aP2(z)P0(z) + (a+ b)P (z)P 0
2(z) + [p1 � (a+ b) p0]P02(z):
Comparing the coe�cients of zk�1 we obtain
k gk = akX
j=0
(k � j) fj gk�j + (a+ b)kX
j=0
j fj gk�j + [p1 � (a+ b) p0] k fk:
Therefore
gk = a f0 gk +kX
j=1
�a+
bj
k
�fj gk�j + [p1 � (a+ b) p0] fk:
22
3 Aggregate claims distribution
We shall deal with some methods for computing the compound distribution
of the aggregate loss represented by a sum
S =NXi=1
Xi
of a random numberN of i.i.d. individual paymentsX1; X2; : : : , independent
of N .
3.1 Recursive formula
Suppose that the severity distribution FX(x) is de�ned on 0; 1; : : : ; r repre-
senting multiples of some monetary unit. The number r represents the larger
possible payment and could be in�nite. Further, suppose that the frequency
distribution, pk, is a member of the (a; b; 1) class and therefore satis�es (6).
Then the distribution of the aggregate claim S can be obtained from
fS(x) =[p1 � (a+ b) p0] fX(x) +
Px^ry=1
�a+ by
x
�fX(y) fS(x� y)
1� a fX(0): (8)
For the (a; b; 0) class, (8) reduces to
fS(x) =
Px^ry=1
�a+ by
x
�fX(y) fS(x� y)
1� a fX(0): (9)
Note that when the severity distribution has no probability at zero, the
denominator of (8) and (9) equals one.
The starting value of the recursive schemes (8) and (9) is fS(0) = PN(FX(0)).
23
3.2 Constructing arithmetic distributions
The recursive method has been developed for discrete severity distributions.
In case of a continuous severity distribution, we can use as an approximation
a discrete severity distribution on multiples of a convenient unit h (the span).
Such a distribution is called arithmetic. We want to preserve the properties
of the original distribution.
1. Method of rounding (mass dispersal)
Let fj denote the probability placed at jh; j = 0; 1; : : : . We set
f0 = P(X < h=2) = FX
�h
2
�;
fj = P
�jh� h
2� X < jh+
h
2
�
= FX
�jh+
h
2
�� FX
�jh� h
2
�; j = 1; 2; : : :
2. Method of local moment matching
In this method we construct an arithmetic distribution that matches p
moments of the arithmetic and the true continuous severity distribution.
Consider an arbitrary interval of length ph, denoted by [xk; xk + ph). We
will locate point masses mk0; m
k1; : : : ;m
kp at points xk, xk + h; : : : ; xk + ph so
thatpX
j=0
(xk + jh)rmkj =
Z xk+ph
xk
xrdFX(x); r = 0; 1; : : : ; p: (10)
Arrange the intervals so that xk+1 = xk + ph. Then the point masses at the
24
endpoints are added together. With x0 = 0, the resulting discrete distribu-
tion has successsive probabilities
f0 = m00; f1 = m0
1; f2 = m02; : : : ;
fp = m0p +m1
0; fp+1 = m11; fp+2 = m1
2; : : :
By summing (10) for all possible values of k, with x0 = 0, it is clear that p
moments are preserved for the entire distribution and that the probabilities
add to one exactly.
The solution of (10) is
mkj =
Z xk+ph
xk
Yi6=j
x� xk � ih
(j � i)hdFX(x); j = 0; 1; : : : ; p:
The proof is based on the Lagrange formula for collocation of a polynomial
f(y) at points y0; y1; : : : ; yn:
f(y) =nXj=0
f(yj)Yi 6=j
y � yiyj � yi
:
Applying this formula to the polynomial f(y) = yr over the points xk; xk +
h; : : : ; xk + ph yields
xr =
pXj=0
(xk + jh)rYi6=j
x� xk � ih
(j � i)h; r = 0; 1; : : : ; p:
3.3 Fast Fourier transform
The fast Fourier transform (FFT) is na algorithm that can be used for invert-
ing characteristic functions to obtain densities of discrete random variables.
25
De�nition. For any continuous function f(x), the Fourier transform is
the mapping
~f(z) =
Z 1
�1
f(x) eizx dx: (11)
The original function can be expressed by means of its Fourier transform as
f(x) =1
2�
Z 1
�1
~f(z) e�ixz dz:
When f(x) is a probability density function, then ~f(z) is its characteristic
function.
De�nition. Let fx denote a function de�ned for all integer values of x
that is periodic with period length n (fx+n = fx for all x). For the vector
(f0; f1; : : : ; fn�1) the discrete Fourier transform is the mapping ~fx; x =
� � � � 1; 0; 1; : : : , de�ned by
~fk =n�1Xj=0
fj exp
�2�i
njk
�; k = � � � � 1; 0; 1; : : : (12)
This mapping is bijective. In addition, ~fk is also periodic with period length
n. The inverse mapping is
fj =1
n
n�1Xj=0
~fk exp
��2�i
nkj
�; j = � � � � 1; 0; 1; : : : (13)
Note. In order to obtain n values of ~fk, the number of terms that need to
be evaluated is of order O(n2).
The Fast Fourier transform is an algorithm, that reduces the number of
computations required to be of order O(n log2 n). It is based on the property
26
that a discrete F. transform of length n can be rewritten as the sum of two
discrete transforms, each length n=2:
~fk =n�1Xj=0
fj exp
�2�i
njk
�
=m�1Xj=0
f2j exp
�2�i
njk
�+ exp
�2�i
nk
� m�1Xj=0
f2j+1 exp
�2�i
njk
�;
when m = n=2.
The application of the FFT to computing the aggregate claim distribution
can be summarized as follows:
1. Discretize the severity distribution to obtain the distribution
fX(0); fX(1); : : : ; fX(n� 1)
where n = 2r for some integer r and n is the number of points desired
in the distribution fS(x) of aggregate claims.
2. Apply the FFT to this vector of values, obtaining �X(z), the charac-
teristic function of the discretized distribution. The result is a vector
of n values.
3. Calculate the characteristic function of the compound distribution of
S using �S(z) = PN (�X(z)), where PN is the probability generating
function of N .
27
4. Apply the inverse FFT to obtain the distribution of aggregate claims
for the dicretized severity model.
Literature: S.A.Klugman, H.H.Panjer, G.E.Willmot: Loss Models: From
Data to Decisions. John Wiley & Sons, 1998.
28