Post on 04-Jun-2018
transcript
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 1/10
Statistics and Probabilistic Modelling for Insurance
Solutions to Course Work No1 2012/2013
1. (a) Since f x, y is a joint density, we have that
(1) -¶
¶
-¶
¶
f x, y „ y „ x = c x2+ y2§ R2
„ y „ x = 1 .
Noting that the double integral is equal to the area of the circle, i.e.,
x2+ y2§ R2
„ y „ x = p R2
from (1) we obtain that , c =1
p R2.
(b) We have
f X x = -¶
¶
f x, y „ y =1
p R2 x2+ y2§ R2
„ y
=1
p R2 - R2- x2
R2- x2
„ y =2
p R2 R2 - x2 ,
if x2 § R2 and f X x = 0 if x2 > R2. By symmetry, the marginal density of Y is given
by
f Y y =2
p R2 R2 - y2
0
for
for
y2 § R2
y2 > R2.
(c) The distribution function, F Da, 0 § a § R, of the distance, D = X 2 + Y 2 is
obtained as follows
F Da = P D § a = P X 2 + Y 2 § a = P X 2 + Y 2 § a2
= x2+ y2§a2
f x, y „ y „ x =1
p R2 x2+ y2§a2
„ y „ x =p a2
p R2=
a2
R2,
where we have used the fact that x2+ y2§a2
„ y „ x is the area of a circle of radius a and
thus is equal to p a2
. When R = 50 miles and a = 10 miles we obtainP D § 10 =
100
2500 = 0.04.
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 2/10
(d) Using the distribution function, F Da, from part (c) we obtain
f Da =„
„ aF Da =
2 a
R2, 0 § a § R.
Hence, we have
E D = 0
R
a2 a
R2„ a =
2
R2 0
R
a2 „ a =2 R
3.
2. (a) Equating the theoretical mean and variance to the empirical mean, x, and
variance, we get
‰ m+
1
2
s2
= x‰2 m+s2‰s2
- 1 = i=1120 xi - x2 119
ï
‰2 m+s2
= i=1120 xi 120
‰2 m+s2‰s2
- 1 = i=1120 xi - x2 119
ï ‰s2
- 1 =i=1
120 xi - x2 119
x2
Hence, we have
‰s2
- 1 =15601373
2020.292
2ï s2 = Log1 +
15601373
2020.292
2 = 1.57327 ï s = 1.2543
and
‰ m+
1
2 s2
= 2020.292 ï m = Log2020.292 -1
2s2 = 6.82436
(b) The likelihood function takes the form
La, l = i=1
n
f X xi = i=1
n a la
l + xia+1
and hence
log La, l = la, l = Log i=1
n a la
l + xia+1
= i=1
n
log a la - logl + xia+1 = i=1
n
loga + a logl - a + 1 logl + xi
=n
loga +n
a logl -i=1
n
a + 1 logl + x
i
SPMI CW1 2012/2013 Solutions 2
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 3/10
Differentiating the log-likelihood function wrt a and l, and then solving for a, one
find the maximum likelihood estimators must satisfy
d log La, ld a
= n1
a
+ n log
l
-
i=1
n
log
l + xi
= 0 ï
a`
=n
i=1n logl + xi - n logl
d log La, ld l
= n a1
l-
i=1
n
a + 1 1
l + xi = 0 ï
a`
=i=1
n 1
l+ xinl
-i=1n 1l+ xi
Hence, the maximum likelihood estimator l` must be a solution of
n
i=1n logl + xi - n logl =
i=1n 1
l+ xin
l -i=1
n 1
l+ xiwhich may be solved in e.g. Excel. Thus, we obtain l
`= 1872.13 and a
`= 1.880467.
(c) Note that both Lognormal and Pareto distributions are defined on
0, ¶
and you
need to have i E i = 120 = i Oi.
(i) One appropriate choice is to use 10 equally probable bins as determined by the
fitted Lognormal distribution. So, in the case of Lognormal with m = 6.82436 and
s = 1.2543 we have
where the Expected number of data in the bin L, U is calculated as
120 P L < x § U = 120 F U - F L = 120 μ 0.1We need to compare with c2 with df = (10-2-1) = 7. The critical value at 5% is 14.07
and so we reject the hypothesis that data follow Lognormal distribution with
SPMI CW1 2012/2013 Solutions 3
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 4/10
= 6.82436 and s = 1.2543. The critical value at 1% is 18.48 and so we do not reject
the Lognormal fit. The p-value of the test statistic is 0.018503 and of course leads to
the same conclusions.
(ii) One appropriate choice is to use 10 equally probable bins as determined by the
fitted Pareto distribution (note that there is no built-in (inverted) Pareto distribution
function in Excel but it could be explicitly inverted in order to find the bins). So, in the
case of Pareto with a = 1.8805 and l = 1872.13 we have
where the Expected number of data in the bin L, U is calculated as
120 P L < x § U = 120 F U - F L = 120 μ 0.1
We need to compare with c2 with df = (10-2-1) = 7. The critical value at 5% is 14.07
and so we do not reject the hypothesis that data follow Pareto distribution with
a = 1.8805 and l = 1872.13. The critical value at 1% is 18.48 and so we again do not
reject the Pareto fit. The p-value of the test statistic is 0.277482 and of course leads to
the same conclusions.
(iii) Recall that in order for the Goodness-of-Fit test to be valid one needs E i ¥ 5, so in
both cases (i) and (ii) the test is reliable. The quality of the test is also reaffirmed by
the high number of observations, namely 120, used to perform it.
Based on the test statistics one can say that the Pareto fit is better than the Lognormal
fit - it has a much smaller test statistics and thus, a higher p-value.
This particular choice ensures Oi > 5, i = 1, ..., 10 and simplifies the computation of
E i, i = 1, ..., 10. Other choices are also possible and clearly, the test statistic will be
influenced by the bins selected. However, given the above calculations it is unlikely
that the drawn conclusions would be affected.
3. It will be instructive to use the notation M for the retention level and L for the limiting
level, ( M = 20, L = 60).(a) It is not difficult to see from the definition of F Z x that the partial density function of
SPMI CW1 2012/2013 Solutions 4
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 5/10
its continuous part is f Z z ª f X z + M = c exp -c M + z , if 0 < z < L - M , where
f X ÿ is the density of the original individual claims X .
Hence, one can conclude that the r.v. X has an exponential distribution with parameter c,
i.e., f X
x
= c ‰-c x.
(b) We have
Z = min max0, X - M , L- M = min max0, X - 20 , 40(c) Applying similar reasoning as in the lectures,
E Z = M
L1 - F X x „ x = M
L1 - 1 - ‰-c x „ x = M
L‰-c x „ x =
‰-c M -‰-c L
c.
Alternatively, but much longer,
E Z = 0¶min max0, x - M , L- M f X x „ x =
= M
L x - M f X x „ x + L - M L+¶ f X x „ x
= 0 L- M y f X y + M „ y + L - M 1 - F X L
= 0 L- M y c ‰-c y+ M „ y + L - M ‰-c L
= c ‰-c M 0 L- M y ‰-c y „ y + L - M ‰-c L
= -c ‰-c M L - M ‰-c L- M +1
c ‰-c L- M - 1 + L - M ‰-c L
= ‰-c M
-‰-c L
c
Hence,
E Z = ‰-c M -‰-c L
c =
‰-0.1μ 20-‰-0.1μ 60
0.1 = 1.32857
(d) Applying similar reasoning as in the lectures, we have
F Y y = ¶ F X y, if y < M
F X L + y - M , if y ¥ M
i.e.,
F Y y = ¶ 1 - exp -c y, if y < M
1 - exp -c L + y - M , if y ¥ M
We have
E Y = 0 M 1 - F X x „ x + L+¶1 - F X x „ x.
= 0 M ‰-c x „ x + L¶
‰-c x „ x =1+‰-c L -‰-c M
c.
Alternatively, and simpler,
E Y = E X - E Z =1
c -
‰-c M -‰-c L
c =
1+‰-c L -‰-c M
c.
SPMI CW1 2012/2013 Solutions 5
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 6/10
Alternatively, but much longer,
E Y = 0¶min X , M f X x „ x + 0¶
max0, X - L f X x „ x =
=1+‰-c L -‰-c M
c.
Finally, we have
E Y =1+‰-c L -‰-c M
c =
1+‰-0.1μ 60-‰-0.1μ 20
0.1 = 8.67143
4.
(a) We have that Y = I x where I is the indicator of the loss event (accident)
I =
1, with probability q
0, with probability 1 - q
and x is the severity of the loss given the loss event (accident) occurs. For the cdf
F Y y, for y ¥ 0, we have
F Y y = PY § y = PY § y I = 0 P I = 0 + PY § y I = 1 P I = 1= P I x § y I = 0 P I = 0 + P I x § y I = 1 P I = 1
= P0 § y 1 - q + P x § y q = 1 - q + q F x ysubstituting the gamma cdf (see lecture notes on Loss distributions)
F x y = 1 - exp -a y - a y exp -a y = 1 - exp -a y 1 + a y,
we obtain that
F Y y = 1 - q + q 1 - exp -a y 1 + a y = 1 - q 1 + a y exp -a y(b) Applying maximum likelihood to estimate q, we have
Lq, x =
n
x q x
1 - qn- x
=
2000
40 q40
1 - q2000-40
log Lq, x = logn
x + x log q + n - x log1 - q
d log Lq, xd q
= x
q-
n - x1 - q
= 0 ï q`
= x
n=
400
2000= 0.2
Since E x = m = 2 a , we have that a`
= 2 E x = 2 1000 = 0.002
(c) Set m = E
x
and s2 = Var
x
. Then, in view of
SPMI CW1 2012/2013 Solutions 6
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 7/10
E Y k = E I x k = E I k E x k = q E x k ï
E Y = mY = q E x = q m =2 q
a
Var Y = s2 = E Y 2 - E Y 2 = q E x 2 - q2 E x 2 =
qs2 + m2 - q2 m2 = q s2 + q1 - q m2 =2 q
a2+
4 q1 - qa2
=2 q3 - 2 q
a2
E Y = mY = q m = 0.2 * 1000 = 200
Var
Y
= s2 =
2 q3 - 2 qa2
=2 * 0.2 3 - 2 * 0.2
0.0022
= 260000.
E S n = n E Y = 2000 * 200 = 400000;
Var S n = 20002 q3 - 2 q
a2= 2000 * 260 000 = 52 μ 107
(d) From the lectures we have that for a probability level b, say b = 0.95
b = PS n § Pn = PS n § 1 + q E S n = PS n - E S n § q E S n =
PS n - E S n
Var S n §
q E S nVar S n
= P S * §q E S nVar S n
where PS * § x º F x, standard normal cdf ï F q E S nVar S n = 0.95 ï
(2)
q ºq b Var S n
E S n =
q b n s
n mY
=q b s
n mY
=1.65 * 260000
2000 * 200= 0.0940645,
where we have used that
E S n = j=1
n
q j m j = n q m = n mY and Var S n = j=1
n
Var Y j = n Var Y = n s2,
and that mY = 20, s = 260 000 = 172.047, q b = 1.65 at b = 0.95.
From (2) we see that the security loading coefficient q decays as 1 n , so the more
policies n the less the security coefficient q , which is natural since the risk is shared
among larger number of policyholders.
SPMI CW1 2012/2013 Solutions 7
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 8/10
(e) Denote the loss of the insurance company by W . We have
W = ¶ 0, if Y § d
Y - d , if Y > d
ï
F W w = PW § w = ¶ 0, if w < 0
F Y w + d = PY § w + d if w ¥ 0
F W w = PW § w =0, if w < 0
1 - q + q F x w + d if w ¥ 0
F W w = PW § w = ¶ 0, if w < 0
1 - q 1 + a w + d exp -a w + d , if w ¥ 0
(f) (i) We have
W =
0, if Y § d
Y - d , if d < Y § m
m - d , if Y > m
F W w = PW § w =
0, if w < 0
F Y w + d = PY § w + d ,1,if 0 § w < m - d
if m - d § w
F W w =
PW § w =
0, if w < 0
1 - q 1 + a w + d exp -a w + d ,1,
if 0 § w < m - d
if m - d § w
(ii) We have
E W = 0
¶1 - F W w „ w = 0
m-d 1 - F W w „ w
= 0
m-d 1 - 1 - q 1 + a w + d exp -a w + d „ w
= 0
m-d
q 1 + a w + d exp -a w + d „ w
SPMI CW1 2012/2013 Solutions 8
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 9/10
= 0
m-d
q exp -a w + d „ w + 0
m-d
q a w + d exp -a w + d „ w
= q ‰-a d
0
m-d
‰-a w
„ w + q a 0
m-d
w + d ‰-a
w+d
„ w
=q
a‰-a d - ‰-a m+
q
a1 + a d ‰-a d - 1 + a m ‰-a m
=q
a2 + a d ‰-a d - 2 + a m ‰-a m
E W =q
a2 + a d ‰-a d - 2 + a m ‰-a m = 107.438
E S = 2000 E W = 2000 * 107.438 = 214876,
so the mean has decreased almost twice compared to E S n = 2000 * 200 = 400 000.
5.
We have the general results
(3) E S = m E N and V S = s2 E N + m2 Var N (a)(i) For the Poisson case
E S = m l and V S = l E X 2So
E S = m l = 2 *1
3+ 3 *
1
2+ 4 *
1
6* 150 = 425
V
S
= l E
X 2
= 150 * 22 *
1
3
+ 32 *1
2
+ 42 *1
6
= 1275.
(ii) From (3) we have
E S = m E N = m1 - p
p=
5
3*
0.98
0.02=
245
3= 81.6667
V S = s2 E N + m2 Var N =
s
2 1 - p
p + m
2 1 - p
p2 =
5
32 *
0.98
0.02 +
52
32 *
0.98
0.022 =
61495
9 = 6832.78
SPMI CW1 2012/2013 Solutions 9
8/13/2019 CourseWork1 20122013 Solutions
http://slidepdf.com/reader/full/coursework1-20122013-solutions 10/10
(ii) From (3) we have
E S = m E N = mk * 1 - p
p= 3 *
4 * 0.98
0.02= 588
V S = s2 E N + m2 Var N =
s2 k * 1 - p
p+ m2
k * 1 - p p2
= 32 *4 * 0.98
0.02+ 32 *
4 * 0.98
0.022= 89964
(b)(i) Comparing
M S t = exp 2001
1 - 2 t - 1 , for t < 1 2
with formula (22) from the lecture notes on Risk models which is for the Poisson
number of claims
M S t = exp l M X t - 1it is evident that this corresponds to a collective risk model with Poisson l = 200
number of claims and
M X t =1
1 - 2 t , for t < 1 2
is the m.g.f. of exponentially distributed claim amounts, X ~ Exp0.5 (ii) Therefore we have
E S = m l = 2 * 200 = 400 and V S = l E X 2 = 200 * 22 + 22 = 1600
SPMI CW1 2012/2013 Solutions 1 0