CourseWork1 20122013 Solutions

transcript

8/13/2019 CourseWork1 20122013 Solutions

http://slidepdf.com/reader/full/coursework1-20122013-solutions 1/10

Statistics and Probabilistic Modelling for Insurance

Solutions to Course Work No1 2012/2013

1. (a) Since f x, y is a joint density, we have that

(1) -¶

f x, y „ y „ x = c x2+ y2§ R2

„ y „ x = 1 .

Noting that the double integral is equal to the area of the circle, i.e.,

x2+ y2§ R2

„ y „ x = p R2

from (1) we obtain that , c =1

(b) We have

f X x = -¶

f x, y „ y =1

p R2 x2+ y2§ R2

p R2 - R2- x2

R2- x2

„ y =2

p R2 R2 - x2 ,

if x2 § R2 and f X x = 0 if x2 > R2. By symmetry, the marginal density of Y is given

f Y y =2

p R2 R2 - y2

y2 § R2

y2 > R2.

(c) The distribution function, F Da, 0 § a § R, of the distance, D = X 2 + Y 2 is

obtained as follows

F Da = P D § a = P X 2 + Y 2 § a = P X 2 + Y 2 § a2

= x2+ y2§a2

f x, y „ y „ x =1

p R2 x2+ y2§a2

„ y „ x =p a2

where we have used the fact that x2+ y2§a2

„ y „ x is the area of a circle of radius a and

thus is equal to p a2

. When R = 50 miles and a = 10 miles we obtainP D § 10 =

2500 = 0.04.

(d) Using the distribution function, F Da, from part (c) we obtain

f Da =„

„ aF Da =

R2, 0 § a § R.

Hence, we have

E D = 0

R2„ a =

a2 „ a =2 R

2. (a) Equating the theoretical mean and variance to the empirical mean, x, and

variance, we get

‰ m+

= x‰2 m+s2‰s2

- 1 = i=1120 xi - x2 119

‰2 m+s2

= i=1120 xi 120

‰2 m+s2‰s2

- 1 = i=1120 xi - x2 119

ï ‰s2

- 1 =i=1

120 xi - x2 119

Hence, we have

- 1 =15601373

2020.292

2ï s2 = Log1 +

15601373

2020.292

2 = 1.57327 ï s = 1.2543

‰ m+

= 2020.292 ï m = Log2020.292 -1

2s2 = 6.82436

(b) The likelihood function takes the form

La, l = i=1

f X xi = i=1

n a la

l + xia+1

and hence

log La, l = la, l = Log i=1

n a la

l + xia+1

log a la - logl + xia+1 = i=1

loga + a logl - a + 1 logl + xi

loga +n

a logl -i=1

a + 1 logl + x

SPMI CW1 2012/2013 Solutions 2

Differentiating the log-likelihood function wrt a and l, and then solving for a, one

find the maximum likelihood estimators must satisfy

d log La, ld a

+ n log

l + xi

= 0 ï

i=1n logl + xi - n logl

d log La, ld l

= n a1

a + 1 1

l + xi = 0 ï

l+ xinl

-i=1n 1l+ xi

Hence, the maximum likelihood estimator l` must be a solution of

i=1n logl + xi - n logl =

i=1n 1

l+ xin

l -i=1

l+ xiwhich may be solved in e.g. Excel. Thus, we obtain l

`= 1872.13 and a

`= 1.880467.

(c) Note that both Lognormal and Pareto distributions are defined on

and you

need to have i E i = 120 = i Oi.

(i) One appropriate choice is to use 10 equally probable bins as determined by the

fitted Lognormal distribution. So, in the case of Lognormal with m = 6.82436 and

s = 1.2543 we have

where the Expected number of data in the bin L, U is calculated as

120 P L < x § U = 120 F U - F L = 120 μ 0.1We need to compare with c2 with df = (10-2-1) = 7. The critical value at 5% is 14.07

and so we reject the hypothesis that data follow Lognormal distribution with

= 6.82436 and s = 1.2543. The critical value at 1% is 18.48 and so we do not reject

the Lognormal fit. The p-value of the test statistic is 0.018503 and of course leads to

the same conclusions.

(ii) One appropriate choice is to use 10 equally probable bins as determined by the

fitted Pareto distribution (note that there is no built-in (inverted) Pareto distribution

function in Excel but it could be explicitly inverted in order to find the bins). So, in the

case of Pareto with a = 1.8805 and l = 1872.13 we have

where the Expected number of data in the bin L, U is calculated as

120 P L < x § U = 120 F U - F L = 120 μ 0.1

We need to compare with c2 with df = (10-2-1) = 7. The critical value at 5% is 14.07

and so we do not reject the hypothesis that data follow Pareto distribution with

a = 1.8805 and l = 1872.13. The critical value at 1% is 18.48 and so we again do not

reject the Pareto fit. The p-value of the test statistic is 0.277482 and of course leads to

the same conclusions.

(iii) Recall that in order for the Goodness-of-Fit test to be valid one needs E i ¥ 5, so in

both cases (i) and (ii) the test is reliable. The quality of the test is also reaffirmed by

the high number of observations, namely 120, used to perform it.

Based on the test statistics one can say that the Pareto fit is better than the Lognormal

fit - it has a much smaller test statistics and thus, a higher p-value.

This particular choice ensures Oi > 5, i = 1, ..., 10 and simplifies the computation of

E i, i = 1, ..., 10. Other choices are also possible and clearly, the test statistic will be

influenced by the bins selected. However, given the above calculations it is unlikely

that the drawn conclusions would be affected.

3. It will be instructive to use the notation M for the retention level and L for the limiting

level, ( M = 20, L = 60).(a) It is not difficult to see from the definition of F Z x that the partial density function of

its continuous part is f Z z ª f X z + M = c exp -c M + z , if 0 < z < L - M , where

f X ÿ is the density of the original individual claims X .

Hence, one can conclude that the r.v. X has an exponential distribution with parameter c,

i.e., f X

= c ‰-c x.

(b) We have

Z = min max0, X - M , L- M = min max0, X - 20 , 40(c) Applying similar reasoning as in the lectures,

E Z = M

L1 - F X x „ x = M

L1 - 1 - ‰-c x „ x = M

L‰-c x „ x =

‰-c M -‰-c L

Alternatively, but much longer,

E Z = 0¶min max0, x - M , L- M f X x „ x =

L x - M f X x „ x + L - M L+¶ f X x „ x

= 0 L- M y f X y + M „ y + L - M 1 - F X L

= 0 L- M y c ‰-c y+ M „ y + L - M ‰-c L

= c ‰-c M 0 L- M y ‰-c y „ y + L - M ‰-c L

= -c ‰-c M L - M ‰-c L- M +1

c ‰-c L- M - 1 + L - M ‰-c L

= ‰-c M

-‰-c L

Hence,

E Z = ‰-c M -‰-c L

‰-0.1μ 20-‰-0.1μ 60

0.1 = 1.32857

(d) Applying similar reasoning as in the lectures, we have

F Y y = ¶ F X y, if y < M

F X L + y - M , if y ¥ M

F Y y = ¶ 1 - exp -c y, if y < M

1 - exp -c L + y - M , if y ¥ M

We have

E Y = 0 M 1 - F X x „ x + L+¶1 - F X x „ x.

= 0 M ‰-c x „ x + L¶

‰-c x „ x =1+‰-c L -‰-c M

Alternatively, and simpler,

E Y = E X - E Z =1

‰-c M -‰-c L

1+‰-c L -‰-c M

Alternatively, but much longer,

E Y = 0¶min X , M f X x „ x + 0¶

max0, X - L f X x „ x =

=1+‰-c L -‰-c M

Finally, we have

E Y =1+‰-c L -‰-c M

1+‰-0.1μ 60-‰-0.1μ 20

0.1 = 8.67143

(a) We have that Y = I x where I is the indicator of the loss event (accident)

1, with probability q

0, with probability 1 - q

and x is the severity of the loss given the loss event (accident) occurs. For the cdf

F Y y, for y ¥ 0, we have

F Y y = PY § y = PY § y I = 0 P I = 0 + PY § y I = 1 P I = 1= P I x § y I = 0 P I = 0 + P I x § y I = 1 P I = 1

= P0 § y 1 - q + P x § y q = 1 - q + q F x ysubstituting the gamma cdf (see lecture notes on Loss distributions)

F x y = 1 - exp -a y - a y exp -a y = 1 - exp -a y 1 + a y,

we obtain that

F Y y = 1 - q + q 1 - exp -a y 1 + a y = 1 - q 1 + a y exp -a y(b) Applying maximum likelihood to estimate q, we have

Lq, x =

1 - qn- x

40 q40

1 - q2000-40

log Lq, x = logn

x + x log q + n - x log1 - q

d log Lq, xd q

n - x1 - q

= 0 ï q`

2000= 0.2

Since E x = m = 2 a , we have that a`

= 2 E x = 2 1000 = 0.002

(c) Set m = E

and s2 = Var

. Then, in view of

E Y k = E I x k = E I k E x k = q E x k ï

E Y = mY = q E x = q m =2 q

Var Y = s2 = E Y 2 - E Y 2 = q E x 2 - q2 E x 2 =

qs2 + m2 - q2 m2 = q s2 + q1 - q m2 =2 q

4 q1 - qa2

=2 q3 - 2 q

E Y = mY = q m = 0.2 * 1000 = 200

= s2 =

2 q3 - 2 qa2

=2 * 0.2 3 - 2 * 0.2

0.0022

= 260000.

E S n = n E Y = 2000 * 200 = 400000;

Var S n = 20002 q3 - 2 q

a2= 2000 * 260 000 = 52 μ 107

(d) From the lectures we have that for a probability level b, say b = 0.95

b = PS n § Pn = PS n § 1 + q E S n = PS n - E S n § q E S n =

PS n - E S n

Var S n §

q E S nVar S n

= P S * §q E S nVar S n

where PS * § x º F x, standard normal cdf ï F q E S nVar S n = 0.95 ï

q ºq b Var S n

E S n =

q b n s

=q b s

=1.65 * 260000

2000 * 200= 0.0940645,

where we have used that

E S n = j=1

q j m j = n q m = n mY and Var S n = j=1

Var Y j = n Var Y = n s2,

and that mY = 20, s = 260 000 = 172.047, q b = 1.65 at b = 0.95.

From (2) we see that the security loading coefficient q decays as 1 n , so the more

policies n the less the security coefficient q , which is natural since the risk is shared

among larger number of policyholders.

(e) Denote the loss of the insurance company by W . We have

W = ¶ 0, if Y § d

Y - d , if Y > d

F W w = PW § w = ¶ 0, if w < 0

F Y w + d = PY § w + d if w ¥ 0

F W w = PW § w =0, if w < 0

1 - q + q F x w + d if w ¥ 0

F W w = PW § w = ¶ 0, if w < 0

1 - q 1 + a w + d exp -a w + d , if w ¥ 0

(f) (i) We have

0, if Y § d

Y - d , if d < Y § m

m - d , if Y > m

F W w = PW § w =

0, if w < 0

F Y w + d = PY § w + d ,1,if 0 § w < m - d

if m - d § w

F W w =

PW § w =

0, if w < 0

1 - q 1 + a w + d exp -a w + d ,1,

if 0 § w < m - d

if m - d § w

(ii) We have

E W = 0

¶1 - F W w „ w = 0

m-d 1 - F W w „ w

m-d 1 - 1 - q 1 + a w + d exp -a w + d „ w

q 1 + a w + d exp -a w + d „ w

q exp -a w + d „ w + 0

q a w + d exp -a w + d „ w

= q ‰-a d

‰-a w

„ w + q a 0

w + d ‰-a

a‰-a d - ‰-a m+

a1 + a d ‰-a d - 1 + a m ‰-a m

a2 + a d ‰-a d - 2 + a m ‰-a m

E W =q

a2 + a d ‰-a d - 2 + a m ‰-a m = 107.438

E S = 2000 E W = 2000 * 107.438 = 214876,

so the mean has decreased almost twice compared to E S n = 2000 * 200 = 400 000.

We have the general results

(3) E S = m E N and V S = s2 E N + m2 Var N (a)(i) For the Poisson case

E S = m l and V S = l E X 2So

E S = m l = 2 *1

3+ 3 *

2+ 4 *

6* 150 = 425

= 150 * 22 *

+ 32 *1

+ 42 *1

= 1275.

(ii) From (3) we have

E S = m E N = m1 - p

3= 81.6667

V S = s2 E N + m2 Var N =

2 1 - p

0.02 +

0.022 =

9 = 6832.78

(ii) From (3) we have

E S = m E N = mk * 1 - p

p= 3 *

4 * 0.98

0.02= 588

V S = s2 E N + m2 Var N =

s2 k * 1 - p

k * 1 - p p2

= 32 *4 * 0.98

0.02+ 32 *

4 * 0.98

0.022= 89964

(b)(i) Comparing

M S t = exp 2001

1 - 2 t - 1 , for t < 1 2

with formula (22) from the lecture notes on Risk models which is for the Poisson

number of claims

M S t = exp l M X t - 1it is evident that this corresponds to a collective risk model with Poisson l = 200

number of claims and

M X t =1

1 - 2 t , for t < 1 2

is the m.g.f. of exponentially distributed claim amounts, X ~ Exp0.5 (ii) Therefore we have

E S = m l = 2 * 200 = 400 and V S = l E X 2 = 200 * 22 + 22 = 1600

SPMI CW1 2012/2013 Solutions 1 0

CourseWork1 20122013 Solutions

Documents