Risk Process based on General Compound Hawkes Process and...

Risk Process based on General

Compound Hawkes Process and its

Implementation with Real Datatext

Gabriela Zeller

University of Calgary

Calgary, Alberta, Canada

'Hawks Seminar' Talk

Dept. of Math. & Stat.

Calgary, Canada

August 07, 2018

Outline of Presentation

• The Classical Risk Model

• Hawkes Process Background

• RMGCHP: Theoretical Results

• RMGCHP: Implementation with Real Data

• Evaluation and Next Steps

• References

The Classical Risk Model

This section gives some mathematical background on the classical risk

model, such as common assumptions and approaches.

The Cramer-Lundberg Model - Background

The classical risk model or compound-Poisson risk model was

introduced in 1903 by Filip Lundberg and has the following form:

R(t) = u+ ct−Nt∑i=1

Yi

where u denotes the initial capital, c denotes the (continuous)

premium rate and the number of claims in the interval [0, t) is a

homogeneous Poisson process Nt with rate λ.

The claims Yi are i.i.d. positive random variables with distribu-

tion function G and mean µG independent of (Nt).

The Cramer-Lundberg Model - Computations

Quantities of interest that have been extensively studied:

Ruin time: τ = inf{t > 0 : R(t) < 0} where inf∅ =∞Probability of ruin in [0, t):Φ(u, t) = P (τ ≤ t|R(0) = u) = P ( inf

0<s≤tR(s) < 0)

Probability of ultimate ruin:

Φ(u) = limt→∞

Φ(u, t) = P (inft>0

R(t) < 0)

For practical purposes, it is sometimes more convenient to use

the survival probability: δ(u) = 1−Φ(u)Severity of ruin: |R(τ)|Distribution of the time to ruin (given ruin): P (τ < t|τ <∞)Net Pro�t Condition:

When is the mean income strictly larger than the mean out�ow?

Premium Principle:

How should the premium be set against the insurance risk?

The Cramer-Lundberg Model - Net Pro�t Condition

The �rst interest of an insurance company is to avoid a state

where ruin occurs with probability 1.

• If c < λµG, ruin is unavoidable, that is Φ(u) = 1.

• If c = λµG ruin also occurs with probability 1.

(This result requires some deep theory of random walks (e.g.

Mikosch (2000))

• Thus, the obvious condition for solvency of the company is

the net pro�t condition c > λµG which implies the premium

rate to be c = (1+θ)λµG where θ is called the safety loading.

The Cramer-Lundberg Model - Ruin Probability

The following is known for the survival probability (see e.g. As-mussen and Albrecher (2010)):

Theorem 1 The survival probability δ(u) is continuous and dif-ferentiable everywhere (except the countable set where G is notcontinuous) and satis�es the following integro-di�erential equa-tion:

cδ′(u) = λ[δ(u)−u∫

0

δ(u− y)dG(y)] (1)

For a general distribution G, equation (1) is not analytically solv-

able.


However, for claims following an Exp(γ) distribution, the survival

(resp. ruin) probability can be explicitly computed as:

δ(u) = 1−λ

cγe(λc−γ)utextandtextΦ(u) =

λ

cγe(λc−γ)u

This convenience is one of the reasons why claims are often

assumed to be i.i.d. Exp(γ) distributed for insurance applications

(for the light-tailed case).


As equation (1) cannot be solved analytically for the general

case, this has motivated seeking bounds for the ruin probability.

To this end, Lundberg introduced the adjustment coe�cient

R > 0 as the unique positive root of

h(r) = 0

where h(r) = λ(MY (r)− 1)− crwhere MY (r) = E[erY1] is the m.g.f. of (Yi) which is assumed to

be �nite for all 0 < r < γ ≤ ∞ with limr→γ−

MY (r) =∞.

The intuition here is that for r ∈ R such that MY (r) <∞ we can

show that the process {e−rRt−h(r)t}t≥0 is a martingale.


The adjustment coe�cient is connected to the ruin probability

through the following theorem (see e.g. Asmussen and Albrecher

(2010)):

Theorem 2 (Lundberg Inequality)

Φ(u) ≤ e−Ru, text∀u ≥ 0

where R is the adjustment coe�cient.

For the special case of Exp(γ)-distributed claim sizes, the in-

equality gives Φ(u) ≤ e(λc−γ)u

which is related closely to the net pro�t condition c > λγ in this

case.


Using the adjustment coe�cient, we can furthermore describe

the asymptotic behavior of the ruin probability.

Theorem 3 Assume that the adjustment coe�cient R exists

and that

λ

c

∞∫0

xeRx(1−G(x))dx <∞

Then

limu→∞Φ(u)eRu =

c− λµGλM ′Y (R)− c

The Cramer-Lundberg Model - Di�usion Approximation

For a general distribution function, ruin probabilities can be es-

timated using the following di�usion approximation.

Proposition 1 Let (R(n)t ) be a sequence of Cramer-Lundberg

processes with initial capital u, claim arrival intensities λ(n) = nλ,

claim size distributions G(n)(x) = G(√nx) and premium rates

c(n) = (1 +c− λµGλµG√n

)λ(n)µ(n)G = c+ (

√n− 1)λµG

Let µG =∞∫0ydG(y) and assume that µ2,G =

∞∫0y2dG(y) < ∞.

Then

R(n)t

d→ (u+Wt)

where (Wt) is a (c−λµG, λµ2,G)-Brownian motion and the conver-

gence is in distribution in the topology of uniform convergence

on �nite intervals.


Let τ(n) denote the ruin time of (R(n)t ) and

τ = inf{t ≥ 0 : u+Wt < 0} the ruin time of the di�usion limit.

Proposition 2 Let (R(n)t ) and (Wt) be as above. Then

limn→∞P [τ(n) ≤ t] = P [τ ≤ t]

and

limn→∞P [τ(n) ≤ ∞] = P [τ ≤ ∞]


The idea is to approximate P [τ(1) ≤ t] by P [τ ≤ t] (�nite ruin

probability) and P [τ(1) ≤ ∞] by P [τ ≤ ∞] (in�nite ruin probabil-

ity).

Thus, we use the following result for the Brownian motion:

Proposition 3 Let (Wt) be a (m,σ2)-Brownian motion with

m > 0 and τ = inf{t ≥ 0 : u+Wt < 0}.Then

P [τ <∞] = e−2umσ2

and

P [τ ≤ t] = 1−Φ(mt+ u

σ√t

) + e−2umσ2 Φ(

mt− uσ√t

)

The Cramer-Lundberg Model - Extensions

The Cramer-Lundberg risk model is a convenient foundation for

classical risk theory, but it does not re�ect the dependencies of

incoming claims in reality. Thus, several extensions have been

proposed:

The Sparre-Andersen model / renewal risk model introduced in

1957 replaces the Poisson process with a general renewal pro-

cess Nt, thus allowing for claim inter-arrival times with arbitrary

distribution functions. However, the lengths of the intervals be-

tween subsequent arrivals stay independent.

More recently, risk models using a Cox process (with Poisson shot

noise) have been studied as they incorporate time-dependent in-

tensity in�uenced by exogenously caused jumps ("shocks", envi-

ronmental factors).

The Cramer-Lundberg Model - Extensions

However, it has been observed that e�ects like contagion and

clustering in �nancial contexts are often caused endogenously.

Therefore, Hawkes processes have gained attention due to their

ability to re�ect endogenously caused jumps of the intensity func-

tion.

Hawkes processes have e.g. been successfully applied to con-

struct stock price models including �nancial contagion, to model

mid-price changes in high-frequency trading and in order book

�ow modelling (Bacry (2015) gives a good overview).

Hawkes Processes

This section gives an introduction to Hawkes processes.First, we give the de�nition, explain the conditional intensity function and

highlight the immigration-birth-representation of a Hawkes process.We then focus on the special case of a Hawkes process with exponentiallydecaying intensity and give some properties of the number of jumps of such

a process over a �xed interval.

The last part of this section focuses on simulation, parameter estimation

and goodness of �t testing of a Hawkes model.

Hawkes Process: Counting Process

The Hawkes process introduced by Hawkes (1971) is a simple

point process that can model a sequence of arrivals over time,

e.g. trade orders, bank defaults or incoming claims.

The counting process N(t) refers to the number of arrivals over

time, and the corresponding point process (t1, t2, ...) refers to the

arrival times (thus the "jumps" of N(t)).

Consider a counting process (N(t) : t ≥ 0) with history

(H(t) : t ≥ 0) that satis�es:

P(N(t+ h)−N(t) = m|H(t)) =

λ∗(t)h+ o(h), m = 1

o(h) m > 1

1− λ∗(t)h+ o(h), m = 0

Hawkes Process: Counting Process

Hawkes Process: Conditional Intensity Function

The Hawkes process has self-exciting property, clustering e�ect

and long memory. This is re�ected in the conditional intensity

function:

λ∗(t) = λ+

t∫0

µ(t− s)dN(s)

where λ > 0 is the background intensity and µ > 0 is the exci-

tation function describing how much the intensity is a�ected by

past jumps.

Using the observed sequence of arrival times (t1, t2, · · · , tk) up to

time t, the conditional intensity can be written as:

λ∗(t) = λ+∑ti<t

µ(t− ti)


We can see that each new arrival causes the conditional intensity

function to jump up instaneously, then decay back to the back-

ground intensity until it jumps again at the next arrival. The

intensity depends on the whole past history of the process.

Platzhalter


Of course we can analogously de�ne a multi-dimensional Hawkes

process:

De�nition 1 A Hawkes process is a counting process Nt such

that the intensity vector can be written as

λit = µi +D∑j=1

∫φij(t− t′)dNj

t′ (2)

where the quantity µ = {µi}Di=1 is a vector of exogenous in-

tensities and Φ(t) = {φij(t)}Di,j=1 is a matrix kernel which is

component-wise positive (φij(t) ≥ 0 for each 1 ≤ i, j ≤ D),

component-wise causal (if t < 0, φij(t) = 0 for each 1 ≤ i, j ≤ D)and each component φij(t) belongs to the space of L1-integrable

functions.

Hawkes Process: Immigration-Birth-Representation

A Hawkes process X can be represented as a Poisson cluster

process with the following structure:

a) Immigrants (cluster centers) are distributed according to a

homogeneous Poisson process I with points Xi ∈ (0,∞) and in-

tensity λ > 0.b) Each immigrant (of generation 0) Xi generates a cluster

Ci = CXi, a random set with the following branching structure:

Given generations 0,1, ..., n ∈ Ci, each Y ∈ Ci (of generation n)generates a Poisson process on (Y,∞) of o�spring (of generationn+ 1) with intensity function µ(· −Y ), where µ : (0,∞)→ (0,∞]is a non-neg. Borel function called fertility rate.

c) Given the immigrants, the centered clusters

Ci −Xi = {Y −Xi : Y ∈ Ci} for Xi ∈ I are i.i.d. and independent

of I.

d) X is the union of all clusters⋃iCi.


Let 0 < n :=∞∫0µ(s)ds < 1.

If an immigrant enters the system at time ti ∈ R, they produce

o�spring at rate µ(t − ti) at future times t > ti. Their o�-

spring (called �rst generation) again produces o�spring (second

generation) and so on - members of all generations are called

descendants of the original arrival.

Let Zi denote the random number of o�spring in the nth gener-

ation (where Z0 = 1 denotes the immigrant).

Then E[Zi] = ni and the expected number of descendants for

one immigrant is

E[∞∑i=1

Zi] =∞∑i=1

E[Zi] =∞∑i=1

ni =

n

1−n, n < 1

∞ n ≥ 1


We can see that the condition n < 1 avoids explosion of the

process (where one immigrant would generate in�nitely many

children).

Most properties of the Hawkes process rely on this so-called sta-

tionarity which we will always assume to hold from now on.

Note that for n ∈ (0,1), the branching ratio n can be interpreted

as the probability that a random arrival was generated endoge-

nously (a child) as we can look at the ratio of descendants over

the whole "family" (descendants + original immigrant):

E[∞∑i=1

Zi]

1 + E[∞∑i=1

Zi]=

n1−n

1 + n1−n

= n


The immigration-birth representation is interesting for us due to

its interpretation in the risk model context:

We observe standard claims which arrive according to the points

of I and trigger other claims according to the branching struc-

ture described before.

The fertility rate is mon. decreasing, so the process has a self-

exciting structure.

This risk process is closely related to the shot-noise Cox model

(doubly stochastic Poisson model) studied e.g. in Albrecher

(2006), with the main di�erence that the other model incorpo-

rates time-dependent intensity in�uenced by exogenously caused

jumps ("shocks", environmental factors) as opposed to endoge-

nously caused clustering in the Hawkes model.

Hawkes Process: Compensator

De�nition 2 (Compensator) For a counting process N(·) the

non-decreasing function

Λ(t) =

t∫0

λ∗(s)ds

is called the compensator of the process.

Note that M(t) := N(t)− Λ(t) is a local H(t)-martingale.

Exponential Hawkes Process: De�nition

We now want to make a concrete choice for the excitation func-

tion of the Hawkes process, where we choose the most commonly

used exponential decay, thus µ(t) = αe−βt.This might seem restrictive (and it is), but until now (almost)

all applications of Hawkes processes use this speci�cation as it

simpli�es many theoretical derivations as we will see in the fol-

lowing. Thus, from now on our conditional intensity function

will be:

λ∗(t) = λ+ α∑ti<t

e−β(t−ti)

Thus, each arrival makes the intensity immediately jump up by

α and over time the impact of the arrival decays exponentially

at rate β.

Exponential Hawkes Process: De�nition

For the special case of exponentially decaying intensity, given

an initial condition λ∗(t) = λ0, the conditional intensity process

satis�es the SDE

dλ∗(t) = β(λ− λ∗(t))dt+ αdN(t) (3)

Also note that the joint process X(t) = (λ∗(t), N(t)) is a Markov

process on the state space D = R+×N which allows Da Fonseca

et al. (2014) to derive the useful results about properties of

the number of jumps of a Hawkes process over a �xed interval

described in the following.

Exponential Hawkes Process: Properties

Proposition 4 Given a Hawkes process X(t) = (λ∗(t), N(t)) with

dynamic given by (3), the long-run expected value of the number

of jumps during an interval of length τ is given by:

limt→∞

E[N(t+ τ)−N(t)] =λ

1− α/βτ = lim

t→∞E[λ∗(t)]τ (4)

The variance is given by:

V (τ) = limt→∞

E[(N(t+ τ)−N(t))2]− E[N(t+ τ)−N(t)]2 (5)

=λ

1− α/β(τ(

1

1− α/β)2 + (1− (

1

1− α/β)2)

(1− e−τ(β−α))

β − α(6)

The covariance for two non-overlapping intervals of length τ withlag δ > 0 is given by:

Cov(τ, δ) = limt→∞

E[(N(t+ τ)−N(t))(N(t+ 2τ + δ)−N(t+ τ + δ))]

(7)

− E[N(t+ τ)−N(t)]E[N(t+ 2τ + δ)−N(t+ τ + δ)]

(8)

=λβα(2β − α)(e(α−β)τ − 1)2

2(α− β)4e(α−β)δ (9)

Note that taking the limit for t → ∞ (thus putting the process

into its long-run stationary regime) to simplify dependence with

respect to the initial value λ0 requires again the stability of the

process, which for exponential intensity means∞∫0αe−βsds = α

β < 1.

Exponential Hawkes Process: Properties

Proposition 5 A direct consequence from the last result is the

autocorrelation function of the number of jumps over intervals

of length τ separated by a time lag of δ:

Acf(τ, δ) =e−2βτ(eατ − eβτ)2α(α− 2β)

2(α(α− 2β)(e(α−β)τ − 1) + β2τ(α− β))e(α−β)δ

(10)

Note that this expression is always positive for α < β (stationarity

condition) and it is exponentially decaying with the lag δ.

Exponential Hawkes Process: Simulation

In order to simulate a Hawkes Process, there are di�erent pos-

sible methods. We elect to use the modi�ed thinning algorithm

introduced in Ogata (1981) and described again in Laub et al.

(2015).

The original thinning algorithm was used by Lewis et al. (1979)

to simulate a non-homogeneous Poisson process with time-dependent

rate λ(t) by generating a homogeneous Poisson process with rate

M > λ(·) and probabilistically removing points so that the re-

maining points satisfy the intensity λ(t).

An analogous approach that requires updating the upper bound

M during the simulation can be used to simulate a Hawkes pro-

cess.


text


text

Exponential Hawkes Process: Parameter Estimation

Given a set of arrival times (t1, t2, · · · , tk) assumed to come from

a Hawkes process, we would like to generate parameter estimates

(λ, α, β) by the method of maximum likelihood estimation. We

start by citing the result from Daley, Vere-Jones (2003):

Theorem 4 (Hawkes Process Likelihood) Let N(·) be a reg-ular point process on [0, T ] for some �nite T > 0 and let t1, · · · , tkbe a realisation of N(·) over [0, T ]. Then the likelihood L of N(·)is expressible in the form

L = (k∏i=1

λ∗(ti))exp(−T∫

0

λ∗(s)ds) (11)


Given the likelihood function from (11), the log-likelihood for

the interval [0, tk] is given as

l =k∑i=1

log(λ∗(ti))−tk∫

0

λ∗(s)ds =k∑i=1

log(λ∗(ti))− Λ(tk) (12)

For a Hawkes process with exponential decay, the compensator

can be explicitly computed as

Λ(tk) = λtk −α

β

k∑i=1

[e−β(tk−ti) − 1] (13)


In order to make the computation feasible, the term

A(i) =

0 i = 1i−1∑j=1

e−β(ti−tj) = e−β(ti−ti−1)(1 +A(i− 1)) i ∈ {2, · · · , k}

is introduced, such that the log-likelihood (12) of a Hawkes pro-

cess with exponentially decaying intensity is given by

l =k∑i=1

log(λ+ αA(i))− λtk +α

β

k∑i=1

[e−β(tk−ti) − 1] (14)


In general, the maximum likelihood estimation will be very ef-

fective and its consistency, asymptotic normality and e�ciency

were proved in Ogata (1978).

However, for real datasets there are several signi�cant challenges

such as bias for small sample sizes, high number of local optima

and O(k) complexity for large sample sizes which made Filiminov

et al. (2013) state that

"Our overall conclusion is that calibrating the Hawkes process

is akin to an excursion within a mine�eld

that requires expert and careful testing before any conclusive

step can be taken."

Exponential Hawkes Process: Goodness of Model Fit

After estimating parameters, the next important step is assessing

the goodness of �t of the Hawkes process model for real data.

An essential result for this is the following theorem from e.g.

Brown et al. (2002).

Theorem 5 (Random Time Change Theorem) Let {t1, t2, · · · , tk}be a realisation over time [0, T ] from a point process with con-ditional intensity function λ∗(·). If λ∗(·) is positive over [0, T ]and Λ(T ) < ∞ a.s. then the transformed points {t∗1, · · · , t

∗k} =

{Λ(t1), · · · ,Λ(tk)} form a Poisson process with unit rate.

As we know the closed form of the compensator (13), we can

test the quality of the parameter estimation by transforming the

original timepoints and performing standard �tness tests for a

unit rate Poisson process on the transformed datapoints.

Exponential Hawkes Process: Goodness of Model Fit

As suggested in Laub et al. (2015), we use the following steps:

• QQ-plot of tranformed interarrival times {t∗1, t∗2−t∗1, · · · } against

Exp(1)-distribution

• Independence plot of the points (Ui+1, Ui), where

Ui = F (t∗i − t∗i−1) = 1− e−(t∗i−t

∗i−1)

• Checking for autocorrelation in the sequence of transformed

interarrival times

Risk Model with Hawkes Process and RMGCHP

Now we combine the �rst two sections by studying a risk model whereclaims arrive according to a Hawkes process.

First, we mention some results obtained in this area by past work. We then

introduce the Risk Model with General Compound Hawkes Processes

(RMCGHP) suggested by Swishchuk (2017) and the theoretical results

obtained for it (Law of Large Numbers and Functional Central Limit

Theorem).

Risk Model with Hawkes Process

The �rst work to consider a risk model with Hawkes claims ar-

rivals was Stabile et al. (2010), who derive the asymptotic be-

havior of in�nite and �nite horizon ruin probabilities and asymp-

totically e�cient simulation laws using that compound Hawkes

process ful�ls a large deviation principle and assuming light-tailed

claims.

Their work was extended by Zhu (2013) who considered (subex-

ponential) heavy tailed claims.


Dassios and Zhao (2012) consider a risk process with the arrival

of claims modelled by a dynamic contagion process, generalising

the Hawkes process and the Cox process with shot noise intensity

and thus including both self-excited and externally excited jumps.

They derive generalisations of the Cramer-Lundberg inequality,

Lundbergs equation, some asymptotics as well as bounds for the

probability of ruin with special attention on the case of exponen-

tial jumps.

Dassios and Jang (2012) study a bivariate shot noise self-excitingprocess for insurance, including a constant rate of exponential

decay that could be interpreted as the time value of money.

They derive theoretical distributional properties and use numer-

ical examples to show that this point process could be used for

the modelling of discounted aggregate losses from catastrophic

events.


Cheng and Seol (2018) derive di�usion approximations and thus

expressions for the ruin probabilities of risk models with Hawkes

claims arrivals, providing numerical examples for exponential and

Gamma-distributed jumps.

They construct the di�usion approximation analogously to the

classical case and �nd that the di�usion limit is a Gaussian pro-

cess that can be decomposed into a centered Gaussian process

and an independent Brownian motion.

Contrary to the classical case, the variance function of the dif-

fusion limit is nonlinear in t in general, and can be computed

explicitly for a Hawkes process with exponential decay.

Risk Model Based On General Compound Hawkes Process

As a generalisation of the classical risk model, Swishchuk (2017)

proposes a risk model with general compound Hawkes process

(RMGCHP):

R(t) = u+ ct−Nt∑k=1

a(Xk) (15)



Hawkes process Nt.

The claim sizes Xk follow a continuous-time Markov chain on

state space X = {1, ..., n} independent of Nt and a(x) is a contin-

uous and bounded function on X. Special cases of this RMGCHP

would be a(Xk) := Xk following a Markov Chain and a(Xk) := Xki.i.d.

RMGCHP: Theoretical Results

The �rst important result of Swishchuk (2017) is a Law of Large

Numbers for the RMGCHP:

Theorem 6 (LLN) Let R(t) be the risk model de�ned in (15),

and let Xk be a Markov Chain with state space X and stationary

probabilities π∗n. We suppose that 0 < µ =∞∫0µ(s)ds < 1. Then

limt→∞

R(t)

t= c− a∗

λ

1− µ(16)

where a∗ =∑k∈X

a(k)π∗k


From this Law of Large Numbers follow the net pro�t condition

and premium principle:

Corollary 1 (Net Pro�t Condition)

c > a∗λ

1− µ(17)

Corollary 2 (Premium Principle)

c = (1 + θ)a∗λ

1− µ(18)

where θ is the safety loading.


The second important result in Swishchuk (2017) is the following

Functional Central Limit Theorem:

Theorem 7 (FCLT) Let R(t) be the risk model de�ned in (15),

and Xk be an ergodic Markov Chain with stationary probabilities

π∗n. We suppose that 0 < µ =∞∫0µ(s)ds < 1 and

∞∫0sµ(s)ds < ∞.

Then:

limt→∞

R(t)− (ct− a∗N(t))√t

D= σΦ(0,1)

(or in Skorokhod topology (see Skorokhod (1965))):

limn→∞

R(nt)− (cnt− a∗N(nt))√n

D= σW (t) (19)

where Φ(·, ·) is the std. Normal cdf and W(t) is a std. Wienerprocess.

σ := σ∗√λ/(1− µ), (σ∗)2 :=

∑i∈X

π∗i ν(i)

a∗ :=∑i∈X

π∗i a(i), b(i) := a∗ − a(i)

νi := b(i)2 +∑j∈X

(g(j)− g(i))2P (i, j)

− 2b(i)∑j∈X

(g(j)− g(i))P (i, j)

g := (P + Π∗ − I)−1(b(1), ..., b(n))T

where P is the transition matrix for Xk and Π∗ is the matrix ofstationary probabilities of P .


The FCLT allows us to approximate the risk process R(t) by the

jump-di�usion-process D(t):

R(t) ≈ u+ ct−N(t)a∗+ σW (t) := u+D(t)

where a∗ and σ are de�ned as above, N(t) is a Hawkes process

and W (t) is a standard Wiener process.

The rate of approximation is given by

E|R(t)− (ct− a∗N(t))− σW (t)| ≤1√tC(c, a∗, σ, λ, µ)

or

E|R(tn)− (cnt− a∗N(nt))− σW (t)| ≤1√nC(c, a∗, σ, λ, µ, T )

from Swishchuk (2000).


We use the di�usion approximation to calculate the ruin proba-

bility in a �nite time interval and the ultimate ruin probability.

Theorem 8 (Finite horizon ruin probability)

Ψ(u, τ) = Φ(−u+ (c− a∗λ/(1− µ))τ

σ√τ

)

+ e−2(c−a∗λ/(1−µ))

σ2 uΦ(−

u− (c− a∗λ/(1− µ))τ

σ√τ

)

Theorem 9 (Ultimate ruin probability)

Ψ(u) = e−2(c−a∗λ/(1−µ))

σ2 u

Implementation of RMGCHP with Empirical Data

We would now like to implement the results for RMGCHP with real dataprovided by ERGO Group.

We �rst explain the dataset and data preparation steps.We then show that a Poisson model is not suitable for this data, thus we

estimate parameters for a Hawkes model and test the goodness of �t of theobtained model with respect to real claim arrival times.

Based on the obtained Hawkes model, we use the results from Swishchuk(2017) to calculate premium rates and ruin probabilities using the di�usion

approximation.

In order to test the appropriateness of the approximation, we then compare

the standard deviation of the empirical process (for large timescales) with

the theoretical di�usion coe�cients.

Empirical Data: Description

ERGO Group has kindly sent �ve very comprehensive data sets,

one for each reporting year 2010 to 2014.

These include claims that were reported to the company in the

respective year, although the claim might have occurred in an

earlier year and might cause payments from the reporting year

on into other years in the future. The data sets are about claims

from various kinds of 'Rechtsschutzversicherung' (insurance cov-

ering claims from legal disputes, i.e. lawyer fees).


Each data set consists of the following columns:

VNR: Policy Number (e.g. one client)

SCHDLNR: Running number of claim per policy (a client can

have several claims over the years)

SCHDDAT: Date of claim occurrence (this can be earlier than

the reporting year)

ZAHLUNG: Claim payment (one claim can result in several pay-

ments over time, this is the main point of interest)

ZHLGDATUM: Date of the claim payment (temporal structure

of this is main point of interest)

LEISTUNGSART: Type of insurance (there are many classes of

legal expenses insurance, in total 44 classes in this dataset)

STATUS: Has the case been closed or is it still open?

MELDJAHR: Reporting Year


Example of the dataset from the reporting year 2010:

text

The �rst three rows belong to the same policy and claim oc-

currence (on 28.03.2002). The claim was reported in 2010

and has led to three payments by the insurer on future dates

(06.05.,06.07. and 11.08.2010) with di�erent claim sizes (745.78,

502.78 and 4.43). The case has been closed.

Empirical Data: Preparation

For a selected subset we extract the columns ZAHLUNG (claim

payment) and ZHLGDATUM (claim payment date) and modify

ZHLGDATUM as follows:

For the reporting year 2010, the starting point 0 is chosen as

31.12.2009 and the time scale is in days. Each date is thus as-

signed a number (days from the start), e.g. 15.01.2010 → 15 or

03.02.2010 → 34.

For �tting data from the reporting year 2011 only, the date

31.12.2010 is chosen as 0 and so on.

For �tting data from all reporting years simultaneously, the start-

ing point is 31.12.2009 and numbers are assigned continuously,

i.e. 01.01.2011 → 366.

Empirical Data: Preparation

On some days, there is more than one occurrence which is im-

possible for a simple point process. In this case, occurrences are

distributed uniformly (non-random) over the day, e.g. if there

are 3 occurrences on day 15, they are assigned 15, 15.33 and

15.66.

This is questionable, but there is no precedent in literature for

insurance data and distributing occurrences randomly uniformly

over the day (as has been done over milliseconds for �nancial

data) leads to very unstable parameter estimations over each

run.

For the following demonstration, we choose the dataset of claims

of type "Firmen-Arbeitsschutz" which have been reported in the

years 2010 to 2014 and which have occurred exactly three years

before their reporting (thus between 2007 and 2011).

This dataset has 1205 events over a period of T = 2882 days.

Empirical Data: Check of Poisson Assumption

First, we plot the number of payment occurrences per week (7

days) over the whole time period and inspect if clustering can

be seen. text

Empirical Data: Check of Poisson Assumption

To justify the use of a Hawkes process, we next plot the inter-

arrival times of claim occurrences against an exponential distri-

bution in a QQ-plot. If the data followed a memoryless Poisson

distribution, we should see a good �t.

→ At this point, we can conclude that a classical risk model

would not be appropriate.

Empirical Data: Parameter Estimation

We �t parameters λ, α, β using Maximum Likelihood Optimiza-

tion where we constrain the parameters by (0,0,0) and (500,500,500)

and vary starting values for each parameter between 0 and 10.

This seems reasonable as λ for a Hawkes process should be

smaller than the "Poisson" rate of 12052882 = 0.4181 events per

time step and α and β should not be huge either due to the

rather low frequency of events per time step.

Parameter estimates are stable within this range as long as the

starting values of λ and β are not both much higher than the one

for α in which case α = 0 which would not be a Hawkes process.

The estimates for this speci�c dataset are λ = 0.0186, α =

0.0238, β = 0.0249.

We observe αβ< 1 indicating stability of the process, although α

and β are quite close together.

Empirical Data: Goodness of Fit

As a �rst goodness of �t test, we transform arrival times using

the closed-form compensator with the estimated parameters and

plot their interarrival times against an Exp(1)-distribution.

Note: In literature considering Hawkes process �tting to empirical

data, this criterion is never shown to test the goodness of �t (as

it is generally di�cult to get a good �t here).


To test independence of tranformed interarrival times, we plot

the points (Uk, Uk+1) as described above.

Ideally, we should see uniformly scattered points here.

Note: Again, this criterion is generally di�cult to meet for em-

pirical data.


As described above, it is generally di�cult to meet the standard

goodness of �t criteria for empirical data.

For most empirical datasets, the goodness of model �t as de-

scribed above per se is mediocre. Realistically, we should be

doubtful whether the data follows an exponential Hawkes pro-

cess (as this is a restrictive assumption).

However, even if the data does not strictly follow a Hawkes pro-

cess with exponential intensity, it might still be possible to de-

scribe the characteristics of the data well with a Hawkes process!


To this end, we compare the expected value and the autocorre-

lation of the number of jumps on an interval for the empirical

data with the theoretical values derived by Da Fonseca et al.

(2014).

We �nd that for all datasets the Hawkes process estimates the

number of jumps over any interval very well.

For some datasets, although the �t from the QQ-plot above

might not be very good, the Hawkes process almost perfectly

matches the autocorrelation for intervals of 7 (14,28) days and

time lags ranging from 7 to 180 (240) days.

Note that a Poisson process would assume the autocorrelation

to be 0 for all lags which is clearly not the case for empirical

data.


For our dataset, we compare the theoretical autocorrelation of

the number of jumps of a Hawkes process with λ = 0.0186, α =0.0238, β = 0.0249 (line) with the autocorrelation of the number

of jumps over the same intervals and lags for the empirical arrival

times (points). Interval lengths are 28 days and lags range from

7 to 180 by steps of 7.


We observe that the empirical autocorrelation is decreasing with

the time lag (as it should be for a Hawkes process) but unfor-

tunately the decrease is not exponential. The slow decay might

indicate a Hawkes process with power law decay might be more

suitable. However, in past literature, many authors state that

although empirical evidence generally favours power law decay,

they choose to work with exponential decay as it is analytically

much more convenient. As our work is the �rst one with em-

pirical insurance data, using exponential decay could thus be

justi�ed.

This slow autocorrelation decay however favours |β−α| ≈ 0 which

distorts the ratio αβ away from the empirical branching ratio ('pri-

mary' claims/immigrants vs. 'secondary' claims/children).

Empirical Data: Risk Process

After �tting a Hawkes process to the empirical claim times, we

would like to simulate the risk process and implement the results

from Swishchuk (2017). Again, the RMGCHP is de�ned in (15)

as

R(t) = u+ ct−Nt∑k=1

a(Xk)



Hawkes process Nt.

The claim sizes Xk follow a continuous-time Markov chain on

state space X = {1, ..., n} independent of (Nt) and a(x) is a

continuous and bounded function on X. Special cases of this

RMGCHP would be a(Xk) := Xk following a Markov Chain and

a(Xk) := Xk i.i.d.


We do all the following computations for the case of i.i.d. claim

sizes with two states as well as claims following a Markov Chain

with 2,3,4 and 10 states. In order to choose the function values

a(Xk), we follow the quantile-based approach used in Swishchuk

et al. (2017) described below.

For a Markov Chain with n states, we divide the empirical claim

sizes along n quantiles and assign to each 'section' the condi-

tional mean claim size. In order to obtain the transition matrix,

we count empirical occurrence frequencies of transitions from

one claim size to another.


For the example dataset, e.g. for four states, we would obtain

the conditional mean claimsizes

a = (a(1), a(2), a(3), a(4))

= (157.788,512.2633,1043.4839,2277.3009)

and transition matrix0.2947020 0.2384106 0.2450331 0.22185430.2748344 0.2748344 0.2384106 0.21192050.2408027 0.2575251 0.2575251 0.24414720.1926910 0.2325581 0.2524917 0.3222591


In the next step, we can �nd the stationary distribution as

π∗ = (π∗1, π∗2, π∗3, π∗4) = (0.2508306,0.2508306,0.2483389,0.25)

Then we can �nd the value of a∗ required for the LLN (16) and

FCLT (19):

a∗ =∑k∈X

a(Xk)π∗k = · · · = 996.5322

Note that this value is very close to the mean claimsize of

996.5711 as should be expected for our choice of a(Xk) and

the stationary distribution above.


As we do not have information on the initial capital u or on

the premium rate c, we set the initial capital for our dataset as

u = 100 and calculate the premium rate using the expected value

principle in (18) with a safety loading θ = 0.1:

c = (1 + θ)a∗λ

1− µ= 461.5339


Given u and c, we can now plot the empirical risk process using

the actual claimsizes and claim occurrence times:

Note that the �nal capital at time T for the empirical process is

130272.5 (if ruin did not occur before time T here).


Analogously, we can simulate risk processes using a Hawkes pro-

cess with the estimated parameters λ = 0.0186, α = 0.0238,

β = 0.0249 as arrival times and a simulated Markov Chain with

transition matrix P and claim sizes a(Xk) for the di�erent states:


In order to evaluate how well the simulated risk process �ts the

empirical one, we could use the following criteria adapted from

Zhang (2016): First, we look at how well the simulated processˆR(t) matches the empirical one in terms of �uctuations:

S(L) =1

L

L∑i=1

max(Ri(t))−min(Ri(t))

max(R(t))−min(R(t))

where L is the number of simulated paths.

Furthermore, we consider how far the �nal capital of simulated

paths is from the empirical one by considering:

F (L) =1

L

L∑i=1

Ri(T )

R(T )


We compare the results for a "classical" Poisson risk model with

i.i.d. claim sizes and three Hawkes models with Hawkes pro-

cess arrivals and i.i.d. claims (two states) and Markov Chain

claims (2,3,4,10 states) (as the results are very similar for all

Hawkes models, we only print 2 and 10 state case here). We use

L = 1000 simulations.

We can see that a Poisson model would underestimate �uctu-

ation whereas a Hawkes model (independent of the claim size

modelling) overestimates it. Furthermore, a Hawkes model un-

fortunately systematically overestimates the �nal capital (fc).


Model S(N) F (N) Mean (fc) Std. (fc)

Poisson 0.40817 1.0335 134639.5 31063.27

Hawkes

(i.i.d.)

1.8706 3.6356 473626.1 449849.9

Hawkes (2-

state MC)

1.9065 3.6543 476054.1 470973.7

Hawkes

(10-state

MC)

1.8716 3.5501 462483.5 460574.8


We would now like to implement the results from the FCLT and

di�usion approximation (19) derived in Swishchuk (2017). To

this end, we �rst compute the values of a∗, σ∗ and the di�usion

coe�cient σ = σ∗( λ1−α/β) for MC claims with di�erent numbers

of states.

Model a∗ σ∗ σ

Hawkes (i.i.d.) 996.5711 708.753 459.8908

Hawkes (2 states MC) 996.0189 718.3519 466.1193

Hawkes (3 states MC) 996.7685 832.8063 540.3856

Hawkes (4 states MC) 996.5322 898.3323 582.9037

Hawkes (10 states MC) 996.6928 1002.043 650.199

Note that a∗ stays constant (which makes sense considering our

choice of a(Xk)), but σ∗ and accordingly σ increase with the

number of states.

Empirical Data: Ruin Probabilities

Given the di�usion approximation from Swishchuk (2017) with

di�usion coe�cients σ calculated before, we can use (8) to cal-

culate ruin probabilities over intervals of length T .

We compare the obtained numbers with ruin probabilities ob-

tained from L = 1000 simulations of the risk process until time

T .

Model Theoretical RP Simulated RP

Hawkes (i.i.d.) 0.6724 0.176

Hawkes (2-state MC) 0.6993 0.199

Hawkes (3-state MC) 0.7502 0.182

Hawkes (4-state MC) 0.781162 0.19

Hawkes (10-state MC) 0.8199346 0.189

Empirical Data: Error Estimation

In order to understand the discrepancies between theoretical and

empirical ruin probabilities, we would like to assess the appropri-

ateness of the di�usion approximation. We proceed as suggested

in Swishchuk et al. (2017).

Given the FCLT

limn→∞

R(nt)− (cnt− a∗N(nt))√n

D= σW (t)

we would like to compare the standard deviation of the RHS

multiplied by√n, that is

√n√tσ∗

√λ

1−α/β to the counterpart on

the LHS, that is the standard deviation of the process

R(nt)− (cnt− a∗N(nt)) = u−N(nt)∑k=1

(a(Xk)− a∗) (20)


For the empirical dataset, we choose t to be our original time

scale of one day and �rst let n run from 7 to 1435 by steps of 7.

At each step nt, we compute the value of the process (R(int)−(cint− a∗N(int)))− (R((i− 1)nt)− (c(i− 1)nt− a∗N((i− 1)nt))),

thus we consider intervals of e.g. one week in the �rst step.

We then compare the standard deviation of these values to the

standard deviation theoretically obtained on the RHS of (19)

multiplied by√n.

Note that this approximation should be naturally only accurate

for large n, however due to our relatively short time frame of

2882 days, for large n the LHS of (19) is based on only few

observations.


The following plot shows the empirical standard deviation (points)

for di�erent models against the theoretical standard deviation

(lines).

The empirical standard deviation is only plotted for a model with

i.i.d. claims (two sizes), as the results look very similar for all

models on this scale.


We have to keep in mind that empirical standard deviations for

large values of nt are based on very few samples.

If we look at the standard deviations for a sequence of n from 1 to

35 (�ve weeks) in steps of 1 day, we obtain the following picture

where now all values are based on at least 80 observations.

The �rst plot shows the result for a Hawkes model with i.i.d.

claims (two sizes).


We now compare the results for Hawkes models with Markov

Chain claims with di�erent numbers of states.

Evaluation and Next Steps

In the following, I would like to summarize some concerns about

the results obtained so far and possible next steps:

1. Parameter Estimation for Hawkes process leads to good

model �ts (according to QQ-plot of transformed interarrival

times and independence plot) only for a minority of datasets.

Note: Most applications of Hawkes processes to empirical data

do not show the Hawkes process �t per se, but �nd other cri-

teria for model �t (i.e. autocorrelation �t or signature plot for

�nancial data).

We could thus use other criteria, e.g. good ruin probability ap-

proximations or error estimations for the di�usion approximation.


2. Data Description and Autocorrelation plot suggest that a

Hawkes process (with exponential decay) is not perfectly suit-

able.

As we can see from the empirical datasets, the response to an

initial claim arrival is not instantaneous, but there is usually a

time period between related claim payments (weeks or months

on a total timescale of seven years, thus we cannot make it close

to instantaneous by compressing the timescale).

Considering our work is the �rst one with Hawkes processes for

empirical insurance data, its use could probably still be justi�ed,

but we have to consider this when evaluating risk model �ts.

Next Steps: Power Law Kernel

One possibility is �tting a Hawkes process with power-law kernel

instead of an exponential kernel.

In this case, the conditional intensity function is


k

(t− ti + c)1+η

where k, c, η > 0.

In this case, interpretation of the parameters is not so straight-

forward, but roughly one could say that k corresponds to α de-

scribing the upward jump of the intensity caused by an arrival

(the magnitude of its in�uence), η corresponds to β determin-

ing how longlasting the impact of the arrival is and c describes

a temporal shift to keep the intensity bounded when (t − ti) is

close to 0.


For a power law kernel we �nd that the stationarity condition is

kc−η

η< 1

and the compensator is given as

Λ(t) =

t∫0

λ∗(s)ds = λt+k

η

∑tj<t

(c−η − (t− tj + c)−η)


Thus, we can deduce the loglikelihood function analogously as

for the exponential case in Laub et al. (2015) and given in (14)

and get:

l =n∑i=1

log(λ+ki−1∑j=1

(ti−tj+c)−(1+η))−λT+k

η

n∑i=1

((T−ti+c)−η−c−η)

where (t1, ..., tn) are the observed arrivals over the interval [0, T ].

Unfortunately, in this case there is no easy way to avoid the

double summation in the �rst part of the loglikelihood function

which leads the optimization to be very slow for a large num-

ber of arrivals. (In the exponential case, we used a recursive

computation derived by Ozaki (1979)).


One startling observation so far for constrained MLE for a power

law kernel is that for simulated data, even given the correct pa-

rameters as starting values, the estimated parameters (λ, k, c, η)

are rather far away from the real ones, but the conditional in-

tensity function over time looks quite similar.

We simulate a Hawkes process with power law kernel and parame-

ters (λ, k, c, η) = (0.5,1,1,2) over an interval of [0,100]. Parame-

ters are estimated as (λ, k, c, η) = (0.3959,499.9999,2.8507,4.8385)

for an upper bound of 500 for each parameter. We plot the con-

ditional intensity function for both sets of parameters below.


text


For the empirical dataset from above, we calculate parameter

estimates (λ, k, c, η) using the lower bound (0,0,0,0) and varying

the upper bound for each parameter until 5000.We �nd that λ, c and η are mostly independent of the starting

values, but k is always estimated as the upper bound (due to

the long computation time, a large number of runs with di�er-

ent values was not feasible). We nevertheless plot the intensities

with bounds 500 and 5000 and compare them with the intensity

from the exponential kernel estimated before.

We �nd that on the overall time period [0,2882], the intensi-

ties look quite similar, however, if we zoom in on the period of

[0,200], we observe that a power law kernel would favor a faster

decay than an exponential kernel (which would contradict our

intention of using it). This might of course be caused by the

problems when estimating k and could be studied further.


text


text


3. Risk Model Computations and Ruin Probabilities are inaccu-

rate when comparing theoretical formulas and simulations.

This holds for all datasets and even for simulated data when

using the correct parameters as estimations.

A possible reason for this might be that α ≈ β, so the Hawkes

process is close to unstable and thus outcomes are very "volatile"

over simulations.


4. Claim Size Modelling with a Markov Chain is not consistent

with empirical process.

The way the empirical insurance claim portfolio is constructed

(various claims with more than one payment are bundled together

on the same timescale), claim sizes that directly follow each

other in the overall portfolio are not necessarily related as they

generally don't come from the same claim process. Thus, the

use of a Markov Chain that connects claim sizes is not really

justi�able. One possibility would be to use i.i.d. exponentially

distributed claims (as is commonly done in insurance literature).

Next Steps: Claim Sizes as i.i.d. Exp(γ)

The risk model would have the following form:


Yi

where u is the initial capital, c is the premium rate and the num-

ber of claims in the interval [0, t) is a Hawkes process Nt.

The claims Yi are i.i.d. Exp(γ)-distributed random variables with

mean µG = 1γ independent of (Nt).


In order to check whether this approach would be suitable for

our empirical data, we �rst plot the empirical c.d.f. of claimsizes

against an Exp(γ) distribution where we choose γ = 1mean(claimsizes)

.


As two further qualitative tests we could again use a QQ-plot

against the Exponential distribution and the probability integral

transform which should ideally lead to i.i.d. Unif [0,1] variables

(independence plot).

text

text

Both qualitative test lead us to believe that modelling claimsizes

as i.i.d. exponentially distributed would be a suitable approach

for this dataset.


If we compute the di�usion approximation for this model based

on the sequence of n = (1, ...,35) days, we get the following

result. Note that for this case, the value of σ∗ on the RHS of

equation (19) would be σ∗ = sd(Xk) =√V ar(Xk) = 1

γ which is

simply the mean of the empirical claimsizes.

Next Steps: Markov-Modulated Claimsizes

For other empirical datasets, the assumption of i.i.d. Exp(γ)

claimsizes does not seem to hold.

Thus, as a generalization, one could consider studying a risk

model where claimsizes are assumed to come from two di�erent

Exponential distributions, say Exp(γ1) and Exp(γ2), depending

on the state of an underlying (unobservable) Markov Chain (with

two states). Thus, we would have a process (Zt, Yt) where (Zt)

is a Markov Chain on the state space S = {1,2} and (Yt) are the

claimsizes of the risk model. The distribution of an incoming

claim at time t would only depend on the state of St and be

independent of other (Y·).

Next Steps: Markov-Modulated Claimsizes

A general case of such a semi-Markov risk model (for Poisson

arrivals) was studied in Reinhard (1984) and asympotic non-ruin

probabilities were obtained explicitly for the case above (two

states and Exponential distributions).

More recently Yu and Li (2005) gave non-ruin probabilities for

more general claim size distributions.

Asmussen (1986) studies risk theory in a Markovian environment,

studying a variety of methods for assessing the ruin probabilities,

in particular Cramér-Lundberg approximations and di�usion ap-

proximations with correction terms.

Next Steps: Di�usion Approximation

The very recent work of Cheng and Seol (2018) also studies a

di�usion approximation for a risk model with Hawkes claims ar-

rivals and exponentially distributed claimsizes. They construct

the di�usion approximation analogously as described for the clas-

sical risk model in Chapter 1 and �nd that the limit is a Gaussian

process with non-linear variance function (as opposed to the dif-

fusion towards a standard Brownian motion for the Poisson case).


They start with the risk model


Yi

where (Yi) are i.i.d. claims with E[Y1] = m1 < ∞ and E[Y 21 ] =

m2 < ∞, independent of the claims arrival process (Nt) which

is assumed to follow a stationary Hawkes process with intensity


µ(t− ti).

Let µ =∞∫0µ(t)dt <∞.


Similarly to the standard di�usion approximation in the "classi-

cal" case, they construct a sequence of processes (Rλ) (where λ

is the background rate of the Hawkes process) as

Rλ(t) = u+ cλt−Nλt∑

i=1

1√λYi

where cλ =√λm1

1−µ + k for a constant k > 0 and Nλt describes the

number of arrivals of a Hawkes process with background rate λ.

They study the limit for λ→∞.


Intuitively, we go from a process with few, big jumps to a process

with smaller jumps at a higher rate. To illustrate this, we choose

a Hawkes process with exponential intensity,

i.e. λ∗(t) = λ+∑ti<tαe

−β(t−ti) and µ = αβ .

We assume (Yi) to be i.i.d.Exp(γ)-distributed with mean m1 = 1γ .

We de�ne the sequence of risk processes

Rn(t) = u+ c(n)t−N

(n)t∑i=1

Y(n)i

where N(n)t is the number of arrivals of a Hawkes process with

background intensity nλ and (Y (n)i ) are i.i.d. Exp(

√nγ)-distributed,

i.e. E[Y (n)i ] =: m(n)

1 = m1√n.


The premium rate is set as c(n) = m11−α/β((1+θ)λ+

√n−1), where

θ is again the safety loading from the expected value principle

(note that for n = 1 we obtain the premium rate we had used

before for the "original" process).

We then let n→∞.


The following plot shows realisations of the risk process for dif-

ferent choices of n for a Hawkes process with initial parameters

(λ, α, β) = (1,2,5) and initial claimsizes according to an expo-

nential distribution with γ = 1/4. The initial capital is set as

u = 10 and the safety loading is θ = 0.1.


Cheng and Seol (2018) prove the following theorem:

Theorem 10 Let Xλt =

Nλt∑

i=1Yi be the aggregate claims process

and assume that µ(·) is a decreasing function with∞∫0tµ(t)dt <∞.

Then, as λ→∞,

Xλt −

λm1t1−µ√λ

→ G (21)

weakly in (D([0,∞) ,R), J1) where G is a mean-zero almost surely

continuous Gaussian process with covariance function (t ≥ s),

Cov(G(t), G(s)) = m21

t∫s

s∫0

φ(u−v)dvdu+m2s

1− µ+2m2

1

s∫0

t2∫0

φ(t2−t1)dt1dt2

(22)

where φ : [0,∞)→ [0,∞) satis�es the integral equation

φ(t) =µ(t)

1− µ+

∞∫0

µ(t+ v)φ(v)dv +

t∫0

µ(t− v)φ(v)dv (23)

As a result,

Rλt → u+ kt−G(t) (24)


For the special case of a Hawkes process with exponential inten-

sity, i.e. µ(t) = αe−βt, they compute explicitly (for t ≥ 0)

φ(t) =αβ(2β − α)

2(β − α)2e−(β−α)t

and thus

V ar(G(t)) = (β

β − αm2+

αβ(2β − α)

(β − α)3m2

1)t−m21αβ(2β − α)

(β − α)4(1−e−(β−α)t)

(25)


We choose the same parameters as above - (λ, α, β) = (1,2,5),(Yi) Exp(1/4), u = 10 and θ = 0.1 - and simulate 500 runs of

the risk process R(n)t for n = 100 over the interval [0,100]. At

each timestep, we compute the variance over 500 realisations

(black line) with the variance from (25) (red line) and �nd that

they are very similar.


Cheng and Seol (2018) further derive approximations for the

ruin probability based on the di�usion approximation, but unfor-

tunately they can only be computed numerically. In their paper,

they provide numerical examples for exponentially and gamma-

distributed claimsizes (based on simulated data).

Next Steps: Optimal Investment

Recall: Theorem 7 derived by Swishchuk (2017) allows us to

approximate the risk process R(t) by the jump-di�usion-process

D(t)

R(t) ≈ u+ ct−N(t)a∗+ σW (t) := u+D(t) (26)

text

where D(t) is N ((c − a ∗ λ1−µ)t, σ2t)-distributed, where σ is the

di�usion coe�cient from (19).


Recall: In his presentation during the Hawks Seminar on July

4, Mohamed brie�y mentioned the work of Browne (1995) on

optimal investment for an insurer who can invest in a risky asset

(St) following

dSt = St(µdt+ σdWt)

when the insurer's risk process is described by

R(t) = u+ ct+ βW1t (27)

where W1t is a Brownian motion such that E[WtW1t] = ρt and ρ

is the correlation coe�cient.


As in this model, the risk of bankruptcy cannot be eliminated,

they solve for the strategy that minimizes the probability of ruin

for an insurer with exponential utility function. They show that

this strategy is equivalent to maximizing utility from terminal

wealth, given there is no riskfree asset.

This work is interesting for us, as the representation of our risk

process in (26) is closely related to the risk process used in

Browne (1995) in (27). We would hope to use the same steps

applied from his work to �nd an optimal strategy for a risk model

based on Hawkes claims arrivals (which to the best of our knowl-

edge has not been addressed so far).

References

• Albrecher, H. et al. (2006). Ruin probabilities and aggre-grate claims distributions for shot noise Cox processes.

• Asmussen, S. (1986). Risk theory in a Markovian environ-ment

• Asmussen, S. and Albrecher, H. (2010). Ruin Probabilities.

• Bacry, E. et al. (2015). Hawkes processes in �nance.

• Brown, E. et al. (2002). The time-rescaling theorem and itsapplication to neural spike train data analysis.

• Browne, S.(1995). Optimal Investment Policies for a Firmwith a Random Risk Process: Exponential Utility and Mini-mizing the Probability of Ruin

References

• Cheng, Z. el al. (2018). Gaussian Approximation of a Risk

Model with Stationary Hawkes Arrivals of Claims.

• Cramer, H. (1955). Collective Risk Theory: A Survey of

the Theory from Point of view of the Theory of Stochstic

Processes.

• Da Fonseca, J. and Zaatour, R. (2014). Hawkes process:

Fast calibration, application to trade clustering, and di�usive

limit.

• Daley,D., Vere-Jones, D. (2013). An Introduction to the

Theory of Point Processes.

References

• Dassios, A. and Jang, J. (2012). A bivariate shot noise self-exciting process for insurance.

• Dassios, A. and Zhao, H. (2012). Ruin by Dynamic Conta-gion Claims.

• Filimonov, V. and Sornette, D. (2013). Apparent criticalityand calibration issues in the Hawkes self-excited point processmodel: application to high-frequency �nancial data.

• Hawkes, A. (1971). Point Spectra of Some Mutually ExcitingPoint Processes.

• Jaisson, T. and Rosenbaum, M. (2014). Limit theorems fornearly unstable hawkes processes.

References

• Laub, J. et al. (2015). Hawkes Processes.

• Lu, Y. and Li, S. (2005) On the Probability of Ruin in aMarkov-modulated Risk Model.

• Lundberg, F. (1903). Approximerad Framställning av San-nolikehetsfunktionen, Återförsäkering av Kollektivrisker.

• Mikosch, T. and Samorodnitsky, G. (2000). Ruin Probabilitywith Claims modelled by a stationary ergodic stable process.

• Ogata, Y. (1981). Information Theory.

• Reinhard, J. (1984). ON A CLASS OF SEMI-MARKOVRISK MODELS OBTAINED AS CLASSICAL RISK MOD-ELS IN A MARKOVIAN ENVIRONMENT.

References

• Stabile, G. et al. (2010). Risk Processes with Non-stationaryHawkes Claims Arrivals.

• Swishchuk, A. (2017). Risk Model Based on General Com-

pound Hawkes Processes.

• Swishchuk, A. et al. (2017). General Semi-Markov Model

for Limit Order Books.

• Swishchuk, A. et al. (2017, preprint) Compound Hawkes

Processes in Limit Order Books.

• Swishchuk, A. (2000). Random Evolutions and Their Appli-

cations: New Trends.

• Zhang, C. (2016). Modeling High Frequency Data Using

Hawkes Processes with Power-law Kernels.

• Zhu, L. (2013). Ruin probabilities for risk processes with

non-stationary arrivals and subexponential claims.

References

• Zhang, C. (2016). Modeling High Frequency Data Using

Hawkes Processes with Power-law Kernels.

• Zhu, L. (2013). Ruin probabilities for risk processes with

non-stationary arrivals and subexponential claims.

The End

Thank You!

Q&A time!

Date post:	30-Jul-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Risk Process based on General Compound Hawkes Process and...

Documents