Analysis of Markov Reward Models with Partial Reward Loss … · M Telek, Markov Anniversary...

M Telek, Markov Anniversary Meeting, June 2006. Analysis of Markov Reward Models with Partial Reward Loss - p. 1/15

Analysis of Markov Reward Models withPartial Reward Loss Based on a Time

Reverse Approach

Gábor Horváth, Miklós TelekTechnical University of Budapest, 1521 Budapest, Hungary

{hgabor,telek}@webspn.hit.bme.hu

Outline

● Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Outline

■ Markov Reward models with reward loss

■ The difficulty of time forward approach

■ The time reverse analysis approach

■ Properties of the obtained solution

■ Numerical examples

■ Conclusions

Outline

Introduction

● Markov Reward models

without reward loss● Markov Reward models with

total reward loss● Markov Reward models with

partial reward loss

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Markov Reward models without reward loss

Markov reward models (MRM)

■ a finite state CTMC,■ non negative reward rates (ri),■ performance measures:

◆ reward accumulated up to time t,◆ time to accumulate reward w.

ri

rk

rj

rk

t

t

j

i

k

Z(t)

B(t)

Outline

Introduction




partial reward loss

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Markov Reward models with total reward loss

We consider

■ first order MRM (deterministic dependence on Z(t)),■ without impulse reward,■ but with potential reward loss at state transition.

t

rk

t

Z(t)

B(t)

j

i

k

ri

rkrj

Outline

Introduction




partial reward loss

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Markov Reward models with partial reward loss

In case of partial reward loss:

■ αi remaining portion of reward when leaving state i,■ the lost reward is proportional to:

◆ total accumulated reward⇒ partial total loss,

◆ reward accumulated in the last state⇒ partial incremental loss.

Outline

Introduction




partial reward loss

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Markov Reward models with partial reward loss

In case of partial reward loss:

■ αi remaining portion of reward when leaving state i,■ the lost reward is proportional to:

◆ total accumulated reward⇒ partial total loss,

◆ reward accumulated in the last state⇒ partial incremental loss.

t

ri

rk

rj

rk

tT1 T2 T3

Z(t)

B(t)

i

j

k

B(T−1 )αi

B(T−2 )αk

B(T−3 )αjαj[B(T−3 )− B(T2)]

t

ri

rk

rj

B(T2)

rk

rkαk

riαi

rjαj

tT1 T2 T3

Z(t)

B(t)

i

j

k

Outline

Introduction

Model behaviour

● Time forward approach

● Time reverse approach

Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Time forward approach

Possible interpretation:■ Reduced (riαi) reward accumulation up to the last state

transition,■ and total (ri) reward accumulation in the last statewithout reward loss.

αj[B(T−3 )− B(T2)]

t

ri

rk

rj

B(T2)

rk

rkαk

riαi

rjαj

tT1 T2 T3

Z(t)

B(t)

i

j

k

Outline

Introduction

Model behaviour



Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Time forward approach

Possible interpretation:■ Reduced (riαi) reward accumulation up to the last state

transition,■ and total (ri) reward accumulation in the last statewithout reward loss.

αj[B(T−3 )− B(T2)]

t

ri

rk

rj

B(T2)

rk

rkαk

riαi

rjαj

tT1 T2 T3

Z(t)

B(t)

i

j

k

Unfortunately, the last state transition before time T is not astopping time.

Outline

Introduction

Model behaviour



Model description

Analysis approach

Numerical method

Numerical example

Conclusions


Time reverse approach

Behaviour of the time reverse process:■ Inhomogeneous CTMC

with initial probability←−γ (0) = γ(T )

and generator←−Q(τ) = {←−qij(τ)},

where

←−qij(τ) =

γj(T − τ)

γi(T − τ)qji if i 6= j,

−∑

k∈S,k 6=i

γk(T − τ)

γi(T − τ)qki if i = j.

Outline

Introduction

Model behaviour



Model description

Analysis approach

Numerical method

Numerical example

Conclusions



Behaviour of the time reverse process:■ Inhomogeneous CTMC

with initial probability←−γ (0) = γ(T )

and generator←−Q(τ) = {←−qij(τ)},

where

←−qij(τ) =

γj(T − τ)

γi(T − τ)qji if i 6= j,

−∑

k∈S,k 6=i

γk(T − τ)

γi(T − τ)qki if i = j.

■ Total (ri) reward accumulation in the first state,■ and reduced (riαi) reward accumulation in all consecutive

states■ without reward loss.

Outline

Introduction

Model behaviour

Model description


Analysis approach

Numerical method

Numerical example

Conclusions



Potential model description:

duplicate the state space to describe■ the total reward accumulation in the first state (ri),■ and the reduced reward accumulation in all further states

(riαi).

Outline

Introduction

Model behaviour

Model description


Analysis approach

Numerical method

Numerical example

Conclusions



Potential model description:

duplicate the state space to describe■ the total reward accumulation in the first state (ri),■ and the reduced reward accumulation in all further states

(riαi).

π∗(0) = [γ(T ), 0],←−Q∗(τ) =

←−−QD(τ)

←−Q(τ)−

←−−QD(τ)

0←−Q(τ)

, R∗ =

R 0

0 Rα

Outline

Introduction

Model behaviour

Model description

Analysis approach

● Inhomogeneous differential

equation

● Homogeneous differential

equation

● Block structure of the

differential equation

● Moments of accumulated

reward

Numerical method

Numerical example

Conclusions


Inhomogeneous differential equation

Introducing

←−Y i(τ, w) = Pr(

←−B (τ) ≤ w,

←−Z (τ) = i)

we can apply the analysis approach available forinhomogeneous MRMs.

It is based on the solution of the inhomogeneous partialdifferential equation

∂

∂τ

←−Y (τ, w) +

∂

∂w

←−Y (τ, w)R =

←−Y (τ, w)

←−Q(τ) ,

where←−Y (τ, w) = {

←−Y i(τ, w)}.

Outline

Introduction

Model behaviour

Model description

Analysis approach


equation


equation




reward

Numerical method

Numerical example

Conclusions


Inhomogeneous differential equation

Introducing

←−Y i(τ, w) = Pr(

←−B (τ) ≤ w,

←−Z (τ) = i)

we can apply the analysis approach available forinhomogeneous MRMs.

It is based on the solution of the inhomogeneous partialdifferential equation

∂

∂τ

←−Y (τ, w) +

∂

∂w

←−Y (τ, w)R =

←−Y (τ, w)

←−Q(τ) ,

where←−Y (τ, w) = {

←−Y i(τ, w)}.

But a drawback of this approach is that it requires thecomputation of

←−Q(τ).

Outline

Introduction

Model behaviour

Model description

Analysis approach


equation


equation




reward

Numerical method

Numerical example

Conclusions


Homogeneous differential equation

To overcome this drawback we introduce the conditionaldistribution of reward accumulated by the reverse process

←−V i(τ, w) = Pr(

←−B (τ) ≤ w |

←−Z (τ) = i)

and the row vector←−V (τ, w) = {

←−V i(τ, w)}.

Outline

Introduction

Model behaviour

Model description

Analysis approach


equation


equation




reward

Numerical method

Numerical example

Conclusions


Homogeneous differential equation

To overcome this drawback we introduce the conditionaldistribution of reward accumulated by the reverse process

←−V i(τ, w) = Pr(

←−B (τ) ≤ w |

←−Z (τ) = i)

and the row vector←−V (τ, w) = {

←−V i(τ, w)}.

Using this performance measure we have to solve

∂

∂τ

←−V (τ, w) +

∂

∂w

←−V (τ, w)R =

←−V (τ, w)QT ,

where QT is the transpose of Q.

Outline

Introduction

Model behaviour

Model description

Analysis approach


equation


equation




reward

Numerical method

Numerical example

Conclusions


Block structure of the differential equation

Utilizing the special block structure of the Q′(τ) and the R′

matrices (of size 2#S) we can obtain two homogeneous partialdifferential equations of size #S:

∂

∂τ

←−X1(τ, w) +

∂

∂w

←−X1(τ, w)R =

←−X1(τ, w)QD ,

and

∂

∂τ

←−X2(τ, w)+

∂

∂w

←−X2(τ, w)Rα =

←−X1(τ, w)(Q−QD)T +

←−X2(τ, w)QT ,

Outline

Introduction

Model behaviour

Model description

Analysis approach


equation


equation




reward

Numerical method

Numerical example

Conclusions


Moments of accumulated reward

The analysis approach available for inhomogeneous MRMsallows to describe the moments of IMRMs with aninhomogeneous ordinary differential equation.

Similar to the reward distribution case, this approach is alsoapplicable for our model, but it requires the the computation of←−Q(τ).

Outline

Introduction

Model behaviour

Model description

Analysis approach


equation


equation




reward

Numerical method

Numerical example

Conclusions


Moments of accumulated reward

The analysis approach available for inhomogeneous MRMsallows to describe the moments of IMRMs with aninhomogeneous ordinary differential equation.

Similar to the reward distribution case, this approach is alsoapplicable for our model, but it requires the the computation of←−Q(τ).

Using similar state dependent moment measures we obtainhomogeneous ordinary differential equations

d

dτ

←−−M1(n)(τ) = n

←−−M1(n−1)(τ)R +

←−−M1(n)(τ)QD ,

and

d

dτ

←−−M2(n)(τ) = n

←−−M2(n−1)(τ)Rα +

←−−M1(n)(τ)(Q−QD)T +

←−−M2(n)(τ)Q

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

● Randomization based

numerical method

Numerical example

Conclusions


Randomization based numerical method

The ordinary differential equation with constant coefficientsallows to compose a randomization based numerical method.

←−−M1(n)(τ) = τn e RnED(τ) ,

and←−−M2(n)(τ) = n!dn

∞∑

k=0

e−λτ (λτ)k

k!D(n)(k),

where

D(n)(k) =

e (I−AkD) n = 0

0 k ≤ n, n ≥ 1

D(n−1)(k−1)Sα + D(n)(k−1)A+(

k−1n

)

e SnAk−1−nD (A−AD) k > n, n ≥ 1

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

● Numerical Example

Conclusions


Numerical Example

rN=Nr rN−2=(N−2)rrN−1=(N−1)r r0 =0

rM =0

α N=0.5 α N−1=0.5 α N−2=0.5

α M

α 0

ρσ σ σ

σ

λλλ (N−1)N

0N−2N N−1

M

=1

=1

Structure of the Markov chain

1e-05

0.0001

0.001

0.01

0.1

1

10

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

1st moment2nd moment3rd moment4th moment5th moment

Moments of the accumulatedreward

With parameters N = 500000, λ = 0.000004, σ = 1.5, ρ = 0.1,r = 0.000002, α = 0.5,

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions

● Conclusions


Conclusions

The analysis of partial loss MRM is usually rather complex.

We propose an analysis method with the following features:

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions

● Conclusions


Conclusions



■ non stopping time⇒ time reverse approach

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions

● Conclusions


Conclusions




■ inhomogeneous differential equation⇒ proper performancemeasure,

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions

● Conclusions


Conclusions





■ partial differential equation⇒ ordinary differential equations,

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions

● Conclusions


Conclusions






■ numerical stability, error control⇒ randomization basedanalysis.

Outline

Introduction

Model behaviour

Model description

Analysis approach

Numerical method

Numerical example

Conclusions

● Conclusions


Conclusions






■ numerical stability, error control⇒ randomization basedanalysis.

Thanks for your attention.

Date post:	06-Feb-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Analysis of Markov Reward Models with Partial Reward Loss … · M Telek, Markov Anniversary...

Documents