Vanier Collegegauss.vaniercollege.qc.ca/~iti/proj/Lauren_Stochastic.pdf · 2016. 7. 6. · Vanier...

Vanier College

Science Programme - Comprehensive

Assessment

Stochastic Differential Equations

Lauren Ménard

201-HTH-05

May 20, 2016

1 Introduction

Differential equations in which one or more random processes are involved are

called stochastic differential equations (SDE), where the term stochastic is used

to describe the random behaviour of such processes. It is not surprising that

these equations are widely used in various fields. For instance, their applications

range from the modelling of the fluctuations of stock prices to the diffusion of

particles in a given physical medium. They arise where underlying random

process can be found.

To evaluate these phenomenons, we must be able to solve the corresponding

stochastic differential equations. However, to do so, we must first understand

the properties of stochastic processes at a certain level. That is why, in this

introduction to SDE’s, we will explore the dynamics of these types of processes

in the discrete-time case, e.g. Markov chains, as well as in the continuous-

time case, e.g. classes of processes such as the Markov processes and diffusion

processes with emphasis on the example of the Wiener process. The latter is

particularly important as the main integral used to solve SDE’s, called the Itô

integral, which crucially uses properties of the Wiener process.

To provide further context, the Wiener process is a mathematical interpre-

tation, devised by mathematician Norbert Wiener, of the physical phenomenon

that describes the erratic movement of particles suspended in a fluid, whether

it be liquid or gas, resulting from their collision with atoms or molecules of the

medium. This phenomenon was discovered by botanist Robert Brown in 1827,

when observing the movement of grains of pollen suspended in water at the

1

microscopic level, and is appropriately called Brownian Motion.

Consequently, stochastic calculus, and the related mathematical fields, owe

credit to the discoveries made about the relationships between different random

processes and physical phenomenons. With all that said, the theory as pre-

sented in this introduction will hopefully help provide further clarification and

understanding of stochastic differential equations.

2

2 Markov Chains

2.1 Random Walks

To get a better understanding of the significance of Markov chains and stochastic

matrices, it is useful to look at the simplest cases of random walks. Consider a

particle moving along an axis with discrete single-unit steps and discrete time,

starting at position 0 at time 0, x(0) = 0.

The initial distribution is described by the equations

p(i) =

1, i = 0

0, i 6= 0,

where the probability that the particle be at position i = 0 is p(i) = 1, and the

probability that the particle be anywhere else, i 6= 0, is p(i) = 0.

Next, suppose that this particle only moves one unit to the left or right in

one unit time with associated probabilities p and q = 1−p, respectively. Finally,

let p = q = 12 . This particular type of random walk is called a standard random

walk.

Aside: Let E be the set of all states of the particle, which in this example is

the set of all integers, E = Z. Notice that, in this particular case, if En corre-

sponds to the position of the particle (state) at time n, where n also describes

the number of single-unit steps, then the set of all states E must be enclosed

between n and −n, En = {−n,−n+ 2, . . . , n− 2, n}.

3

Figure 1: Random walks with smaller and smaller time steps.

The transition probability distribution for the standard random walk defined

above can be described by the following,

P (i, j) =

p, j = i+ 1

q, j = i− 1

0, otherwise

(1)

where i is the current state of the particle and j is the next immediate state of

the particle.

Modifications can be added to this standard random walk. For example, if

we allow the particle to remain in the same position for any unit of time, the then

4

probability that describes this situation is denoted r. Therefore, p+ q + r = 1,

as the probabilities must sum to 1, and the transition probability distribution,

i.e. the transition from position i to j, is described by the following,

P (i, j) =

p, j = i+ 1

q, j = i− 1

r, j = i

0, otherwise

(2)

Another modification that can be applied to random walks is the case where

there are two limiting barriers. If both barriers are found at point A and B,

respectively, and 0 is contained between these points A < 0 < B, then the set

of all states E has B − A + 1 states. The transition probability distribution is

defined by the following,

P (i, j) =

sA, i = A, j = A

1− sA, i = A, j = A+ 1

sB , i = B, j = B

1− sB , i = B, j = B − 1

p, i 6= A,B, j = i+ 1

q, i 6= A,B, j = i− 1

r, i 6= A,B, j = i

0, otherwise

(3)

5

where sA and sB are the probabilities that the particle stay at points A or B,

respectively, in the next immediate step. Note that if sA = sB = 1, then there

is full absorption by the barriers and if sA = sB = 0, then there is full repulsion

by the barriers.

Furthermore, one can write (1), (2) and (3) in matricial form,

P (i, j) =

· · · −1 0 1 · · ·

... · · · · · · · · ·

1... 0 q 0

...

0... q 0 p

...

−1... 0 p 0

...

... · · · · · · · · ·

, P (i, j) =

· · · −1 0 1 · · ·

... · · · · · · · · ·

1... 0 q r

...

0... q r p

...

−1... r p 0

...

... · · · · · · · · ·

,

P (i, j) =

A A+ 1 · · · −1 0 1 · · · B − 1 B

B 0 · · · · · · · · · · · · · · · · · · 1− sB sB

B − 1... . .

.r p

...... . .

.. ..

. .. ...

1... 0 q r . .

. ...

0... q r p

...

−1... . .

.r p 0

...

...... . .

.. ..

. .. ...

A+ 1 q r . .. ...

A sA 1− sA · · · · · · · · · · · · · · · · · · 0

,

6

respectively, where the left-hand side of the matrices indicates the position i

(current state), and the top part indicates the next position, j. These transition

matrices are stochastic matrices Pn×n, where the rows are known as probability

or random vectors. Such vectors X0, X1, X2, . . . , Xn, each have nonnegative

entries that sum up to 1.

For the case where the particle is not bounded between two points A and B,

such as in case (1) and (2), the stochastic matrix will result in an infinitely large

and unbounded matrix P∞. It is for this reason that we typically use matrices

to evaluate discrete-time processes with a finite set of states.

2.2 Markov Chains

Following the theory for stochastic matrices, a Markov chain can be defined as a

sequence of probability vectors X0, X1, X2, . . . associated to a certain stochastic

matrix P such that the following conditions hold:

(1) P (X0 = i) = p0(i) for each state i ∈ E;

(2) P (X0 ∈ E) =∑i∈E

p(i) = 1;

(3) P (Xn+1 = in+1 | (X0 = i0, . . . , Xn = in))

= P (Xn+1 = in+1 | Xn = in) = P (in, in+1);

(4) P (Xn+1 ∈ E | Xn = i) =∑j∈E

P (Xn+1 = j | Xn = i)

=∑j∈E

P (i, j) = 1 for every i ∈ E.

The above conditions can be interpreted as follows. If E is the set of all states

7

of some system, then Xn indicates the active state at time n. We assume that

the probability that i is the active state at time zero is p(i) for any initial state

i ∈ E and that the system at time zero is in E with probability 1. Furthermore,

according to Property (3), the transitional probability from some state i to

another j only depends on the state i and not on any previous states that have

been visited. Lastly, Property (4) says that it is impossible for a system to leave

the set E as the transition from i to j is always in E.

Now, consider a finite set C of states where all the transitional probabilities

from i → j and from j → i in a finite fixed number of steps are positive. In

this case, states i and j are said to communicate. In other words, every pair of

states in C communicates with each other and it is possible to get to any state

from any state in a certain number of steps. A Markov chain is irreducible if its

set of states is described by such a communicating class C.

Another important concept is that of regular stochastic matrices. A stochas-

tic matrix P for which some power of Pn has strictly positive entries is called

regular. Note that regular matrices are only discussed in the case of finite

Markov chains. Furthermore, the Markov chain described by this type of ma-

trix is necessarily irreducible. However, the converse is not always true. An

irreducible Markov chain is not necessarily regular.

For example, consider the transition matrix P =

0 11 0

. It is irreducibleas it is possible to move to any state from any state. However, there exists

8

no power of P where all entries as strictly positive as P 2n =

1 00 1

andP 2n+1 =

0 11 0

. Therefore, it is not regular.Moreover, an irreducible Markov Chain with set C of states can be aperiodic

or periodic. If any return to state i ∈ C occurs at irregular times (the greatest

common denominator of the number of steps to return to i is 1), then the state

is said to be aperiodic. An irreducible Markov chain only needs one aperiodic

state to imply all states are aperiodic. However, if returns to i occur at multiples

of k > 1, then the chain is said to be periodic. In the previous example, k = 2,

therefore it is called periodic. Consequently, the Markov chain is irreducible,

but not aperiodic. This leads to the following proposition: finite Markov chains

are regular if and only if they are irreducible and aperiodic.

These definitions are useful when discussing the long-term behaviour of a

system. For an aperiodic irreducible Markov chain, high powers of the associ-

ated regular stochastic matrix P will approach limiting value in the sense that

limn→∞

Pn = Π, where Π is a matrix will all rows equal to the same probability

vector π. Thus, π is called the steady-state vector of the Markov chain. Note

that a Markov chain must be irreducible and aperiodic for the stochastic matrix

to be regular and for there to exist a steady-state. A Markov chain that respects

the conditions described above as n → ∞ is called ergodic. With the addition

of this new definition, the following claim arises:

9

A finite Markov chain is Aperiodic and Irreducible ⇐⇒ Regular ⇐⇒

Ergodic.

There are other ways, however, to find a steady-state vector π. Consider the

following. If we have a matrix A, then a row vector ξ that satisfies the equation

ξA = λξ is called an eigenvector with eigenvalue λ. To find such eigenvectors,

one can use AT ξT = λξT . In the case where A is a regular stochastic matrix,

now called P , we can apply the Perron-Frobenius theorem:

(1) there is always one eigenvalue of P with λ1 = 1;

(2) all other λi satisfy | λi |< 1 for all i > 1;

(3) λ1 = 1 has algebraic and geometric multiplicity 1.

Consequently, the λ = 1 eigenvector ξ1 is the steady-state vector π, which

is shown by the following argument. Let πn = c1λn1 ξ1 + c2λ

n2 ξ2 + . . .+ ckλ

n1 ξk,

where, by (1) and (2), taking limn→∞

πn will collapse all terms to 0 except for ξ1.

Therefore, we get π = limn→∞ πn = c1ξ1. This particular probability vector, π,

is the unique normalized solution of the equation ξ1P = ξ1.

Example: Let P =

0.5 0.3 0.2

0.2 0.8 0

0.3 0.3 0.4

be a regular stochastic matrix along itsrows.

The state of this system is described by the Markov chain, xk = x0Pk for

k = 0, 1, 2, . . .

10

The steady-state vector can be found by taking high powers of P . For example,

within the precision of of computer algebra system

P 100 =

0.3 0.6 0.1

0.3 0.6 0.1

0.3 0.6 0.1

and π = ( 0.3 0.6 0.1 ).

However, as previously discussed, we can also find the steady-state vector

by solving (PT − I)ξT1 = 0

(PT − I) =

−0.5 0.2 0.3 0

0.3 −0.2 0.3 0

0.2 0 −0.6 0

→

1 0 −3 0

0 1 −6 0

0 0 0 0

ξT1 =

3

6

1

s, and by choosing s = 110 , π = ( 0.3 0.6 0.1 ).

Now, let us discuss non-regular Markov chains. If the associated matrix is

non-regular, denoted Q, then Qn does not converge and limn→∞

Qn(i, j) does not

exist.

Example: Let Q =

0 11 0

be an periodic irreducible stochastic matrix.

Then, Q2 = Q4 = Q6 =

1 00 1

and Q1 = Q3 = Q5 = 0 1

1 0

11

In this case, Q is irreducible, but not regular because of its periodicity.

Stochastic matrices are particularly effective when analyzing the long-term

behaviour, i.e. the steady-state probability vector, of a discrete-time Markov

chain. However, there exists many other cases where time is continuous, called

continuous-time stochastic processes. Such processes will be discussed in the

next section.

12

3 Continuous-Time Stochastic Processes

As the section title suggests, we will consider stochastic processes in continuous

time, where specific classes, notably the Markov processes and the Diffusion

processes, will be discussed. In addition, the Wiener process, an example of said

classes, will be explored.

3.1 Markov Processes

A Markov process, as its name suggests, is an extension of the previously ex-

plored Markov chain in continuous time. It can be viewed as a stochastic pro-

cess that satisfies the Markov property, which can be described as follows. Let

X = {X(t), t ∈ R+} be continuous time stochastic process, i.e. a family of

random variables X(t), where t ≥ 0. Then,

P[X(tn+1) ∈ B|X(tn) = xn, . . . , X(t1) = x1

]= P

[X(tn+1) ∈ B|X(tn) = xn

]

for all Borel subsets B of R, all time instants 0 < t1 < . . . < tn < tn+1

and all states x0, x1, . . . , xn in R. In other words, a stochastic process has the

Markov property if the conditional probability distribution of future states of

the process depends only upon the present state and not on the sequence of

events that precede it.

Moreover, Borel sets in the real line are a class of events obtained as relative

complements, countable unions and countable intersections of intervals of the

13

real line. In the case where the set of all possible outcomes is defined as the real

line or as an interval of said line, meaning that the sample space is not finite or

countable, it would be unrealistic to assign probabilities to all possible subsets

of that interval. Therefore, the use of Borel sets is necessary.

The transition probabilities of the Markov process X(t) can be written as

follows,

P (s, x; t, B) = P[X(t) ∈ B|X(s) = x

]for 0 ≤ s < t and,

P (s, x; t, B) =

∫B

p(s, x; t, y)dy ,

for the continuous case, where the density p(s, x; t, ·) is called the transition

density.

Furthermore, a Markov process is said to be homogeneous if all its transition

probabilities depend only on the time difference between two instants, t − s.

This means that,

P[X(s+ t) = j|X(s) = i

]is independent of s. When this holds, setting s = 0, we obtain,

P[X(s+ t) = j|X(s) = i

]= P

[X(t) = j|X(0) = i

], ∀s, t ≥ 0.

An important class of Markov process called the diffusion process will be

explored in the following section.

14

3.2 Diffusion Processes

The diffusion process, which will be discussed in the one-dimensional case, is

a special case of Markov process with continuous sample paths. Note that the

terms sample path and trajectory can be used interchangeably.

A Markov process X(t) with transition densities p(s, x; t, y) is called a dif-

fusion process if the following conditions are satisfied.

(1) For all x and all ε > 0,

limt→s+

1

t− s

∫|y−x|>ε

p(s, x; t, y)dy = 0;

(2) There exists a function a(x, s) such that for all x and all ε > 0,

limt→s+

1

t− s

∫|y−x|≤ε

(y − x) p(s, x; t, y)dy = a(s, x);

(3) There exists a function b(s, x) such that for all x and all ε > 0,

limt→s+

1

t− s

∫|y−x|≤ε

(y − x)2 p(s, x; t, y)dy = b2(s, x).

The first condition implies, as it was stated above, that the process is con-

tinuous for any chosen sample path. Furthermore, the second condition states

that there exists a function a(x, s) called the drift coefficient of the diffusion,

where the drift is the instantaneous rate of change of the average value or mean

15

of the process if X(s) = x. In addition, for a diffusion process, there exists

a function b(x, s) called the diffusion coefficient for which b2(x, s) denotes its

squared variations with the condition that X(s) = x.

3.2.1 Kolmogorov Equations

The Kolmogorov equations are two partial differential equations (PDE) that

arise in the case of continuous-time and continuous-state Markov processes,

which were introduced by Andrei Kolmogorov in 1931. Such equations, notably

the Backward and Forward Kolmogorov equations, will be explored in the case

of Markov diffusion processes. However, it should be noted that the forward

equation is also known as the Fokker-Planck equation.

The Forward Kolmogorov Equation At time s, we are given information

about the state of a system. This information is described by probability density

p(s, x). This imposes an initial condition on the partial differential equation,

from time s to t for any s < t, hence the term forward. In other words, the final

condition of the PDE, is found by integrating forward in time, from s to t.

Suppose X(t) is a diffusion process with transition density p(s, x; t, y), which

is described by a continuous function over its arguments. Furthermore, suppose

that both functions a(t, y) and b(t, y) are continuous in both t and y. Then,

p(s, y) is a solution to

∂p

∂t+

∂

∂y

[a(t, y)p

]− 1

2

∂2

∂y2

[b2(t, y)p

]= 0, (4)

16

with initial condition p(s, x; s, y) = δ(x− y).

The proof that diffusion processes obey the forward Kolmogorov equation is

similar to the proof for the backward Kolmogorov equation, which will be given

below.

The Backward Kolmogorov Equation Conversely, at time s, we are inter-

ested in whether, at a future time point t, the system will be in a given subset of

states. This subset is described by a function u(t, x). This imposes a terminal

condition on the partial differential equation, from time t to s for any t > s,

hence the term backward. In other words, the final condition of the PDE, is

found by integrating backward in time, from t to s.

Theorem: Let f(x) be a continuous bounded function on R, and let u(s, x)

be the conditional expectation,

u(s, x) = E[f(Xt)|Xs = x

]=

∫f(y) p(s, x; t, y) dy,

with t fixed. Furthermore, suppose that both functions describing the drift and

diffusion coefficients, a(s, x) and b(s, x), respectively, are continuous in both s

and x. Then u(s, x) is a solution to partial differential equation

∂u

∂s+ a(s, x)

∂u

∂x+

1

2b2(s, x)

∂2u

∂x2= 0, (5)

with the terminal condition that u(t, x) = f(x) for s ∈ [0, t].

17

Proof. First observe that by the continuity assumption of the diffusion process,

together with the fact that the function f(x) is bounded implies that

u(s, x) =

∫Rf(y) p(s, x; t, y) dy =

∫|y−x|≤�

f(y) p(s, x; t, y) dy +

∫|y−x|>�

f(y) p(s, x; t, y) dy ≤

∫|y−x|≤�

f(y) p(s, x; t, y) dy + ||f ||∞∫|y−x|>�

p(s, x; t, y) dy =

∫|y−x|≤�

f(y) p(s, x; t, y) dy + o (t− s).

Here the little o notation signifies a term which goes to zeto faster than t−s when

t → s. ||f ||∞ is the maximum absolute value of the bounded function f . We

add and substract the final condition f(x) and repeat the previous calculation

to obtain

u(s, x) = f(x) +

∫|y−x|≤�

(f(y)− f(x)

)p(s, x; t, y) dy + o(t− s).

Using the Chapman-Kolmogorov equation we obtain

u(s, x) =

∫Rf(z)p(s, x; t, z) dz =

∫R

∫Rp(s, x; r, y)p(r, y; t, z) dz dy

=

∫Ru(r, y)p(s, x; r, y) dy.

From Taylor’s theorem we have

u(r, z)− u(r, x) = ∂u(r, x)∂x

(z − x) + 12

∂2u(r, x)

∂x2(z − x)2(1 + α�), |z − x| ≤ �,

18

where lim�→0 α� = 0. Combining the above equations we calculate

u(s, x)− u(s+ h, x)h

=1

h

(∫Rp(s, x; s+ h, y)u(s+ h, y) dy − u(s+ h, x)

)

=1

h

∫|x−y|

Figure 2: Three dimensional Brownian motion

The standard Wiener process W = {W (t), t ≥ 0} is a family of Gaussian

random variables W (t) that depends continuously on t ≥ 0 and that satisfies

the following:

(1) W (0) = 0;

(2) E(W (t)) = 0;

(3) V ar(W (t)−W (s)) = t− s;

for all 0 ≤ s ≤ t. We can gather from these conditions that as time increases,

the variance also increases while maintaining a mean of 0 if the the process

starts at 0. The Wiener process is sample-path continuous, meaning that it

is continuous on any choice of trajectory. This is not surprising as the same

can be said more generally for diffusion processes. However, with probability

20

1, it is nowhere differentiable for any time t ≥ 0. This will be proved in the

mean-square sense.

Proof. By definition, W (t) is Gaussian with variance t. If we consider the

quotient for the derivative,

W (t+ h)−W (t)(t+ h)− t

,

limh→0

E

[(W (t+ h)−W (t)

(t+ h)− t

)2]= limh→0

1

h= ∞,

where the ratio has mean square 1h , and it goes to infinity as h approaches 0.

Therefore, as no such limit exists, the trajectories of the Wiener process are

nowhere differentiable. In other words, as the curve of W (t) is observed on an

increasingly smaller scale, it becomes more and more erratic, which results in a

completely random quantity.

The transition density of the Wiener process is,

p(s, x; t, y) =1√

2π(t− s)exp

(− (y − x)

2

2(t− s)

). (6)

Note that this transition density is expressed as a Gaussian distribution.

Furthermore, by evaluating the partial derivatives of (6), we find that they

satisfy the partial differential equations,

∂p

∂t− 1

2

∂2p

∂y2= 0, (s, x) fixed, (7)

21

and

∂p

∂s+

1

2

∂2p

∂x2= 0, (t, y) fixed. (8)

Proof. From equation (7), by letting (s, x) be fixed at 0, equation (6) becomes,

p(0, 0; t, y) =1√2πt

exp

(− y

2

2t

).

By taking the first and second partial derivatives of t and y respectively, we

obtain,

∂

∂t

[1√2πt

exp

(− y

2

2t

)]=

− 12√

2πt32

exp

(− y

2

2t

)+

1

2√

2πt

y2

t2exp

(− y

2

2t

), (9)

∂

∂y

[1√2πt

exp

(− y

2

2t

)]=

− y√2πt

32

exp

(− y

2

2t

),

∂2

∂y2

[1√2πt

exp

(− y

2

2t

)]=

∂

∂y

[− y√

2πt32

exp

(− y

2

2t

)]=

1√2πt

y2

t2exp

(− y

2

2t

)− 1√

2πt32

exp

(− y

2

2t

). (10)

Comparing (9) and (10) obtained from this calculation, we observe that,

22

∂p

∂t=

1

2

∂2p

∂y2,

and, consequently,

∂p

∂t− 1

2

∂2p

∂y2= 0.

Similarly, by selecting specific conditions for (t, y) it can be shown that,

∂p

∂s+

1

2

∂2p

∂x2= 0.

To get a better understanding of the Wiener process, let us clearly relate

it to the diffusion process. The standard Wiener process is a diffusion process

with drift coefficient a(s, x) = 0 and diffusion coefficient b(s, x) = 1. Hence, we

obtain,

a(s, x) = limt→s

E

(xt − xst− s

∣∣∣∣∣ xs = x)

= 0,

and,

b2(s, x) = limt→s

E

((xt − xs)2

t− s

∣∣∣∣∣ xs = x)

= limt→s

t− st− s

= 1.

Consequently, substituting these values into the forward equation (4) and

the backward equation (5), we obtain,

∂p

∂t+∂p

∂y(0)− 1

2

∂2p

∂y2(1) = 0,

23

∂p

∂t− 1

2

∂2p

∂y2= 0, (11)

and,

(∂u

∂s+ (0)

∂u

∂x+

1

2(1)

∂2u

∂x2

)p = 0,

(∂u

∂s+

1

2

∂2u

∂x2

)p = 0,

respectively, where these results are precisely equal to the previously derived

equations (7) and (8). Note that equation (11) is called the heat equation and

is used to model the diffusion of heat. This makes the Wiener process a very

important stochastic process in many different fields. However, for it to be

useful, mathematical meaning must be assigned to its infinitesimal changes.

Therefore, stochastic calculus is the logical continuation of the theory, where Itô

integrals will be introduced.

24

4 Stochastic Calculus

Stochastic calculus is necessary to effectively study stochastic processes because,

as previously mentioned, properties of said processes prevent us from using

regular calculus techniques. This section will explore the various obstacles and

results that arose in the solving of this problem.

Recall that a Riemann integral∫ baf(t) dt is defined for a continuous function

f on a bounded interval [a, b]. The interval is partitioned into n subintervals

with a = t0 < t1 < . . . < tn = b, where the Riemann integral is equal to the

sum of the areas of all subintervals as n→∞ and as the width of n approaches

0,

∫ ba

f(t) dt = lim(tj−tj−1)→0

n∑j=1

f(tj−1) (tj − tj−1).

There exists, however, in the more general case, an integral called the

Riemann-Stieltjes integral. Suppose f(t) and g(t) are real-valued bounded func-

tions defined on an interval [a, b]. This simple dt integration can be generalized

to increments dg(t) by using g(tj)−g(tj−1) instead of tj−tj−1. Thus, we obtain

the Riemann-Stieltjes integral,

∫ ba

f(t) dg(t) = lim(g(tj)−g(tj−1)

)→0

n∑j=1

f(tj−1)(g(tj)− g(tj−1)

).

Note that for such integrals to exist, the variation of g must be bounded

and finite over the interval [a, b].

25

These types of integrals arise when solving equations of various processes.

For example, consider the case where a small amount of liquid flows with macro-

scopic velocity a(t, u(t)

), where u(t) describes its position at time t. Further-

more, suppose that a microscopic particle is suspended in this liquid, displaying

evidence of Brownian motion. Consequently, the change in the position of the

particle has the following equation,

du(t) = a(t, u(t)

)+ b(t, u(t)

)dWt, (12)

However, the second term of equation (12) does not make sense because the

trajectories of the Wiener process are nowhere differentiable as it was previously

discussed. If we represent this equation in integral form, we obtain the following,

u(t)− u(0) =∫ t0

a(s, u(s)

)ds+

∫ t0

b(s, u(s)

)dWs. (13)

Observe that the second term of equation (13) is a Riemann-Stieltjes integral.

Therefore, for this to apply, the variation must be bounded. However, the

Wiener process is nowhere differentiable and it is not of bounded variation.

Consequently,∫ baf(t) dWt cannot be interpreted as a Riemann-Stieltjes integral.

This is precisely why Itô integrals are necessary for stochastic calculus. These

integrals will be discussed in the next section.

26

4.1 Itô Stochastic Integrals

We want to make sense of the following expression,

∫ Tto

f(s, ω) dWs(ω),

which we will call a stochastic integral, that is defined for a random function

f : [0, T ]× Ω→ R.

For a fixed sample path ω the Riemann-Stieltjes integral is commonly used

to express the limit of the sums,

n∑j=1

f(τ(n)j , ω)

(Wt(n)j+1

(ω)−Wt(n)j

(ω)), (14)

for all possible choices of evaluation points τ(n)j ∈

[t(n)j , t

(n)j+1

]and partitions

0 = t(n)1 < t

(n)2 < . . . < t

(n)n+1 = T of [0, T ] as

max(1≤j≤n){t(n)j+1 − t

(n)j

}→ 0 as n→ 0.

However, this limit does not exist as the sample paths of the Wiener process

are not of bounded variation. Hence, instead of considering said pathwise con-

vergence, we might want to consider an L2-convergence, where the limit of the

sums (13) may exists and differ depending on the choice of evaluation points

τ(n)j ∈

[t(n)j , t

(n)j+1

].

For example, consider the case where f(t, ω) = Wt(ω) and,

τ(n)j = (1− λ)t

(n)j + λt

(n)j+1 = (j + λ)δ, λ ∈ [0, 1], ∀ j = 0, 1, . . . , n− 1,

27

is a fixed evaluation point. Note that λ is a parameter that determines the

choice of evaluation point. Furthermore, let δ = tj+1 − tj = Tn be the constant

step size. The terms of (13) can, therefore, be rearranged as follows,

Wτj

(Wtj+1 −Wtj

)= −1

2

(Wtj+1 −Wτj

)2+

1

2

(Wτj −Wtj

)2+

1

2

(W 2tj+1 −W

2tj

).

By taking the sums, we obtain,

n∑j=1

Wτj

(Wtj+1 −Wtj

)=

−12

n∑j=1

(Wtj+1 −Wτj

)2+

1

2

n∑j=1

(Wτj −Wtj

)2+

1

2

n∑j=1

(W 2tj+1 −W

2tj

).

The third term of the right hand side becomes,

1

2

n∑j=1

(W 2τj −W

2tj

)=

1

2

(W 2T −W 20

)=

1

2W 2T .

Moreover, recall that the Wiener process has variance,

Var(W (t)−W (s)

)= E

(W (t)−W (s)

)2= t− s.

Consequently,

E

(n∑j=1

Wτj

(Wtj+1 −Wtj

))=

−12

n∑j=1

(tj+1 − τj

)+

1

2

n∑j=1

(τj − tj

)+

1

2W 2T .

28

Observe that the following equalities arises from the chosen evaluation point,

tj+1 − τj = tj+1 − (1− λ) tj − λ tj+1

= (1− λ)[tj+1 − tj

]= (1− λ) δ,

and,

τj − tj = (1− λ) tj + λ tj+1 − tj

= λ[tj+1 − tj

]= λ δ.

Therefore,

E

(n∑j=1

Wτj

(Wtj+1 −Wtj

))= −1

2

n∑j=1

(1− λ) δ + 12

n∑j=1

λ δ +1

2W 2T

= −12

(1− λ) δn +1

2λ δn +

1

2W 2T

= − (12− λ) T + 1

2W 2T

=1

2T − 1

2T + λT

= λT.

Note that the integral in the expected-value sense becomes,

29

E

(∫ T0

Wt dWt

)= λT. (15)

Thus we have a convergent sum in the L2-sense, but the result depends on

the location of the evaluation point. Furthermore, by taking λ = 0, thus making

the evaluation point the left endpoint of the subinterval, we obtain,

E

(∫ T0

Wt dWt

)= 0,

which is a useful result as the integrand Wiener process Wτj has E(Wt) = 0

with independent increments Wtj+1 −Wtj . Moreover,

E

(∣∣∣∣ ∫ T0

Wt dWt

∣∣∣∣2)

=

∫ T0

E(|Wt|2

)dt =

∫ T0

t dt =1

2T 2.

The main point of the construction of the Itô’s integral is that, for the

stochastic integral∫ Tt0f(s, ω) dWs(ω), the dependence of f(s, ω) on W (s, ω) is

nonanticipative, meaning that the random function f(s, ω) can depend, at most,

on the present and past values of the Wiener process W (s, ω) and is independent

of its future.

To clarify, for the stochastic integral∫ Tt0f(s, ω) dWs(ω), the integrand f(s, ω)

is nonanticipative if the random variable f(t, ·) is At-measurable for t ∈ [0, T ]

where {At, t ≥ 0} is an increasing family of σ-algebras generated by Wt, for

t ≥ 0.

30

Furthermore, the relevant class L2T consists of functions f : [0, T ] × Ω → R

satisfying

(1) f is jointly β×A-measurable, where β is the Borel σ-algebra on [0, T ]. Note

that the collection of Borel sets is the smallest σ-algebra containing the open

sets,

(2) E(f(t, ·)2

)

(4) I(αf + βg) = αI(f) + βI(g) for f, g ∈ L2t and ∀α, β ∈ R.

32

5 Conclusion

Itô integrals now provides us with the necessary theory to interpret the stochas-

tic differential equation

dXt = a(t,Xt) dt+ b(t,Xt) dWt,

as a stochastic integral equation

Xt = Xt0 +

∫ tt0

a(s,Xs) ds+

∫ tt0

b(s,Xs) dWs,

where the solutions of these types of stochastic differential equations are diffu-

sion processes which obey the Kolmogorov equations previously discussed.

The theory presented in this introduction to stochastic differential equations

allows us to consider the solutions of such equations. In fact, as it can be seen,

these solutions are stochastic processes. The representation of SDE’s at the

macroscopic level is a diffusion, say of some substance suspended in a physical

medium, that is modelled by the Kolmogorov equations mentioned above. The

century-long discussion revolving around the relationship between the micro-

scopic random behaviour of particles and the nature of a diffusion (amongst

many emminent mathematicians and scientists including Einstein) has been

settled by the Itô integral, devised by K. Itô and giving meaning to stochastic

differential equations. The main result of this discussion is the relationship be-

tween the stochastic diffusion processes and the Kolmogorov partial differential

equations.

33

6 References

Books

Coleman R. Stochastic processes. London: Allen and Unwin; 1974. p. 35-51

Cyganowski S, Kloeden PE, Ombach J. From Elementary Probability to

Stochastic Differential Equations with Maple. Berlin: Springer; 2002.

Lawler GF. Random Walk and the Heat Equation. Providence, RI: American

Mathematical Society; 2010. p. vii

Lay DC, Lay SR, McDonald JJ. Linear Algebra and its Applications. Boston:

Pearson/Addison-Wesley; 2006. p. 256-262

Leon-Garcia A. Probability, statistics, and random processes for electrical

engineering. 3rd ed. Upper Saddle River, NJ: Pearson/Prentice Hall; 2008.

p. 673

Sobczyk K. Stochastic Differential Equations: with Applications to Physics and

Engineering. Dordrecht: Kluwer Academic; 1991. p. 106-113

Websites

Archambeau C., London University College, Centre for Computational

Statistics and Machine Learning, (n.d.), An Introduction to Diffusion

Processes and Ito Stochastic Calculus.

Carnegie Mellon University, Department of Statistics, (n.d.), Diffusions and the

Wiener Process.

Ghosh A. P., Iowa State University, Department of Statistics, February 1, 2010,

Backward and Forward equations for Diffusion processes.

34

Herzog F., ETHzurich, (n.d.), Stochastic Differential Equations.

Massachusetts Institute of Technology, February 28, 2011, Electrical

Engineering and Computer Science: Discrete Stochastic Processes - Markov

Eigenvalues and Eigenvectors.

New York University, Department of Mathematics, 2007, The Ito Integral with

Respect to Brownian Motion.

Sigman K., University of Columbia, 2007, Introduction to Stochastic

Integration.

University of Chicago, Department of Statistics, (n.d.), Brownian Motion.

Whitt, University of Columbia, 2007, A Quick Introduction to Stochastic

Calculus.

35

Date post:	16-Feb-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Vanier Collegegauss.vaniercollege.qc.ca/~iti/proj/Lauren_Stochastic.pdf · 2016. 7. 6. · Vanier...

Documents