Introduction to Brownian motioncurien/cours/cours.pdfFind two stochastic processes (X t) t 0 and (Y...

Introduction to Brownian motion

October 31, 2013

Lecture notes for the course given at Tsinghua university in May 2013.Please send an e-mail to [email protected] for any error/typo found.

Historic introduction

From wikipedia : Brownian motion is the random moving of particles suspended in a fluid (aliquid or a gas) resulting from their bombardment by the fast-moving atoms or molecules in thegas or liquid. In 1827, the botanist Robert Brown, looking through a microscope at particlesfound in pollen grains in water, noted that the particles moved through the water but was notable to determine the mechanisms that caused this motion. Atoms and molecules had long beentheorized as the constituents of matter, and many decades later, Albert Einstein published apaper in 1905 that explained in precise detail how the motion that Brown had observed was aresult of the pollen being moved by individual water molecules. This explanation of Brownianmotion served as definitive confirmation that atoms and molecules actually exist, and was furtherverified experimentally by Jean Perrin in 1908. Perrin was awarded the Nobel Prize in Physics in1926“for his work on the discontinuous structure of matter”(Einstein had received the award fiveyears earlier “for his services to theoretical physics” with specific citation of different research).The direction of the force of atomic bombardment is constantly changing, and at different timesthe particle is hit more on one side than another, leading to the seemingly random nature ofthe motion. This transport phenomenon is named after Robert Brown. The first mathematicalrigorous construction of Brownian motion is due to Wiener in 1923 (that is why Brownian motionis sometimes called Wiener process).

20000 40000 60000 80000 100000

-300

-200

-100

100

-300 -200 -100 100

-200

-100

100

200

Figure 1: Robert Brown and Brownian motions in 1 and 2 dimensions.

1

A discrete model

A possible model of the above motion of a particle in d dimension can be as follows. We considerthat the particle is moving as a random walk Sn = Y1 + . . . + Yn with Yi i.i.d. uniform over−1, 1d (at each step, the d coordinates are updated by an independent fair ±1). We then lookat the motion of the particle seen from far away. Since n−1Sn → 0, the central limit theorem tellsus that the good renormalization in space is the square root of the time, and we thus considerthe random function

S∗n(t) =1√nS[nt],

where [x] is the largest integer less than x. For every 0 = t0 < t1 < t2 < . . . < tp the vectors(S∗n(ti) − S∗n(ti−1)), i ∈ 1, . . . , p are independent and converge in distribution by the centrallimit theorem towards centered Gaussian vectors of covariance (ti − ti−1)Id. Obviously thisremains true (up to a multiplicative factor) if we change the distribution of the steps as longas they remain centered and have finite variance. This leads us to set up a definition for the(candidate) continuous limiting process:

Definition 1. A d-dimensional Brownian motion (starting from 0) is a family of Rd-valuedrandom variables (Bt : t ≥ 0) living on a probability space (Ω,F , P ) such that

• B0 = 0 almost surely,

• for every 0 = t0 < t1 < t2 < . . . < tp the variables Bti − Bti−1 for i ∈ 1, . . . , p areindependent and

Bti −Bti−1 = N(0, Id(ti − ti−1)

).

• the function t 7→ Bt is almost surely continuous.

The last point of the definition could seem trivial from a physics point of view but turns outto be mathematically essential. We will see that existence of Brownian motion is not trivial. Wewill come back later to the fact that Brownian motion is the universal limit of scaled randomwalks. Before constructing Brownian motion, let us quickly dive into the realm of stochasticprocesses.

Remark. Brownian motion thus has stationary and independent increments. Meaning thatBti−Bti−1 for i ∈ 1, . . . , p are independent and Bt+s−Bt = Bt′+s−Bt′ = Bs in distribution forevery t, t′ ≥ 0. Among the class of stochastic processes satisfying these assumptions (The Levyprocesses) Brownian motion is the only continuous one. Do you know another (non-continuous)process (Xt : t ≥ 0) which has stationary and independent increments?

1 Generality on stochastic processes

Let us call stochastic process a family (Xt : t ∈ R+) of random variables with values in Rendowed with its Borel σ-field B (the vectorial case is similar) defined on a probability space(Ω,F , P ). Hence, for every ω ∈ Ω we can speak about the “trajectory” t 7→ Xt(ω) and interpretit as a “random function”. Where does this object live? How to characterize it? What are itsproperties?

2

1.1 First approach

We could consider the random process (Xt)t≥0 as taking values in RR+ . As a product space,RR+ is endowed with the product σ-field BR+ (where B is the Borel σ-field on R) generated bythe coordinate mappings

(Xt)t≥0 ∈ RR+ 7→ Xs ∈ (R,B).

Since ω 7→ Xs(ω) ∈ (R,B) is measurable for every s ≥ 0, it follows that ω 7→ (Xt(ω), t ≥0) ∈ (RR+ ,BR+) is measurable as well. The law of such a random process is thus a probabilitymeasure on (RR+ ,BR+). The finite-dimensional distributions (or marginals) of X = (Xt : t ≥ 0)are the laws of all the finite vectors (Xt1 , Xt2 , . . . , Xtp) for every t1 ≤ t2 ≤ . . . ≤ tp ∈ R+.

Proposition 1. The finite-dimensional distributions of X characterize its law over (RR+ ,BR+).

Proof. Denote ν the law of (X) over (RR+ ,BR+). The finite-dimensional distributions of X fixthe ν-probability of events of the form f ∈ RR+ : f(t1) ∈ A1, . . . , f(tp) ∈ Ap for Ai ∈ B.Such events are called “cylinders”. It is easy to see that cylinder events is a collection that isclosed under finite intersection and which generates BR+ as a σ-field. Hence the monotone classtheorem entails that ν is characterized by its values on cylinder events.

Exercise 1. Show that items 1 and 2 of Definition 1 characterize the finite-dimensional distri-bution of Brownian motion.

Exercise 2 (One-dimensional does not suffices). Find two stochastic processes (Xt)t≥0 and(Yt)t≥0 which have the same one-dimensional distributions i.e. such that Xt = Yt in distributionfor every t ≥ 0, but (X) and (Y ) have different laws. * Same question with two-dimensionaldistributions.

This approach enjoys an abstract criterion to construct stochastic processes:

Theorem 1 (Kolmogorov extension theorem). Suppose that we are given a family of finite-dimensional laws πt1,...,tn for every t1, . . . , tn ∈ R+ which are coherent in the sense that if(Xi)i=t1,...,tn,tn+1 has law πt1,...,tn,tn+1 then (Xi)i=t1,...,tn has law πt1,...,tn,. Then there exists aprobability π on (RR+ ,BR+) such that (Xt)t≥0 has law π and (Xi)i=t1,...,tn has law πt1,...,tn.

Proof. Omitted.

Exercise 3. Check that the finite-dimensional distributions of Brownian motion are coherentand deduce that there exists a stochastic process verifying conditions 1 and 2 of Definition 1.

The problem with the σ-field BR+ is that it is “trajectorially” very poor.

Exercise 4. *

1. Show that the event t 7→ Xt is continuous is not measurable for the σ-field BR+ .Hint : Construction two stochastic processes X,X ′ which have the same finite-dimensional distributions

and such that X is continuous almost surely but X ′ is not.

2. Show that the σ-field BR+ is made of all sets that can be written A× RR+\D, where D isa countable set of indices of R+ and A ∈ BD.

3

1.2 Second approach

We now suppose that our stochastic process (X) is continuous and thus lives in C(R+,R). Wethus have to endow this set with a σ-field. The first try would be to use the ‖‖∞ norm but thatwould yield a non-separable space which causes many problems, see Exercise 6. We instead usethe topology of uniform convergence over every compact sets which is metrizable by

d(f, g) =

+∞∑i=1

2−i(1 ∧ sup[0,i]|f − g|).

Exercise 5. Check that d is a distance on C(R+,R) which makes it separable and complete.

The Borel σ-field associated to this topology is denoted by Bc. We also denote C the traceof the σ-field BR+ on C(R+,R).

Proposition 2. We have Bc = C.

Proof. The inclusion C ⊂ Bc follows from the fact that the coordinate mappings f 7→ f(t) arecontinuous for d hence measurable. For the other inclusion, since (C(R+,R), d) is separable, itsuffices to show that balls for d are measurable. But by continuity of the functions consideredwe can give an equivalent definition of d using rational values:

+∞∑i=1

2−i(1 ∧ sups∈[0,i]

|f(s)− g(s)|) =+∞∑i=1

2−i(1 ∧ sups∈[0,i]∩Q

|f(s)− g(s)|).

It easily follows that balls for d are C-measurable.

Exercise 6 (Nightmares1***). Let `∞ be the space of bounded real sequences. We endow thisspace with two σ-fields: the (trace of the) product σ-field BN and the Borel σ-field from thenorm ‖‖∞ over `∞ where ‖u‖∞ = supi≥0 |ui|.

1. Show that BN ⊂ B‖‖∞ .

Consider a sequence An of intervals of [0, 1] such that |An| → 0 when n→∞ and such that forevery t ∈ [0, 1], the set Bt = n ≥ 0 : t ∈ An is infinite. Finally, define the mapping

ξ : t ∈ [0, 1] 7→ (1t∈A1 ,1t∈A2 ,1t∈A3 , . . .).

2. Show that ξ is measurable from ([0, 1],B[0,1]) into (`∞,BN).

3. Let A ⊂ [0, 1]. Show that ξ(A) is closed in (`∞, ‖‖∞), and that ξ−1(ξ(A)) = A.

4. Assume the axiom of choice, and recall that there are subsets of [0, 1] which are nonLebesgue measurable. Conclude that ξ is not measurable from ([0, 1],B[0,1]) into (`∞,B‖‖∞)

and that BN 6= B‖‖∞ .

1Taken from ”Probability Distributions on Banach Spaces” by Vakhania et al, p. 23 - 24.

4

Intermezzo: Gaussian process

A random vector (X1, . . . , Xp) ∈ Rp is a Gaussian vector if any linear combinaison of the Xi

is Gaussian. If N1, . . . , Nk are independent Gaussian random variables then any vector whoseentries are (fixed) linear combinaisons of the N1, . . . , Nk is a Gaussian vector (actually anyGaussian vector can be represented as such). Let us give an example of a vector with Gaussianentries which is not a Gaussian vector. Consider a standard normal variable N and let ε be anindependent fair ± Bernoulli coin flip. It is easy to see that the vector (X, εX) has both entriesdistributed as N (0, 1) but this vector is not Gaussian because X + εX is not Gaussian.

The marvelous property is that the law of a Gaussian vector is completely characterized bythe means E[Xi] and the covariance matrix (E[XiXj ])1≤i,j≤n. Indeed, for any λ1, . . . λn ∈ R,the characteristic function E[exp(i

∑λiXi)] is the Fourier transform of a Gaussian variable∑

λiXi and thus equal exp(im − σ2/2) where m is its mean and σ2 its variance. Both m andσ2 are computable in terms of the λi, of the covariance matrix and the means of the Xi. Inparticular if (X1, . . . , Xn, Y1, . . . , Ym) is a Gaussian vector and such that Cov[XiYj ] = 0 for everyi ∈ 1, . . . , n and every j ∈ 1, . . . ,m then (X1, . . . , Xn) is independent of (Y1, . . . , Ym).

A stochastic process (Xt)t≥0 is a Gaussian process if any finite-dimensional vector (Xt1 , . . . , Xtp)is a Gaussian vector. By the above remark, the finite-dimensional distributions of (X) are com-pletely characterized by the means of Xt and the covariance function s, t 7→ E[XsXt].

Example. Brownian motion is a Gaussian process. Indeed, for any t1, . . . , tp the vector(Bt1 , . . . , Btp) is a linear combinaison of independent Gaussian variables (Bt1 , Bt2−Bt1 , . . . , Btp−tp−1)and is thus a Gaussian vector. Furthermore, the process is centered meaning that E[Bt] = 0 forall t ≥ 0 and the covariance function is

E[BsBt] = min(s, t) (exercise).

The last line is thus an alternative (and much simpler to manipulate) to the points 1 and 2 ofDefinition 1.

Exercise 7. Let Xn be a sequence of Gaussian vectors.

1. Suppose that b = limE[Xn] (the mean vector) and Σ = lim Cov(Xn) (the covariancematrix) exist. Show that Xn → X in distribution where X is Gaussian with mean b andcovariance matrix Σ.

2. * Suppose that Xn → X almost surely. Show that Xn → X in L2 and deduce thatb = limE[Xn] (the mean vector) and Σ = lim Cov(Xn) (the covariance matrix) exist.

We finish this section by stating a bound on the tail of the normal distribution which willbe used a several places in this course.

Lemma 1 (Tail bounds for the standard normal). Let N be a standard normal N (0, 1) randomvariable. Then we have

P (N > x) ∼ 1

x

1√2πe−x

2/2 as x→∞.

Proof. On the one hand we have∫ ∞x

ds√2πe−s

2/2 ≤∫ ∞x

ds√2π

s

xe−s

2/2 =1

x

1√2πe−x

2/2.

5

On the other hand, after performing an integration by parts∫ ∞x

ds√2πe−s

2/2 =

[− 1√

2π

1

se−s

2/2

]∞x

−∫ ∞x

1

s2ds√2πe−s

2/2.

The second term is easily seen to be negligible compared to the first one as x→∞.

2 Construction of Brownian motion

Theorem 2. Brownian motion exists2.

Proof. The following proof is due to Paul Levy. We first construct the process for t ∈ [0, 1] andin dimension 1. Let

Dn = k2−n : 0 ≤ k ≤ 2n

the set of dyadic points of order n and D = ∪Dn the set of all dyadic points of [0, 1]. Let (Ω,F , P )a probability space where we have a collection (Zi : i ∈ D) of independent identically distributedstandard N (0, 1) variables. We put B0 = 0 and B1 = Z1 and then construct inductively B onthe set Dn so that B has the right finite-dimensional marginals on Dn. This is done for D0.Suppose that we succeeded in doing it at step n − 1. Then for every t ∈ Dn\Dn−1 we updatethe interpolated value of Bt by adding 2−(n+1)/2Zt, formally we put

Bt =Bt+2−n +Bt−2−n

2+

Zt

2(n+1)/2.

Note that after step n the values of Bt ∈ Dn are fixed and are independent of Zt : t /∈ Dn.It is easy to see that after step n ≥ 0, the variables B(k+1)2−n − Bk2−n for k ∈ 0, . . . , 2n − 1are independent and distributed as N (0, 2−n). Indeed, it is easy to see by induction that thecovariance matrix of the Gaussian vector (Bi2−n)0≤i≤2n is equal to

E[BtBs] = (s ∧ t)1t∈Dn1s∈Dn .

This implies that B has the right finite-dimensional distributions over Dn. We denote by B(n)

the function which interpoles linearly Bt : t ∈ Dn.

Lemma 2. The random functions B(n) almost surely converge towards a random continuousfunction B (with a slight abuse of notation) for the uniform norm ‖‖ over [0, 1].

Proof. Fix n ≥ 1 and let us upper bound ‖B(n) − B(n−1)‖. To go from the function B(n−1) toB(n) the slope of B(n−1) over each dyadic interval of length 2−(n−1) is updated in the middle byadding an independent 2−(n+1)/2N (0, 1). Hence we have

‖B(n) −B(n−1)‖ ≤ 2−(n+1)/2 maxt∈Dn\Dn−1

|Zt|.

Fix α ∈ (1/√

2, 1). We thus have

P (‖B(n) −B(n−1)‖ ≥ αn) = P (2−(n+1)/2 maxt∈Dn\Dn−1

|Zt| ≥ αn)

≤ 2nP (|Z| ≥ αn2(n+1)/2)

≤ C2n exp(−(α22)n)

2That is, we can construct a probability space which supports a Brownian motion

6

for some C > 0 using Lemma 1. Since 2α2 > 1, the last probabilities are summable forn = 1, 2, . . . hence by the Borel–Cantelli lemma we have almost surely ‖B(n) −B(n−1)‖ ≤ αneventually. Consequently, the functions B(n) almost surely converge (at exponential speed!) forthe ‖‖ norm towards a random continuous function B.

It remains to show that B has the desired finite-dimensional marginals. Fix t1 ≤ . . . ≤tp ∈ [0, 1] and let t

(n)1 ≤ . . . ≤ t

(n)p be approximations of the latter in Dn. We already know

that the variables Bt(n)i+1

− Bt(n)i

are independent and are distributed as N (0, t(n)i+1 − t

(n)i ). By

continuity of B these variables converge towards Bti+1 − Bti . By Exercice 7 the limit has thedesired distribution.

To construct (one-dimensional) Brownian motion on the whole of R+ we just concatenateindependent copies of Brownian motion over the time interval [0, 1]. The Brownian motionin dimension d is constructed as (B1(t), . . . , Bd(t)) where B1, . . . , Bd are independent standardBrownian motions in one dimension. It is easy to see that these processes have the desiredproperties.

Canonical representation. Let (B) be a Brownian motion defined over (Ω,F , P ). The push-forward of P by the measurable mapping ω 7→ Bt(ω) ∈ (C(R+,Rd),Bc) is called the Wienermeasure and is denoted by P0. For every t ≥ 0, we put for x ∈ Rd

pt(x) =1

√2πt

dexp

(−‖x‖

2

2t

).

The Wiener measure is thus characterized on (C(R+,Rd),Bc) by

P0

(ω ∈ C(R+,Rd) : ω(t1) ∈ A1, . . . , ω(tp) ∈ Ap

)=

∫A1×A2...×Ap

dy1 . . . dyp pt1(y1)pt2−t1(y2 − y1) . . . ptp−tp−1(yp − yp−1),

for every Borel sets Ai ∈ B. In the remaining of this course, we will often use the canonicalrepresentation of BM (Brownian motion) by setting (Ω,F , P ) = (C(R+,Rd),Bc,P0) and Bt(ω) =ω(t). For every x ∈ Rd, under the probability Px, the Brownian motion starts at x.

Exercise 8. ∗ Let f : [0, 1] → R be a continuous function such that f(0) = 0. Prove that forevery ε > 0 we have

P

(supt∈[0,1]

|Bt − f(t)| < ε

)> 0.

2.1 Simple properties

Proposition 3. Let (B) be a standard Brownian motion in dimension d ≥ 1.

• Isometry. If φ is a linear isometry of Rd then φ(Bt) is a Brownian motion.

• Translation. For every s ≥ 0, the process B(s)t = Bs+t −Bs is a Brownian motion.

• Time reversal. The process (B1 −B1−t)t∈[0,1] is distributed as (Bt)t∈[0,1].

• Scale invariance. For every a > 0, the process ( 1aBa2t : t ≥ 0) is a Brownian motion.

7

Proof. All the processes considered are continuous, Gaussian, centered (mean zero) and thecovariance functions are easily seen to coincide with that of Brownian motion.

Exercise 9 (A first use of Brownian scaling). For b ≥ 0, let τ[−b,b] = inft ≥ 0 : Bt /∈ [−b, b].Show that E[τ[−b,b]] = b2E[τ[−1,1]].

Exercise 10 (Time inversion). * The process (1t>0tB1/t : t ≥ 0) is a Brownian motion.

Exercise 11 (Ornstein-Uhlenbeck diffusion). For t ∈ R, let Xt = e−tBe2t .

1. Show that for every t ∈ R, Xt is distributed as N (0, 1).

2. Show that (X)t∈R is a centered Gaussian process and compute its covariance functionE[XsXt] for s, t ∈ R.

3. Show that for every s ∈ R and a ≥ 0, the processes (X−t)t∈R and (Xt+s)t∈R have the samelaw as (X)t∈R on C(R,R) endowed with the Borel σ-field of uniform convergence over everycompact.

Exercise 12 (Quadratic variation). * Let 0 ≤ a < b. For every n ≥ 0, set

Xn =

2n∑k=1

(Ba+k(b−a)2−n −Ba+(k−1)(b−a)2−n

)2.

Compute the mean and the variance of Xn and find the almost sure limit of Xn. Deduce thata.s. BM has no finite variation on any interval –we recall that a function f : R+ −→ R has finitevariation on [a, b] if there exists M > 0 such that for every subdivision a = t0 < t1 < . . . < tp = bwe have

p∑i=1

|f(ti)− f(ti−1)| < M.

3 The Markov property

Unless explicitly mentioned, B is a one-dimensional Brownian motion starting from 0.

3.1 Blumenthal 0-1 law and consequences

Filtrations. For every s ≥ 0 we denote by Fs the filtration generated by the random variablesBr : 0 ≤ r ≤ s. We suppose that this filtration is complete in the sense that Fs contains allthe P -negligible sets3. In words, Fs contains all the information before time s. We also define

Fs+ =⋂t>s

Ft,

which contains an additional infinitesimal look into the future. Finally, F∞ is the σ-field gener-ated by all the Bt, t ∈ R+. Let us start to strengthen item 2 of Proposition 3. With the notationintroduced there, for every 0 < r1 < r2 < . . . < rq ≤ s and every t1, . . . , tp > 0

the vector (B(s)t1, . . . , B

(s)tp ) is independent of (Br1 , . . . , Brq),

3this assumption is usually used to restrict oneself to the set of trajectories where BM is continuous

8

this can be seen from the definition of Brownian motion or using the covariance structure. By the

monotone class theorem, it follows that (B(s)t1, . . . , B

(s)tp ) is independent of σ(Bu : u ≤ s) = Fs.

Applying once again the monotone class theorem it comes that

the Brownian motion (B(s)t )t≥0 is independent of Fs.

This reinforcement of Proposition 3 item 2 is called the simple Markov property. Very soon, wewill extend this property when the fixed time s is replaced by a random one. Before that weprove a zero-one law for “infinitesimal” events.

Theorem 3 (Blumenthal 0-1 law). The σ-field F0+ is trivial in the sense that every event whichis F0+ measurable has probability 0 or 1.

Exercise 13. Do you know others 0-1 laws?

Proof. Fix A ∈ F0+ . Let t1, . . . , tp > 0 and f be a bounded continuous function on Rp. Forevery ε > 0 the Brownian motion B(ε) is independent of Fε and so a fortiori of F0+ . Thus wehave

E[1Af(B(ε)t1, . . . , B

(ε)tp )] = P (A)E[f(B

(ε)t1, . . . , B

(ε)tp )] = P (A)E[f(Bt1 , . . . , Btp)].

On the other hand, by (almost sure) continuity of the paths of Brownian motion we have

E[1Af(Bt1 , . . . , Btp)] = limε→0

E[1Af(B(ε)t1, . . . , B

(ε)tp )].

Combining the two displays we find that A is independent of (Bt1 , . . . , Btp). By the monotoneclass theorem, A is independent of the σ-field generated by all Bs for s > 0 which is F∞. Theevent A is in particular independent of itself, hence P (A) = 0 or 1.

Exercise 14. * Prove that for every s ≥ 0 we have Fs = Fs+ .

Applications. Consider the variables Bε = supBs : 0 ≤ s ≤ ε and Bε = infBs : 0 ≤ s ≤ ε.A priori it is no clear that these are measurable with respect to Fε. To see this, we use the(almost sure) continuity of the trajectories and write

Bε = supt∈[0,ε]∩Q

Bt.

This clearly entails that Bε is Fε-measurable. The same reasoning holds for Bε. Then the(measurable) events A+ = ∀n > 0 : B1/n > 0 and A− = ∀n > 0 : B1/n < 0 belong to F0+ .

By the previous theorem they have P -probability 0 or 1. But P (Bε > 0) ≥ P (Bε > 0) = 1/2which entails that P (A+) ≥ 1/2 and so P (A+) = 1 and similarly P (A−) = 1. In words, whenstarting from 0, Brownian motion immediately takes positive and negative values. The pictureto keep in mind is the function t 7→ t sin(t−1) which exhibits the same property:

Proposition 4. For every a ∈ R, let Ta = inft ≥ 0 : Bt = a with the convention thatinf ∅ =∞. Then for every a ∈ R we have Ta <∞ a.s.

Proof. By the discussion before the Proposition we have

1 = P (B1 > 0) = limδ→0

P (B1 > δ).

9

0.1 0.2 0.3 0.4 0.5

-0.2

-0.1

0.1

0.2

0.3

0.4

Figure 2: The function t 7→ t sin(t−1) takes negative and positive values immediatelyafter 0.

A scaling argument (Proposition 3) shows that

P (B1 > δ) = P (sup[0,1]

B > δ) = P ( sup[0,δ−2]

B > 1).

Letting δ → 0, we deduce that P (sups≥0B > 1) = 1. Another scaling argument shows thatP (sups≥0B > A) = 1 for every A > 0. By symmetry the proposition holds.

Exercise 15. Show that in the last proposition we can exchange a.s. and ∀ meaning that wehave a.s. for every a ∈ R, Ta <∞. Deduce that supB = +∞ and inf B = −∞ almost surely.

Exercise 16. 1. Show that the event lim supt→0 t−1/2Bt = +∞ is F0+-measurable.

2. Deduce that a.s. lim supt→0 t−1/2Bt = +∞ and lim inft→0 t

−1/2Bt = −∞.

3. Using (or admitting) Exercise 10, show that the statements of 2. hold when t→∞.

Exercise 17. Show that Brownian motion is a.s. non monotone on every interval.

Exercise 18. * Show that for every t ≥ 0, time t is not a local maximum for B almost surely.Can we exchange a.s. and ∀ in the last sentence ?

Exercise 19. Show the following convergence in distribution(∫ t

0eBsds

)1/√t

−−−→t→∞

eB1 .

Hint : Use scaling.

3.2 The strong Markov property

Recall that with the notation of Proposition 3, for every s ≥ 0 the Brownian motion B(s) isindependent of Fs. The goal of this section is to extend the previous statement when s is replacedby a random time... but not any random time!

Definition 2. A random variable T ∈ R+ ∪∞ is a stopping time if for every t ≥ 0, the eventT ≤ t is Ft measurable. In words, “if you must stop before time t you should know it by timet”.The σ-field of the past before T is defined as

FT = A ∈ F∞ : ∀t ≥ 0 : A ∩ T ≤ t ∈ Ft.

10

Example. For every a ∈ R, Ta is a stopping time. Indeed,

Ta ≤ t =

inf

r∈Q∩[0,t]|Br − a| = 0

∈ Ft.

Exercise 20. Let T be stopping time.

• Check that FT is a σ-field.

• If S, T are stopping times then min(S, T ), max(S, T ), S + T are stopping times.

• If S ≤ T are stopping times show that and FS ⊂ FT .

It is trivial that the random variable T is FT -measurable. Also BT1T<∞ is FT -measurable.To see this, notice that by the almost sure continuity of Brownian motion we have the almostsure limit

BT = limn→∞

∞∑k=0

1T∈[k/n,(k+1)/n)Bk/n,

and remark that each member in the sum is FT -measurable. We can now state one of the mostuseful theorem about Brownian motion.

Theorem 4 (strong Markov property). Let T be a stopping time such that P (T <∞) > 0. Forevery t ≥ 0 we put

B(T )t = 1T<∞

(Bt+T −BT

).

Then under P (. | T <∞) the process B(T ) is a Brownian motion independent of FT .

Proof. We first suppose that T < ∞ a.s. Let A ∈ FT , pick a bounded continuous functionF : Rp → R and fix 0 < t1 < . . . < tp. We thus aim at showing that

E[1Af(B(T )t1, . . . , B

(T )tp )] = P (A)E[f(Bt1 , . . . , Btp)].

For every integer n ≥ 1 denote by [T ]n the smallest number of the form k2−n larger than or

equal to T . Since we have f(B([T ]n)t1

, . . . , B([T ]n)tp )→ f(B

(T )t1, . . . , B

(T )tp ) a.s., then by the dominated

convergence theorem we have

E[1Af(B(T )t1, . . . , B

(T )tp )]

= limn→∞

E[1Af(B([T ]n)t1

, . . . , B([T ]n)tp )]

= limn→∞

∞∑k=0

E[1A1(k−1)2−n<T≤k2−nf(B

(k2−n)t1

, . . . , B(k2−n)tp )

]Now, notice that A ∩ (k − 1)2−n < T ≤ k2−n ∈ Fk2−n . Since by the simple Markov propertyB(k2−n) is independent of Fk2−n we can split the expectation is the sum and get

E[1Af(B(T )t1, . . . , B

(T )tp )] = lim

n→∞

∞∑k=0

E[1A1(k−1)2−n<T≤k2−n

]E[f(Bt1 , . . . , Btp)

].

By re-summing we get the desired result. The case when P (T =∞) > 0 is similar.

In the following, except mentioned all the stopping times considered are almost surely finite.

11

Let us give a useful reformulation of the strong Markov property. Suppose that T <∞ a.s.and let X be a random variable which is FT measurable: it could for example be a function ofT , BT1T<∞ of the path before T ... and can take value in an arbitrary measurable space (E,A).Then for every positive measurable φ : E × C(R+,R) we have

E[φ(X,B(T )

)]= E

[∫P0(d(W ))φ

(X, (Wt)t≥0

)], (1)

where P0 is the Wiener measure. To see this, just remark that X is independent of B(T ) by theabove theorem and thus we can write the law P(X,B(T )) as a product PX ⊗ P0. Let us see anapplication on the canonical space to remind you of the strong Markov property in the case ofMarkov chains. We put Ω = C(R+,R) endowed with the product σ-field and Bt(ω) = ωt. Recallalso the notation Px. The translation operator is

θs(ω) = (ωs+t)t≥0, for s ≥ 0.

Then the Markov property is equivalently described as follows. Let T by an a.s. finite stoppingtime. For every measurable positive functions F,G on Ω such that F is FT -measurable then

Ex [F ·G θT ] = Ex[F ·G(B(T )

. +BT )]

= Ex[F

∫P0(dW )G(W +BT )

],

by (1). But by definition of the measures Px, we have

Ex [F ·G θT ] = Ex[F

∫PBT

(dW )G(W )

]= Ex[F EBT

[G]].

Exercise 21. Show that for every a > 0, almost surely

inft ≥ 0 |Bt = a = inft ≥ 0 |Bt > a .

Exercise 22. Let a ≥ 0 and recall that Ta = infs ≥ 0 : Bs = a.

1. Show that Ta = a2T1 in distribution.

2. For 0 ≤ a ≤ b < ∞, prove that Tb − Ta is distributed as Tb−a and is independent of(FTs)0≤s≤a.

Remark: the process a 7→ Ta is thus another process with stationary and independent increments.

Exercise 23. Show that if S, T are stopping times then ST is not necessarily a stopping time.

4 Paths properties

4.1 Reflection principle

Theorem 5. Let B be a Brownian motion in dimension 1. For every t > 0, recall that Bt =sups≤tBs. For every a ≥ 0 and b ≤ a we have

P (Bt ≥ a, Bt ≤ b) = P (Bt ≥ 2a− b).

12

Proof. The idea is to apply the strong Markov property at the stopping time Ta and to reflectthe path after time a around the horizontal line y = a. We have already seen that Ta <∞ a.s.Then we can apply the strong Markov property (1) and get that

P (Bt ≥ a, Bt ≤ b) = P (Ta ≤ t, B(Ta)t−Ta ≤ b− a)

= E [1Ta≤tFt−Ta([−∞, b− a])] ,

where for s ≥ 0 and A ⊂ RFs(A) =

∫P(dω)1ωs∈A.

By symmetry of the Wiener measure we have Fs(A) = Fs(−A) so that we have

P (Bt ≥ a, Bt ≤ b) = E [1Ta≤tFt−Ta([a− b,∞])]

= P (Ta ≤ t, B(Ta)t−Ta ≥ a− b)

= P (Bt ≥ a, Bt ≥ 2a− b)= P (Bt ≥ 2a− b).

Corollary 1. With the notation of the previous theorem we have

1. The variable Bt has the same distribution as |Bt|.

2. The variable Ta is distributed as ( aB1

)2 for every a ∈ R.

Proof. For (i) we have

P (Bt ≥ a) = P (Bt ≥ a, Bt ≤ a) + P (Bt ≥ a, Bt > a)

= P (Bt ≥ 2a− a) + P (Bt > a)

= 2P (Bt ≥ a).

For (ii), if a ≥ 0 using (i) and the scaling property of Brownian motion we have

P (Ta ≤ t) = P (Bt ≥ a) = P (|Bt| ≥ a) = P (|√tB1| ≥ a) = P (

a2

B21

≤ t).

When a ≤ 0, we use the symmetry of Brownian motion to say that Ta = T−a in law.

Exercise 24 (Densities). Deduce from the last corollary that

1. The pair (Bt, Bt) has density

2(2a− b)√2πt3

exp

(−(2a− b)2

2t

)1a>0,b<a.

2. For every a ∈ R, the variable Ta has density

a√2πt3

exp

(−a

2

2t

)1t>0.

Exercise 25. Try Exercise 10 again.

Exercise 26 (local maxima). ∗ Let p, q, r, s ∈ Q+ such that p < q < r < s. Show that

P

(supp≤t≤q

Bt = supr≤t≤s

Bt

)= 0.

Deduce that local extrema of Brownian motion are almost surely distinct.

13

4.2 Zeros of Brownian motion

In this section, we study the zero set of a standard one-dimensional Brownian motion that is

Z = t ≥ 0 : Bt = 0.

Theorem 6. Almost surely, the set Z is a perfect set (that is closed with no isolated points) ofzero Lebesgue measure.

Proof. Since B is an a.s. continuous function the set Z is closed a.s. It follows from the discussionafter the Blumenthal law that T+

0 = inft>0 : Bt = 0 satisfies T+0 = 0 almost surely. Using

this, we will prove that Z has no isolated point. Fix q ∈ [0,∞) a rational number and considerthe first return to zero after q:

τq = inft ≥ q : Bt = 0.

It is easy to see that τq is an almost surely finite stopping time. Note also that Bτq = 0 bycontinuity. We can thus apply the strong Markov property at time τq and get

P (infz ∈ Z : z > q = q) = P (inft > 0 : B(τq) = 0) = 1.

We deduce that ∀q ∈ Q+, almost surely, the zero τq is not isolated from the right. Since therationals numbers are countable, we can exchange ∀ and a.s. in the last sentence. This easyimplies that all the zero of Z are no isolated, indeed if we argue by contradiction and find z ∈ Zalone in a neighborhood then z = τq for some q ∈ Q+ and we reach a contradiction.

To show that the Lebesgue measure of Z is zero, we apply Fubini’s theorem

E[Leb(Z)] = E

[∫ ∞0

dt 1Bt=0

]=

∫ ∞0

dt P (Bt = 0) = 0,

which proves the desired result... However to justify the use of Fubini’s theorem we have toshow that the function (ω, t) 7→ 1Bt=0 is measurable with respect to F∞ ⊗ B. This is left as anmeasurability exercise.

Exercise 27. Show that the function (ω, t) 7→ 1Bt=0 is measurable with respect to F∞ ⊗ B.

Exercise 28. Show that for every s ≥ 0, almost surely, the set Ls = t ≥ 0 : Bt = s is also aperfect set of zero Lebesgue measure. Can we exchange ∀ and a.s. ?

Exercise 29. Show that Brownian motion has isolated zeros from the left and from the right.

Exercise 30 (Classic exercise about perfect sets). Show that a perfect set of R has the samecardinal as R. Do you know of other perfect sets of R of Lebesgue measure 0? (If you do not,then learn the classical construction of Cantor sets).

4.3 Arcsine laws

A random variable X ∈ [0, 1] follows the arcsine law if

P (X ∈ dx) =dx

π√x(1− x)

1x∈(0,1).

The name comes from the fact that the cumulative distribution function of X is then P (X ≤ t) =2πarcsin(

√x). The arcsine law pops-up naturally when studying basic properties of Brownian

motion.

14

0.2 0.4 0.6 0.8 1.0

0.5

1.0

1.5

2.0

Figure 3: The arcsine distribution

Theorem 7 (Arcsine laws). Let L = supt ∈ [0, 1] : Bt = 0 be the last zero of Brownianmotion before time t = 1. Let also M be the almost surely unique random time in [0, 1] suchthat BM = B1. Then L and M are both arcsine distributed.

Exercise 26 shows that M is almost surely unique and thus well-defined.

We will see later that there is yet another arcsine law for Brownian motion: the time spentby B in positive (or negative) regions before time 1 is again distributed according to the arcsinedistribution.

Exercise 31. Show that L and M are not stopping times.

Proof. We start with the case of L. Fix t ∈ (0, 1). By applying the (simple) Markov propertyat (the stopping) time t we deduce that

P (L < t) = E[1B(t)

s +Bt 6= 0, ∀s ∈ [0, 1− t]]

= E [EBt [ωs > 0, ∀s ∈ [0, 1− t]]]= E [PBt(T0 > 1− t)]= E [P0(TBt > 1− t)] .

We can now use Corollary 1 and get

P (L < t) = E[P ((Bt/N′)2 > 1− t)]

= P ((N/N ′)2 >1− tt

),

where N,N ′ are independent standard normal N (0, 1) independent of B. At this point, either wecompute directly the integral or we use a trick: After a simple manipulation the last probabilityis also equal to

P

(|N ′|√

N2 +N ′2≤√t

).

We write (N,N ′) = (R sin(θ), R cos(θ)) in polar coordinates, where R ≥ 0 and θ ∈ (0, 2π].The density of (N,N ′) is dxdy 1

2πe−(x2+y2)/2 which becomes dθ 1

2π dr re−r2/2 in polar. The point

is that θ is uniformly distributed over (0, 2π]. It follows that the last probability is equal toP (| sin(θ)| ≤

√t) = 2

πarcsin(√t) as desired.

15

The case of the variable M is similar in spirit. Fix t ∈ (0, 1) and write

P (M ≤ t) = P

(sup[0,t]

B ≥ sup[t,1]

B

)

= P

(supu∈[0,t]

Bt−u −Bt ≥ supu∈[0,1−t]

B(t)u

).

By Proposition 3, the process Bu = Bt−u−Bt is a Brownian motion over [0, t]. Hence, applyingthe (simple) Markov property at time t we get that

P (M ≤ t) = P (Bt ≥ B(t)1−t)

= P (√t|N | ≥

√1− t|N ′|)

by Theorem 5 and with the same notation as above. The rest of the proof is the same.

Exercise 32 (Another proof using time inversion). Let R = inft ≥ 1 : Bt = 0.

1. Using the (simple) Markov property at time 1, show that R = 1 + (N/N ′)2 in distributionwhere N and N ′ are two independent standard normal.

2. Use Exercise 10 to show that R = L−1 in distribution and recover the arcsine law for L.

4.4 Law of iterated logarithm

Theorem 8. Let B be a standard one-dimensional Brownian motion. Then we have

lim supt→∞

Bt√2t log log(t)

= 1.

Proof. The main idea of the proof is to look at values of Brownian motion at exponential timesαn for some α > 1. Indeed, when α → ∞ these values becomes more and more independentand we can use the Borel–Cantelli lemmas. Let us even imagine in the proof that we do notknow the scale function

√2t log log t and let us try to recover it. Notice that it should be large

compared to√t since the latter is the typical behavior of Brownian motion at time t.

Upper bound. Fix α > 1 and look at the events

An =Bαn ≥

√αnψ(αn)

.

The goal here is to apply the first Borel–Cantelli lemma and to see that for a suitable function ψwe have

∑P (An) <∞ which will imply the upper bound. By the scaling property of Brownian

motion and Theorem 5 we have

P (An) = P (B1 ≥ ψ(αn)) = P (|B1| ≥ ψ(αn)).

As we said, it should be clear that ψ(t)→∞ as t→∞. Using Lemma 1 the last probability isthus of order

1

ψ(αn)exp(−ψ(αn)2/2).

Since ψ(αn) → ∞, the important factor in the last display is the exponential one. We thuswant to see if the series

∑P (An) converges and thus compare exp(−ψ(αn)2/2) with 1/n. Doing

16

so, we clearly see that the threshold function is ψ(t) =√

2 log log t. Indeed for every ε > 0 ifψ(t) < (1 − ε)

√2 log log t the series

∑P (An) diverges and if ψ(t) = (1 + ε)

√2 log log t then∑

P (An) converges. In this case, by the first Borel–Cantelli lemma, An happens finitely manyoften. For every t > 1, we now interpolate and find n ≥ 1 such that αn−1 < t ≤ αn and so

Bt√2t log log t

≤ Bαn

(1 + ε)√

2αn log logαn(1 + ε)

√2αn log logαn√

2t log log t

Taking lim sup we get that

lim supt→∞

Bt√2t log log t

≤ (1 + ε)√α.

Now let ε → 0 and α→ 1 to get the upper bound. We can generalize slightly the upper boundto get that (this will be used in the lower bound)

lim supt→∞

|Bt|√2t log log t

= 1. (2)

Lower bound. For the lower bound we will apply the second Borel–Cantelli lemma and thusneed independent events. Fix α > 1 and ε > 0 and consider the events

A′n = Bαn −Bαn−1 > (1− ε)√

2αn log logαn, for n ≥ 1

Remark that Bαn − Bαn−1 = B(αn−1)αn−αn−1 so that the Markov property applied at (the stopping

time) αn−1 yields that A′n is independent of A′1, A′2, . . . , A

′n−1. It follows that all the A′n are

independent. On the other hand by Lemma 1 we have

P (A′n) = P (Bαn−αn−1 > (1− ε)√

2αn log logαn)

= P

(B1 >

(1− ε)√

2αn log logαn√αn − αn−1

)≥ P

(B1 >

√2α(1− ε)√α− 1

√log logαn

).

Using Lemma 1 the last quantity is of order

n−γ2/2+o(1), where γ =

√2α(1− ε)√α− 1

.

For ε fixed we thus pick α large enough so that γ2/2 < 1. If we do so, the series∑P (A′n) is

infinite and by the second Borel–Cantelli lemma, A′n happens infinitely often. When A′n happenswe have

Bαn

√2αn log logαn

≥ (1− ε)− |Bαn−1 |√2αn log logαn

.

We now use the upper bound (2) for the second term of the right-hand side to deduce that thelimsup of the right-hand side is at least (1− ε)− 1√

α. We can now let α→∞ followed by ε→ 0

and get the desired result.

17

Remark. By symmetry it follows that lim inf Bt/√

2t log log(t) = −1 and by time inversion(see Exercise 10) we also have

lim supt→0

Bt√2t log | log t|

= 1.

In particular, a.s. Brownian motion is not differentiable at time t = 0. Applying the simpleMarkov property it follows that, for every t ≥ 0, almost surely B is not differentiable at time t.The purpose of the following theorem is to exchange ∀ and a.s. and show that Brownian motionis a.s. nowhere differentiable.

Theorem 9 (Paley, Wiener and Zygmung 1933). Almost surely, the function t 7→ Bt is nowheredifferentiable.

Proof. We can restrict ourself to [0, 1]. If there exists t0 ∈ [0, 1) such that B is differentiable att0 that implies that

suph∈[0,1]

|Bt0+h −Bt0 |h

≤M, (3)

for some (random) M > 0. We split the interval [0, 1] into the 2n intervals [(k − 1)2−n, k2−n]for k ∈ 1, . . . , 2n. If t0 falls into some interval [(k − 1)2−n, k2−n] that means that for every1 ≤ j ≤ 2n − k we have using (3)

|B(k+j)2−n −B(k+j−1)2−n | ≤ |B(k+j)2−n −Bt0 |+ |B(k+j−1)2−n −Bt0 |

≤M 2j + 1

2n.

In fact, we will only use the first values of j = 1, 2, 3 for our purposes. We are thus led toconsider the events

ΩMn,k := |B(k+j)2−n −B(k+j−1)2−n | ≤ 7M2−n, for j = 1, 2, 3.

By the independence of the increments and the scaling property of Brownian motion we have

P (ΩMn,k) = P (|B2−n | ≤ 7M2−n)3

= P (|B1| ≤ 7M2−n/2)3

≤ (7M)32−3n/2,

where for the last line we used the fact that the density of the normal distribution is boundedby 1. By the union bound we have

P

(2n⋃k=1

ΩMn,k

)≤ (7M)32−n/2.

Now, if B is differentiable at some t0 ∈ [0, 1) by (3) it implies that for some M , the event ∪kΩMn,k

holds for n large enough and so for infinitely many n’s. But by the first Borel–Cantelli lemma

18

we deduce that

P (∃t0 ∈ [0, 1] : B differentiable at t0)

≤ P

( ∞⋃M=1

the event

2n⋃k=1

ΩMn,k holds i.o.

)

=∞∑

M=1

0 = 0.

Exercise 33. * Let f ∈ C([0, 1],R) be any fixed continuous function. Adapt the proof of Theo-rem 9 and show that, almost surely, the function (Bt + f(t) : t ∈ [0, 1]) is nowhere differentiable.

We end this section by mentioning Levy’s modulus of continuity theorem (without givingthe proof) which shares striking similarities with the law of the iterated logarithm (except theiterated logarithm!).

Theorem 10 (Levy 1937). Almost surely,

lim suph→0

sup0≤t≤1−h

|Bt+h −Bt|√2h log(1/h)

= 1.

Exercise 34. * Prove the lower bound in Levy’s modulus of continuity theorem.

5 Brownian motion and martingales

Let (Ω,F , P ) be a probability space. Suppose that we have a filtration (Ft)t≥0. For the purposesof this course we will suppose that Ft is the filtration generated by the Brownian motion B.We will need to consider martingales in continuous time which are defined similarly as in thediscrete setting.

Definition 3. A family (Mt)t≥0 is an Ft-martingale if for every t ≥ 0, Mt is Ft-measurable(the process is said to be adapted to the filtration), E[|Mt|] <∞ and if for every s ≤ t we have

E[Mt | Fs] = Ms.

The process is a submartingale or supermartingale if the = of the last display is replaced by ≥respectively ≤.

As in the discrete setting, martingales represent fair games where the current value of theprocess gives the best prediction for the future of the game. Let us give a few examples of veryimportant continuous martingales based on Brownian motion.

Proposition 5. The processes (Bt : t ≥ 0), (B2t − t : t ≥ 0) and (eλBt−tλ2/2 : t ≥ 0) for all

λ ∈ R are all continuous martingales for the Brownian filtration.

Proof. All the processes considered are integrable, adapted to the filtration (Ft)t≥0 and contin-uous. Let us check the martingale property, for t ≥ s we have

E[Bt | Fs] = E[Bs + (Bt −Bs) | Fs] = E[Bs],

becauseBt−Bs is independent of Fs by the (simple) Markov property. The other cases are similar(for the exponential martingale, recall that if N is a standard Gaussian r.v. then E[exp(zN)] =exp(z2/2) for every z ∈ C).

19

Exercise 35. Find α ∈ R such that B3t − αtBt is an (Ft)-martingale.

Exercise 36. Show that the process

t 7→ tBt −∫ t

0ds Bs,

is an (Ft)-martingale.

We will now extend some of the results of the theory of discrete-time martingales to our setup.The proofs always go through an approximation by discrete martingales after discretizing thetime and then passing to the limit.

Proposition 6. Let T be a bounded stopping time and let (Mt)t≥0 be a continuous martingaleadapted to Ft. Then we have

E[MT ] = E[M0].

Proof. Let K > 0 such that T < K a.s. For n ≥ 0, we divide the time-scale [0,K] into n

steps by putting F (n)i = FiK/n, defining T (n) to be the smallest i such that iK/n larger than T

and setting M(n)i = MiK/n. It is then clear that M (n) is an (F (n)

i )i≥0-martingale and T (n) is a

(discrete) stopping time for (F (n)i )i≥0. By continuity of M we clearly have

M(n)

T (n)

a.s.−−−→n→∞

MT .

For every n ≥ 0, by the discrete theory we have

E[M(n)

T (n) ] = E[M0],

besidesM

(n)

T (n) = E[MK | F (n)

T (n) ].

Since the family E[MK | G] : G ⊂ F subfiltration is uniformly integrable it follows that M(n)

T (n)

is also uniformly integrable. Consequently, by the enhanced version of dominated convergencetheorem we have

E[M(n)

T (n) ]→ E[MT ] = E[M0] as n→∞.

Theorem 11. For every a < 0 < b we have

1. P (Ta < Tb) =b

b− a,

2. E[min(Ta, Tb)] = −ab,

Proof. Fix n ≥ 1. The stopping time τn = min(n, Ta, Tb) is bounded by n. We use the continuousmartingale Bt and we get by the last proposition that

E[Bτn ] = 0.

Since Ta, Tb < ∞, we can let n → ∞ to get Bτn → Bmin(Ta,Tb) a.s. Since furthermore Bτn isbounded by max(|a|, |b|) the dominated convergence theorem shows that

0 = E[Bmin(Ta,Tb)] = aP (Ta < Tb) + bP (Tb < Ta).

20

Combining this with the obvious fact P (Ta < Tb) + P (Tb < Ta) = 1 yields the first point of thetheorem.The second point is similar but we now work with the continuous martingale (B2

t − t)t≥0. Usingthe same stopping time τn = min(n, Ta, Tb) we get by the same proposition that

0 = E[B2τn − τn] = E[B2

τn ]− E[τn].

On the one hand, E[B2τn ] → E[B2

min(Ta,Tb)] by dominated convergence. On the other hand

E[τn] ↑ E[min(Ta, Tb)] by monotone convergence. We thus obtain using the first point of thetheorem that

E[min(Ta, Tb)] = E[B2min(Ta,Tb)

] = a2b

b− a+ b2

−ab− a

= −ab.

Thanks to the last theorem, the expectation of the stopping time min(Ta, Tb) for a < 0 < bis thus seen to be finite. Notice that it is very important to constrain the Brownian motion ina finite strip to get a finite expectation for the exit time. Indeed for every a ∈ R∗ it is easy tosee using the exact density of Ta (see Exercise 24) that

E[Ta] =∞.

Exercise 37. Use Theorem 11 and let b→∞ to get another proof of the identity E[Ta] =∞.

Exercise 38. Using the martingale exp(λBt − tλ2/2), show that for every λ ∈ R+ we have

E [exp(−λmin(Ta, T−a))] = cosh(a√

2λ)−1.

Exercise 39. Show that if |Bt+1 − Bt| > 2 for some t ≥ 0 then we have min(T1, T−1) ≤ t + 1.Deduce that P (min(T1, T−1) ≥ n) ≤ αn for some α ∈ (0, 1).

Exercise 40 (Doob’s maximal inequality for submartingales). Let (Xt)t≥0 be a continuoussubmartingale for the filtration (Ft)t≥0. Let p > 1 and t ≥ 0, show using the discrete version ofDoob’s maximal inequality that

E

[(sups≤t|Xt|

)p]≤(

p

p− 1

)pE [|Xt|p] .

Cultural remark. A fascinating theorem, due to Dubins and Schwarz (not the same guy as theCauchy-Schwarz inequality...) roughly says that all the continuous martingales are variationsof Brownian motion (more precisely time-changed Brownian motion). Hence, in a certain sensethere is only “one” continuous martingale which is the Brownian motion. Of course if we removethe statement “continuous” that is another story...

6 Donsker’s theorem

Reminder on the convergence in distribution. Let Xn and X be random variables withvalues in a metric space (E, d) endowed with its Borel σ-field. We say that Xn → X in distri-bution (or in law) if for every bounded continuous function F : E → R we have

E[F (Xn)] −→ E[F (X)], as n→∞.

The following proposition gathers most of the properties which are commonly used:

21

Proposition 7 (Portmanteau). Let Xn and X be random variables with values in an arbitrarymetric space (E, d). The the following propositions are equivalent:

• Xn converges in law towards X

• E[F (Xn)]→ E[F (X)] for any bounded continuous function F

• E[F (Xn)]→ E[F (X)] for any bounded Lipschitz function F

• lim supP (Xn ∈ C) ≤ P (X ∈ C) for any C ⊂ E closed

• lim inf P (Xn ∈ O) ≥ P (X ∈ O) for any O ⊂ E open

• P (Xn ∈ A)→ P (X ∈ A) for any A ⊂ E measurable such that P (X ∈ ∂A) = 0

• E[F (Xn)]→ E[F (X)] for any bounded measurable function F : E → R with

P (F is discontinuous at X) = 0.

Proof. Can be found in any textbook on advanced probability. For example “Convergence ofprobability measures” by Billingsley.

Proposition 8 (Continuous mapping theorem). Suppose that Xn converges in distribution to-wards X and let F be a continuous on (E, d). Then F (Xn) converges in law towards F (X).This remains true if F is only measurable but such that

P (F is discontinuous at X) = 0.

Proof. Suppose that is measurable and that P (F is discontinuous at X) = 0. Let G be abounded continuous function (defined on the image space of F ) with values in R. Obviously,G F is a bounded measurable function such that P (G F is discontinuous at X) = 0. We canthus apply the last item of Proposition 7 to deduce that

E[G(F (Xn))] −→ E[G(F (X))],

as n→∞ which entails that F (Xn)→ F (X) in law as desired.

Exercise 41. * Let Xn and X be random variables with values in a metric space (E, d).

1. Suppose that Xn → X almost surely. Show that Xn → X in distribution.

2. Suppose now that Xn → X in probability, i.e. ∀ε > 0 we have P (d(Xn, X) > ε)→ 0.

(a) Show that there exists a subsequence nk such that Xnk→ X almost surely.

(b) Deduce that Xn → X in distribution.

We will focus in this chapter on the case

(E, d) =(C([0,K],R

), ‖‖∞

),

the space of continuous functions over [0,K] for K > 0 endowed with the topology of uniformconvergence. Although not useful in the sequel, recall that this space is separable and complete.

22

The random variables that we will consider are rescaled random walks and their limit is alwaysBrownian motion.

Let X1, . . . , Xn, . . . be independent identically distributed real random variables with meanc ∈ R and finite variance σ2. We form the cumulative sum S0 = 0 and Sn = X1 + . . .+Xn forn ≥ 1 which we interpolate linearly between integer values by setting

S(t) = S[t] + (t− [t])(S[t]+1 − S[t])

for every t ≥ 0. We now define a sequence (S∗n) of random continuous functions by putting

S∗n(t) =S(nt)− cnt

σ√n

.

We already know from the central limit theorem that S∗n(1) → B1 in distribution and moregenerally that for every 0 ≤ t1 ≤ t2 ≤ . . . ≤ tk, we have

(S∗n(t1), . . . , S∗n(tk))

(d)−−−→n→∞

(Bt1 , . . . , Btk)

in distribution as n → ∞ (if you do not see why, prove it!). In words, the finite dimensionalmarginals of S∗n converge towards that of Brownian motion. The following theorem (sometimescalled the functional central limit theorem) shows that the convergence actually holds in distri-bution for the uniform norm over [0,K] for any K > 0. This is much stronger (as a comparison,the uniform convergence over every compact in much stronger that the point wise convergenceof functions) and enables to control the function entirely over [0,K]. For example, this theoremwill imply the convergence in distribution of all F (S∗n) towards F (B) where F is a continuousfunction for the uniform norm over [0, 1], see Section 6.1 for applications.

Theorem 12 (Donsker’s invariance principle). For every K ≥ 0 the random continuous func-tions S∗n converge in distribution for the supremum norm over [0,K] towards B.

One of the advantage of Donsker’s theorem is its universality in the sense that Brownianmotion appears as the limit of any rescaled random walk as soon as the step distribution hasa finite variance. However, for the sake of clarity (and for timing reason) we will prove thetheorem in the case when Xi is a sequence of iid fair Bernoulli variables, that is

P (X1 = 1) = P (X1 = −1) =1

2.

There are many proofs of the Donsker’s theorem. The one we undertake is based on the ideaof coupling: we will construct, on the same probability space the random walk Sn as well as aBrownian motion B so that they are very close to each other. To do so, we will embed the walkinto the Brownian motion B.

Proof. Consider the sequence of random times defined recursively as follows: τ0 = 0 and fori ≥ 0

τi+1 = inft ≥ τi : |Bt −Bτi | = 1.

This forms a sequence of increasing stopping times (exercise) such that we have Bτi+1 = Bτi±1.By the strong Markov property applied at time τi, we deduce that (τi+1 − τi, Bτi+1 − Bτi) isindependent of Fτi and is distributed as (τ1, Bτ1). Since the (τj+1 − τj , Bτj+1 − Bτj ) for j < iare Fτi-measurable, it follows that

23

(τi+1 − τi, Bτi+1 −Bτi)i≥0 are iid and distributed as (τ1, Bτ1).

By Theorem 11, E[τ1] = 1 and Bτ1 is a fair Bernoulli variable. Hence, the discrete process(Sn)n≥0 = (Bτn)n≥0 is a simple symmetric random walk on Z. This is the coupling we werelooking for. We set B∗n(t) = n−1/2Bnt and we will show that for every K > 0 we have

supt∈[0,K]

|B∗n(t)− S∗n(t)| → 0, in probability as n→∞. (4)

This will imply the theorem. Indeed, let F : C([0,K],R)→ R be a bounded continuous functionfor the uniform norm over [0,K]. Since for every n ≥ 0, B∗n is distributed as a standard Brownianmotion we have∣∣E[F (S∗n)]− E[F (B)]

∣∣ =∣∣E[F (S∗n)− E[F (B∗n)]

∣∣ ≤ E[∣∣F (S∗n)− F (B∗n)∣∣].

We then proceed as in the proof of the fact that convergence in probability implies convergencein distribution (Exercise 41). Namely, we argue by contradiction and suppose that we can findinfinitely many n’s so that the right-hand side of the last display is larger than some ε > 0. Thenby (4), we can extract from this sequence a subsequence nk such that S∗nk

− B∗nk→ 0 almost

surely for the uniform norm over [0,K]. We can then use the dominated convergence theoremto get that

E[∣∣F (S∗nk

)− F (B∗nk)∣∣]→ 0, as k →∞,

which yields a contradiction.It thus suffices to prove (4) to show the theorem. We fix K = 1 for simplicity (the general

case is similar). Recall the linear interpolation of S. For every t ∈ [0, 1] we have

|B∗n(t)− S∗n(t)| = |Bnt − S(nt)|√n

.

Since S interpolates between S([nt]) and S([nt] + 1) = S([nt])± 1 we have |S([nt])−S(nt)| ≤ 1.Recall also that with our coupling we have S([nt]) = Bτ[nt]

and so

|B∗n(t)− S∗n(t)| ≤ |Bnt − S([nt])|+ |S(nt)− S([nt])|√n

≤ 1√n

+∣∣∣B∗n(t)−B∗n(τ[nt]/n)

∣∣∣.Taking the sup over [0, 1] we deduce using the notation Modf for the modulus of continuity ofthe continuous function f over the interval [0, 2] that

supt∈[0,1]

|B∗n(t)− S∗n(t)| ≤ 1√n

+ ModB∗n

(supt∈[0,1]

∣∣∣t− τ[nt]

n

∣∣∣ ).Since for every n ≥ 0, B∗n is distributed as a standard Brownian motion B we have the equalityin distribution ModB∗n(η) = ModB(η) for every n ≥ 0 and every η > 0. However, since B isalmost surely continuous we have ModB(η)→ 0 in probability as η → 0 (exercise). We are thusreduced to prove that

supt∈[0,1]

∣∣∣t− τ[nt]

n

∣∣∣→ 0, in probability. (5)

24

Recall that τi = (τ1−τ0)+(τ2−τ1)+ . . .+(τi−τi−1) is a sum of iid random variables distributedas min(T1, T−1) and thus of mean 1 by Theorem 11. It follows from the law of large numbersthat

τnn

a.s.−−−→n→∞

1.

Thanks to Exercise 42 this implies (5) and completes the proof of the theorem.

Exercise 42 (functional law of large numbers). Let an be a sequence of real numbers so thatlimn−1an → 1. Prove that

limn→∞

sup0≤k≤n

∣∣∣∣akn − k

n

∣∣∣∣ = 0.

Deduce that if (Xi)i≥0 is a sequence of i.i.d. random variables such that E[|X0|] < ∞ andE[X0] = 1 then the random functions(

X1 + ...+X[nt]

n

)t≥0

,

converge almost surely on every compact of R+ towards the deterministic map t 7→ t.

6.1 From the continuous to the discrete

In this section we use the functional central limit theorem to deduce limiting laws for randomwalks. In the following Sn is a random walk with independent increments having a finite varianceσ2 ∈ (0,∞) and zero mean (centered). Also S is interpolated linearly between integers and S∗ndenotes the same function as in the last section.

Corollary 2. We have the following convergence in distribution

1√n

sup0≤k≤n

Sk(d)−−−→

n→∞σ · |N |,

where N is a standard normal distribution.

Proof. By Donsker’s theorem we have S∗n → B in distribution for the topology of uniformconvergence over [0, 1]. Let F be the function F : f ∈ C([0, 1],R) 7→ sup[0,1] f . Using thisnotation we have

1√n

sup0≤k≤n

Sk = σ · F(S∗n).

Since F is continuous for the uniform topology over [0, 1], we deduce by the continuous mappingtheorem that F (S∗n) converges in distribution towards F (B). The statement of the corollarythen follows from the identity B1 = |N | in law, see Theorem 5.

Corollary 3 (Arcsine law of the maximum). We have the following convergence in distribution

1

ninf

0 ≤ k ≤ n : Sk = sup0≤i≤n

Si (d)−−−→

n→∞Arcsine distribution.

25

Proof. If f is a continuous function over [0, 1] we set

G(f) = inft ≤ 1 : f(t) = sup

[0,1]f

so that the left-hand side in the corollary is nothing but G(S∗n). Since by Donsker’s theoremS∗n → B in distribution for the uniform norm over [0, 1], we are willing to deduce that G(S∗n)→G(B) in law as n→∞. The latter distribution being distributed as the arcsine distribution byTheorem 7, that would finish the job...

However, the function G is not continuous for the uniform norm over [0, 1] (exercise: find acounterexample). But it turns out that G is continuous at every function f such that the maximaof f over [0, 1] is uniquely attained (exercise). Since the local maxima of Brownian motion aredistinct (Exercise 26) we deduce that a.s. G is continuous at B (and measurable). We thenuse the enhanced version of the continuous mapping theorem to deduce that G(S∗n)→ G(B) indistribution as n→∞.

Exercise 43 (Arcsine law for the last zero). Let Ln = supk ∈ 0, 1, 2, ..., n−1 : SkSk+1 ≤ 0.Show that n−1Ln converges in distribution towards the arcsine distribution.

Exercise 44. * Show that we have the following convergence in distribution

1

n2infk ≥ 0 : Sk ≥ n

(d)−−−→n→∞

1

σ2|N |2.

Exercise 45. * Let Sn by the simple symmetric random walk on Z. Use the coupling with theBrownian motion of the proof of Theorem 12 to show that

P (Si ≥ 0 : i ≤ n) ∼ P (T−1 ≥ n) ∼√

2√πn

, as n→∞

Exercise 46. ** Think about transferring the law of iterated logarithm for the Brownian motiononto the simple symmetric random walk via the coupling of the proof of Theorem 12.

6.2 From the discrete to the continuous

We now use Donsker’s theorem in the other direction, meaning that we will prove results onBrownian motion by first establishing the corresponding result for one particular random walkand then transfer it to Brownian motion using Theorem 12. Obviously, in this section we willfocus on the most simple random walk (Sn) which is the simple symmetric random walk on Z.

6.2.1 Third arcsine law

Theorem 13. The random variable

∫ 1

0dt 1Bt>0 is arcsine distributed.

There is a proof of this theorem directly in the continuous setting. But this is not withinthe reach of this course. Rather, we first prove a discrete statement which directly implies thetheorem.

Lemma 3 (Richard). Let Sn be the simple symmetric random walk on Z. Then

#k ∈ 1, . . . , n : Sk > 0

(d)= min

k ∈ 0, 1, . . . , n : Sk = max

0≤j≤nSj.

26

Proof of Theorem 13 with Lemma 3. If f is a continuous function over [0, 1] we let

H(f) =

∫ 1

0dt 1f(t)>0.

Using this notation, it is easy to see that

1

n#k ∈ 1, . . . , n : Sk > 0

+

#k ≤ n : Sk−1 = 1 and Sk = 0n

= H(S∗n).

It is easy to show that the second term of the left-hand side is small, indeed

E[#k ≤ n : Sk = 0] =n∑k=0

P (Sk = 0) ∼ C√n

so that n−1#k ≤ n : Sk = 0 converges in probability to 0 as n → ∞. Combining thisobservation with Lemma 3 and Corollary 3 we deduce that H(S∗n) converges in distribution tothe arcsine law.

On the other hand, by Donsker’s theorem we have S∗n → B uniformly over [0, 1]. We aretempted to say that this implies H(S∗n) → H(B) in distribution... which would imply thetheorem. But the function H is not continuous. However, an exercise shows that the functionH is continuous at functions f where

limε→0

∫ 1

0dt 1f(t)∈[0,ε] = 0,

or equivalently at functions f so that ∫ 1

0dt 1f(t)=0 = 0,

which is almost surely the case for the Brownian motion as seen in Theorem 6. The statementthus follows from an application of the enhanced continuous mapping theorem.

Proof of Richard’s lemma. We see a random walk as a list [+,+,−,+ . . .] ordered from tail tohead. The standard random walk S just grows by adding ± to the head. We define a newrandom walk S by re-arranging the steps of S. Inductively we apply the following procedure:Initially the walk is empty list [], and then for every k ≥ 1, if Sk > 0 then the step Sk − Sk−1 isinserted at the tail of the list S otherwise it is inserted at the head of S. Here is an example ofthe walk S and the associated walk S at times goes on:

[]↔ [] [−]↔ [−] [−,+]↔ [−,+] [−,+,+]↔ [+,−,+] [−,+,+,−]↔ [+,−,+,−]

It is easy to check (exercise) by induction that

#k ∈ 1, . . . , n : Sk > 0

= min

k ∈ 0, 1, . . . , n : Sk = max

0≤j≤nSj.

To prove the lemma, we will now show that for every fixed n ≥ 0 we have Sn = Sn in distribution.We here also use induction. The statement is clear for n = 0 and n = 1. Let i1, . . . , in+1 ∈

27

+,−n+1 and let us compute P (Sn = [i1, . . . , in+1]). By construction, if In is the nth incrementof S we have

P (Sn+1 = [i1, . . . , in+1])

= P(Sn+1 = [i1, . . . , in+1]

)1i1+...+in+1>0 + P

(Sn+1 = [i1, . . . , in+1]

)1i1+...+in+1≤0

= P(Sn = [i2, . . . , in+1]1In+1=i1

)1i1+i2+...+in+1>0 + P

(Sn = [i1, . . . , in]1In+1=in+1

)1i1+i2+...+in+in+1≤0

= 2−n1

2

(1i1+i2+...+in+in+1>0 + 1i1+i2+...+in+in+1≤0

)= 2−n−1,

where we used the independence of the steps of S to go from the third to the last line: note thatSn only depends on the first n steps of S and is thus independent of In+1.

6.2.2 Levy’s B −B

Theorem 14 (Levy). The process (Bt −Bt)t≥0 has the same law as (|B|t)t≥0.

Sketch of the proof. Let (Sn) be the simple symmetric random walk and recall the notation S∗n.Fix K > 0. We introduce the supremum process S

∗n(t) = sups≤t S

∗n. By Donsker’s theorem we

have

S∗n → B and |S∗n| → |B| (6)

in distribution uniformly over [0,K]. On the other hand, the process Xn = supk≤n Sk − Sn isa Markov chain whose probabilities transition are exactly the same as the Markov chain |S|nexcept at 0 where P (Xn = 0) = 1/2. In others words, Xn can be seen as the Markov chain |S|nwhere each time |S|n touches 0 it stays there for a independent geometric variable of parameter1/2. We can thus couple the chain (Xn) with the chain (|S|n) such that

Xn+ξn = |S|n,

where ξn =∑n

i=0 1|S|i=0Gi with i.i.d. Gi geometric of parameter 1/2. Using similar estimates asin the proof of Theorem 13 we have n−1 supi≤n ξi = n−1ξn → 0 in probability. We thus deduceusing (6) that (with an abuse of notation) we have the following convergences in distributionuniformly over [0,K],

B −B ←− X∗n ≈ |S∗n| −→ |B|.

This theorem sheds a new light on Theorem 7. Indeed, it is clear now that M and L havethe same distribution since the last zero of |B| (or B) before 1 coincides with the distribution ofthe last zero before 1 of B −B which is nothing but the last time B reaches its maximum over[0, 1].

28

Date post:	23-May-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Introduction to Brownian motioncurien/cours/cours.pdfFind two stochastic processes (X t) t 0 and (Y...

Documents