+ All Categories
Home > Documents > Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1...

Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1...

Date post: 30-Apr-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
76
University of Cambridge Mathematics Tripos Part II Probability and Measure Michaelmas, 2018 Lectures by E. Breuillard Notes by Qiangru Kuang
Transcript
Page 1: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

University ofCambridge

Mathematics Tripos

Part II

Probability and Measure

Michaelmas, 2018

Lectures byE. Breuillard

Notes byQiangru Kuang

Page 2: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

Contents

Contents

1 Lebesgue measure 21.1 Boolean algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Jordan measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Lebesgue measure . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Abstract measure theory 9

3 Integration and measurable functions 17

4 Product measures 25

5 Foundations of probability theory 28

6 Independence 326.1 Useful inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7 Convergence of random variables 38

8 ๐ฟ๐‘ spaces 43

9 Hilbert space and ๐ฟ2-methods 48

10 Fourier transform 55

11 Gaussians 63

12 Ergodic theory 6612.1 The canonical model . . . . . . . . . . . . . . . . . . . . . . . . . 67

Index 73

1

Page 3: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

1 Lebesgue measure

1.1 Boolean algebra

Definition (Boolean algebra). Let ๐‘‹ be a set. A Boolean algebra on ๐‘‹ isa family of subsets of ๐‘‹ which

1. contains โˆ…,

2. is stable under finite unions and complementation.

Example.

โ€ข The trivial Boolean algebra โ„ฌ = {โˆ…, ๐‘‹}.

โ€ข The discrete Boolean algebra โ„ฌ = 2๐‘‹, the family of all subsets of ๐‘‹.

โ€ข Less trivially, if ๐‘‹ is a topological space, the family of constructible setsforms a Boolean algebra, where a constructible set is the finite union oflocally closed set, i.e. a set ๐ธ = ๐‘ˆ โˆฉ ๐น where ๐‘ˆ is open and ๐น is closed.

Definition (finitely additive measure). Let ๐‘‹ be a set and โ„ฌ a Booleanalgebra on ๐‘‹. A finitely additive measure on (๐‘‹, โ„ฌ) is a function ๐‘š โˆถ โ„ฌ โ†’[0, +โˆž] such that

1. ๐‘š(โˆ…) = 0,

2. ๐‘š(๐ธ โˆช ๐น) = ๐‘š(๐ธ) + ๐‘š(๐น) where ๐ธ โˆฉ ๐น = โˆ….

Example.

1. Counting measure: ๐‘š(๐ธ) = #๐ธ, the cardinality of ๐ธ where โ„ฌ is thediscrete Boolean algebra of ๐‘‹.

2. More generally, given ๐‘“ โˆถ ๐‘‹ โ†’ [0, +โˆž], define for ๐ธ โŠ† ๐‘‹,

๐‘š(๐ธ) = โˆ‘๐‘’โˆˆ๐ธ

๐‘“(๐‘’).

3. Suppose ๐‘‹ = โˆ๐‘๐‘–=1 ๐‘‹๐‘–, then define โ„ฌ(๐‘‹) to be the unions of ๐‘‹๐‘–โ€™s. Assign

a weight ๐‘Ž๐‘– โ‰ฅ 0 to each ๐‘‹๐‘– and define ๐‘š(๐ธ) = โˆ‘๐‘–โˆถ๐‘‹๐‘–โŠ†๐ธ ๐‘Ž๐‘– for ๐ธ โˆˆ โ„ฌ.

1.2 Jordan measureThis section is a historic review and provides intuition for Lebesgue measuretheory. Weโ€™ll gloss over details of proofs in this section.

Definition. A subset of R๐‘‘ is called elementary if it is a finite union ofboxes, where a box is a set ๐ต = ๐ผ1 ร— โ‹ฏ ร— ๐ผ๐‘‘ where each ๐ผ๐‘– is a finite intervalof R.

2

Page 4: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

Proposition 1.1. Let ๐ต โŠ† R๐‘‘ be a box. Let โ„ฐ(๐ต) be the family of elementarysubsets of ๐ต. Then

1. โ„ฐ(๐ต) is a Boolean algebra on ๐ต,

2. every ๐ธ โˆˆ โ„ฐ(๐ต) is a disjoint finite union of boxes,

3. if ๐ธ โˆˆ โ„ฐ(๐ต) can be written as disjoint finite union in two ways,๐ธ = โ‹ƒ๐‘›

๐‘–=1 ๐ต๐‘– = โ‹ƒ๐‘š๐‘—=1 ๐ตโ€ฒ

๐‘—, then โˆ‘๐‘›๐‘–=1 |๐ต๐‘–| = โˆ‘๐‘š

๐‘—=1 |๐ตโ€ฒ๐‘—| where |๐ต| =

โˆ๐‘‘๐‘–=1 |๐‘๐‘– โˆ’ ๐‘Ž๐‘–| if ๐ต = ๐ผ1 ร— โ‹ฏ ร— ๐ผ๐‘‘ and ๐ผ๐‘– has endpoints ๐‘Ž๐‘–, ๐‘๐‘–.

Following this, we can define a finitely additive measure correponding to ourintuition of length, area, volume etc:

Proposition 1.2. Define ๐‘š(๐ธ) = โˆ‘๐‘›๐‘–=1 |๐ต๐‘–| if ๐ธ is any elementary set and

is the disjoint union of boxes ๐ต๐‘– โŠ† R๐‘‘. Then ๐‘š is a finitely additive measureon โ„ฐ(๐ต) for any box ๐ต.

Definition. A subset ๐ธ โŠ† R๐‘‘ is Jordan measurable if for any ๐œ€ > 0 thereare elementary sets ๐ด, ๐ต, ๐ด โŠ† ๐ธ โŠ† ๐ต and ๐‘š(๐ต \ ๐ด) < ๐œ€.

Remark. Jordan measurable sets are bounded.

Proposition 1.3. If a set ๐ธ โŠ† R๐‘‘ is Jordan measurable, then

sup๐ดโŠ†๐ธ elementary

{๐‘š(๐ด)} = inf๐ตโŠ‡๐ธ elementary

{๐‘š(๐ต)}.

In which case we define the Jordan measure of ๐ธ as

๐‘š(๐ธ) = sup๐ดโŠ†๐ธ

{๐‘š(๐ด)}.

Proof. Take ๐ด๐‘› โŠ† ๐ธ such that ๐‘š(๐ด๐‘›) โ†‘ sup and ๐ต๐‘› โŠ‡ ๐ธ such that ๐‘š(๐ต๐‘›) โ†“ inf.Note that

inf โ‰ค ๐‘š(๐ต๐‘›) = ๐‘š(๐ด๐‘›) + ๐‘š(๐ต๐‘› \ ๐ด๐‘›) โ‰ค sup +๐‘š(๐ต๐‘› \ ๐ด๐‘›) โ‰ค sup +๐œ€

for arbitrary ๐œ€ > 0 so they are equal.

Exercise.

1. If ๐ต is a box, the family ๐’ฅ(๐ต) of Jordan measurable subsets of ๐ต is aBoolean algebra.

2. A subset ๐ธ โŠ† [0, 1] is Jordan measurable if and only if 1๐ธ, the indicatorfunction on ๐ธ, is Riemann integrable.

3

Page 5: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

1.3 Lebesgue measureAlthough Jordan measure corresponds to the intuition of length, area and vol-ume, it suffer from a few severe problems and issues:

1. unbounded sets in R๐‘‘ are not Jordan measurable.

2. 1Qโˆฉ[0,1] is not Riemann integrable, and therefore Q โˆฉ [0, 1] is not Jordanmeasurable.

3. pointwise limits of Riemann integrable function ๐‘“๐‘› โˆถ= 1 1๐‘›!Zโˆฉ[0,1] โ†’ 1Qโˆฉ[0,1]

is not Riemann integrable.

The idea of Lebesgue is to use countable covers by boxes.

Definition. A subset ๐ธ โŠ† R๐‘‘ is Lebesgue measurable if for all ๐œ€ > 0, thereexists a countable union of boxes ๐ถ with ๐ธ โŠ† ๐ถ and ๐‘šโˆ—(๐ถ \ ๐ธ) < ๐œ€, where๐‘šโˆ—, the Lebesgue outer measure, is defined as

๐‘šโˆ—(๐ธ) = inf{โˆ‘๐‘–โ‰ฅ1

|๐ต๐‘–| โˆถ ๐ธ โŠ† โ‹ƒ๐‘–โ‰ฅ1

๐ต๐‘–, ๐ต๐‘– boxes}

for every subset ๐ธ โŠ† R๐‘‘.

Remark. wlog in these definitions we may assume that boxes are open.

Proposition 1.4. The family โ„’ of Lebesgue measurable subsets of R๐‘‘ is aBoolean algebra stable under countable unions.

Lemma 1.5.

1. ๐‘šโˆ— is monotone: if ๐ธ โŠ† ๐น then ๐‘šโˆ—(๐ธ) โŠ† ๐‘šโˆ—(๐น).

2. ๐‘šโˆ— is countably subadditive: if ๐ธ = โ‹ƒ๐‘›โ‰ฅ1 ๐ธ๐‘› where ๐ธ๐‘› โŠ† R๐‘‘ then

๐‘šโˆ—(๐ธ) โ‰ค โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐ธ๐‘›).

Proof. Monotonicity is obvious. For countable subadditivity, pick ๐œ€ > 0 and let๐ถ๐‘› = โ‹ƒ๐‘–โ‰ฅ1 ๐ถ๐‘›,๐‘– where ๐ถ๐‘›,๐‘– are boxes such that ๐ธ๐‘› โŠ† ๐ถ๐‘› and

โˆ‘๐‘–โ‰ฅ1

|๐ถ๐‘›,๐‘–| โ‰ค ๐‘šโˆ—(๐ธ๐‘›) + ๐œ€2๐‘› .

Thenโˆ‘๐‘›โ‰ฅ1

โˆ‘๐‘–โ‰ฅ1

|๐ถ๐‘›,๐‘–| โ‰ค โˆ‘๐‘›โ‰ฅ1

(๐‘šโˆ—(๐ธ๐‘›) + ๐œ€2๐‘› ) = ๐œ€ + โˆ‘

๐‘›โ‰ฅ1๐‘šโˆ—(๐ธ๐‘›)

and ๐ธ โŠ† โ‹ƒ๐‘›โ‰ฅ1 ๐ถ๐‘› = โ‹ƒ๐‘›โ‰ฅ1 โ‹ƒ๐‘–โ‰ฅ1 ๐ถ๐‘›,๐‘– so

๐‘šโˆ—(๐ธ) โ‰ค ๐œ€ + โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐ธ๐‘›)

for all ๐œ€ > 0.

4

Page 6: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

Remark. Note that ๐‘šโˆ— is not additive on the family of all subsets of R๐‘‘.However, it will be on โ„’, as we will show later.

Lemma 1.6. If ๐ด, ๐ต are disjoint compact subsets of R๐‘‘ then

๐‘šโˆ—(๐ด โˆช ๐ต) = ๐‘šโˆ—(๐ด) + ๐‘šโˆ—(๐ต).

Proof. โ‰ค by the previous lemma so need to show โ‰ฅ. Pick ๐œ€ > 0. Let ๐ด โˆช ๐ต โŠ†โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› where ๐ต๐‘› are open boxes such that

โˆ‘๐‘›โ‰ฅ1

|๐ต๐‘›| โ‰ค ๐‘šโˆ—(๐ด โˆช ๐ต) + ๐œ€.

wlog we may assume that the side lengths of each ๐ต๐‘› are < ๐›ผ2 , where

๐›ผ = inf{โ€–๐‘ฅ โˆ’ ๐‘ฆโ€–1 โˆถ ๐‘ฅ โˆˆ ๐ด, ๐‘ฆ โˆˆ ๐ต} > 0.

where the inequality comes from the fact that ๐ด and ๐ต are compact and thusclosed. wlog we may discard the ๐ต๐‘›โ€™s that do not interesect ๐ด โˆช ๐ต. Then byconstruction

โˆ‘๐‘›โ‰ฅ1

|๐ต๐‘›| = โˆ‘๐‘›โ‰ฅ1,๐ต๐‘›โˆฉ๐ด=โˆ…

|๐ต๐‘›| + โˆ‘๐‘›โ‰ฅ1,๐ต๐‘›โˆฉ๐ต=โˆ…

|๐ต๐‘›| โ‰ฅ ๐‘šโˆ—(๐ด) + ๐‘šโˆ—(๐ต)

so๐œ€ + ๐‘šโˆ—(๐ด โˆช ๐ต) โ‰ฅ ๐‘šโˆ—(๐ด) + ๐‘šโˆ—(๐ต)

for all ๐œ€.

Lemma 1.7. If ๐ธ โŠ† R๐‘‘ has ๐‘šโˆ—(๐ธ) = 0 then ๐ธ โˆˆ โ„’.

Definition (null set). A set ๐ธ โŠ† R๐‘‘ such that ๐‘šโˆ—(๐ธ) = 0 is called a nullset.

Proof. For all ๐œ€ > 0, there exist ๐ถ = โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› where ๐ต๐‘› are boxes such that๐ธ โŠ† ๐ถ and โˆ‘๐‘›โ‰ฅ1 |๐ต๐‘›| โ‰ค ๐œ€. But

๐‘šโˆ—(๐ถ \ ๐ธ) โ‰ค ๐‘šโˆ—(๐ถ) โ‰ค ๐œ€.

Lemma 1.8. Every open subset of R๐‘‘ and every closed subset of R๐‘‘ is inโ„’.

We will prove the lemma using the fact that the family of Lebesgue mea-surable subsets is stable under countable union, which itself does not use thislemma. This lemma, however, will be used to show the stability under comple-mentation. Since the proof is quite technical (it has more to do with generaltopology than measure theory), for brevity and fluency of ideas we present theproof the main proposition first.

5

Page 7: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

Proof of Proposition 1.4. It is obvious that โˆ… โˆˆ โ„’. To show it is stable undercountable unions, start with ๐ธ๐‘› โˆˆ โ„’ for ๐‘› โ‰ฅ 1. Need to show ๐ธ โˆถ= โ‹ƒ๐‘›โ‰ฅ1 ๐ธ๐‘› โˆˆโ„’.

Pick ๐œ€ > 0. By assumption there exist ๐ถ๐‘› = โ‹ƒ๐‘–โ‰ฅ1 ๐ต๐‘›,๐‘– where ๐ต๐‘›,๐‘– are boxessuch that ๐ธ๐‘› โŠ† ๐ถ๐‘› and

๐‘šโˆ—(๐ถ๐‘› \ ๐ธ๐‘›) < ๐œ€2๐‘› .

Now๐ธ = โ‹ƒ

๐‘›โ‰ฅ1๐ธ๐‘› โŠ† โ‹ƒ

๐‘›โ‰ฅ1๐ถ๐‘› =โˆถ ๐ถ

so ๐ถ is again a countable union of boxes and ๐ถ \ ๐ธ โŠ† โ‹ƒ๐‘›โ‰ฅ1 ๐ถ๐‘› \ ๐ธ๐‘›. so

๐‘šโˆ—(๐ถ \ ๐ธ) โ‰ค โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐ถ๐‘› \ ๐ธ๐‘›) โ‰ค โˆ‘๐‘›โ‰ฅ1

๐œ€2๐‘› = ๐œ€

by countable subadditivity so ๐ธ โˆˆ โ„’.To show it is stable under complementation, suppose ๐ธ โˆˆ โ„’. By assumption

there exist ๐ถ๐‘› a countable union of boxes with ๐ธ โŠ† ๐ถ๐‘› and ๐‘šโˆ—(๐ถ๐‘› \ ๐ธ) โ‰ค 1๐‘› .

wlog we may assume the boxes are open so ๐ถ๐‘› is open, ๐ถ๐‘๐‘› is closed so ๐ถ๐‘

๐‘› โˆˆ โ„’.Thus โ‹ƒ๐‘›โ‰ฅ1 ๐ถ๐‘

๐‘› โˆˆ โ„’ by first part of the proof.But

๐‘šโˆ—(๐ธ๐‘ \ โ‹ƒ๐‘›โ‰ฅ1

๐ถ๐‘๐‘›) โ‰ค ๐‘šโˆ—(๐ธ๐‘ \ ๐ถ๐‘

๐‘›) = ๐‘šโˆ—(๐ถ๐‘› \ ๐ธ) โ‰ค 1๐‘›

so ๐‘šโˆ—(๐ธ๐‘ \ โ‹ƒ๐‘›โ‰ฅ1 ๐ถ๐‘๐‘›) = 0 so ๐ธ๐‘ \ โ‹ƒ๐‘›โ‰ฅ1 ๐ถ๐‘

๐‘› โˆˆ โ„’ since it is a null set. But

๐ธ๐‘ = (๐ธ๐‘ \ โ‹ƒ๐‘›โ‰ฅ1

๐ถ๐‘๐‘›) โˆช โ‹ƒ

๐‘›โ‰ฅ1๐ถ๐‘

๐‘›,

both of which are in โ„’ so ๐ธ๐‘ โˆˆ โ„’.

Proof of Lemma 1.8. Every open set in R๐‘‘ is a countable union of boxes so isin โ„’. It is more subtle for closed sets. The key observation is that every closedset is the countable union of compact subsets so we are left to show compactsets of R๐‘‘ are in โ„’.

Let ๐น โŠ† R๐‘‘ be compact. For all ๐‘˜ โ‰ฅ 1, there exist ๐‘‚๐‘˜ a countable union ofopen sets such that ๐น โŠ† ๐‘‚๐‘˜ โˆถ= โ‹ƒ๐‘–โ‰ฅ1 ๐‘‚๐‘˜,๐‘– where ๐‘‚๐‘˜,๐‘– are open boxes such that

โˆ‘๐‘–โ‰ฅ1

|๐‘‚๐‘˜,๐‘–| โ‰ค ๐‘šโˆ—(๐น) + 12๐‘˜ .

By compactness there exist a finite subcover so we can assume ๐‘‚๐‘˜ is a finiteunion of open boxes. Moreover, wlog assume that

1. the side lengths of ๐‘‚๐‘˜,๐‘– are โ‰ค 12๐‘˜ .

2. for each ๐‘–, ๐‘‚๐‘˜,๐‘– intersects ๐น.

3. ๐‘‚๐‘˜+1 โŠ† ๐‘‚๐‘˜ (by replacing ๐‘‚๐‘˜+1 with ๐‘‚๐‘˜+1 โˆฉ ๐‘‚๐‘˜ iteratively).

6

Page 8: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

Then ๐น = โ‹‚๐‘˜โ‰ฅ1 ๐‘‚๐‘˜ and we are left to show ๐‘šโˆ—(๐‘‚๐‘˜ \ ๐น) โ†’ 0. By additivity ondisjoint compact sets,

๐‘šโˆ—(๐น) + ๐‘šโˆ—(๐‘‚๐‘– \ ๐‘‚๐‘–+1) = ๐‘šโˆ—(๐น โˆช (๐‘‚๐‘– \ ๐‘‚๐‘–+1))

so๐‘šโˆ—(๐น) + ๐‘šโˆ—(๐‘‚๐‘– \ ๐‘‚๐‘–+1) โ‰ค ๐‘šโˆ—(๐‘‚๐‘–) โ‰ค โˆ‘

๐‘—โ‰ฅ1|๐‘‚๐‘–,๐‘—| โ‰ค ๐‘šโˆ—(๐น) + 1

2๐‘–

so ๐‘šโˆ—(๐‘‚๐‘– \ ๐‘‚๐‘–+1) โ‰ค 12๐‘– . Finally,

๐‘šโˆ—(๐‘‚๐‘˜ \ ๐น) = ๐‘šโˆ—(โ‹ƒ๐‘–โ‰ฅ๐‘˜

(๐‘‚๐‘– \ ๐‘‚๐‘–+1)) โ‰ค โˆ‘๐‘–โ‰ฅ๐‘˜

๐‘šโˆ—(๐‘‚๐‘– \ ๐‘‚๐‘–+1) โ‰ค โˆ‘๐‘–โ‰ฅ๐‘˜

12๐‘– = 1

2๐‘˜โˆ’1 .

The result weโ€™re working towards is

Proposition 1.9. ๐‘šโˆ— is countably additive on โ„’, i.e. if (๐ธ๐‘›)๐‘›โ‰ฅ1 where๐ธ๐‘› โˆˆ โ„’ are pairwise disjoint then

๐‘šโˆ—( โ‹ƒ๐‘›โ‰ฅ1

๐ธ๐‘›) = โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐ธ๐‘›).

Lemma 1.10. If ๐ธ โˆˆ โ„’ then for all ๐œ€ > 0 there exists ๐‘ˆ open, ๐น closed,๐น โŠ† ๐ธ โŠ† ๐‘ˆ such that ๐‘šโˆ—(๐‘ˆ \ ๐ธ) < ๐œ€ and ๐‘šโˆ—(๐ธ \ ๐น) < ๐œ€.

Proof. By definition of โ„’, there exists a countable union of open boxes ๐ธ โŠ†โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› such that ๐‘šโˆ—(โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› \ ๐ธ) < ๐œ€. Just take ๐‘ˆ = โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› which isopen.

For ๐น do the same with ๐ธ๐‘ = R๐‘‘ \ ๐ธ in place of ๐ธ.

Proof of Proposition 1.9. First we assume each ๐ธ๐‘› is compact. By a previouslemma ๐‘šโˆ— is additive on compact sets so for all ๐‘ โˆˆ N,

๐‘šโˆ—(๐‘โ‹ƒ๐‘›=1

๐ธ๐‘›) =๐‘

โˆ‘๐‘›=1

๐‘šโˆ—(๐ธ๐‘›).

In particular๐‘

โˆ‘๐‘›=1

๐‘šโˆ—(๐ธ๐‘›) โ‰ค ๐‘šโˆ—( โ‹ƒ๐‘›โ‰ฅ1

๐ธ๐‘›)

since ๐‘šโˆ— is monotone. Take ๐‘ โ†’ โˆž to get one inequality. The other directionholds by countable subadditivity of ๐‘šโˆ—.

Now assume that each ๐ธ๐‘› is a bounded subset in โ„’. By the lemma thereexists ๐พ๐‘› โŠ† ๐ธ๐‘› closed, so compact, such that ๐‘šโˆ—(๐ธ๐‘› \ ๐พ๐‘›) โ‰ค ๐œ€

2๐‘› . Since ๐พ๐‘›โ€™sare disjoint, by the previous case

๐‘šโˆ—( โ‹ƒ๐‘›โ‰ฅ1

๐พ๐‘›) = โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐พ๐‘›)

7

Page 9: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

then

โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐ธ๐‘›)

โ‰ค โˆ‘๐‘›โ‰ฅ1

๐‘šโˆ—(๐พ๐‘›) + ๐‘šโˆ—(๐ธ๐‘› \ ๐พ๐‘›)

โ‰ค๐‘šโˆ—( โ‹ƒ๐‘›โ‰ฅ1

๐พ๐‘›) + โˆ‘๐‘›โ‰ฅ1

๐œ€2๐‘›

โ‰ค๐‘šโˆ—( โ‹ƒ๐‘›โ‰ฅ1

๐ธ๐‘›) + ๐œ€

so one direction of inequality. Similarly the other direction holds by countablesubadditivity of ๐‘šโˆ—.

For the general case, note that R๐‘‘ = โ‹ƒ๐‘›โˆˆZ๐‘‘ ๐ด๐‘› where ๐ด๐‘› is bounded andin โ„’, for example by taking ๐ด๐‘› to be product of half open intervals of unitlength. Write ๐ธ๐‘› as โ‹ƒ๐‘šโˆˆZ๐‘‘ ๐ธ๐‘› โˆฉ ๐ด๐‘š so just apply the previous results to(๐ธ๐‘› โˆฉ ๐ด๐‘š)๐‘›โ‰ฅ1,๐‘šโˆˆZ๐‘‘ .

Definition (Lebesgue measure). ๐‘šโˆ— when restricted to โ„’ is called theLebesgue measure and is simply denoted by ๐‘š.

Example (Vitali counterexample). Althought โ„’ is pretty big (it includes allopen and closed sets, countable unions and intersections of them, and has car-dinality at least 2๐”  where ๐”  is the continuum, by considering a null set withcardinality ๐” , and each subset thereof), it does not include every subset of R๐‘‘.

Consider (Q, +), the additive subgroup of (R, +). Pick a set of representative๐ธ of the cosets of (Q, +). Choose it inside [0, 1]. For each ๐‘ฅ โˆˆ R, there existsa unique ๐‘’ โˆˆ ๐ธ such that ๐‘ฅ โˆ’ ๐‘’ โˆˆ Q (here we require axiom of choice). Claimthat ๐ธ โˆ‰ โ„’ and ๐‘šโˆ— is not additive on the family of all subsets of R๐‘‘.

Proof. Pick distinct rationals ๐‘1, โ€ฆ , ๐‘๐‘ in [0, 1]. The sets ๐‘๐‘– + ๐ธ are pairwisedisjoint so if ๐‘šโˆ— were additive then we would have

๐‘šโˆ—(๐‘โ‹ƒ๐‘–=1

๐‘๐‘– + ๐ธ) =๐‘

โˆ‘๐‘–=1

๐‘šโˆ—(๐‘๐‘– + ๐ธ) = ๐‘๐‘šโˆ—(๐ธ)

by translation invariance of ๐‘šโˆ—. But then

๐‘โ‹ƒ๐‘–=1

๐‘๐‘– + ๐ธ โŠ† [0, 2]

since ๐ธ โŠ† [0, 1] so by monotonicity of ๐‘šโˆ— have

๐‘šโˆ—(๐‘โ‹ƒ๐‘–=1

๐‘๐‘– + ๐ธ) โ‰ค 2

so for all ๐‘๐‘šโˆ—(๐ธ) โ‰ค 2 so ๐‘šโˆ—(๐ธ) = 0. But

[0, 1] โŠ† โ‹ƒ๐‘žโˆˆQ

๐ธ + ๐‘ž = R,

8

Page 10: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

1 Lebesgue measure

by countable subadditivity of ๐‘šโˆ—,

1 = ๐‘šโˆ—([0, 1]) โ‰ค โˆ‘๐‘žโˆˆQ

๐‘šโˆ—(๐ธ + ๐‘ž) = 0.

Absurd.In particular ๐ธ โˆ‰ โ„’ as ๐‘šโˆ— is additive on โ„’.

9

Page 11: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

2 Abstract measure theoryIn this chapter we extend measure theory to arbitrary set. Most part of thetheory is developed by Frรฉchet and Carathรฉodory.

Definition (๐œŽ-algebra). A ๐œŽ-algebra on a set ๐‘‹ is a Boolean algebra stableunder countable unions.

Definition (measurable space). A measurable space is a couple (๐‘‹, ๐’œ)where ๐‘‹ is a set and ๐’œ is a ๐œŽ-algebra on ๐‘‹.

Definition (measure). A measure on (๐‘‹, ๐’œ) is a map ๐œ‡ โˆถ ๐’œ โ†’ [0, โˆž] suchthat

1. ๐œ‡(โˆ…) = 0,

2. ๐œ‡ is countably additive (also known as ๐œŽ-additive), i.e. for every family(๐ด๐‘›)๐‘›โ‰ฅ1 of disjoint subsets in ๐’œ, have

๐œ‡( โ‹ƒ๐‘›โ‰ฅ1

๐ด๐‘›) = โˆ‘๐‘›โ‰ฅ1

๐œ‡(๐ด๐‘›).

The triple (๐‘‹, ๐’œ, ๐œ‡) is called a measure space.

Example.

1. (R๐‘‘, โ„’, ๐‘š) is a measure space.

2. (๐‘‹, 2๐‘‹, #) where # is the counting measure.

Proposition 2.1. Let (๐‘‹, ๐’œ, ๐œ‡) be a measure space. Then

1. ๐œ‡ is monotone: ๐ด โŠ† ๐ต implies ๐œ‡(๐ด) โŠ† ๐œ‡(๐ต),

2. ๐œ‡ is countably subadditive: ๐œ‡(โ‹ƒ๐‘›โ‰ฅ1 ๐ด๐‘›) โ‰ค โˆ‘๐‘›โ‰ฅ1 ๐œ‡(๐ด๐‘›),

3. upward monotone convergence: if

๐ธ1 โŠ† ๐ธ2 โŠ† โ‹ฏ โŠ† ๐ธ๐‘› โŠ† โ€ฆ

then๐œ‡( โ‹ƒ

๐‘›โ‰ฅ1๐ธ๐‘›) = lim

๐‘›โ†’โˆž๐œ‡(๐ธ๐‘›) = sup

๐‘›โ‰ฅ1๐œ‡(๐ธ๐‘›).

4. downard monotone convergence: if

๐ธ1 โŠ‡ ๐ธ2 โŠ‡ โ‹ฏ โŠ‡ ๐ธ๐‘› โŠ‡ โ€ฆ

10

Page 12: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

and ๐œ‡(๐ธ1) < โˆž then

๐œ‡( โ‹‚๐‘›โ‰ฅ1

๐ธ๐‘›) = lim๐‘›โ†’โˆž

๐œ‡(๐ธ๐‘›) = inf๐‘›โ‰ฅ1

๐œ‡(๐ธ๐‘›).

Proof.

1.๐œ‡(๐ต) = ๐œ‡(๐ด) + ๐œ‡(๐ต \ ๐ด)โŸ

โ‰ฅ0

by additivity of ๐œ‡.

2. See example sheet. The idea is that every countable union โ‹ƒ๐‘›โ‰ฅ1 ๐ด๐‘› is adisjoint countable union โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› where for each ๐‘›, ๐ต๐‘› โŠ† ๐ด๐‘›. It thenfollows by ๐œŽ-additivity.

3. Let ๐ธ0 = โˆ… soโ‹ƒ๐‘›โ‰ฅ1

๐ธ๐‘› = โ‹ƒ๐‘›โ‰ฅ1

(๐ธ๐‘› \ ๐ธ๐‘›โˆ’1),

a disjoint union. By ๐œŽ-additivity,

๐œ‡( โ‹ƒ๐‘›โ‰ฅ1

๐ธ๐‘›) = โˆ‘๐‘›โ‰ฅ1

๐œ‡(๐ธ๐‘› \ ๐ธ๐‘›โˆ’1)

but for all ๐‘, by additivity of ๐œ‡,

๐‘โˆ‘๐‘›=1

๐œ‡(๐ธ๐‘› \ ๐ธ๐‘›โˆ’1) = ๐œ‡(๐ธ๐‘)

so take limit. The supremum part is obvious.

4. Apply the previous result to ๐ธ1 \ ๐ธ๐‘›.

Remark. Note the ๐œ‡(๐ธ1) < โˆž condition in the last part. Counterexample:๐ธ๐‘› = [๐‘›, โˆž) โŠ† R.

Definition (๐œŽ-algebra generated by a family). Let ๐‘‹ be a set and โ„ฑ besome family of subsets of ๐‘‹. The the intersection of all ๐œŽ-algebras on ๐‘‹containing โ„ฑ is a ๐œŽ-algebra, called the ๐œŽ-algebra generated by โ„ฑ and isdenoted by ๐œŽ(โ„ฑ).

Proof. Easy check. See example sheet.

Example.

1. Suppose ๐‘‹ = โˆ๐‘๐‘–=1 ๐‘‹๐‘–, i.e. ๐‘‹ admits a finite partition. Let โ„ฑ = {๐‘‹1, โ€ฆ , ๐‘‹๐‘›},

then ๐œŽ(โ„ฑ) consists of all subsets that are unions of ๐‘‹๐‘–โ€™s.

2. Suppose ๐‘‹ is countable and let โ„ฑ be the collection of all singletons. Then๐œŽ(โ„ฑ) = 2๐‘‹.

11

Page 13: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

Definition (Borel ๐œŽ-algebra). Let ๐‘‹ be a topological space. The ๐œŽ-algebragenerated by open subsets of ๐‘‹ is called the Borel ๐œŽ-algebra of ๐‘‹, denotedby โ„ฌ(๐‘‹).

Proposition 2.2. If ๐‘‹ = R๐‘‘ then โ„ฌ(๐‘‹) โŠ† โ„’. Moreover every ๐ด โˆˆ โ„’ canbe written as a disjoint union ๐ด = ๐ต โˆช ๐‘ where ๐ต โˆˆ โ„ฌ(๐‘‹) and ๐‘ is a nullset.

Proof. Weโ€™ve shown that โ„’ is a ๐œŽ-algebra and contains all open sets so โ„ฌ(๐‘‹) โŠ†โ„’. Given ๐ด โˆˆ โ„’, ๐ด๐‘ โˆˆ โ„’ so for all ๐‘› โ‰ฅ 1 there exists ๐ถ๐‘› countable unions of(open) boxes such that ๐ด๐‘ โŠ† ๐ถ๐‘› and ๐‘šโˆ—(๐ถ๐‘› \ ๐ด๐‘) โ‰ค 1

๐‘› . Take ๐ถ = โ‹‚๐‘›โ‰ฅ1 ๐ถ๐‘› โˆˆโ„ฌ(๐‘‹). Thus ๐ต โˆถ= ๐ถ๐‘ โˆˆ โ„ฌ(๐‘‹) and ๐‘š(๐ด \ ๐ต) = 0 because ๐ด \ ๐ต = ๐ถ \ ๐ด๐‘.

Remark.

1. It can be shown that โ„ฌ(R๐‘‘) โŠŠ โ„’. In fact |โ„’| โ‰ฅ 2๐”  and |โ„ฌ(R๐‘‘)| = ๐” .

2. If โ„ฑ is a family of subsets of a set ๐‘‹, the Boolean algebra generated byโ„ฑ can be explicitly described as

โ„ฌ(โ„ฑ) = {finite unions of ๐น1 โˆฉ โ‹ฏ โˆฉ ๐น๐‘ โˆถ ๐น๐‘– โˆˆ โ„ฑ or ๐น ๐‘๐‘– โˆˆ โ„ฑ}.

3. However, this is not so for ๐œŽ(โ„ฑ). There is no โ€œsimpleโ€ description of ๐œŽ-algebra generated by โ„ฑ. (c.f. Borel hierarchy in descriptive set theory andtransfinite induction)

Definition (๐œ‹-system). A family โ„ฑ of subsets of a set ๐‘‹ is called a ๐œ‹-systemif it contains โˆ… and it is closed under finite intersection.

Proposition 2.3 (measure uniqueness). Let (๐‘‹, ๐’œ) be a measurable space.Assume ๐œ‡1 and ๐œ‡2 are two finite measures (i.e. ๐œ‡๐‘–(๐‘‹) < โˆž) such that๐œ‡1(๐น) = ๐œ‡2(๐น) for every ๐น โˆˆ โ„ฑ where โ„ฑ is a ๐œ‹-system with ๐œŽ(โ„ฑ) = ๐’œ.Then ๐œ‡1 = ๐œ‡2.

For R๐‘‘, we only have to check open boxes.

Proof. We state first the following lemma:

Lemma 2.4 (Dynkin lemma). If โ„ฑ is a ๐œ‹-system on ๐‘‹ and ๐’ž is a familyof subsets of ๐‘‹ such that โ„ฑ โŠ† ๐’ž and ๐’ž is stable under complementation anddisjoint countable unions. Then ๐œŽ(โ„ฑ) โŠ† ๐’ž.

Let ๐’ž = {๐ด โˆˆ ๐’œ โˆถ ๐œ‡1(๐ด) = ๐œ‡2(๐ด)}. Then ๐’ž is clearly stable under comple-mentation as

๐œ‡๐‘–(๐ด๐‘) = ๐œ‡๐‘–(๐‘‹ \ ๐ด) = ๐œ‡๐‘–(๐‘‹) โˆ’ ๐œ‡๐‘–(๐ด).

๐’ž is also clearly stable under countable disjoint unions by ๐œŽ-additivity. Thus byDynkin lemma, ๐’ž โŠ‡ ๐œŽ(โ„ฑ) = ๐’œ.

12

Page 14: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

Proof of Dynkin lemma. Let โ„ณ be the smallest family of subsets of ๐‘‹ contain-ing โ„ฑ and stable under complementation and countable disjoint union (2๐‘‹ issuch a family and taking intersection). Sufficient to show that โ„ณ is a ๐œŽ-algebra,as then โ„ณ โŠ† ๐’ž implies ๐œŽ(โ„ฑ) โŠ† ๐’ž.

It suffices to show โ„ณ is a Boolean algebra. Let

โ„ณโ€ฒ = {๐ด โˆˆ โ„ณ โˆถ ๐ด โˆฉ ๐ต โˆˆ โ„ณ for all ๐ต โˆˆ โ„ฑ}.

โ„ณโ€ฒ again is stable under countable disjoint unions and complementation because

๐ด๐‘ โˆฉ ๐ต = (๐ต๐‘ โˆช (๐ด โˆฉ ๐ต))๐‘

as a disjoint union so is in โ„ณ.As โ„ณโ€ฒ โŠ‡ โ„ฑ, by minimality of โ„ณ, have โ„ณ = โ„ณโ€ฒ. Now let

โ„ณโ€ณ = {๐ด โˆˆ โ„ณโ€ฒ โˆถ ๐ด โˆฉ ๐ต โˆˆ โ„ณ for all ๐ต โˆˆ โ„ณ}.

The same argument shows that โ„ณโ€ณ = โ„ณ. Thus โ„ณ is a Boolean algebra and a๐œŽ-algebra.

Proposition 2.5 (uniqueness of Lebesgue measure). Lebesgue measure isthe unique translation invariant measure ๐œ‡ on (R๐‘‘, โ„ฌ(R๐‘‘)) such that

๐œ‡([0, 1]๐‘‘) = 1.

Proof. Exercise. Hint: use the ๐œ‹-system โ„ฑ made of all boxes in R๐‘‘ and dissecta cube into dyadic pieces. Then approximate and use monotone.

Remark.

1. There is no countably additive translation invariant measure on R definedon all subsets of R. (c.f. Vitaliโ€™s counterexample).

2. However, the Lebesgue measure can be extended to a finitely additivemeasure on all subsets of R (proof requires Hahn-Banach theorem. SeeIID Linear Analysis).

Recall the construction of Lebesgue measure: we take boxes in R๐‘‘, and defineelementary sets, which is the Boolean algebra generated by boxes. Then we candefine Jordan measure which is finitely additive. However, this is not countablyadditive but analysis craves limits so we define Lebesgue measurable sets, byintroducing the outer measure ๐‘šโˆ—, which is built from the Jordan measure.Finally we restrict this outer measure to โ„’. We also define the Borel ๐œŽ-algebra,which is the same as the ๐œŽ-algebra generated by the boxes. We show that theBorel ๐œŽ-algebra is contained in โ„’, and every element in โ„’ can be written as adisjoint union of an element in the Borel ๐œŽ-algebra and a measure zero set.

Suppose โ„ฌ is a Boolean algebra on a set ๐‘‹. Let ๐œ‡ be a finitely additivemeasure on โ„ฌ. We are going to construct a measure on ๐œŽ(โ„ฌ).

13

Page 15: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

Theorem 2.6 (Carathรฉodory extension theorem). Assume that ๐œ‡ is count-ably additive on โ„ฌ, i.e. if ๐ต๐‘› โˆˆ โ„ฌ disjoint is such that โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› โˆˆ โ„ฌ then๐œ‡(โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘›) = โˆ‘๐‘›โ‰ฅ1 ๐œ‡(๐ต๐‘›) and assume that ๐œ‡ is ๐œŽ-finite, i.e. there exists๐‘‹๐‘š โˆˆ โ„ฌ such that ๐‘‹ = โ‹ƒ๐‘šโ‰ฅ1 ๐‘‹๐‘š and ๐œ‡(๐‘‹๐‘š) < โˆž, then ๐œ‡ extends uniquelyto a measure on ๐œŽ(โ„ฌ).

Proof. For any ๐ธ โŠ† ๐‘‹, let

๐œ‡โˆ—(๐ธ) = inf{โˆ‘๐‘›โ‰ฅ1

๐œ‡(๐ต๐‘›) โˆถ ๐ธ โŠ† โ‹ƒ๐‘›โ‰ฅ1

๐ต๐‘›, ๐ต๐‘› โˆˆ โ„ฌ}

and call it the outer measure associated to ๐œ‡. Define a subset ๐ธ โŠ† ๐‘‹ to be๐œ‡โˆ—-measurable if for all ๐œ€ > 0 there exists ๐ถ = โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› with ๐ต๐‘› โˆˆ โ„ฌ such that๐ธ โŠ† ๐ถ and

๐œ‡โˆ—(๐ถ \ ๐ธ) โ‰ค ๐œ€.

We denote by โ„ฌโˆ— the set of ๐œ‡โˆ—-measurable subsets. Claim that

1. ๐œ‡โˆ— is countably subadditive and monotone.

2. ๐œ‡โˆ—(๐ต) = ๐œ‡(๐ต) for all ๐ต โˆˆ โ„ฌ.

3. โ„ฌโˆ— is a ๐œŽ-algebra and contains all ๐œ‡โˆ—-null sets and โ„ฌ.

4. ๐œ‡โˆ— is ๐œŽ-additive on โ„ฌโˆ—.

Then existence follows from the proposition as โ„ฌโˆ— โŠ‡ ๐œŽ(โ„ฌ): ๐œ‡โˆ— will be ameasure on โ„ฌโˆ— and thus on ๐œŽ(โ„ฌ). Uniqueness follows from a similar proof forLebesgue measure via Dynkin lemma.

Proof. This will be very easy as we only need to adapt our previous work tothe general case. Note that in a few occassion we used properties of R๐‘‘, suchas openness of some sets, so be careful.

1. Same.

2. ๐œ‡โˆ—(๐ต) โ‰ค ๐œ‡(๐ต) for all ๐ต โˆˆ โ„ฌ by definition of ๐œ‡โˆ—. For the other direction, forall ๐œ€ > 0, there exist ๐ต๐‘› โˆˆ โ„ฌ such that ๐ต โŠ† โ‹ƒ๐‘›โ‰ฅ1 ๐ต๐‘› and โˆ‘๐‘›โ‰ฅ1 ๐œ‡(๐ต๐‘›) โ‰ค๐œ‡โˆ—(๐ต) + ๐œ€. But

๐ต = โ‹ƒ๐‘›โ‰ฅ1

๐ต๐‘› โˆฉ ๐ต = โ‹ƒ๐‘›โ‰ฅ1

๐ถ๐‘›

where ๐ถ๐‘› โˆถ= ๐ต๐‘› โˆฉ ๐ต \ โ‹ƒ๐‘–<๐‘› ๐ต โˆฉ ๐ต๐‘– and so ๐ถ๐‘› โˆˆ โ„ฌ. Thus by countableadditivity

๐œ‡(๐ต) = โˆ‘๐‘›โ‰ฅ1

๐œ‡(๐ถ๐‘›) โ‰ค โˆ‘๐‘›โ‰ฅ1

๐œ‡(๐ต๐‘›) โ‰ค ๐œ‡โˆ—(๐ต) + ๐œ€

3. ๐œ‡โˆ—-null sets and โ„ฌ are obviously in โ„ฌโˆ—. Thus it is left to show that โ„ฌโˆ—

is a ๐œŽ-algebra. Stability under countable union is exactly the same andthen we claim that โ„ฌโˆ— is stable under complementation. This is the bitwhere we used closed/open sets in R๐‘‘ in the original proof. Here we usea lemma as a substitute.

14

Page 16: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

Lemma 2.7. Suppose ๐ต๐‘› โˆˆ โ„ฌ then โ‹‚๐‘›โ‰ฅ1 ๐ต๐‘› โˆˆ โ„ฌโˆ—.

Proof. First claim that if ๐ธ = โ‹‚๐‘›โ‰ฅ1 ๐ผ๐‘› where ๐ผ๐‘›+1 โŠ† ๐ผ๐‘› and ๐ผ๐‘› โˆˆ โ„ฌ suchthat ๐œ‡(๐ผ1) < โˆž then ๐œ‡โˆ—(๐ธ) = lim๐‘›โ†’โˆž ๐œ‡(๐ผ๐‘›) and ๐ธ โˆˆ โ„ฌโˆ—: by additivity of๐œ‡ on โ„ฌ,

๐‘โˆ‘๐‘›=1

๐œ‡(๐ผ๐‘› \ ๐ผ๐‘›+1) = ๐œ‡(๐ผ1) โˆ’ ๐œ‡(๐ผ๐‘)

which converges as ๐‘ โ†’ โˆž (because ๐œ‡(๐ผ๐‘›+1) โ‰ค ๐œ‡(๐ผ๐‘›)), so

โˆ‘๐‘›โ‰ฅ๐‘

๐œ‡(๐ผ๐‘› \ ๐ผ๐‘›+1) โ†’ 0

as ๐‘ โ†’ โˆž. But LHS is greater than ๐œ‡โˆ—(๐ผ๐‘ \ ๐ธ) because ๐ผ๐‘ \ ๐ธ =โ‹ƒ๐‘›โ‰ฅ๐‘ ๐ผ๐‘› \ ๐ผ๐‘›+1. Therefore ๐ธ โˆˆ โ„ฌโˆ— and

๐œ‡(๐ผ๐‘›) โ‰ค ๐œ‡โˆ—(๐ผ๐‘› \ ๐ธ)โŸโŸโŸโŸโŸโ†’0

+ ๐œ‡โˆ—(๐ธ)โŸโ‰ค๐œ‡(๐ผ๐‘›)

solim

๐‘›โ†’โˆž๐œ‡(๐ผ๐‘›) = ๐œ‡โˆ—(๐ธ).

Now for the actual lemma, let ๐ธ = โ‹‚๐‘›โ‰ฅ1 ๐ผ๐‘› where ๐ผ๐‘› โˆˆ โ„ฌ. wlog we mayassume ๐ผ๐‘›+1 โŠ† ๐ผ๐‘›. By ๐œŽ-finiteness assumption, ๐‘‹ = โ‹ƒ๐‘šโ‰ฅ1 ๐‘‹๐‘š where๐‘‹๐‘š โˆˆ โ„ฌ with ๐œ‡(๐‘‹๐‘š) < โˆž so

๐ธ = โ‹ƒ๐‘šโ‰ฅ1

๐ธ โˆฉ ๐‘‹๐‘š.

By the claim for all ๐‘š, ๐ธ โˆฉ ๐‘‹๐‘š โˆˆ โ„ฌโˆ— so ๐ธ โˆˆ โ„ฌโˆ—.

From the lemma we can derive that โ„ฌโˆ— is also stable under complementa-tion: given ๐ธ โˆˆ โ„ฌโˆ—, for all ๐‘› there exist ๐ถ๐‘› = โ‹ƒ๐‘–โ‰ฅ1 ๐ต๐‘›,๐‘– where ๐ต๐‘›,๐‘– โˆˆ โ„ฌsuch that ๐ธ โŠ† ๐ถ๐‘› and ๐œ‡โˆ—(๐ถ๐‘› \ ๐ธ) โ‰ค 1

๐‘› . Now

๐ธ๐‘ = ( โ‹ƒ๐‘›โ‰ฅ1

๐ถ๐‘๐‘›) โˆช (๐ธ๐‘ \ โ‹ƒ

๐‘›โ‰ฅ1๐ถ๐‘

๐‘›)

but ๐ถ๐‘๐‘› is a countable intersection โ‹‚๐‘–โ‰ฅ1 ๐ต๐‘

๐‘›,๐‘– and ๐ธ๐‘ \ โ‹ƒ๐‘›โ‰ฅ1 ๐ถ๐‘๐‘› is ๐œ‡โˆ—-null

so by the lemma, ๐ถ๐‘๐‘› โˆˆ โ„ฌโˆ—. Therefore their union is also in โ„ฌโˆ—. Since

weโ€™ve shown that null sets are in โ„ฌโˆ—, ๐ธ๐‘ โˆˆ โ„ฌโˆ—.

4. We want to show ๐œ‡โˆ— is countably additive on โ„ฌโˆ—. Recall that ๐œ‡ is ๐œŽ-finite:there exists ๐‘‹๐‘š โˆˆ โ„ฌ such that ๐‘‹ = โ‹ƒ๐‘šโ‰ฅ1 ๐‘‹๐‘š, ๐œ‡(๐‘‹๐‘š) < โˆž. We say๐ธ โŠ† ๐‘‹ is bounded if there exists ๐‘š such that ๐ธ โŠ† ๐‘‹๐‘š. It is then enoughto show countable additivity for bounded sets by the same argument asbefore: write ๐‘‹ = โ‹ƒ๐‘šโ‰ฅ1 ๏ฟฝ๏ฟฝ๐‘š where ๏ฟฝ๏ฟฝ๐‘š = ๐‘‹๐‘š \ โ‹ƒ๐‘–<๐‘š ๐‘‹๐‘– โˆˆ โ„ฌ so this is adisjoint union. Then if ๐ธ = โ‹ƒ๐‘›โ‰ฅ1 ๐ธ๐‘› as a disjoint union then

๐ธ = โ‹ƒ๐‘›โ‰ฅ1

โ‹ƒ๐‘šโ‰ฅ1

(๐ธ๐‘› โˆฉ ๏ฟฝ๏ฟฝ๐‘š)

15

Page 17: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

which is also a countable disjoint union.Given ๐ธ, if we can show finite additivity then

๐‘โˆ‘๐‘›=1

๐œ‡โˆ—(๐ธ๐‘›) = ๐œ‡โˆ—(๐‘โ‹ƒ๐‘›=1

๐ธ๐‘›) โ‰ค ๐œ‡โˆ—(๐ธ) โ‰ค โˆ‘๐‘›โ‰ฅ1

๐œ‡โˆ—(๐ธ๐‘›)

take limit as ๐‘ โ†’ โˆž to have equality throughout.It suffices to prove finite additivity when ๐ธ and ๐น are countable intersec-tions of sets from โ„ฌ: ๐ธ, ๐น โˆˆ โ„ฌโˆ— so for ๐œ€ > 0 there exists ๐ถ, ๐ท countableintersections of sets from โ„ฌ such that ๐ถ โŠ† ๐ธ, ๐ท โŠ† ๐น and

๐œ‡โˆ—(๐ธ) โ‰ค ๐œ‡โˆ—(๐ถ) + ๐œ€๐œ‡โˆ—(๐น) โ‰ค ๐œ‡โˆ—(๐ท) + ๐œ€

As ๐ธ โˆฉ ๐น = โˆ… and ๐ถ โŠ† ๐ธ, ๐ท โŠ† ๐น, ๐ถ โˆฉ ๐ท = โˆ… so by finite additivity,

๐œ‡โˆ—(๐ธ) + ๐œ‡โˆ—(๐น) โ‰ค 2๐œ€ + ๐œ‡โˆ—(๐ถ โˆช ๐ท) โ‰ค 2๐œ€ + ๐œ‡โˆ—(๐ธ โˆช ๐น).

As usual, reverse holds by subadditivity.Finally for ๐ธ = โ‹‚๐‘›โ‰ฅ1 ๐ผ๐‘›, ๐น = โ‹‚๐‘›โ‰ฅ1 ๐ฝ๐‘› bounded, wlog assume ๐ผ๐‘›+1 โŠ†๐ผ๐‘›, ๐ฝ๐‘›+1 โŠ† ๐ฝ๐‘›. ๐œ‡(๐ผ๐‘›), ๐œ‡(๐ฝ๐‘›) < โˆž. Now use claim 3,

๐œ‡โˆ—(๐ธ) = lim๐‘›โ†’โˆž

๐œ‡โˆ—(๐ผ๐‘›)

๐œ‡โˆ—(๐น) = lim๐‘›โ†’โˆž

๐œ‡โˆ—(๐ฝ๐‘›)

so

๐œ‡โˆ—(๐ธ) + ๐œ‡โˆ—(๐น) = lim๐‘›โ†’โˆž

๐œ‡(๐ผ๐‘›) + ๐œ‡(๐ฝ๐‘›) = lim๐‘›โ†’โˆž

(๐œ‡(๐ผ๐‘› โˆช ๐ฝ๐‘›) + ๐œ‡(๐ผ๐‘› โˆฉ ๐ฝ๐‘›))

But

โ‹‚๐‘›โ‰ฅ1

(๐ผ๐‘› โˆฉ ๐ฝ๐‘›) = ๐ธ โˆฉ ๐น = โˆ…

โ‹‚๐‘›โ‰ฅ1

(๐ผ๐‘› โˆช ๐ฝ๐‘›) = ๐ธ โˆช ๐น

so by claim 3

lim๐‘›โ†’โˆž

๐œ‡(๐ผ๐‘› โˆฉ ๐ฝ๐‘›) = 0

lim๐‘›โ†’โˆž

๐œ‡(๐ผ๐‘› โˆช ๐ฝ๐‘›) = ๐œ‡โˆ—(๐ธ โˆช ๐น)

which finishes the proof.

Remark. We prove that every set in โ„ฌโˆ— is a disjoint union ๐ธ = ๐น โˆช ๐‘ where๐น โˆˆ ๐œŽ(โ„ฌ) and ๐‘ is ๐œ‡โˆ—-null.

16

Page 18: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

2 Abstract measure theory

Definition (completion). We say that โ„ฌโˆ— is the completion of ๐œŽ(โ„ฌ) withrespect to ๐œ‡.

Example. โ„’ is the completion of โ„ฌ(R๐‘‘) in R๐‘‘.

17

Page 19: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

3 Integration and measurable functions

Definition (measurable function). Let (๐‘‹, ๐’œ) be a measurable space. Afunction ๐‘‹ โ†’ R is called measurable or ๐’œ-measurable if for all ๐‘ก โˆˆ R,

{๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) < ๐‘ก} โˆˆ ๐’œ.

Remark. The ๐œŽ-algebra generated by intervals (โˆ’โˆž, ๐‘ก) where ๐‘ก โˆˆ R is the Borel๐œŽ-algebra of R, denote โ„ฌ(R). Thus for every measurable function ๐‘“ โˆถ ๐‘‹ โ†’ R,the preimage ๐‘“โˆ’1(๐ต) โˆˆ ๐’œ for all ๐ต โˆˆ โ„ฌ(R). However, it is not true that๐‘“โˆ’1(๐ฟ) โˆˆ ๐’œ for any ๐ฟ โˆˆ โ„’.

Remark. If ๐‘“ is allowed to take the values +โˆž and โˆ’โˆž we will say that ๐‘“ ismeasurable if additionally ๐‘“โˆ’1({+โˆž}) โˆˆ ๐’œ and ๐‘“โˆ’1({โˆ’โˆž}) โˆˆ ๐’œ.

More generally,

Definition (measurable map). Suppose (๐‘‹, ๐’œ) and (๐‘Œ , โ„ฌ) are measurablespaces. A map ๐‘“ โˆถ ๐‘‹ โ†’ ๐‘Œ is measurable if for all ๐ต โˆˆ โ„ฌ, ๐‘“โˆ’1(๐ต) โˆˆ ๐’œ.

Proposition 3.1.

1. The composition of measurable maps is measurable.

2. If ๐‘“, ๐‘” โˆถ (๐‘‹, ๐’œ) โ†’ R are measurable functions then ๐‘“ + ๐‘”, ๐‘“๐‘” and ๐œ†๐‘“for ๐œ† โˆˆ R are also measurable.

3. If (๐‘“๐‘›)๐‘›โ‰ฅ1 is a sequence of measurable functions on (๐‘‹, ๐’œ) then so aresup๐‘› ๐‘“๐‘›, inf๐‘› ๐‘“๐‘›, lim sup๐‘› ๐‘“๐‘› and lim inf๐‘› ๐‘“๐‘›.

Proof.

1. Obvious.

2. Follow from 1 once itโ€™s shown that + โˆถ R2 โ†’ R and ร— โˆถ R2 โ†’ R aremeasurable (with respect to Borel sets). The sets

{(๐‘ฅ, ๐‘ฆ) โˆถ ๐‘ฅ + ๐‘ฆ < ๐‘ก}{(๐‘ฅ, ๐‘ฆ) โˆถ ๐‘ฅ๐‘ฆ < ๐‘ก}

are open in R2 and hence Borel.

3. inf๐‘› ๐‘“๐‘›(๐‘ฅ) < ๐‘ก if and only if

๐‘ฅ โˆˆ โ‹ƒ๐‘›

{๐‘ฅ โˆถ ๐‘“๐‘›(๐‘ฅ) < ๐‘ก}

and similar for sup. Similarly lim sup๐‘› ๐‘“๐‘›(๐‘ฅ) < ๐‘ก if and only if

๐‘ฅ โˆˆ โ‹ƒ๐‘šโ‰ฅ1

โ‹‚๐‘˜โ‰ฅ1

โ‹ƒ๐‘›โ‰ฅ๐‘˜

{๐‘ฅ โˆถ ๐‘“๐‘›(๐‘ฅ) < ๐‘ก โˆ’ 1๐‘š

}.

18

Page 20: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

Proposition 3.2. ๐‘“ = (๐‘“1, โ€ฆ , ๐‘“๐‘‘) โˆถ (๐‘‹, ๐’œ) โ†’ (R๐‘‘, โ„ฌ(R๐‘‘)) where ๐‘‘ โ‰ฅ 1 ismeasurable if and only if each ๐‘“๐‘– โˆถ ๐‘‹ โ†’ R is measurable.

Proof. One direction is easy: suppose ๐‘“ is measurable then

{๐‘ฅ โˆถ ๐‘“๐‘–(๐‘ฅ) < ๐‘ก} = ๐‘“โˆ’1({๐‘ฆ โˆˆ R๐‘‘ โˆถ ๐‘ฆ๐‘– < ๐‘ก}),

which is open so ๐‘“๐‘– is measurable.Conversely, suppose ๐‘“๐‘– is measurable. Then

๐‘“โˆ’1(๐‘‘

โˆ๐‘–=1

[๐‘Ž๐‘–, ๐‘๐‘–]) =๐‘‘

โ‹‚๐‘–=1

{๐‘ฅ โˆถ ๐‘Ž๐‘– โ‰ค ๐‘“๐‘–(๐‘ฅ) โ‰ค ๐‘๐‘–}

As the boxes generate the Borel sets, done.

Example.

1. Let (๐‘‹, ๐’œ) be a measurable space and ๐ธ โŠ† ๐‘‹. Then ๐ธ โˆˆ ๐’œ if and only if1๐ธ, the indicator function on ๐ธ, is ๐’œ-measurable.

2. If ๐‘‹ = โˆ๐‘๐‘–=1 ๐‘‹๐‘– and ๐’œ is the Boolean algebra generated by the ๐‘‹๐‘–โ€™s. A

function ๐‘“ โˆถ (๐‘‹, ๐’œ) โ†’ R is measurable if and only if ๐‘“ is constant on each๐‘‹๐‘–. In this case the vector space of measurable functions has dimension๐‘.

3. Every continuous function ๐‘“ โˆถ R๐‘‘ โ†’ R is measurable.

Definition (Borel measurable). If ๐‘‹ is a topological space, ๐‘“ โˆถ ๐‘‹ โ†’ R isBorel or Borel measurable if it is โ„ฌ(๐‘‹)-measurable.

Definition (simple function). A function ๐‘“ on (๐‘‹, ๐’œ) is called simple if

๐‘“ =๐‘›

โˆ‘๐‘–=1

๐‘Ž๐‘–1๐ด๐‘–

for some ๐‘Ž๐‘– โ‰ฅ 0 and ๐ด๐‘– โˆˆ ๐’œ.

Of course simple functions are measurable.

Lemma 3.3. If a simple function can be written in two ways

๐‘“ =๐‘›

โˆ‘๐‘–=1

๐‘Ž๐‘–1๐ด๐‘–=

๐‘ โˆ‘๐‘—=1

๐‘๐‘—1๐ต๐‘—

then ๐‘›โˆ‘๐‘–=1

๐‘Ž๐‘–๐œ‡(๐ด๐‘–) =๐‘ 

โˆ‘๐‘—=1

๐‘๐‘—๐œ‡(๐ต๐‘—)

for any measure ๐œ‡ on (๐‘‹, ๐’œ).

Proof. Example sheet 1.

19

Page 21: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

Definition (integral of a simple function with respect to a measure). The๐œ‡-integral of ๐‘“ is defined by

๐œ‡(๐‘“) โˆถ=๐‘›

โˆ‘๐‘–=1

๐‘Ž๐‘–๐œ‡(๐ด๐‘–).

Remark.

1. The lemma says that the integral is well-defined.

2. We also use the notation โˆซ๐‘‹

๐‘“๐‘‘๐œ‡ to denote ๐œ‡(๐‘“).

Proposition 3.4. ๐œ‡-integral satisfies, for all simple functions ๐‘“ and ๐‘”,

1. linearity: for all ๐›ผ, ๐›ฝ โ‰ฅ 0, ๐œ‡(๐›ผ๐‘“ + ๐›ฝ๐‘”) = ๐›ผ๐œ‡(๐‘“) + ๐›ฝ๐œ‡(๐‘”).

2. positivity: if ๐‘” โ‰ค ๐‘“ then ๐œ‡(๐‘”) โ‰ค ๐œ‡(๐‘“).

3. if ๐œ‡(๐‘“) = 0 then ๐‘“ = 0 ๐œ‡-almost everywhere, i.e. {๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) โ‰  0}is a ๐œ‡-null set.

Proof. Obvious from definition and lemma.

Definition. If ๐‘“ โ‰ฅ 0 and measurable on (๐‘‹, ๐’œ), define

๐œ‡(๐‘“) = sup{๐œ‡(๐‘”) โˆถ ๐‘” simple , ๐‘” โ‰ค ๐‘“} โˆˆ [0, +โˆž].

Remark. This is consistent with the definition for ๐‘“ simple, due to positivity.

Definition (integrable). If ๐‘“ is an arbitrary measurable function on (๐‘‹, ๐’œ)we say ๐‘“ is ๐œ‡-integrable if

๐œ‡(|๐‘“|) < โˆž.

Definition (integral with respect to a measure). If ๐‘“ is ๐œ‡-integrable, thenwe define its ๐œ‡-integral by

๐œ‡(๐‘“) = ๐œ‡(๐‘“+) โˆ’ ๐œ‡(๐‘“โˆ’)

where ๐‘“+ = max{0, ๐‘“} and ๐‘“โˆ’ = (โˆ’๐‘“)+.

Note.

|๐‘“| = ๐‘“+ + ๐‘“โˆ’

๐‘“ = ๐‘“+ โˆ’ ๐‘“โˆ’

Theorem 3.5 (monotone convergence theorem). Let (๐‘“๐‘›)๐‘›โ‰ฅ1 be a sequenceof measurable functions on a measure space (๐‘‹, ๐’œ, ๐œ‡) such that

0 โ‰ค ๐‘“1 โ‰ค ๐‘“2 โ‰ค โ‹ฏ โ‰ค ๐‘“๐‘› โ‰ค โ€ฆ

20

Page 22: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

Let ๐‘“ = lim๐‘›โ†’โˆž ๐‘“๐‘›. Then

๐œ‡(๐‘“) = lim๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›).

Lemma 3.6. If ๐‘” is a simple function on (๐‘‹, ๐’œ, ๐œ‡), the map

๐‘š๐‘” โˆถ ๐’œ โ†’ [0, โˆž]๐ธ โ†ฆ ๐œ‡(1๐ธ๐‘”)

is a measure on (๐‘‹, ๐’œ).

Proof. Write ๐‘” = โˆ‘๐‘Ÿ๐‘–=1 ๐‘Ž๐‘–1๐ด๐‘–

so ๐‘”1๐ธ = โˆ‘๐‘Ÿ๐‘–=1 ๐‘Ž๐‘–1๐ด๐‘–โˆฉ๐ธ so

๐œ‡(1๐ธ๐‘”) =๐‘Ÿ

โˆ‘๐‘–=1

๐‘Ž๐‘–๐œ‡(๐ด๐‘– โˆฉ ๐ธ).

By a question on example sheet this is well-defined. Then ๐œŽ-additivity followsimmediately from ๐œŽ-additivity of ๐œ‡.

Proof of monotone convergence theorem. ๐‘“๐‘› โ‰ค ๐‘“๐‘›+1 โ‰ค ๐‘“ by assumption so

๐œ‡(๐‘“๐‘›) โ‰ค ๐œ‡(๐‘“๐‘›+1) โ‰ค ๐œ‡(๐‘“)

by definition of integral so

lim๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›) โ‰ค ๐œ‡(๐‘“),

although RHS may be infinite.Let ๐‘” be any simple function with ๐‘” โ‰ค ๐‘“. Need to show that ๐œ‡(๐‘”) โ‰ค

lim๐‘›โ†’โˆž ๐œ‡(๐‘“๐‘›). Pick ๐œ€ > 0. Let

๐ธ๐‘› = {๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“๐‘›(๐‘ฅ) โ‰ฅ (1 โˆ’ ๐œ€)๐‘”(๐‘ฅ)}.

Then ๐‘‹ = โ‹ƒ๐‘›โ‰ฅ1 ๐ธ๐‘› and ๐ธ๐‘› โŠ† ๐ธ๐‘›+1. So we may apply upward monotoneconvergence for sets to measure ๐‘š๐‘” and get

lim๐‘›โ†’โˆž

๐‘š๐‘”(๐ธ๐‘›) = ๐‘š๐‘”(๐‘‹) = ๐œ‡(๐‘”1๐‘‹) = ๐œ‡(๐‘”).

But(1 โˆ’ ๐œ€)๐‘š๐‘”(๐ธ๐‘›) = ๐œ‡((1 โˆ’ ๐œ€)๐‘”1๐ธ๐‘›

)) โ‰ค ๐œ‡(๐‘“๐‘›)

because (1 โˆ’ ๐œ€)๐‘”1๐ธ๐‘›is a simple function smaller than ๐‘“๐‘›. Taking limit,

(1 โˆ’ ๐œ€)๐œ‡(๐‘”) โ‰ค lim๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›)

which holds for all ๐œ€. So๐œ‡(๐‘”) โ‰ค lim

๐‘›โ†’โˆž๐œ‡(๐‘“๐‘›).

21

Page 23: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

Lemma 3.7. If ๐‘“ โ‰ฅ 0 is a measurable function on (๐‘‹, ๐’œ) then there is asequence of simple functions (๐‘”๐‘›)๐‘›โ‰ฅ1

0 โ‰ค ๐‘”๐‘› โ‰ค ๐‘”๐‘›+1 โ‰ค ๐‘“

such that for all ๐‘ฅ โˆˆ ๐‘‹, ๐‘”๐‘›(๐‘ฅ) โ†‘ ๐‘“(๐‘ฅ).

Notation. ๐‘”๐‘› โ†‘ ๐‘“ means that lim๐‘›โ†’โˆž ๐‘”๐‘›(๐‘ฅ) = ๐‘“(๐‘ฅ) and ๐‘”๐‘›+1 โ‰ฅ ๐‘”๐‘›.

Proof. We can take๐‘”๐‘› = 1

2๐‘› โŒŠ2๐‘› min{๐‘“, ๐‘›}โŒ‹

pointwise. Check that โŒŠ2๐‘ฆโŒ‹ โ‰ฅ 2โŒŠ๐‘ฆโŒ‹ for all ๐‘ฆ โ‰ฅ 0.

Proposition 3.8. Basic properties of the integral (for positive functions):suppose ๐‘“, ๐‘” โ‰ฅ 0 are measurable on (๐‘‹, ๐’œ, ๐œ‡).

1. linearity: for all ๐›ผ, ๐›ฝ โ‰ฅ 0, ๐œ‡(๐›ผ๐‘“ + ๐›ฝ๐‘ฆ) = ๐›ผ๐œ‡(๐‘“) + ๐›ฝ๐œ‡(๐‘”).

2. positivity: if 0 โ‰ค ๐‘“ โ‰ค ๐‘” then ๐œ‡(๐‘“) โ‰ค ๐œ‡(๐‘”).

3. if ๐œ‡(๐‘“) = 0 then ๐‘“ = 0 ๐œ‡-almost everywhere.

4. if ๐‘“ = ๐‘” ๐œ‡-almost everywhere then ๐œ‡(๐‘“) = ๐œ‡(๐‘”).

Proof.

1. Follows from the same property for simple functions and from Lemma 3.7combined with monotone convergence theorem.

2. Obvious from definition.

3.{๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) โ‰  0} = โ‹ƒ

๐‘›โ‰ฅ0{๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) > 1

๐‘›}

set ๐‘”๐‘› = 1๐‘› 1{๐‘ฅโˆˆ๐‘‹โˆถ๐‘“(๐‘ฅ)>1/๐‘›} which is simple and ๐‘”๐‘› โ‰ค ๐‘“ so by definition of

integral ๐œ‡(๐‘”๐‘›) โ‰ค ๐œ‡(๐‘“) so ๐œ‡(๐‘”๐‘›) = 0, i.e. ๐œ‡({๐‘ฅ โˆถ ๐‘“(๐‘ฅ) > 1๐‘› }) = 0.

4. Note that if ๐ธ โˆˆ ๐’œ, ๐œ‡(๐ธ๐‘) = 0 then

๐œ‡(โ„Ž1๐ธ) = ๐œ‡(โ„Ž)

for all โ„Ž simple. Thus it holds for all โ„Ž โ‰ฅ 0 measurable. Now take๐ธ = {๐‘ฅ โˆถ ๐‘“(๐‘ฅ) = ๐‘”(๐‘ฅ)}.

Proposition 3.9 (linearity of integral). Suppose ๐‘“, ๐‘” are ๐œ‡-integrable func-tions and ๐›ผ, ๐›ฝ โˆˆ R. Then ๐›ผ๐‘“ + ๐›ฝ๐‘” is ๐œ‡-integrable and

๐œ‡(๐›ผ๐‘“ + ๐›ฝ๐‘”) = ๐›ผ๐œ‡(๐‘“) + ๐›ฝ๐œ‡(๐‘”).

Proof. We have shown the case when ๐›ผ, ๐›ฝ โ‰ฅ 0 and ๐‘“, ๐‘” โ‰ฅ 0. In the general case,use the positive and negative parts.

22

Page 24: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

Lemma 3.10 (Fatouโ€™s lemma). Suppose (๐‘“๐‘›)๐‘›โ‰ฅ1 is a sequence of measurablefunctions on (๐‘‹, ๐’œ, ๐œ‡) such that ๐‘“๐‘› โ‰ฅ 0 for all ๐‘›. Then

๐œ‡(lim inf๐‘›โ†’โˆž

๐‘“๐‘›) โ‰ค lim inf๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›).

Remark. We may not have equality: let ๐‘“๐‘› = 1[๐‘›,๐‘›+1] on (R, โ„’, ๐‘š). Then๐œ‡(๐‘“๐‘›) = 1 but lim๐‘›โ†’โˆž ๐‘“๐‘› = 0.

Proof. Let ๐‘”๐‘› โˆถ= inf๐‘˜โ‰ฅ๐‘› ๐‘“๐‘˜. Then ๐‘”๐‘›+1 โ‰ฅ ๐‘”๐‘› โ‰ฅ 0 so by monotone convergencetheorem, ๐œ‡(๐‘”๐‘›) โ†‘ ๐œ‡(๐‘”) as ๐‘› โ†’ โˆž where ๐‘” = lim๐‘›โ†’โˆž ๐‘” = lim inf๐‘›โ†’โˆž ๐‘“๐‘› and๐‘”๐‘› โ‰ค ๐‘“๐‘› so ๐œ‡(๐‘”๐‘›) โ‰ค ๐œ‡(๐‘“๐‘›) for all ๐‘›. Take ๐‘› โ†’ โˆž,

๐œ‡(๐‘”) โ‰ค lim inf๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›).

In both monotone convergence theorem and Fatouโ€™s lemma we assumed thatthe sequence of functions is nonnegative. There is another version of convergencetheorem where we replace nonnegativity by domination:

Theorem 3.11 (Lebesgueโ€™s dominated convergence theorem). Let (๐‘“๐‘›)๐‘›โ‰ฅ1be a sequence of measurable functions on (๐‘‹, ๐’œ, ๐œ‡) and ๐‘” a ๐œ‡-integrablefunction on ๐‘‹. Assume |๐‘“๐‘›| โ‰ค ๐‘” for all ๐‘› (domination assumption) andassume for all ๐‘ฅ โˆˆ ๐‘‹, lim๐‘›โ†’โˆž ๐‘“๐‘›(๐‘ฅ) = ๐‘“(๐‘ฅ). Then ๐‘“ is ๐œ‡-integrable and

๐œ‡(๐‘“) = lim๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›).

This allows us to swap limit and integral.

Proof. |๐‘“๐‘›| โ‰ค ๐‘” so |๐‘“| โ‰ค ๐‘” so ๐œ‡(|๐‘“|) โ‰ค ๐œ‡(๐‘”) < โˆž and ๐‘“ is integrable. Note that๐‘” + ๐‘“๐‘› โ‰ฅ 0 so by Fatouโ€™s lemma,

๐œ‡(lim inf๐‘›โ†’โˆž

(๐‘” + ๐‘“๐‘›)) โ‰ค lim inf๐‘›โ†’โˆž

๐œ‡(๐‘” + ๐‘“๐‘›).

But lim inf๐‘›โ†’โˆž(๐‘” + ๐‘“๐‘›) = ๐‘” + ๐‘“ and by linearity ๐œ‡(๐‘” + ๐‘“๐‘›) = ๐œ‡(๐‘”) + ๐œ‡(๐‘“๐‘›), so

๐œ‡(๐‘”) + ๐œ‡(๐‘“) โ‰ค ๐œ‡(๐‘”) + lim inf๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›),

i.e.๐œ‡(๐‘“) โ‰ค lim inf

๐‘›โ†’โˆž๐œ‡(๐‘“๐‘›).

Do the same with ๐‘” โˆ’ ๐‘“๐‘› in place of ๐‘” + ๐‘“๐‘›, get

๐œ‡(โˆ’๐‘“) โ‰ค lim inf๐‘›โ†’โˆž

๐œ‡(โˆ’๐‘“๐‘›) = โˆ’ lim sup๐‘›โ†’โˆž

๐œ‡(๐‘“๐‘›)

so๐œ‡(๐‘“) = lim

๐‘›โ†’โˆž๐œ‡(๐‘“๐‘›).

23

Page 25: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

Corollary 3.12 (exchanging integral and summation). Let (๐‘‹, ๐’œ, ๐œ‡) be ameasure space and let (๐‘“๐‘›)๐‘›โ‰ฅ1 be a sequence of measurable functions on ๐‘‹.

1. If ๐‘“๐‘› โ‰ฅ 0 then๐œ‡(โˆ‘

๐‘›โ‰ฅ1๐‘“๐‘›) = โˆ‘

๐‘›โ‰ฅ1๐œ‡(๐‘“๐‘›).

2. If โˆ‘๐‘›โ‰ฅ1 |๐‘“๐‘›| is ๐œ‡-integrable then โˆ‘๐‘›โ‰ฅ1 ๐‘“๐‘› is ๐œ‡-integrable and

๐œ‡(โˆ‘๐‘›โ‰ฅ1

๐‘“๐‘›) = โˆ‘๐‘›โ‰ฅ1

๐œ‡(๐‘“๐‘›).

Proof.

1. Let ๐‘”๐‘ = โˆ‘๐‘๐‘›=1 ๐‘“๐‘›, then ๐‘”๐‘ โ†‘ โˆ‘๐‘›โ‰ฅ1 ๐‘“๐‘› as ๐‘ โ†’ โˆž so the result follows

from monotone convergence theorem.

2. Let ๐‘” = โˆ‘๐‘›โ‰ฅ1 |๐‘“๐‘›| and ๐‘”๐‘ as above. Then |๐‘”๐‘| โ‰ค ๐‘” for all ๐‘ so thedomination assumption holds. The result thus follows from dominatedconvergence theorem.

Corollary 3.13 (differentiation under integral sign). Let (๐‘‹, ๐’œ, ๐œ‡) be ameasure space. Let ๐‘ˆ โŠ† R be an open set and let ๐‘“ โˆถ ๐‘ˆ ร— ๐‘‹ โ†’ R be suchthat

1. ๐‘ฅ โ†ฆ ๐‘“(๐‘ก, ๐‘ฅ) is ๐œ‡-integrable for all ๐‘ก โˆˆ ๐‘ˆ,

2. ๐‘ก โ†ฆ ๐‘“(๐‘ก, ๐‘ฅ) is differentiable for all ๐‘ฅ โˆˆ ๐‘‹,

3. domination: there exists ๐‘” โˆถ ๐‘‹ โ†’ R ๐œ‡-integrable such that for all๐‘ก โˆˆ ๐‘ˆ, ๐‘ฅ โˆˆ ๐‘‹,

๐œ•๐‘“๐œ•๐‘ก

(๐‘ก, ๐‘ฅ) โ‰ค ๐‘”(๐‘ฅ).

Then ๐‘ฅ โ†ฆ ๐œ•๐‘“๐œ•๐‘ก (๐‘ก, ๐‘ฅ) is ๐œ‡-integrable for all ๐‘ก โˆˆ ๐‘ˆ and if we set ๐น(๐‘ก) =

โˆซ๐‘‹

๐‘“(๐‘ก, ๐‘ฅ)๐‘‘๐œ‡ then ๐น is differentiable and

๐น โ€ฒ(๐‘ก) = โˆซ๐‘‹

๐œ•๐‘“๐œ•๐‘ก

(๐‘ก, ๐‘ฅ)๐‘‘๐œ‡.

Proof. Pick โ„Ž๐‘› > 0, โ„Ž๐‘› โ†’ 0 and define

๐‘”๐‘›(๐‘ก, ๐‘ฅ) โˆถ= 1โ„Ž๐‘›

(๐‘“(๐‘ก + โ„Ž๐‘›, ๐‘ฅ) โˆ’ ๐‘“(๐‘ก, ๐‘ฅ)).

Thenlim

๐‘›โ†’โˆž๐‘”๐‘›(๐‘ก, ๐‘ฅ) = ๐œ•๐‘“

๐œ•๐‘ก(๐‘ก, ๐‘ฅ).

By mean value theorem, there exists ๐œƒ๐‘ก,๐‘›,๐‘ฅ โˆˆ [๐‘ก, ๐‘ก + โ„Ž๐‘›] such that

๐‘”๐‘›(๐‘ก, ๐‘ฅ) = ๐œ•๐‘“๐œ•๐‘ก

(๐œƒ, ๐‘ฅ)

24

Page 26: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

3 Integration and measurable functions

so|๐‘”๐‘›(๐‘ก, ๐‘ฅ)| โ‰ค ๐‘”(๐‘ฅ)

by domination assumption. Now apply dominated convergence theorem.

Remark.

1. If ๐‘“ โˆถ [๐‘Ž, ๐‘] โ†’ R is continuous where ๐‘Ž < ๐‘ in R, then ๐‘“ is ๐‘š-integrable(where ๐‘š is the Lebesgue measure) and ๐‘š(๐‘“) = โˆซ๐‘

๐‘Ž๐‘“(๐‘ฅ)๐‘‘๐‘ฅ is the Riemann

integral. In general if ๐‘“ is only assumed to be bounded, then ๐‘“ will beRiemann integrable if and only if the points of discontinuity of ๐‘“ is an๐‘š-null set. See example sheet 2.

2. If ๐‘” โˆˆ GL๐‘‘(R) and ๐‘“ โ‰ฅ 0 is Borel measurable on R๐‘‘, then

๐‘š(๐‘“ โˆ˜ ๐‘”) = 1| det ๐‘”|

๐‘š(๐‘“).

See example sheet 2. In particular ๐‘š is invariant under linear transfor-mation whose determinant has absolute value 1, e.g. rotation.

Remark. In each of monotone convergence theorem, Fatouโ€™s lemma and dom-inated convergence theorem, we can replace pointwise assumption by the corre-sponding ๐œ‡-almost everywhere. The same conclusion holds. Indeed, let

๐ธ = {๐‘ฅ โˆˆ ๐‘‹ โˆถ assumptions hold at ๐‘ฅ}

so ๐ธ๐‘ is a ๐œ‡-null set. Replace each ๐‘“๐‘› (and similarly ๐‘” etc) by 1๐ธ๐‘“๐‘›. Thenassumptions then hold everywhere as ๐œ‡(๐‘“1๐ธ) = ๐œ‡(๐‘“) for all ๐‘“ measurable.

25

Page 27: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

4 Product measures

4 Product measures

Definition (product ๐œŽ-algebra). Let (๐‘‹, ๐’œ) and (๐‘Œ , โ„ฌ) be measurable spaces.The ๐œŽ-algebra of subsets of ๐‘‹ ร—๐‘Œ generated by the product sets ๐ธ ร—๐น where๐ธ โˆˆ ๐’œ, ๐น โˆˆ โ„ฌ is called the product ๐œŽ-algebra of ๐’œ and โ„ฌ and is denoted by๐’œ โŠ— ๐ต.

Remark.

1. By analogy with the notion of product topology, ๐’œ โŠ— โ„ฌ is the smallest๐œŽ-algebra of subsets of ๐‘‹ ร—๐‘Œ making the two projection maps measurable.

2. โ„ฌ(R๐‘‘1) โŠ— โ„ฌ(R๐‘‘2) = โ„ฌ(R๐‘‘1+๐‘‘2). See example sheet. However this is not sofor โ„’(R๐‘‘).

Lemma 4.1. If ๐ธ โŠ† ๐‘‹ ร— ๐‘Œ is ๐’œ โŠ— โ„ฌ-measurable then for all ๐‘ฅ โˆˆ ๐‘‹, theslice

๐ธ๐‘ฅ = {๐‘ฆ โˆˆ ๐‘Œ โˆถ (๐‘ฅ, ๐‘ฆ) โˆˆ ๐ธ}

is in โ„ฌ.

Proof. Letโ„ฐ = {๐ธ โŠ† ๐‘‹ ร— ๐‘Œ โˆถ ๐ธ๐‘ฅ โˆˆ โ„ฌ for all ๐‘ฅ โˆˆ ๐‘‹}.

Note that โ„ฐ contains all product sets ๐ด ร— ๐ต where ๐ด โˆˆ ๐’œ, ๐ต โˆˆ โ„ฌ. โ„ฐ is a ๐œŽ-algebra: if ๐ธ โˆˆ โ„ฐ then ๐ธ๐‘ โˆˆ โ„ฐ and if ๐ธ๐‘› โˆˆ โ„ฐ then โ‹ƒ ๐ธ๐‘› โˆˆ โ„ฐ since (๐ธ๐‘)๐‘ฅ = (๐ธ๐‘ฅ)๐‘

and (โ‹ƒ ๐ธ๐‘›)๐‘ฅ = โ‹ƒ(๐ธ๐‘›)๐‘ฅ.

Lemma 4.2. Assume (๐‘‹, ๐’œ, ๐œ‡) and (๐‘Œ , โ„ฌ, ๐œˆ) are ๐œŽ-finite measure spaces.Let ๐‘“ โˆถ ๐‘‹ ร— ๐‘Œ โ†’ [0, +โˆž] be ๐’œ โŠ— โ„ฌ-measurable. Then

1. for all ๐‘ฅ โˆˆ ๐‘‹, the function ๐‘ฆ โ†ฆ ๐‘“(๐‘ฅ, ๐‘ฆ) is โ„ฌ-measurable.

2. for all ๐‘ฅ โˆˆ ๐‘‹, the map ๐‘ฅ โ†ฆ โˆซ๐‘Œ

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘๐œˆ(๐‘ฆ) is ๐’œ-measurable.

Proof.

1. In case ๐‘“ = 1๐ธ for ๐ธ โˆˆ ๐’œโŠ—โ„ฌ the function ๐‘ฆ โ†ฆ ๐‘“(๐‘ฅ, ๐‘ฆ) is just ๐‘ฆ โ†ฆ 1๐ธ๐‘ฅ(๐‘ฆ),

which is measurable by the previous lemma.More generally, the result is true for simple functions and thus for allmeasurable functions by taking pointwise limit.

2. By the same reduction we may assume ๐‘“ = 1๐ธ for some ๐ธ โˆˆ ๐’œ โŠ— โ„ฌ. Nowlet ๐‘Œ = โ‹ƒ๐‘šโ‰ฅ1 ๐‘Œ๐‘š with ๐œˆ(๐‘Œ๐‘š) < โˆž. Let

โ„ฐ = {๐ธ โˆˆ ๐’œ โŠ— โ„ฌ โˆถ ๐‘ฅ โ†ฆ ๐œˆ(๐ธ๐‘ฅ โˆฉ ๐‘Œ๐‘š) is ๐’œ-measurable for all ๐‘š}.

โ„ฐ contains all product sets ๐ธ = ๐ด ร— ๐ต where ๐ด โˆˆ ๐’œ, ๐ต โˆˆ โ„ฌ because๐œˆ(๐ธ๐‘ฅ โˆฉ ๐‘Œ๐‘š) = 1๐‘ฅโˆˆ๐’œ๐œˆ(๐ต โˆฉ ๐‘Œ๐‘š). โ„ฐ is stable under complementation:

๐œˆ((๐ธ๐‘)๐‘ฅ โˆฉ ๐‘Œ๐‘š) = ๐œˆ(๐‘Œ๐‘š) โˆ’ ๐œˆ(๐‘Œ๐‘š โˆฉ ๐ธ๐‘ฅ)

26

Page 28: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

4 Product measures

where LHS is ๐œˆ-measurable. โ„ฐ is stable under disjoint countable union:let ๐ธ = โ‹ƒ๐‘›โ‰ฅ1 ๐ธ๐‘› where ๐ธ๐‘› โˆˆ โ„ฐ disjoint. Then by ๐œŽ-additivity

๐œˆ(๐ธ๐‘ฅ โˆฉ ๐‘Œ๐‘š) = โˆ‘๐‘›โ‰ฅ1

๐œˆ((๐ธ๐‘›)๐‘ฅ โˆฉ ๐‘Œ๐‘š)

which is ๐’œ-measurable.The product sets form a ๐œ‹-system and generates the product measure soby Dynkin lemma โ„ฐ = ๐’œ โŠ— โ„ฌ.

Definition (product measure). Let (๐‘‹, ๐’œ, ๐œ‡) and (๐‘Œ , โ„ฌ, ๐œˆ) be measurespaces and ๐œ‡, ๐œˆ ๐œŽ-finite. Then there exists a unique product measure, denotedby ๐œ‡ โŠ— ๐œˆ, on ๐’œ โŠ— โ„ฌ such that for all ๐ด โˆˆ ๐’œ, ๐ต โˆˆ โ„ฌ,

(๐œ‡ โŠ— ๐œˆ)(๐ด ร— ๐ต) = ๐œ‡(๐ด)๐œˆ(๐ต).

Proof. Uniqueness follows from Dynkin lemma. For existence, set

๐œŽ(๐ธ) = โˆซ๐‘‹

๐œˆ(๐ธ๐‘ฅ)๐‘‘๐œ‡(๐‘ฅ).

๐œŽ is well-defined because ๐‘ฅ โ†ฆ ๐œˆ(๐ธ๐‘ฅ) is ๐’œ-measurable by lemma 2. ๐œŽ is countably-additive: suppose ๐ธ = โ‹ƒ๐‘›โ‰ฅ1 ๐ธ๐‘› where ๐ธ๐‘› โˆˆ ๐’œ โŠ— โ„ฌ disjoint, then

๐œŽ(๐ธ) = โˆซ๐‘‹

๐œˆ(๐ธ๐‘ฅ)๐‘‘๐œ‡(๐‘ฅ) = โˆซ๐‘‹

โˆ‘๐‘›โ‰ฅ1

๐œˆ((๐ธ๐‘›)๐‘ฅ)๐‘‘๐œ‡๐‘ฅ = โˆ‘๐‘›โ‰ฅ1

โˆซ๐‘‹

๐œˆ((๐ธ๐‘›)๐‘ฅ)๐‘‘๐œ‡(๐‘ฅ) = โˆ‘๐‘›โ‰ฅ2

๐œŽ(๐ธ๐‘›)

by a corollary of MCT.

Theorem 4.3 (Tonelli-Fubini). Let (๐‘‹, ๐’œ, ๐œ‡) and (๐‘Œ , โ„ฌ, ๐œˆ) be ๐œŽ-finite mea-sure spaces.

1. Let ๐‘“ โˆถ ๐‘‹ ร— ๐‘Œ โ†’ [0, +โˆž] be ๐’œ โŠ— โ„ฌ-measurable. Then

โˆซ๐‘‹ร—๐‘Œ

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘(๐œ‡โŠ—๐œˆ) = โˆซ๐‘‹

โˆซ๐‘Œ

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘๐œˆ(๐‘ฆ)๐‘‘๐œ‡(๐‘ฅ) = โˆซ๐‘‹

โˆซ๐‘‹

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘๐œ‡(๐‘ฅ)๐‘‘๐œˆ(๐‘ฆ).

2. If ๐‘“ โˆถ ๐‘‹ ร— ๐‘Œ โ†’ R is ๐œ‡ โŠ— ๐œˆ-integrable then for ๐œ‡-almost everywhere ๐‘ฅ,๐‘ฆ โ†ฆ ๐‘“(๐‘ฅ, ๐‘ฆ) is ๐œˆ-integrable, and for ๐œˆ-almost everywhere ๐‘ฆ, ๐‘ฅ โ†ฆ ๐‘“(๐‘ฅ, ๐‘ฆ)is ๐œ‡-integrable and

โˆซ๐‘‹ร—๐‘Œ

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘(๐œ‡โŠ—๐œˆ) = โˆซ๐‘‹

โˆซ๐‘Œ

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘๐œˆ(๐‘ฆ)๐‘‘๐œ‡(๐‘ฅ) = โˆซ๐‘‹

โˆซ๐‘‹

๐‘“(๐‘ฅ, ๐‘ฆ)๐‘‘๐œ‡(๐‘ฅ)๐‘‘๐œˆ(๐‘ฆ).

Without the nonnegativity or integrability assumption, the result is false ingeneral. For example for ๐‘‹ = ๐‘Œ = N, let ๐’œ = โ„ฌ be discrete ๐œŽ-algebras and

27

Page 29: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

4 Product measures

๐œ‡ = ๐œˆ counting measure. Let ๐‘“(๐‘›, ๐‘š) = 1๐‘›=๐‘š โˆ’ 1๐‘›=๐‘š+1. Check that

โˆ‘๐‘›โ‰ฅ1

๐‘“(๐‘›, ๐‘š) = 0

โˆ‘๐‘šโ‰ฅ1

๐‘“(๐‘›, ๐‘š) = {0 ๐‘› โ‰ฅ 21 ๐‘› = 1

soโˆ‘๐‘›โ‰ฅ1

โˆ‘๐‘šโ‰ฅ1

๐‘“(๐‘›, ๐‘š) โ‰  โˆ‘๐‘šโ‰ฅ1

โˆ‘๐‘›โ‰ฅ1

๐‘“(๐‘›, ๐‘š).

Proof.

1. The result holds for ๐‘“ = 1๐ธ where ๐ธ โˆˆ ๐’œโŠ—โ„ฌ by the definition of productmeasure and lemma 2, so it holds for all simple functions. Now take limitsand apply MCT.

2. Write ๐‘“ = ๐‘“+ โˆ’ ๐‘“โˆ’ and apply 1.

Note.

1. The Lebesgue measure ๐‘š๐‘‘ on R๐‘‘ is equal to ๐‘š1 โŠ— โ‹ฏ โŠ— ๐‘š1, because it istrue on boxes and extend by uniqueness of measure.

2. ๐ธ โˆˆ ๐’œ โŠ— โ„ฌ if is ๐œ‡ โŠ— ๐œˆ-null if and only if for ๐œ‡-almost every ๐‘ฅ, ๐œˆ(๐ธ๐‘ฅ) = 0.

28

Page 30: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

5 Foundations of probability theory

5 Foundations of probability theoryModern probability theory was founded by Kolmogorov, who formulated theaxioms of probability theory in 1933 in his thesis Foundations on the Theory ofProbability. He defined a probability space to be a measure space (ฮฉ, โ„ฑ,P). Theinterpretation is as follow: ฮฉ is the universe of possible outcomes. However, wewouldnโ€™t be able to assign probability to every single outcome unless the spaceis discrete. Instead we are interested in studying some subsets of ฮฉ, whichare called events and contained in โ„ฑ. Finally P is a probability measure withP(ฮฉ) = 1. Thus for ๐ด โˆˆ โ„ฑ, P(๐ด occurs) โˆˆ [0, 1]. Thus finite additivity of P saysthat if ๐ด and ๐ต never occurs simultaneously then P(๐ด or ๐ต = P(๐ด) + P(๐ต).๐œŽ-additivity is slightly more difficult to justify and it is perhaps to see theequivalent notion continuity: if ๐ด๐‘›+1 โŠ‡ ๐ด๐‘› and โ‹‚๐‘›โ‰ฅ1 ๐ด๐‘› = โˆ… then P(๐ด๐‘›) โ†’ 0as ๐‘› โ†’ โˆž.

Definition (probability measure, probability space). Let ฮฉ be a set and โ„ฑa ๐œŽ-algebra on ฮฉ. A measure ๐œ‡ on (ฮฉ, โ„ฑ) is called a probability measure if๐œ‡(ฮฉ) = 1 and the measure space (ฮฉ, โ„ฑ, ๐œ‡) is called a probability space.

Definition (random variable). A measurable function ๐‘‹ โˆถ ฮฉ โ†’ R is calleda random variable.

We usually use a capital letter to denote a random variable.

Definition (expectation). If (ฮฉ, โ„ฑ,P) is a probability space then the P-integral is called expectation, denoted E.

Definition (distribution/law). A random variable ๐‘‹ โˆถ ฮฉ โ†’ R on a proba-bility space (ฮฉ, โ„ฑ,P) determines a Borel measure ๐œ‡๐‘‹ on R defined by

๐œ‡๐‘‹((โˆ’โˆž, ๐‘ก]) = P(๐‘‹ โ‰ค ๐‘ก) = P({๐œ” โˆˆ ฮฉ โˆถ ๐‘‹(๐œ”) โ‰ค ๐‘ก})

and ๐œ‡๐‘‹ is called the distribution of ๐‘‹, or the law of ๐‘‹.

Note. ๐œ‡๐‘‹ is the image of P under

ฮฉ โ†’ R๐œ” โ†ฆ ๐‘‹(๐œ”)

Definition (distribution function). The function

๐น๐‘‹ โˆถ R โ†’ [0, 1]๐‘ก โ†ฆ P(๐‘‹ โ‰ค ๐‘ก)

is called the distribution function of ๐‘‹.

29

Page 31: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

5 Foundations of probability theory

Proposition 5.1. If (ฮฉ, โ„ฑ,P) is a probability space and ๐‘‹ โˆถ ฮฉ โ†’ R is a ran-dom variable then ๐น๐‘‹ is non-decreasing, right-continuous and it determines๐œ‡๐‘‹ uniquely.

Proof. Given ๐‘ก๐‘› โ†“ ๐‘ก,

๐น๐‘‹(๐‘ก๐‘›) = P(๐‘‹ โ‰ค ๐‘ก๐‘›) โ†’ P( โ‹‚๐‘›โ‰ฅ1

{๐‘‹ โ‰ค ๐‘ก๐‘›}) = P({๐‘‹ โ‰ค ๐‘ก}) = ๐น๐‘‹(๐‘ก)

by downward monotone convergence for sets. Uniqueness follows from Dynkinlemma applied to the ๐œ‹-system {โˆ…} โˆช {(โˆ’โˆž, ๐‘ก]}๐‘กโˆˆR.

Conversely,

Proposition 5.2. If ๐น โˆถ R โ†’ [0, 1] is a non-decreasing right-continuousfunction with

lim๐‘กโ†’โˆ’โˆž

๐น(๐‘ก) = 0

lim๐‘กโ†’+โˆž

๐น(๐‘ก) = 1

then there exists a unique probability measure ๐œ‡ on R such that

๐น(๐‘ก) = ๐œ‡((โˆ’โˆž, ๐‘ก])

for all ๐‘ก โˆˆ R.

Remark. The measure ๐œ‡ is called the Lebesgue-Stieltjes measure on R associ-ated to ๐น. Furthermore for all ๐‘Ž, ๐‘ โˆˆ R,

๐œ‡((๐‘Ž, ๐‘]) = ๐น(๐‘) โˆ’ ๐น(๐‘Ž).

We can also construct Lebesgue measure this way.

Proof. Uniqueness is the same as above. For existence, we use the lemma

Lemma 5.3. Let

๐‘” โˆถ (0, 1) โ†’ R๐‘ฆ โ†ฆ inf{๐‘ฅ โˆˆ R โˆถ ๐น (๐‘ฅ) โ‰ฅ ๐‘ฆ}

then ๐‘” is non-decreasing, left-continuous and for all ๐‘ฅ โˆˆ R, ๐‘ฆ โˆˆ (0, 1), ๐‘”(๐‘ฆ) โ‰ค๐‘ฅ if and only if ๐น(๐‘ฅ) โ‰ฅ ๐‘ฆ.

Proof. Let๐ผ๐‘ฆ = {๐‘ฅ โˆˆ R โˆถ ๐น (๐‘ฅ) โ‰ฅ ๐‘ฆ}.

Clearly if ๐‘ฆ1 โ‰ฅ ๐‘ฆ2 then ๐ผ๐‘ฆ1โŠ† ๐ผ๐‘ฆ2

so ๐‘”(๐‘ฆ2) โ‰ค ๐‘”(๐‘ฆ1) so ๐‘” is non-decreasing. ๐ผ๐‘ฆ isan interval of R because if ๐‘ฅ > ๐‘ฅ1 and ๐‘ฅ1 โˆˆ ๐ผ๐‘ฆ then ๐น(๐‘ฅ) โ‰ฅ ๐น(๐‘ฅ1) โ‰ฅ ๐‘ฆ so ๐‘ฅ โˆˆ ๐ผ๐‘ฆ.So ๐ผ๐‘ฆ is an interval with endpoints ๐‘”(๐‘ฆ) and +โˆž. But ๐น is right-continuous so๐‘”(๐‘ฆ) = min ๐ผ๐‘ฆ and the minimum is obtained. Thus ๐ผ๐‘ฆ = [๐‘”(๐‘ฆ), +โˆž).

This means that ๐‘ฅ โ‰ฅ ๐‘”(๐‘ฆ) if and only if ๐‘ฅ โˆˆ ๐ผ๐‘ฆ if and only if ๐น(๐‘ฅ) โ‰ฅ ๐‘ฆ.Finally for left-continuity, suppose ๐‘ฆ๐‘› โ†‘ ๐‘ฆ then โ‹‚๐‘›โ‰ฅ1 ๐ผ๐‘ฆ๐‘›

= ๐ผ๐‘ฆ by definitionof ๐ผ๐‘ฆ so ๐‘”(๐‘ฆ๐‘›) โ†’ ๐‘”(๐‘ฆ).

30

Page 32: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

5 Foundations of probability theory

Remark. If ๐น is continuous and strictly increasing then ๐‘” = ๐น โˆ’1.

Now back to the proposition. Set ๐œ‡ = ๐‘”โˆ—๐‘š where ๐‘š is the Lebesgue measureon (0, 1). ๐œ‡ is a probability measure as ๐‘” is Borel-measurable. By the lemma

๐œ‡((๐‘Ž, ๐‘]) = ๐‘š(๐‘”โˆ’1(๐‘Ž, ๐‘]) = ๐‘š((๐น(๐‘Ž), ๐น (๐‘))) = ๐น(๐‘) โˆ’ ๐น(๐‘Ž).

Proposition 5.4. If ๐œ‡ is a Borel probability measure on R then there existssome probability space (ฮฉ, โ„ฑ,P) and a random variable ๐‘‹ on ฮฉ such that๐œ‡๐‘‹ = ๐œ‡.

In fact, one can even pick ฮฉ = (0, 1), โ„ฑ = โ„ฌ(0, 1) and P = ๐‘š, theLebesgue measure.

Proof. For the first claim set ฮฉ = R, โ„ฑ = โ„ฌ(R), P = ๐œ‡ and ๐‘‹(๐‘ฅ) = ๐‘ฅ.For the second claim, set ๐น(๐‘ก) = ๐œ‡((โˆ’โˆž, ๐‘ก]) and take ๐‘‹ = ๐‘” where ๐‘” is the

auxillary function defined in the previous lemma, namely

๐‘‹(๐œ”) = inf{๐‘ฅ โˆถ ๐น(๐‘ฅ) โ‰ฅ ๐œ”}.

Check that ๐œ‡๐‘‹ = ๐œ‡:

๐œ‡๐‘‹((๐‘Ž, ๐‘]) = P(๐‘‹ โˆˆ (๐‘Ž, ๐‘])= ๐‘š({๐œ” โˆˆ (0, 1) โˆถ ๐‘Ž < ๐‘‹(๐‘ค) โ‰ค ๐‘})= ๐‘š({๐œ” โˆˆ (0, 1) โˆถ ๐น (๐‘Ž) < ๐œ” < ๐น(๐‘)})

Remark. If ๐œ‡ is a Borel probability measure on R such that ๐œ‡ = ๐‘“๐‘‘๐‘ก forsome ๐‘“ โ‰ฅ 0 measurable, we say that ๐œ‡ has a density (with respect to Lebesguemeasure) and ๐‘“ is called the density of ๐œ‡. Here ๐œ‡ = ๐‘“๐‘‘๐‘ก means that ๐œ‡((๐‘Ž, ๐‘]) =โˆซ๐‘๐‘Ž

๐‘“(๐‘ก)๐‘‘๐‘ก.

Example.1. uniform distribution on [0, 1]:

๐‘“(๐‘ก) = 1[0,1](๐‘ก)๐น(๐‘ก) = ๐œ‡((โˆ’โˆž, ๐‘ก] โˆฉ [0, 1])

2. exponential distribution of rate ๐œ†:

๐‘“๐œ†(๐‘ก) = ๐œ†๐‘’โˆ’๐œ†๐‘ก1๐‘กโ‰ฅ0

๐น๐œ†(๐‘ก) = โˆซ๐‘ก

โˆ’โˆž๐‘“๐œ†(๐‘ )๐‘‘๐‘  = 1๐‘กโ‰ฅ0(1 โˆ’ ๐‘’โˆ’๐œ†๐‘ก)

3. Gaussian distribution with standard deviation ๐œŽ and mean ๐‘š:

๐‘“๐œŽ,๐‘š(๐‘ก) = 1โˆš2๐œ‹๐œŽ2

exp(โˆ’(๐‘ก โˆ’ ๐‘š)2

2๐œŽ2 )

๐น๐œŽ,๐‘š(๐‘ก) = โˆซ๐‘ก

โˆ’โˆž

1โˆš2๐œ‹๐œŽ2

exp(โˆ’(๐‘  โˆ’ ๐‘š)2

2๐œŽ2 )๐‘‘๐‘ 

31

Page 33: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

5 Foundations of probability theory

Definition (mean, moment, variance). If ๐‘‹ is a random variable then

1. E(๐‘‹) is called the mean,

2. E(๐‘‹๐‘˜) is called the ๐‘˜th-moment of ๐‘‹,

3. Var(๐‘‹) = E((๐‘‹ โˆ’ E๐‘‹)2) = E(๐‘‹2) โˆ’ E(๐‘‹)2 is called the variance.

Remark. Suppose ๐‘“ โ‰ฅ 0 is measurable and ๐‘‹ is a random variable. Then

E(๐‘“(๐‘‹)) = โˆซR

๐‘“(๐‘ฅ)๐‘‘๐œ‡๐‘‹(๐‘ฅ)

where by definition of ๐œ‡๐‘‹ = ๐‘‹โˆ—P.

32

Page 34: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

6 Independence

6 IndependenceIndependence is the key notion that makes probability theory distinct from(abstract) measure theory.

Definition (independence). Let (ฮฉ, โ„ฑ,P) be a probability space. A se-quence of events (๐ด๐‘›)๐‘›โ‰ฅ1 is called independent or mutually independent iffor all ๐น โŠ† N finite,

P(โ‹‚๐‘–โˆˆ๐น

๐ด๐‘–) = โˆ๐‘–โˆˆ๐น

P(๐ด๐‘–).

Definition (independent ๐œŽ-algebra). A sequence of ๐œŽ-algebras (๐’œ๐‘›)๐‘›โ‰ฅ1where ๐’œ๐‘› โŠ† โ„ฑ is called independent if for all ๐ด๐‘› โˆˆ ๐’œ๐‘›, the family (๐ด๐‘›)๐‘›โ‰ฅ1is independent.

Remark.

1. To prove that (๐’œ๐‘›)๐‘›โ‰ฅ1 is an independent family, it is enough to checkthe independence condition for all ๐ด๐‘›โ€™s with ๐ด๐‘› โˆˆ ฮ ๐‘› where ฮ ๐‘› is a ๐œ‹-system generating ๐’œ๐‘›. The proof is an application of Dynkin lemma. Forexample for ๐œŽ-algebras ๐’œ1, ๐’œ2, suffices to check

P(๐ด1 โˆฉ ๐ด2) = P(๐ด1)P(๐ด2)

for all ๐ด1 โˆˆ ฮ 1, ๐ด2 โˆˆ ฮ 2. Fix ๐ด2 โˆˆ ฮ 2, look at the measures

๐ด1 โ†ฆ P(๐ด1 โˆฉ ๐ด2)๐ด1 โ†ฆ P(๐ด1)P(๐ด2)

on ๐’œ1. They coincide on ฮ 1 by assumption and hence everywhere on ๐’œ1.Subsequently consider ๐’œ2.

Notation. Suppose ๐‘‹ is a random variable. Denote by ๐œŽ(๐‘‹) the smallest๐œŽ-subalgebra ๐’œ of โ„ฑ such that ๐‘‹ is ๐’œ-measurable, i.e.

๐œŽ(๐‘‹) = ๐œŽ({๐œ” โˆˆ ฮฉ โˆถ ๐‘‹(๐œ”) โ‰ค ๐‘ก}๐‘กโˆˆR).

Definition (independence). A sequence of random variables (๐‘‹๐‘–)๐‘–โ‰ฅ1 iscalled independent if the sequence of ๐œŽ-subalgebras (๐œŽ(๐‘‹๐‘–))๐‘–โ‰ฅ1 is indepen-dent.

Remark. This is equivalent to the condition that for all (๐‘ก๐‘–)๐‘–โ‰ฅ1, for all ๐‘›,

P((๐‘‹1 โ‰ค ๐‘ก1) โˆฉ โ‹ฏ โˆฉ (๐‘‹๐‘› โ‰ค ๐‘ก๐‘›)) =๐‘›

โˆ๐‘–=1

P(๐‘‹๐‘– โ‰ค ๐‘ก๐‘–).

Yet another equivalent formulation is

๐œ‡(๐‘‹1,โ€ฆ,๐‘‹๐‘›) =๐‘›

โจ‚๐‘–=1

๐œ‡๐‘‹๐‘–

as Borel probability measures on R๐‘›. โ€œThe joint law is the same as the productof individual lawsโ€.

33

Page 35: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

6 Independence

Note. Note that independence is a property of a family so pairwise indepen-dence is necessary but not sufficient for independence. A famous counterexampleis Bersteinโ€™s example: take ๐‘‹ and ๐‘Œ to be random variables for two independentfair coins flips. Set ๐‘ = |๐‘‹ โˆ’ ๐‘Œ |. Then ๐‘ = 0 if and only if ๐‘‹ = ๐‘Œ. Check that

P(๐‘ = 0) = P(๐‘ = 1) = 12

and each pair (๐‘‹, ๐‘Œ ), (๐‘‹, ๐‘) and (๐‘Œ , ๐‘) is independent. But (๐‘‹, ๐‘Œ , ๐‘) is notindependent.

Proposition 6.1. If ๐‘‹ and ๐‘Œ are independent random variables, ๐‘‹ โ‰ฅ0, ๐‘Œ โ‰ฅ 0 then

E(๐‘‹๐‘Œ ) = E(๐‘‹)E(๐‘Œ ).

Proof. Essentially Tonelli-Fubini:

E(๐‘‹๐‘Œ ) = โˆซR2

๐‘ฅ๐‘ฆ๐‘‘๐œ‡๐‘‹,๐‘Œ(๐‘ฅ, ๐‘ฆ) = โˆซR2

๐‘‘๐œ‡๐‘‹(๐‘ฅ)๐‘‘๐œ‡๐‘Œ(๐‘ฆ)

= (โˆซR

๐‘ฅ๐‘‘๐œ‡๐‘‹(๐‘ฅ)) (โˆซR

๐‘ฆ๐‘‘๐œ‡๐‘Œ(๐‘ฆ))

= E(๐‘‹)E(๐‘Œ )

Remark. As in Tonelli-Fubini, we may require ๐‘‹๐‘Œ to be integrable instead andthe same conclusion holds.

Example. Let ฮฉ = (0, 1), โ„ฑ = โ„ฌ(0, 1),P = ๐‘š the Lebesgue measure. Writethe decimal expansion of ๐œ” โˆˆ (0, 1) as

๐œ” = 0.๐œ€1๐œ€2 โ€ฆ

where ๐œ€๐‘–(๐œ”) โˆˆ {0, โ€ฆ , 9}. Choose a convention so that each ๐œ” has a well-definedexpansion (to avoid things like 0.099 โ‹ฏ = 0.100 โ€ฆ). Now let ๐‘‹๐‘›(๐œ”) = ๐œ€๐‘›(๐œ”).Claim that the (๐‘‹๐‘›)๐‘›โ‰ฅ1 are iid. random variables uniformly distributed on{0, โ€ฆ , 9}, where โ€œiid.โ€ stands for independently and identically distributed.

Proof. Easy check. For example ๐‘‹1(๐œ”) = โŒŠ10๐œ”โŒ‹ so

P(๐‘‹1 = ๐‘–1) = 110

.

Similarly for all ๐‘›

P(๐‘‹1 = ๐‘–1, โ€ฆ , ๐‘‹๐‘› = ๐‘–๐‘›) = 110๐‘› , P(๐‘‹๐‘› = ๐‘–๐‘›) = 1

10so

P(๐‘‹1 = ๐‘–1, โ€ฆ , ๐‘‹๐‘› = ๐‘–๐‘›) =๐‘›

โˆ๐‘–=1

P(๐‘‹๐‘˜ = ๐‘–๐‘˜).

34

Page 36: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

6 Independence

Remark.๐œ” = โˆ‘

๐‘›โ‰ฅ1

๐‘‹๐‘›(๐œ”)10๐‘›

is distributed according to Lebesgue measure so if we want we can constructLebesgue measure as the law of this random variable.

Proposition 6.2 (infinite product of product measure). Let (ฮฉ๐‘–, โ„ฑ๐‘–, ๐œ‡๐‘–)๐‘–โ‰ฅ1be a sequence of probability spaces, ฮฉ = โˆ๐‘–โ‰ฅ1 ฮฉ๐‘– and โ„ฐ be the Boolean algebraof cylinder sets, i.e. sets of the form

๐ด ร— โˆ๐‘–โ‰ฅ๐‘›

ฮฉ๐‘–

for some ๐ด โˆˆ โจ‚๐‘›๐‘–=1 โ„ฑ๐‘–. Set โ„ฑ = ๐œŽ(โ„ฐ), the infinite product ๐œŽ-algebra. Then

there is a unique probability measure ๐œ‡ on (ฮฉ, โ„ฑ) such that it agrees withproduct measures on all cylinder sets, i.e.

๐œ‡(๐ด ร— โˆ๐‘–>๐‘›

ฮฉ๐‘–) = (๐‘›

โจ‚๐‘–=1

๐œ‡๐‘–)(๐ด)

for all ๐ด โˆˆ โจ‚๐‘›๐‘–=1 โ„ฑ๐‘–.

Proof. Omitted. See example sheet 3.

Lemma 6.3 (Borel-Cantelli). Let (ฮฉ, โ„ฑ,P) be a probability space and (๐ด๐‘›)๐‘›โ‰ฅ1a sequence of events.

1. If โˆ‘๐‘›โ‰ฅ1 P(๐ด๐‘›) < โˆž then

P(lim sup๐‘›

๐ด๐‘›) = 0.

2. Conversely, if (๐ด๐‘›)๐‘›โ‰ฅ1 are independent and โˆ‘๐‘›โ‰ฅ1 P(๐ด๐‘›) = โˆž then

P(lim sup๐‘›

๐ด๐‘›) = 1.

Note that lim sup๐‘› ๐ด๐‘› is also called ๐ด๐‘› io. meaning โ€œinfinitely oftenโ€.

Proof.

1. Let ๐‘Œ = โˆ‘๐‘›โ‰ฅ1 1๐ด๐‘›be a random variable. Then

E(๐‘Œ ) = โˆ‘๐‘›โ‰ฅ1

E(1๐ด๐‘›) = โˆ‘

๐‘›โ‰ฅ1P(๐ด๐‘›).

Since ๐‘Œ โ‰ฅ 0, recall that we prove that E(๐‘Œ ) < โˆž implies that ๐‘Œ < โˆžalmost surely, i.e. P-almost everywhere.

2. Note that(lim sup

๐‘›๐ด๐‘›)๐‘ = โ‹ƒ

๐‘โ‹‚

๐‘›โ‰ฅ๐‘๐ด๐‘

๐‘›

35

Page 37: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

6 Independence

so

P( โ‹‚๐‘›โ‰ฅ๐‘

๐ด๐‘๐‘›) โ‰ค P(

๐‘€โ‹‚

๐‘›=๐‘๐ด๐‘

๐‘›)

=๐‘€โˆ

๐‘›=๐‘P(๐ด๐‘

๐‘›) =๐‘€โˆ

๐‘›=๐‘(1 โˆ’ P(๐ด๐‘›))

โ‰ค๐‘€โˆ

๐‘›=๐‘exp(โˆ’P(๐ด๐‘›))

โ‰ค exp(โˆ’๐‘€

โˆ‘๐‘›=๐‘

P(๐ด๐‘›))

โ†’ 0

as ๐‘€ โ†’ โˆž. ThusP( โ‹‚

๐‘›โ‰ฅ๐‘๐ด๐‘

๐‘›) = 0

for all ๐‘ soP(โ‹ƒ

๐‘โ‹‚

๐‘›โ‰ฅ๐‘๐ด๐‘

๐‘›) = 0.

Definition (random/stochastic process, filtration, tail ๐œŽ-algebra, tail event).Let (ฮฉ, โ„ฑ,P) be a probability space and (๐‘‹๐‘›)๐‘›โ‰ฅ1 a sequence of random vari-ables.

1. (๐‘‹๐‘›)๐‘›โ‰ฅ1 is sometimes called a random process or stochastic process.

2.โ„ฑ๐‘› = ๐œŽ(๐‘‹1, โ€ฆ , ๐‘‹๐‘›) โŠ† โ„ฑ

is called the associated filtration. โ„ฑ๐‘› โŠ† โ„ฑ๐‘›+1.

3.๐’ž = โ‹‚

๐‘›โ‰ฅ1๐œŽ(๐‘‹๐‘›, ๐‘‹๐‘›+1, โ€ฆ )

is called the tail ๐œŽ-algebra of the process. Its elements are called tailevents.

Example. Tail events are those not affected by the first few terms in the se-quence of random variables. For example,

{๐œ” โˆˆ ฮฉ โˆถ lim๐‘›

๐‘‹๐‘›(๐œ”) exists}

is a tail event, so is{๐œ” โˆˆ ฮฉ โˆถ lim sup

๐‘›๐‘‹๐‘›(๐œ”) โ‰ฅ ๐‘‡ }.

Theorem 6.4 (Kolmogorov 0โˆ’1 law). If (๐‘‹๐‘›)๐‘›โ‰ฅ1 is a sequence of mutually

36

Page 38: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

6 Independence

independent random variables then for all ๐ด โˆˆ ๐’ž,

P(๐ด) โˆˆ {0, 1}.Proof. Pick ๐ด โˆˆ ๐’ž. Fix ๐‘›. For all ๐ต โˆˆ ๐œŽ(๐‘‹1, โ€ฆ , ๐‘‹๐‘›),

P(๐ด โˆฉ ๐ต) = P(๐ด)P(๐ต)

as ๐’ž is independent of ๐œŽ(๐‘‹1, โ€ฆ , ๐‘‹๐‘›). The measures ๐ต โ†ฆ P(๐ด)P(๐ต) and๐ต โ†ฆ P(๐ด โˆฉ ๐ต) coincide on each โ„ฑ๐‘› so on โ‹ƒ๐‘›โ‰ฅ1 โ„ฑ๐‘›. Hence they coincideon ๐œŽ(โ‹ƒ๐‘›โ‰ฅ1 โ„ฑ๐‘›) โŠ‡ ๐’ž so

P(๐ด) = P(๐ด โˆฉ ๐ด) = P(๐ด)P(๐ด)

soP(๐ด) โˆˆ {0, 1}.

6.1 Useful inequalities

Proposition 6.5 (Cauchy-Schwarz). Suppose ๐‘‹, ๐‘Œ are random variablesthen

E(|๐‘‹๐‘Œ |) โ‰ค โˆšE(๐‘‹2) โ‹… E(๐‘Œ 2).

Proof. For all ๐‘ก โˆˆ R,

0 โ‰ค E((|๐‘‹| + ๐‘ก|๐‘Œ |)2) = E(๐‘‹2) + 2๐‘กE(|๐‘‹๐‘Œ |) + ๐‘ก2E(๐‘Œ 2)

so viewed as a quadratic in ๐‘ก, the discriminant is nonpositive, i.e.

(E(|๐‘‹๐‘Œ |)2 โˆ’ E(๐‘‹)2 โˆ’ E(๐‘Œ )2 โ‰ค 0.

Proposition 6.6 (Markov). Let ๐‘‹ โ‰ฅ 0 be a random variable. Then for all๐‘ก โ‰ฅ 0,

๐‘กP(๐‘‹ โ‰ฅ ๐‘ก) โ‰ค E(๐‘‹).

Proof.E(๐‘‹) โ‰ฅ E(๐‘‹1๐‘‹โ‰ฅ๐‘ก) โ‰ฅ E(๐‘ก1๐‘‹โ‰ฅ๐‘ก) = ๐‘กP(๐‘‹ โ‰ฅ ๐‘ก)

Proposition 6.7 (Chebyshev). Let ๐‘Œ be a random variable with E(๐‘Œ 2) < โˆž,then for all ๐‘ก โˆˆ R,

๐‘ก2P(|๐‘Œ โˆ’ E(๐‘Œ )| โ‰ฅ ๐‘ก) โ‰ค Var ๐‘Œ .

E(๐‘Œ 2) < โˆž implies that E(|๐‘Œ |) < โˆž by Cauchy-Schwarz, so Var ๐‘Œ < โˆž.The converse is more subtle.

Proof. Apply Markov to ๐‘‹ = |๐‘Œ โˆ’ E(๐‘Œ )|2.

37

Page 39: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

6 Independence

Theorem 6.8 (strong law of large numbers). Let (๐‘‹๐‘›)๐‘›โ‰ฅ1 be a sequence ofiid. random variables. Assume E(|๐‘‹1|) < โˆž. Let

๐‘†๐‘› =๐‘›

โˆ‘๐‘˜=1

๐‘‹๐‘˜,

then 1๐‘› ๐‘†๐‘› converges almost surely to E(๐‘‹1).

Proof. We prove the theorem under a stonger condition: we assume E(๐‘‹41) <

โˆž. This implies, by Cauchy-Schwarz, E(๐‘‹21),E(|๐‘‹1|) < โˆž. Subsequently

E(|๐‘‹1|3) < โˆž. The full proof is much harder but will be given later whenwe have developed enough machinery.

wlog we may assume E(๐‘‹1) = 0 by replacing ๐‘‹๐‘› with ๐‘‹๐‘› โˆ’ E(๐‘‹1). Have

E(๐‘†4๐‘›) = โˆ‘

๐‘–,๐‘—,๐‘˜,โ„“E(๐‘‹๐‘–๐‘‹๐‘—๐‘‹๐‘˜๐‘‹โ„“).

All terms vanish because E(๐‘‹๐‘–) = 0 and (๐‘‹๐‘–)๐‘–โ‰ฅ1 are independent, except forE(๐‘‹4

๐‘– ) and E(๐‘‹2๐‘– ๐‘‹2

๐‘— ) for ๐‘– โ‰  ๐‘—. For example,

E(๐‘‹๐‘–๐‘‹3๐‘— ) = E(๐‘‹๐‘–) โ‹… E(๐‘‹3

๐‘— ) = 0

for ๐‘– โ‰  ๐‘—. Thus

E(๐‘†4๐‘›) =

๐‘›โˆ‘๐‘–=1

E(๐‘‹4๐‘– ) + 6 โˆ‘

๐‘–<๐‘—E(๐‘‹2

๐‘– ๐‘‹2๐‘— ).

By Cauchy-Schwarz,

E(๐‘‹2๐‘– ๐‘‹2

๐‘— ) โ‰ค โˆšE๐‘‹4๐‘– โ‹… E๐‘‹4

๐‘— = E๐‘‹41

soE(๐‘†4

๐‘›) โ‰ค (๐‘› + 6 โ‹… ๐‘›(๐‘› โˆ’ 1)2

)E๐‘‹41

and asymptotically,E(๐‘†๐‘›

๐‘›)4 = ๐‘‚( 1

๐‘›2 )

soE(โˆ‘

๐‘›โ‰ฅ1(๐‘†๐‘›

๐‘›)4) = โˆ‘

๐‘›โ‰ฅ1E(๐‘†๐‘›

๐‘›)4 < โˆž.

Hence โˆ‘( ๐‘†๐‘›๐‘› )4 < โˆž almost surely and it follows that

lim๐‘›โ†’โˆž

๐‘†๐‘›๐‘›

= 0

almost surely.

Strong law of large numbers has a very important statistical implication: wecan sample the mean of larger number of iid. to detect an unknown law, at leastthe mean.

38

Page 40: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

7 Convergence of random variables

7 Convergence of random variables

Definition (weak convergence). A sequence of probability measures (๐œ‡๐‘›)๐‘›โ‰ฅ1on (R๐‘‘, โ„ฌ(R๐‘‘)) is said to converge weakly to a measure ๐œ‡ if for all ๐‘“ โˆˆ ๐ถ๐‘(R๐‘‘),the set of continuous bounded functions on R๐‘‘,

lim๐‘›โ†’โˆž

๐œ‡๐‘›(๐‘“) = ๐œ‡(๐‘“).

Example.

1. Let ๐œ‡๐‘› = ๐›ฟ1/๐‘› be the Dirac mass on R๐‘‘, i.e. for ๐‘ฅ โˆˆ R๐‘‘, ๐›ฟ๐‘ฅ is the Borelprobability measure on R๐‘‘ such that

๐›ฟ๐‘ฅ(๐ด) = {1 ๐‘ฅ โˆˆ ๐ด0 ๐‘ฅ โˆ‰ ๐ด

then ๐œ‡๐‘› โ†’ ๐›ฟ0.

2. Let ๐œ‡๐‘› = ๐’ฉ(0, ๐œŽ2๐‘›), Gaussian distribution with standard deviation ๐œŽ๐‘›,

where ๐œŽ๐‘› โ†’ 0, then again ๐œ‡๐‘› โ†’ ๐›ฟ0. Indeed,

๐œ‡๐‘›(๐‘“) = โˆซ ๐‘“(๐‘ฅ)๐‘‘๐œ‡๐‘›(๐‘ฅ)

= โˆซ ๐‘“(๐‘ฅ) 1โˆš2๐œ‹๐œŽ2

๐‘›exp(โˆ’ ๐‘ฅ2

2๐œŽ2๐‘›

)๐‘‘๐‘ฅ

= โˆซ ๐‘“(๐‘ฅ๐œŽ๐‘›) 1โˆš2๐œ‹

exp(โˆ’๐‘ฅ2

2)๐‘‘๐‘ฅ

๐œŽ๐‘› โ†’ 0 so ๐‘“(๐‘ฅ๐œŽ๐‘›) โ†’ ๐‘“(0) so by dominated convergence theorem, ๐œ‡๐‘›(๐‘“) โ†’๐‘“(0) = ๐›ฟ0(๐‘“).

Definition (convergence of random variable). A sequence (๐‘‹๐‘›)๐‘›โ‰ฅ1 of R๐‘‘-valued random variables on (ฮฉ, โ„ฑ,P) is said to converge to a random variable๐‘‹

1. almost surely iflim

๐‘›โ†’โˆž๐‘‹๐‘›(๐œ”) = ๐‘‹(๐œ”)

for P-almost every ๐œ”.

2. in probability or in measure if for all ๐œ€ > 0,

lim๐‘›โ†’โˆž

P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– > ๐œ€) = 0.

Note that all norms on R๐‘‘ are equivalent so we donโ€™t have to specifyone.

39

Page 41: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

7 Convergence of random variables

3. in distribution or in law if ๐œ‡๐‘‹๐‘›โ†’ ๐œ‡๐‘‹ weakly, where ๐œ‡๐‘‹ = ๐‘‹โˆ—P is

the law of ๐‘‹, a Borel probability measure on R๐‘‘. Equivalently, for all๐‘“ โˆˆ ๐ถ๐‘(R๐‘‘), E(๐‘“(๐‘‹๐‘›)) โ†’ E(๐‘“(๐‘‹)).

Proposition 7.1. 1 โŸน 2 โŸน 3.

Proof.

1. 1 โŸน 2:P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– > ๐œ€) = E(1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€)

so if ๐‘‹๐‘› โ†’ ๐‘‹ almost surely then

1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€ โ†’ 0

P-almost everywhere so by dominated convergence theorem P(โ€–๐‘‹๐‘› โˆ’๐‘‹โ€– >๐œ€) โ†’ 0.

2. 2 โŸน 3: given ๐‘“ โˆˆ ๐ถ๐‘(R๐‘‘), need to show that ๐œ‡๐‘‹๐‘›(๐‘“) โ†’ ๐œ‡๐‘‹(๐‘“). But

๐œ‡๐‘‹๐‘›(๐‘“) โˆ’ ๐œ‡๐‘‹(๐‘“) = E(๐‘“(๐‘‹๐‘›) โˆ’ ๐‘“(๐‘‹)).

To bound this, note that ๐‘“ is continuous and R๐‘‘ is locally compact so it islocally uniformly continuous. In particular for all ๐œ€ > 0 exists ๐›ฟ > 0 suchthat if โ€–๐‘ฅโ€– < 1

๐œ€ and โ€–๐‘ฆ โˆ’ ๐‘ฅโ€– < ๐›ฟ then |๐‘“(๐‘ฆ) โˆ’ ๐‘“(๐‘ฅ)| < ๐œ€. Thus

|E(๐‘“(๐‘‹๐‘›) โˆ’ ๐‘“(๐‘‹))| โ‰ค E(1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–<๐›ฟ1โ€–๐‘‹โ€–<1/๐œ€ |๐‘“(๐‘‹๐‘›) โˆ’ ๐‘“(๐‘‹)|โŸโŸโŸโŸโŸโŸโŸ<๐œ€

)

+ 2 โ€–๐‘“โ€–โˆžโŸ<โˆž

(P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– โ‰ฅ ๐›ฟ) + P(โ€–๐‘‹โ€– โ‰ฅ 1๐œ€

))

solim sup

๐‘›โ†’โˆž|E(๐‘“(๐‘‹๐‘›) โˆ’ ๐‘“(๐‘‹))| โ‰ค ๐œ€ + 2โ€–๐‘“โ€–โˆž P(โ€–๐‘‹โ€– > 1

๐œ€)

โŸโŸโŸโŸโŸโ†’0 as ๐œ€โ†’0

.

which is 0 as ๐œ€ is arbitrary.

Remark. When ๐‘‘ = 1, 3 is equivalent to ๐น๐‘‹๐‘›(๐‘ฅ) โ†’ ๐น๐‘‹(๐‘ฅ) for all ๐‘ฅ as ๐‘› โ†’ โˆž

where ๐น๐‘‹ is continuous. See example sheet 3.

The strict converses do not hold but we can say something weaker.

Proposition 7.2. If ๐‘‹๐‘› โ†’ ๐‘‹ in probability then there is a subsequence(๐‘›๐‘˜)๐‘˜ such that ๐‘‹๐‘›๐‘˜

โ†’ ๐‘‹ almost surely as ๐‘˜ โ†’ โˆž.

Proof. We know for all ๐œ€ > 0, P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– > ๐œ€) โ†’ 0 as ๐‘› โ†’ โˆž. So for all ๐‘˜exists ๐‘›๐‘˜ such that

P(โ€–๐‘‹๐‘›๐‘˜โˆ’ ๐‘‹โ€– > 1

๐‘˜) โ‰ค 1

2๐‘˜

soโˆ‘๐‘˜โ‰ฅ1

P(โ€–๐‘‹๐‘›๐‘˜โˆ’ ๐‘‹โ€– > 1

๐‘˜) < โˆž

40

Page 42: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

7 Convergence of random variables

so by the first Borel-Cantelli lemma

P(โ€–๐‘‹๐‘›๐‘˜โˆ’ ๐‘‹โ€– > 1

๐‘˜io.) = 0.

This means that with probability 1, โ€–๐‘‹๐‘›๐‘˜โˆ’ ๐‘‹โ€– โ†’ 0 as ๐‘˜ โ†’ โˆž.

Definition (convergence in mean). Let (๐‘‹๐‘›)๐‘›โ‰ฅ1 and ๐‘‹ be R๐‘‘-valued inte-grable random variables. We say that (๐‘‹๐‘›)๐‘› converges in mean or in ๐ฟ1 to๐‘‹ if

lim๐‘›โ†’โˆž

E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–) = 0.

Remark.

1. If ๐‘‹๐‘› โ†’ ๐‘‹ in mean then ๐‘‹๐‘› โ†’ ๐‘‹ in probability by Markov inequality:

๐œ€ โ‹… P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– > ๐œ€) โ‰ค E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–).

2. The converse is false. For example take ฮฉ = (0, 1), โ„ฑ = โ„ฌ(ฮฃ) and PLebesgue measure. Let ๐‘‹๐‘› = ๐‘›1[0, 1

๐‘› ]. ๐‘‹๐‘› โ†’ 0 almost surely but E๐‘‹๐‘› = 1.

When does convergence in probability imply convergence in mean? We needsome kind of domination assumption.

Definition (uniformly integrable). A sequence of random variables (๐‘‹๐‘›)๐‘›โ‰ฅ1is uniformly integrable if

lim๐‘€โ†’โˆž

lim sup๐‘›โ†’โˆž

E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) = 0.

Remark. If (๐‘‹๐‘›)๐‘›โ‰ฅ1 are dominated, namely exists an integrable random vari-able ๐‘Œ โ‰ฅ 0 such that โ€–๐‘‹๐‘›โ€– โ‰ค ๐‘Œ for all ๐‘› then (๐‘‹๐‘›)๐‘› is uniformly integrable bydominated convergence theorem:

E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) โ‰ค E(๐‘Œ1๐‘Œ โ‰ฅ๐‘€) โ†’ 0

as ๐‘€ โ†’ โˆž.

Theorem 7.3. Let (๐‘‹๐‘›)๐‘›โ‰ฅ1 be a sequence of R๐‘‘-valued integrable randomvariable. Let ๐‘‹ be another random variable. Then TFAE:

1. ๐‘‹ is integrable and ๐‘‹๐‘› โ†’ ๐‘‹ in mean,

2. (๐‘‹๐‘›)๐‘›โ‰ฅ1 is uniformly integrable and ๐‘‹๐‘› โ†’ ๐‘‹ in probability.

Proof.

โ€ข 1 โŸน 2: Left to show uniform integrability:

E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) โ‰ค E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) + E(โ€–๐‘‹โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€)โ‰ค E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–) + E(โ€–๐‘‹โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€(1โ€–๐‘‹โ€–โ‰ค ๐‘€

2+ 1โ€–๐‘‹โ€–> ๐‘€

2))

โ‰ค E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–) + E(โ€–๐‘‹โ€–1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–โ‰ฅ ๐‘€2

1โ€–๐‘‹โ€–โ‰ค ๐‘€2

) + E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–โ‰ฅ ๐‘€2

)

โ‰ค E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–) + ๐‘€2P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– โ‰ฅ ๐‘€

2) + E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–โ‰ฅ ๐‘€

2)

41

Page 43: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

7 Convergence of random variables

Take lim sup,

lim sup๐‘›โ†’โˆž

E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘€โ€–โ‰ฅ๐‘›) โ‰ค 0 + 0 + E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–โ‰ฅ ๐‘€2

) โ†’ 0

by dominated convergence theorem.

โ€ข 2 โŸน 1: Prove first that ๐‘‹ is integrable. By the previous proposition,we can find a subsequence (๐‘›๐‘˜)๐‘˜ such that ๐‘‹๐‘›๐‘˜

โ†’ ๐‘‹ almost surely. ByFatouโ€™s lemma,

E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–โ‰ฅ๐‘€) โ‰ค lim inf๐‘˜โ†’0

E(โ€–๐‘‹๐‘›๐‘˜โ€–1โ€–๐‘‹๐‘›๐‘˜

โ€–โ‰ฅ๐‘€)

which goes to 0 as ๐‘€ โ†’ โˆž by uniform integrability assumption. Thus

E(โ€–๐‘‹โ€–) โ‰ค ๐‘€ + E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–โ‰ฅ๐‘€) < โˆž

for ๐‘€ sufficiently large. Thus ๐‘‹ is integrable.To show convergence in mean, we use the same trick of spliting into smalland big parts.

E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–) = E((1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–โ‰ค๐œ€ + 1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€)โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–)โ‰ค ๐œ€ + E(1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–(1โ€–๐‘‹๐‘›โ€–โ‰ค๐‘€ + 1โ€–๐‘‹๐‘›โ€–>๐‘€))โ‰ค ๐œ€ + โ€  + โ€ก

where

โ€  = E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€1โ€–๐‘‹๐‘›โ€–โ‰ค๐‘€)โ‰ค E((๐‘€ + โ€–๐‘‹โ€–)1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€(1โ€–๐‘‹โ€–โ‰ค๐‘€ + 1โ€–๐‘‹โ€–>๐‘€)โ‰ค 2๐‘€P(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€– > ๐œ€) + 2E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–>๐‘€)

solim sup

๐‘›โ†’โˆžโ€  โ‰ค 2E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–>๐‘€) โ†’ 0

as ๐‘€ โ†’ โˆž. On the other hand

โ€ก = E(โ€–๐‘‹๐‘› โˆ’ ๐‘‹โ€–1โ€–๐‘‹๐‘›โˆ’๐‘‹โ€–>๐œ€1โ€–๐‘‹๐‘›โ€–>๐‘€)โ‰ค E((โ€–๐‘‹๐‘›โ€– + โ€–๐‘‹โ€–)1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€)โ‰ค E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€ + โ€–๐‘‹โ€–1โ€–๐‘‹โ€–>๐‘€ + โ€–๐‘‹โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€1โ€–๐‘‹โ€–โ‰ค๐‘€)โ‰ค 2E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) + E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–>๐‘€)

taking lim sup,

lim sup๐‘›โ†’โˆž

โ€ก โ‰ค 2 lim sup๐‘›โ†’โˆž

E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) + E(โ€–๐‘‹โ€–1โ€–๐‘‹โ€–>๐‘€) โ†’ 0 + 0

as ๐‘€ โ†’ โˆž.

42

Page 44: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

7 Convergence of random variables

Definition. We say that a sequence of random variables (๐‘‹๐‘›)๐‘›โ‰ฅ1 is boundedin ๐ฟ๐‘ if there exists ๐ถ > 0 such that E(โ€–๐‘‹๐‘›โ€–๐‘) โ‰ค ๐ถ for all ๐‘›.

Proposition 7.4. If ๐‘ > 1 and (๐‘‹๐‘›)๐‘›โ‰ฅ1 is bounded in ๐ฟ๐‘ then (๐‘‹๐‘›)๐‘›โ‰ฅ1 isuniformly integrable.

Proof.

๐‘€๐‘โˆ’1E(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) โ‰ค E(โ€–๐‘‹๐‘›โ€–๐‘1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) โ‰ค E(โ€–๐‘‹๐‘›โ€–๐‘) โ‰ค ๐ถ

solim sup

๐‘›โ†’โˆžE(โ€–๐‘‹๐‘›โ€–1โ€–๐‘‹๐‘›โ€–โ‰ฅ๐‘€) โ‰ค ๐ถ

๐‘€๐‘โˆ’1 โ†’ 0

as ๐‘€ โ†’ โˆž.

This provides a sufficient condition for uniform integrability and thus con-vergnce in mean.

43

Page 45: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

8 ๐ฟ๐‘ spaces

8 ๐ฟ๐‘ spacesRecall that ๐œ‘ โˆถ ๐ผ โ†’ R is convex means that for all ๐‘ฅ, ๐‘ฆ โˆˆ ๐ผ, for all ๐‘ก โˆˆ [0, 1],

๐œ‘(๐‘ก๐‘ฅ + (1 โˆ’ ๐‘ก)๐‘ฆ) โ‰ค ๐‘ก๐œ‘(๐‘ฅ) + (1 โˆ’ ๐‘ก)๐œ‘(๐‘ฆ).

Proposition 8.1 (Jensen inequality). Let ๐ผ be an open interval of R and๐œ‘ โˆถ ๐ผ โ†’ R a convex function. Let ๐‘‹ be a random variable (ฮฉ, โ„ฑ,P). Assume๐‘‹ is integrable and takes values in ๐ผ. Then

E(๐œ‘(๐‘‹)) โ‰ฅ ๐œ‘(E(๐‘‹)).

Remark.

1. As ๐‘‹ โˆˆ ๐ผ almost surely and ๐ผ is an interval, have E(๐‘‹) โˆˆ ๐ผ.

2. Weโ€™ll show that ๐œ‘(๐‘‹)โˆ’ is integrable so

E(๐œ‘(๐‘‹)) = E(๐œ‘(๐‘‹)+) โˆ’ E(๐œ‘(๐‘‹)โˆ’)

with the possibility that both sides are infinity.

Lemma 8.2. TFAE:

1. ๐œ‘ is convex,

2. there exists a family โ„ฑ of affine functions (๐‘ฅ โ†ฆ ๐‘Ž๐‘ฅ + ๐‘) such that๐œ‘ = supโ„“โˆˆโ„ฑ โ„“ on ๐ผ.

Proof.

1. 2 โŸน 1: every โ„“ is convex and the supremum of โ„“ is convex:

โ„“(๐‘ก๐‘ฅ + (1 โˆ’ ๐‘ก)๐‘ฆ) = ๐‘กโ„“(๐‘ฅ) + (1 โˆ’ ๐‘ก)โ„“(๐‘ฆ) โ‰ค ๐‘ก supโ„“โˆˆโ„ฑ

โ„“(๐‘ฅ) + (1 โˆ’ ๐‘ก) supโ„“โˆˆโ„ฑ

โ„“(๐‘ฆ)

so

๐œ‘(๐‘ก๐‘ฅ + (1 โˆ’ ๐‘ก)๐‘ฆ) = supโ„“โˆˆโ„ฑ

โ„“(๐‘ก๐‘ฅ + (1 โˆ’ ๐‘ก)๐‘ฆ)

โ‰ค ๐‘ก supโ„“โˆˆโ„ฑ

โ„“(๐‘ฅ) + (1 โˆ’ ๐‘ก) supโ„“โˆˆโ„ฑ

โ„“(๐‘ฆ)

= ๐‘ก๐œ‘(๐‘ฅ) + (1 โˆ’ ๐‘ก)๐œ‘(๐‘ฆ)

2. We need to show that for all ๐‘ฅ0 โˆˆ ๐ผ we can find an affine function

โ„“๐‘ฅ0(๐‘ฅ) = ๐œƒ๐‘ฅ0

(๐‘ฅ โˆ’ ๐‘ฅ0) + ๐œ‘(๐‘ฅ0),

where ๐œƒ๐‘ฅ0is morally the slope at ๐‘ฅ0, such that ๐œ‘(๐‘ฅ) โ‰ฅ โ„“๐‘ฅ0

(๐‘ฅ) for all ๐‘ฅ โˆˆ ๐ผ.Then have ๐œ‘ = sup๐‘ฅ0โˆˆ๐ผ โ„“๐‘ฅ0

.

To find ๐œƒ๐‘ฅ0observe that for all ๐‘ฅ < ๐‘ฅ0 < ๐‘ฆ where ๐‘ฅ, ๐‘ฆ โˆˆ ๐ผ, have

๐œ‘(๐‘ฅ0) โˆ’ ๐œ‘(๐‘ฅ)๐‘ฅ0 โˆ’ ๐‘ฅ

โ‰ค ๐œ‘(๐‘ฆ) โˆ’ ๐œ‘(๐‘ฅ0)๐‘ฆ โˆ’ ๐‘ฅ0

.

44

Page 46: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

8 ๐ฟ๐‘ spaces

Indeed this is the convexity of ๐œ‘ on [๐‘ฅ, ๐‘ฆ] with ๐‘ก = ๐‘ฅ0โˆ’๐‘ฅ๐‘ฆโˆ’๐‘ฅ . This holds for

all ๐‘ฅ < ๐‘ฅ0, ๐‘ฆ > ๐‘ฅ0 so there exists ๐œƒ โˆˆ R such that

๐œ‘(๐‘ฅ0) โˆ’ ๐œ‘(๐‘ฅ)๐‘ฅ0 โˆ’ ๐‘ฅ

โ‰ค ๐œƒ โ‰ค ๐œ‘(๐‘ฆ) โˆ’ ๐œ‘(๐‘ฅ0)๐‘ฆ โˆ’ ๐‘ฅ0

.

Then just set โ„“๐‘ฅ0(๐‘ฅ) = ๐œƒ(๐‘ฅ โˆ’ ๐‘ฅ0) + ๐œ‘(๐‘ฅ0). By construction ๐œ‘(๐‘ฅ) โ‰ฅ โ„“๐‘ฅ0

(๐‘ฅ)for all ๐‘ฅ โˆˆ ๐ผ.

Proof of Jensen inequality. Let ๐œ‘(๐‘ฅ) = supโ„“โˆˆโ„ฑ โ„“(๐‘ฅ) where โ„“ affine. then

E(๐œ‘(๐‘‹)) โ‰ฅ E(โ„“(๐‘‹)) = โ„“(E(๐‘‹))

for all โ„“ โˆˆ โ„ฑ so take supremum,

E(๐œ‘(๐‘‹)) โ‰ฅ supโ„“โˆˆโ„ฑ

โ„“(E(๐‘‹)) = ๐œ‘(E(๐‘‹)).

Also for the remark,

โˆ’๐œ‘ = โˆ’ supโ„“โˆˆโ„ฑ

โ„“ = infโ„“โˆˆโ„ฑ

(โˆ’โ„“)

so ๐œ‘โˆ’ = (โˆ’๐œ‘)+ โ‰ค |โ„“| for all โ„“ โˆˆ โ„ฑ. Then

๐œ‘(๐‘‹)โˆ’ โ‰ค |โ„“(๐‘‹)| โ‰ค |๐‘Ž||๐‘‹| + |๐‘|.

As ๐‘‹ is integrable, ๐œ‘(๐‘‹)โˆ’ is integrable.

Jensen inequality is for probability space only. The following applies to allmeasure spaces.

Proposition 8.3 (Minkowski inquality). Let (๐‘‹, ๐’œ, ๐œ‡) be a measure spaceand ๐‘“, ๐‘” measurable functions on ๐‘‹. Let ๐‘ โˆˆ [1, โˆž) and define the ๐‘-norm

โ€–๐‘“โ€–๐‘ = (โˆซ๐‘‹

|๐‘“|๐‘๐‘‘๐œ‡)1/๐‘

.

Thenโ€–๐‘“ + ๐‘”โ€–๐‘ โ‰ค โ€–๐‘“โ€–๐‘ + โ€–๐‘”โ€–๐‘.

Proof. wlog assume โ€–๐‘“โ€–๐‘, โ€–๐‘“โ€–๐‘ โ‰  0. Need to show

โˆฅโ€–๐‘“โ€–๐‘

โ€–๐‘“โ€–๐‘ + โ€–๐‘”โ€–๐‘

๐‘“โ€–๐‘“โ€– ๐‘

+โ€–๐‘”โ€–๐‘

โ€–๐‘“โ€–๐‘ + โ€–๐‘”โ€–๐‘

๐‘”โ€–๐‘”โ€–๐‘

โˆฅ๐‘

โ‰ค 1.

Suffice to show for all ๐‘ก โˆˆ [0, 1], for all ๐น, ๐บ measurable such that โ€–๐นโ€–๐‘ = โ€–๐บโ€–๐‘ =1, have

โ€–๐‘ก๐น + (1 โˆ’ ๐‘ก)๐บโ€–๐‘ โ‰ค 1โ€œthe unit ball is convexโ€. For this note that

[0, +โˆž) โ†’ [0, +โˆž)๐‘ฅ โ†ฆ ๐‘ฅ๐‘

45

Page 47: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

8 ๐ฟ๐‘ spaces

is convex if ๐‘ โ‰ฅ 1 so

|๐‘ก๐น + (1 โˆ’ ๐‘ก)๐บ|๐‘ โ‰ค ๐‘ก|๐น |๐‘ + (1 โˆ’ ๐‘ก)|๐บ|๐‘

andโˆซ

๐‘‹|๐‘ก๐น + (1 โˆ’ ๐‘ก)๐บ|๐‘๐‘‘๐œ‡ โ‰ค ๐‘ก โˆซ

๐‘‹|๐น |๐‘๐‘‘๐œ‡

โŸโŸโŸโŸโŸ=1

+(1 โˆ’ ๐‘ก) โˆซ๐‘‹

|๐บ|๐‘๐‘‘๐œ‡โŸโŸโŸโŸโŸ

=1

= 1.

Proposition 8.4 (Hรถlder inequality). Suppose (๐‘‹, ๐’œ, ๐œ‡) is a measure spaceand let ๐‘“, ๐‘” be measurable functions on ๐‘‹. Given ๐‘, ๐‘ž โˆˆ (1, โˆž) such that1๐‘ + 1

๐‘ž = 1,

โˆซ๐‘‹

|๐‘“๐‘”|๐‘‘๐œ‡ โ‰ค (โˆซ๐‘‹

|๐‘“|๐‘๐‘‘๐œ‡)1/๐‘

(โˆซ๐‘‹

|๐‘”|๐‘ž๐‘‘๐œ‡)1/๐‘ž

with equality if and only if there exists (๐›ผ, ๐›ฝ) โ‰  (0, 0) such that ๐›ผ|๐‘“|๐‘ = ๐›ฝ|๐‘”|๐‘ž๐œ‡-almost everywhere.

Lemma 8.5 (Young inequality). For all ๐‘, ๐‘ž โˆˆ (1, โˆž) such that 1๐‘ + 1

๐‘ž = 1,for all ๐‘Ž, ๐‘ โ‰ฅ 0, have

๐‘Ž๐‘ โ‰ค ๐‘Ž๐‘

๐‘+ ๐‘๐‘ž

๐‘ž.

Proof of Hรถlder inequality. wlog assume โ€–๐‘“โ€–๐‘, โ€–๐‘”โ€–๐‘ž โ‰  0. By scaling by factors(๐›ผ, ๐›ฝ) โ‰  (0, 0) wlog โ€–๐‘“โ€–๐‘ = โ€–๐‘”โ€–๐‘ž = 1. Then by Young inequality,

|๐‘“๐‘”| โ‰ค 1๐‘

|๐‘“|๐‘ + 1๐‘ž

|๐‘”|๐‘ž

soโˆซ

๐‘‹|๐‘“๐‘”|๐‘‘๐œ‡ โ‰ค 1

๐‘โˆซ

๐‘‹|๐‘“|๐‘๐‘‘๐œ‡ + 1

๐‘žโˆซ

๐‘‹|๐‘”|๐‘ž๐‘‘๐œ‡ = 1

๐‘+ 1

๐‘ž= 1.

Remark. Apply Jensen inequality to ๐œ‘(๐‘ฅ) = ๐‘ฅ๐‘โ€ฒ/๐‘ for ๐‘โ€ฒ > ๐‘, we have

E(|๐‘‹|๐‘)1/๐‘ โ‰ค E(|๐‘‹|๐‘โ€ฒ)1/๐‘โ€ฒ ,

so the function ๐‘ โ†ฆ E(|๐‘‹|๐‘)1/๐‘ is non-decreasing. This can be used, for example,to show that if ๐‘‹ has finite ๐‘โ€ฒth moment then it has finite ๐‘th moment for ๐‘โ€ฒ โ‰ฅ ๐‘.

Definition. Let (๐‘‹, ๐’œ, ๐œ‡) be a measure space.

โ€ข For ๐‘ โ‰ฅ 1,

โ„’๐‘(๐‘‹, ๐’œ, ๐œ‡) = {๐‘“ โˆถ ๐‘‹ โ†’ R measurable such that |๐‘“|๐‘ is ๐œ‡-integrable}.

โ€ข For ๐‘ = โˆž,

โ„’โˆž(๐‘‹, ๐’œ, ๐œ‡) = {๐‘“ โˆถ ๐‘‹ โ†’ R measurable such that essup |๐‘“| < โˆž}

46

Page 48: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

8 ๐ฟ๐‘ spaces

where

essup |๐‘“| = inf{๐‘ก โˆถ |๐‘“(๐‘ฅ)| โ‰ค ๐‘ก for ๐œ‡-almost every ๐‘ฅ}.

Lemma 8.6. โ„’๐‘(๐‘‹, ๐’œ, ๐œ‡) is an R-vector space.

Proof. For ๐‘ < โˆž use Minkowski inequality. Similar we can check that โ€–๐‘“ +๐‘”โ€–โˆž โ‰ค โ€–๐‘“โ€–โˆž + โ€–๐‘”โ€–โˆž for all ๐‘“, ๐‘”.

Definition. We say ๐‘“ and ๐‘” are ๐œ‡-equivalent and write ๐‘“ โ‰ก๐œ‡ ๐‘” if for ๐œ‡-almost every ๐‘ฅ, ๐‘“(๐‘ฅ) = ๐‘”(๐‘ฅ).

Check that this is an equvalence relation stable under addition and multi-plication.

Definition (๐ฟ๐‘-space). Define

๐ฟ๐‘(๐‘‹, ๐’œ, ๐œ‡) = โ„’๐‘(๐‘‹, ๐’œ, ๐œ‡)/ โ‰ก๐œ‡

and if [๐‘“] denotes the equivalence class of ๐‘“ under โ‰ก๐œ‡ we define

โ€–[๐‘“]โ€–๐‘ = โ€–๐‘“โ€–๐‘.

Proposition 8.7. For ๐‘ โˆˆ [1, โˆž], ๐ฟ๐‘(๐‘‹, ๐’œ, ๐œ‡) is a normed vector spaceunder โ€–โ‹…โ€–๐‘ and it is complete, i.e. it is a Banach space.

Proof. If ๐‘“ โ‰ก๐œ‡ ๐‘” then โ€–๐‘“โ€–๐‘ = โ€–๐‘”โ€–๐‘ so โ€–โ‹…โ€–๐‘ on ๐ฟ๐‘ is well-defined. Triangleinequality follows from Minkowski inequality and linearity is obvious so โ€–โ‹…โ€–๐‘is indeed a norm.

For completeness, pick (๐‘“๐‘›)๐‘› a Cauchy sequence in โ„’๐‘(๐‘‹, ๐’œ, ๐œ‡). Need toshow that there exists ๐‘“ โˆˆ โ„’๐‘ such that โ€–๐‘“๐‘› โˆ’ ๐‘“โ€–๐‘ โ†’ 0 as ๐‘› โ†’ โˆž. This thenimplies that [๐‘“๐‘›] โ†’ [๐‘“] in ๐ฟ๐‘.

We can extract a subsequence ๐‘›๐‘˜ โ†‘ โˆž such that โ€–๐‘“๐‘›๐‘˜+1โˆ’ ๐‘“๐‘›๐‘˜

โ€–๐‘ โ‰ค 2โˆ’๐‘˜. Let

๐‘†๐พ =๐พ

โˆ‘๐‘˜=1

|๐‘“๐‘›๐‘˜+1โˆ’ ๐‘“๐‘›๐‘˜

|

then

โ€–๐‘†๐พโ€–๐‘ โ‰ค๐พ

โˆ‘๐‘˜=1

โ€–๐‘“๐‘›๐‘˜+1โˆ’ ๐‘“๐‘›๐‘˜

โ€–๐‘ โ‰ค๐พ

โˆ‘๐‘˜=1

2โˆ’๐‘˜ โ‰ค 1

so by monotone convergence,

lim๐พโ†’โˆž

โˆซ๐‘‹

|๐‘†๐พ|๐‘๐‘‘๐œ‡ = โˆซ๐‘‹

|๐‘†โˆž|๐‘๐‘‘๐œ‡,

i.e. ๐‘†โˆž โˆˆ โ„’๐‘. In particular for ๐œ‡-almost everywhere ๐‘ฅ, |๐‘†โˆž(๐‘ฅ)| < โˆž, i.e.

โˆ‘๐‘˜โ‰ฅ1

|๐‘“๐‘›๐‘˜+1(๐‘ฅ) โˆ’ ๐‘“๐‘›๐‘˜

(๐‘ฅ)| < โˆž.

47

Page 49: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

8 ๐ฟ๐‘ spaces

Hence (๐‘“๐‘›๐‘˜(๐‘ฅ))๐‘˜ is a Cauchy sequence in R. By completeness of R, the limit

exists and set ๐‘“(๐‘ฅ) to be it. When this limit does not exist set ๐‘“(๐‘ฅ) = 0.We then have, in case ๐‘ < โˆž, by Fatouโ€™s lemma

โ€–๐‘“๐‘› โˆ’ ๐‘“โ€–๐‘ โ‰ค lim inf๐‘˜โ†’โˆž

โ€–๐‘“๐‘› โˆ’ ๐‘“๐‘›๐‘˜โ€–๐‘ โ‰ค ๐œ€

for any ๐œ€ for ๐‘› sufficiently large. Thus

lim๐‘›โ†’โˆž

โ€–๐‘“๐‘› โˆ’ ๐‘“โ€–๐‘ = 0.

When ๐‘ = โˆž, we use the fact that if ๐‘“๐‘› โ†’ ๐‘“ ๐œ‡-almost everywhere then

โ€–๐‘“โ€–โˆž โ‰ค lim inf๐‘›โ†’โˆž

โ€–๐‘“๐‘›โ€–โˆž.

Proof. Let ๐‘ก > lim sup๐‘›โ†’โˆžโ€–๐‘“๐‘›โ€–โˆž. Then exists ๐‘›๐‘˜ โ†‘ โˆž such that

โ€–๐‘“๐‘›๐‘˜โ€–โˆž = essup |๐‘“๐‘›๐‘˜

| = inf{๐‘  โ‰ฅ 0 โˆถ |๐‘“๐‘›๐‘˜(๐‘ฅ)| โ‰ค ๐‘  for ๐œ‡-almost every ๐‘ฅ} < ๐‘ก

for all ๐‘˜. Thus for all ๐‘˜, for ๐œ‡-almost every ๐‘ฅ, |๐‘“๐‘›๐‘˜(๐‘ฅ)| < ๐‘ก. But by ๐œŽ-additivity

of ๐œ‡ we can swap the quantifiers, i.e. for ๐œ‡-almost every ๐‘ฅ, for all ๐‘ฅ, |๐‘“๐‘›๐‘˜(๐‘ฅ)| < ๐‘ก.

Thus for ๐œ‡-almost every ๐‘ฅ, โ€–๐‘“(๐‘ฅ)โ€–โˆž โ‰ค ๐‘ก.

Proposition 8.8 (approximation by simple functions). Let ๐‘ โˆˆ [1, โˆž). Let๐‘‰ be the linear span of all simple functions on ๐‘‹. Then ๐‘‰ โˆฉ โ„’๐‘ is dense inโ„’๐‘.

Proof. Note that ๐‘” โˆˆ โ„’๐‘ implies ๐‘”+, ๐‘”โˆ’ โˆˆ โ„’๐‘. Thus by writing ๐‘“ = ๐‘“+ โˆ’ ๐‘“โˆ’

and using Minkowski inequality, suffice to show ๐‘“ โ‰ฅ 0 is the limit of a sequenceof simple functions.

Recall there exist simple functions 0 โ‰ค ๐‘”๐‘› โ‰ค ๐‘“ such that ๐‘”๐‘›(๐‘ฅ) โ†‘ ๐‘“(๐‘ฅ) for๐œ‡-almost every ๐‘ฅ. Then

lim๐‘›โ†’โˆž

โ€–๐‘”๐‘› โˆ’ ๐‘“โ€–๐‘๐‘ = lim

๐‘›โ†’โˆžโˆซ

๐‘‹|๐‘”๐‘› โˆ’ ๐‘“|๐‘๐‘‘๐œ‡ = 0

by dominated convergence theorem (|๐‘”๐‘› โˆ’ ๐‘“| โ‰ค ๐‘”๐‘› + ๐‘“ โ‰ค 2๐‘“ so ๐‘”๐‘› โˆ’ ๐‘“ is ๐œ‡-integrable).

Remark. When ๐‘‹ = R๐‘‘, ๐’œ = โ„ฌ(R๐‘‘) and ๐œ‡ is the Lebesgue measure, ๐ถ๐‘(R๐‘‘),the space of continuous functions with compact support is dense in โ„’๐‘(๐‘‹, ๐’œ, ๐œ‡)when ๐‘ โˆˆ [1, โˆž) (this does not hold for ๐‘ = โˆž: a constant nonzero function hasno noncompact support). See example sheet. In fact, ๐ถโˆž

๐‘ (R๐‘‘) suffices.

48

Page 50: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

9 Hilbert space and ๐ฟ2-methods

Definition (inner product). Let ๐‘‰ be a complex vector space. A Hermitianinner product on ๐‘‰ is a map

๐‘‰ ร— ๐‘‰ โ†’ C(๐‘ฅ, ๐‘ฆ) โ†ฆ โŸจ๐‘ฅ, ๐‘ฆโŸฉ

such that

1. โŸจ๐›ผ๐‘ฅ + ๐›ฝ๐‘ฆ, ๐‘งโŸฉ = ๐›ผโŸจ๐‘ฅ, ๐‘งโŸฉ + ๐›ฝโŸจ๐‘ฆ, ๐‘งโŸฉ for all ๐›ผ, ๐›ฝ โˆˆ C, for all ๐‘ฅ, ๐‘ฆ, ๐‘ง โˆˆ ๐‘‰.

2. โŸจ๐‘ฆ, ๐‘ฅโŸฉ = โŸจ๐‘ฅ, ๐‘ฆโŸฉ.

3. โŸจ๐‘ฅ, ๐‘ฅโŸฉ โˆˆ R and โŸจ๐‘ฅ, ๐‘ฅโŸฉ โ‰ฅ 0, with equality if and only if ๐‘ฅ = 0.

Definition (Hermitian norm). The Hermitian norm is defined as โ€–๐‘ฅโ€– =โˆšโŸจ๐‘ฅ, ๐‘ฅโŸฉ.

Lemma 9.1. Properties of norm:

1. linearity: โ€–๐œ†๐‘ฅโ€– = |๐œ†|โ€–๐‘ฅโ€– for all ๐œ† โˆˆ C, ๐‘ฅ โˆˆ ๐‘‰,

2. Cauchy-Schwarz: |โŸจ๐‘ฅ, ๐‘ฆโŸฉ| โ‰ค โ€–๐‘ฅโ€– โ‹… โ€–๐œ“โ€–,

3. triangle inequality: โ€–๐‘ฅ + ๐‘ฆโ€– โ‰ค โ€–๐‘ฅโ€– + โ€–๐‘ฆโ€–

4. parallelogram identity: โ€–๐‘ฅ + ๐‘ฆโ€–2 + โ€–๐‘ฅ โˆ’ ๐‘ฆโ€–2 = 2(โ€–๐‘ฅโ€–2 + โ€–๐‘ฆโ€–2)

Proof. Exercise. For reference see authorโ€™s notes on IID Linear Analysis.

Corollary 9.2. (๐‘‰ , โ€–โ‹…โ€–) is a normed vector space.

Definition (Hilbert space). We say (๐‘‰ , โ€–โ‹…โ€–) is a Hilbert space if it is com-plete.

Example. Let ๐‘‰ = ๐ฟ2(๐‘‹, ๐’œ, ๐œ‡) where (๐‘‹, ๐’œ, ๐œ‡) is a measure space. Then wecan define

โŸจ๐‘“, ๐‘”โŸฉ = โˆซ๐‘‹

๐‘“๐‘”๐‘‘๐œ‡

which is well-defined (i.e. finite) by Cauchy-Schwarz. The axioms are easy tocheck, with positive-definiteness given by

0 = โŸจ๐‘“, ๐‘“โŸฉ = โˆซ๐‘‹

|๐‘“|2๐‘‘๐œ‡

if and only if ๐‘“ = 0 ๐œ‡-almost everywhere so ๐‘“ = 0.

49

Page 51: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

Proposition 9.3. Let ๐ป be a Hilbert space and let ๐’ž be a closed convexsubset of ๐ป. Then for all ๐‘ฅ โˆˆ ๐ป, there exists unique ๐‘ฆ โˆˆ ๐’ž such that

โ€–๐‘ฅ โˆ’ ๐‘ฆโ€– = ๐‘‘(๐‘ฅ, ๐’ž)

where by definition๐‘‘(๐‘ฅ, ๐’ž) = inf

๐‘โˆˆ๐’žโ€–๐‘ฅ โˆ’ ๐‘โ€–.

This ๐‘ฆ is called the orthogonal projection of ๐‘ฅ on ๐’ž.

Proof. Let ๐‘๐‘› โˆˆ ๐’ž be a sequence such that โ€–๐‘ฅ โˆ’ ๐‘๐‘›โ€– โ†’ ๐‘‘(๐‘ฅ, ๐’ž). Letโ€™s show that(๐‘๐‘›)๐‘› is a Cauchy sequence. By parallelogram identity,

โˆฅ๐‘ฅ โˆ’ ๐‘๐‘›2

+ ๐‘ฅ โˆ’ ๐‘๐‘š2

โˆฅ2

+ โˆฅ๐‘ฅ โˆ’ ๐‘๐‘›2

โˆ’ ๐‘ฅ โˆ’ ๐‘๐‘š2

โˆฅ2

= 12

(โ€–๐‘ฅ โˆ’ ๐‘๐‘›โ€–2 + โ€–๐‘ฅ โˆ’ ๐‘๐‘šโ€–2)

soโˆฅโˆฅโˆฅโˆฅ๐‘ฅ โˆ’ ๐‘๐‘› + ๐‘๐‘š

2โŸโˆˆ๐’ž

โˆฅโˆฅโˆฅโˆฅ

2

โŸโŸโŸโŸโŸโŸโŸโ‰ฅ๐‘‘(๐‘ฅ,๐’ž)2

+14

โ€–๐‘๐‘› โˆ’ ๐‘๐‘šโ€–2 = 12

(โ€–๐‘ฅ โˆ’ ๐‘๐‘›โ€–2 + โ€–๐‘ฅ โˆ’ ๐‘๐‘šโ€–2)โŸโŸโŸโŸโŸโŸโŸโŸโŸโŸโŸ

โ†’๐‘‘(๐‘ฅ,๐’ž)2

solim

๐‘›,๐‘šโ†’โˆžโ€–๐‘๐‘› โˆ’ ๐‘๐‘šโ€– = 0

i.e. (๐‘๐‘›)๐‘› is Cauchy. By completeness exist lim๐‘›โ†’โˆž ๐‘๐‘› = ๐‘ฆ โˆˆ ๐ป. As ๐’ž is closed,๐‘ฆ โˆˆ ๐’ž. As โ€–๐‘ฅ โˆ’ ๐‘๐‘›โ€– โ†’ ๐‘‘(๐‘ฅ, ๐’ž), โ€–๐‘ฅ โˆ’ ๐‘ฆโ€– โ†’ ๐‘‘(๐‘ฅ, ๐’ž). This shows existence of ๐‘ฆ.

For uniqueness, use parallelogram identity

โˆฅ๐‘ฅ โˆ’ ๐‘ฆ + ๐‘ฆโ€ฒ

2โˆฅ2

โŸโŸโŸโŸโŸโ‰ฅ๐‘‘(๐‘ฅ,๐’ž)

+14

โ€–๐‘ฆ โˆ’ ๐‘ฆโ€ฒโ€–2 = 12

(โ€–๐‘ฅ โˆ’ ๐‘ฆโ€–2 + โ€–๐‘ฅ โˆ’ ๐‘ฆโ€ฒโ€–2) = ๐‘‘(๐‘ฅ, ๐’ž)2

so โ€–๐‘ฆ โˆ’ ๐‘ฆโ€ฒโ€– = 0.

Corollary 9.4. Suppose ๐‘‰ โ‰ค ๐ป is a closed subspace of a Hilbert space ๐ป.Then

๐ป = ๐‘‰ โŠ• ๐‘‰ โŸ‚

where๐‘‰ โŸ‚ = {๐‘ฅ โˆˆ ๐ป โˆถ โŸจ๐‘ฅ, ๐‘ฃโŸฉ = 0 for all ๐‘ฃ โˆˆ ๐‘‰ }

is the orthogonal of ๐‘‰.

Proof. ๐‘‰ โˆฉ ๐‘‰ โŸ‚ = 0 by positivity of inner product. If ๐‘ฅ โˆˆ ๐ป then there exists aunique ๐‘ฆ โˆˆ ๐‘‰ such that โ€–๐‘ฅ โˆ’ ๐‘ฆโ€– = ๐‘‘(๐‘ฅ, ๐‘ฆ). Need to show that ๐‘ฅ โˆ’ ๐‘ฆ โˆˆ ๐‘‰ โŸ‚.

For all ๐‘ง โˆˆ ๐‘‰,โ€–๐‘ฅ โˆ’ ๐‘ฆ โˆ’ ๐‘งโ€– โ‰ฅ โ€–๐‘ฅ โˆ’ ๐‘ฆโ€–

as ๐‘ฆ + ๐‘ง โˆˆ ๐‘‰. Thus

โ€–๐‘ฅ โˆ’ ๐‘ฆโ€–2 + โ€–๐‘งโ€– โˆ’ 2 ReโŸจ๐‘ฅ โˆ’ ๐‘ฆ, ๐‘งโŸฉ โ‰ฅ โ€–๐‘ฅ โˆ’ ๐‘ฆโ€–2.

50

Page 52: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

Rearrange,2 ReโŸจ๐‘ฅ โˆ’ ๐‘ฆ, ๐‘งโŸฉ โ‰ค โ€–๐‘งโ€–2

for all ๐‘ง โˆˆ ๐‘‰. Now substitute ๐‘ก๐‘ง for ๐‘ง where ๐‘ก โˆˆ R+, have

๐‘ก โ‹… 2 Re ๐‘ฅ โˆ’ ๐‘ฆ, ๐‘ง โ‰ค ๐‘ก2โ€–๐‘งโ€–2.

For ๐‘ก = 0,ReโŸจ๐‘ฅ โˆ’ ๐‘ฆ, ๐‘งโŸฉ โ‰ค 0

Similarly replace ๐‘ง by โˆ’๐‘ง to conclude ReโŸจ๐‘ฅโˆ’๐‘ฆ, ๐‘งโŸฉ = 0. Finally replace ๐‘ง by ๐‘’๐‘–๐œƒ๐‘งto have โŸจ๐‘ฅ โˆ’ ๐‘ฆ, ๐‘งโŸฉ = 0 for all ๐‘ง. Thus ๐‘ฅ โˆ’ ๐‘ฆ โˆˆ ๐‘‰ โŸ‚.

Definition (bounded linear form). A linear form โ„“ โˆถ ๐ป โ†’ C is bounded ifthere exists ๐ถ > 0 such that |โ„“(๐‘ฅ)| โ‰ค ๐ถโ€–๐‘ฅโ€– for all ๐‘ฅ โˆˆ ๐ป.

Remark. โ„“ bounded is equivalent to โ„“ continuous.

Theorem 9.5 (Riesz representation theorem). Let ๐ป be a Hilbert space.For every bounded linear form โ„“ there exists ๐‘ฃ0 โˆˆ ๐ป such that

โ„“(๐‘ฅ) = โŸจ๐‘ฅ, ๐‘ฃ0โŸฉ

for all ๐‘ฅ โˆˆ ๐ป.

Proof. By boundedness of โ„“, ker โ„“ is closed so write

๐ป = ker โ„“ โŠ• (ker โ„“)โŸ‚.

If โ„“ = 0 then just pick ๐‘ฃ0 = 0. Otherwise pick ๐‘ฅ0 โˆˆ (ker โ„“)โŸ‚ \ {0}. But (ker โ„“)โŸ‚

is spanned by ๐‘ฅ0: indeed for any ๐‘ฅ โˆˆ (ker โ„“)โŸ‚,

โ„“(๐‘ฅ) = โ„“(๐‘ฅ)โ„“(๐‘ฅ0)

โ„“(๐‘ฅ0)

soโ„“(๐‘ฅ โˆ’ โ„“(๐‘ฅ)

โ„“(๐‘ฅ0)๐‘ฅ0) = 0

so ๐‘ฅ โˆ’ โ„“(๐‘ฅ)โ„“(๐‘ฅ0) โˆˆ ker โ„“ โˆฉ (ker โ„“)โŸ‚ = 0. Now let

๐‘ฃ0 = โ„“(๐‘ฅ0)โ€–๐‘ฅ0โ€–2 ๐‘ฅ0

and observe that โ„“(๐‘ฅ) โˆ’ โŸจ๐‘ฅ, ๐‘ฃ0โŸฉ vanishes on ker โ„“ and on (ker โ„“)โŸ‚ = C๐‘ฅ0. Thusit is identically zero.

Definition (absolutely continuous, singular measure). Let (๐‘‹, ๐’œ) be a mea-surable space and let ๐œ‡, ๐œˆ be two measures on (๐‘‹, ๐’œ).

1. ๐œ‡ is absolutely continuous with respect to ๐œˆ, denoted ๐œ‡ โ‰ช ๐œˆ, if forevery ๐ด โˆˆ ๐’œ, ๐œˆ(๐ด) = 0 implies ๐œ‡(๐ด) = 0.

2. ๐œ‡ is singular, denoted ๐œ‡ โŸ‚ ๐œˆ, if exists ฮฉ โˆˆ ๐’œ such that ๐œ‡(ฮฉ) = 0 and

51

Page 53: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

๐œˆ(ฮฉ๐‘) = 0.Example.

1. Let ๐œˆ be the Lebesgue measure on (R, โ„ฌ(R)) and ๐‘‘๐œ‡ = ๐‘“๐‘‘๐œˆ where ๐‘“ โ‰ฅ 0is a Borel function then ๐œ‡ โ‰ช ๐œˆ.

2. If ๐œ‡ = ๐›ฟ๐‘ฅ0, the Dirac mass at ๐‘ฅ0 โˆˆ R, then ๐œ‡ โŸ‚ ๐‘ฃ.

Non-examinable theorem and proof:

Theorem 9.6 (Radon-Nikodym). Assume ๐œ‡ and ๐œˆ are ๐œŽ-finite measureson (๐‘‹, ๐’œ).

1. If ๐œ‡ โ‰ช ๐œˆ then there exists ๐‘” โ‰ฅ 0 measurable such that ๐‘‘๐œ‡ = ๐‘”๐‘‘๐œˆ,namely

๐œ‡(๐ด) = โˆซ๐ด

๐‘”(๐‘ฅ)๐‘‘๐œˆ(๐‘ฅ)

for all ๐ด โˆˆ ๐’œ. ๐‘” is called the density of ๐œ‡ with respect to ๐œˆ or Radon-Nikodym derivative, sometimes denoted by ๐‘” = ๐‘‘๐œ‡

๐‘‘๐œˆ .

2. For any ๐œ‡, ๐œˆ ๐œŽ-finite, ๐œ‡ decomposes as

๐œ‡ = ๐œ‡๐‘Ž + ๐œ‡๐‘ 

where ๐œ‡๐‘Ž โ‰ช ๐œˆ and ๐œ‡๐‘  โŸ‚ ๐œˆ. Moreover this decomposition is unique.

Proof. Consider ๐ป = ๐ฟ2(๐‘‹, ๐’œ, ๐œ‡ + ๐œˆ), which is a Hilbert space. First assume ๐œ‡and ๐œˆ are finite. Consider the linear form

โ„“ โˆถ ๐ป โ†’ R

๐‘“ โ†ฆ ๐œ‡(๐‘“) = โˆซ๐‘‹

๐‘“๐‘‘๐œ‡

โ„“ is bounded by Cauchy-Schwarz and finiteness of the measures:

|๐œ‡(๐‘“)| โ‰ค ๐œ‡(|๐‘“|) โ‰ค (๐œ‡ + ๐œˆ)(|๐‘“|) โ‰ค โˆš(๐œ‡ + ๐œˆ)(๐‘‹) โ‹… โˆšโˆซ๐‘‹

|๐‘“|2๐‘‘(๐œ‡ + ๐œˆ) = ๐ถ โ‹… โ€–๐‘“โ€–๐ป

so by Riesz representation theorem, there exists ๐‘”0 โˆˆ ๐ฟ2(๐‘‹, ๐’œ, ๐œ‡ + ๐œˆ) such that

๐œ‡(๐‘“) = โˆซ๐‘‹

๐‘“๐‘”0๐‘‘(๐œ‡ + ๐œˆ). (โˆ—)

Claim that for (๐œ‡+๐œˆ)-almost every ๐‘ฅ, 0 โ‰ค ๐‘”0(๐‘ฅ) โ‰ค 1: take ๐‘“ = 1{๐‘”0<0} and plugit into (โˆ—),

0 โ‰ค ๐œ‡({๐‘”0 < 0}) = โˆซ๐‘‹

1{๐‘”0<0}๐‘”0โŸโ‰ค0

๐‘‘(๐œ‡ + ๐œˆ) โ‰ค 0

so equality throughout. Thus ๐‘”0 โ‰ฅ 0 (๐œ‡ + ๐œˆ)-almost everywhere. Similarly take๐‘“ = 1{๐‘”0>1+๐œ€} for ๐œ€ > 0 and plug it into (โˆ—),

(๐œ‡ + ๐œˆ)({๐‘”0 > 1 + ๐œ€}) โ‰ฅ ๐œ‡({๐‘”0 > 1 + ๐œ€})

= โˆซ๐‘‹

1{๐‘”0>1+๐œ€}๐‘”0๐‘‘(๐œ‡ + ๐œˆ)

โ‰ฅ (1 + ๐œ€)(๐œ‡ + ๐œˆ)({๐‘”0 > 1 + ๐œ€})

52

Page 54: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

so must have(๐œ‡ + ๐œˆ)({๐‘”0 > 1 + ๐œ€}) = 0

i.e. ๐‘”0 โ‰ค 1 (๐œ‡ + ๐œˆ)-almost everywhere.Now set ฮฉ = {๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘”0 โˆˆ [0, 1)} so on ฮฉ๐‘, ๐‘”0 = 1 (๐œ‡ + ๐œˆ)-almost

everywhere. Then (โˆ—) is equivalent to

โˆซ๐‘‹

๐‘“(1 โˆ’ ๐‘”0)๐‘‘๐œ‡ = โˆซ๐‘‹

๐‘“๐‘”0๐‘‘๐œˆ

for all ๐‘“ โˆˆ ๐ฟ2(๐‘‹, ๐’œ, ๐œ‡ + ๐œˆ). Hence this holds for all ๐‘“ โ‰ฅ 0. Now ๐‘“ be ๐‘“1โˆ’๐‘”0

1ฮฉ,get

โˆซฮฉ

๐‘“๐‘‘๐œ‡ = โˆซฮฉ

๐‘“ ๐‘”01 โˆ’ ๐‘”0

๐‘‘๐œˆ. (โˆ—โˆ—)

Set

๐œ‡๐‘Ž(๐ด) = ๐œ‡(๐ด โˆฉ ฮฉ)๐œ‡๐‘ (๐ด) = ๐œ‡(๐ด โˆฉ ฮฉ๐‘)

Clearly ๐œ‡ = ๐œ‡๐‘Ž + ๐œ‡๐‘ . Claim that this is the required result, i.e.

1. ๐œ‡๐‘Ž โ‰ช ๐œˆ,

2. ๐œ‡๐‘  โŸ‚ ๐œˆ,

3. ๐‘‘๐œ‡๐‘Ž = ๐‘”๐‘‘๐œˆ where ๐‘” = ๐‘”01โˆ’๐‘”0

1ฮฉ.

Proof.

1. If ๐œˆ(๐ด) = 0 set ๐‘“ = 1๐ด and plug into (โˆ—โˆ—) to get ๐œ‡(๐ด โˆฉ ฮฉ) = 0, namely๐œ‡๐‘Ž(๐ด) = 0.

2. Set ๐‘“ = 1ฮฉ๐‘ . On ฮฉ๐‘, ๐‘”0 = 1 (๐œ‡ + ๐œˆ)-almost everywhere. Plug this into (โˆ—)to get ๐œˆ(ฮฉ๐‘) = 0. But ๐œ‡๐‘ (ฮฉ) = 0 so ๐œ‡๐‘  โŸ‚ ๐œˆ.

3. (โˆ—โˆ—) is equivalent to ๐‘‘๐œ‡๐‘Ž = ๐‘”๐‘‘๐œˆ where ๐‘” = ๐‘”01โˆ’๐‘”0

1ฮฉ.

This settles part 2 of the theorem, and also part 1 as if ๐œ‡ โ‰ช ๐œˆ then ๐œ‡ = ๐œ‡๐‘Ž.If ๐œ‡ are ๐œˆ are not finite but only ๐œŽ-finite, use the old trick of partition ๐‘‹

into countably many ๐œ‡- and ๐œˆ-finite sets and take their intersections. Supposewe get a disjoint countable union ๐‘‹ = โ‹ƒ๐‘› ๐‘‹๐‘›. Then ๐œ‡ = โˆ‘๐‘› ๐œ‡|๐‘‹๐‘›

where foreach ๐‘› we can write

๐œ‡|๐‘‹๐‘›= (๐œ‡|๐‘‹๐‘›

)๐‘Ž + (๐œ‡|๐‘‹๐‘›)๐‘ .

Then set

๐œ‡๐‘Ž = โˆ‘๐‘›

(๐œ‡|๐‘‹๐‘›)๐‘Ž

๐œ‡๐‘  = โˆ‘๐‘›

(๐œ‡|๐‘‹๐‘›)๐‘ 

53

Page 55: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

Remains to check uniqueness of decomposition. Suppose ๐œ‡ can be decom-posed in two ways

๐œ‡ = ๐œ‡๐‘Ž + ๐œ‡๐‘  = ๐œ‡โ€ฒ๐‘Ž + ๐œ‡โ€ฒ

๐‘ .

As ๐œ‡๐‘ , ๐œ‡โ€ฒ๐‘  โŸ‚ ๐œˆ there exists ฮฉ0, ฮฉโ€ฒ

0 โˆˆ ๐’œ such that

๐œ‡๐‘ (ฮฉ0) = 0, ๐œˆ(ฮฉ๐‘0) = 0

๐œ‡โ€ฒ๐‘ (ฮฉโ€ฒ

0) = 0, ๐œˆ((ฮฉโ€ฒ0)๐‘) = 0

Set ฮฉ1 = ฮฉ0 โˆฉ ฮฉโ€ฒ0. Check that

๐œ‡๐‘ (ฮฉ1) = ๐œ‡โ€ฒ๐‘ (ฮฉ1) = 0

๐œˆ(ฮฉ๐‘1) = ๐œˆ(ฮฉ๐‘

0 โˆช ฮฉโ€ฒ๐‘0 ) = 0

Now ๐œ‡๐‘Ž, ๐œ‡โ€ฒ๐‘Ž โ‰ช ๐œˆ so

๐œ‡๐‘Ž(ฮฉ๐‘1) = ๐œ‡โ€ฒ

๐‘Ž(ฮฉ๐‘1) = 0.

Hence for all ๐ด โˆˆ ๐’œ,

๐œ‡๐‘Ž(๐ด) = ๐œ‡๐‘Ž(๐ด โˆฉ ฮฉ1) = ๐œ‡(๐ด โˆฉ ฮฉ1) = ๐œ‡โ€ฒ๐‘Ž(๐ด โˆฉ ฮฉ1) = ๐œ‡โ€ฒ

๐‘Ž(๐ด)

so ๐œ‡๐‘Ž = ๐œ‡โ€ฒ๐‘Ž and hence ๐œ‡๐‘  = ๐œ‡โ€ฒ

๐‘ .

Proposition 9.7. Let (ฮฉ, โ„ฑ,P) be a probability space. Let ๐’ข be a ๐œŽ-subalgbebra of โ„ฑ and ๐‘‹ a random variable on (ฮฉ, โ„ฑ,P). Assume ๐‘‹ isintegrable, then there exists a random variable ๐‘Œ on (ฮฉ, ๐’ข,P) such that

E(1๐ด๐‘‹) = E(1๐ด๐‘Œ )

for all ๐ด โˆˆ ๐’ข. Moreover ๐‘Œ is unique almost surely.

If you are perplexed by why it is nontrivial to recover a random variable on ๐’ขfrom one on โ„ฑ, as ๐’ข โŠ† โ„ฑ so it seems that we can easily restrict to a โ€œsub-randomvariableโ€. But this reasoning makes absolutely no sense as random variablesare functions from the space. In other words, id โˆถ (ฮฉ, โ„ฑ,P) โ†’ (ฮฉ, ๐’ข,P) ismeasurable but its inverse is not, and it is not obvious that ๐‘‹ has a pushforward.

Definition (conditional expectation). ๐‘Œ as above is called the conditionalexpectation of ๐‘‹ with respect to ๐’ข, denote by

๐‘Œ = E(๐‘‹|๐’ข).

Proof. wlog assume ๐‘‹ โ‰ฅ 0. Set ๐œ‡(๐ด) = E(1๐ด๐‘‹) for all ๐ด โˆˆ ๐’ข. ๐œ‡ is finiteby integrability of ๐‘‹ and is a measure on (ฮฉ, ๐’ข). Moreover ๐œ‡ โ‰ช P. Thus byRadon-Nikodym there exists ๐‘” โ‰ฅ 0 ๐’ข-measurable such that

๐œ‡(๐ด) = โˆซ๐ด

๐‘”๐‘‘P = E(1๐ด๐‘”).

Set ๐‘Œ = ๐‘”.Uniqueness is shown in example sheet 3.

54

Page 56: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

9 Hilbert space and ๐ฟ2-methods

Remark. In case ๐‘‹ โˆˆ ๐ฟ2(ฮฉ, โ„ฑ,P) then ๐‘Œ is the orthogonal projection of ๐‘‹onto ๐ฟ2(ฮฉ, ๐’ข,P). It is well-defined since ๐ฟ2(ฮฉ, โ„ฑ,P) is a Hilbert space and๐ฟ2(ฮฉ, ๐’ข,P) is a closed subspace. In this case TFAE:

1. E((๐‘‹ โˆ’ ๐‘Œ )1๐ด) = 0 for all ๐ด โˆˆ ๐’ข,

2. E((๐‘‹ โˆ’ ๐‘Œ )โ„Ž) = 0 for all โ„Ž simple on (ฮฉ, ๐’ข,P),

3. E((๐‘‹ โˆ’ ๐‘Œ )โ„Ž) = 0 for all โ„Ž โ‰ฅ 0 ๐’ข-measurable,

4. E((๐‘‹ โˆ’ ๐‘Œ )โ„Ž) = 0 for all โ„Ž โˆˆ ๐ฟ2(ฮฉ, ๐’ข,P).

Remark. Special case when ๐’ข = {โˆ…, ๐ต, ๐ต๐‘, ฮฉ} where ๐ต โˆˆ โ„ฑ:

E(1๐ด|๐’ข)(๐œ”) = {P(๐ด|๐ต) ๐œ” โˆˆ ๐ตP(๐ด|๐ต๐‘) ๐œ” โˆˆ ๐ต๐‘

whereP(๐ด|๐ต) = P(๐ด โˆฉ ๐ต)

P(๐ต).

Proposition 9.8 (non-examinable). Properties of conditional expectation:

1. linearity: E(๐›ผ๐‘‹ + ๐›ฝ๐‘Œ |๐’ข) = ๐›ผE(๐‘‹|๐’ข) + ๐›ฝE(๐‘Œ |๐’ข).

2. if ๐‘‹ is ๐’ข-measurable then E(๐‘‹|๐’ข) = E(๐‘‹).

3. positivity: if ๐‘‹ โ‰ฅ 0 then E(๐‘‹|๐’ข) โ‰ฅ 0.

4. E(E(๐‘‹|๐’ข)|โ„‹) = E(๐‘‹|โ„‹) if โ„‹ โŠ† ๐’ข.

5. if ๐‘ is ๐’ข-measurable and bounded then E(๐‘‹๐‘|๐’ข) = ๐‘ โ‹… E(๐‘‹|๐’ข).

55

Page 57: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

10 Fourier transform

Definition (Fourier transform). Let ๐‘“ โˆˆ ๐ฟ1(R๐‘‘, โ„ฌ(R๐‘‘), ๐‘‘๐‘ฅ) where ๐‘‘๐‘ฅ is theLebesgue measure. The function

๐‘“ โˆถ R๐‘‘ โ†’ C

๐‘ข โ†ฆ โˆซR๐‘‘

๐‘“(๐‘ฅ)๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ฅ

where โŸจ๐‘ข, ๐‘ฅโŸฉ = ๐‘ข1๐‘ฅ1 + โ‹ฏ + ๐‘ข๐‘‘๐‘ฅ๐‘‘, is called the Fourier transform of ๐‘“.

Proposition 10.1.

1. | ๐‘“(๐‘ข)| โ‰ค โ€–๐‘“โ€–1.

2. ๐‘“ is continuous.

Proof. 1 is clear. 2 follows from dominated convergence theorem.

Definition. Given a finite Borel measure ๐œ‡ on R๐‘‘, its Fourier transform isgiven by

๐œ‡(๐‘ข) = โˆซR๐‘‘

๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐œ‡(๐‘ฅ).

Again | ๐œ‡(๐‘ข)| โ‰ค ๐œ‡(R๐‘‘) and ๐œ‡ is continuous.

Remark. If ๐‘‹ is an R๐‘‘-valued random variable with law ๐œ‡๐‘‹ then ๐œ‡๐‘‹ is calledthe characteristic function of ๐‘‹.

Example.

1. Normalised Gaussian distribution on R: ๐œ‡ = ๐’ฉ(0, 1). ๐‘‘๐œ‡ = ๐‘”๐‘‘๐‘ฅ where

๐‘”(๐‘ฅ) = ๐‘’โˆ’๐‘ฅ2/2โˆš

2๐œ‹.

Claim that๐œ‡(๐‘ข) = ๐‘”(๐‘ข) = ๐‘’โˆ’๐‘ข2/2,

i.e. ๐‘” =โˆš

2๐œ‹๐‘”. This is the defining characteristic of Gaussian distribution.

Proof. Since (๐‘ข โ†ฆ |๐‘–๐‘ข โ‹… ๐‘’๐‘–๐‘ข๐‘ฅ๐‘’โˆ’๐‘ฅ2/2|) โˆˆ ๐ฟ1(R), we can differentiate under

56

Page 58: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

integral sign to get

๐‘‘๐‘‘๐‘ข

๐‘”(๐‘ข) = ๐‘‘๐‘‘๐‘ข

โˆซR

๐‘’๐‘–๐‘ข๐‘ฅ๐‘’โˆ’๐‘ฅ2/2 ๐‘‘๐‘ฅโˆš2๐œ‹

= โˆซR

๐‘–๐‘ฅ๐‘’๐‘–๐‘ข๐‘ฅ๐‘’โˆ’๐‘ฅ2/2 ๐‘‘๐‘ฅโˆš2๐œ‹

= โˆ’ โˆซR

๐‘–๐‘’๐‘–๐‘ข๐‘ฅ๐‘”โ€ฒ(๐‘ฅ)๐‘‘๐‘ฅ

= ๐‘– โˆซR

(๐‘’๐‘–๐‘ข๐‘ฅ)โ€ฒ๐‘”(๐‘ฅ)๐‘‘๐‘ฅ

= โˆ’๐‘ข โˆซR

๐‘’๐‘–๐‘ข๐‘ฅ๐‘”(๐‘ฅ)๐‘‘๐‘ฅ

= โˆ’๐‘ข ๐‘”(๐‘ข)

Thus๐‘‘

๐‘‘๐‘ข( ๐‘”(๐‘ข)๐‘’๐‘ข2/2) = 0

so๐‘”(๐‘ข) = ๐‘”(0)๐‘’โˆ’๐‘ข2/2.

But๐‘”(0) = โˆซ

R๐‘”(๐‘ฅ)๐‘‘๐‘ฅ = 1

so ๐‘”(๐‘ข) = ๐‘’โˆ’๐‘ข2/2 as required.

2. ๐‘‘-dimensional version: ๐œ‡ = ๐’ฉ(0, ๐ผ๐‘‘). ๐‘‘๐œ‡(๐‘ฅ) = ๐บ(๐‘ฅ)๐‘‘๐‘ฅ where ๐‘‘๐‘ฅ =๐‘‘๐‘ฅ1 โ‹ฏ ๐‘‘๐‘ฅ๐‘‘ and

๐บ(๐‘ฅ) =๐‘‘

โˆ๐‘–=1

๐‘”(๐‘ฅ๐‘–) = ๐‘’โˆ’โ€–๐‘ฅโ€–2/2

(โˆš

2๐œ‹)๐‘‘.

Then

๐บ(๐‘ข) = โˆซR๐‘‘

๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐บ(๐‘ฅ)๐‘‘๐‘ฅ

=๐‘‘

โˆ๐‘–=1

โˆซR

๐‘”(๐‘ฅ๐‘–)๐‘’๐‘–๐‘ข๐‘–๐‘ฅ๐‘–๐‘‘๐‘ฅ๐‘–

=๐‘‘

โˆ๐‘–=1

๐‘”(๐‘ข๐‘–)

= ๐‘’โˆ’โ€–๐‘ขโ€–2/2

Theorem 10.2 (Fourier inversion formula).

1. If ๐‘“ โˆˆ ๐ฟ1(R๐‘‘) is such that ๐‘“ โˆˆ ๐ฟ1(R๐‘‘) then ๐‘“ is continuous (i.e. ๐‘“equals to a continuous function almost everywhere) and

๐‘“(๐‘ฅ) = 1(2๐œ‹)๐‘‘

๐‘“(โˆ’๐‘ฅ).

57

Page 59: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

2. If ๐œ‡ is a finite Borel measure on R๐‘‘ such that ๐œ‡ โˆˆ ๐ฟ1(R๐‘‘) then ๐œ‡ hasa continuous density with respect to Lebesgue measure, i.e. ๐‘‘๐œ‡ = ๐‘“๐‘‘๐‘ฅwith

๐‘“(๐‘ฅ) = 1(2๐œ‹)๐‘‘

๐œ‡(โˆ’๐‘ฅ).

Remark. In other words

๐‘“(๐‘ฅ) = 1(2๐œ‹)๐‘‘ โˆซ

R๐‘‘

๐‘“(๐‘ข)๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ข

where ๐‘“(๐‘ข) are Fourier coefficients and ๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ are called Fourier modes, whichare characters ๐‘’๐‘–โŸจ๐‘ข,โˆ’โŸฉ โˆถ R๐‘‘ โ†’ {๐‘ง โˆˆ C โˆถ |๐‘ง| = 1}. Informally this says that every๐‘“ can be written as an โ€œinfinite linear combinationโ€ of Fourier modes.

Proof. (!UPDATE: 1 does not quite reduce to 2 as ๐‘“ = ๐‘“+ โˆ’ ๐‘“โˆ’ does not quitehold. Instead write ๐‘“ = ๐‘Ž ๐œ‡๐‘‹ โˆ’ ๐‘ ๐œ‡๐‘Œ where ๐‘Ž = โ€–๐‘“+โ€–1, ๐‘‘๐œ‡๐‘ฅ = ๐‘“

โ€–๐‘“+โ€–1๐‘‘๐‘ฅ)

1 reduces to 2 by considering ๐‘“ = ๐‘“+ โˆ’ ๐‘“โˆ’. In 2 we may assume wlog ๐œ‡is a probability measure so is the law of some random variable ๐‘‹. Let ๐‘“(๐‘ฅ) =

1(2๐œ‹)๐‘‘

๐œ‡(โˆ’๐‘ฅ). We need to show ๐œ‡ = ๐‘“๐‘‘๐‘ฅ, which is equivalent to for all ๐ด โˆˆ โ„ฌ(R๐‘‘),

๐œ‡(๐ด) = โˆซR๐‘‘

๐‘“1๐ด๐‘‘๐‘ฅ.

Let โ„Ž = 1๐ด and wlog assume ๐ด is a bounded Borel set. The trick is to introducean independent Gaussian random variable ๐‘ โˆผ ๐’ฉ(0, ๐ผ๐‘‘) with law ๐บ๐‘‘๐‘ฅ. We have

โˆซR๐‘‘

โ„Ž(๐‘ฅ)๐‘‘๐œ‡(๐‘ฅ) = E(โ„Ž(๐‘‹))) = lim๐œŽโ†’0

E(โ„Ž(๐‘‹ + ๐œŽ๐‘))

by dominated convergence theorem. But

E(โ„Ž(๐‘‹ + ๐œŽ๐‘)) = E(โˆซR๐‘‘

โ„Ž(๐‘‹ + ๐œŽ๐‘ฅ)๐บ(๐‘ฅ)๐‘‘๐‘ฅ)

= E(โˆซR๐‘‘

โˆซR๐‘‘

โ„Ž(๐‘‹ + ๐œŽ๐‘ฅ)๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐บ(๐‘ข) ๐‘‘๐‘ข๐‘‘๐‘ฅ(โˆš

2๐œ‹)๐‘‘)

as๐บ(๐‘ฅ) = 1

(โˆš

2๐œ‹)๐‘‘๐บ(๐‘ฅ) = 1

(โˆš

2๐œ‹)๐‘‘โˆซR๐‘‘

๐บ(๐‘ข)๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ข.

So by a change of variable ๐‘ฆ = ๐œŽ๐‘ฅ,

E(โ„Ž(๐‘‹ + ๐œŽ๐‘)) = E(โˆซ โˆซ โ„Ž(๐‘‹ + ๐‘ฆ)๐‘’๐‘–โŸจ๐‘ข,๐‘ฆ/๐œŽโŸฉ๐บ(๐‘ข) ๐‘‘๐‘ข(โˆš

2๐œ‹)๐‘‘๐‘‘๐‘ฆ๐œŽ๐‘‘ )

= E(โˆซ โˆซ โ„Ž(๐‘ง)๐‘’๐‘–โŸจ๐‘ข/๐œŽ,๐‘งโˆ’๐‘ฅโŸฉ๐บ(๐‘ข) ๐‘‘๐‘ข(โˆš

2๐œ‹๐œŽ2)๐‘‘๐‘‘๐‘ง)

= โˆซ โˆซ โ„Ž(๐‘ง)๐‘’๐‘–โŸจ๐‘ข/๐œŽ,๐‘งโŸฉ ๐œ‡๐‘‹(โˆ’๐‘ข๐œŽ

)๐บ(๐‘ข) ๐‘‘๐‘ข(โˆš

2๐œ‹๐œŽ2)๐‘‘๐‘‘๐‘ง Tonelli-Fubini

= 1(2๐œ‹)๐‘‘ โˆซ โˆซ ๐œ‡๐‘‹(๐‘ข)๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘งโŸฉโ„Ž(๐‘ง)๐‘’โˆ’๐œŽ2โ€–๐‘ขโ€–2/2๐‘‘๐‘ข๐‘‘๐‘ง

58

Page 60: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

We want a condition to ensure ๐‘“ โˆˆ ๐ฟ1(R๐‘‘). Clearly continuity is necessary.Here we use a generally principle in Fourier analysis: Fourier transform convertsdecay at infinity to smoothness. In Fourier inversion formula, if ๐‘ข is large thenthe Fourier character has fast oscillation. Thus if ๐‘“ decays fast at infinity thenthe Fourier coefficients also decays fast, and the resulting transfrom is smoother.

Proposition 10.3. If ๐‘“, ๐‘“ โ€ฒ and ๐‘“โ€ณ exists (for example if ๐‘“ is ๐ถ2) and arein ๐ฟ1 then ๐‘“ โˆˆ ๐ฟ1.

Proof. We prove the case ๐‘‘ = 1. The general case follows from Tonelli-Fubini.We show first that ๐‘“, ๐‘“ โ€ฒ โˆˆ ๐ฟ1 implies that ๐‘“(๐‘ข) = ๐‘–

๐‘ข๐‘“ โ€ฒ(๐‘ข). This easily follows

from integration by parts:

๐‘“(๐‘ข) = โˆซ ๐‘“(๐‘ฅ)๐‘’๐‘–๐‘ข๐‘ฅ๐‘‘๐‘ฅ

= 1๐‘–๐‘ข

โˆซ ๐‘“(๐‘ฅ)(๐‘’๐‘–๐‘ข๐‘ฅ)โ€ฒ๐‘‘๐‘ฅ

= โˆ’ 1๐‘–๐‘ข

โˆซ ๐‘“ โ€ฒ(๐‘ฅ)๐‘’๐‘–๐‘ข๐‘ฅ๐‘‘๐‘ฅ

so in particular | ๐‘“(๐‘ข)| โ‰ค 1|๐‘ข| โ€–๐‘“

โ€ฒโ€–1.Thus if ๐‘“, ๐‘“ โ€ฒ, ๐‘“โ€ณ โˆˆ ๐ฟ1 then ๐‘“(๐‘ข) = โˆ’ 1

๐‘ข2๐‘“โ€ณ(๐‘ข) so

| ๐‘“(๐‘ข)| โ‰ค โ€–๐‘“โ€ณโ€–1|๐‘ข2|

.

As โˆซโˆž1

1|๐‘ข|2 ๐‘‘๐‘ข < โˆž, ๐‘“ โˆˆ ๐ฟ1.

Definition (convolution). Given two Borel measures ๐œ‡ and ๐œˆ on R๐‘‘, wedefine their convolution ๐œ‡ โˆ— ๐œˆ as the image of ๐œ‡ โŠ— ๐œˆ under the addition map

ฮฆ โˆถ R๐‘‘ ร— R๐‘‘ โ†’ R๐‘‘

(๐‘ฅ, ๐‘ฆ) โ†ฆ ๐‘ฅ + ๐‘ฆ

i.e. ๐œ‡ โˆ— ๐œˆ = ฮฆโˆ—(๐œ‡ โŠ— ๐œˆ)

Thus given ๐ด โˆˆ โ„ฌ(R๐‘‘),

ฮฆโˆ—(๐œ‡ โŠ— ๐‘ฃ)(๐ด) = ๐œ‡ โŠ— ๐œˆ({(๐‘ฅ, ๐‘ฆ) โˆถ ๐‘ฅ + ๐‘ฆ โˆˆ ๐ด}).

Example. Given ๐‘‹, ๐‘Œ independent random variables and ๐œ‡, ๐œˆ be laws of ๐‘‹, ๐‘Œrespectively, then ๐œ‡ โˆ— ๐œˆ is the law of ๐‘‹ + ๐‘Œ.

Definition (convolution). If ๐‘“, ๐‘” โˆˆ ๐ฟ1(R๐‘‘) define their convolution ๐‘“ โˆ— ๐‘” by

(๐‘“ โˆ— ๐‘”)(๐‘ฅ) = โˆซR๐‘‘

๐‘“(๐‘ฅ โˆ’ ๐‘ก)๐‘”(๐‘ก)๐‘‘๐‘ก.

59

Page 61: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

This is well defined by Fubini: ๐‘“, ๐‘” โˆˆ ๐ฟ1 so

โˆซR๐‘‘

โˆซR๐‘‘

|๐‘“(๐‘ฅ โˆ’ ๐‘ก)๐‘”(๐‘ก)|๐‘‘๐‘ก๐‘‘๐‘ฅ < โˆž

and

โ€–๐‘“ โˆ— ๐‘”โ€–1 = โˆซ โˆฃ โˆซ ๐‘“(๐‘ฅ โˆ’ ๐‘ก)๐‘”(๐‘ก)๐‘‘๐‘กโˆฃ๐‘‘๐‘ฅ โ‰ค โˆซ โˆซ |๐‘“(๐‘ฅ โˆ’ ๐‘ก)๐‘”(๐‘ก)|๐‘‘๐‘ก๐‘‘๐‘ฅ โ‰ค โ€–๐‘“โ€–1 โ‹… โ€–๐‘”โ€–1

Therefore (๐ฟ1(R๐‘‘), โˆ—) forms a Banach algebra.

Remark. If ๐œ‡, ๐œˆ are two finite Borel measures on R๐‘‘ and if ๐œ‡, ๐œˆ โ‰ช ๐‘‘๐‘ฅ, i.e.absolutely continuous, then by Radon-Nikodym there exist ๐‘“, ๐‘” โˆˆ ๐ฟ1(R๐‘‘) suchthat

๐‘‘๐œ‡ = ๐‘“๐‘‘๐‘ฅ๐‘‘๐œˆ = ๐‘”๐‘‘๐‘ฅ

then ๐œ‡ โˆ— ๐œˆ โ‰ช ๐‘‘๐‘ฅ and๐‘‘(๐œ‡ โˆ— ๐œˆ) = (๐‘“ โˆ— ๐‘”)๐‘‘๐‘ฅ.

Proposition 10.4 (Gaussian approximation). If ๐‘“ โˆˆ ๐ฟ๐‘(R๐‘‘) where ๐‘ โˆˆ[1, โˆž) then

lim๐œŽโ†’0

โ€–๐‘“ โˆ— ๐บ๐œŽ โˆ’ ๐‘“โ€–๐‘ = 0

where ๐บ๐œŽ = ๐’ฉ(0, ๐œŽ2๐ผ๐‘‘), i.e.

๐บ๐œŽ(๐‘ฅ) = 1(โˆš

2๐œ‹๐œŽ2)๐‘‘๐‘’โˆ’ โ€–๐‘ฅโ€–2

2๐œŽ2 .

Lemma 10.5 (continuity of translation in ๐ฟ๐‘). Suppose ๐‘ โˆˆ [1, โˆž) and๐‘“ โˆˆ ๐ฟ๐‘. Then

lim๐‘กโ†’0

โ€–๐œ๐‘ก(๐‘“) โˆ’ ๐‘“โ€–๐‘ = 0

where ๐œ๐‘ก(๐‘“)(๐‘ฅ) = ๐‘“(๐‘ฅ + ๐‘ก), ๐‘ก โˆˆ R๐‘‘.

Proof. Example sheet.

Proof of Gaussian approximation.

(๐‘“ โˆ— ๐บ๐œŽ โˆ’ ๐‘“)(๐‘ฅ) = โˆซR๐‘‘

๐บ๐œŽ(๐‘ก)(๐‘“(๐‘ฅ โˆ’ ๐‘ก) โˆ’ ๐‘“(๐‘ฅ))๐‘‘๐‘ก = E(๐‘“(๐‘ฅ โˆ’ ๐œŽ๐‘) โˆ’ ๐‘“(๐‘ฅ))

where ๐‘ โˆผ ๐’ฉ(0, ๐ผ๐‘‘) is Gaussian with density ๐บ1. Then

โ€–๐‘“ โˆ— ๐บ๐œŽ โˆ’ ๐‘“โ€–๐‘๐‘ โ‰ค E(โ€–๐‘“(๐‘ฅ + ๐œŽ๐‘) โˆ’ ๐‘“(๐‘ฅ)โ€–๐‘

๐‘) = E(โ€–๐œ๐œŽ๐‘(๐‘“) โˆ’ ๐‘“โ€–๐‘๐‘)

by Jensenโ€™s inequality and convexity of ๐‘ฅ โ†ฆ ๐‘ฅ๐‘. By the lemma,

lim๐œŽโ†’0

โ€–๐œ๐œŽ๐‘(๐‘“) โˆ’ ๐‘“โ€–๐‘ = 0.

As โ€–๐œ๐œŽ๐‘(๐‘“) โˆ’ ๐‘“โ€–๐‘ โ‰ค 2โ€–๐‘“โ€–๐‘, apply dominated convergence theorem to get therequired result.

60

Page 62: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

Proposition 10.6.

โ€ข If ๐‘“, ๐‘” โˆˆ ๐ฟ1(R๐‘‘) then๐‘“ โˆ— ๐‘” = ๐‘“ โ‹… ๐‘”.

โ€ข If ๐œ‡, ๐œˆ are finite Borel measure then

๐œ‡ โˆ— ๐œˆ = ๐œ‡ โ‹… ๐œˆ.

Proof. 1 reduces to 2 by writing

๐‘“(๐‘ฅ)๐‘‘๐‘ฅ = ๐‘“+(๐‘ฅ)๐‘‘๐‘ฅ โˆ’ ๐‘“โˆ’(๐‘ฅ)๐‘‘๐‘ฅ = ๐‘Ž๐‘‘๐œ‡ โˆ’ ๐‘๐‘‘๐œˆ

for some probability measure ๐œ‡, ๐œˆ.wlog we may assume ๐œ‡ and ๐œˆ are laws of independent random variables ๐‘‹

and ๐‘Œ. Then by a previous result ๐œ‡ โˆ— ๐œˆ is just the law of ๐‘‹ + ๐‘Œ so

๐œ‡ โˆ— ๐œˆ(๐‘ข) = โˆซ ๐‘’๐‘–โŸจ๐‘ข,๐‘ฅ+๐‘ฆโŸฉ๐‘‘๐œ‡(๐‘ฅ)๐‘‘๐œˆ(๐‘ฆ)

= E(๐‘’๐‘–โŸจ๐‘ข,๐‘‹+๐‘Œ โŸฉ) = E(๐‘’๐‘–โŸจ๐‘ข,๐‘‹โŸฉ๐‘’๐‘–โŸจ๐‘ข,๐‘Œ โŸฉ) homomorphism= E(๐‘’๐‘–โŸจ๐‘ข,๐‘‹โŸฉ)E(๐‘’๐‘–โŸจ๐‘ข,๐‘Œ โŸฉ) as ๐‘‹, ๐‘Œ are independent= ๐œ‡(๐‘ข) โ‹… ๐œˆ(๐‘ข).

In short, this is precisely because ๐‘’๐‘–โŸจ๐‘ข,โˆ’โŸฉ are characters.

Theorem 10.7 (Lรฉvy criterion). Let (๐‘‹๐‘›)๐‘›โ‰ฅ1 and ๐‘‹ be R๐‘‘-valued randomvariables. Then TFAE:

1. ๐‘‹๐‘› โ†’ ๐‘‹ in law,

2. For all ๐‘ข โˆˆ R๐‘‘, ๐œ‡๐‘‹๐‘›(๐‘ข) โ†’ ๐œ‡๐‘‹(๐‘ข).

In particular if ๐œ‡๐‘‹ = ๐œ‡๐‘Œ for two random variables ๐‘‹ and ๐‘Œ then ๐‘‹ = ๐‘Œ inlaw, i.e. ๐œ‡๐‘‹ = ๐œ‡๐‘Œ.

Thus Fourier transform is an injection from Borel measure to certain functionspace.

Proof.

โ€ข 1 โŸน 2: Clear by defintion as ๐‘“(๐‘ฅ) = ๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ is continuous and boundedfor all ๐‘ข โˆˆ R๐‘‘.

โ€ข 2 โŸน 1: Need to show that for all ๐‘” โˆˆ ๐ถ๐‘(R๐‘‘),

E(๐‘”(๐‘‹๐‘›)) โ†’ E(๐‘”(๐‘‹)).

wlog itโ€™s enough to check this for all ๐‘” โˆˆ ๐ถโˆž๐‘ (R๐‘‘). For the sufficiency see

example sheet.Note that for all ๐‘” โˆˆ ๐ถโˆž

๐‘ (R๐‘‘), ๐‘” โˆˆ ๐ฟ1 so by Fourier inversion formula

๐‘”(๐‘ฅ) = โˆซ ๐‘”(๐‘ข)๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ ๐‘‘๐‘ข(2๐œ‹)๐‘‘ .

61

Page 63: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

Hence

E(๐‘”(๐‘‹๐‘›)) = โˆซ ๐‘”(๐‘ข)E(๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘‹๐‘›โŸฉ)๐‘‘๐‘ข

= โˆซ ๐‘”(๐‘ข) ๐œ‡๐‘‹๐‘›(โˆ’๐‘ข) ๐‘‘๐‘ข

(2๐œ‹)๐‘‘

โ†’ โˆซ ๐‘”(๐‘ข)๏ฟฝ๏ฟฝ๐‘‹(โˆ’๐‘ข) ๐‘‘๐‘ข(2๐œ‹)๐‘‘

= E(๐‘”(๐‘‹))

by dominated convergence theorem.

Theorem 10.8 (Plancherel formula).

1. If ๐‘“ โˆˆ ๐ฟ1(R๐‘‘) โˆฉ ๐ฟ2(R๐‘‘) then ๐‘“ โˆˆ ๐ฟ2(R๐‘‘) and

โ€– ๐‘“โ€–22 = (2๐œ‹)๐‘‘โ€–๐‘“โ€–2

2.

2. If ๐‘“, ๐‘” โˆˆ ๐ฟ1(R๐‘‘) โˆฉ ๐ฟ2(R๐‘‘) then

โŸจ ๐‘“, ๐‘”โŸฉ๐ฟ2 = (2๐œ‹)๐‘‘โŸจ๐‘“, ๐‘”โŸฉ๐ฟ2 .

3. The Fourier transform

โ„ฑ โˆถ ๐ฟ1(R๐‘‘) โˆฉ ๐ฟ2(R๐‘‘) โ†’ ๐ฟ2(R๐‘‘)

๐‘“ โ†ฆ 1(โˆš

2๐œ‹)๐‘‘๐‘“

extends uniquely to a linear operator on ๐ฟ2(R๐‘‘) which is an isometry.Moreover

โ„ฑ โˆ˜ โ„ฑ(๐‘“) = ๐‘“

where ๐‘“(๐‘ฅ) = ๐‘“(โˆ’๐‘ฅ), for all ๐‘“ โˆˆ ๐ฟ2(R๐‘‘).

Proof. First we prove 1 and 2 assuming ๐‘“ , ๐‘” โˆˆ ๐ฟ1(R๐‘‘). By Fourier inversionformula,

โ€– ๐‘“โ€–22 = โˆซ

R๐‘‘

| ๐‘“(๐‘ข)|2๐‘‘๐‘ข

= โˆซ ๐‘“(๐‘ข) ๐‘“(๐‘ข)๐‘‘๐‘ข

= โˆซ (โˆซ ๐‘“(๐‘ฅ)๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ฅ) ๐‘“(๐‘ข)๐‘‘๐‘ข

= โˆซ โˆซ ๐‘“(๐‘ฅ) ๐‘“(๐‘ข)๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ข๐‘‘๐‘ฅ

= โˆซ ๐‘“(๐‘ฅ)๐‘“(๐‘ฅ)(2๐œ‹)๐‘‘๐‘‘๐‘ฅ

= (2๐œ‹)๐‘‘โ€–๐‘“โ€–22

62

Page 64: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

10 Fourier transform

and in particular ๐‘“ โˆˆ ๐ฟ2(R๐‘‘).Similarly for 2,

โŸจ ๐‘“, ๐‘”โŸฉ๐ฟ2 = โˆซ ๐‘“(๐‘ข) ๐‘”(๐‘ข)๐‘‘๐‘ข

= โˆซ (โˆซ ๐‘“(๐‘ฅ)๐‘’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ฅ) ๐‘”(๐‘ข)๐‘‘๐‘ข

= โˆซ โˆซ ๐‘“(๐‘ฅ) ๐‘”(๐‘ข)๐‘’โˆ’๐‘–โŸจ๐‘ข,๐‘ฅโŸฉ๐‘‘๐‘ข๐‘‘๐‘ฅ

= โˆซ ๐‘“(๐‘ฅ)๐‘”(๐‘ฅ)๐‘‘๐‘ฅ(2๐œ‹)๐‘‘

= (2๐œ‹)๐‘‘โŸจ๐‘“, ๐‘”โŸฉ๐ฟ2

Now for the general case we use Gaussian as a mollifier. Consider

๐‘“๐œŽ = ๐‘“ โˆ— ๐บ๐œŽ

๐‘”๐œŽ = ๐‘” โˆ— ๐บ๐œŽ

and based on results and computations before,๐‘“๐œŽ = ๐‘“ โ‹… ๐บ๐œŽ = ๐‘“๐‘’โˆ’๐œŽโ€–๐‘ขโ€–2/2.

As โ€– ๐‘“โ€–โˆž โ‰ค โ€–๐‘“โ€–1, ๐‘“๐œŽ โˆˆ ๐ฟ1(R๐‘‘). Thus ๐‘“๐œŽ โˆˆ ๐ฟ2(R๐‘‘) and โ€– ๐‘“๐œŽโ€–22 = (2๐œ‹)๐‘‘โ€–๐‘“๐œŽโ€–2

2. Butby Gaussian approximation we know that ๐‘“๐œŽ โ†’ ๐‘“ in ๐ฟ2(R๐‘‘) as ๐œŽ โ†’ 0. Henceโ€–๐‘“๐œŽโ€–2 โ†’ โ€–๐‘“โ€–2. Then

โ€– ๐‘“๐œŽโ€–22 = โ€– ๐‘“ โ‹… ๐บ๐œŽโ€–2

2 = โˆซ | ๐‘“(๐‘ข)|2๐‘’โˆ’๐œŽโ€–๐‘ขโ€–2/2๐‘‘๐‘ข โ†’ โ€–๐‘“โ€–22

as ๐œŽ โ†’ 0 by monotone convergence theorem. Thus

โ€– ๐‘“โ€–22 = (2๐œ‹)๐‘‘โ€–๐‘“โ€–2

2.

For 2,โŸจ ๐‘“๐œŽ, ๐‘”๐œŽโŸฉ = โˆซ ๐‘“ ๐‘”๐‘’โˆ’๐œŽโ€–๐‘ขโ€–2๐‘‘๐‘ข โ†’ โˆซ ๐‘“ ๐‘”๐‘‘๐‘ข

as ๐œŽ โ†’ 0 by dominated convergence theorem as ๐‘“ ๐‘” โˆˆ ๐ฟ1. The result followsfrom Gaussian approximation.

For 3, ๐ฟ1(R๐‘‘)โˆฉ๐ฟ2(R๐‘‘) is dense in ๐ฟ2(R๐‘‘) because it contains ๐ถ๐‘(R๐‘‘). Thenextend by completeness: given ๐‘“ โˆˆ ๐ฟ2(R๐‘‘), pick a sequence ๐‘“๐‘› โˆˆ ๐ฟ1(R๐‘‘) โˆฉ๐ฟ2(R๐‘‘) such that ๐‘“๐‘› โ†’ ๐‘“ in ๐ฟ2(R๐‘‘). Then define

โ„ฑ๐‘“ = lim๐‘›โ†’โˆž

โ„ฑ๐‘“๐‘›.

The limit exists as ๐ฟ2(R๐‘‘) is complete. โ„ฑ is well-defined as

โ€–โ„ฑ๐‘“๐‘› โˆ’ โ„ฑ๐‘“๐‘šโ€–2 = โ€–๐‘“๐‘› โˆ’ ๐‘“๐‘šโ€–2

by 1. Finally,โ€–โ„ฑ๐‘“โ€–2 = โ€–๐‘“โ€–2

for all ๐‘“ โˆˆ ๐ฟ2(R๐‘‘) andโ„ฑ โˆ˜ โ„ฑ(๐‘“) = ๐‘“

for all ๐‘“ such that ๐‘“, ๐‘“ โˆˆ ๐ฟ1(R๐‘‘). Thus by continuity this holds for ๐ฟ2(R๐‘‘).

63

Page 65: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

11 Gaussians

11 Gaussians

Definition (Gaussian). An R๐‘‘-valued random variable ๐‘‹ is called Gaussianif for all ๐‘ข โˆˆ R๐‘‘, โŸจ๐‘‹, ๐‘ขโŸฉ is Gaussian, namely its law has the form ๐’ฉ(๐‘š, ๐œŽ2)for some ๐‘š โˆˆ R, ๐œŽ > 0.

Proposition 11.1. The law of a Gaussian vector ๐‘‹ = (๐‘‹1, โ€ฆ , ๐‘‹๐‘‘) โˆˆ R๐‘‘

is uniquely determined by

1. its mean E๐‘‹ = (E๐‘‹1, โ€ฆ ,E๐‘‹๐‘‘),

2. its covariance matrix (Cov(๐‘‹๐‘–, ๐‘‹๐‘—))๐‘–๐‘— where

Cov(๐‘‹๐‘–, ๐‘‹๐‘—) = E((๐‘‹๐‘– โˆ’ E๐‘‹๐‘–)(๐‘‹๐‘— โˆ’ E๐‘‹๐‘—)).

Proof. If ๐‘‘ = 1 then this just says that it is determined by its mean ๐‘š andcovariance ๐œŽ2, which is obviously true. For ๐‘‘ > 1, compute the characteristicfunction

๐œ‡๐‘‹(๐‘ข) = E(๐‘’๐‘–โŸจ๐‘‹,๐‘ขโŸฉ)

but by assumption โŸจ๐‘‹, ๐‘ขโŸฉ is Gaussian in ๐‘‘ = 1 so its law is determined

1. the mean E(โŸจ๐‘‹, ๐‘ขโŸฉ) = โŸจE๐‘‹, ๐‘ขโŸฉ,

2. the variance VarโŸจ๐‘‹, ๐‘ขโŸฉ. But

VarโŸจ๐‘‹, ๐‘ขโŸฉ = E((โŸจ๐‘‹, ๐‘ขโŸฉ โˆ’ EโŸจ๐‘‹, ๐‘ขโŸฉ)2) = โˆ‘๐‘–,๐‘—

๐‘ข๐‘–๐‘ข๐‘— Cov(๐‘‹๐‘–, ๐‘‹๐‘—).

In particular this shows that (Cov(๐‘‹๐‘–, ๐‘‹๐‘—))๐‘–๐‘— is a non-negative semidefinitesymmetric matrix.

Proposition 11.2. If ๐‘‹ is a Gaussian vector then exists ๐ด โˆˆ โ„ณ๐‘‘(R), ๐‘ โˆˆ R๐‘‘

such that ๐‘‹ has the same law as ๐ด๐‘ + ๐‘ where ๐‘ = (๐‘1, โ€ฆ , ๐‘๐‘‘), (๐‘๐‘–)๐‘‘๐‘–=1

are iid. ๐’ฉ(0, 1).

Proof. Take ๐ด such that

๐ด๐ดโˆ— = (Cov(๐‘‹๐‘–, ๐‘‹๐‘—))๐‘–๐‘—

where ๐ดโˆ— is the adjoint/transpose of ๐ด, and

๐‘ = (E๐‘‹1, โ€ฆ ,E๐‘‹๐‘‘).

Check that for all ๐‘ข โˆˆ R๐‘‘,

E(โŸจ๐‘‹, ๐‘ขโŸฉ) = โŸจ๐‘, ๐‘ขโŸฉVar(โŸจ๐‘‹, ๐‘ขโŸฉ) = โŸจ๐ด๐ดโˆ—๐‘ข, ๐‘ขโŸฉ = โ€–๐ดโˆ—๐‘ขโ€–2

2 = Var(โŸจ๐ด๐‘ + ๐‘, ๐‘ขโŸฉ)

64

Page 66: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

11 Gaussians

Proposition 11.3. If (๐‘‹1, โ€ฆ , ๐‘‹๐‘‘) is a Gaussian vector then TFAE:

1. ๐‘‹๐‘–โ€™s are independent.

2. ๐‘‹๐‘–โ€™s are pairwise independent.

3. (Cov(๐‘‹๐‘–, ๐‘‹๐‘—))๐‘–๐‘— is a diagonal matrix.

Proof. 1 โŸน 2 โŸน 3 is obvious. 3 โŸน 1 as we can choose ๐ด to be diagonal.Thus ๐‘‹ has the same law as (๐‘Ž1๐‘1, โ€ฆ , ๐‘Ž๐‘‘๐‘๐‘‘) + ๐‘.

Theorem 11.4 (central limit theorem). Let (๐‘‹๐‘–)๐‘–โ‰ฅ1 be R๐‘‘-valued iid. ran-dom variables with law ๐œ‡. Assume they have second moment, i.e. E(โ€–๐‘‹1โ€–2) <โˆž. Let ๐ฆ = E(๐‘‹1) โˆˆ R๐‘‘ and

๐‘Œ๐‘› = ๐‘‹1 + โ‹ฏ + ๐‘‹๐‘› โˆ’ ๐‘› โ‹… ๐ฆโˆš๐‘›

.

Then ๐‘Œ๐‘› converges in law to a central Gaussian on R๐‘‘ with law ๐’ฉ(0, ๐พ)where

๐พ๐‘–๐‘— = (Cov(๐‘‹1))๐‘–๐‘— = [โˆซR๐‘‘

(๐‘ฅ๐‘– โˆ’ E(๐‘‹1))(๐‘ฅ๐‘— โˆ’ E(๐‘‹1))๐‘‘๐œ‡(๐‘ฅ)]๐‘–๐‘—

.

Proof. The proof is an application of Lรฉvy criterion. Need to show ๐œ‡๐‘Œ๐‘›(๐‘ข) โ†’

๐œ‡๐‘Œ(๐‘ข) as ๐‘› โ†’ โˆž for all ๐‘ข, where ๐‘Œ โˆผ ๐’ฉ(0, ๐พ). As

๐œ‡๐‘Œ๐‘›(๐‘ข) = E(๐‘’๐‘–โŸจ๐‘Œ๐‘›,๐‘ขโŸฉ),

this is equivalent to show that for all ๐‘ข, โŸจ๐‘Œ๐‘›, ๐‘ขโŸฉ converges in law to โŸจ๐‘Œ , ๐‘ขโŸฉ. But

โŸจ๐‘Œ๐‘›, ๐‘ขโŸฉ = โŸจ๐‘‹1, ๐‘ขโŸฉ + โ‹ฏ + โŸจ๐‘‹๐‘›, ๐‘ขโŸฉ โˆ’ ๐‘›โŸจ๐ฆ, ๐‘ขโŸฉโˆš๐‘›

so we reduce the problem to 1-dimension case. By rescaling wlog E(๐‘‹1) =0,E(๐‘‹2

1) = 1.Now

๐œ‡๐‘Œ๐‘›(๐‘ข) = E(๐‘’๐‘–๐‘ข๐‘Œ๐‘›)

= E(exp(๐‘–๐‘ข๐‘‹1 + โ‹ฏ + ๐‘‹๐‘›โˆš๐‘›

))

=๐‘›

โˆ๐‘–=1

E(exp(๐‘–๐‘ข ๐‘‹๐‘–โˆš๐‘›

))

= (E(exp(๐‘–๐‘ข ๐‘‹๐‘–โˆš๐‘›

)))๐‘›

= ( ๐œ‡( ๐‘ขโˆš๐‘›

))๐‘›

65

Page 67: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

11 Gaussians

But E(๐‘‹1) = 0,E(๐‘‹21) = 1 so we can differentiate ๐œ‡ under the integral sign

๐œ‡(๐‘ข) = โˆซR

๐‘’๐‘–๐‘ข๐‘ฅ๐‘‘๐œ‡(๐‘ฅ)

๐‘‘๐‘‘๐‘ข

๐œ‡(๐‘ข) = โˆซR

๐‘–๐‘ฅ๐‘’๐‘–๐‘ข๐‘ฅ๐‘‘๐œ‡(๐‘ฅ) = ๐‘–E(๐‘‹1)

๐‘‘2

๐‘‘๐‘ข2 ๐œ‡(๐‘ข) = โˆซR

โˆ’๐‘ฅ2๐‘’๐‘–๐‘ข๐‘ฅ๐‘‘๐œ‡(๐‘ฅ) = โˆ’E(๐‘‹21)

Taylor expand ๐œ‡ around 0 to 2nd order,

๐œ‡(๐‘ข) = ๐œ‡(0) + ๐‘ข ๐œ‡โ€ฒ(0) + ๐‘ข2

2๐œ‡โ€ณ(๐‘ข) + ๐‘œ(๐‘ข2)

= 1 + 0 โ‹… ๐‘ข โˆ’ ๐‘ข2

2+ ๐‘œ(๐‘ข2)

so๐œ‡๐‘Œ๐‘›

(๐‘ข) = (1 โˆ’ ๐‘ข2

2๐‘›+ ๐‘œ(๐‘ข2

๐‘›))๐‘› โ†’ ๐‘’โˆ’๐‘ข2/2 = ๐‘”(๐‘ข)

as ๐‘› โ†’ โˆž where ๐‘” is the law of ๐‘Œ.

66

Page 68: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

12 Ergodic theoryLet (๐‘‹, ๐’œ, ๐œ‡) be a measure space. Let ๐‘‡ โˆถ ๐‘‹ โ†’ ๐‘‹ be an ๐’œ-measurable map. Weare interested in the trajectories of ๐‘‡ ๐‘›๐‘ฅ for ๐‘› โ‰ฅ 0 and their statistical behaviour.In particular we are interested in those ๐‘‡ preserving measure ๐œ‡.

Definition (measure-preserving). ๐‘‡ โˆถ ๐‘‹ โ†’ ๐‘‹ is measure-preserving if๐‘‡โˆ—๐œ‡ = ๐œ‡. (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ ) is called a measure-preserving dynamical system.

Definition (invariant function, invariant set, invariant ๐œŽ-algebra).

โ€ข A measurable function ๐‘“ โˆถ ๐‘‹ โ†’ R is called ๐‘‡-invariant if ๐‘“ = ๐‘“ โˆ˜ ๐‘‡.

โ€ข A set ๐ด โˆˆ ๐’œ is ๐‘‡-invariant if 1๐ด is ๐‘‡-invariant.

โ€ข๐’ฏ = {๐ด โˆˆ ๐’œ โˆถ ๐ด is ๐‘‡-invariant}

is called the ๐‘‡-invariant ๐œŽ-algebra.

Lemma 12.1. ๐‘“ is ๐‘‡-invariant if and only if ๐‘“ is ๐’ฏ-measurable.

Proof. Indeed for all ๐‘ก โˆˆ R,

{๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) < ๐‘ก} = {๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“ โˆ˜ ๐‘‡ (๐‘ฅ) < ๐‘ก} = ๐‘‡ โˆ’1({๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) < ๐‘ก}).

Definition (ergodic). ๐‘‡ is ergodic with respect to ๐œ‡, or that ๐œ‡ is ergodicwith respect to ๐‘‡, if for all ๐ด โˆˆ ๐’ฏ, ๐œ‡(๐ด) = 0 or ๐œ‡(๐ด๐‘) = 0.

This condition asserts that ๐’ฏ is trivial, i.e. its elements are either null orconull.

Lemma 12.2. ๐‘‡ is ergodic with respect to ๐œ‡ if and only if every invariantfunction ๐‘“ is almost everywhere constant.

Proof. Exercise.

Example.

1. Let ๐‘‹ be a finite space, ๐‘‡ โˆถ ๐‘‹ โ†’ ๐‘‹ a map and ๐œ‡ = # the countingmeasure, then ๐‘‡ is measure preserving is equivalent to ๐‘‡ being a bijection,and ๐‘‡ is ergodic is equivalent to there does not exists a partition ๐‘‹ =๐‘‹1 โˆช ๐‘‹2 such that both ๐‘‹1 and ๐‘‹2 are ๐‘‡-invariant, which is equivalent tofor all ๐‘ฅ, ๐‘ฆ โˆˆ ๐‘‹, there exists ๐‘› such that ๐‘‡ ๐‘›๐‘ฅ = ๐‘ฆ.

2. Let ๐‘‹ = R๐‘‘/Z๐‘‘, ๐’œ the Borel ๐œŽ-algebra and ๐œ‡ the Lebesgue measure.Given ๐‘Ž โˆˆ R๐‘‘, translation ๐‘‡๐‘Ž โˆถ ๐‘ฅ โ†ฆ ๐‘ฅ + ๐‘Ž is measure-preserving. ๐‘‡๐‘Žis ergodic with respect to ๐œ‡ if and only if (1, ๐‘Ž1, โ€ฆ , ๐‘Ž๐‘‘), where ๐‘Ž๐‘–โ€™s arecoordinates of ๐‘Ž, are linearly independent. See example sheet. (hint:Fourier transform)

67

Page 69: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

3. Let ๐‘‹ = R/Z and again ๐’œ Borel ๐œŽ-algebra and ๐œ‡ the Lebesgue mea-sure. The doubling map ๐‘‡ โˆถ ๐‘ฅ โ†ฆ 2๐‘ฅ โˆ’ โŒŠ2๐‘ฅโŒ‹ is ergodic with respect to๐œ‡. (hint: again consider Fourier coefficeints). Intuitively in the graphof this function the preimage ๐œ€ is two segments each of length ๐œ€/2, someasure-preserving.

4. Furstenberg conjecture: every ergodic measure ๐œ‡ on R/Z invariant under๐‘‡2, ๐‘‡3 must be either Lebesgue or finitely supported.

12.1 The canonical modelLet (๐‘‹๐‘›)๐‘›โ‰ฅ1 be an R๐‘‘-valued stochastic process on (ฮฉ, โ„ฑ,P). Let ๐‘‹ = (R๐‘‘)Nand define the sample path map

ฮฆ โˆถ ฮฉ โ†’ ๐‘‹๐œ” โ†ฆ (๐‘‹๐‘›(๐œ”))๐‘›โ‰ฅ1

Let

๐‘‡ โˆถ ๐‘‹ โ†’ ๐‘‹(๐‘ฅ๐‘›)๐‘›โ‰ฅ1 โ†ฆ (๐‘ฅ๐‘›+1)๐‘›โ‰ฅ1

be the shift map. Let ๐‘ฅ๐‘› โˆถ ๐‘‹ โ†’ R๐‘‘ be the ๐‘›th coordinate function and let๐’œ = ๐œŽ(๐‘ฅ๐‘› โˆถ ๐‘› โ‰ฅ 1).

Note. ๐’œ is the infinite product ๐œŽ-algebra โ„ฌ(R๐‘‘)โŠ—N of โ„ฌ(R๐‘‘)N.

Let ๐œ‡ = ฮฆโˆ—P, a probability measure on (๐‘‹, ๐’œ). This ๐œ‡ is called the law ofthe process (๐‘‹๐‘›)๐‘›โ‰ฅ1. Now (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ ) is called the canonical model associatedto (๐‘‹๐‘›)๐‘›โ‰ฅ1.

Proposition 12.3 (stationary process). TFAE:

1. (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ ) is measure-preserving.

2. For all ๐‘˜ โ‰ฅ 1, the law of (๐‘‹๐‘›, ๐‘‹๐‘›+1, โ€ฆ , ๐‘‹๐‘›+๐‘˜) on (R๐‘‘)๐‘˜ is independentof ๐‘›.

In this case we say that (๐‘‹๐‘›)๐‘›โ‰ฅ1 is a stationary process.

Proof.

โ€ข 1 โŸน 2: ๐œ‡ = ๐‘‡โˆ—๐œ‡ implies ๐œ‡ = ๐‘‡ ๐‘›โˆ— ๐œ‡ for all ๐œ‡ and this says law of (๐‘‹๐‘–)๐‘–โ‰ฅ1

is the same as that of (๐‘‹๐‘–+๐‘›)๐‘–โ‰ฅ1.

โ€ข 1 โŸธ 2: ๐œ‡ and ๐‘‡ ๐‘›โˆ— ๐œ‡ agree on cylinders ๐ด ร— (R๐‘‘)N\๐น for ๐น โŠ† N finite,

๐ด โˆˆ โ„ฌ((R๐‘‘)๐น).

In some sense ergodic system is the study of stationary process.

68

Page 70: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

Proposition 12.4 (Bernoulli shift). If (๐‘‹๐‘›)๐‘›โ‰ฅ1 are iid. then (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ )is ergodic. It is called the Bernoulli shift associated to the law ๐œˆ of ๐‘‹1. Wehave

๐œ‡ = ๐œˆโŠ—N.

Proof. Claim that ฮฆโˆ’1(๐’ฏ) โŠ† ๐’ž, the tail ๐œŽ-algebra of (๐‘‹๐‘›)๐‘›โ‰ฅ1. But Kolmogorov0-1 law says that if ๐ด โˆˆ ๐’ฏ then P(ฮฆโˆ’1(๐ด)) = 0 or 1, so ๐œ‡(๐ด) = 0 or 1, thus ๐œ‡is ๐‘‡-ergodic.

Given ๐ด โˆˆ ๐’ฏ, ๐‘‡ โˆ’1๐ด = ๐ด so

ฮฆโˆ’1(๐ด) = {๐œ” โˆˆ ฮฉ โˆถ (๐‘‹๐‘›(๐œ”))๐‘›โ‰ฅ1 โˆˆ ๐ด}= {๐œ” โˆˆ ฮฉ โˆถ (๐‘‹๐‘›(๐œ”))๐‘›โ‰ฅ1 โˆˆ ๐‘‡ โˆ’1๐ด}= {๐œ” โˆˆ ฮฉ โˆถ (๐‘‹๐‘›+1(๐œ”))๐‘›โ‰ฅ1 โˆˆ ๐ด}= {๐œ” โˆˆ ฮฉ โˆถ (๐‘‹๐‘›+๐‘˜(๐œ”))๐‘›โ‰ฅ1 โˆˆ ๐ด} for all ๐‘˜โˆˆ ๐œŽ(๐‘‹๐‘˜, ๐‘‹๐‘˜+1, โ€ฆ )

for ๐œ‡-almost every ๐‘ฅ.

Theorem 12.5 (von Neumann mean ergodic theorem). Let (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ )be a measure-preserving system. Let ๐‘“ โˆˆ ๐ฟ2(๐‘‹, ๐’œ, ๐œ‡). Then the ergodicaverage

๐‘†๐‘›๐‘“ = 1๐‘›

๐‘›โˆ’1โˆ‘๐‘–=0

๐‘“ โˆ˜ ๐‘‡ ๐‘–

converges in ๐ฟ2 to ๐‘“, a ๐‘‡-invariant function. In fact ๐‘“ is the orthogonalprojection of ๐‘“ onto ๐ฟ2(๐‘‹, ๐’ฏ, ๐œ‡).

The intuition is as follow: if ๐‘“ is the indicator function of a set, then ๐‘†๐‘›๐‘“ isexactly the average of the time each orbit spending in ๐ด.

Proof. Hilbert space argument. Let ๐ป = ๐ฟ2(๐‘‹, ๐’œ, ๐œ‡) and define

๐‘ˆ โˆถ ๐ป โ†’ ๐ป๐‘“ โ†ฆ ๐‘“ โˆ˜ ๐‘‡

which is an isometry: because ๐œ‡ is ๐‘‡-invariant, โˆซ |๐‘“ โˆ˜ ๐‘‡ |2๐‘‘๐œ‡ = โˆซ |๐‘“|2๐‘‘๐œ‡. Thenby Riesz representation theorem it has an adjoint

๐‘ˆ โˆ— โˆถ ๐ป โ†’ ๐ป๐‘ฅ โ†ฆ ๐‘ˆ โˆ—๐‘ฅ

which satisfies โŸจ๐‘ˆโˆ—๐‘ฅ, ๐‘ฆโŸฉ = โŸจ๐‘ฅ, ๐‘ˆ๐‘ฆโŸฉ for all ๐‘ฆ โˆˆ ๐ป. Let

๐‘Š = {๐œ‘ โˆ’ ๐œ‘ โˆ˜ ๐‘‡ โˆถ ๐œ‘ โˆˆ ๐ป}

be the coboundaries. Let ๐‘“ โˆˆ ๐‘Š. Then

๐‘†๐‘›๐‘“ = 1๐‘›

๐‘›โˆ’1โˆ‘๐‘–=0

(๐œ‘ โˆ˜ ๐‘‡ ๐‘– โˆ’ ๐œ‘ โˆ˜ ๐‘‡ ๐‘–+1) = ๐œ‘ โˆ’ ๐œ‘ โˆ˜ ๐‘‡ ๐‘›

๐‘›โ†’ 0

69

Page 71: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

as ๐‘› โ†’ โˆž.Let ๐‘“ โˆˆ ๐‘Š then again ๐‘†๐‘›๐‘“ โ†’ 0 because for all ๐œ€ exists ๐‘” โˆˆ ๐‘Š such that

โ€–๐‘“ โˆ’ ๐‘”โ€– < ๐œ€. Then

โ€–๐‘†๐‘›๐‘“ โˆ’ ๐‘†๐‘›๐‘”โ€– = โ€–๐‘†๐‘›(๐‘“ โˆ’ ๐‘”)โ€– โ‰ค โ€–๐‘“ โˆ’ ๐‘”โ€– โ‰ค ๐œ€

so lim sup๐‘›โ€–๐‘†๐‘›๐‘“โ€– โ‰ค ๐œ€.Have ๐ป = ๐‘Š โŠ• ๐‘ŠโŸ‚ and ๐‘ŠโŸ‚ = ๐‘Š โŸ‚. Claim ๐‘Š โŸ‚ is exactly the ๐‘‡-invariant

functions. The theorem then follows because if ๐‘“ โˆ˜ ๐‘‡ = ๐‘“ then ๐‘†๐‘›๐‘“ = ๐‘“ for all ๐‘“.

Proof of claim.

๐‘Š โŸ‚ = {๐‘” โˆˆ ๐ป โˆถ โŸจ๐‘”, ๐œ‘ โˆ’ ๐‘ˆ๐œ‘โŸฉ = 0 for all ๐œ‘ โˆˆ ๐ป}= {๐‘” โˆถ โŸจ๐‘”, ๐œ‘โŸฉ = โŸจ๐‘”, ๐‘ˆ๐œ‘โŸฉ for all ๐œ‘}= {๐‘” โˆถ โŸจ๐‘”, ๐œ‘โŸฉ = โŸจ๐‘ˆ โˆ—๐‘”, ๐œ‘โŸฉ for all ๐œ‘}= {๐‘” โˆถ ๐‘ˆ โˆ—๐‘” = ๐‘”}= {๐‘” โˆถ ๐‘ˆ๐‘” = ๐‘”}

where the last equality is by

โ€–๐‘ˆ๐‘” โˆ’ ๐‘”โ€–2 = 2โ€–๐‘”โ€–2 โˆ’ 2 ReโŸจ๐‘”, ๐‘ˆ๐‘”โŸฉ = 2โ€–๐‘”โ€–2 โˆ’ 2 ReโŸจ๐‘ˆโˆ—๐‘”, ๐‘”โŸฉ,

and this shows that ๐‘Š โŸ‚ are exactly ๐‘‡-invariant functions.

In fact we can do better:

Theorem 12.6 (Birkhoff (pointwise) ergodic theorem). Let (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ ) bea measure-preserving system. Assume ๐œ‡ is finite (actually ๐œŽ-finite suffices)and let ๐‘“ โˆˆ ๐ฟ1(๐‘‹, ๐’œ, ๐œ‡). Then

๐‘†๐‘›๐‘“ = 1๐‘›

๐‘›โˆ’1โˆ‘๐‘–=0

๐‘“ โˆ˜ ๐‘‡ ๐‘–

converges ๐œ‡-almost everywhere to a ๐‘‡-invariant function ๐‘“ โˆˆ ๐ฟ1. Moreover๐‘†๐‘›๐‘“ โ†’ ๐‘“ in ๐ฟ1.

Corollary 12.7 (strong law of large numbers). Let (๐‘‹๐‘›)๐‘›โ‰ฅ1 be a sequenceof iid. random variables. Assume E(|๐‘‹1|) < โˆž. Let

๐‘†๐‘› =๐‘›

โˆ‘๐‘˜=1

๐‘‹๐‘˜,

then 1๐‘› ๐‘†๐‘› converges almost surely to E(๐‘‹1).

Proof. Let (๐‘‹, ๐’œ, ๐œ‡, ๐‘‡ ) be the canonical model associated to (๐‘‹๐‘›)๐‘›โ‰ฅ1, where๐‘‹ = RN, ๐’œ = โ„ฌ(R)โŠ—N, ๐‘‡ the shift operator and ๐œ‡ = ๐œˆโŠ—N where ๐œˆ is the law of๐‘‹1. It is a Bernoulli shift. Let

๐‘“ โˆถ ๐‘‹ โ†’ R๐‘ฅ โ†ฆ ๐‘ฅ1

70

Page 72: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

the first coodinate. Then ๐‘“ โˆ˜ ๐‘‡ ๐‘–(๐‘ฅ) = ๐‘ฅ๐‘–+1 so

1๐‘›

(๐‘‹1 + โ‹ฏ + ๐‘‹๐‘›)(๐œ”) = ๐‘†๐‘›๐‘“(๐‘ฅ)

where ๐‘ฅ = (๐‘‹๐‘›(๐œ”))๐‘›โ‰ฅ1. Hence by Birkhoff ergodic theorem

1๐‘›

(๐‘‹1 + โ‹ฏ + ๐‘‹๐‘›) โ†’ ๐‘“ = โˆซ ๐‘“๐‘‘๐œ‡ = โˆซ ๐‘ฅ1๐‘‘๐œ‡ = E(๐‘‹1)

almost surely.

Remark. If ๐‘‡ is ergodic then ๐‘“ is almost everywhere constant. Hence ๐‘“ =โˆซ ๐‘“๐‘‘๐œ‡.

Lemma 12.8 (maximal ergodic lemma). Let ๐‘“ โˆˆ ๐ฟ1(๐‘‹, ๐’œ, ๐œ‡) and ๐›ผ โˆˆ R.Let

๐ธ๐›ผ = {๐‘ฅ โˆˆ ๐‘‹ โˆถ sup๐‘›โ‰ฅ1

๐‘†๐‘›๐‘“(๐‘ฅ) > ๐›ผ}

then๐›ผ๐œ‡(๐ธ๐›ผ) โ‰ค โˆซ

๐ธ๐›ผ

๐‘“๐‘‘๐œ‡.

Lemma 12.9 (maximal inequality). Let

๐‘“0 = 0

๐‘“๐‘› = ๐‘›๐‘†๐‘›๐‘“ =๐‘›โˆ’1โˆ‘๐‘–=0

๐‘“ โˆ˜ ๐‘‡ ๐‘– ๐‘› โ‰ฅ 1

Let๐‘ƒ๐‘ = {๐‘ฅ โˆˆ ๐‘‹ โˆถ max

0โ‰ค๐‘›โ‰ค๐‘๐‘“๐‘›(๐‘ฅ) > 0}.

Thenโˆซ

๐‘ƒ๐‘

๐‘“๐‘‘๐œ‡ โ‰ฅ 0.

Proof of maximal inequality. Set ๐น๐‘ = max0โ‰ค๐‘›โ‰ค๐‘ ๐‘“๐‘›. Observe that for all ๐‘› โ‰ค๐‘, ๐น๐‘ โ‰ฅ ๐‘“๐‘› and hence

๐น๐‘ โˆ˜ ๐‘‡ + ๐‘“ โ‰ฅ ๐‘“๐‘› โˆ˜ ๐‘‡ + ๐‘“ = ๐‘“๐‘›+1.

Now if ๐‘ฅ โˆˆ ๐‘ƒ๐‘› then

๐น๐‘(๐‘ฅ) โ‰ค max0โ‰ค๐‘›โ‰ค๐‘

๐‘“๐‘›+1 โ‰ค ๐น๐‘ โˆ˜ ๐‘‡ + ๐‘“

Integrate to getโˆซ

๐‘ƒ๐‘

๐น๐‘๐‘‘๐œ‡ โ‰ค โˆซ๐‘ƒ๐‘

๐น๐‘ โˆ˜ ๐‘‡ ๐‘‘๐œ‡ + โˆซ๐‘ƒ๐‘

๐‘“๐‘‘๐œ‡.

Note that ๐น๐‘(๐‘ฅ) = 0 if ๐‘ฅ โˆ‰ ๐‘ƒ๐‘ because ๐‘“0 = 0 so

โˆซ๐‘ƒ๐‘

๐น๐‘ = โˆซ๐‘‹

๐น๐‘ โ‰ค โˆซ๐‘‹

๐น๐‘ โˆ˜ ๐‘‡ + โˆซ๐‘ƒ๐‘›

๐‘“.

71

Page 73: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

As ๐œ‡ is ๐‘‡-invariant, โˆซ ๐น๐‘ โˆ˜ ๐‘‡ = โˆซ ๐น๐‘ so

โˆซ๐‘ƒ๐‘

๐‘“๐‘‘๐œ‡ โ‰ฅ 0.

Proof of maximal ergodic lemma. Apply the maximal inequality to ๐‘” = ๐‘“ โˆ’ ๐›ผ.Observe that

๐ธ๐›ผ(๐‘“) = โ‹ƒ๐‘โ‰ฅ1

๐‘ƒ๐‘(๐‘”)

and ๐‘†๐‘›๐‘” = ๐‘†๐‘›๐‘“ โˆ’ ๐›ผ. Thus

โˆซ๐ธ๐›ผ(๐‘“)

(๐‘“ โˆ’ ๐›ผ)๐‘‘๐œ‡ โ‰ฅ 0,

which is equivalent to๐›ผ๐œ‡(๐ธ๐›ผ) โ‰ค โˆซ

๐ธ๐›ผ

๐‘“๐‘‘๐œ‡.

Proof of Birkhoff (pointwise) ergodic theorem. Let

๐‘“ = lim sup๐‘›

๐‘†๐‘›๐‘“

๐‘“ = lim inf๐‘›

๐‘†๐‘›๐‘“

Observe that ๐‘“ = ๐‘“ โˆ˜ ๐‘‡ , ๐‘“ โˆ˜ ๐‘‡ = ๐‘“: indeed

๐‘†๐‘›๐‘“ โˆ˜ ๐‘‡ = 1๐‘›

(๐‘“ โˆ˜ ๐‘‡ + โ‹ฏ + ๐‘“ โˆ˜ ๐‘‡ ๐‘›) = 1๐‘›

((๐‘› + 1)๐‘†๐‘›+1๐‘“ โˆ’ ๐‘“).

Need to show that ๐‘“ = ๐‘“ ๐œ‡-almost everywhere. This is equivalent to for all๐›ผ, ๐›ฝ โˆˆ Q, ๐›ผ > ๐›ฝ, the set

๐ธ๐›ผ,๐›ฝ(๐‘“) = {๐‘ฅ โˆˆ ๐‘‹ โˆถ ๐‘“(๐‘ฅ) < ๐›ฝ, ๐‘“(๐‘ฅ) > ๐›ผ}

is ๐œ‡-null, as then{๐‘ฅ โˆถ ๐‘“(๐‘ฅ) โ‰  ๐‘“(๐‘ฅ)} = โ‹ƒ

๐›ผ>๐›ฝ๐ธ๐›ผ,๐›ฝ

is ๐œ‡-null by subadditivity. Observe that ๐ธ๐›ผ,๐›ฝ is ๐‘‡-invariant. Apply the maximalergodic theorem to ๐ธ๐›ผ,๐›ฝ to get

๐›ผ๐œ‡(๐ธ๐›ผ,๐›ฝ(๐‘“)) โ‰ค โˆซ๐ธ๐›ผ,๐›ฝ

๐‘“๐‘‘๐œ‡.

Duallyโˆ’๐›ฝ๐œ‡(๐ธ๐›ผ,๐›ฝ(๐‘“)) โ‰ค โˆซ

๐ธ๐›ผ,๐›ฝ

โˆ’๐‘“๐‘‘๐œ‡

72

Page 74: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

12 Ergodic theory

so๐›ผ๐œ‡(๐ธ๐›ผ,๐›ฝ) โ‰ค ๐›ฝ๐œ‡(๐ธ๐›ผ,๐›ฝ).

But ๐›ผ > ๐›ฝ so ๐œ‡(๐ธ๐›ผ,๐›ฝ) = 0.We have proved that the limit lim๐‘› ๐‘†๐‘›๐‘“ exists almost everywhere, which we

now define to be ๐‘“, and left to show ๐‘“ โˆˆ ๐ฟ1 and lim๐‘›โ€–๐‘†๐‘›๐‘“ โˆ’ ๐‘“โ€–1 = 0. This is anapplication of Fatouโ€™s lemma:

โˆซ |๐‘“|๐‘‘๐œ‡ = โˆซ lim inf๐‘›

|๐‘†๐‘›๐‘“|๐‘‘๐œ‡

โ‰ค lim inf๐‘›

โˆซ |๐‘†๐‘›๐‘“|๐‘‘๐œ‡

โ‰ค lim inf๐‘›

โ€–๐‘†๐‘›๐‘“โ€–1

โ‰ค โ€–๐‘“โ€–1

so ๐‘“ โˆˆ ๐ฟ1 where the last inequality is because

โ€–๐‘†๐‘›๐‘“โ€– โ‰ค 1๐‘›

(โ€–๐‘“โ€–1 + โ‹ฏ + โ€–๐‘“ โˆ˜ ๐‘‡ ๐‘›โˆ’1โ€–1) = โ€–๐‘“โ€–1.

Now to show โ€–๐‘†๐‘›๐‘“ โˆ’ ๐‘“โ€–1 โ†’ 0, we truncate ๐‘“. Let ๐‘€ > 0 and set ๐œ‘๐‘€ = ๐‘“1|๐‘“|<๐‘€.Note that

โ€ข |๐œ‘๐‘€| โ‰ค ๐‘€ so |๐‘†๐‘›๐œ‘๐‘€| โ‰ค ๐‘€. Hence by dominated convergence theoremโ€–๐‘†๐‘›๐œ‘๐‘€ โˆ’ ๐œ‘๐‘€โ€–1 โ†’ 0.

โ€ข ๐œ‘๐‘€ โ†’ ๐‘“ ๐œ‡-almost everywhere and also in ๐ฟ1 by dominated convergencetheorem.

Thus by Fatouโ€™s lemma,

โ€–๐œ‘๐‘€ โˆ’ ๐‘“โ€–1 โ‰ค lim inf๐‘›

โ€–๐‘†๐‘›๐œ‘๐‘€ โˆ’ ๐‘†๐‘›๐‘“โ€–1 โ‰ค โ€–๐œ‘๐‘€ โˆ’ ๐‘“โ€–1

Finally

โ€–๐‘†๐‘›๐‘“ โˆ’ ๐‘“โ€–1 โ‰ค โ€–๐‘†๐‘›๐‘“ โˆ’ ๐‘†๐‘›๐œ‘๐‘€โ€–1 + โ€–๐‘†๐‘›๐œ‘๐‘€ โˆ’ ๐œ‘๐‘€โ€–1 + โ€–๐œ‘๐‘€ โˆ’ ๐‘“โ€–1

โ‰ค โ€–๐‘“ โˆ’ ๐œ‘๐‘€โ€–1 + โ€–๐‘†๐‘›๐œ‘๐‘€ โˆ’ ๐œ‘๐‘€โ€–1 + โ€–๐‘“ โˆ’ ๐œ‘๐‘€โ€–1

solim supโ€–๐‘†๐‘›๐‘“ โˆ’ ๐‘“โ€–1 โ‰ค 2โ€–๐‘“ โˆ’ ๐œ‘๐‘€โ€–1

for all ๐‘€, so goes to 0 as ๐‘€ โ†’ โˆž.

Remark.

1. The theorem holds if ๐œ‡ is only assumed to be ๐œŽ-finite.

2. The theorem holds if ๐‘“ โˆˆ ๐ฟ๐‘ for ๐‘ โˆˆ [1, โˆž). The ๐‘†๐‘›๐‘“ โ†’ ๐‘“ in ๐ฟ๐‘.

73

Page 75: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

Index

๐ฟ๐‘-space, 47๐œ‹-system, 12๐œŽ-algebra, 10

independence, 33invariant, 67tail, 36

Bernoulli shift, 69Birkhoff ergodic theorem, 70Boolean algebra, 2Borel ๐œŽ-algebra, 12Borel measurable, 19Borel-Cantelli lemma, 35

canonical model, 68Carathรฉodory extension theorem,

14central limit theorem, 65characteristic function, 56coboundary, 69completion, 17conditional expectation, 54convergence

almost surely, 39in ๐ฟ1, 41in distribution, 39in mean, 41in measure, 39in probability, 39

convolution, 59covariance matrix, 64

density, 31, 52Dirac mass, 39distribution, 29distribution function, 29Dynkin lemma, 12

ergodic, 67expectation, 29

Fatouโ€™s lemma, 23filtration, 36Fourier transform, 56Furstenberg conjecture, 68

Gaussian, 64Gaussian approximation, 60

Hermitian norm, 49Hilbert space, 49Hรถlder inequality, 46

iid., 34independence, 33infinite product ๐œŽ-algebra, 35, 68inner product, 49integrable, 20integral with respect to a measure,

20invariant ๐œŽ-algebra, 67invariant function, 67invariant set, 67io., 35

Jensen inequality, 44

Kolmogorov 0 โˆ’ 1 law, 36, 69

law, 29law of large numbers, 38law of larger numbers, 70Lebesgue measure, 8Lebesgueโ€™s dominated convergence

theorem, 23Lรฉvy criterion, 61

maximal ergodic lemma, 71maximal inequality, 71mean, 32, 64measurable, 18measurable space, 10measure, 10

absolutely continuous, 51finitely additive, 2singular, 51

measure-preserving, 67measure-preserving dynamical

system, 67Minkowski inequality, 45moment, 32monotone convergence theorem, 20

null set, 5

orthogonal projection, 50

Plancherel formula, 62probability measure, 29

74

Page 76: Probability and Measure - Aleph epsilonqk206.user.srcf.net/notes/probability_and_measure.pdfย ยท 1 Lebesguemeasure Proposition1.1.Let๐ตโŠ†R๐‘‘beabox.Letโ„ฐ(๐ต)bethefamilyofelementary

Index

probability space, 29product ๐œŽ-algebra, 26

infinite, 35, 68product measure, 27

infinite, 35

Radon-Nikodym derivative, 52Radon-Nikodym theorem, 52random process, 36random variable, 29

independence, 33Riesz representation theorem, 51,

69

sample path map, 68simple function, 19

stationary process, 68stochastic process, 68strong law of large numbers, 38, 70

tail event, 36, 69Tonelli-Fubini theorem, 27

uniformly integrable, 41

variance, 32von Neumann mean ergodic

theorem, 69

weak convergence, 39

Young inequality, 46

75


Recommended