University ofCambridge
Mathematics Tripos
Part II
Probability and Measure
Michaelmas, 2018
Lectures byE. Breuillard
Notes byQiangru Kuang
Contents
Contents
1 Lebesgue measure 21.1 Boolean algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Jordan measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Lebesgue measure . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Abstract measure theory 9
3 Integration and measurable functions 17
4 Product measures 25
5 Foundations of probability theory 28
6 Independence 326.1 Useful inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7 Convergence of random variables 38
8 ๐ฟ๐ spaces 43
9 Hilbert space and ๐ฟ2-methods 48
10 Fourier transform 55
11 Gaussians 63
12 Ergodic theory 6612.1 The canonical model . . . . . . . . . . . . . . . . . . . . . . . . . 67
Index 73
1
1 Lebesgue measure
1 Lebesgue measure
1.1 Boolean algebra
Definition (Boolean algebra). Let ๐ be a set. A Boolean algebra on ๐ isa family of subsets of ๐ which
1. contains โ ,
2. is stable under finite unions and complementation.
Example.
โข The trivial Boolean algebra โฌ = {โ , ๐}.
โข The discrete Boolean algebra โฌ = 2๐, the family of all subsets of ๐.
โข Less trivially, if ๐ is a topological space, the family of constructible setsforms a Boolean algebra, where a constructible set is the finite union oflocally closed set, i.e. a set ๐ธ = ๐ โฉ ๐น where ๐ is open and ๐น is closed.
Definition (finitely additive measure). Let ๐ be a set and โฌ a Booleanalgebra on ๐. A finitely additive measure on (๐, โฌ) is a function ๐ โถ โฌ โ[0, +โ] such that
1. ๐(โ ) = 0,
2. ๐(๐ธ โช ๐น) = ๐(๐ธ) + ๐(๐น) where ๐ธ โฉ ๐น = โ .
Example.
1. Counting measure: ๐(๐ธ) = #๐ธ, the cardinality of ๐ธ where โฌ is thediscrete Boolean algebra of ๐.
2. More generally, given ๐ โถ ๐ โ [0, +โ], define for ๐ธ โ ๐,
๐(๐ธ) = โ๐โ๐ธ
๐(๐).
3. Suppose ๐ = โ๐๐=1 ๐๐, then define โฌ(๐) to be the unions of ๐๐โs. Assign
a weight ๐๐ โฅ 0 to each ๐๐ and define ๐(๐ธ) = โ๐โถ๐๐โ๐ธ ๐๐ for ๐ธ โ โฌ.
1.2 Jordan measureThis section is a historic review and provides intuition for Lebesgue measuretheory. Weโll gloss over details of proofs in this section.
Definition. A subset of R๐ is called elementary if it is a finite union ofboxes, where a box is a set ๐ต = ๐ผ1 ร โฏ ร ๐ผ๐ where each ๐ผ๐ is a finite intervalof R.
2
1 Lebesgue measure
Proposition 1.1. Let ๐ต โ R๐ be a box. Let โฐ(๐ต) be the family of elementarysubsets of ๐ต. Then
1. โฐ(๐ต) is a Boolean algebra on ๐ต,
2. every ๐ธ โ โฐ(๐ต) is a disjoint finite union of boxes,
3. if ๐ธ โ โฐ(๐ต) can be written as disjoint finite union in two ways,๐ธ = โ๐
๐=1 ๐ต๐ = โ๐๐=1 ๐ตโฒ
๐, then โ๐๐=1 |๐ต๐| = โ๐
๐=1 |๐ตโฒ๐| where |๐ต| =
โ๐๐=1 |๐๐ โ ๐๐| if ๐ต = ๐ผ1 ร โฏ ร ๐ผ๐ and ๐ผ๐ has endpoints ๐๐, ๐๐.
Following this, we can define a finitely additive measure correponding to ourintuition of length, area, volume etc:
Proposition 1.2. Define ๐(๐ธ) = โ๐๐=1 |๐ต๐| if ๐ธ is any elementary set and
is the disjoint union of boxes ๐ต๐ โ R๐. Then ๐ is a finitely additive measureon โฐ(๐ต) for any box ๐ต.
Definition. A subset ๐ธ โ R๐ is Jordan measurable if for any ๐ > 0 thereare elementary sets ๐ด, ๐ต, ๐ด โ ๐ธ โ ๐ต and ๐(๐ต \ ๐ด) < ๐.
Remark. Jordan measurable sets are bounded.
Proposition 1.3. If a set ๐ธ โ R๐ is Jordan measurable, then
sup๐ดโ๐ธ elementary
{๐(๐ด)} = inf๐ตโ๐ธ elementary
{๐(๐ต)}.
In which case we define the Jordan measure of ๐ธ as
๐(๐ธ) = sup๐ดโ๐ธ
{๐(๐ด)}.
Proof. Take ๐ด๐ โ ๐ธ such that ๐(๐ด๐) โ sup and ๐ต๐ โ ๐ธ such that ๐(๐ต๐) โ inf.Note that
inf โค ๐(๐ต๐) = ๐(๐ด๐) + ๐(๐ต๐ \ ๐ด๐) โค sup +๐(๐ต๐ \ ๐ด๐) โค sup +๐
for arbitrary ๐ > 0 so they are equal.
Exercise.
1. If ๐ต is a box, the family ๐ฅ(๐ต) of Jordan measurable subsets of ๐ต is aBoolean algebra.
2. A subset ๐ธ โ [0, 1] is Jordan measurable if and only if 1๐ธ, the indicatorfunction on ๐ธ, is Riemann integrable.
3
1 Lebesgue measure
1.3 Lebesgue measureAlthough Jordan measure corresponds to the intuition of length, area and vol-ume, it suffer from a few severe problems and issues:
1. unbounded sets in R๐ are not Jordan measurable.
2. 1Qโฉ[0,1] is not Riemann integrable, and therefore Q โฉ [0, 1] is not Jordanmeasurable.
3. pointwise limits of Riemann integrable function ๐๐ โถ= 1 1๐!Zโฉ[0,1] โ 1Qโฉ[0,1]
is not Riemann integrable.
The idea of Lebesgue is to use countable covers by boxes.
Definition. A subset ๐ธ โ R๐ is Lebesgue measurable if for all ๐ > 0, thereexists a countable union of boxes ๐ถ with ๐ธ โ ๐ถ and ๐โ(๐ถ \ ๐ธ) < ๐, where๐โ, the Lebesgue outer measure, is defined as
๐โ(๐ธ) = inf{โ๐โฅ1
|๐ต๐| โถ ๐ธ โ โ๐โฅ1
๐ต๐, ๐ต๐ boxes}
for every subset ๐ธ โ R๐.
Remark. wlog in these definitions we may assume that boxes are open.
Proposition 1.4. The family โ of Lebesgue measurable subsets of R๐ is aBoolean algebra stable under countable unions.
Lemma 1.5.
1. ๐โ is monotone: if ๐ธ โ ๐น then ๐โ(๐ธ) โ ๐โ(๐น).
2. ๐โ is countably subadditive: if ๐ธ = โ๐โฅ1 ๐ธ๐ where ๐ธ๐ โ R๐ then
๐โ(๐ธ) โค โ๐โฅ1
๐โ(๐ธ๐).
Proof. Monotonicity is obvious. For countable subadditivity, pick ๐ > 0 and let๐ถ๐ = โ๐โฅ1 ๐ถ๐,๐ where ๐ถ๐,๐ are boxes such that ๐ธ๐ โ ๐ถ๐ and
โ๐โฅ1
|๐ถ๐,๐| โค ๐โ(๐ธ๐) + ๐2๐ .
Thenโ๐โฅ1
โ๐โฅ1
|๐ถ๐,๐| โค โ๐โฅ1
(๐โ(๐ธ๐) + ๐2๐ ) = ๐ + โ
๐โฅ1๐โ(๐ธ๐)
and ๐ธ โ โ๐โฅ1 ๐ถ๐ = โ๐โฅ1 โ๐โฅ1 ๐ถ๐,๐ so
๐โ(๐ธ) โค ๐ + โ๐โฅ1
๐โ(๐ธ๐)
for all ๐ > 0.
4
1 Lebesgue measure
Remark. Note that ๐โ is not additive on the family of all subsets of R๐.However, it will be on โ, as we will show later.
Lemma 1.6. If ๐ด, ๐ต are disjoint compact subsets of R๐ then
๐โ(๐ด โช ๐ต) = ๐โ(๐ด) + ๐โ(๐ต).
Proof. โค by the previous lemma so need to show โฅ. Pick ๐ > 0. Let ๐ด โช ๐ต โโ๐โฅ1 ๐ต๐ where ๐ต๐ are open boxes such that
โ๐โฅ1
|๐ต๐| โค ๐โ(๐ด โช ๐ต) + ๐.
wlog we may assume that the side lengths of each ๐ต๐ are < ๐ผ2 , where
๐ผ = inf{โ๐ฅ โ ๐ฆโ1 โถ ๐ฅ โ ๐ด, ๐ฆ โ ๐ต} > 0.
where the inequality comes from the fact that ๐ด and ๐ต are compact and thusclosed. wlog we may discard the ๐ต๐โs that do not interesect ๐ด โช ๐ต. Then byconstruction
โ๐โฅ1
|๐ต๐| = โ๐โฅ1,๐ต๐โฉ๐ด=โ
|๐ต๐| + โ๐โฅ1,๐ต๐โฉ๐ต=โ
|๐ต๐| โฅ ๐โ(๐ด) + ๐โ(๐ต)
so๐ + ๐โ(๐ด โช ๐ต) โฅ ๐โ(๐ด) + ๐โ(๐ต)
for all ๐.
Lemma 1.7. If ๐ธ โ R๐ has ๐โ(๐ธ) = 0 then ๐ธ โ โ.
Definition (null set). A set ๐ธ โ R๐ such that ๐โ(๐ธ) = 0 is called a nullset.
Proof. For all ๐ > 0, there exist ๐ถ = โ๐โฅ1 ๐ต๐ where ๐ต๐ are boxes such that๐ธ โ ๐ถ and โ๐โฅ1 |๐ต๐| โค ๐. But
๐โ(๐ถ \ ๐ธ) โค ๐โ(๐ถ) โค ๐.
Lemma 1.8. Every open subset of R๐ and every closed subset of R๐ is inโ.
We will prove the lemma using the fact that the family of Lebesgue mea-surable subsets is stable under countable union, which itself does not use thislemma. This lemma, however, will be used to show the stability under comple-mentation. Since the proof is quite technical (it has more to do with generaltopology than measure theory), for brevity and fluency of ideas we present theproof the main proposition first.
5
1 Lebesgue measure
Proof of Proposition 1.4. It is obvious that โ โ โ. To show it is stable undercountable unions, start with ๐ธ๐ โ โ for ๐ โฅ 1. Need to show ๐ธ โถ= โ๐โฅ1 ๐ธ๐ โโ.
Pick ๐ > 0. By assumption there exist ๐ถ๐ = โ๐โฅ1 ๐ต๐,๐ where ๐ต๐,๐ are boxessuch that ๐ธ๐ โ ๐ถ๐ and
๐โ(๐ถ๐ \ ๐ธ๐) < ๐2๐ .
Now๐ธ = โ
๐โฅ1๐ธ๐ โ โ
๐โฅ1๐ถ๐ =โถ ๐ถ
so ๐ถ is again a countable union of boxes and ๐ถ \ ๐ธ โ โ๐โฅ1 ๐ถ๐ \ ๐ธ๐. so
๐โ(๐ถ \ ๐ธ) โค โ๐โฅ1
๐โ(๐ถ๐ \ ๐ธ๐) โค โ๐โฅ1
๐2๐ = ๐
by countable subadditivity so ๐ธ โ โ.To show it is stable under complementation, suppose ๐ธ โ โ. By assumption
there exist ๐ถ๐ a countable union of boxes with ๐ธ โ ๐ถ๐ and ๐โ(๐ถ๐ \ ๐ธ) โค 1๐ .
wlog we may assume the boxes are open so ๐ถ๐ is open, ๐ถ๐๐ is closed so ๐ถ๐
๐ โ โ.Thus โ๐โฅ1 ๐ถ๐
๐ โ โ by first part of the proof.But
๐โ(๐ธ๐ \ โ๐โฅ1
๐ถ๐๐) โค ๐โ(๐ธ๐ \ ๐ถ๐
๐) = ๐โ(๐ถ๐ \ ๐ธ) โค 1๐
so ๐โ(๐ธ๐ \ โ๐โฅ1 ๐ถ๐๐) = 0 so ๐ธ๐ \ โ๐โฅ1 ๐ถ๐
๐ โ โ since it is a null set. But
๐ธ๐ = (๐ธ๐ \ โ๐โฅ1
๐ถ๐๐) โช โ
๐โฅ1๐ถ๐
๐,
both of which are in โ so ๐ธ๐ โ โ.
Proof of Lemma 1.8. Every open set in R๐ is a countable union of boxes so isin โ. It is more subtle for closed sets. The key observation is that every closedset is the countable union of compact subsets so we are left to show compactsets of R๐ are in โ.
Let ๐น โ R๐ be compact. For all ๐ โฅ 1, there exist ๐๐ a countable union ofopen sets such that ๐น โ ๐๐ โถ= โ๐โฅ1 ๐๐,๐ where ๐๐,๐ are open boxes such that
โ๐โฅ1
|๐๐,๐| โค ๐โ(๐น) + 12๐ .
By compactness there exist a finite subcover so we can assume ๐๐ is a finiteunion of open boxes. Moreover, wlog assume that
1. the side lengths of ๐๐,๐ are โค 12๐ .
2. for each ๐, ๐๐,๐ intersects ๐น.
3. ๐๐+1 โ ๐๐ (by replacing ๐๐+1 with ๐๐+1 โฉ ๐๐ iteratively).
6
1 Lebesgue measure
Then ๐น = โ๐โฅ1 ๐๐ and we are left to show ๐โ(๐๐ \ ๐น) โ 0. By additivity ondisjoint compact sets,
๐โ(๐น) + ๐โ(๐๐ \ ๐๐+1) = ๐โ(๐น โช (๐๐ \ ๐๐+1))
so๐โ(๐น) + ๐โ(๐๐ \ ๐๐+1) โค ๐โ(๐๐) โค โ
๐โฅ1|๐๐,๐| โค ๐โ(๐น) + 1
2๐
so ๐โ(๐๐ \ ๐๐+1) โค 12๐ . Finally,
๐โ(๐๐ \ ๐น) = ๐โ(โ๐โฅ๐
(๐๐ \ ๐๐+1)) โค โ๐โฅ๐
๐โ(๐๐ \ ๐๐+1) โค โ๐โฅ๐
12๐ = 1
2๐โ1 .
The result weโre working towards is
Proposition 1.9. ๐โ is countably additive on โ, i.e. if (๐ธ๐)๐โฅ1 where๐ธ๐ โ โ are pairwise disjoint then
๐โ( โ๐โฅ1
๐ธ๐) = โ๐โฅ1
๐โ(๐ธ๐).
Lemma 1.10. If ๐ธ โ โ then for all ๐ > 0 there exists ๐ open, ๐น closed,๐น โ ๐ธ โ ๐ such that ๐โ(๐ \ ๐ธ) < ๐ and ๐โ(๐ธ \ ๐น) < ๐.
Proof. By definition of โ, there exists a countable union of open boxes ๐ธ โโ๐โฅ1 ๐ต๐ such that ๐โ(โ๐โฅ1 ๐ต๐ \ ๐ธ) < ๐. Just take ๐ = โ๐โฅ1 ๐ต๐ which isopen.
For ๐น do the same with ๐ธ๐ = R๐ \ ๐ธ in place of ๐ธ.
Proof of Proposition 1.9. First we assume each ๐ธ๐ is compact. By a previouslemma ๐โ is additive on compact sets so for all ๐ โ N,
๐โ(๐โ๐=1
๐ธ๐) =๐
โ๐=1
๐โ(๐ธ๐).
In particular๐
โ๐=1
๐โ(๐ธ๐) โค ๐โ( โ๐โฅ1
๐ธ๐)
since ๐โ is monotone. Take ๐ โ โ to get one inequality. The other directionholds by countable subadditivity of ๐โ.
Now assume that each ๐ธ๐ is a bounded subset in โ. By the lemma thereexists ๐พ๐ โ ๐ธ๐ closed, so compact, such that ๐โ(๐ธ๐ \ ๐พ๐) โค ๐
2๐ . Since ๐พ๐โsare disjoint, by the previous case
๐โ( โ๐โฅ1
๐พ๐) = โ๐โฅ1
๐โ(๐พ๐)
7
1 Lebesgue measure
then
โ๐โฅ1
๐โ(๐ธ๐)
โค โ๐โฅ1
๐โ(๐พ๐) + ๐โ(๐ธ๐ \ ๐พ๐)
โค๐โ( โ๐โฅ1
๐พ๐) + โ๐โฅ1
๐2๐
โค๐โ( โ๐โฅ1
๐ธ๐) + ๐
so one direction of inequality. Similarly the other direction holds by countablesubadditivity of ๐โ.
For the general case, note that R๐ = โ๐โZ๐ ๐ด๐ where ๐ด๐ is bounded andin โ, for example by taking ๐ด๐ to be product of half open intervals of unitlength. Write ๐ธ๐ as โ๐โZ๐ ๐ธ๐ โฉ ๐ด๐ so just apply the previous results to(๐ธ๐ โฉ ๐ด๐)๐โฅ1,๐โZ๐ .
Definition (Lebesgue measure). ๐โ when restricted to โ is called theLebesgue measure and is simply denoted by ๐.
Example (Vitali counterexample). Althought โ is pretty big (it includes allopen and closed sets, countable unions and intersections of them, and has car-dinality at least 2๐ where ๐ is the continuum, by considering a null set withcardinality ๐ , and each subset thereof), it does not include every subset of R๐.
Consider (Q, +), the additive subgroup of (R, +). Pick a set of representative๐ธ of the cosets of (Q, +). Choose it inside [0, 1]. For each ๐ฅ โ R, there existsa unique ๐ โ ๐ธ such that ๐ฅ โ ๐ โ Q (here we require axiom of choice). Claimthat ๐ธ โ โ and ๐โ is not additive on the family of all subsets of R๐.
Proof. Pick distinct rationals ๐1, โฆ , ๐๐ in [0, 1]. The sets ๐๐ + ๐ธ are pairwisedisjoint so if ๐โ were additive then we would have
๐โ(๐โ๐=1
๐๐ + ๐ธ) =๐
โ๐=1
๐โ(๐๐ + ๐ธ) = ๐๐โ(๐ธ)
by translation invariance of ๐โ. But then
๐โ๐=1
๐๐ + ๐ธ โ [0, 2]
since ๐ธ โ [0, 1] so by monotonicity of ๐โ have
๐โ(๐โ๐=1
๐๐ + ๐ธ) โค 2
so for all ๐๐โ(๐ธ) โค 2 so ๐โ(๐ธ) = 0. But
[0, 1] โ โ๐โQ
๐ธ + ๐ = R,
8
1 Lebesgue measure
by countable subadditivity of ๐โ,
1 = ๐โ([0, 1]) โค โ๐โQ
๐โ(๐ธ + ๐) = 0.
Absurd.In particular ๐ธ โ โ as ๐โ is additive on โ.
9
2 Abstract measure theory
2 Abstract measure theoryIn this chapter we extend measure theory to arbitrary set. Most part of thetheory is developed by Frรฉchet and Carathรฉodory.
Definition (๐-algebra). A ๐-algebra on a set ๐ is a Boolean algebra stableunder countable unions.
Definition (measurable space). A measurable space is a couple (๐, ๐)where ๐ is a set and ๐ is a ๐-algebra on ๐.
Definition (measure). A measure on (๐, ๐) is a map ๐ โถ ๐ โ [0, โ] suchthat
1. ๐(โ ) = 0,
2. ๐ is countably additive (also known as ๐-additive), i.e. for every family(๐ด๐)๐โฅ1 of disjoint subsets in ๐, have
๐( โ๐โฅ1
๐ด๐) = โ๐โฅ1
๐(๐ด๐).
The triple (๐, ๐, ๐) is called a measure space.
Example.
1. (R๐, โ, ๐) is a measure space.
2. (๐, 2๐, #) where # is the counting measure.
Proposition 2.1. Let (๐, ๐, ๐) be a measure space. Then
1. ๐ is monotone: ๐ด โ ๐ต implies ๐(๐ด) โ ๐(๐ต),
2. ๐ is countably subadditive: ๐(โ๐โฅ1 ๐ด๐) โค โ๐โฅ1 ๐(๐ด๐),
3. upward monotone convergence: if
๐ธ1 โ ๐ธ2 โ โฏ โ ๐ธ๐ โ โฆ
then๐( โ
๐โฅ1๐ธ๐) = lim
๐โโ๐(๐ธ๐) = sup
๐โฅ1๐(๐ธ๐).
4. downard monotone convergence: if
๐ธ1 โ ๐ธ2 โ โฏ โ ๐ธ๐ โ โฆ
10
2 Abstract measure theory
and ๐(๐ธ1) < โ then
๐( โ๐โฅ1
๐ธ๐) = lim๐โโ
๐(๐ธ๐) = inf๐โฅ1
๐(๐ธ๐).
Proof.
1.๐(๐ต) = ๐(๐ด) + ๐(๐ต \ ๐ด)โ
โฅ0
by additivity of ๐.
2. See example sheet. The idea is that every countable union โ๐โฅ1 ๐ด๐ is adisjoint countable union โ๐โฅ1 ๐ต๐ where for each ๐, ๐ต๐ โ ๐ด๐. It thenfollows by ๐-additivity.
3. Let ๐ธ0 = โ soโ๐โฅ1
๐ธ๐ = โ๐โฅ1
(๐ธ๐ \ ๐ธ๐โ1),
a disjoint union. By ๐-additivity,
๐( โ๐โฅ1
๐ธ๐) = โ๐โฅ1
๐(๐ธ๐ \ ๐ธ๐โ1)
but for all ๐, by additivity of ๐,
๐โ๐=1
๐(๐ธ๐ \ ๐ธ๐โ1) = ๐(๐ธ๐)
so take limit. The supremum part is obvious.
4. Apply the previous result to ๐ธ1 \ ๐ธ๐.
Remark. Note the ๐(๐ธ1) < โ condition in the last part. Counterexample:๐ธ๐ = [๐, โ) โ R.
Definition (๐-algebra generated by a family). Let ๐ be a set and โฑ besome family of subsets of ๐. The the intersection of all ๐-algebras on ๐containing โฑ is a ๐-algebra, called the ๐-algebra generated by โฑ and isdenoted by ๐(โฑ).
Proof. Easy check. See example sheet.
Example.
1. Suppose ๐ = โ๐๐=1 ๐๐, i.e. ๐ admits a finite partition. Let โฑ = {๐1, โฆ , ๐๐},
then ๐(โฑ) consists of all subsets that are unions of ๐๐โs.
2. Suppose ๐ is countable and let โฑ be the collection of all singletons. Then๐(โฑ) = 2๐.
11
2 Abstract measure theory
Definition (Borel ๐-algebra). Let ๐ be a topological space. The ๐-algebragenerated by open subsets of ๐ is called the Borel ๐-algebra of ๐, denotedby โฌ(๐).
Proposition 2.2. If ๐ = R๐ then โฌ(๐) โ โ. Moreover every ๐ด โ โ canbe written as a disjoint union ๐ด = ๐ต โช ๐ where ๐ต โ โฌ(๐) and ๐ is a nullset.
Proof. Weโve shown that โ is a ๐-algebra and contains all open sets so โฌ(๐) โโ. Given ๐ด โ โ, ๐ด๐ โ โ so for all ๐ โฅ 1 there exists ๐ถ๐ countable unions of(open) boxes such that ๐ด๐ โ ๐ถ๐ and ๐โ(๐ถ๐ \ ๐ด๐) โค 1
๐ . Take ๐ถ = โ๐โฅ1 ๐ถ๐ โโฌ(๐). Thus ๐ต โถ= ๐ถ๐ โ โฌ(๐) and ๐(๐ด \ ๐ต) = 0 because ๐ด \ ๐ต = ๐ถ \ ๐ด๐.
Remark.
1. It can be shown that โฌ(R๐) โ โ. In fact |โ| โฅ 2๐ and |โฌ(R๐)| = ๐ .
2. If โฑ is a family of subsets of a set ๐, the Boolean algebra generated byโฑ can be explicitly described as
โฌ(โฑ) = {finite unions of ๐น1 โฉ โฏ โฉ ๐น๐ โถ ๐น๐ โ โฑ or ๐น ๐๐ โ โฑ}.
3. However, this is not so for ๐(โฑ). There is no โsimpleโ description of ๐-algebra generated by โฑ. (c.f. Borel hierarchy in descriptive set theory andtransfinite induction)
Definition (๐-system). A family โฑ of subsets of a set ๐ is called a ๐-systemif it contains โ and it is closed under finite intersection.
Proposition 2.3 (measure uniqueness). Let (๐, ๐) be a measurable space.Assume ๐1 and ๐2 are two finite measures (i.e. ๐๐(๐) < โ) such that๐1(๐น) = ๐2(๐น) for every ๐น โ โฑ where โฑ is a ๐-system with ๐(โฑ) = ๐.Then ๐1 = ๐2.
For R๐, we only have to check open boxes.
Proof. We state first the following lemma:
Lemma 2.4 (Dynkin lemma). If โฑ is a ๐-system on ๐ and ๐ is a familyof subsets of ๐ such that โฑ โ ๐ and ๐ is stable under complementation anddisjoint countable unions. Then ๐(โฑ) โ ๐.
Let ๐ = {๐ด โ ๐ โถ ๐1(๐ด) = ๐2(๐ด)}. Then ๐ is clearly stable under comple-mentation as
๐๐(๐ด๐) = ๐๐(๐ \ ๐ด) = ๐๐(๐) โ ๐๐(๐ด).
๐ is also clearly stable under countable disjoint unions by ๐-additivity. Thus byDynkin lemma, ๐ โ ๐(โฑ) = ๐.
12
2 Abstract measure theory
Proof of Dynkin lemma. Let โณ be the smallest family of subsets of ๐ contain-ing โฑ and stable under complementation and countable disjoint union (2๐ issuch a family and taking intersection). Sufficient to show that โณ is a ๐-algebra,as then โณ โ ๐ implies ๐(โฑ) โ ๐.
It suffices to show โณ is a Boolean algebra. Let
โณโฒ = {๐ด โ โณ โถ ๐ด โฉ ๐ต โ โณ for all ๐ต โ โฑ}.
โณโฒ again is stable under countable disjoint unions and complementation because
๐ด๐ โฉ ๐ต = (๐ต๐ โช (๐ด โฉ ๐ต))๐
as a disjoint union so is in โณ.As โณโฒ โ โฑ, by minimality of โณ, have โณ = โณโฒ. Now let
โณโณ = {๐ด โ โณโฒ โถ ๐ด โฉ ๐ต โ โณ for all ๐ต โ โณ}.
The same argument shows that โณโณ = โณ. Thus โณ is a Boolean algebra and a๐-algebra.
Proposition 2.5 (uniqueness of Lebesgue measure). Lebesgue measure isthe unique translation invariant measure ๐ on (R๐, โฌ(R๐)) such that
๐([0, 1]๐) = 1.
Proof. Exercise. Hint: use the ๐-system โฑ made of all boxes in R๐ and dissecta cube into dyadic pieces. Then approximate and use monotone.
Remark.
1. There is no countably additive translation invariant measure on R definedon all subsets of R. (c.f. Vitaliโs counterexample).
2. However, the Lebesgue measure can be extended to a finitely additivemeasure on all subsets of R (proof requires Hahn-Banach theorem. SeeIID Linear Analysis).
Recall the construction of Lebesgue measure: we take boxes in R๐, and defineelementary sets, which is the Boolean algebra generated by boxes. Then we candefine Jordan measure which is finitely additive. However, this is not countablyadditive but analysis craves limits so we define Lebesgue measurable sets, byintroducing the outer measure ๐โ, which is built from the Jordan measure.Finally we restrict this outer measure to โ. We also define the Borel ๐-algebra,which is the same as the ๐-algebra generated by the boxes. We show that theBorel ๐-algebra is contained in โ, and every element in โ can be written as adisjoint union of an element in the Borel ๐-algebra and a measure zero set.
Suppose โฌ is a Boolean algebra on a set ๐. Let ๐ be a finitely additivemeasure on โฌ. We are going to construct a measure on ๐(โฌ).
13
2 Abstract measure theory
Theorem 2.6 (Carathรฉodory extension theorem). Assume that ๐ is count-ably additive on โฌ, i.e. if ๐ต๐ โ โฌ disjoint is such that โ๐โฅ1 ๐ต๐ โ โฌ then๐(โ๐โฅ1 ๐ต๐) = โ๐โฅ1 ๐(๐ต๐) and assume that ๐ is ๐-finite, i.e. there exists๐๐ โ โฌ such that ๐ = โ๐โฅ1 ๐๐ and ๐(๐๐) < โ, then ๐ extends uniquelyto a measure on ๐(โฌ).
Proof. For any ๐ธ โ ๐, let
๐โ(๐ธ) = inf{โ๐โฅ1
๐(๐ต๐) โถ ๐ธ โ โ๐โฅ1
๐ต๐, ๐ต๐ โ โฌ}
and call it the outer measure associated to ๐. Define a subset ๐ธ โ ๐ to be๐โ-measurable if for all ๐ > 0 there exists ๐ถ = โ๐โฅ1 ๐ต๐ with ๐ต๐ โ โฌ such that๐ธ โ ๐ถ and
๐โ(๐ถ \ ๐ธ) โค ๐.
We denote by โฌโ the set of ๐โ-measurable subsets. Claim that
1. ๐โ is countably subadditive and monotone.
2. ๐โ(๐ต) = ๐(๐ต) for all ๐ต โ โฌ.
3. โฌโ is a ๐-algebra and contains all ๐โ-null sets and โฌ.
4. ๐โ is ๐-additive on โฌโ.
Then existence follows from the proposition as โฌโ โ ๐(โฌ): ๐โ will be ameasure on โฌโ and thus on ๐(โฌ). Uniqueness follows from a similar proof forLebesgue measure via Dynkin lemma.
Proof. This will be very easy as we only need to adapt our previous work tothe general case. Note that in a few occassion we used properties of R๐, suchas openness of some sets, so be careful.
1. Same.
2. ๐โ(๐ต) โค ๐(๐ต) for all ๐ต โ โฌ by definition of ๐โ. For the other direction, forall ๐ > 0, there exist ๐ต๐ โ โฌ such that ๐ต โ โ๐โฅ1 ๐ต๐ and โ๐โฅ1 ๐(๐ต๐) โค๐โ(๐ต) + ๐. But
๐ต = โ๐โฅ1
๐ต๐ โฉ ๐ต = โ๐โฅ1
๐ถ๐
where ๐ถ๐ โถ= ๐ต๐ โฉ ๐ต \ โ๐<๐ ๐ต โฉ ๐ต๐ and so ๐ถ๐ โ โฌ. Thus by countableadditivity
๐(๐ต) = โ๐โฅ1
๐(๐ถ๐) โค โ๐โฅ1
๐(๐ต๐) โค ๐โ(๐ต) + ๐
3. ๐โ-null sets and โฌ are obviously in โฌโ. Thus it is left to show that โฌโ
is a ๐-algebra. Stability under countable union is exactly the same andthen we claim that โฌโ is stable under complementation. This is the bitwhere we used closed/open sets in R๐ in the original proof. Here we usea lemma as a substitute.
14
2 Abstract measure theory
Lemma 2.7. Suppose ๐ต๐ โ โฌ then โ๐โฅ1 ๐ต๐ โ โฌโ.
Proof. First claim that if ๐ธ = โ๐โฅ1 ๐ผ๐ where ๐ผ๐+1 โ ๐ผ๐ and ๐ผ๐ โ โฌ suchthat ๐(๐ผ1) < โ then ๐โ(๐ธ) = lim๐โโ ๐(๐ผ๐) and ๐ธ โ โฌโ: by additivity of๐ on โฌ,
๐โ๐=1
๐(๐ผ๐ \ ๐ผ๐+1) = ๐(๐ผ1) โ ๐(๐ผ๐)
which converges as ๐ โ โ (because ๐(๐ผ๐+1) โค ๐(๐ผ๐)), so
โ๐โฅ๐
๐(๐ผ๐ \ ๐ผ๐+1) โ 0
as ๐ โ โ. But LHS is greater than ๐โ(๐ผ๐ \ ๐ธ) because ๐ผ๐ \ ๐ธ =โ๐โฅ๐ ๐ผ๐ \ ๐ผ๐+1. Therefore ๐ธ โ โฌโ and
๐(๐ผ๐) โค ๐โ(๐ผ๐ \ ๐ธ)โโโโโโ0
+ ๐โ(๐ธ)โโค๐(๐ผ๐)
solim
๐โโ๐(๐ผ๐) = ๐โ(๐ธ).
Now for the actual lemma, let ๐ธ = โ๐โฅ1 ๐ผ๐ where ๐ผ๐ โ โฌ. wlog we mayassume ๐ผ๐+1 โ ๐ผ๐. By ๐-finiteness assumption, ๐ = โ๐โฅ1 ๐๐ where๐๐ โ โฌ with ๐(๐๐) < โ so
๐ธ = โ๐โฅ1
๐ธ โฉ ๐๐.
By the claim for all ๐, ๐ธ โฉ ๐๐ โ โฌโ so ๐ธ โ โฌโ.
From the lemma we can derive that โฌโ is also stable under complementa-tion: given ๐ธ โ โฌโ, for all ๐ there exist ๐ถ๐ = โ๐โฅ1 ๐ต๐,๐ where ๐ต๐,๐ โ โฌsuch that ๐ธ โ ๐ถ๐ and ๐โ(๐ถ๐ \ ๐ธ) โค 1
๐ . Now
๐ธ๐ = ( โ๐โฅ1
๐ถ๐๐) โช (๐ธ๐ \ โ
๐โฅ1๐ถ๐
๐)
but ๐ถ๐๐ is a countable intersection โ๐โฅ1 ๐ต๐
๐,๐ and ๐ธ๐ \ โ๐โฅ1 ๐ถ๐๐ is ๐โ-null
so by the lemma, ๐ถ๐๐ โ โฌโ. Therefore their union is also in โฌโ. Since
weโve shown that null sets are in โฌโ, ๐ธ๐ โ โฌโ.
4. We want to show ๐โ is countably additive on โฌโ. Recall that ๐ is ๐-finite:there exists ๐๐ โ โฌ such that ๐ = โ๐โฅ1 ๐๐, ๐(๐๐) < โ. We say๐ธ โ ๐ is bounded if there exists ๐ such that ๐ธ โ ๐๐. It is then enoughto show countable additivity for bounded sets by the same argument asbefore: write ๐ = โ๐โฅ1 ๏ฟฝ๏ฟฝ๐ where ๏ฟฝ๏ฟฝ๐ = ๐๐ \ โ๐<๐ ๐๐ โ โฌ so this is adisjoint union. Then if ๐ธ = โ๐โฅ1 ๐ธ๐ as a disjoint union then
๐ธ = โ๐โฅ1
โ๐โฅ1
(๐ธ๐ โฉ ๏ฟฝ๏ฟฝ๐)
15
2 Abstract measure theory
which is also a countable disjoint union.Given ๐ธ, if we can show finite additivity then
๐โ๐=1
๐โ(๐ธ๐) = ๐โ(๐โ๐=1
๐ธ๐) โค ๐โ(๐ธ) โค โ๐โฅ1
๐โ(๐ธ๐)
take limit as ๐ โ โ to have equality throughout.It suffices to prove finite additivity when ๐ธ and ๐น are countable intersec-tions of sets from โฌ: ๐ธ, ๐น โ โฌโ so for ๐ > 0 there exists ๐ถ, ๐ท countableintersections of sets from โฌ such that ๐ถ โ ๐ธ, ๐ท โ ๐น and
๐โ(๐ธ) โค ๐โ(๐ถ) + ๐๐โ(๐น) โค ๐โ(๐ท) + ๐
As ๐ธ โฉ ๐น = โ and ๐ถ โ ๐ธ, ๐ท โ ๐น, ๐ถ โฉ ๐ท = โ so by finite additivity,
๐โ(๐ธ) + ๐โ(๐น) โค 2๐ + ๐โ(๐ถ โช ๐ท) โค 2๐ + ๐โ(๐ธ โช ๐น).
As usual, reverse holds by subadditivity.Finally for ๐ธ = โ๐โฅ1 ๐ผ๐, ๐น = โ๐โฅ1 ๐ฝ๐ bounded, wlog assume ๐ผ๐+1 โ๐ผ๐, ๐ฝ๐+1 โ ๐ฝ๐. ๐(๐ผ๐), ๐(๐ฝ๐) < โ. Now use claim 3,
๐โ(๐ธ) = lim๐โโ
๐โ(๐ผ๐)
๐โ(๐น) = lim๐โโ
๐โ(๐ฝ๐)
so
๐โ(๐ธ) + ๐โ(๐น) = lim๐โโ
๐(๐ผ๐) + ๐(๐ฝ๐) = lim๐โโ
(๐(๐ผ๐ โช ๐ฝ๐) + ๐(๐ผ๐ โฉ ๐ฝ๐))
But
โ๐โฅ1
(๐ผ๐ โฉ ๐ฝ๐) = ๐ธ โฉ ๐น = โ
โ๐โฅ1
(๐ผ๐ โช ๐ฝ๐) = ๐ธ โช ๐น
so by claim 3
lim๐โโ
๐(๐ผ๐ โฉ ๐ฝ๐) = 0
lim๐โโ
๐(๐ผ๐ โช ๐ฝ๐) = ๐โ(๐ธ โช ๐น)
which finishes the proof.
Remark. We prove that every set in โฌโ is a disjoint union ๐ธ = ๐น โช ๐ where๐น โ ๐(โฌ) and ๐ is ๐โ-null.
16
2 Abstract measure theory
Definition (completion). We say that โฌโ is the completion of ๐(โฌ) withrespect to ๐.
Example. โ is the completion of โฌ(R๐) in R๐.
17
3 Integration and measurable functions
3 Integration and measurable functions
Definition (measurable function). Let (๐, ๐) be a measurable space. Afunction ๐ โ R is called measurable or ๐-measurable if for all ๐ก โ R,
{๐ฅ โ ๐ โถ ๐(๐ฅ) < ๐ก} โ ๐.
Remark. The ๐-algebra generated by intervals (โโ, ๐ก) where ๐ก โ R is the Borel๐-algebra of R, denote โฌ(R). Thus for every measurable function ๐ โถ ๐ โ R,the preimage ๐โ1(๐ต) โ ๐ for all ๐ต โ โฌ(R). However, it is not true that๐โ1(๐ฟ) โ ๐ for any ๐ฟ โ โ.
Remark. If ๐ is allowed to take the values +โ and โโ we will say that ๐ ismeasurable if additionally ๐โ1({+โ}) โ ๐ and ๐โ1({โโ}) โ ๐.
More generally,
Definition (measurable map). Suppose (๐, ๐) and (๐ , โฌ) are measurablespaces. A map ๐ โถ ๐ โ ๐ is measurable if for all ๐ต โ โฌ, ๐โ1(๐ต) โ ๐.
Proposition 3.1.
1. The composition of measurable maps is measurable.
2. If ๐, ๐ โถ (๐, ๐) โ R are measurable functions then ๐ + ๐, ๐๐ and ๐๐for ๐ โ R are also measurable.
3. If (๐๐)๐โฅ1 is a sequence of measurable functions on (๐, ๐) then so aresup๐ ๐๐, inf๐ ๐๐, lim sup๐ ๐๐ and lim inf๐ ๐๐.
Proof.
1. Obvious.
2. Follow from 1 once itโs shown that + โถ R2 โ R and ร โถ R2 โ R aremeasurable (with respect to Borel sets). The sets
{(๐ฅ, ๐ฆ) โถ ๐ฅ + ๐ฆ < ๐ก}{(๐ฅ, ๐ฆ) โถ ๐ฅ๐ฆ < ๐ก}
are open in R2 and hence Borel.
3. inf๐ ๐๐(๐ฅ) < ๐ก if and only if
๐ฅ โ โ๐
{๐ฅ โถ ๐๐(๐ฅ) < ๐ก}
and similar for sup. Similarly lim sup๐ ๐๐(๐ฅ) < ๐ก if and only if
๐ฅ โ โ๐โฅ1
โ๐โฅ1
โ๐โฅ๐
{๐ฅ โถ ๐๐(๐ฅ) < ๐ก โ 1๐
}.
18
3 Integration and measurable functions
Proposition 3.2. ๐ = (๐1, โฆ , ๐๐) โถ (๐, ๐) โ (R๐, โฌ(R๐)) where ๐ โฅ 1 ismeasurable if and only if each ๐๐ โถ ๐ โ R is measurable.
Proof. One direction is easy: suppose ๐ is measurable then
{๐ฅ โถ ๐๐(๐ฅ) < ๐ก} = ๐โ1({๐ฆ โ R๐ โถ ๐ฆ๐ < ๐ก}),
which is open so ๐๐ is measurable.Conversely, suppose ๐๐ is measurable. Then
๐โ1(๐
โ๐=1
[๐๐, ๐๐]) =๐
โ๐=1
{๐ฅ โถ ๐๐ โค ๐๐(๐ฅ) โค ๐๐}
As the boxes generate the Borel sets, done.
Example.
1. Let (๐, ๐) be a measurable space and ๐ธ โ ๐. Then ๐ธ โ ๐ if and only if1๐ธ, the indicator function on ๐ธ, is ๐-measurable.
2. If ๐ = โ๐๐=1 ๐๐ and ๐ is the Boolean algebra generated by the ๐๐โs. A
function ๐ โถ (๐, ๐) โ R is measurable if and only if ๐ is constant on each๐๐. In this case the vector space of measurable functions has dimension๐.
3. Every continuous function ๐ โถ R๐ โ R is measurable.
Definition (Borel measurable). If ๐ is a topological space, ๐ โถ ๐ โ R isBorel or Borel measurable if it is โฌ(๐)-measurable.
Definition (simple function). A function ๐ on (๐, ๐) is called simple if
๐ =๐
โ๐=1
๐๐1๐ด๐
for some ๐๐ โฅ 0 and ๐ด๐ โ ๐.
Of course simple functions are measurable.
Lemma 3.3. If a simple function can be written in two ways
๐ =๐
โ๐=1
๐๐1๐ด๐=
๐ โ๐=1
๐๐1๐ต๐
then ๐โ๐=1
๐๐๐(๐ด๐) =๐
โ๐=1
๐๐๐(๐ต๐)
for any measure ๐ on (๐, ๐).
Proof. Example sheet 1.
19
3 Integration and measurable functions
Definition (integral of a simple function with respect to a measure). The๐-integral of ๐ is defined by
๐(๐) โถ=๐
โ๐=1
๐๐๐(๐ด๐).
Remark.
1. The lemma says that the integral is well-defined.
2. We also use the notation โซ๐
๐๐๐ to denote ๐(๐).
Proposition 3.4. ๐-integral satisfies, for all simple functions ๐ and ๐,
1. linearity: for all ๐ผ, ๐ฝ โฅ 0, ๐(๐ผ๐ + ๐ฝ๐) = ๐ผ๐(๐) + ๐ฝ๐(๐).
2. positivity: if ๐ โค ๐ then ๐(๐) โค ๐(๐).
3. if ๐(๐) = 0 then ๐ = 0 ๐-almost everywhere, i.e. {๐ฅ โ ๐ โถ ๐(๐ฅ) โ 0}is a ๐-null set.
Proof. Obvious from definition and lemma.
Definition. If ๐ โฅ 0 and measurable on (๐, ๐), define
๐(๐) = sup{๐(๐) โถ ๐ simple , ๐ โค ๐} โ [0, +โ].
Remark. This is consistent with the definition for ๐ simple, due to positivity.
Definition (integrable). If ๐ is an arbitrary measurable function on (๐, ๐)we say ๐ is ๐-integrable if
๐(|๐|) < โ.
Definition (integral with respect to a measure). If ๐ is ๐-integrable, thenwe define its ๐-integral by
๐(๐) = ๐(๐+) โ ๐(๐โ)
where ๐+ = max{0, ๐} and ๐โ = (โ๐)+.
Note.
|๐| = ๐+ + ๐โ
๐ = ๐+ โ ๐โ
Theorem 3.5 (monotone convergence theorem). Let (๐๐)๐โฅ1 be a sequenceof measurable functions on a measure space (๐, ๐, ๐) such that
0 โค ๐1 โค ๐2 โค โฏ โค ๐๐ โค โฆ
20
3 Integration and measurable functions
Let ๐ = lim๐โโ ๐๐. Then
๐(๐) = lim๐โโ
๐(๐๐).
Lemma 3.6. If ๐ is a simple function on (๐, ๐, ๐), the map
๐๐ โถ ๐ โ [0, โ]๐ธ โฆ ๐(1๐ธ๐)
is a measure on (๐, ๐).
Proof. Write ๐ = โ๐๐=1 ๐๐1๐ด๐
so ๐1๐ธ = โ๐๐=1 ๐๐1๐ด๐โฉ๐ธ so
๐(1๐ธ๐) =๐
โ๐=1
๐๐๐(๐ด๐ โฉ ๐ธ).
By a question on example sheet this is well-defined. Then ๐-additivity followsimmediately from ๐-additivity of ๐.
Proof of monotone convergence theorem. ๐๐ โค ๐๐+1 โค ๐ by assumption so
๐(๐๐) โค ๐(๐๐+1) โค ๐(๐)
by definition of integral so
lim๐โโ
๐(๐๐) โค ๐(๐),
although RHS may be infinite.Let ๐ be any simple function with ๐ โค ๐. Need to show that ๐(๐) โค
lim๐โโ ๐(๐๐). Pick ๐ > 0. Let
๐ธ๐ = {๐ฅ โ ๐ โถ ๐๐(๐ฅ) โฅ (1 โ ๐)๐(๐ฅ)}.
Then ๐ = โ๐โฅ1 ๐ธ๐ and ๐ธ๐ โ ๐ธ๐+1. So we may apply upward monotoneconvergence for sets to measure ๐๐ and get
lim๐โโ
๐๐(๐ธ๐) = ๐๐(๐) = ๐(๐1๐) = ๐(๐).
But(1 โ ๐)๐๐(๐ธ๐) = ๐((1 โ ๐)๐1๐ธ๐
)) โค ๐(๐๐)
because (1 โ ๐)๐1๐ธ๐is a simple function smaller than ๐๐. Taking limit,
(1 โ ๐)๐(๐) โค lim๐โโ
๐(๐๐)
which holds for all ๐. So๐(๐) โค lim
๐โโ๐(๐๐).
21
3 Integration and measurable functions
Lemma 3.7. If ๐ โฅ 0 is a measurable function on (๐, ๐) then there is asequence of simple functions (๐๐)๐โฅ1
0 โค ๐๐ โค ๐๐+1 โค ๐
such that for all ๐ฅ โ ๐, ๐๐(๐ฅ) โ ๐(๐ฅ).
Notation. ๐๐ โ ๐ means that lim๐โโ ๐๐(๐ฅ) = ๐(๐ฅ) and ๐๐+1 โฅ ๐๐.
Proof. We can take๐๐ = 1
2๐ โ2๐ min{๐, ๐}โ
pointwise. Check that โ2๐ฆโ โฅ 2โ๐ฆโ for all ๐ฆ โฅ 0.
Proposition 3.8. Basic properties of the integral (for positive functions):suppose ๐, ๐ โฅ 0 are measurable on (๐, ๐, ๐).
1. linearity: for all ๐ผ, ๐ฝ โฅ 0, ๐(๐ผ๐ + ๐ฝ๐ฆ) = ๐ผ๐(๐) + ๐ฝ๐(๐).
2. positivity: if 0 โค ๐ โค ๐ then ๐(๐) โค ๐(๐).
3. if ๐(๐) = 0 then ๐ = 0 ๐-almost everywhere.
4. if ๐ = ๐ ๐-almost everywhere then ๐(๐) = ๐(๐).
Proof.
1. Follows from the same property for simple functions and from Lemma 3.7combined with monotone convergence theorem.
2. Obvious from definition.
3.{๐ฅ โ ๐ โถ ๐(๐ฅ) โ 0} = โ
๐โฅ0{๐ฅ โ ๐ โถ ๐(๐ฅ) > 1
๐}
set ๐๐ = 1๐ 1{๐ฅโ๐โถ๐(๐ฅ)>1/๐} which is simple and ๐๐ โค ๐ so by definition of
integral ๐(๐๐) โค ๐(๐) so ๐(๐๐) = 0, i.e. ๐({๐ฅ โถ ๐(๐ฅ) > 1๐ }) = 0.
4. Note that if ๐ธ โ ๐, ๐(๐ธ๐) = 0 then
๐(โ1๐ธ) = ๐(โ)
for all โ simple. Thus it holds for all โ โฅ 0 measurable. Now take๐ธ = {๐ฅ โถ ๐(๐ฅ) = ๐(๐ฅ)}.
Proposition 3.9 (linearity of integral). Suppose ๐, ๐ are ๐-integrable func-tions and ๐ผ, ๐ฝ โ R. Then ๐ผ๐ + ๐ฝ๐ is ๐-integrable and
๐(๐ผ๐ + ๐ฝ๐) = ๐ผ๐(๐) + ๐ฝ๐(๐).
Proof. We have shown the case when ๐ผ, ๐ฝ โฅ 0 and ๐, ๐ โฅ 0. In the general case,use the positive and negative parts.
22
3 Integration and measurable functions
Lemma 3.10 (Fatouโs lemma). Suppose (๐๐)๐โฅ1 is a sequence of measurablefunctions on (๐, ๐, ๐) such that ๐๐ โฅ 0 for all ๐. Then
๐(lim inf๐โโ
๐๐) โค lim inf๐โโ
๐(๐๐).
Remark. We may not have equality: let ๐๐ = 1[๐,๐+1] on (R, โ, ๐). Then๐(๐๐) = 1 but lim๐โโ ๐๐ = 0.
Proof. Let ๐๐ โถ= inf๐โฅ๐ ๐๐. Then ๐๐+1 โฅ ๐๐ โฅ 0 so by monotone convergencetheorem, ๐(๐๐) โ ๐(๐) as ๐ โ โ where ๐ = lim๐โโ ๐ = lim inf๐โโ ๐๐ and๐๐ โค ๐๐ so ๐(๐๐) โค ๐(๐๐) for all ๐. Take ๐ โ โ,
๐(๐) โค lim inf๐โโ
๐(๐๐).
In both monotone convergence theorem and Fatouโs lemma we assumed thatthe sequence of functions is nonnegative. There is another version of convergencetheorem where we replace nonnegativity by domination:
Theorem 3.11 (Lebesgueโs dominated convergence theorem). Let (๐๐)๐โฅ1be a sequence of measurable functions on (๐, ๐, ๐) and ๐ a ๐-integrablefunction on ๐. Assume |๐๐| โค ๐ for all ๐ (domination assumption) andassume for all ๐ฅ โ ๐, lim๐โโ ๐๐(๐ฅ) = ๐(๐ฅ). Then ๐ is ๐-integrable and
๐(๐) = lim๐โโ
๐(๐๐).
This allows us to swap limit and integral.
Proof. |๐๐| โค ๐ so |๐| โค ๐ so ๐(|๐|) โค ๐(๐) < โ and ๐ is integrable. Note that๐ + ๐๐ โฅ 0 so by Fatouโs lemma,
๐(lim inf๐โโ
(๐ + ๐๐)) โค lim inf๐โโ
๐(๐ + ๐๐).
But lim inf๐โโ(๐ + ๐๐) = ๐ + ๐ and by linearity ๐(๐ + ๐๐) = ๐(๐) + ๐(๐๐), so
๐(๐) + ๐(๐) โค ๐(๐) + lim inf๐โโ
๐(๐๐),
i.e.๐(๐) โค lim inf
๐โโ๐(๐๐).
Do the same with ๐ โ ๐๐ in place of ๐ + ๐๐, get
๐(โ๐) โค lim inf๐โโ
๐(โ๐๐) = โ lim sup๐โโ
๐(๐๐)
so๐(๐) = lim
๐โโ๐(๐๐).
23
3 Integration and measurable functions
Corollary 3.12 (exchanging integral and summation). Let (๐, ๐, ๐) be ameasure space and let (๐๐)๐โฅ1 be a sequence of measurable functions on ๐.
1. If ๐๐ โฅ 0 then๐(โ
๐โฅ1๐๐) = โ
๐โฅ1๐(๐๐).
2. If โ๐โฅ1 |๐๐| is ๐-integrable then โ๐โฅ1 ๐๐ is ๐-integrable and
๐(โ๐โฅ1
๐๐) = โ๐โฅ1
๐(๐๐).
Proof.
1. Let ๐๐ = โ๐๐=1 ๐๐, then ๐๐ โ โ๐โฅ1 ๐๐ as ๐ โ โ so the result follows
from monotone convergence theorem.
2. Let ๐ = โ๐โฅ1 |๐๐| and ๐๐ as above. Then |๐๐| โค ๐ for all ๐ so thedomination assumption holds. The result thus follows from dominatedconvergence theorem.
Corollary 3.13 (differentiation under integral sign). Let (๐, ๐, ๐) be ameasure space. Let ๐ โ R be an open set and let ๐ โถ ๐ ร ๐ โ R be suchthat
1. ๐ฅ โฆ ๐(๐ก, ๐ฅ) is ๐-integrable for all ๐ก โ ๐,
2. ๐ก โฆ ๐(๐ก, ๐ฅ) is differentiable for all ๐ฅ โ ๐,
3. domination: there exists ๐ โถ ๐ โ R ๐-integrable such that for all๐ก โ ๐, ๐ฅ โ ๐,
๐๐๐๐ก
(๐ก, ๐ฅ) โค ๐(๐ฅ).
Then ๐ฅ โฆ ๐๐๐๐ก (๐ก, ๐ฅ) is ๐-integrable for all ๐ก โ ๐ and if we set ๐น(๐ก) =
โซ๐
๐(๐ก, ๐ฅ)๐๐ then ๐น is differentiable and
๐น โฒ(๐ก) = โซ๐
๐๐๐๐ก
(๐ก, ๐ฅ)๐๐.
Proof. Pick โ๐ > 0, โ๐ โ 0 and define
๐๐(๐ก, ๐ฅ) โถ= 1โ๐
(๐(๐ก + โ๐, ๐ฅ) โ ๐(๐ก, ๐ฅ)).
Thenlim
๐โโ๐๐(๐ก, ๐ฅ) = ๐๐
๐๐ก(๐ก, ๐ฅ).
By mean value theorem, there exists ๐๐ก,๐,๐ฅ โ [๐ก, ๐ก + โ๐] such that
๐๐(๐ก, ๐ฅ) = ๐๐๐๐ก
(๐, ๐ฅ)
24
3 Integration and measurable functions
so|๐๐(๐ก, ๐ฅ)| โค ๐(๐ฅ)
by domination assumption. Now apply dominated convergence theorem.
Remark.
1. If ๐ โถ [๐, ๐] โ R is continuous where ๐ < ๐ in R, then ๐ is ๐-integrable(where ๐ is the Lebesgue measure) and ๐(๐) = โซ๐
๐๐(๐ฅ)๐๐ฅ is the Riemann
integral. In general if ๐ is only assumed to be bounded, then ๐ will beRiemann integrable if and only if the points of discontinuity of ๐ is an๐-null set. See example sheet 2.
2. If ๐ โ GL๐(R) and ๐ โฅ 0 is Borel measurable on R๐, then
๐(๐ โ ๐) = 1| det ๐|
๐(๐).
See example sheet 2. In particular ๐ is invariant under linear transfor-mation whose determinant has absolute value 1, e.g. rotation.
Remark. In each of monotone convergence theorem, Fatouโs lemma and dom-inated convergence theorem, we can replace pointwise assumption by the corre-sponding ๐-almost everywhere. The same conclusion holds. Indeed, let
๐ธ = {๐ฅ โ ๐ โถ assumptions hold at ๐ฅ}
so ๐ธ๐ is a ๐-null set. Replace each ๐๐ (and similarly ๐ etc) by 1๐ธ๐๐. Thenassumptions then hold everywhere as ๐(๐1๐ธ) = ๐(๐) for all ๐ measurable.
25
4 Product measures
4 Product measures
Definition (product ๐-algebra). Let (๐, ๐) and (๐ , โฌ) be measurable spaces.The ๐-algebra of subsets of ๐ ร๐ generated by the product sets ๐ธ ร๐น where๐ธ โ ๐, ๐น โ โฌ is called the product ๐-algebra of ๐ and โฌ and is denoted by๐ โ ๐ต.
Remark.
1. By analogy with the notion of product topology, ๐ โ โฌ is the smallest๐-algebra of subsets of ๐ ร๐ making the two projection maps measurable.
2. โฌ(R๐1) โ โฌ(R๐2) = โฌ(R๐1+๐2). See example sheet. However this is not sofor โ(R๐).
Lemma 4.1. If ๐ธ โ ๐ ร ๐ is ๐ โ โฌ-measurable then for all ๐ฅ โ ๐, theslice
๐ธ๐ฅ = {๐ฆ โ ๐ โถ (๐ฅ, ๐ฆ) โ ๐ธ}
is in โฌ.
Proof. Letโฐ = {๐ธ โ ๐ ร ๐ โถ ๐ธ๐ฅ โ โฌ for all ๐ฅ โ ๐}.
Note that โฐ contains all product sets ๐ด ร ๐ต where ๐ด โ ๐, ๐ต โ โฌ. โฐ is a ๐-algebra: if ๐ธ โ โฐ then ๐ธ๐ โ โฐ and if ๐ธ๐ โ โฐ then โ ๐ธ๐ โ โฐ since (๐ธ๐)๐ฅ = (๐ธ๐ฅ)๐
and (โ ๐ธ๐)๐ฅ = โ(๐ธ๐)๐ฅ.
Lemma 4.2. Assume (๐, ๐, ๐) and (๐ , โฌ, ๐) are ๐-finite measure spaces.Let ๐ โถ ๐ ร ๐ โ [0, +โ] be ๐ โ โฌ-measurable. Then
1. for all ๐ฅ โ ๐, the function ๐ฆ โฆ ๐(๐ฅ, ๐ฆ) is โฌ-measurable.
2. for all ๐ฅ โ ๐, the map ๐ฅ โฆ โซ๐
๐(๐ฅ, ๐ฆ)๐๐(๐ฆ) is ๐-measurable.
Proof.
1. In case ๐ = 1๐ธ for ๐ธ โ ๐โโฌ the function ๐ฆ โฆ ๐(๐ฅ, ๐ฆ) is just ๐ฆ โฆ 1๐ธ๐ฅ(๐ฆ),
which is measurable by the previous lemma.More generally, the result is true for simple functions and thus for allmeasurable functions by taking pointwise limit.
2. By the same reduction we may assume ๐ = 1๐ธ for some ๐ธ โ ๐ โ โฌ. Nowlet ๐ = โ๐โฅ1 ๐๐ with ๐(๐๐) < โ. Let
โฐ = {๐ธ โ ๐ โ โฌ โถ ๐ฅ โฆ ๐(๐ธ๐ฅ โฉ ๐๐) is ๐-measurable for all ๐}.
โฐ contains all product sets ๐ธ = ๐ด ร ๐ต where ๐ด โ ๐, ๐ต โ โฌ because๐(๐ธ๐ฅ โฉ ๐๐) = 1๐ฅโ๐๐(๐ต โฉ ๐๐). โฐ is stable under complementation:
๐((๐ธ๐)๐ฅ โฉ ๐๐) = ๐(๐๐) โ ๐(๐๐ โฉ ๐ธ๐ฅ)
26
4 Product measures
where LHS is ๐-measurable. โฐ is stable under disjoint countable union:let ๐ธ = โ๐โฅ1 ๐ธ๐ where ๐ธ๐ โ โฐ disjoint. Then by ๐-additivity
๐(๐ธ๐ฅ โฉ ๐๐) = โ๐โฅ1
๐((๐ธ๐)๐ฅ โฉ ๐๐)
which is ๐-measurable.The product sets form a ๐-system and generates the product measure soby Dynkin lemma โฐ = ๐ โ โฌ.
Definition (product measure). Let (๐, ๐, ๐) and (๐ , โฌ, ๐) be measurespaces and ๐, ๐ ๐-finite. Then there exists a unique product measure, denotedby ๐ โ ๐, on ๐ โ โฌ such that for all ๐ด โ ๐, ๐ต โ โฌ,
(๐ โ ๐)(๐ด ร ๐ต) = ๐(๐ด)๐(๐ต).
Proof. Uniqueness follows from Dynkin lemma. For existence, set
๐(๐ธ) = โซ๐
๐(๐ธ๐ฅ)๐๐(๐ฅ).
๐ is well-defined because ๐ฅ โฆ ๐(๐ธ๐ฅ) is ๐-measurable by lemma 2. ๐ is countably-additive: suppose ๐ธ = โ๐โฅ1 ๐ธ๐ where ๐ธ๐ โ ๐ โ โฌ disjoint, then
๐(๐ธ) = โซ๐
๐(๐ธ๐ฅ)๐๐(๐ฅ) = โซ๐
โ๐โฅ1
๐((๐ธ๐)๐ฅ)๐๐๐ฅ = โ๐โฅ1
โซ๐
๐((๐ธ๐)๐ฅ)๐๐(๐ฅ) = โ๐โฅ2
๐(๐ธ๐)
by a corollary of MCT.
Theorem 4.3 (Tonelli-Fubini). Let (๐, ๐, ๐) and (๐ , โฌ, ๐) be ๐-finite mea-sure spaces.
1. Let ๐ โถ ๐ ร ๐ โ [0, +โ] be ๐ โ โฌ-measurable. Then
โซ๐ร๐
๐(๐ฅ, ๐ฆ)๐(๐โ๐) = โซ๐
โซ๐
๐(๐ฅ, ๐ฆ)๐๐(๐ฆ)๐๐(๐ฅ) = โซ๐
โซ๐
๐(๐ฅ, ๐ฆ)๐๐(๐ฅ)๐๐(๐ฆ).
2. If ๐ โถ ๐ ร ๐ โ R is ๐ โ ๐-integrable then for ๐-almost everywhere ๐ฅ,๐ฆ โฆ ๐(๐ฅ, ๐ฆ) is ๐-integrable, and for ๐-almost everywhere ๐ฆ, ๐ฅ โฆ ๐(๐ฅ, ๐ฆ)is ๐-integrable and
โซ๐ร๐
๐(๐ฅ, ๐ฆ)๐(๐โ๐) = โซ๐
โซ๐
๐(๐ฅ, ๐ฆ)๐๐(๐ฆ)๐๐(๐ฅ) = โซ๐
โซ๐
๐(๐ฅ, ๐ฆ)๐๐(๐ฅ)๐๐(๐ฆ).
Without the nonnegativity or integrability assumption, the result is false ingeneral. For example for ๐ = ๐ = N, let ๐ = โฌ be discrete ๐-algebras and
27
4 Product measures
๐ = ๐ counting measure. Let ๐(๐, ๐) = 1๐=๐ โ 1๐=๐+1. Check that
โ๐โฅ1
๐(๐, ๐) = 0
โ๐โฅ1
๐(๐, ๐) = {0 ๐ โฅ 21 ๐ = 1
soโ๐โฅ1
โ๐โฅ1
๐(๐, ๐) โ โ๐โฅ1
โ๐โฅ1
๐(๐, ๐).
Proof.
1. The result holds for ๐ = 1๐ธ where ๐ธ โ ๐โโฌ by the definition of productmeasure and lemma 2, so it holds for all simple functions. Now take limitsand apply MCT.
2. Write ๐ = ๐+ โ ๐โ and apply 1.
Note.
1. The Lebesgue measure ๐๐ on R๐ is equal to ๐1 โ โฏ โ ๐1, because it istrue on boxes and extend by uniqueness of measure.
2. ๐ธ โ ๐ โ โฌ if is ๐ โ ๐-null if and only if for ๐-almost every ๐ฅ, ๐(๐ธ๐ฅ) = 0.
28
5 Foundations of probability theory
5 Foundations of probability theoryModern probability theory was founded by Kolmogorov, who formulated theaxioms of probability theory in 1933 in his thesis Foundations on the Theory ofProbability. He defined a probability space to be a measure space (ฮฉ, โฑ,P). Theinterpretation is as follow: ฮฉ is the universe of possible outcomes. However, wewouldnโt be able to assign probability to every single outcome unless the spaceis discrete. Instead we are interested in studying some subsets of ฮฉ, whichare called events and contained in โฑ. Finally P is a probability measure withP(ฮฉ) = 1. Thus for ๐ด โ โฑ, P(๐ด occurs) โ [0, 1]. Thus finite additivity of P saysthat if ๐ด and ๐ต never occurs simultaneously then P(๐ด or ๐ต = P(๐ด) + P(๐ต).๐-additivity is slightly more difficult to justify and it is perhaps to see theequivalent notion continuity: if ๐ด๐+1 โ ๐ด๐ and โ๐โฅ1 ๐ด๐ = โ then P(๐ด๐) โ 0as ๐ โ โ.
Definition (probability measure, probability space). Let ฮฉ be a set and โฑa ๐-algebra on ฮฉ. A measure ๐ on (ฮฉ, โฑ) is called a probability measure if๐(ฮฉ) = 1 and the measure space (ฮฉ, โฑ, ๐) is called a probability space.
Definition (random variable). A measurable function ๐ โถ ฮฉ โ R is calleda random variable.
We usually use a capital letter to denote a random variable.
Definition (expectation). If (ฮฉ, โฑ,P) is a probability space then the P-integral is called expectation, denoted E.
Definition (distribution/law). A random variable ๐ โถ ฮฉ โ R on a proba-bility space (ฮฉ, โฑ,P) determines a Borel measure ๐๐ on R defined by
๐๐((โโ, ๐ก]) = P(๐ โค ๐ก) = P({๐ โ ฮฉ โถ ๐(๐) โค ๐ก})
and ๐๐ is called the distribution of ๐, or the law of ๐.
Note. ๐๐ is the image of P under
ฮฉ โ R๐ โฆ ๐(๐)
Definition (distribution function). The function
๐น๐ โถ R โ [0, 1]๐ก โฆ P(๐ โค ๐ก)
is called the distribution function of ๐.
29
5 Foundations of probability theory
Proposition 5.1. If (ฮฉ, โฑ,P) is a probability space and ๐ โถ ฮฉ โ R is a ran-dom variable then ๐น๐ is non-decreasing, right-continuous and it determines๐๐ uniquely.
Proof. Given ๐ก๐ โ ๐ก,
๐น๐(๐ก๐) = P(๐ โค ๐ก๐) โ P( โ๐โฅ1
{๐ โค ๐ก๐}) = P({๐ โค ๐ก}) = ๐น๐(๐ก)
by downward monotone convergence for sets. Uniqueness follows from Dynkinlemma applied to the ๐-system {โ } โช {(โโ, ๐ก]}๐กโR.
Conversely,
Proposition 5.2. If ๐น โถ R โ [0, 1] is a non-decreasing right-continuousfunction with
lim๐กโโโ
๐น(๐ก) = 0
lim๐กโ+โ
๐น(๐ก) = 1
then there exists a unique probability measure ๐ on R such that
๐น(๐ก) = ๐((โโ, ๐ก])
for all ๐ก โ R.
Remark. The measure ๐ is called the Lebesgue-Stieltjes measure on R associ-ated to ๐น. Furthermore for all ๐, ๐ โ R,
๐((๐, ๐]) = ๐น(๐) โ ๐น(๐).
We can also construct Lebesgue measure this way.
Proof. Uniqueness is the same as above. For existence, we use the lemma
Lemma 5.3. Let
๐ โถ (0, 1) โ R๐ฆ โฆ inf{๐ฅ โ R โถ ๐น (๐ฅ) โฅ ๐ฆ}
then ๐ is non-decreasing, left-continuous and for all ๐ฅ โ R, ๐ฆ โ (0, 1), ๐(๐ฆ) โค๐ฅ if and only if ๐น(๐ฅ) โฅ ๐ฆ.
Proof. Let๐ผ๐ฆ = {๐ฅ โ R โถ ๐น (๐ฅ) โฅ ๐ฆ}.
Clearly if ๐ฆ1 โฅ ๐ฆ2 then ๐ผ๐ฆ1โ ๐ผ๐ฆ2
so ๐(๐ฆ2) โค ๐(๐ฆ1) so ๐ is non-decreasing. ๐ผ๐ฆ isan interval of R because if ๐ฅ > ๐ฅ1 and ๐ฅ1 โ ๐ผ๐ฆ then ๐น(๐ฅ) โฅ ๐น(๐ฅ1) โฅ ๐ฆ so ๐ฅ โ ๐ผ๐ฆ.So ๐ผ๐ฆ is an interval with endpoints ๐(๐ฆ) and +โ. But ๐น is right-continuous so๐(๐ฆ) = min ๐ผ๐ฆ and the minimum is obtained. Thus ๐ผ๐ฆ = [๐(๐ฆ), +โ).
This means that ๐ฅ โฅ ๐(๐ฆ) if and only if ๐ฅ โ ๐ผ๐ฆ if and only if ๐น(๐ฅ) โฅ ๐ฆ.Finally for left-continuity, suppose ๐ฆ๐ โ ๐ฆ then โ๐โฅ1 ๐ผ๐ฆ๐
= ๐ผ๐ฆ by definitionof ๐ผ๐ฆ so ๐(๐ฆ๐) โ ๐(๐ฆ).
30
5 Foundations of probability theory
Remark. If ๐น is continuous and strictly increasing then ๐ = ๐น โ1.
Now back to the proposition. Set ๐ = ๐โ๐ where ๐ is the Lebesgue measureon (0, 1). ๐ is a probability measure as ๐ is Borel-measurable. By the lemma
๐((๐, ๐]) = ๐(๐โ1(๐, ๐]) = ๐((๐น(๐), ๐น (๐))) = ๐น(๐) โ ๐น(๐).
Proposition 5.4. If ๐ is a Borel probability measure on R then there existssome probability space (ฮฉ, โฑ,P) and a random variable ๐ on ฮฉ such that๐๐ = ๐.
In fact, one can even pick ฮฉ = (0, 1), โฑ = โฌ(0, 1) and P = ๐, theLebesgue measure.
Proof. For the first claim set ฮฉ = R, โฑ = โฌ(R), P = ๐ and ๐(๐ฅ) = ๐ฅ.For the second claim, set ๐น(๐ก) = ๐((โโ, ๐ก]) and take ๐ = ๐ where ๐ is the
auxillary function defined in the previous lemma, namely
๐(๐) = inf{๐ฅ โถ ๐น(๐ฅ) โฅ ๐}.
Check that ๐๐ = ๐:
๐๐((๐, ๐]) = P(๐ โ (๐, ๐])= ๐({๐ โ (0, 1) โถ ๐ < ๐(๐ค) โค ๐})= ๐({๐ โ (0, 1) โถ ๐น (๐) < ๐ < ๐น(๐)})
Remark. If ๐ is a Borel probability measure on R such that ๐ = ๐๐๐ก forsome ๐ โฅ 0 measurable, we say that ๐ has a density (with respect to Lebesguemeasure) and ๐ is called the density of ๐. Here ๐ = ๐๐๐ก means that ๐((๐, ๐]) =โซ๐๐
๐(๐ก)๐๐ก.
Example.1. uniform distribution on [0, 1]:
๐(๐ก) = 1[0,1](๐ก)๐น(๐ก) = ๐((โโ, ๐ก] โฉ [0, 1])
2. exponential distribution of rate ๐:
๐๐(๐ก) = ๐๐โ๐๐ก1๐กโฅ0
๐น๐(๐ก) = โซ๐ก
โโ๐๐(๐ )๐๐ = 1๐กโฅ0(1 โ ๐โ๐๐ก)
3. Gaussian distribution with standard deviation ๐ and mean ๐:
๐๐,๐(๐ก) = 1โ2๐๐2
exp(โ(๐ก โ ๐)2
2๐2 )
๐น๐,๐(๐ก) = โซ๐ก
โโ
1โ2๐๐2
exp(โ(๐ โ ๐)2
2๐2 )๐๐
31
5 Foundations of probability theory
Definition (mean, moment, variance). If ๐ is a random variable then
1. E(๐) is called the mean,
2. E(๐๐) is called the ๐th-moment of ๐,
3. Var(๐) = E((๐ โ E๐)2) = E(๐2) โ E(๐)2 is called the variance.
Remark. Suppose ๐ โฅ 0 is measurable and ๐ is a random variable. Then
E(๐(๐)) = โซR
๐(๐ฅ)๐๐๐(๐ฅ)
where by definition of ๐๐ = ๐โP.
32
6 Independence
6 IndependenceIndependence is the key notion that makes probability theory distinct from(abstract) measure theory.
Definition (independence). Let (ฮฉ, โฑ,P) be a probability space. A se-quence of events (๐ด๐)๐โฅ1 is called independent or mutually independent iffor all ๐น โ N finite,
P(โ๐โ๐น
๐ด๐) = โ๐โ๐น
P(๐ด๐).
Definition (independent ๐-algebra). A sequence of ๐-algebras (๐๐)๐โฅ1where ๐๐ โ โฑ is called independent if for all ๐ด๐ โ ๐๐, the family (๐ด๐)๐โฅ1is independent.
Remark.
1. To prove that (๐๐)๐โฅ1 is an independent family, it is enough to checkthe independence condition for all ๐ด๐โs with ๐ด๐ โ ฮ ๐ where ฮ ๐ is a ๐-system generating ๐๐. The proof is an application of Dynkin lemma. Forexample for ๐-algebras ๐1, ๐2, suffices to check
P(๐ด1 โฉ ๐ด2) = P(๐ด1)P(๐ด2)
for all ๐ด1 โ ฮ 1, ๐ด2 โ ฮ 2. Fix ๐ด2 โ ฮ 2, look at the measures
๐ด1 โฆ P(๐ด1 โฉ ๐ด2)๐ด1 โฆ P(๐ด1)P(๐ด2)
on ๐1. They coincide on ฮ 1 by assumption and hence everywhere on ๐1.Subsequently consider ๐2.
Notation. Suppose ๐ is a random variable. Denote by ๐(๐) the smallest๐-subalgebra ๐ of โฑ such that ๐ is ๐-measurable, i.e.
๐(๐) = ๐({๐ โ ฮฉ โถ ๐(๐) โค ๐ก}๐กโR).
Definition (independence). A sequence of random variables (๐๐)๐โฅ1 iscalled independent if the sequence of ๐-subalgebras (๐(๐๐))๐โฅ1 is indepen-dent.
Remark. This is equivalent to the condition that for all (๐ก๐)๐โฅ1, for all ๐,
P((๐1 โค ๐ก1) โฉ โฏ โฉ (๐๐ โค ๐ก๐)) =๐
โ๐=1
P(๐๐ โค ๐ก๐).
Yet another equivalent formulation is
๐(๐1,โฆ,๐๐) =๐
โจ๐=1
๐๐๐
as Borel probability measures on R๐. โThe joint law is the same as the productof individual lawsโ.
33
6 Independence
Note. Note that independence is a property of a family so pairwise indepen-dence is necessary but not sufficient for independence. A famous counterexampleis Bersteinโs example: take ๐ and ๐ to be random variables for two independentfair coins flips. Set ๐ = |๐ โ ๐ |. Then ๐ = 0 if and only if ๐ = ๐. Check that
P(๐ = 0) = P(๐ = 1) = 12
and each pair (๐, ๐ ), (๐, ๐) and (๐ , ๐) is independent. But (๐, ๐ , ๐) is notindependent.
Proposition 6.1. If ๐ and ๐ are independent random variables, ๐ โฅ0, ๐ โฅ 0 then
E(๐๐ ) = E(๐)E(๐ ).
Proof. Essentially Tonelli-Fubini:
E(๐๐ ) = โซR2
๐ฅ๐ฆ๐๐๐,๐(๐ฅ, ๐ฆ) = โซR2
๐๐๐(๐ฅ)๐๐๐(๐ฆ)
= (โซR
๐ฅ๐๐๐(๐ฅ)) (โซR
๐ฆ๐๐๐(๐ฆ))
= E(๐)E(๐ )
Remark. As in Tonelli-Fubini, we may require ๐๐ to be integrable instead andthe same conclusion holds.
Example. Let ฮฉ = (0, 1), โฑ = โฌ(0, 1),P = ๐ the Lebesgue measure. Writethe decimal expansion of ๐ โ (0, 1) as
๐ = 0.๐1๐2 โฆ
where ๐๐(๐) โ {0, โฆ , 9}. Choose a convention so that each ๐ has a well-definedexpansion (to avoid things like 0.099 โฏ = 0.100 โฆ). Now let ๐๐(๐) = ๐๐(๐).Claim that the (๐๐)๐โฅ1 are iid. random variables uniformly distributed on{0, โฆ , 9}, where โiid.โ stands for independently and identically distributed.
Proof. Easy check. For example ๐1(๐) = โ10๐โ so
P(๐1 = ๐1) = 110
.
Similarly for all ๐
P(๐1 = ๐1, โฆ , ๐๐ = ๐๐) = 110๐ , P(๐๐ = ๐๐) = 1
10so
P(๐1 = ๐1, โฆ , ๐๐ = ๐๐) =๐
โ๐=1
P(๐๐ = ๐๐).
34
6 Independence
Remark.๐ = โ
๐โฅ1
๐๐(๐)10๐
is distributed according to Lebesgue measure so if we want we can constructLebesgue measure as the law of this random variable.
Proposition 6.2 (infinite product of product measure). Let (ฮฉ๐, โฑ๐, ๐๐)๐โฅ1be a sequence of probability spaces, ฮฉ = โ๐โฅ1 ฮฉ๐ and โฐ be the Boolean algebraof cylinder sets, i.e. sets of the form
๐ด ร โ๐โฅ๐
ฮฉ๐
for some ๐ด โ โจ๐๐=1 โฑ๐. Set โฑ = ๐(โฐ), the infinite product ๐-algebra. Then
there is a unique probability measure ๐ on (ฮฉ, โฑ) such that it agrees withproduct measures on all cylinder sets, i.e.
๐(๐ด ร โ๐>๐
ฮฉ๐) = (๐
โจ๐=1
๐๐)(๐ด)
for all ๐ด โ โจ๐๐=1 โฑ๐.
Proof. Omitted. See example sheet 3.
Lemma 6.3 (Borel-Cantelli). Let (ฮฉ, โฑ,P) be a probability space and (๐ด๐)๐โฅ1a sequence of events.
1. If โ๐โฅ1 P(๐ด๐) < โ then
P(lim sup๐
๐ด๐) = 0.
2. Conversely, if (๐ด๐)๐โฅ1 are independent and โ๐โฅ1 P(๐ด๐) = โ then
P(lim sup๐
๐ด๐) = 1.
Note that lim sup๐ ๐ด๐ is also called ๐ด๐ io. meaning โinfinitely oftenโ.
Proof.
1. Let ๐ = โ๐โฅ1 1๐ด๐be a random variable. Then
E(๐ ) = โ๐โฅ1
E(1๐ด๐) = โ
๐โฅ1P(๐ด๐).
Since ๐ โฅ 0, recall that we prove that E(๐ ) < โ implies that ๐ < โalmost surely, i.e. P-almost everywhere.
2. Note that(lim sup
๐๐ด๐)๐ = โ
๐โ
๐โฅ๐๐ด๐
๐
35
6 Independence
so
P( โ๐โฅ๐
๐ด๐๐) โค P(
๐โ
๐=๐๐ด๐
๐)
=๐โ
๐=๐P(๐ด๐
๐) =๐โ
๐=๐(1 โ P(๐ด๐))
โค๐โ
๐=๐exp(โP(๐ด๐))
โค exp(โ๐
โ๐=๐
P(๐ด๐))
โ 0
as ๐ โ โ. ThusP( โ
๐โฅ๐๐ด๐
๐) = 0
for all ๐ soP(โ
๐โ
๐โฅ๐๐ด๐
๐) = 0.
Definition (random/stochastic process, filtration, tail ๐-algebra, tail event).Let (ฮฉ, โฑ,P) be a probability space and (๐๐)๐โฅ1 a sequence of random vari-ables.
1. (๐๐)๐โฅ1 is sometimes called a random process or stochastic process.
2.โฑ๐ = ๐(๐1, โฆ , ๐๐) โ โฑ
is called the associated filtration. โฑ๐ โ โฑ๐+1.
3.๐ = โ
๐โฅ1๐(๐๐, ๐๐+1, โฆ )
is called the tail ๐-algebra of the process. Its elements are called tailevents.
Example. Tail events are those not affected by the first few terms in the se-quence of random variables. For example,
{๐ โ ฮฉ โถ lim๐
๐๐(๐) exists}
is a tail event, so is{๐ โ ฮฉ โถ lim sup
๐๐๐(๐) โฅ ๐ }.
Theorem 6.4 (Kolmogorov 0โ1 law). If (๐๐)๐โฅ1 is a sequence of mutually
36
6 Independence
independent random variables then for all ๐ด โ ๐,
P(๐ด) โ {0, 1}.Proof. Pick ๐ด โ ๐. Fix ๐. For all ๐ต โ ๐(๐1, โฆ , ๐๐),
P(๐ด โฉ ๐ต) = P(๐ด)P(๐ต)
as ๐ is independent of ๐(๐1, โฆ , ๐๐). The measures ๐ต โฆ P(๐ด)P(๐ต) and๐ต โฆ P(๐ด โฉ ๐ต) coincide on each โฑ๐ so on โ๐โฅ1 โฑ๐. Hence they coincideon ๐(โ๐โฅ1 โฑ๐) โ ๐ so
P(๐ด) = P(๐ด โฉ ๐ด) = P(๐ด)P(๐ด)
soP(๐ด) โ {0, 1}.
6.1 Useful inequalities
Proposition 6.5 (Cauchy-Schwarz). Suppose ๐, ๐ are random variablesthen
E(|๐๐ |) โค โE(๐2) โ E(๐ 2).
Proof. For all ๐ก โ R,
0 โค E((|๐| + ๐ก|๐ |)2) = E(๐2) + 2๐กE(|๐๐ |) + ๐ก2E(๐ 2)
so viewed as a quadratic in ๐ก, the discriminant is nonpositive, i.e.
(E(|๐๐ |)2 โ E(๐)2 โ E(๐ )2 โค 0.
Proposition 6.6 (Markov). Let ๐ โฅ 0 be a random variable. Then for all๐ก โฅ 0,
๐กP(๐ โฅ ๐ก) โค E(๐).
Proof.E(๐) โฅ E(๐1๐โฅ๐ก) โฅ E(๐ก1๐โฅ๐ก) = ๐กP(๐ โฅ ๐ก)
Proposition 6.7 (Chebyshev). Let ๐ be a random variable with E(๐ 2) < โ,then for all ๐ก โ R,
๐ก2P(|๐ โ E(๐ )| โฅ ๐ก) โค Var ๐ .
E(๐ 2) < โ implies that E(|๐ |) < โ by Cauchy-Schwarz, so Var ๐ < โ.The converse is more subtle.
Proof. Apply Markov to ๐ = |๐ โ E(๐ )|2.
37
6 Independence
Theorem 6.8 (strong law of large numbers). Let (๐๐)๐โฅ1 be a sequence ofiid. random variables. Assume E(|๐1|) < โ. Let
๐๐ =๐
โ๐=1
๐๐,
then 1๐ ๐๐ converges almost surely to E(๐1).
Proof. We prove the theorem under a stonger condition: we assume E(๐41) <
โ. This implies, by Cauchy-Schwarz, E(๐21),E(|๐1|) < โ. Subsequently
E(|๐1|3) < โ. The full proof is much harder but will be given later whenwe have developed enough machinery.
wlog we may assume E(๐1) = 0 by replacing ๐๐ with ๐๐ โ E(๐1). Have
E(๐4๐) = โ
๐,๐,๐,โE(๐๐๐๐๐๐๐โ).
All terms vanish because E(๐๐) = 0 and (๐๐)๐โฅ1 are independent, except forE(๐4
๐ ) and E(๐2๐ ๐2
๐ ) for ๐ โ ๐. For example,
E(๐๐๐3๐ ) = E(๐๐) โ E(๐3
๐ ) = 0
for ๐ โ ๐. Thus
E(๐4๐) =
๐โ๐=1
E(๐4๐ ) + 6 โ
๐<๐E(๐2
๐ ๐2๐ ).
By Cauchy-Schwarz,
E(๐2๐ ๐2
๐ ) โค โE๐4๐ โ E๐4
๐ = E๐41
soE(๐4
๐) โค (๐ + 6 โ ๐(๐ โ 1)2
)E๐41
and asymptotically,E(๐๐
๐)4 = ๐( 1
๐2 )
soE(โ
๐โฅ1(๐๐
๐)4) = โ
๐โฅ1E(๐๐
๐)4 < โ.
Hence โ( ๐๐๐ )4 < โ almost surely and it follows that
lim๐โโ
๐๐๐
= 0
almost surely.
Strong law of large numbers has a very important statistical implication: wecan sample the mean of larger number of iid. to detect an unknown law, at leastthe mean.
38
7 Convergence of random variables
7 Convergence of random variables
Definition (weak convergence). A sequence of probability measures (๐๐)๐โฅ1on (R๐, โฌ(R๐)) is said to converge weakly to a measure ๐ if for all ๐ โ ๐ถ๐(R๐),the set of continuous bounded functions on R๐,
lim๐โโ
๐๐(๐) = ๐(๐).
Example.
1. Let ๐๐ = ๐ฟ1/๐ be the Dirac mass on R๐, i.e. for ๐ฅ โ R๐, ๐ฟ๐ฅ is the Borelprobability measure on R๐ such that
๐ฟ๐ฅ(๐ด) = {1 ๐ฅ โ ๐ด0 ๐ฅ โ ๐ด
then ๐๐ โ ๐ฟ0.
2. Let ๐๐ = ๐ฉ(0, ๐2๐), Gaussian distribution with standard deviation ๐๐,
where ๐๐ โ 0, then again ๐๐ โ ๐ฟ0. Indeed,
๐๐(๐) = โซ ๐(๐ฅ)๐๐๐(๐ฅ)
= โซ ๐(๐ฅ) 1โ2๐๐2
๐exp(โ ๐ฅ2
2๐2๐
)๐๐ฅ
= โซ ๐(๐ฅ๐๐) 1โ2๐
exp(โ๐ฅ2
2)๐๐ฅ
๐๐ โ 0 so ๐(๐ฅ๐๐) โ ๐(0) so by dominated convergence theorem, ๐๐(๐) โ๐(0) = ๐ฟ0(๐).
Definition (convergence of random variable). A sequence (๐๐)๐โฅ1 of R๐-valued random variables on (ฮฉ, โฑ,P) is said to converge to a random variable๐
1. almost surely iflim
๐โโ๐๐(๐) = ๐(๐)
for P-almost every ๐.
2. in probability or in measure if for all ๐ > 0,
lim๐โโ
P(โ๐๐ โ ๐โ > ๐) = 0.
Note that all norms on R๐ are equivalent so we donโt have to specifyone.
39
7 Convergence of random variables
3. in distribution or in law if ๐๐๐โ ๐๐ weakly, where ๐๐ = ๐โP is
the law of ๐, a Borel probability measure on R๐. Equivalently, for all๐ โ ๐ถ๐(R๐), E(๐(๐๐)) โ E(๐(๐)).
Proposition 7.1. 1 โน 2 โน 3.
Proof.
1. 1 โน 2:P(โ๐๐ โ ๐โ > ๐) = E(1โ๐๐โ๐โ>๐)
so if ๐๐ โ ๐ almost surely then
1โ๐๐โ๐โ>๐ โ 0
P-almost everywhere so by dominated convergence theorem P(โ๐๐ โ๐โ >๐) โ 0.
2. 2 โน 3: given ๐ โ ๐ถ๐(R๐), need to show that ๐๐๐(๐) โ ๐๐(๐). But
๐๐๐(๐) โ ๐๐(๐) = E(๐(๐๐) โ ๐(๐)).
To bound this, note that ๐ is continuous and R๐ is locally compact so it islocally uniformly continuous. In particular for all ๐ > 0 exists ๐ฟ > 0 suchthat if โ๐ฅโ < 1
๐ and โ๐ฆ โ ๐ฅโ < ๐ฟ then |๐(๐ฆ) โ ๐(๐ฅ)| < ๐. Thus
|E(๐(๐๐) โ ๐(๐))| โค E(1โ๐๐โ๐โ<๐ฟ1โ๐โ<1/๐ |๐(๐๐) โ ๐(๐)|โโโโโโโ<๐
)
+ 2 โ๐โโโ<โ
(P(โ๐๐ โ ๐โ โฅ ๐ฟ) + P(โ๐โ โฅ 1๐
))
solim sup
๐โโ|E(๐(๐๐) โ ๐(๐))| โค ๐ + 2โ๐โโ P(โ๐โ > 1
๐)
โโโโโโ0 as ๐โ0
.
which is 0 as ๐ is arbitrary.
Remark. When ๐ = 1, 3 is equivalent to ๐น๐๐(๐ฅ) โ ๐น๐(๐ฅ) for all ๐ฅ as ๐ โ โ
where ๐น๐ is continuous. See example sheet 3.
The strict converses do not hold but we can say something weaker.
Proposition 7.2. If ๐๐ โ ๐ in probability then there is a subsequence(๐๐)๐ such that ๐๐๐
โ ๐ almost surely as ๐ โ โ.
Proof. We know for all ๐ > 0, P(โ๐๐ โ ๐โ > ๐) โ 0 as ๐ โ โ. So for all ๐exists ๐๐ such that
P(โ๐๐๐โ ๐โ > 1
๐) โค 1
2๐
soโ๐โฅ1
P(โ๐๐๐โ ๐โ > 1
๐) < โ
40
7 Convergence of random variables
so by the first Borel-Cantelli lemma
P(โ๐๐๐โ ๐โ > 1
๐io.) = 0.
This means that with probability 1, โ๐๐๐โ ๐โ โ 0 as ๐ โ โ.
Definition (convergence in mean). Let (๐๐)๐โฅ1 and ๐ be R๐-valued inte-grable random variables. We say that (๐๐)๐ converges in mean or in ๐ฟ1 to๐ if
lim๐โโ
E(โ๐๐ โ ๐โ) = 0.
Remark.
1. If ๐๐ โ ๐ in mean then ๐๐ โ ๐ in probability by Markov inequality:
๐ โ P(โ๐๐ โ ๐โ > ๐) โค E(โ๐๐ โ ๐โ).
2. The converse is false. For example take ฮฉ = (0, 1), โฑ = โฌ(ฮฃ) and PLebesgue measure. Let ๐๐ = ๐1[0, 1
๐ ]. ๐๐ โ 0 almost surely but E๐๐ = 1.
When does convergence in probability imply convergence in mean? We needsome kind of domination assumption.
Definition (uniformly integrable). A sequence of random variables (๐๐)๐โฅ1is uniformly integrable if
lim๐โโ
lim sup๐โโ
E(โ๐๐โ1โ๐๐โโฅ๐) = 0.
Remark. If (๐๐)๐โฅ1 are dominated, namely exists an integrable random vari-able ๐ โฅ 0 such that โ๐๐โ โค ๐ for all ๐ then (๐๐)๐ is uniformly integrable bydominated convergence theorem:
E(โ๐๐โ1โ๐๐โโฅ๐) โค E(๐1๐ โฅ๐) โ 0
as ๐ โ โ.
Theorem 7.3. Let (๐๐)๐โฅ1 be a sequence of R๐-valued integrable randomvariable. Let ๐ be another random variable. Then TFAE:
1. ๐ is integrable and ๐๐ โ ๐ in mean,
2. (๐๐)๐โฅ1 is uniformly integrable and ๐๐ โ ๐ in probability.
Proof.
โข 1 โน 2: Left to show uniform integrability:
E(โ๐๐โ1โ๐๐โโฅ๐) โค E(โ๐๐ โ ๐โ1โ๐๐โโฅ๐) + E(โ๐โ1โ๐๐โโฅ๐)โค E(โ๐๐ โ ๐โ) + E(โ๐โ1โ๐๐โโฅ๐(1โ๐โโค ๐
2+ 1โ๐โ> ๐
2))
โค E(โ๐๐ โ ๐โ) + E(โ๐โ1โ๐๐โ๐โโฅ ๐2
1โ๐โโค ๐2
) + E(โ๐โ1โ๐โโฅ ๐2
)
โค E(โ๐๐ โ ๐โ) + ๐2P(โ๐๐ โ ๐โ โฅ ๐
2) + E(โ๐โ1โ๐โโฅ ๐
2)
41
7 Convergence of random variables
Take lim sup,
lim sup๐โโ
E(โ๐๐โ1โ๐๐โโฅ๐) โค 0 + 0 + E(โ๐โ1โ๐โโฅ ๐2
) โ 0
by dominated convergence theorem.
โข 2 โน 1: Prove first that ๐ is integrable. By the previous proposition,we can find a subsequence (๐๐)๐ such that ๐๐๐
โ ๐ almost surely. ByFatouโs lemma,
E(โ๐โ1โ๐โโฅ๐) โค lim inf๐โ0
E(โ๐๐๐โ1โ๐๐๐
โโฅ๐)
which goes to 0 as ๐ โ โ by uniform integrability assumption. Thus
E(โ๐โ) โค ๐ + E(โ๐โ1โ๐โโฅ๐) < โ
for ๐ sufficiently large. Thus ๐ is integrable.To show convergence in mean, we use the same trick of spliting into smalland big parts.
E(โ๐๐ โ ๐โ) = E((1โ๐๐โ๐โโค๐ + 1โ๐๐โ๐โ>๐)โ๐๐ โ ๐โ)โค ๐ + E(1โ๐๐โ๐โ>๐โ๐๐ โ ๐โ(1โ๐๐โโค๐ + 1โ๐๐โ>๐))โค ๐ + โ + โก
where
โ = E(โ๐๐ โ ๐โ1โ๐๐โ๐โ>๐1โ๐๐โโค๐)โค E((๐ + โ๐โ)1โ๐๐โ๐โ>๐(1โ๐โโค๐ + 1โ๐โ>๐)โค 2๐P(โ๐๐ โ ๐โ > ๐) + 2E(โ๐โ1โ๐โ>๐)
solim sup
๐โโโ โค 2E(โ๐โ1โ๐โ>๐) โ 0
as ๐ โ โ. On the other hand
โก = E(โ๐๐ โ ๐โ1โ๐๐โ๐โ>๐1โ๐๐โ>๐)โค E((โ๐๐โ + โ๐โ)1โ๐๐โโฅ๐)โค E(โ๐๐โ1โ๐๐โโฅ๐ + โ๐โ1โ๐โ>๐ + โ๐โ1โ๐๐โโฅ๐1โ๐โโค๐)โค 2E(โ๐๐โ1โ๐๐โโฅ๐) + E(โ๐โ1โ๐โ>๐)
taking lim sup,
lim sup๐โโ
โก โค 2 lim sup๐โโ
E(โ๐๐โ1โ๐๐โโฅ๐) + E(โ๐โ1โ๐โ>๐) โ 0 + 0
as ๐ โ โ.
42
7 Convergence of random variables
Definition. We say that a sequence of random variables (๐๐)๐โฅ1 is boundedin ๐ฟ๐ if there exists ๐ถ > 0 such that E(โ๐๐โ๐) โค ๐ถ for all ๐.
Proposition 7.4. If ๐ > 1 and (๐๐)๐โฅ1 is bounded in ๐ฟ๐ then (๐๐)๐โฅ1 isuniformly integrable.
Proof.
๐๐โ1E(โ๐๐โ1โ๐๐โโฅ๐) โค E(โ๐๐โ๐1โ๐๐โโฅ๐) โค E(โ๐๐โ๐) โค ๐ถ
solim sup
๐โโE(โ๐๐โ1โ๐๐โโฅ๐) โค ๐ถ
๐๐โ1 โ 0
as ๐ โ โ.
This provides a sufficient condition for uniform integrability and thus con-vergnce in mean.
43
8 ๐ฟ๐ spaces
8 ๐ฟ๐ spacesRecall that ๐ โถ ๐ผ โ R is convex means that for all ๐ฅ, ๐ฆ โ ๐ผ, for all ๐ก โ [0, 1],
๐(๐ก๐ฅ + (1 โ ๐ก)๐ฆ) โค ๐ก๐(๐ฅ) + (1 โ ๐ก)๐(๐ฆ).
Proposition 8.1 (Jensen inequality). Let ๐ผ be an open interval of R and๐ โถ ๐ผ โ R a convex function. Let ๐ be a random variable (ฮฉ, โฑ,P). Assume๐ is integrable and takes values in ๐ผ. Then
E(๐(๐)) โฅ ๐(E(๐)).
Remark.
1. As ๐ โ ๐ผ almost surely and ๐ผ is an interval, have E(๐) โ ๐ผ.
2. Weโll show that ๐(๐)โ is integrable so
E(๐(๐)) = E(๐(๐)+) โ E(๐(๐)โ)
with the possibility that both sides are infinity.
Lemma 8.2. TFAE:
1. ๐ is convex,
2. there exists a family โฑ of affine functions (๐ฅ โฆ ๐๐ฅ + ๐) such that๐ = supโโโฑ โ on ๐ผ.
Proof.
1. 2 โน 1: every โ is convex and the supremum of โ is convex:
โ(๐ก๐ฅ + (1 โ ๐ก)๐ฆ) = ๐กโ(๐ฅ) + (1 โ ๐ก)โ(๐ฆ) โค ๐ก supโโโฑ
โ(๐ฅ) + (1 โ ๐ก) supโโโฑ
โ(๐ฆ)
so
๐(๐ก๐ฅ + (1 โ ๐ก)๐ฆ) = supโโโฑ
โ(๐ก๐ฅ + (1 โ ๐ก)๐ฆ)
โค ๐ก supโโโฑ
โ(๐ฅ) + (1 โ ๐ก) supโโโฑ
โ(๐ฆ)
= ๐ก๐(๐ฅ) + (1 โ ๐ก)๐(๐ฆ)
2. We need to show that for all ๐ฅ0 โ ๐ผ we can find an affine function
โ๐ฅ0(๐ฅ) = ๐๐ฅ0
(๐ฅ โ ๐ฅ0) + ๐(๐ฅ0),
where ๐๐ฅ0is morally the slope at ๐ฅ0, such that ๐(๐ฅ) โฅ โ๐ฅ0
(๐ฅ) for all ๐ฅ โ ๐ผ.Then have ๐ = sup๐ฅ0โ๐ผ โ๐ฅ0
.
To find ๐๐ฅ0observe that for all ๐ฅ < ๐ฅ0 < ๐ฆ where ๐ฅ, ๐ฆ โ ๐ผ, have
๐(๐ฅ0) โ ๐(๐ฅ)๐ฅ0 โ ๐ฅ
โค ๐(๐ฆ) โ ๐(๐ฅ0)๐ฆ โ ๐ฅ0
.
44
8 ๐ฟ๐ spaces
Indeed this is the convexity of ๐ on [๐ฅ, ๐ฆ] with ๐ก = ๐ฅ0โ๐ฅ๐ฆโ๐ฅ . This holds for
all ๐ฅ < ๐ฅ0, ๐ฆ > ๐ฅ0 so there exists ๐ โ R such that
๐(๐ฅ0) โ ๐(๐ฅ)๐ฅ0 โ ๐ฅ
โค ๐ โค ๐(๐ฆ) โ ๐(๐ฅ0)๐ฆ โ ๐ฅ0
.
Then just set โ๐ฅ0(๐ฅ) = ๐(๐ฅ โ ๐ฅ0) + ๐(๐ฅ0). By construction ๐(๐ฅ) โฅ โ๐ฅ0
(๐ฅ)for all ๐ฅ โ ๐ผ.
Proof of Jensen inequality. Let ๐(๐ฅ) = supโโโฑ โ(๐ฅ) where โ affine. then
E(๐(๐)) โฅ E(โ(๐)) = โ(E(๐))
for all โ โ โฑ so take supremum,
E(๐(๐)) โฅ supโโโฑ
โ(E(๐)) = ๐(E(๐)).
Also for the remark,
โ๐ = โ supโโโฑ
โ = infโโโฑ
(โโ)
so ๐โ = (โ๐)+ โค |โ| for all โ โ โฑ. Then
๐(๐)โ โค |โ(๐)| โค |๐||๐| + |๐|.
As ๐ is integrable, ๐(๐)โ is integrable.
Jensen inequality is for probability space only. The following applies to allmeasure spaces.
Proposition 8.3 (Minkowski inquality). Let (๐, ๐, ๐) be a measure spaceand ๐, ๐ measurable functions on ๐. Let ๐ โ [1, โ) and define the ๐-norm
โ๐โ๐ = (โซ๐
|๐|๐๐๐)1/๐
.
Thenโ๐ + ๐โ๐ โค โ๐โ๐ + โ๐โ๐.
Proof. wlog assume โ๐โ๐, โ๐โ๐ โ 0. Need to show
โฅโ๐โ๐
โ๐โ๐ + โ๐โ๐
๐โ๐โ ๐
+โ๐โ๐
โ๐โ๐ + โ๐โ๐
๐โ๐โ๐
โฅ๐
โค 1.
Suffice to show for all ๐ก โ [0, 1], for all ๐น, ๐บ measurable such that โ๐นโ๐ = โ๐บโ๐ =1, have
โ๐ก๐น + (1 โ ๐ก)๐บโ๐ โค 1โthe unit ball is convexโ. For this note that
[0, +โ) โ [0, +โ)๐ฅ โฆ ๐ฅ๐
45
8 ๐ฟ๐ spaces
is convex if ๐ โฅ 1 so
|๐ก๐น + (1 โ ๐ก)๐บ|๐ โค ๐ก|๐น |๐ + (1 โ ๐ก)|๐บ|๐
andโซ
๐|๐ก๐น + (1 โ ๐ก)๐บ|๐๐๐ โค ๐ก โซ
๐|๐น |๐๐๐
โโโโโ=1
+(1 โ ๐ก) โซ๐
|๐บ|๐๐๐โโโโโ
=1
= 1.
Proposition 8.4 (Hรถlder inequality). Suppose (๐, ๐, ๐) is a measure spaceand let ๐, ๐ be measurable functions on ๐. Given ๐, ๐ โ (1, โ) such that1๐ + 1
๐ = 1,
โซ๐
|๐๐|๐๐ โค (โซ๐
|๐|๐๐๐)1/๐
(โซ๐
|๐|๐๐๐)1/๐
with equality if and only if there exists (๐ผ, ๐ฝ) โ (0, 0) such that ๐ผ|๐|๐ = ๐ฝ|๐|๐๐-almost everywhere.
Lemma 8.5 (Young inequality). For all ๐, ๐ โ (1, โ) such that 1๐ + 1
๐ = 1,for all ๐, ๐ โฅ 0, have
๐๐ โค ๐๐
๐+ ๐๐
๐.
Proof of Hรถlder inequality. wlog assume โ๐โ๐, โ๐โ๐ โ 0. By scaling by factors(๐ผ, ๐ฝ) โ (0, 0) wlog โ๐โ๐ = โ๐โ๐ = 1. Then by Young inequality,
|๐๐| โค 1๐
|๐|๐ + 1๐
|๐|๐
soโซ
๐|๐๐|๐๐ โค 1
๐โซ
๐|๐|๐๐๐ + 1
๐โซ
๐|๐|๐๐๐ = 1
๐+ 1
๐= 1.
Remark. Apply Jensen inequality to ๐(๐ฅ) = ๐ฅ๐โฒ/๐ for ๐โฒ > ๐, we have
E(|๐|๐)1/๐ โค E(|๐|๐โฒ)1/๐โฒ ,
so the function ๐ โฆ E(|๐|๐)1/๐ is non-decreasing. This can be used, for example,to show that if ๐ has finite ๐โฒth moment then it has finite ๐th moment for ๐โฒ โฅ ๐.
Definition. Let (๐, ๐, ๐) be a measure space.
โข For ๐ โฅ 1,
โ๐(๐, ๐, ๐) = {๐ โถ ๐ โ R measurable such that |๐|๐ is ๐-integrable}.
โข For ๐ = โ,
โโ(๐, ๐, ๐) = {๐ โถ ๐ โ R measurable such that essup |๐| < โ}
46
8 ๐ฟ๐ spaces
where
essup |๐| = inf{๐ก โถ |๐(๐ฅ)| โค ๐ก for ๐-almost every ๐ฅ}.
Lemma 8.6. โ๐(๐, ๐, ๐) is an R-vector space.
Proof. For ๐ < โ use Minkowski inequality. Similar we can check that โ๐ +๐โโ โค โ๐โโ + โ๐โโ for all ๐, ๐.
Definition. We say ๐ and ๐ are ๐-equivalent and write ๐ โก๐ ๐ if for ๐-almost every ๐ฅ, ๐(๐ฅ) = ๐(๐ฅ).
Check that this is an equvalence relation stable under addition and multi-plication.
Definition (๐ฟ๐-space). Define
๐ฟ๐(๐, ๐, ๐) = โ๐(๐, ๐, ๐)/ โก๐
and if [๐] denotes the equivalence class of ๐ under โก๐ we define
โ[๐]โ๐ = โ๐โ๐.
Proposition 8.7. For ๐ โ [1, โ], ๐ฟ๐(๐, ๐, ๐) is a normed vector spaceunder โโ โ๐ and it is complete, i.e. it is a Banach space.
Proof. If ๐ โก๐ ๐ then โ๐โ๐ = โ๐โ๐ so โโ โ๐ on ๐ฟ๐ is well-defined. Triangleinequality follows from Minkowski inequality and linearity is obvious so โโ โ๐is indeed a norm.
For completeness, pick (๐๐)๐ a Cauchy sequence in โ๐(๐, ๐, ๐). Need toshow that there exists ๐ โ โ๐ such that โ๐๐ โ ๐โ๐ โ 0 as ๐ โ โ. This thenimplies that [๐๐] โ [๐] in ๐ฟ๐.
We can extract a subsequence ๐๐ โ โ such that โ๐๐๐+1โ ๐๐๐
โ๐ โค 2โ๐. Let
๐๐พ =๐พ
โ๐=1
|๐๐๐+1โ ๐๐๐
|
then
โ๐๐พโ๐ โค๐พ
โ๐=1
โ๐๐๐+1โ ๐๐๐
โ๐ โค๐พ
โ๐=1
2โ๐ โค 1
so by monotone convergence,
lim๐พโโ
โซ๐
|๐๐พ|๐๐๐ = โซ๐
|๐โ|๐๐๐,
i.e. ๐โ โ โ๐. In particular for ๐-almost everywhere ๐ฅ, |๐โ(๐ฅ)| < โ, i.e.
โ๐โฅ1
|๐๐๐+1(๐ฅ) โ ๐๐๐
(๐ฅ)| < โ.
47
8 ๐ฟ๐ spaces
Hence (๐๐๐(๐ฅ))๐ is a Cauchy sequence in R. By completeness of R, the limit
exists and set ๐(๐ฅ) to be it. When this limit does not exist set ๐(๐ฅ) = 0.We then have, in case ๐ < โ, by Fatouโs lemma
โ๐๐ โ ๐โ๐ โค lim inf๐โโ
โ๐๐ โ ๐๐๐โ๐ โค ๐
for any ๐ for ๐ sufficiently large. Thus
lim๐โโ
โ๐๐ โ ๐โ๐ = 0.
When ๐ = โ, we use the fact that if ๐๐ โ ๐ ๐-almost everywhere then
โ๐โโ โค lim inf๐โโ
โ๐๐โโ.
Proof. Let ๐ก > lim sup๐โโโ๐๐โโ. Then exists ๐๐ โ โ such that
โ๐๐๐โโ = essup |๐๐๐
| = inf{๐ โฅ 0 โถ |๐๐๐(๐ฅ)| โค ๐ for ๐-almost every ๐ฅ} < ๐ก
for all ๐. Thus for all ๐, for ๐-almost every ๐ฅ, |๐๐๐(๐ฅ)| < ๐ก. But by ๐-additivity
of ๐ we can swap the quantifiers, i.e. for ๐-almost every ๐ฅ, for all ๐ฅ, |๐๐๐(๐ฅ)| < ๐ก.
Thus for ๐-almost every ๐ฅ, โ๐(๐ฅ)โโ โค ๐ก.
Proposition 8.8 (approximation by simple functions). Let ๐ โ [1, โ). Let๐ be the linear span of all simple functions on ๐. Then ๐ โฉ โ๐ is dense inโ๐.
Proof. Note that ๐ โ โ๐ implies ๐+, ๐โ โ โ๐. Thus by writing ๐ = ๐+ โ ๐โ
and using Minkowski inequality, suffice to show ๐ โฅ 0 is the limit of a sequenceof simple functions.
Recall there exist simple functions 0 โค ๐๐ โค ๐ such that ๐๐(๐ฅ) โ ๐(๐ฅ) for๐-almost every ๐ฅ. Then
lim๐โโ
โ๐๐ โ ๐โ๐๐ = lim
๐โโโซ
๐|๐๐ โ ๐|๐๐๐ = 0
by dominated convergence theorem (|๐๐ โ ๐| โค ๐๐ + ๐ โค 2๐ so ๐๐ โ ๐ is ๐-integrable).
Remark. When ๐ = R๐, ๐ = โฌ(R๐) and ๐ is the Lebesgue measure, ๐ถ๐(R๐),the space of continuous functions with compact support is dense in โ๐(๐, ๐, ๐)when ๐ โ [1, โ) (this does not hold for ๐ = โ: a constant nonzero function hasno noncompact support). See example sheet. In fact, ๐ถโ
๐ (R๐) suffices.
48
9 Hilbert space and ๐ฟ2-methods
9 Hilbert space and ๐ฟ2-methods
Definition (inner product). Let ๐ be a complex vector space. A Hermitianinner product on ๐ is a map
๐ ร ๐ โ C(๐ฅ, ๐ฆ) โฆ โจ๐ฅ, ๐ฆโฉ
such that
1. โจ๐ผ๐ฅ + ๐ฝ๐ฆ, ๐งโฉ = ๐ผโจ๐ฅ, ๐งโฉ + ๐ฝโจ๐ฆ, ๐งโฉ for all ๐ผ, ๐ฝ โ C, for all ๐ฅ, ๐ฆ, ๐ง โ ๐.
2. โจ๐ฆ, ๐ฅโฉ = โจ๐ฅ, ๐ฆโฉ.
3. โจ๐ฅ, ๐ฅโฉ โ R and โจ๐ฅ, ๐ฅโฉ โฅ 0, with equality if and only if ๐ฅ = 0.
Definition (Hermitian norm). The Hermitian norm is defined as โ๐ฅโ =โโจ๐ฅ, ๐ฅโฉ.
Lemma 9.1. Properties of norm:
1. linearity: โ๐๐ฅโ = |๐|โ๐ฅโ for all ๐ โ C, ๐ฅ โ ๐,
2. Cauchy-Schwarz: |โจ๐ฅ, ๐ฆโฉ| โค โ๐ฅโ โ โ๐โ,
3. triangle inequality: โ๐ฅ + ๐ฆโ โค โ๐ฅโ + โ๐ฆโ
4. parallelogram identity: โ๐ฅ + ๐ฆโ2 + โ๐ฅ โ ๐ฆโ2 = 2(โ๐ฅโ2 + โ๐ฆโ2)
Proof. Exercise. For reference see authorโs notes on IID Linear Analysis.
Corollary 9.2. (๐ , โโ โ) is a normed vector space.
Definition (Hilbert space). We say (๐ , โโ โ) is a Hilbert space if it is com-plete.
Example. Let ๐ = ๐ฟ2(๐, ๐, ๐) where (๐, ๐, ๐) is a measure space. Then wecan define
โจ๐, ๐โฉ = โซ๐
๐๐๐๐
which is well-defined (i.e. finite) by Cauchy-Schwarz. The axioms are easy tocheck, with positive-definiteness given by
0 = โจ๐, ๐โฉ = โซ๐
|๐|2๐๐
if and only if ๐ = 0 ๐-almost everywhere so ๐ = 0.
49
9 Hilbert space and ๐ฟ2-methods
Proposition 9.3. Let ๐ป be a Hilbert space and let ๐ be a closed convexsubset of ๐ป. Then for all ๐ฅ โ ๐ป, there exists unique ๐ฆ โ ๐ such that
โ๐ฅ โ ๐ฆโ = ๐(๐ฅ, ๐)
where by definition๐(๐ฅ, ๐) = inf
๐โ๐โ๐ฅ โ ๐โ.
This ๐ฆ is called the orthogonal projection of ๐ฅ on ๐.
Proof. Let ๐๐ โ ๐ be a sequence such that โ๐ฅ โ ๐๐โ โ ๐(๐ฅ, ๐). Letโs show that(๐๐)๐ is a Cauchy sequence. By parallelogram identity,
โฅ๐ฅ โ ๐๐2
+ ๐ฅ โ ๐๐2
โฅ2
+ โฅ๐ฅ โ ๐๐2
โ ๐ฅ โ ๐๐2
โฅ2
= 12
(โ๐ฅ โ ๐๐โ2 + โ๐ฅ โ ๐๐โ2)
soโฅโฅโฅโฅ๐ฅ โ ๐๐ + ๐๐
2โโ๐
โฅโฅโฅโฅ
2
โโโโโโโโฅ๐(๐ฅ,๐)2
+14
โ๐๐ โ ๐๐โ2 = 12
(โ๐ฅ โ ๐๐โ2 + โ๐ฅ โ ๐๐โ2)โโโโโโโโโโโ
โ๐(๐ฅ,๐)2
solim
๐,๐โโโ๐๐ โ ๐๐โ = 0
i.e. (๐๐)๐ is Cauchy. By completeness exist lim๐โโ ๐๐ = ๐ฆ โ ๐ป. As ๐ is closed,๐ฆ โ ๐. As โ๐ฅ โ ๐๐โ โ ๐(๐ฅ, ๐), โ๐ฅ โ ๐ฆโ โ ๐(๐ฅ, ๐). This shows existence of ๐ฆ.
For uniqueness, use parallelogram identity
โฅ๐ฅ โ ๐ฆ + ๐ฆโฒ
2โฅ2
โโโโโโฅ๐(๐ฅ,๐)
+14
โ๐ฆ โ ๐ฆโฒโ2 = 12
(โ๐ฅ โ ๐ฆโ2 + โ๐ฅ โ ๐ฆโฒโ2) = ๐(๐ฅ, ๐)2
so โ๐ฆ โ ๐ฆโฒโ = 0.
Corollary 9.4. Suppose ๐ โค ๐ป is a closed subspace of a Hilbert space ๐ป.Then
๐ป = ๐ โ ๐ โ
where๐ โ = {๐ฅ โ ๐ป โถ โจ๐ฅ, ๐ฃโฉ = 0 for all ๐ฃ โ ๐ }
is the orthogonal of ๐.
Proof. ๐ โฉ ๐ โ = 0 by positivity of inner product. If ๐ฅ โ ๐ป then there exists aunique ๐ฆ โ ๐ such that โ๐ฅ โ ๐ฆโ = ๐(๐ฅ, ๐ฆ). Need to show that ๐ฅ โ ๐ฆ โ ๐ โ.
For all ๐ง โ ๐,โ๐ฅ โ ๐ฆ โ ๐งโ โฅ โ๐ฅ โ ๐ฆโ
as ๐ฆ + ๐ง โ ๐. Thus
โ๐ฅ โ ๐ฆโ2 + โ๐งโ โ 2 Reโจ๐ฅ โ ๐ฆ, ๐งโฉ โฅ โ๐ฅ โ ๐ฆโ2.
50
9 Hilbert space and ๐ฟ2-methods
Rearrange,2 Reโจ๐ฅ โ ๐ฆ, ๐งโฉ โค โ๐งโ2
for all ๐ง โ ๐. Now substitute ๐ก๐ง for ๐ง where ๐ก โ R+, have
๐ก โ 2 Re ๐ฅ โ ๐ฆ, ๐ง โค ๐ก2โ๐งโ2.
For ๐ก = 0,Reโจ๐ฅ โ ๐ฆ, ๐งโฉ โค 0
Similarly replace ๐ง by โ๐ง to conclude Reโจ๐ฅโ๐ฆ, ๐งโฉ = 0. Finally replace ๐ง by ๐๐๐๐งto have โจ๐ฅ โ ๐ฆ, ๐งโฉ = 0 for all ๐ง. Thus ๐ฅ โ ๐ฆ โ ๐ โ.
Definition (bounded linear form). A linear form โ โถ ๐ป โ C is bounded ifthere exists ๐ถ > 0 such that |โ(๐ฅ)| โค ๐ถโ๐ฅโ for all ๐ฅ โ ๐ป.
Remark. โ bounded is equivalent to โ continuous.
Theorem 9.5 (Riesz representation theorem). Let ๐ป be a Hilbert space.For every bounded linear form โ there exists ๐ฃ0 โ ๐ป such that
โ(๐ฅ) = โจ๐ฅ, ๐ฃ0โฉ
for all ๐ฅ โ ๐ป.
Proof. By boundedness of โ, ker โ is closed so write
๐ป = ker โ โ (ker โ)โ.
If โ = 0 then just pick ๐ฃ0 = 0. Otherwise pick ๐ฅ0 โ (ker โ)โ \ {0}. But (ker โ)โ
is spanned by ๐ฅ0: indeed for any ๐ฅ โ (ker โ)โ,
โ(๐ฅ) = โ(๐ฅ)โ(๐ฅ0)
โ(๐ฅ0)
soโ(๐ฅ โ โ(๐ฅ)
โ(๐ฅ0)๐ฅ0) = 0
so ๐ฅ โ โ(๐ฅ)โ(๐ฅ0) โ ker โ โฉ (ker โ)โ = 0. Now let
๐ฃ0 = โ(๐ฅ0)โ๐ฅ0โ2 ๐ฅ0
and observe that โ(๐ฅ) โ โจ๐ฅ, ๐ฃ0โฉ vanishes on ker โ and on (ker โ)โ = C๐ฅ0. Thusit is identically zero.
Definition (absolutely continuous, singular measure). Let (๐, ๐) be a mea-surable space and let ๐, ๐ be two measures on (๐, ๐).
1. ๐ is absolutely continuous with respect to ๐, denoted ๐ โช ๐, if forevery ๐ด โ ๐, ๐(๐ด) = 0 implies ๐(๐ด) = 0.
2. ๐ is singular, denoted ๐ โ ๐, if exists ฮฉ โ ๐ such that ๐(ฮฉ) = 0 and
51
9 Hilbert space and ๐ฟ2-methods
๐(ฮฉ๐) = 0.Example.
1. Let ๐ be the Lebesgue measure on (R, โฌ(R)) and ๐๐ = ๐๐๐ where ๐ โฅ 0is a Borel function then ๐ โช ๐.
2. If ๐ = ๐ฟ๐ฅ0, the Dirac mass at ๐ฅ0 โ R, then ๐ โ ๐ฃ.
Non-examinable theorem and proof:
Theorem 9.6 (Radon-Nikodym). Assume ๐ and ๐ are ๐-finite measureson (๐, ๐).
1. If ๐ โช ๐ then there exists ๐ โฅ 0 measurable such that ๐๐ = ๐๐๐,namely
๐(๐ด) = โซ๐ด
๐(๐ฅ)๐๐(๐ฅ)
for all ๐ด โ ๐. ๐ is called the density of ๐ with respect to ๐ or Radon-Nikodym derivative, sometimes denoted by ๐ = ๐๐
๐๐ .
2. For any ๐, ๐ ๐-finite, ๐ decomposes as
๐ = ๐๐ + ๐๐
where ๐๐ โช ๐ and ๐๐ โ ๐. Moreover this decomposition is unique.
Proof. Consider ๐ป = ๐ฟ2(๐, ๐, ๐ + ๐), which is a Hilbert space. First assume ๐and ๐ are finite. Consider the linear form
โ โถ ๐ป โ R
๐ โฆ ๐(๐) = โซ๐
๐๐๐
โ is bounded by Cauchy-Schwarz and finiteness of the measures:
|๐(๐)| โค ๐(|๐|) โค (๐ + ๐)(|๐|) โค โ(๐ + ๐)(๐) โ โโซ๐
|๐|2๐(๐ + ๐) = ๐ถ โ โ๐โ๐ป
so by Riesz representation theorem, there exists ๐0 โ ๐ฟ2(๐, ๐, ๐ + ๐) such that
๐(๐) = โซ๐
๐๐0๐(๐ + ๐). (โ)
Claim that for (๐+๐)-almost every ๐ฅ, 0 โค ๐0(๐ฅ) โค 1: take ๐ = 1{๐0<0} and plugit into (โ),
0 โค ๐({๐0 < 0}) = โซ๐
1{๐0<0}๐0โโค0
๐(๐ + ๐) โค 0
so equality throughout. Thus ๐0 โฅ 0 (๐ + ๐)-almost everywhere. Similarly take๐ = 1{๐0>1+๐} for ๐ > 0 and plug it into (โ),
(๐ + ๐)({๐0 > 1 + ๐}) โฅ ๐({๐0 > 1 + ๐})
= โซ๐
1{๐0>1+๐}๐0๐(๐ + ๐)
โฅ (1 + ๐)(๐ + ๐)({๐0 > 1 + ๐})
52
9 Hilbert space and ๐ฟ2-methods
so must have(๐ + ๐)({๐0 > 1 + ๐}) = 0
i.e. ๐0 โค 1 (๐ + ๐)-almost everywhere.Now set ฮฉ = {๐ฅ โ ๐ โถ ๐0 โ [0, 1)} so on ฮฉ๐, ๐0 = 1 (๐ + ๐)-almost
everywhere. Then (โ) is equivalent to
โซ๐
๐(1 โ ๐0)๐๐ = โซ๐
๐๐0๐๐
for all ๐ โ ๐ฟ2(๐, ๐, ๐ + ๐). Hence this holds for all ๐ โฅ 0. Now ๐ be ๐1โ๐0
1ฮฉ,get
โซฮฉ
๐๐๐ = โซฮฉ
๐ ๐01 โ ๐0
๐๐. (โโ)
Set
๐๐(๐ด) = ๐(๐ด โฉ ฮฉ)๐๐ (๐ด) = ๐(๐ด โฉ ฮฉ๐)
Clearly ๐ = ๐๐ + ๐๐ . Claim that this is the required result, i.e.
1. ๐๐ โช ๐,
2. ๐๐ โ ๐,
3. ๐๐๐ = ๐๐๐ where ๐ = ๐01โ๐0
1ฮฉ.
Proof.
1. If ๐(๐ด) = 0 set ๐ = 1๐ด and plug into (โโ) to get ๐(๐ด โฉ ฮฉ) = 0, namely๐๐(๐ด) = 0.
2. Set ๐ = 1ฮฉ๐ . On ฮฉ๐, ๐0 = 1 (๐ + ๐)-almost everywhere. Plug this into (โ)to get ๐(ฮฉ๐) = 0. But ๐๐ (ฮฉ) = 0 so ๐๐ โ ๐.
3. (โโ) is equivalent to ๐๐๐ = ๐๐๐ where ๐ = ๐01โ๐0
1ฮฉ.
This settles part 2 of the theorem, and also part 1 as if ๐ โช ๐ then ๐ = ๐๐.If ๐ are ๐ are not finite but only ๐-finite, use the old trick of partition ๐
into countably many ๐- and ๐-finite sets and take their intersections. Supposewe get a disjoint countable union ๐ = โ๐ ๐๐. Then ๐ = โ๐ ๐|๐๐
where foreach ๐ we can write
๐|๐๐= (๐|๐๐
)๐ + (๐|๐๐)๐ .
Then set
๐๐ = โ๐
(๐|๐๐)๐
๐๐ = โ๐
(๐|๐๐)๐
53
9 Hilbert space and ๐ฟ2-methods
Remains to check uniqueness of decomposition. Suppose ๐ can be decom-posed in two ways
๐ = ๐๐ + ๐๐ = ๐โฒ๐ + ๐โฒ
๐ .
As ๐๐ , ๐โฒ๐ โ ๐ there exists ฮฉ0, ฮฉโฒ
0 โ ๐ such that
๐๐ (ฮฉ0) = 0, ๐(ฮฉ๐0) = 0
๐โฒ๐ (ฮฉโฒ
0) = 0, ๐((ฮฉโฒ0)๐) = 0
Set ฮฉ1 = ฮฉ0 โฉ ฮฉโฒ0. Check that
๐๐ (ฮฉ1) = ๐โฒ๐ (ฮฉ1) = 0
๐(ฮฉ๐1) = ๐(ฮฉ๐
0 โช ฮฉโฒ๐0 ) = 0
Now ๐๐, ๐โฒ๐ โช ๐ so
๐๐(ฮฉ๐1) = ๐โฒ
๐(ฮฉ๐1) = 0.
Hence for all ๐ด โ ๐,
๐๐(๐ด) = ๐๐(๐ด โฉ ฮฉ1) = ๐(๐ด โฉ ฮฉ1) = ๐โฒ๐(๐ด โฉ ฮฉ1) = ๐โฒ
๐(๐ด)
so ๐๐ = ๐โฒ๐ and hence ๐๐ = ๐โฒ
๐ .
Proposition 9.7. Let (ฮฉ, โฑ,P) be a probability space. Let ๐ข be a ๐-subalgbebra of โฑ and ๐ a random variable on (ฮฉ, โฑ,P). Assume ๐ isintegrable, then there exists a random variable ๐ on (ฮฉ, ๐ข,P) such that
E(1๐ด๐) = E(1๐ด๐ )
for all ๐ด โ ๐ข. Moreover ๐ is unique almost surely.
If you are perplexed by why it is nontrivial to recover a random variable on ๐ขfrom one on โฑ, as ๐ข โ โฑ so it seems that we can easily restrict to a โsub-randomvariableโ. But this reasoning makes absolutely no sense as random variablesare functions from the space. In other words, id โถ (ฮฉ, โฑ,P) โ (ฮฉ, ๐ข,P) ismeasurable but its inverse is not, and it is not obvious that ๐ has a pushforward.
Definition (conditional expectation). ๐ as above is called the conditionalexpectation of ๐ with respect to ๐ข, denote by
๐ = E(๐|๐ข).
Proof. wlog assume ๐ โฅ 0. Set ๐(๐ด) = E(1๐ด๐) for all ๐ด โ ๐ข. ๐ is finiteby integrability of ๐ and is a measure on (ฮฉ, ๐ข). Moreover ๐ โช P. Thus byRadon-Nikodym there exists ๐ โฅ 0 ๐ข-measurable such that
๐(๐ด) = โซ๐ด
๐๐P = E(1๐ด๐).
Set ๐ = ๐.Uniqueness is shown in example sheet 3.
54
9 Hilbert space and ๐ฟ2-methods
Remark. In case ๐ โ ๐ฟ2(ฮฉ, โฑ,P) then ๐ is the orthogonal projection of ๐onto ๐ฟ2(ฮฉ, ๐ข,P). It is well-defined since ๐ฟ2(ฮฉ, โฑ,P) is a Hilbert space and๐ฟ2(ฮฉ, ๐ข,P) is a closed subspace. In this case TFAE:
1. E((๐ โ ๐ )1๐ด) = 0 for all ๐ด โ ๐ข,
2. E((๐ โ ๐ )โ) = 0 for all โ simple on (ฮฉ, ๐ข,P),
3. E((๐ โ ๐ )โ) = 0 for all โ โฅ 0 ๐ข-measurable,
4. E((๐ โ ๐ )โ) = 0 for all โ โ ๐ฟ2(ฮฉ, ๐ข,P).
Remark. Special case when ๐ข = {โ , ๐ต, ๐ต๐, ฮฉ} where ๐ต โ โฑ:
E(1๐ด|๐ข)(๐) = {P(๐ด|๐ต) ๐ โ ๐ตP(๐ด|๐ต๐) ๐ โ ๐ต๐
whereP(๐ด|๐ต) = P(๐ด โฉ ๐ต)
P(๐ต).
Proposition 9.8 (non-examinable). Properties of conditional expectation:
1. linearity: E(๐ผ๐ + ๐ฝ๐ |๐ข) = ๐ผE(๐|๐ข) + ๐ฝE(๐ |๐ข).
2. if ๐ is ๐ข-measurable then E(๐|๐ข) = E(๐).
3. positivity: if ๐ โฅ 0 then E(๐|๐ข) โฅ 0.
4. E(E(๐|๐ข)|โ) = E(๐|โ) if โ โ ๐ข.
5. if ๐ is ๐ข-measurable and bounded then E(๐๐|๐ข) = ๐ โ E(๐|๐ข).
55
10 Fourier transform
10 Fourier transform
Definition (Fourier transform). Let ๐ โ ๐ฟ1(R๐, โฌ(R๐), ๐๐ฅ) where ๐๐ฅ is theLebesgue measure. The function
๐ โถ R๐ โ C
๐ข โฆ โซR๐
๐(๐ฅ)๐๐โจ๐ข,๐ฅโฉ๐๐ฅ
where โจ๐ข, ๐ฅโฉ = ๐ข1๐ฅ1 + โฏ + ๐ข๐๐ฅ๐, is called the Fourier transform of ๐.
Proposition 10.1.
1. | ๐(๐ข)| โค โ๐โ1.
2. ๐ is continuous.
Proof. 1 is clear. 2 follows from dominated convergence theorem.
Definition. Given a finite Borel measure ๐ on R๐, its Fourier transform isgiven by
๐(๐ข) = โซR๐
๐๐โจ๐ข,๐ฅโฉ๐๐(๐ฅ).
Again | ๐(๐ข)| โค ๐(R๐) and ๐ is continuous.
Remark. If ๐ is an R๐-valued random variable with law ๐๐ then ๐๐ is calledthe characteristic function of ๐.
Example.
1. Normalised Gaussian distribution on R: ๐ = ๐ฉ(0, 1). ๐๐ = ๐๐๐ฅ where
๐(๐ฅ) = ๐โ๐ฅ2/2โ
2๐.
Claim that๐(๐ข) = ๐(๐ข) = ๐โ๐ข2/2,
i.e. ๐ =โ
2๐๐. This is the defining characteristic of Gaussian distribution.
Proof. Since (๐ข โฆ |๐๐ข โ ๐๐๐ข๐ฅ๐โ๐ฅ2/2|) โ ๐ฟ1(R), we can differentiate under
56
10 Fourier transform
integral sign to get
๐๐๐ข
๐(๐ข) = ๐๐๐ข
โซR
๐๐๐ข๐ฅ๐โ๐ฅ2/2 ๐๐ฅโ2๐
= โซR
๐๐ฅ๐๐๐ข๐ฅ๐โ๐ฅ2/2 ๐๐ฅโ2๐
= โ โซR
๐๐๐๐ข๐ฅ๐โฒ(๐ฅ)๐๐ฅ
= ๐ โซR
(๐๐๐ข๐ฅ)โฒ๐(๐ฅ)๐๐ฅ
= โ๐ข โซR
๐๐๐ข๐ฅ๐(๐ฅ)๐๐ฅ
= โ๐ข ๐(๐ข)
Thus๐
๐๐ข( ๐(๐ข)๐๐ข2/2) = 0
so๐(๐ข) = ๐(0)๐โ๐ข2/2.
But๐(0) = โซ
R๐(๐ฅ)๐๐ฅ = 1
so ๐(๐ข) = ๐โ๐ข2/2 as required.
2. ๐-dimensional version: ๐ = ๐ฉ(0, ๐ผ๐). ๐๐(๐ฅ) = ๐บ(๐ฅ)๐๐ฅ where ๐๐ฅ =๐๐ฅ1 โฏ ๐๐ฅ๐ and
๐บ(๐ฅ) =๐
โ๐=1
๐(๐ฅ๐) = ๐โโ๐ฅโ2/2
(โ
2๐)๐.
Then
๐บ(๐ข) = โซR๐
๐๐โจ๐ข,๐ฅโฉ๐บ(๐ฅ)๐๐ฅ
=๐
โ๐=1
โซR
๐(๐ฅ๐)๐๐๐ข๐๐ฅ๐๐๐ฅ๐
=๐
โ๐=1
๐(๐ข๐)
= ๐โโ๐ขโ2/2
Theorem 10.2 (Fourier inversion formula).
1. If ๐ โ ๐ฟ1(R๐) is such that ๐ โ ๐ฟ1(R๐) then ๐ is continuous (i.e. ๐equals to a continuous function almost everywhere) and
๐(๐ฅ) = 1(2๐)๐
๐(โ๐ฅ).
57
10 Fourier transform
2. If ๐ is a finite Borel measure on R๐ such that ๐ โ ๐ฟ1(R๐) then ๐ hasa continuous density with respect to Lebesgue measure, i.e. ๐๐ = ๐๐๐ฅwith
๐(๐ฅ) = 1(2๐)๐
๐(โ๐ฅ).
Remark. In other words
๐(๐ฅ) = 1(2๐)๐ โซ
R๐
๐(๐ข)๐โ๐โจ๐ข,๐ฅโฉ๐๐ข
where ๐(๐ข) are Fourier coefficients and ๐โ๐โจ๐ข,๐ฅโฉ are called Fourier modes, whichare characters ๐๐โจ๐ข,โโฉ โถ R๐ โ {๐ง โ C โถ |๐ง| = 1}. Informally this says that every๐ can be written as an โinfinite linear combinationโ of Fourier modes.
Proof. (!UPDATE: 1 does not quite reduce to 2 as ๐ = ๐+ โ ๐โ does not quitehold. Instead write ๐ = ๐ ๐๐ โ ๐ ๐๐ where ๐ = โ๐+โ1, ๐๐๐ฅ = ๐
โ๐+โ1๐๐ฅ)
1 reduces to 2 by considering ๐ = ๐+ โ ๐โ. In 2 we may assume wlog ๐is a probability measure so is the law of some random variable ๐. Let ๐(๐ฅ) =
1(2๐)๐
๐(โ๐ฅ). We need to show ๐ = ๐๐๐ฅ, which is equivalent to for all ๐ด โ โฌ(R๐),
๐(๐ด) = โซR๐
๐1๐ด๐๐ฅ.
Let โ = 1๐ด and wlog assume ๐ด is a bounded Borel set. The trick is to introducean independent Gaussian random variable ๐ โผ ๐ฉ(0, ๐ผ๐) with law ๐บ๐๐ฅ. We have
โซR๐
โ(๐ฅ)๐๐(๐ฅ) = E(โ(๐))) = lim๐โ0
E(โ(๐ + ๐๐))
by dominated convergence theorem. But
E(โ(๐ + ๐๐)) = E(โซR๐
โ(๐ + ๐๐ฅ)๐บ(๐ฅ)๐๐ฅ)
= E(โซR๐
โซR๐
โ(๐ + ๐๐ฅ)๐๐โจ๐ข,๐ฅโฉ๐บ(๐ข) ๐๐ข๐๐ฅ(โ
2๐)๐)
as๐บ(๐ฅ) = 1
(โ
2๐)๐๐บ(๐ฅ) = 1
(โ
2๐)๐โซR๐
๐บ(๐ข)๐๐โจ๐ข,๐ฅโฉ๐๐ข.
So by a change of variable ๐ฆ = ๐๐ฅ,
E(โ(๐ + ๐๐)) = E(โซ โซ โ(๐ + ๐ฆ)๐๐โจ๐ข,๐ฆ/๐โฉ๐บ(๐ข) ๐๐ข(โ
2๐)๐๐๐ฆ๐๐ )
= E(โซ โซ โ(๐ง)๐๐โจ๐ข/๐,๐งโ๐ฅโฉ๐บ(๐ข) ๐๐ข(โ
2๐๐2)๐๐๐ง)
= โซ โซ โ(๐ง)๐๐โจ๐ข/๐,๐งโฉ ๐๐(โ๐ข๐
)๐บ(๐ข) ๐๐ข(โ
2๐๐2)๐๐๐ง Tonelli-Fubini
= 1(2๐)๐ โซ โซ ๐๐(๐ข)๐โ๐โจ๐ข,๐งโฉโ(๐ง)๐โ๐2โ๐ขโ2/2๐๐ข๐๐ง
58
10 Fourier transform
We want a condition to ensure ๐ โ ๐ฟ1(R๐). Clearly continuity is necessary.Here we use a generally principle in Fourier analysis: Fourier transform convertsdecay at infinity to smoothness. In Fourier inversion formula, if ๐ข is large thenthe Fourier character has fast oscillation. Thus if ๐ decays fast at infinity thenthe Fourier coefficients also decays fast, and the resulting transfrom is smoother.
Proposition 10.3. If ๐, ๐ โฒ and ๐โณ exists (for example if ๐ is ๐ถ2) and arein ๐ฟ1 then ๐ โ ๐ฟ1.
Proof. We prove the case ๐ = 1. The general case follows from Tonelli-Fubini.We show first that ๐, ๐ โฒ โ ๐ฟ1 implies that ๐(๐ข) = ๐
๐ข๐ โฒ(๐ข). This easily follows
from integration by parts:
๐(๐ข) = โซ ๐(๐ฅ)๐๐๐ข๐ฅ๐๐ฅ
= 1๐๐ข
โซ ๐(๐ฅ)(๐๐๐ข๐ฅ)โฒ๐๐ฅ
= โ 1๐๐ข
โซ ๐ โฒ(๐ฅ)๐๐๐ข๐ฅ๐๐ฅ
so in particular | ๐(๐ข)| โค 1|๐ข| โ๐
โฒโ1.Thus if ๐, ๐ โฒ, ๐โณ โ ๐ฟ1 then ๐(๐ข) = โ 1
๐ข2๐โณ(๐ข) so
| ๐(๐ข)| โค โ๐โณโ1|๐ข2|
.
As โซโ1
1|๐ข|2 ๐๐ข < โ, ๐ โ ๐ฟ1.
Definition (convolution). Given two Borel measures ๐ and ๐ on R๐, wedefine their convolution ๐ โ ๐ as the image of ๐ โ ๐ under the addition map
ฮฆ โถ R๐ ร R๐ โ R๐
(๐ฅ, ๐ฆ) โฆ ๐ฅ + ๐ฆ
i.e. ๐ โ ๐ = ฮฆโ(๐ โ ๐)
Thus given ๐ด โ โฌ(R๐),
ฮฆโ(๐ โ ๐ฃ)(๐ด) = ๐ โ ๐({(๐ฅ, ๐ฆ) โถ ๐ฅ + ๐ฆ โ ๐ด}).
Example. Given ๐, ๐ independent random variables and ๐, ๐ be laws of ๐, ๐respectively, then ๐ โ ๐ is the law of ๐ + ๐.
Definition (convolution). If ๐, ๐ โ ๐ฟ1(R๐) define their convolution ๐ โ ๐ by
(๐ โ ๐)(๐ฅ) = โซR๐
๐(๐ฅ โ ๐ก)๐(๐ก)๐๐ก.
59
10 Fourier transform
This is well defined by Fubini: ๐, ๐ โ ๐ฟ1 so
โซR๐
โซR๐
|๐(๐ฅ โ ๐ก)๐(๐ก)|๐๐ก๐๐ฅ < โ
and
โ๐ โ ๐โ1 = โซ โฃ โซ ๐(๐ฅ โ ๐ก)๐(๐ก)๐๐กโฃ๐๐ฅ โค โซ โซ |๐(๐ฅ โ ๐ก)๐(๐ก)|๐๐ก๐๐ฅ โค โ๐โ1 โ โ๐โ1
Therefore (๐ฟ1(R๐), โ) forms a Banach algebra.
Remark. If ๐, ๐ are two finite Borel measures on R๐ and if ๐, ๐ โช ๐๐ฅ, i.e.absolutely continuous, then by Radon-Nikodym there exist ๐, ๐ โ ๐ฟ1(R๐) suchthat
๐๐ = ๐๐๐ฅ๐๐ = ๐๐๐ฅ
then ๐ โ ๐ โช ๐๐ฅ and๐(๐ โ ๐) = (๐ โ ๐)๐๐ฅ.
Proposition 10.4 (Gaussian approximation). If ๐ โ ๐ฟ๐(R๐) where ๐ โ[1, โ) then
lim๐โ0
โ๐ โ ๐บ๐ โ ๐โ๐ = 0
where ๐บ๐ = ๐ฉ(0, ๐2๐ผ๐), i.e.
๐บ๐(๐ฅ) = 1(โ
2๐๐2)๐๐โ โ๐ฅโ2
2๐2 .
Lemma 10.5 (continuity of translation in ๐ฟ๐). Suppose ๐ โ [1, โ) and๐ โ ๐ฟ๐. Then
lim๐กโ0
โ๐๐ก(๐) โ ๐โ๐ = 0
where ๐๐ก(๐)(๐ฅ) = ๐(๐ฅ + ๐ก), ๐ก โ R๐.
Proof. Example sheet.
Proof of Gaussian approximation.
(๐ โ ๐บ๐ โ ๐)(๐ฅ) = โซR๐
๐บ๐(๐ก)(๐(๐ฅ โ ๐ก) โ ๐(๐ฅ))๐๐ก = E(๐(๐ฅ โ ๐๐) โ ๐(๐ฅ))
where ๐ โผ ๐ฉ(0, ๐ผ๐) is Gaussian with density ๐บ1. Then
โ๐ โ ๐บ๐ โ ๐โ๐๐ โค E(โ๐(๐ฅ + ๐๐) โ ๐(๐ฅ)โ๐
๐) = E(โ๐๐๐(๐) โ ๐โ๐๐)
by Jensenโs inequality and convexity of ๐ฅ โฆ ๐ฅ๐. By the lemma,
lim๐โ0
โ๐๐๐(๐) โ ๐โ๐ = 0.
As โ๐๐๐(๐) โ ๐โ๐ โค 2โ๐โ๐, apply dominated convergence theorem to get therequired result.
60
10 Fourier transform
Proposition 10.6.
โข If ๐, ๐ โ ๐ฟ1(R๐) then๐ โ ๐ = ๐ โ ๐.
โข If ๐, ๐ are finite Borel measure then
๐ โ ๐ = ๐ โ ๐.
Proof. 1 reduces to 2 by writing
๐(๐ฅ)๐๐ฅ = ๐+(๐ฅ)๐๐ฅ โ ๐โ(๐ฅ)๐๐ฅ = ๐๐๐ โ ๐๐๐
for some probability measure ๐, ๐.wlog we may assume ๐ and ๐ are laws of independent random variables ๐
and ๐. Then by a previous result ๐ โ ๐ is just the law of ๐ + ๐ so
๐ โ ๐(๐ข) = โซ ๐๐โจ๐ข,๐ฅ+๐ฆโฉ๐๐(๐ฅ)๐๐(๐ฆ)
= E(๐๐โจ๐ข,๐+๐ โฉ) = E(๐๐โจ๐ข,๐โฉ๐๐โจ๐ข,๐ โฉ) homomorphism= E(๐๐โจ๐ข,๐โฉ)E(๐๐โจ๐ข,๐ โฉ) as ๐, ๐ are independent= ๐(๐ข) โ ๐(๐ข).
In short, this is precisely because ๐๐โจ๐ข,โโฉ are characters.
Theorem 10.7 (Lรฉvy criterion). Let (๐๐)๐โฅ1 and ๐ be R๐-valued randomvariables. Then TFAE:
1. ๐๐ โ ๐ in law,
2. For all ๐ข โ R๐, ๐๐๐(๐ข) โ ๐๐(๐ข).
In particular if ๐๐ = ๐๐ for two random variables ๐ and ๐ then ๐ = ๐ inlaw, i.e. ๐๐ = ๐๐.
Thus Fourier transform is an injection from Borel measure to certain functionspace.
Proof.
โข 1 โน 2: Clear by defintion as ๐(๐ฅ) = ๐๐โจ๐ข,๐ฅโฉ is continuous and boundedfor all ๐ข โ R๐.
โข 2 โน 1: Need to show that for all ๐ โ ๐ถ๐(R๐),
E(๐(๐๐)) โ E(๐(๐)).
wlog itโs enough to check this for all ๐ โ ๐ถโ๐ (R๐). For the sufficiency see
example sheet.Note that for all ๐ โ ๐ถโ
๐ (R๐), ๐ โ ๐ฟ1 so by Fourier inversion formula
๐(๐ฅ) = โซ ๐(๐ข)๐โ๐โจ๐ข,๐ฅโฉ ๐๐ข(2๐)๐ .
61
10 Fourier transform
Hence
E(๐(๐๐)) = โซ ๐(๐ข)E(๐โ๐โจ๐ข,๐๐โฉ)๐๐ข
= โซ ๐(๐ข) ๐๐๐(โ๐ข) ๐๐ข
(2๐)๐
โ โซ ๐(๐ข)๏ฟฝ๏ฟฝ๐(โ๐ข) ๐๐ข(2๐)๐
= E(๐(๐))
by dominated convergence theorem.
Theorem 10.8 (Plancherel formula).
1. If ๐ โ ๐ฟ1(R๐) โฉ ๐ฟ2(R๐) then ๐ โ ๐ฟ2(R๐) and
โ ๐โ22 = (2๐)๐โ๐โ2
2.
2. If ๐, ๐ โ ๐ฟ1(R๐) โฉ ๐ฟ2(R๐) then
โจ ๐, ๐โฉ๐ฟ2 = (2๐)๐โจ๐, ๐โฉ๐ฟ2 .
3. The Fourier transform
โฑ โถ ๐ฟ1(R๐) โฉ ๐ฟ2(R๐) โ ๐ฟ2(R๐)
๐ โฆ 1(โ
2๐)๐๐
extends uniquely to a linear operator on ๐ฟ2(R๐) which is an isometry.Moreover
โฑ โ โฑ(๐) = ๐
where ๐(๐ฅ) = ๐(โ๐ฅ), for all ๐ โ ๐ฟ2(R๐).
Proof. First we prove 1 and 2 assuming ๐ , ๐ โ ๐ฟ1(R๐). By Fourier inversionformula,
โ ๐โ22 = โซ
R๐
| ๐(๐ข)|2๐๐ข
= โซ ๐(๐ข) ๐(๐ข)๐๐ข
= โซ (โซ ๐(๐ฅ)๐๐โจ๐ข,๐ฅโฉ๐๐ฅ) ๐(๐ข)๐๐ข
= โซ โซ ๐(๐ฅ) ๐(๐ข)๐โ๐โจ๐ข,๐ฅโฉ๐๐ข๐๐ฅ
= โซ ๐(๐ฅ)๐(๐ฅ)(2๐)๐๐๐ฅ
= (2๐)๐โ๐โ22
62
10 Fourier transform
and in particular ๐ โ ๐ฟ2(R๐).Similarly for 2,
โจ ๐, ๐โฉ๐ฟ2 = โซ ๐(๐ข) ๐(๐ข)๐๐ข
= โซ (โซ ๐(๐ฅ)๐๐โจ๐ข,๐ฅโฉ๐๐ฅ) ๐(๐ข)๐๐ข
= โซ โซ ๐(๐ฅ) ๐(๐ข)๐โ๐โจ๐ข,๐ฅโฉ๐๐ข๐๐ฅ
= โซ ๐(๐ฅ)๐(๐ฅ)๐๐ฅ(2๐)๐
= (2๐)๐โจ๐, ๐โฉ๐ฟ2
Now for the general case we use Gaussian as a mollifier. Consider
๐๐ = ๐ โ ๐บ๐
๐๐ = ๐ โ ๐บ๐
and based on results and computations before,๐๐ = ๐ โ ๐บ๐ = ๐๐โ๐โ๐ขโ2/2.
As โ ๐โโ โค โ๐โ1, ๐๐ โ ๐ฟ1(R๐). Thus ๐๐ โ ๐ฟ2(R๐) and โ ๐๐โ22 = (2๐)๐โ๐๐โ2
2. Butby Gaussian approximation we know that ๐๐ โ ๐ in ๐ฟ2(R๐) as ๐ โ 0. Henceโ๐๐โ2 โ โ๐โ2. Then
โ ๐๐โ22 = โ ๐ โ ๐บ๐โ2
2 = โซ | ๐(๐ข)|2๐โ๐โ๐ขโ2/2๐๐ข โ โ๐โ22
as ๐ โ 0 by monotone convergence theorem. Thus
โ ๐โ22 = (2๐)๐โ๐โ2
2.
For 2,โจ ๐๐, ๐๐โฉ = โซ ๐ ๐๐โ๐โ๐ขโ2๐๐ข โ โซ ๐ ๐๐๐ข
as ๐ โ 0 by dominated convergence theorem as ๐ ๐ โ ๐ฟ1. The result followsfrom Gaussian approximation.
For 3, ๐ฟ1(R๐)โฉ๐ฟ2(R๐) is dense in ๐ฟ2(R๐) because it contains ๐ถ๐(R๐). Thenextend by completeness: given ๐ โ ๐ฟ2(R๐), pick a sequence ๐๐ โ ๐ฟ1(R๐) โฉ๐ฟ2(R๐) such that ๐๐ โ ๐ in ๐ฟ2(R๐). Then define
โฑ๐ = lim๐โโ
โฑ๐๐.
The limit exists as ๐ฟ2(R๐) is complete. โฑ is well-defined as
โโฑ๐๐ โ โฑ๐๐โ2 = โ๐๐ โ ๐๐โ2
by 1. Finally,โโฑ๐โ2 = โ๐โ2
for all ๐ โ ๐ฟ2(R๐) andโฑ โ โฑ(๐) = ๐
for all ๐ such that ๐, ๐ โ ๐ฟ1(R๐). Thus by continuity this holds for ๐ฟ2(R๐).
63
11 Gaussians
11 Gaussians
Definition (Gaussian). An R๐-valued random variable ๐ is called Gaussianif for all ๐ข โ R๐, โจ๐, ๐ขโฉ is Gaussian, namely its law has the form ๐ฉ(๐, ๐2)for some ๐ โ R, ๐ > 0.
Proposition 11.1. The law of a Gaussian vector ๐ = (๐1, โฆ , ๐๐) โ R๐
is uniquely determined by
1. its mean E๐ = (E๐1, โฆ ,E๐๐),
2. its covariance matrix (Cov(๐๐, ๐๐))๐๐ where
Cov(๐๐, ๐๐) = E((๐๐ โ E๐๐)(๐๐ โ E๐๐)).
Proof. If ๐ = 1 then this just says that it is determined by its mean ๐ andcovariance ๐2, which is obviously true. For ๐ > 1, compute the characteristicfunction
๐๐(๐ข) = E(๐๐โจ๐,๐ขโฉ)
but by assumption โจ๐, ๐ขโฉ is Gaussian in ๐ = 1 so its law is determined
1. the mean E(โจ๐, ๐ขโฉ) = โจE๐, ๐ขโฉ,
2. the variance Varโจ๐, ๐ขโฉ. But
Varโจ๐, ๐ขโฉ = E((โจ๐, ๐ขโฉ โ Eโจ๐, ๐ขโฉ)2) = โ๐,๐
๐ข๐๐ข๐ Cov(๐๐, ๐๐).
In particular this shows that (Cov(๐๐, ๐๐))๐๐ is a non-negative semidefinitesymmetric matrix.
Proposition 11.2. If ๐ is a Gaussian vector then exists ๐ด โ โณ๐(R), ๐ โ R๐
such that ๐ has the same law as ๐ด๐ + ๐ where ๐ = (๐1, โฆ , ๐๐), (๐๐)๐๐=1
are iid. ๐ฉ(0, 1).
Proof. Take ๐ด such that
๐ด๐ดโ = (Cov(๐๐, ๐๐))๐๐
where ๐ดโ is the adjoint/transpose of ๐ด, and
๐ = (E๐1, โฆ ,E๐๐).
Check that for all ๐ข โ R๐,
E(โจ๐, ๐ขโฉ) = โจ๐, ๐ขโฉVar(โจ๐, ๐ขโฉ) = โจ๐ด๐ดโ๐ข, ๐ขโฉ = โ๐ดโ๐ขโ2
2 = Var(โจ๐ด๐ + ๐, ๐ขโฉ)
64
11 Gaussians
Proposition 11.3. If (๐1, โฆ , ๐๐) is a Gaussian vector then TFAE:
1. ๐๐โs are independent.
2. ๐๐โs are pairwise independent.
3. (Cov(๐๐, ๐๐))๐๐ is a diagonal matrix.
Proof. 1 โน 2 โน 3 is obvious. 3 โน 1 as we can choose ๐ด to be diagonal.Thus ๐ has the same law as (๐1๐1, โฆ , ๐๐๐๐) + ๐.
Theorem 11.4 (central limit theorem). Let (๐๐)๐โฅ1 be R๐-valued iid. ran-dom variables with law ๐. Assume they have second moment, i.e. E(โ๐1โ2) <โ. Let ๐ฆ = E(๐1) โ R๐ and
๐๐ = ๐1 + โฏ + ๐๐ โ ๐ โ ๐ฆโ๐
.
Then ๐๐ converges in law to a central Gaussian on R๐ with law ๐ฉ(0, ๐พ)where
๐พ๐๐ = (Cov(๐1))๐๐ = [โซR๐
(๐ฅ๐ โ E(๐1))(๐ฅ๐ โ E(๐1))๐๐(๐ฅ)]๐๐
.
Proof. The proof is an application of Lรฉvy criterion. Need to show ๐๐๐(๐ข) โ
๐๐(๐ข) as ๐ โ โ for all ๐ข, where ๐ โผ ๐ฉ(0, ๐พ). As
๐๐๐(๐ข) = E(๐๐โจ๐๐,๐ขโฉ),
this is equivalent to show that for all ๐ข, โจ๐๐, ๐ขโฉ converges in law to โจ๐ , ๐ขโฉ. But
โจ๐๐, ๐ขโฉ = โจ๐1, ๐ขโฉ + โฏ + โจ๐๐, ๐ขโฉ โ ๐โจ๐ฆ, ๐ขโฉโ๐
so we reduce the problem to 1-dimension case. By rescaling wlog E(๐1) =0,E(๐2
1) = 1.Now
๐๐๐(๐ข) = E(๐๐๐ข๐๐)
= E(exp(๐๐ข๐1 + โฏ + ๐๐โ๐
))
=๐
โ๐=1
E(exp(๐๐ข ๐๐โ๐
))
= (E(exp(๐๐ข ๐๐โ๐
)))๐
= ( ๐( ๐ขโ๐
))๐
65
11 Gaussians
But E(๐1) = 0,E(๐21) = 1 so we can differentiate ๐ under the integral sign
๐(๐ข) = โซR
๐๐๐ข๐ฅ๐๐(๐ฅ)
๐๐๐ข
๐(๐ข) = โซR
๐๐ฅ๐๐๐ข๐ฅ๐๐(๐ฅ) = ๐E(๐1)
๐2
๐๐ข2 ๐(๐ข) = โซR
โ๐ฅ2๐๐๐ข๐ฅ๐๐(๐ฅ) = โE(๐21)
Taylor expand ๐ around 0 to 2nd order,
๐(๐ข) = ๐(0) + ๐ข ๐โฒ(0) + ๐ข2
2๐โณ(๐ข) + ๐(๐ข2)
= 1 + 0 โ ๐ข โ ๐ข2
2+ ๐(๐ข2)
so๐๐๐
(๐ข) = (1 โ ๐ข2
2๐+ ๐(๐ข2
๐))๐ โ ๐โ๐ข2/2 = ๐(๐ข)
as ๐ โ โ where ๐ is the law of ๐.
66
12 Ergodic theory
12 Ergodic theoryLet (๐, ๐, ๐) be a measure space. Let ๐ โถ ๐ โ ๐ be an ๐-measurable map. Weare interested in the trajectories of ๐ ๐๐ฅ for ๐ โฅ 0 and their statistical behaviour.In particular we are interested in those ๐ preserving measure ๐.
Definition (measure-preserving). ๐ โถ ๐ โ ๐ is measure-preserving if๐โ๐ = ๐. (๐, ๐, ๐, ๐ ) is called a measure-preserving dynamical system.
Definition (invariant function, invariant set, invariant ๐-algebra).
โข A measurable function ๐ โถ ๐ โ R is called ๐-invariant if ๐ = ๐ โ ๐.
โข A set ๐ด โ ๐ is ๐-invariant if 1๐ด is ๐-invariant.
โข๐ฏ = {๐ด โ ๐ โถ ๐ด is ๐-invariant}
is called the ๐-invariant ๐-algebra.
Lemma 12.1. ๐ is ๐-invariant if and only if ๐ is ๐ฏ-measurable.
Proof. Indeed for all ๐ก โ R,
{๐ฅ โ ๐ โถ ๐(๐ฅ) < ๐ก} = {๐ฅ โ ๐ โถ ๐ โ ๐ (๐ฅ) < ๐ก} = ๐ โ1({๐ฅ โ ๐ โถ ๐(๐ฅ) < ๐ก}).
Definition (ergodic). ๐ is ergodic with respect to ๐, or that ๐ is ergodicwith respect to ๐, if for all ๐ด โ ๐ฏ, ๐(๐ด) = 0 or ๐(๐ด๐) = 0.
This condition asserts that ๐ฏ is trivial, i.e. its elements are either null orconull.
Lemma 12.2. ๐ is ergodic with respect to ๐ if and only if every invariantfunction ๐ is almost everywhere constant.
Proof. Exercise.
Example.
1. Let ๐ be a finite space, ๐ โถ ๐ โ ๐ a map and ๐ = # the countingmeasure, then ๐ is measure preserving is equivalent to ๐ being a bijection,and ๐ is ergodic is equivalent to there does not exists a partition ๐ =๐1 โช ๐2 such that both ๐1 and ๐2 are ๐-invariant, which is equivalent tofor all ๐ฅ, ๐ฆ โ ๐, there exists ๐ such that ๐ ๐๐ฅ = ๐ฆ.
2. Let ๐ = R๐/Z๐, ๐ the Borel ๐-algebra and ๐ the Lebesgue measure.Given ๐ โ R๐, translation ๐๐ โถ ๐ฅ โฆ ๐ฅ + ๐ is measure-preserving. ๐๐is ergodic with respect to ๐ if and only if (1, ๐1, โฆ , ๐๐), where ๐๐โs arecoordinates of ๐, are linearly independent. See example sheet. (hint:Fourier transform)
67
12 Ergodic theory
3. Let ๐ = R/Z and again ๐ Borel ๐-algebra and ๐ the Lebesgue mea-sure. The doubling map ๐ โถ ๐ฅ โฆ 2๐ฅ โ โ2๐ฅโ is ergodic with respect to๐. (hint: again consider Fourier coefficeints). Intuitively in the graphof this function the preimage ๐ is two segments each of length ๐/2, someasure-preserving.
4. Furstenberg conjecture: every ergodic measure ๐ on R/Z invariant under๐2, ๐3 must be either Lebesgue or finitely supported.
12.1 The canonical modelLet (๐๐)๐โฅ1 be an R๐-valued stochastic process on (ฮฉ, โฑ,P). Let ๐ = (R๐)Nand define the sample path map
ฮฆ โถ ฮฉ โ ๐๐ โฆ (๐๐(๐))๐โฅ1
Let
๐ โถ ๐ โ ๐(๐ฅ๐)๐โฅ1 โฆ (๐ฅ๐+1)๐โฅ1
be the shift map. Let ๐ฅ๐ โถ ๐ โ R๐ be the ๐th coordinate function and let๐ = ๐(๐ฅ๐ โถ ๐ โฅ 1).
Note. ๐ is the infinite product ๐-algebra โฌ(R๐)โN of โฌ(R๐)N.
Let ๐ = ฮฆโP, a probability measure on (๐, ๐). This ๐ is called the law ofthe process (๐๐)๐โฅ1. Now (๐, ๐, ๐, ๐ ) is called the canonical model associatedto (๐๐)๐โฅ1.
Proposition 12.3 (stationary process). TFAE:
1. (๐, ๐, ๐, ๐ ) is measure-preserving.
2. For all ๐ โฅ 1, the law of (๐๐, ๐๐+1, โฆ , ๐๐+๐) on (R๐)๐ is independentof ๐.
In this case we say that (๐๐)๐โฅ1 is a stationary process.
Proof.
โข 1 โน 2: ๐ = ๐โ๐ implies ๐ = ๐ ๐โ ๐ for all ๐ and this says law of (๐๐)๐โฅ1
is the same as that of (๐๐+๐)๐โฅ1.
โข 1 โธ 2: ๐ and ๐ ๐โ ๐ agree on cylinders ๐ด ร (R๐)N\๐น for ๐น โ N finite,
๐ด โ โฌ((R๐)๐น).
In some sense ergodic system is the study of stationary process.
68
12 Ergodic theory
Proposition 12.4 (Bernoulli shift). If (๐๐)๐โฅ1 are iid. then (๐, ๐, ๐, ๐ )is ergodic. It is called the Bernoulli shift associated to the law ๐ of ๐1. Wehave
๐ = ๐โN.
Proof. Claim that ฮฆโ1(๐ฏ) โ ๐, the tail ๐-algebra of (๐๐)๐โฅ1. But Kolmogorov0-1 law says that if ๐ด โ ๐ฏ then P(ฮฆโ1(๐ด)) = 0 or 1, so ๐(๐ด) = 0 or 1, thus ๐is ๐-ergodic.
Given ๐ด โ ๐ฏ, ๐ โ1๐ด = ๐ด so
ฮฆโ1(๐ด) = {๐ โ ฮฉ โถ (๐๐(๐))๐โฅ1 โ ๐ด}= {๐ โ ฮฉ โถ (๐๐(๐))๐โฅ1 โ ๐ โ1๐ด}= {๐ โ ฮฉ โถ (๐๐+1(๐))๐โฅ1 โ ๐ด}= {๐ โ ฮฉ โถ (๐๐+๐(๐))๐โฅ1 โ ๐ด} for all ๐โ ๐(๐๐, ๐๐+1, โฆ )
for ๐-almost every ๐ฅ.
Theorem 12.5 (von Neumann mean ergodic theorem). Let (๐, ๐, ๐, ๐ )be a measure-preserving system. Let ๐ โ ๐ฟ2(๐, ๐, ๐). Then the ergodicaverage
๐๐๐ = 1๐
๐โ1โ๐=0
๐ โ ๐ ๐
converges in ๐ฟ2 to ๐, a ๐-invariant function. In fact ๐ is the orthogonalprojection of ๐ onto ๐ฟ2(๐, ๐ฏ, ๐).
The intuition is as follow: if ๐ is the indicator function of a set, then ๐๐๐ isexactly the average of the time each orbit spending in ๐ด.
Proof. Hilbert space argument. Let ๐ป = ๐ฟ2(๐, ๐, ๐) and define
๐ โถ ๐ป โ ๐ป๐ โฆ ๐ โ ๐
which is an isometry: because ๐ is ๐-invariant, โซ |๐ โ ๐ |2๐๐ = โซ |๐|2๐๐. Thenby Riesz representation theorem it has an adjoint
๐ โ โถ ๐ป โ ๐ป๐ฅ โฆ ๐ โ๐ฅ
which satisfies โจ๐โ๐ฅ, ๐ฆโฉ = โจ๐ฅ, ๐๐ฆโฉ for all ๐ฆ โ ๐ป. Let
๐ = {๐ โ ๐ โ ๐ โถ ๐ โ ๐ป}
be the coboundaries. Let ๐ โ ๐. Then
๐๐๐ = 1๐
๐โ1โ๐=0
(๐ โ ๐ ๐ โ ๐ โ ๐ ๐+1) = ๐ โ ๐ โ ๐ ๐
๐โ 0
69
12 Ergodic theory
as ๐ โ โ.Let ๐ โ ๐ then again ๐๐๐ โ 0 because for all ๐ exists ๐ โ ๐ such that
โ๐ โ ๐โ < ๐. Then
โ๐๐๐ โ ๐๐๐โ = โ๐๐(๐ โ ๐)โ โค โ๐ โ ๐โ โค ๐
so lim sup๐โ๐๐๐โ โค ๐.Have ๐ป = ๐ โ ๐โ and ๐โ = ๐ โ. Claim ๐ โ is exactly the ๐-invariant
functions. The theorem then follows because if ๐ โ ๐ = ๐ then ๐๐๐ = ๐ for all ๐.
Proof of claim.
๐ โ = {๐ โ ๐ป โถ โจ๐, ๐ โ ๐๐โฉ = 0 for all ๐ โ ๐ป}= {๐ โถ โจ๐, ๐โฉ = โจ๐, ๐๐โฉ for all ๐}= {๐ โถ โจ๐, ๐โฉ = โจ๐ โ๐, ๐โฉ for all ๐}= {๐ โถ ๐ โ๐ = ๐}= {๐ โถ ๐๐ = ๐}
where the last equality is by
โ๐๐ โ ๐โ2 = 2โ๐โ2 โ 2 Reโจ๐, ๐๐โฉ = 2โ๐โ2 โ 2 Reโจ๐โ๐, ๐โฉ,
and this shows that ๐ โ are exactly ๐-invariant functions.
In fact we can do better:
Theorem 12.6 (Birkhoff (pointwise) ergodic theorem). Let (๐, ๐, ๐, ๐ ) bea measure-preserving system. Assume ๐ is finite (actually ๐-finite suffices)and let ๐ โ ๐ฟ1(๐, ๐, ๐). Then
๐๐๐ = 1๐
๐โ1โ๐=0
๐ โ ๐ ๐
converges ๐-almost everywhere to a ๐-invariant function ๐ โ ๐ฟ1. Moreover๐๐๐ โ ๐ in ๐ฟ1.
Corollary 12.7 (strong law of large numbers). Let (๐๐)๐โฅ1 be a sequenceof iid. random variables. Assume E(|๐1|) < โ. Let
๐๐ =๐
โ๐=1
๐๐,
then 1๐ ๐๐ converges almost surely to E(๐1).
Proof. Let (๐, ๐, ๐, ๐ ) be the canonical model associated to (๐๐)๐โฅ1, where๐ = RN, ๐ = โฌ(R)โN, ๐ the shift operator and ๐ = ๐โN where ๐ is the law of๐1. It is a Bernoulli shift. Let
๐ โถ ๐ โ R๐ฅ โฆ ๐ฅ1
70
12 Ergodic theory
the first coodinate. Then ๐ โ ๐ ๐(๐ฅ) = ๐ฅ๐+1 so
1๐
(๐1 + โฏ + ๐๐)(๐) = ๐๐๐(๐ฅ)
where ๐ฅ = (๐๐(๐))๐โฅ1. Hence by Birkhoff ergodic theorem
1๐
(๐1 + โฏ + ๐๐) โ ๐ = โซ ๐๐๐ = โซ ๐ฅ1๐๐ = E(๐1)
almost surely.
Remark. If ๐ is ergodic then ๐ is almost everywhere constant. Hence ๐ =โซ ๐๐๐.
Lemma 12.8 (maximal ergodic lemma). Let ๐ โ ๐ฟ1(๐, ๐, ๐) and ๐ผ โ R.Let
๐ธ๐ผ = {๐ฅ โ ๐ โถ sup๐โฅ1
๐๐๐(๐ฅ) > ๐ผ}
then๐ผ๐(๐ธ๐ผ) โค โซ
๐ธ๐ผ
๐๐๐.
Lemma 12.9 (maximal inequality). Let
๐0 = 0
๐๐ = ๐๐๐๐ =๐โ1โ๐=0
๐ โ ๐ ๐ ๐ โฅ 1
Let๐๐ = {๐ฅ โ ๐ โถ max
0โค๐โค๐๐๐(๐ฅ) > 0}.
Thenโซ
๐๐
๐๐๐ โฅ 0.
Proof of maximal inequality. Set ๐น๐ = max0โค๐โค๐ ๐๐. Observe that for all ๐ โค๐, ๐น๐ โฅ ๐๐ and hence
๐น๐ โ ๐ + ๐ โฅ ๐๐ โ ๐ + ๐ = ๐๐+1.
Now if ๐ฅ โ ๐๐ then
๐น๐(๐ฅ) โค max0โค๐โค๐
๐๐+1 โค ๐น๐ โ ๐ + ๐
Integrate to getโซ
๐๐
๐น๐๐๐ โค โซ๐๐
๐น๐ โ ๐ ๐๐ + โซ๐๐
๐๐๐.
Note that ๐น๐(๐ฅ) = 0 if ๐ฅ โ ๐๐ because ๐0 = 0 so
โซ๐๐
๐น๐ = โซ๐
๐น๐ โค โซ๐
๐น๐ โ ๐ + โซ๐๐
๐.
71
12 Ergodic theory
As ๐ is ๐-invariant, โซ ๐น๐ โ ๐ = โซ ๐น๐ so
โซ๐๐
๐๐๐ โฅ 0.
Proof of maximal ergodic lemma. Apply the maximal inequality to ๐ = ๐ โ ๐ผ.Observe that
๐ธ๐ผ(๐) = โ๐โฅ1
๐๐(๐)
and ๐๐๐ = ๐๐๐ โ ๐ผ. Thus
โซ๐ธ๐ผ(๐)
(๐ โ ๐ผ)๐๐ โฅ 0,
which is equivalent to๐ผ๐(๐ธ๐ผ) โค โซ
๐ธ๐ผ
๐๐๐.
Proof of Birkhoff (pointwise) ergodic theorem. Let
๐ = lim sup๐
๐๐๐
๐ = lim inf๐
๐๐๐
Observe that ๐ = ๐ โ ๐ , ๐ โ ๐ = ๐: indeed
๐๐๐ โ ๐ = 1๐
(๐ โ ๐ + โฏ + ๐ โ ๐ ๐) = 1๐
((๐ + 1)๐๐+1๐ โ ๐).
Need to show that ๐ = ๐ ๐-almost everywhere. This is equivalent to for all๐ผ, ๐ฝ โ Q, ๐ผ > ๐ฝ, the set
๐ธ๐ผ,๐ฝ(๐) = {๐ฅ โ ๐ โถ ๐(๐ฅ) < ๐ฝ, ๐(๐ฅ) > ๐ผ}
is ๐-null, as then{๐ฅ โถ ๐(๐ฅ) โ ๐(๐ฅ)} = โ
๐ผ>๐ฝ๐ธ๐ผ,๐ฝ
is ๐-null by subadditivity. Observe that ๐ธ๐ผ,๐ฝ is ๐-invariant. Apply the maximalergodic theorem to ๐ธ๐ผ,๐ฝ to get
๐ผ๐(๐ธ๐ผ,๐ฝ(๐)) โค โซ๐ธ๐ผ,๐ฝ
๐๐๐.
Duallyโ๐ฝ๐(๐ธ๐ผ,๐ฝ(๐)) โค โซ
๐ธ๐ผ,๐ฝ
โ๐๐๐
72
12 Ergodic theory
so๐ผ๐(๐ธ๐ผ,๐ฝ) โค ๐ฝ๐(๐ธ๐ผ,๐ฝ).
But ๐ผ > ๐ฝ so ๐(๐ธ๐ผ,๐ฝ) = 0.We have proved that the limit lim๐ ๐๐๐ exists almost everywhere, which we
now define to be ๐, and left to show ๐ โ ๐ฟ1 and lim๐โ๐๐๐ โ ๐โ1 = 0. This is anapplication of Fatouโs lemma:
โซ |๐|๐๐ = โซ lim inf๐
|๐๐๐|๐๐
โค lim inf๐
โซ |๐๐๐|๐๐
โค lim inf๐
โ๐๐๐โ1
โค โ๐โ1
so ๐ โ ๐ฟ1 where the last inequality is because
โ๐๐๐โ โค 1๐
(โ๐โ1 + โฏ + โ๐ โ ๐ ๐โ1โ1) = โ๐โ1.
Now to show โ๐๐๐ โ ๐โ1 โ 0, we truncate ๐. Let ๐ > 0 and set ๐๐ = ๐1|๐|<๐.Note that
โข |๐๐| โค ๐ so |๐๐๐๐| โค ๐. Hence by dominated convergence theoremโ๐๐๐๐ โ ๐๐โ1 โ 0.
โข ๐๐ โ ๐ ๐-almost everywhere and also in ๐ฟ1 by dominated convergencetheorem.
Thus by Fatouโs lemma,
โ๐๐ โ ๐โ1 โค lim inf๐
โ๐๐๐๐ โ ๐๐๐โ1 โค โ๐๐ โ ๐โ1
Finally
โ๐๐๐ โ ๐โ1 โค โ๐๐๐ โ ๐๐๐๐โ1 + โ๐๐๐๐ โ ๐๐โ1 + โ๐๐ โ ๐โ1
โค โ๐ โ ๐๐โ1 + โ๐๐๐๐ โ ๐๐โ1 + โ๐ โ ๐๐โ1
solim supโ๐๐๐ โ ๐โ1 โค 2โ๐ โ ๐๐โ1
for all ๐, so goes to 0 as ๐ โ โ.
Remark.
1. The theorem holds if ๐ is only assumed to be ๐-finite.
2. The theorem holds if ๐ โ ๐ฟ๐ for ๐ โ [1, โ). The ๐๐๐ โ ๐ in ๐ฟ๐.
73
Index
๐ฟ๐-space, 47๐-system, 12๐-algebra, 10
independence, 33invariant, 67tail, 36
Bernoulli shift, 69Birkhoff ergodic theorem, 70Boolean algebra, 2Borel ๐-algebra, 12Borel measurable, 19Borel-Cantelli lemma, 35
canonical model, 68Carathรฉodory extension theorem,
14central limit theorem, 65characteristic function, 56coboundary, 69completion, 17conditional expectation, 54convergence
almost surely, 39in ๐ฟ1, 41in distribution, 39in mean, 41in measure, 39in probability, 39
convolution, 59covariance matrix, 64
density, 31, 52Dirac mass, 39distribution, 29distribution function, 29Dynkin lemma, 12
ergodic, 67expectation, 29
Fatouโs lemma, 23filtration, 36Fourier transform, 56Furstenberg conjecture, 68
Gaussian, 64Gaussian approximation, 60
Hermitian norm, 49Hilbert space, 49Hรถlder inequality, 46
iid., 34independence, 33infinite product ๐-algebra, 35, 68inner product, 49integrable, 20integral with respect to a measure,
20invariant ๐-algebra, 67invariant function, 67invariant set, 67io., 35
Jensen inequality, 44
Kolmogorov 0 โ 1 law, 36, 69
law, 29law of large numbers, 38law of larger numbers, 70Lebesgue measure, 8Lebesgueโs dominated convergence
theorem, 23Lรฉvy criterion, 61
maximal ergodic lemma, 71maximal inequality, 71mean, 32, 64measurable, 18measurable space, 10measure, 10
absolutely continuous, 51finitely additive, 2singular, 51
measure-preserving, 67measure-preserving dynamical
system, 67Minkowski inequality, 45moment, 32monotone convergence theorem, 20
null set, 5
orthogonal projection, 50
Plancherel formula, 62probability measure, 29
74
Index
probability space, 29product ๐-algebra, 26
infinite, 35, 68product measure, 27
infinite, 35
Radon-Nikodym derivative, 52Radon-Nikodym theorem, 52random process, 36random variable, 29
independence, 33Riesz representation theorem, 51,
69
sample path map, 68simple function, 19
stationary process, 68stochastic process, 68strong law of large numbers, 38, 70
tail event, 36, 69Tonelli-Fubini theorem, 27
uniformly integrable, 41
variance, 32von Neumann mean ergodic
theorem, 69
weak convergence, 39
Young inequality, 46
75