Download - 1.4 Cauchy Sequence in R - Yonsei Universityweb.yonsei.ac.kr/nipi/lectureNote/Real Analysis-Marsden-seo.pdf · 1.4 Cauchy Sequence in R Deﬂnition. (1.4.1) A sequence xn 2 Ris said

1.4 Cauchy Sequence in R

Definition. (1.4.1)

A sequence xn ∈ R is said to converge to a limit x if• ∀ε > 0, ∃N s.t. n > N ⇒ |xn − x | < ε.A sequence xn ∈ R is called Cauchy sequence if• ∀ε, ∃N s.t. n > N & m > N ⇒ |xn − xm| < ε.

Proposition. (1.4.2)

Every convergent sequence is a Cauchy sequence.

Proof. Assume xk → x . Let ε > 0 be given.

• ∃N s.t. n > N ⇒ |xn − x | < ε2 .

• n, m ≥ N ⇒|xn − xm| ≤ |x − xn|+ |x − xm| < ε

2 + ε2 = ε.

1

Theorem. (1.4.3; Bolzano-Weierstrass Property)

Every bounded sequence in R has a subsequence thatconverges to some point in R.

Proof. Suppose xn is a bounded sequence in R. ∃M such that−M ≤ xn ≤ M, n = 1, 2, · · · . Select xn0 = x1.

• Bisect I0 := [−M,M] into [−M, 0] and [0, M].

• At least one of these (either [−M, 0] or [0, M]) must contain xn forinfinitely many indices n.

• Call it I1 and select n1 > n0 with xn1 ∈ I0.

• Continue in this way to get a subsequence xnksuch that

• I0 ⊃ I1 ⊃ I2 ⊃ I3 · · ·• Ik = [ak , bk ] with |Ik | = 2−kM.• Choose n0 < n1 < n2 < · · · with xnk

∈ Ik .

• Since ak ≤ ak+1 ≤ M (monotone and bounded), ak →∃ x .• Since xnk

∈ Ik and |Ik | = 2−kM, we have

|xnk−x | < |xnk

−ak |+|ak − x | ≤ 2−k−1M+|ak − x | → 0 as k →∞.

2

Corollary. (1.4.5; Compactness)

Every sequence in the closed interval [a, b] has a subsequencein R that converges to some point in R.

Proof. Assume a ≤ xn ≤ b for n = 1, 2, · · · . By Theorem 1.4.3, ∃a subsequence xnk

and a ≤ ∃x ≤ b such that xnk→ x .

Lemma. (1.4.6; Boundedness of Cauchy sequence)

If xn is a Cauchy sequence, xn is bounded.

Proof. ∃N s.t. n ≥ N ⇒ |xn − x | < 1. Thensupn |xn| ≤ 1 + max{|x1|, · · · , |xN |} (Why?)

Theorem. (1.4.3; Completeness)

Every Cauchy sequence in R converges to an element in [a, b].

Proof. Cauchy seq. ⇒ bounded seq. ⇒ convergent subseq.

3

1.5. Cluster Points of the sequence xn

Definition. (1.5.1; cluster points)

A point x is called a cluster point of the sequence xn if• ∀ε > 0, ∃infinitely many values of n with |xn − x | < εIn other words, a point x is a cluster point of the sequence xn iff∀ε > 0 & ∀N, ∃n > N s.t. |xn − x | < ε

Example

• Both 1 and −1 are cluster points of the sequence1,−1, 1,−1, · · · .

• The sequence xn = 1n has the only cluster point 0.

• The sequence xn = n does not have any cluster point.

4

Proposition.

1. x is a cluster point of the sequence xn iff∃ a subsequence xnk

s.t. xnk→ x .

2. xn → x iff every subsequence of xn converges to x

3. xn → x iff the sequence {xn} is bounded and x is its onlycluster points.

Proof.

1. (⇒) Assume x is a cluster point. Then, we can choosen1 < n2 < n3 · · · s.t. |xnk

− x | < 1k . (Why?) This gives a

subsequence xnk→ x .

2. Trivial

3. (⇐)If not, ∃ε and ∃ a subseq xnkso that |xnk

− x | > ε. Sincexnk

is bounded, ∃ a convergent subseq. The limit of thatsubseq would be a cluster pt of the seq xn different from x ,but there are no such pt. Contradiction.

5

Definition. (1.5.3; limit superior & limit inferior of seq xn )

Define the limit superior limxn in the following way:• If xn is bounded above, then

lim supn→∞ xn = limxn = the largest cluster point

limxn = −∞ if the set cluster point is empty

• If xn is NOT bounded above, then limxn = ∞

Similarly, we can define the limit inferior limxn.

Examples

• For the seq 1, 0,−1, 1, 0,−1, · · · , limxn = 1 and limxn = −1.

• If xn = n, then limxn = ∞ = limxn

• Let xn = (−1)n 1+nn . Then limxn = 1 and limxn = −1.

6

Definition. (1.6.2; Vector space)

A real vector space V is a set of elements called vectors, withgiven operations of vector addition + : V × V → V and scalarmultiplication · : R× V → V such that the followings hold for allv , u,w ∈ V and all λ, µ ∈ R:

1. v + w = w + v , (v + u) + w = v + (u + w), λ(v + w) =λv + λw , λ(µv) = (λµ)v , (λ + µ)v = λv + µv , 1v = v.

2. ∃0 ∈ V s.t. v + 0 = v. ∃ − v ∈ V s.t. v − v = 0.

• A subset of V is called a subspace if it is itself a vector spacewith the same operations.

• W is a vector subspace of V iff λv + µu ∈ W wheneveru, v ∈ W and λ, µ ∈ R.

• The straight line W = {(x1, x2) : x1 = 2x2} is a subspace ofR2.

7

Euclidean space Rn & Definitions & Properties

The Euclidean n-space Rn with the operations(x1, · · · , xn) + (y1, · · · , yn) = (x1 + y1, · · · , xn + yn) & λ(x1, · · · , xn) =

(λx1, · · · , λxn) is a vector space of dimension n.

• The standard basis of Rn;e1 = (1, 0, · · · , 0), · · · , en = (0, · · · , 0, 1).

• Unique representation: x = (x1, · · · , xn) ∈ Rn can beexpressed uniquely as x = x1e1 + · · ·+ xnen.

• Inner product of x and y : 〈x , y〉 =∑n

i=1 xiyi

• Norm of x: ‖x‖ =√〈x , y〉.

• Distance between x and y : dist(x , y) = ‖x − y‖• Triangle inequality: ‖x + y‖ ≤ ‖x‖+ ‖y‖.• Cauchy-Schwartz inequality: 〈x , y〉 ≤ ‖x‖ ‖y‖• Pythagorean theorem: If 〈x , y〉 = 0, then‖x + y‖2 = ‖x‖2 + ‖y‖2.

8

Definition. (1.7.1; Metric Space (M , d) equipped with d =distance)

A metric space (M, d) is a set M and a function d : M ×M → Rsuch that

1. d(x , y) ≥ 0 for all x , y ∈ M.

2. d(x , y) = 0 iff x = y.

3. d(x , y) = d(y , x) for all x , y ∈ M.

4. d(x , y) ≤ d(x , z) + d(z , y) for all x , y ∈ M.

Example [Fingerprint Recognition] Let M be a data set offingerprints in Seoul city police department.

• Motivation: Design an efficient access system to find a target.

• We need to define a dissimilarity function stating thedistance between the data. The distance d(x , y) between twodata x and y must satisfy the above four rules.

• Similarity queries. For a given target x∗ ∈ M and ε > 0,arrest all having finger print y ∈ M such that d(y , x∗) < ε.

9

Definition. (1.7.3. Normed Space (V , ‖ · ‖))A normed space (V, ‖ · ‖) is a vector space V and a function‖ · ‖ : V → R called a norm such that

1. ‖v‖ ≥ 0, ∀v ∈ V2. ‖v‖ = 0 iff v = 0.

3. ‖λv‖ = |λ|‖v‖, ∀v ∈ V and every scaler λ.

4. ‖v + w‖ ≤ ‖v‖+ ‖w‖, ∀v,w ∈ V

Examples

• V = R and ‖x‖ = |x | for all x ∈ R.

• V = R2 and ‖v‖ =√

v21 + v2

2 for all v = (v1, v2) ∈ R2.

• Let V = C ([0, 1])=all continuous functions on the interval[a, b]. Define‖f ‖ = sup{|f (x)| : x ∈ [0, 1]} (called supremum norm).

10

Proposition.

If (V, ‖ · ‖) is a normed vector space and

d(v,w) = ‖v −w‖

, then d is a metric in V.

Proof. EASY.Examples

• For V = C ([0, 1]), the metric is

d(f , g) = ‖f − g‖ = sup{|f (x)− g(x)| : x ∈ [0, 1]}.

The sup distance between functions is the largest verticaldistance between their graphs.

11

Definition.

A vector space V with a function 〈·, ·〉 : V × V → R is called aninner product space if

1. 〈 v, v〉 ≥ 0 for all v ∈ V.

2. 〈 v, v〉 = 0 iff v = 0.

3. 〈λv,w〉 = λ〈v,w〉, ∀v ∈ V and every scaler λ.

4. 〈v + w,h〉 = 〈 v,h 〉+ 〈w,h 〉.5. 〈 v,w 〉 = 〈w, v 〉

Examples

1. V = R2 and 〈 v,w 〉 = v1w1 + v2w2.Two vectors v and w are orthogonal if 〈 v,w 〉 = 0.

2. V = C [0, 1] and 〈 f , g 〉 =∫ 10 f (x)g(x)dx

3. ‖v‖ =√〈 v, v 〉 is a norm on V.

12

Theorem. (Cauchy-Schwarz inequality )

If 〈 ·, · 〉 is an inner product in a real vector space V, then|〈 f , g 〉| ≤ ‖f ‖‖g‖Proof:

• Suppose g 6= 0. Let h = g‖g‖ . It suffices to prove that

|〈 f , h 〉| ≤ ‖f ‖. (Why? |〈 f , g 〉| ≤ ‖f ‖‖g‖ iff |〈 f , h 〉| ≤ ‖f ‖.)• Denote α = 〈 f , h 〉. Then

0 ≤ ‖f − αh‖2 = 〈f − αh, f − αh〉= ‖f ‖2 − α 〈h, f 〉 − α 〈f , h〉+ |α|2= ‖f ‖2 − |α|2

Hence, |α| = |〈 f , h 〉| ≤ ‖f ‖. This completes the proof.

13

Chapter 2: Topology of M = Rn

Throughout this chapter, assume M = Rn ( the Euclidean space )with the metric d(x, y) =

√∑ni=1 |xi − yi |2 = ‖x− y‖

Definition. (D(x, ε), open, neighborhood)

• D(x, ε) := {y ∈ M : d(y, x) < ε} is called ε-ball (or ε-disk)about x.

• A ⊂ M is open if ∀ x ∈ M, ∃ε > 0 s.t. D(x, ε) ⊂ A.

• A neighborhood of x is an open set A containing x.

• Open sets: (a, b), D(x, ε), {(x , y) ∈ R2 : 0 < x < 1}.• The union of an arbitrary collection of open subsets of M is

open. (Why?)

• The intersection of a finite number of open subsets of M isopen. (Note that ∩∞n=1(−1/n, 1/n) = {0} is closed. )

1

2.2 Interior of a set A: int(A)

Definition. (2.2.1; Interior point & interior of A)

Let (M, d) is a metric space and A ⊂ M. x is called an interiorpoint of A if ∃D(x, ε) s.t. D(x, ε) ⊂ A. Denote

int(A) := the collection of all interior points of A.

Examples. Proofs are very easy.

• If A = [0, 1], then int(A) = (0, 1).

• int{(x , y) ∈ R2 : 0 < x≤1} = {(x , y) ∈ R2 : 0 < x<1}.• If A is open, then int(A) = A.

• Let (M, d) be a metric space and x0 ∈ M.int{y ∈ M : d(y, x0)≤1} = {y ∈ M : d(y, x0)<1}

2

Definition. (2.3-4: Closed sets & Accumulation Points )

• A set B in a metric space M is said to be closed if M \B is open.• x ∈ M is accumulation point (or cluster point ) of a setA ⊂ M if ∀ε > 0, D(x, ε) contains y ∈ A with y 6= x.

Prove the followings:

• Closed sets: [a, b], {y ∈ R2 : d(y, x0)≤1}.• The union of an a finite number of closed subsets of M is

closed. (Note that ∪∞n=1[1/n, 2− 1/n] = (0, 2) is open. )

• The intersection of an arbitrary family of closed subsets ofM is closed. Why?

• Every finite set in Rn is closed.

• A set A ⊂ M is closed iff the accumulation points of Abelongs to A.

• A = {1, 12 , 1

3 , 14 , · · · } ∪ {0} is closed.

3

Definition. ( Closure of A & Boundary of A)

Let (M, d) is a metric space and A ⊂ M.

• cl(A) :=the intersection of all closed set containing A.

• ∂A = bd(A) = cl(A) ∩ cl(M \ A) is called the boundary of A

Examples

• Closure: cl((0, 1)) = [0, 1], cl{(x , y) ∈ R2 : x > y} ={(x , y) ∈ R2 : x ≥ y}.

• Boundary bd((0, 1)) = {0, 1}, bd{(x , y) ∈ R2 : x > y} ={(x , y) ∈ R2 : x = y}.

Let (M, d) is a metric space and A ⊂ M. Prove that

• cl(A) = A ∪ { accumnulation points of A}.• x ∈ cl(A) iff inf{d(x, y) : y ∈ A} = 0.

• x ∈ bd(A) iff ∀ε > 0, D(x, ε)∩A 6= ∅ & D(x, ε)∩ (M \A) 6= ∅.

4

Definition. (Sequences & Completes)

Let (M, d) is a metric space and xk a sequence of points in M.

• limk→∞ xk = x iff ∀ε > 0, ∃N s.t. k ≥ N ⇒ d(x , xk) < ε.

• xk is Cauchy seq.iff ∀ε > 0, ∃N s.t. k, l ≥ N ⇒ d(xk , xl) < ε.

• xk is boundediff ∃B > 0 & x0 ∈ M s.t. d(xk , x0) < B for all k.

• x is a cluster point of the seq. xk

iff ∀ε, ∃ infinitely many k with d(xk , x) < ε.

The space M is called complete if every Cauchy seq. in Mconverges to a point in M.

In a metric space, it is easy to prove the followings:• Every convergent seq. is a Cauchy seq.• A Cauchy seq. is bounded.• If a subseq. of a Cauchy seq. converges to x , then the

sequence itself converges to x .5

Chapter 3: Compact & Connected sets

Throughout this chapter, we assume that (M, d) is a metric space.

Definition. (3.1.1: Sequentially compact & Compact)

Let A ⊂ M.

• A is called sequentially compact if EVERY sequence in Ahas a subsequence that converges to a point in A.

• A is compact if EVERY open cover of A has FINITEsubcover.• An open cover of A is a collection {Ui} of open sets such

that A ⊂ ∪iUi .• An open cover {Ui} of A is said to have finite subcover if a

finite subcollection of {Ui} covers A.

• In chapter 1, we proved that every sequence xn in the closedinterval [a, b] has a subsequence that converges to a point in[a, b]. Hence, [a, b] is sequentially compact.

1

Examples of compact set

1. Prove that the entire line R is NOT compact.Proof. Clearly, {D(n, 1) : n = 0,±1,±2, · · · } is open coverof R but does not have a finite subcover (why?).

2. Prove that A = (0, 1] is not compact. Proof. Clearly,(0, 1] = ∪∞n=1(1/n, 2). Hence, {(1/n, 2) : n = 1, 2, · · · } is anopen cover of (0, 1] but does not have a finite subcover.

3. Heine-Borel thm. Let A ⊂ M = Rn. A is compact iff A isclosed and bounded. Proof. later.

4. Give an example of a bounded and closed set that is notcompact.Sol’n. Let M = {en : n = 1, 2, · · · } wheree1 = (1, 0, 0, · · · ), e2 = (0, 1, 0, · · · ), · · · . Let d(ei , ej) =

√2

if i 6= j . Then (M, d) is a metric space.• The entire metric space M is closed and bounded (why?).• {D(en, 1) ; n = 1, 2 · · · } is open cover of M but but does not

have a finite subcover (why?). Hence, M is not compact.

2

Theorem. (3.1.3; Bolzano-Weirstrass theorem)

A ⊂ M is compact iff A is sequentially compact.

• Lemma 1: Let A ⊂ M. If A is compact, then A is closed.Proof . We will show M \ A is open. Let x ∈ M \ A.

1. A ⊂ ∪∞n=1Un where Un = M \ D(x, 1/n) open set.2. Since A is compact and {Un} covers A, ∃ a finite subcover,

that is, ∃N s.t. A ⊂ ∪Nn=1Un. = UN

3. Hence, D(x, 1/N) ⊂ UcN ⊂ Ac = M \ A and therefore M \ A is

open.

• Lemma 2: Let A ⊂ B ⊂ M. If B is compact and A isclosed, then A is compact.Proof. Let Ui be an open covering of A.

1. Set V = M \ A. Note that V is open.2. Thus {Ui , V } is an open cover of B.3. Since B is compact, B has a finite cover, say,{U1, · · · , UN ,V }. Hence, A ⊂ U1 ∪ · · · ∪ UN .

3

• Lemma 4: If A is sequentially compact, then A is totallybounded.

1. Definition of totally bounded: A ⊂ M is totally bounded if∀ε, ∃ finite set {x1, · · · , xN} ⊂ M s.t. A ⊂ ∪N

i=1D(xi , ε).2. Proof. If not, then for some ε > 0 we cannot cover A with

finitely many disks.

(i) Choose x1 ∈ A and x2 ∈ A \ D(x1, ε).(ii) By assumption, we can repeat; choose xn ∈ A \ ∪n−1

i=1 D(xi , ε)for n = 1, 2, · · · .

(iii) This seq {xn} satisfies d(xn, xm) > ε for all n 6= m.(iv) Hence, xn has no convergent subseq., a contradiction.

• Summery. Let A ⊂ M.• A is compact ⇒ A is closed• A is a closed subset of a compact set ⇒ A is compact.• A is sequentially compact ⇒ A is totally bounded.

4

Proof of B-W thm (⇒): If A is compact, then A issequentially compact.Let A be compact. Let {xn} be a seq. in A.

1. To derive a contradiction, assume that {xn} has noconvergent subseq.

2. Then, {xn} has infinitely many distinct points {yk} which hasno accumulation points. (Why? If not, ∃ convergent subseq. )

3. Hence, ∃ some neighborhood Uk of yk containing no other yi .

4. {yn} is closed because it has no accumulation points. Hence,{yn} is compact by Lemma 2. Lemma2: Any closed subset of the compact set A is

compact.

5. But {Uk} is an open cover that has no finite subcover, acontradiction.

6. Hence, xn has a convergence subsequence. The limit lies in A,since A is closed by Lemma 1.

Hence, xn has a subsequence that converges to a point in A.

5

Proof of B-W thm (⇐): If A is sequentially compact, than Ais compact.Suppose {Ui} is an open cover of A. We need to prove that {Ui}has finite subcover.

• ∃r > 0 s.t. ∀y ∈ A, D(y , r) ⊂ Ui for some Ui .Why?

1. If not, ∃yn ∈ A s.t. D(yn, 1/n) is not contained in any Ui .2. By assumption, {yn} has a subseq., say, ynk

→ z ∈ A. Sincez ∈ A ⊂ ∪iUi , z ∈ Ui0 for some Ui0 .

3. Since Ui0 is open, ∃ε > 0 s.t. D(z , ε) ⊂ Ui0 .4. Since ynk

→ z , ∃N = nk0 ≥ 2/ε s.t. yN ∈ D(z , ε/2).5. But D(yN , 1/N) ⊂ D(z , ε) ⊂ Ui0 (why?), a contradiction.

• Since A is totally bounded (see Lemma 4), we can writeA ⊂ D(y1, r) ∪ · · · ∪ D(yn, r) for finitely many yi .

• Since D(yk , r) ⊂ Uik for some Uik , A ⊂ Ui1 ∪ · · · ∪ Uin ,finite subcover. Hence, A is compact.

6

Theorem. (3.15; Compact ⇔ Closed and Totally Bounded)

Let A ⊂ M. A is compact iff A is complete and totally bounded.

(Proof of ⇒) Assume A is compact.

1. A is compact ⇒ totally bounded & sequentially compact.

2. A is sequentially compact ⇒ A is complete.

(Proof of ⇐) Assume A is is complete and totally bounded. Itsuffices to prove that A is sequentially compact. Assume that {yn}is a sequence in A.

1. We may assume that the yk are all distinct. (Why? If not, ...)

2. Since A is totally bounded, for each k = 1, 2, · · ·∃xk1, · · · , xkLk

∈ M s.t. A ⊂ D(xk1, 1/k) ∪ · · · ∪ D(xkLk, 1/k)

3. Nest page...

7

Theorem. (Continue...)

Let A ⊂ M. A is compact iff A is complete and totally bounded.

(Proof of ⇐) Assume A is is complete and totally bounded. It suffices to prove that A is sequentially compact.

Assume that {yn} is a sequence in A.

1. We may assume that the yk are all distinct. (Why? If not, ...)

2. Since A is totally bounded, for each k = 1, 2, · · ·

∃xk1, · · · , xkLk∈ M s.t. A ⊂ D(xk1, 1/k) ∪ · · · ∪ D(xkLk

, 1/k)

3. For k = 1, an infinitely many yn lie in one of these disksD(x1,j , 1). Hence, we can select a subseq. {y11, y12, · · · } lyingentirely in one of these disks.

4. Repeat the previous step for k = 2 and obtain the subseq.{y21, y22, · · · } of {y11, y12, · · · } lying entirely in one of thesedisks D(x2j , 1/2).

5. Now choose the diagonal subsequence y11, y22, y33, · · · . Thissequence is Cauchy seq. because d(yii , yjj) ≤ max{1/i , 1/j}.

6. Since A is complete, yii converges to a point in A.8

Theorem. (3.2.1, Heine-Borel thm.)

Let A ⊂ M = Rn. A is compact iff A is closed and bounded.

Proof.

• Recall Thm 3.1.5: A is compact iff A is closed and totallybounded.

• Since M = Rn is Euclidean space,

A is bounded ⇔ A is totally bounded

Caution: If M is not Euclidean space, the above statement is nottrue. See Example 3.1.8 where there is an example that A isbounded but not totally bounded.

9

Theorem. (3.3.1: Nested Set Property)

Let Fk be a sequence of compact non-empty set in a metricspace M such that F1 ⊇ F2 ⊇ F3 ⊇ · · · . Then,

∩∞k=1Fk 6= ∅.

1. For each n, choose xn ∈ Fn.2. Since {xn} ⊂ F1 and F1 is compact, ∃ a subseq {xnk

} thatconverges to some point z in F1, that is,

xnk−→ z ∈ F1

3. With a rearrangement, we may assume that xn → z . (why?)4. n > N =⇒ xn ∈ Fn ⊂ FN =⇒ xn ∈ FN

5. Since limj→∞ xN+j = z & xN+j ∈ FN & FN is compact, itmust be

z ∈ FN , N = 1, 2, 3, · · ·This completes the proof.

10

Definition. (Path-Connected Sets)

• φ : [a, b] → M is said to be continuous if

tk ∈ [a, b] → t =⇒ φ(tk) → φ(t)

• A continuous path joining x , y ∈ M is a continuousmapping φ : [a, b] → M such that φ(a) = x , φ(b) = y.

• A ⊂ M is said to be path-connected if for any x , y ∈ A,there exists a continuous path φ : [a, b] → M joining x and ysuch that

φ([a, b]) ⊂ A.

11

Definition. (3.5.1: Separate, Connected Sets)

Let A be a subset of a metric space M.

• Two open set U, V are said to be separate A if

1. U ∩ V ∩ A = ∅2. U ∩ A 6= ∅ & V ∩ A 6= ∅3. A ⊂ U ∪ V .

• A is disconnected if such sets U, V exist.

• A is connected if such sets U, V do not exist.

12

Theorem. (3.3.1)

Path-connected sets are connected.

1. Clearly, [a, b] is connected.

2. To derive a contradiction, suppose A is path-connected butnot connected. Then ∃ open sets U, V such that

(i) U ∩ V ∩ A = ∅ & A ⊂ U ∪ V(ii) ∃x ∈ U ∩ A & ∃y ∈ V ∩ A

3. Since A is path-connected, ∃ a continuous pathφ : [a, b] → M s.t. φ(a) = x , φ(b) = y , φ([a, b]) ⊂ A.

4. From Theorem 4.2.1 which we will learn soon, φ([a, b]) isconnected. This is a contradiction since U, V separateφ([a, b]).

13

Example 3.1

• Show that A := {x ∈ Rn : ‖x‖ ≤ 1} is compact andconnected.Proof.

1. Since A is closed and bounded, A is compact by Heine-Borelthm.

2. To prove connectedness, let x , y ∈ A.3. Define φ : [0, 1] → Rn by φ(t) = tx + (1− t)y . Clearly, φ is

continuous path joining φ(0) = x and φ(1) = y .4. ‖φ(t)‖ ≤ t‖x‖+ (1− t)‖y‖ ≤ t + (1− t) = 1 for t ∈ [0, 1].

Hence, φ([0, 1]) ⊂ A.5. Hence, A is path-connected.

14

Example 3.2

• Let A ⊂ Rn, x ∈ A and y ∈ Rn \ A. Let φ : [0, 1] → Rn be acontinuous path joining x and y .Show that ∃ t0 s.t. φ(t0) ∈ bd(A).

1. Let t0 = sup{t : φ([0, t]) ⊂ A}. This is well-defined becauseφ(0) = x ∈ A.

2. If t0 = 1, clearly y = φ(t0) ∈ bd(A).3. Assume 0 ≤ t0 < 1. From the definition of t0, for

n = 1, 2, · · · , ∃tn s.t. t0 ≤ tk ≤ t0 + 1n & φ(tn) ∈ Ac

4. Since φ(tn) ∈ Ac → φ(t0), φ(t0) ∈ bd(A).

15

Chapter 4. Continuous Mappings

Throughout this chapter, we assume that M = Rn and N = Rm

are Euclidean space with the standard metric

d(x, y) = ‖x− y‖ =

√√√√n∑

j=1

(xi − yj)2, x, y ∈ M

ρ(v,w) = ‖v −w‖ =

√√√√m∑

j=1

(vi − wj)2, v,w ∈ N

Please note that the same symbol ‖ · ‖ may have different normdepending on its context.Throughout this chapter, we assume that A ⊂ M = Rn and

f : A → N = Rm is a mapping.

1

Definition. (4.1.1: Continuity of f : A → N)

• Suppose that x0 ∈ {accumulation points of A}. We writelimx→x0 f (x) = b if ∀ε > 0, ∃ δ > 0 s.t.

0 < ‖x− x0‖ < δ & x ∈ A ⇒ ‖f (x)− b‖ < ε

• Let x0 ∈ A. We say that f is continuous at x0 if eitherx0 6= {accumulation points of A} or limx→x0 f (x) = f (x0).

• Let B ⊂ A. f is called continuous on B if f is continuous ateach point on B. If A = B, we just say that f is continuous.

2

Theorem. (4.1.4: Continuity of f : A → N)

The following assertions are equivalent.

1. f is continuous on A

2. For every convergent seq xk → x0 in A, we havef (xk) → f (x0).

3. For each open set U in N, f −1(U) is open relative to A;that is, f −1(U) = A ∩ V for some open V

4. For each closed set F in N, f −1(F ) is closed relative to A;that is, f −1(F ) = A ∩ G for some close G

Proof. 1easy=⇒ 2

?=⇒ 4

easy=⇒ 3

?=⇒ 1

3

Proof of (2 =⇒ 4)

Let F ⊂ N be closed. We want to prove that f −1(F ) is closedrelative to A. We begin with reviewing the definition of closed.

1. B is closed iff B = B ∪ {accumulation points of B}.2. B is closed iff for every sequence {xk} ⊂ B that xk → x0, we

necessary have x0 ∈ B.

3. B ⊂ A is closed relative to A iffB = (B ∪ {accumulation points of B})∩A

4. B ⊂ A is closed relative to A iff for every sequence {xk} ⊂ Bthat xk → x0∈ A, we necessary have x0 ∈ B.

5. Proof of (2 =⇒ 4). Let xk ∈ f −1(F ) and let xk → x0 ∈ A.By 2, f (xk) → f (x0). Since F is closed, f (x0) ∈ F .∴ x0 ∈ f −1(F ). ∴ f −1(F ) is closed relative to A.

4

Proof of (3 =⇒ 1)

For given x0 ∈ A and ε > 0, we must find δ > 0 such that

‖x− x0‖ < δ & x ∈ A︸︷︷︸x ∈ D(x0,δ)∩A

⇒ ‖f (x)− f (x0)‖ < ε︸︷︷︸f (x) ∈ D(f (x0),ε)

1. Since D(f (x0), ε) is open, by 3

f −1 (D(f (x0), ε)) is open relative to A.

∴ f −1 (D(f (x0), ε)) = A ∩ V for some open set V .

2. Since x0 ∈ V and V is open,

∃ δ > 0 s.t. D(x0, δ) ⊂ V .

3. Hence, D(x0, δ) ∩ A ⊂ f −1 (D(f (x0), ε)) and this completesthe proof.

5

Theorem. (4.2.1: f (connected) is connected if f ∈ C (M))

Suppose that f : M → N is continuous and let K ⊂ M.

(i) If K is connected, so is f (K ).

(ii) If K is path-connected, so is f (K ).

Proof of (i). Suppose f (K ) is not connected.

1. From the definition of disconnectedness, ∃ open U, V s.t.

f (K ) ⊂ U ∪ V , U ∩ V ∩ f (K ) = ∅, U ∩ f (K ) 6= ∅, V ∩ f (K ) 6= ∅

2. Since f is continuous, f −1(U) and f −1(V ) are open. Moreover,K ⊂ f −1(U) ∪ f −1(V ), f −1(U) ∩ f −1(V ) ∩ K = ∅,

f −1(U) ∩ K 6= ∅, f −1(V ) ∩ K 6= ∅.3. Hence, K is disconnected, a contradiction.

6

Proof of (ii). If K is path-connected, so is f (K ).

1. Let v,w ∈ f (K ) and let x, y ∈ K s.t. f (x) = v, f (y) = w.

2. Since K is path-connected, ∃ a continuous curvec : [0, 1] → M s.t.

c(t) ∈ K (0 ≤ t ≤ 1), c(0) = x, c(1) = y

3. Since f is continuous, it is easy to show thatc(t) = f (c(t)) ∈ f (K ) for 0 ≤ t ≤ 1 and and c : [0, 1] → N iscontinuous path joining v and w.

4. Hence, f (K ) is path-connected

7

Theorem. (4.2.2: f (compact) is compact if f ∈ C (M))

Suppose that f : M → N is continuous and K ⊂ M is compact.Then f (K ) is compact.

It suffices to prove that f (K ) is sequentially compact.

1. Let vn ∈ f (K ). Let xn ∈ K s.t. f (xn) = vn.

2. Since K is compact, ∃ a convergent subsequence, say,xnk

→ x0 ∈ K .

3. Since f is continuous, vnk= f (xnk

) → f (x0) ∈ f (K ). Thisproves that f (K ) is sequentially compact.

8

Examples

Let f : R2 → R be a continuous map. Denote x = (x1, x2).

• Let f (x) = x1 for x ∈ R2. If K ⊂ R2 be compact, so isf (K ) = {x1 : x = (x1, x2) ∈ K}. (Why? Since f iscontinuous and K is compact, f (K ) is compact.)

• Let f (x) = 7 for x ∈ R2. The set {7} is compact, whileR2 = f −1({7}) is not compact.

• The set A = {f (x) : ‖x‖ = 1} is a closed interval. (Why?K = {x ∈ R2 : ‖x‖ = 1} is compact and connected. Hence,A = f (K ) is compact and connected. )

9

Theorem.

(1) Let f : A ⊂ N → M and g : A ⊂ N → M be continuous at x0.Then

• f ± αg is continuous at x0 for any α ∈ R.

• fg is continuous at x0

• f /g is continuous at x0 if g(x0) 6= 0.

(2) Suppose f : A ⊂ N → M and h : B ⊂ N → Rp are continuousand f (A) ⊂ B. Then h ◦ f : A ⊂ N → Rp is also continuous.

Proof. EASY

10

Theorem. (4.4.1: Maximum-Minimum Principle)

Let f : A ⊂ M → R be continuous and let K be a compact subsetin A. Then,

• f (K ) is bounded.

• ∃x0, y0 ∈ K such that

f (x0) = inf f (K ) = infx∈K

f (x) & f (y0) = sup f (K ) = supx∈K

f (x).

Proof. Since K is compact and f is continuous on K ⊂ A, f (K ) iscompact. Hence, f (K ) is closed and bounded in R by Heine-Borelthm. This completes the proof.

11

Theorem. (4.5.1: Intermediate Value Theorem)

Let f : A ⊂ M → R be continuous. Assume K is a connectedsubset in A and x , y ∈ K and f (x) < f (y). Then,

• For every number c ∈ R such that f (x) < c < f (y),

∃ z ∈ K s.t. f (z) = c

Proof. Since K is connected and f is continuous on K ⊂ A, f (K )is connected. Hence, [f (x), f (y)] ⊂ f (K ).∴ ∃ z ∈ K s.t. f (z) = c . This completes the proof.

12

4.6 Uniform Continuity

Throughout this section, we assume that f : A ⊂ Rn → Rm iscontinuous.

• Definition. Let B ⊂ A. f is uniformly continuous on B iffor every ε > 0, there is δ > 0 s.t.

‖x − y‖ < δ & x , y ∈ B ⇒ ‖f (x)− f (y)‖ < ε.

• Example. Consider f : R→ R, f (x) = x2. Then f iscontinuous on R, but it is not uniformly continuous. Why?Let xn = n + 1/n and yn = n. Then |xn − yn| = 1/n → 0,while |f (xn)− f (yn)| ≥ 1.

• Example. Consider f : (0, 1) → R, f (x) = 1/x . Then f iscontinuous on (0, 1), but it is not uniformly continuous. Why?Let xn = 1/n . Then |xn+1 − xn| < 1/n → 0, while|f (xn+1)− f (xn)| = 1.

13

Theorem. (Uniform Continuity Theorem)

Let f : A ⊂ Rn → Rm be continuous and let K ⊂ A be compact.Then f uniformly continuous on K.

1. Let ε be given. Since f is continuous on K , for each x ∈ K ,∃δx > 0 s.t. f (D(x , δx) ∩ K ) ⊂ D

(f (x), ε

2

)

2. Since K ⊂ ∪xD(x , δx/2) and K is compact,∃{x1, · · · , xN} ⊂ K s.t. K ⊂ ∪N

j=1D(xj , δ) where

δ = 12 min{δx1 , · · · , δxN

}.3. If |x − y | < δ, x , y ∈ K , then ∃xj s.t. |x − xj | < δ. Since|y − xj | ≤ |y − x |+ |x − xj | < 2δ ≤ δxj ,‖f (x)− f (y)‖ ≤ ‖f (x)− f (xj)‖+ ‖f (xj)− f (y)‖ ≤ ε

2 + ε2 = ε.

14

Chapter 5. Uniform Convergence

This chapter deals with very important results in physical science:

• a basic iteration technique called the contractionmapping principle (5.7.1)

• some applications to differential and integral equationsand some problems in control theory. (5.7.2, 5.7.3, 5.7.10)

To study such results, we need

• compactness in a complete metric space (5.5.3)

• uniform convergence, equi-continuity (5.6.2)

1

Definition. (Pointwise convergence & Uniform Convergence)

Let N be a metric space with the metric ρ, A a set, andfk : A → N, k = 1, 2, · · ·• fk → f pointwise if for each x ∈ A, limk→∞ fk(x) = f (x), i.e.

∀x ∈ A, limk→∞

ρ(fk(x), f (x)) = 0

• fk → f uniformly if limk→∞ supx∈A ρ(fk(x), f (x)) = 0, i.e.

∀ε > 0, ∃ N s.t. k > N ⇒ supx∈A

ρ(fk(x), f (x)) < ε

Examples:

• fk(x) = xk → 0 pointwise in (0, 1). (Why?)

• fk(x) = xk does NOT converge to 0 uniformly in (0, 1).

• Show that fn(x) = xn

1+xn converges pointwise on [0, 2] but thatthe convergence is not uniform.

2

Definition. (5.1.3: Does∑

k gk makes sense ?)

Denote fn(x) =∑n

k=1 gk(x).

• ∑k gk = f (pointwise) if fn → f pointwise.

• ∑k gk = f uniformly if fn → f uniformly.

Examples.

• ∑∞k=0

(−1)kx2k+1

(2k+1)! = sin x uniformly in the interval [−100, 100].

• ∑k xk = 1

1−x converges uniformly in [−0.9, 0.9]

• ∑k xk = 1

1−x converges pointwise (NOT uniformly) in (−1, 1)

• ∑k xk does not converge in R \ (−1, 1)

3

The Weierstrass M-test

Theorem. (5.2.1: Cauchy Criterion)

Let V be a complete normed vector space with norm ‖ · ‖, and letA be a set. Let fk : A → V is a sequence of functions. Then fkconverges uniformly on A iff

∀ ε > 0, ∃ N s.t. l , k > N ⇒ supx∈A

‖fk(x)− fl(x)‖ < ε

Proof of ⇒.

1. Assume fk → f uniformly. Let ε > 0 be given.

2. Then∃ N s.t. k ≥ N ⇒ ‖fk − f ‖ = supx∈A |fk(x)− f (x)| < ε/2.

3. Hence,l , k ≥ N ⇒ ‖fk − fl‖ ≤ ‖fk − f ‖+ ‖fl − f ‖ < ε

2 + ε2 = ε.

4

Theorem. (5.2.1: Continue...)

♣♦♥ Then fk converges uniformly on A iff∀ ε > 0, ∃ N s.t. l , k > N ⇒ supx∈A ‖fk(x)− fl(x)‖ < ε

Proof of ⇐.

1. From the assumption, fk(x) is Cauchy sequence for all x ∈ A.

2. Hence, for all x ∈ A, ∃ limk fk(x) and we can definef (x) = limk fk(x).

3. Let ε > 0 be given. From the assumption,∃ N s.t. l , k > N ⇒ supx∈A ‖fl(x)− fl(x)‖ < ε/2.

4. From 2,∀x ∈ A, ∃ Nx s.t. l > Nx ⇒ ‖f (x)− fl(x)‖ < ε/2.

5. From 3 and 4, if k ≥ N and x ∈ A, then‖fk(x)− f (x)‖ ≤ ‖fk(x)− fl(x)‖+‖fl(x)− f (x)‖ < ε/2+ε/2for any l ≥ Nx .

6. From 5, k ≥ N ⇒ supx∈A ‖fk(x)− f (x)‖ < ε.

5

Theorem. (5.2.2: Weierstrass M test)

Let V be a complete normed vector space with norm ‖ · ‖, and letA be a set. Suppose that gk : A → V are functions such thatsupx∈A ‖gk(x)‖ < Mk and

∑∞k=1 Mk < ∞. Then

∑∞k=1 gk

converges uniformly.

Proof.

1. Denote fn(x) =∫ nk=1 gk(x).

2. Then ‖fn(x)− fn+`(x)‖ = ‖∑n+`k=n gk(x)‖ ≤ ∑n+`

k=n Mk .

3. Since limn→∞∑∞

k=n Mk = 0, it follows from 2 and Theorem5.2.1 that fn converges uniformly.

6

5.5 The space of continuous functions

Throughout this section, we assume M = Rn, A ⊂ M, andN = Rn. (N,M: complete normed space)

• Denote C(A, N) = {f : f : A → N is continuous }. Then C isa vector space.

• For f ∈ C(A,N), f is said to be bounded if there is a constantC such that ‖f (x)‖ < C for all x ∈ A.

• Denote Cb(A, N) = {f ∈ C : f is bounded }.• Define

‖f ‖ = supx∈A

‖f (x)‖

• ‖f ‖ is a measure of the size of f and is called the norm of f .

7

Theorem. (5.5.1-3: Cb(A, N) is a complete normed space)

Let A ⊂ M = Rm, N = Rn. The set Cb(A,N) is a complete normedspace equipped with the norm ‖f ‖ = supx∈A ‖f (x)‖; that is,

1. Cb(A,N) is a normed space.• ‖f ‖ ≥ 0 and ‖f ‖ = 0 iff f = 0.• ‖αf ‖ = |α|‖f ‖ for α ∈ R, f ∈ Cb.• ‖f + g‖ ≤ ‖f ‖+ ‖g‖.

2. Completeness: Every Cauchy sequence {fk} in Cb(A,N)converges to a function f ∈ Cb(A, N), that is,

limk→∞

‖fk − f ‖ = limk→∞

supx∈A

‖fk(x)− f (x)‖ = 0.

• Clearly, Cb(A, N) is a normed space. (EASY!)

• From the definition, fk → f uniformly iff fk → f in Cb.

• From Cauchy criterion (Theorem 5.2.1), Cb(A, N) is complete.

8

Examples

• Let B = {f ∈ C ([0, 1],R) : f (x) > 0 for all x ∈ [0, 1]}.Show that B is open in C ([0, 1],R).Proof.

1. In order to prove that B is open, we must show that∀ f ∈ B, ∃ε > 0 s.t. D(f , ε) ⊂ B.

2. Let f ∈ B. Since [0, 1] is compact, f has a minimum value-say,m- at some point in [0, 1]. Hence, infx∈[0,1] f (x) = m.

3. Let ε = m2 . We will show D(f , ε) ⊂ B.

Proof. If g ∈ D(f , ε), then ‖g − f ‖ < ε, and∴ g(x) ≥ f (x)−|g(x)− f (x)| ≥ f (x)−‖f − g‖ ≥ m−ε = m

2for all x ∈ [0, 1]. Hence, g ∈ B. ∴ D(f , ε) ⊂ B

• Prove that B is D = {f ∈ Cb : infx∈[0,1] f (x) ≥ 0}.Proof.

1. D = D because if fn ∈ D → f uniformly, then fn(x) → f (x)pointwise and ∴ infx∈[0,1] f (x) ≥ 0.

2. If f ∈ D, then fn(x) := f (x) + 1n ∈ B and ‖fn − f ‖ = 1

n → 0.

∴ B ⊂ D ⊂ B.

9

Examples

• Consider a sequence fn ∈ Cb such that ‖fn+1− fn‖ ≤ rn, where∑rn is convergent. Prove that fn converges.

Proof.

1. Let ε > 0 be given.2. Since

∑rn is convergent,

∃ N s.t. n > N ⇒∞∑

k=n

rk < ε

3. Hence, if n ≥ N, then

‖fn+k−fn‖ = ‖n+k−1∑

j=n

(fj+1−fj)‖ ≤n+k−1∑

j=n

‖fj+1−fj‖ ≤∞∑

j=n

rj < ε

4. From 3, fn is a Cauchy sequence, so it converges.

10

Arzela-Ascoli Theorem

Throughout this section, we assume thatM = Rm, A ⊂ M, N = Rn (N,M: complete normed space).

Definition. (5.6.1: Equi-continuous)

Assume B ⊂ C(A, N).

• We say that B is equi-continuous if

∀ ε > 0, ∃ δ > 0 s.t. ‖x − y‖ < δ & x , y ∈ A

⇒ supf ∈B‖f (x)− f (y)‖ < ε

• We say B is pointwise compact iff Bx = {f (x) : f ∈ B}is compact in N for each x ∈ A.

11

Example 5.6.4 (Compact sequence)

Let fn ⊂ Cb([0, 1],R) and be such that f ′n exist and

supn‖fn‖ ≤ C & sup

n

(sup

x∈(0,1)|f ′n(x)|

)≤ C

for a positive constant C . Prove that B := {fn} is equicontinuous.Proof.

• By the mean value theorem,

|fn(x)− fn(y)| ≤ supz∈(0,1)

|f ′n(z)| |x − y | ≤ C |x − y |, for all n

• Hence, for given ε > 0, we can choose δ = εC and

|x−y | < δ & x , y ∈ [0, 1] ⇒ supn|fn(x)−fn(y)| < C |x−y | < ε.

Hence, B := {fn} is equi-continuous. (So, {fn} has aconvergent subsequence. Why? See Arzela-Ascoli theorem.)

12

Theorem. (5.6.2:Arzela-Ascoli theorem)

Let A be compact and B ⊂ C(A,N). If B is closed,equi-continuous, and pointwise compact, then B is compact, thatis, any sequence fn in B has a uniformly convergent subsequence.

The proof strategy is based on Bolzano-Wierstrass properties.

Theorem. (Special case of Arzela-Ascoli theorem)

Let B ⊂ C([0, 1],R). If B is closed, equi-continuous, and bounded,then B is compact.

Proof.

1. Assume fn is a sequence in B.

2. Denote C1/n = { 1n , 2

n , · · · , n−1n , 1}. Let C = ∪nC1/n.

3. Since C is countable, we can write C = {x1, x2, · · · }.13

Theorem. (Special case of Arzela-Ascoli theorem)

Let B ⊂ C([0, 1],R). If B is closed, equi-continuous, and bounded,then B is compact.

Proof.

1. Assume fn is a sequence in B.

2. Denote C1/n = { 1n

, 2n

, · · · , n−1n

, 1}. Let C = ∪nC1/n .

3. Since C is countable, we can write C = {x1, x2, · · · }.

4. Since Bx1 is compact, ∃ a convergent subseq of fn(x1). Let usdenote this subsequence by

f11(x1), f12(x1), · · · , f1k(x1), · · ·

5. Similarly, the sequence f1k(x2) has a subsequence

f21(x2), f22(x2), · · · , f2k(x2), · · · which is converegnt.

6. We proceed in this way and then set gn = fnn.

14

Proof of Arzela-Ascoli theorem

7. gn = fnn is obtained by picking out the diagonal

f11 f12 f13 · · · f1n · · · (1st subseq.)f21 f22 f23 · · · f11 · · · (2nd sub seq.)...

......

...fn1 fn2 fn3 · · · fnn · · · (n-th subseq.)

8. From the construction from the diagonal process,

limn→∞ gn(xi ) exists for all xi ∈ C .

9. Now, we are ready to prove

‖gn − gm‖ = supx∈[0,1]

|gn(x)− gm(x)| → 0 as m, n →∞.

15

Continue...

9. Proof of limn,m→∞ supx∈A |gn(x)− gm(x)| = 0.

a. Let ε > 0 be given.b. From equi-continuity of {gn} ⊂ B, we can choose δ s.t.

|x−y | < δ & x , y ∈ A = [0, 1] ⇒ supn|gn(x)−gn(y)| < ε/3

c. Choose L ≥ 1δ . From 8,

∃ N s.t. n, m > N ⇒ supxi∈C1/L

|gn(xi )− gm(xi )| <ε

3.

d. For each x ∈ A, there exist yj ∈ C1/L s.t. |x − yj | < δ.Therefore, if n, m > N, then

|gn(x)− gm(x)| ≤ |gn(x)− gn(yj)|+ |gn(yj)− gm(yj)|+|gm(x)− gm(yj)| ≤ ε

3+

ε

3+

ε

3

This proves limn,m→∞ supx∈A |gn(x)− gm(x)| = 0.

16

Continue...

10. From 9, gn is a Cauchy sequence in C([0, 1], N).

11. Since C([0, 1], N) is the complete normed space, gn convergesto some g ∈ C([0, 1],N).

12. Since B is closed, it must be g ∈ B.

13. From 1, 11, and 12, B is sequentially compact, so it iscompact.

♣ ♣ ♣ ♣ ♣ ♣ ♣

The proof of Arzela-Ascoli theorem is exactly the same as thespecial case discussed above except the step 2.For the replacement of the step 2, we use the fact that thecompact set A is totally bounded . The compactness of Aprovides that, for each δ > 0, there exist a finite setCδ = {y1, · · · , yk} such that A ⊂ ∪k

j=1D(yj , δ).

17

5.7 The contraction mapping principle

Theorem. (5.7.1: Contraction mapping principle)

Let M be a complete normed space and Φ : M → M a givenmapping. Assume

∃ k ∈ [0, 1) s.t. ‖Φ(f )− Φ(g)‖ ≤ k ‖f − g‖ for all f , g ∈ M

Then there exists a unique fixed point f∗ ∈ M s.t. Φ(f∗) = f∗. Infact, if f0 ∈ M and fn+1 = Φ(fn), n = 0, 1, 2, · · · , then

limn→∞ ‖fn − f∗‖ = 0

Key idea: Φ is shrinking distances:

‖fn+1 − fn‖ = ‖Φ(fn)− Φ(fn)‖ ≤ k‖fn − fn−1‖ ≤ · · · ≤ kn‖f1 − f0‖

1

The proof of contraction mapping principle:∃ f∗ ∈ M s.t. Φ(f∗) = f∗

1. Let f0 ∈ M and fn+1 = Φ(fn), n = 0, 1, 2, · · · .

2. ‖f2 − f1‖ = ‖Φ(f1)− Φ(f0)‖ ≤ k‖f1 − f0‖.3. ‖f3 − f2‖ = ‖Φ(f2)− Φ(f1)‖ ≤ k‖f2 − f1‖ ≤ k2‖f1 − f0‖.4. Inductively, ‖fn+1 − fn‖ ≤ kn‖f1 − f0‖.5. Hence,∑∞

n=0 ‖fn+1 − fn‖ ≤ ‖f1 − f0‖∑∞

n=0 kn = ‖f1 − f0‖ 11−k < ∞.

6. From the proof in Example 5.5.6 and 4, fn converges.

7. Since M is complete, limn→∞ fn = f∗ for some f∗ ∈ M.

8. Φ is uniformly continuous because‖Φ(f )− Φ(g)‖ ≤ k ‖f − g‖.

9. From 8, limn→∞Φ(fn) = Φ(f∗).10. Hence, f∗ = limn→∞ fn+1 = limn→∞Φ(fn) = Φ(f∗).

2

The proof of contraction mapping principle:Uniqueness of the fixed point f∗

11. To prove the uniqueness, assume g∗ is another fixed point,i.e., Φ(g∗) = g∗

12. Then f∗ − g∗ = Φ(f∗)− Φ(g∗) and

‖f∗ − g∗‖ = ‖Φ(f∗)− Φ(g∗)‖ ≤ k‖f∗ − g∗‖

Hence, (1− k)‖f∗ − g∗‖ ≤ 0.

13. Since 1 ≤ k < 1, it must be

‖f∗ − g∗‖ = 0

Hence, f∗ = g∗

3

Theorem. (5.7.2: Existence of sol’n of Differential equations)

Let A ⊂ R2 be an open neighborhood of (t0, x0). Assumef : A → R is continuous function satisfying the following Lipschitzcondition:

|f (t, x1)− f (t, x2)| ≤ K |x1 − x2| for all (t, x1), (t, x2) ∈ A.

Then, there is a δ > 0 s.t. the equation

dx(t)

dt= f (t, x), x(t0) = x0

has a unique C 1-solution x = φ(t) with φ(t0) = x0, fort ∈ (t0 − δ, t0 + δ), i.e.,

φ′(t) = f (t, φ(t)) for all t ∈ (t0 − δ, t0 + δ) & φ(t0) = x0

C 1-solution = continuously differentiable solution4

Get insight: Proof of Theorem 5.7.2

Before the proof, let us get some insight. Imagine that φ isthe solution of dx(t)

dt = f (t, x), x(t0) = x0.Since φ′(t) = f (t, φ(t)) with φ(t0) = x0,

φ(t) = φ(t0) +

∫ t

t0

φ′(s)ds = x0 +

∫ t

t0

f (s, φ(s))ds

Hence, φ is a fixed point for the map Φ : M → M defined by

Φ(φ) = x0 +

∫ t

t0

f (s, φ(s))ds

In order to apply the contraction mapping principle, we need tochoose a suitable space M. In practice, the solution φ can beachieved from the following iterative method:

φn+1(t) = Φ(φn) = x0 +

∫ t

t0

f (s, φn(s))ds & φ0 = x0

5

Proof of Theorem 5.7.2

1. Let L = sup(x ,t)∈A |f (x , t)| where A is a closed subset of A.Since f is continuous in A, L < ∞.

2. Choose δ such that Kδ < 1 and

{(t, x) : |t − t0| < δ, |x − x0| < Lδ} ⊂ A

3. Denote C = C([t0 + δ, t0 + δ],R). From theorem 5.5.3, C is acomplete normed space (or Banach space) with norm

‖φ‖ = supt∈[t0+δ,t0+δ]

|φ(t)|

4. Let

M = {φ ∈ C : φ(t0) = x0 & |φ(t)− x0| ≤ Lδ}5. Then, M is also a complete normed space. (Why? M is

closed subset of C w.r.t. the norm ‖ · ‖.)6


5. Define Φ : M → C by (Please find its motivation from the previous slide)

Φ(φ) = x0 +

∫ t

t0

f (s, φ(s))ds

6. Claim: φ ∈ M ⇒ Φ(φ) ∈ M.Proof. Let φ ∈ M and ψ = Φ(φ).• ψ(t0) = x0 and ψ ∈ C because

limh→0

|ψ(t + h)− ψ(t)| = limh→0

∣∣∣∣∣∫ t+h

t

f (s, φ(s))ds

∣∣∣∣∣ ≤ limh→0

Lh = 0

• From 1,

|t−t0| ≤ δ ⇒ |ψ(t)−x0| =∣∣∣∣∫ t

t0

f (s, φ(s))ds

∣∣∣∣ ≤ L|t−t0| ≤ Lδ

• Hence, ψ ∈ M.

7. From 6, Φ maps M to M. See the condition of Theorem 5.7.1.

7


7. Using the Lipschitz condition,

‖Φ(φ1)− Φ(φ2)‖ = supt∈[t0+δ,t0+δ]

∣∣∣∣∫ t

t0

f (s, φ1(s))− f (s, φ2(s))ds

∣∣∣∣

≤ supt∈[t0+δ,t0+δ]

∣∣∣∣∫ t

t0

K |φ1(s)− φ2(s)|ds

∣∣∣∣≤ δK‖φ1 − φ2‖

8. Since δK < 1,

‖Φ(φ1)− Φ(φ2)‖ ≤ k‖φ1 − φ2‖, k = δK ∈ [0, 1)

9. From 5.7.1, ∃ φ∗ ∈ M s.t. Φ(φ∗) = φ∗.

8

Theorem. (5.7.3: Fredholm equation)

Assume that K (x , y) is continuous on [a, b]× [a, b] and

M = supx ,y∈[a,b]

|K (x , y)|

If |λ|M|b − a| < 1, then the following Fredholm equation has aunique solution in C([a, b],R):

f (x) = λ

∫ b

aK (x , y) f (y) dy + φ(x), x ∈ [a, b]

where λ ∈ R, φ ∈ C([a, b],R).

Proof. For f ∈ C([a, b],R), we define

(Φ(f ))(x) = λ

∫ b

aK (x , y) f (y) dy + φ(x)

9

Proof of 5.7.3

1. Claim: Φ maps from C([a, b],R) to C([a, b],R).Proof. Let f ∈ C([a, b],R). We need to show that Φ(f ) iscontinuous. Let ε > 0 be given.• Since [a, b]× [a, b] is compact, K (x , y) is uniformly continuous.• Hence, ∃ δ

s.t. ‖(x1, y)− (x2, y)‖ < δ & (x1, y), (x2, y) ∈ [a, b]× [a, b]imply |K (x1, y)− K (x2, y)| < ε

‖f ‖ |b−a|+1 .

• If |x1 − x2| < δ and x1, x2 ∈ [a, b], then|(Φ(f ))(x1)− (Φ(f ))(x2)| =∫ b

a|K (x1, y)− K (x2, y)||f (y)|dy ≤ δ ‖f ‖|b − a| < ε.

2. Set k = |λ|M|b − a|. Then k < 1 and

‖Φ(f )− Φ(g)‖ = supx∈[a,b]

∣∣∣∣∫ b

aK (x , y)(f (y)− g(y)|dy

∣∣∣∣ ≤ k‖f − g‖

3. From 5.7.1, ∃ unique f∗ ∈ C([a, b],R) s.t. Φ(f∗) = f∗.

10

Theorem. (5.7.4: Volterra integral equation)

Assuming K (x , y) is continuous on [a, b]× [a, b], the Volterraintegral equation f (x) = λ

∫ xa K (x , y) f (y) dy + φ(x) has a unique

solution f (x) for any λ.

Proof. For f ∈ C([a, b],R), we define(Φ(f ))(x) = λ

∫ xa K (x , y) f (y) dy + φ(x)

1. As in 5.7.4, Φ maps from C([a, b],R) to C([a, b],R).

2. Let M = supx ,y∈[a,b] |K (x , y)|. Then,

|Φ(f )(x)− Φ(g)(x)| = |λ|∣∣∣∣∫ x

aK (x , y)(f (y)− g(y))dy

∣∣∣∣≤ |λ||x − a|M‖f − g‖

11

Proof of 5.7.4

3. From 2,

|Φ2(f )(x)− Φ2(g)(x)| = |λ|∣∣∣∣∫ x

a

K (x , y)(Φ(f )(y)− Φ(g)(y))dy

∣∣∣∣

≤ |λ|∣∣∣∣∫ x

a

M|y − a||λ|M‖f − g‖dy

∣∣∣∣

≤ |λ|2M2 |b − a|22!

‖f − g‖

4. Inductively, we have

‖Φn(f )− Φn(g)| ≤ |λ|nMn|b − a|nn!

‖f − g‖

5. By the ratio test,∑ |λ|nMn|b−a|n

n! converges.

6. Hence, we can choose N so that |λ|NMN |b−a|NN! < 1. ∴ ΦN is

a contraction!

12

Proof of 5.7.4

7. From 6, ∃ unique f∗ ∈ C([a, b],R) s.t. ΦN(f∗) = f∗.8. From 7, ΦN+1(f∗) = Φ(f∗).9. From 8, Φ(f∗) is a fixed point of ΦN .

10. From 7, 9, and the uniqueness of the fixed point, it mustbe f∗ = Φ(f∗).

What a CUTE IDEA is!

13

Examples

• Example 5.7.5. Let Φ : R→ R be defined by Φ(x) = x + 1.|Φ(x)− Φ(y)| = |x − y |� k|x − y | for any k ∈ [0, 1), and Φdoes not have a unique fixed point.

• Example 5.7.6. Solve x ′(t) = x(t), x(0) = 1.Solution. Let Φ(φ)(t) = 1 +

∫ t0 φ(s)ds. Let φ0 = 1 and

φn+1 = Φ(φn), n = 0, 1, · · · . Then φn(t) =∑n

k=01k! t

k .Hence, φn(t) → et .

• Example 5.7.7. Solve x ′(t) = t x(t) for t near 0 andx(0) = 3.Solution. Let Φ(φ)(t) = 3 +

∫ t0 φ(s)ds. Let φ0 = 3 and

φn+1 = Φ(φn), n = 0, 1, · · · . Then φn(t) = 3∑n

k=01k!

(t2

2

)k.

Hence, φn(t) → 3et2/2.

14

Examples

• Example 5.7.5. Consider the integral equation

f (x) = a +

∫ x

0xe−xy f (y) dy

Check directly on which intervals [0, r ] we get a contraction.Solution. Let K (x , y) = xe−xy and letΦ(f )(x) = a +

∫ x0 xe−xy f (y) dy . Then

‖Φ(f )− Φ(g)| = supx∈[0,r ]

∣∣∣∣∫ x

0K (x , y)(f (y)− g(y) dy

∣∣∣∣

≤ supx∈[0,r ]

∣∣∣∣∫ x

0K (x , y) dy

∣∣∣∣ ‖f − g‖

= supx∈[0,r ]

∣∣∣1− e−x2∣∣∣ ‖f − g‖

Since 0 < 1− e−r2< 1 for any r , Φ is a contraction for any r .

15

5.8 The Stone-Weierstrass Theorem

Aim of Weierstrass Theorem is to show that any continuousfunction can be uniformly approximated by a function that hasmore easily managed properties, such as a polynomial.

Theorem. (5.8.1: Weierstrass-Bernstein )

Let f ∈ C([0, 1],R). There exist a sequence of polynomial pn suchthat limn→∞ ‖pn − f ‖ = 0. In fact,

pn(x) =n∑

k=0

n!

k!(n − k)!xk(1− x)n−k f (k/n) → f unformly

• Meaning of rk(x) := n!k!(n−k)! xk(1− x)n−k : Imagine a coin

with probability x of getting heads and, consequently, with

probability 1− x of getting tails. In n tosses, the probability of

getting exactly k heads is that quantity.

1

Rough proof: Weierstrass-Bernstein

• ∑nk=0 rk(x) = 1 and

∑nk=0(k/n − x)2rk(x) = x(1− x).

limn→∞

∑

| kn−x |>δ

rk(x) = 0, for any δ > 0

andlim

n→∞∑

| kn−x |<δ

rk(x) = 1, for any δ > 0

• Suppose that in gambling game called n-tosses, f (k/n) dollarsis paid out when exactly k heads turn up when n tosses aremade.m The average amount (after a lo∼∼ong evening of

playing n-tosses) paid out when n tosses are made is

pn(x) =n∑

k=0

rk(x) f (k/n) ≈ f (x)

2

The Weierstrass-Bernstein theorem can be applied to C([a, b],R)because

g ∈ C([a, b],R) ⇒ f (x) = g(x(b − a) + a) ∈ C([a, b],R).

Theorem. (5.8.2: Stone-Wierstrass)

Let M be a metric space, A ⊂ M a compact set, and B ⊂ C(A,R)satisfies the following:

1. B is algebra: f , g ∈ B &α ∈ R⇒ f + g , fg , αg ∈ B2. 1 ∈ B3. ∀x , y ∈ A, x 6= y , ∃ f ∈ B s.t. f (x) 6= f (y).

Then B is dense in C(A,R), that is, B = C(A,R).

The proof is easy (just technical). I just provide a rough insight.

1. Since B is algebra, f ∈ B ⇒ pn(f ) ∈ B.

2. Assume that A is a finite set. Then the proof is trivial.

3. Use the concept of finite δ−net for the compact set A.3

Differentiable Mappings

Definition: Let A be an open set in Rn. A mapping f : A ⊂ Rn → Rm

is said to be differentiable at x0 ∈ A if ∃ a linear function (m × n matrix)Df(x0) : Rn → Rm such that

limx→x0

‖f(x)− f(x0)−Df(x0)(x− x0)‖‖x− x0‖

= 0

• Theorem 6.2.2. If f : A ⊂ Rn → Rm is differentiable, then ∂fj

∂xiexist, and

Df(x) =

∂f1

∂x1

∂f1

∂x2· · · ∂f1

∂xn∂f2

∂x1· · · · · · ∂f2

∂xn... ... . . . ...∂fm

∂x1· · · · · · ∂fm

∂xn

(called Jacobian matrix)

• 1-Dimension. If f : (a, b) → R is differentiable at x0, then ∃ a numberm = f ′(x0) such that

limx→x0

‖f(x)− f(x0)−m(x− x0)‖‖x− x0‖

= 0 or limx→x0

f(x)− f(x0)

x− x0= m

1

Thm 6.1.2. If f : A ⊂ Rn → Rm is differentiable at a, then f is

continuous at a and Df(a) is uniquely determined.

Proof of uniqueness. Let L1 and L2 be two m×n matrix (or linear mappings)satisfying

limx→a

‖f(x)− f(a)− L1(x− a)‖‖x− a‖

= 0 = limx→a

‖f(x)− f(a)− L2(x− a)‖‖x− a‖

It suffices to prove that ‖L1ej − L2ej‖ = 0 for j = 1, · · · , n.

‖L1ej − L2ej‖ = 1|h|‖L1(hej)− L2(hej)‖ = ‖L1(hej)−L2(hej)‖

‖hej‖

= ‖f(a+hej)−f(a)−L1(hej)−[f(a+hej)−f(a)−L2(hej)]‖‖hej‖

≤ ‖f(a+hej)−f(a)−L1(hej)‖‖hej‖ + ‖f(a+hej)−f(a)−L2(hej)‖

‖hej‖

→ 0 as h → 0

• Proof of continuity: Since limy→a ‖f(y)−f(a)−Df(a)(y−a)‖ = 0, limy→a ‖f(y)−f(a)‖ = 0.

2

Thm 6.2.2. Assume f : A ⊂ Rn → Rm is differentiable at x and

Df(x) = [aij]. Then,∂fj∂xi

exist and aij =∂fj∂xi

.

Proof. Denote e1 = (1,0, · · · ,0), e2 = (0,1,0, · · · ,0), en = (0, · · · ,0,1). Wehave

limy→x‖f(y)−f(x)−Df(x)(y−x)‖

‖y−x‖ = 0

⇒ limh→0‖f(x+hei)−f(x)−Df(x)(hei)‖

|h| = 0, ı = 1,2, · · · , n

⇒ limh→0

√∑m

j=1|fj(x+hei)−fj(x)−aij(hei)|2

|h| = 0, = 1,2, · · · , n

⇒ ∂fj

∂xiexists and aij = ∂fj

∂xi

3

Thm 6.4.1. Let f : A ⊂ Rn → Rm. If each ∂fj

∂xiexist and continuous on A,

then f is differentiable on A.

[Proof for the case n = 2, m = 1.] Let Df(x) =[

∂f∂x1

(x), ∂f∂x2

(x)], x ∈ A. From

the mean value theorem,

f(y)− f(x) = f(y1, y2)− f(x1, y2) + f(x1, y2)− f(x1, x2)

=∂f

∂x1(u1, y1) (y1 − x1) +

∂f

∂x2(x1, u2) (y2 − x2)

for some ui between xi and yi. Hence,

f(y)− f(x)−Df(x)(y − x) = α (y1 − x1) + β (y2 − x2)

where α :=[

∂f∂x1

(u1, y2)− ∂f∂x1

(x)]

and β :=[

∂f∂x2

(x1, u2)− ∂f∂x2

(x)].

Due to continuity of ∂f∂x1

and ∂f∂x2

, α → 0 & β → 0 as y → x and

|f(y)− f(x)−Df(x)(y − x)|‖y − x‖

=

∣∣∣α (y1 − x1) + β (y2 − x2)∣∣∣√

(y1 − x1)2 + (y2 − x2)2≤

√α2 + β2 → 0

as y → x. This proves that limy→x‖f(y)−f(x)−Df(x)(y−x)‖

‖y−x‖ = 0.

4

Remark. About a differentiable map f : A ⊂ Rn → Rm.

• The proof of Thm 6.4.1 for the general case f : A ⊂ Rn → Rm is almostsame as the special case f : A ⊂ R2 → R.

• Intuitively, x → f(x0) + Df(x0)(x− x0) is supposed to be the best affineapproximation to f near x0

• It should be noticed that the existence of ∂fj

∂xidoes not imply that the

derivative Df exist.

Directional Derivatives.Let f : A ⊂ Rn → R be real-valued function.

• Let e ∈ Rn be a unit vector. ddt

f(x + te)|t=0 = limt→0f(x+te)−f(x)

tis called

the directional derivative of f at x in the direction e.

• If f is differentiable, then limt→0f(x+te)−f(x)

t= Df(x) · e.

• Note that the existence of all directional derivatives at a point need notimply differentiability.

Example. Let f(x, y) = xyx2+y

for x2 6= −y and f(x, y) = 0 if x2 =

−y. Note that f is not continuous at (0,0), since limt→0 f(t, t3 − t2) =

limt→0t(t3−t2)t2+t3−t2

= −1 6= 0 = f(0,0). But all directional derivative of f at

(0,0) exist:

limt→0

f(ta, tb)

t=

1

t

t2ab

t2a2 + tb= a

for any unit vector e = (a, b).

5

Chain Rule 6.5.1: Let A ⊂ Rn be open and let f : A → Rm be differentiable. LetB ⊂ Rn be open, f(A) ⊂ B, and g : B → Rp be differentiable. Then h = g ◦ f isdifferentiable on A and Dh(x) = Dg(f(x))Df(x):

D(g ◦ f)(x) =

∂g1

∂y1· · · ∂g1

∂ym... . . . ...∂gp

∂y1· · · ∂gp

∂ym

∂f1

∂x1· · · ∂f1

∂xn... . . . ...∂fm

∂x1· · · ∂fm

∂xn

Proof. From the assumption, it is easy to see that

limx→x0

‖:=♠︷︸︸︷

(g ◦ f)(x)− (g ◦ f)(x0)−Dg(f(x0))(f(x)− f(x0)) ‖‖x− x0‖

= 0

limx→x0

‖:=♣︷︸︸︷

f(x)− f(x0)−Df(x0)(x− x0) ‖‖x− x0‖

= 0

Since (g ◦ f)(x)− (g ◦ f)(x0)−Dg(f(x0))Df(x0)(x− x0) = ♠+ Dg(f(x0))♣,it follows from the above identities that

limx→x0

‖= ♠ + Dg(f(x0)) ♣︷︸︸︷

(g ◦ f)(x)− (g ◦ f)(x0)−Dg(f(x0))Df(x0)(x− x0) ‖‖x− x0‖

= 0.

6

Directional derivatives and examples

1. If h(r, θ) = f(r cos θ, r sin θ), then(∂h∂r

∂h∂θ

)=

(∂f∂x

∂f∂y

) (cos θ −r sin θsin θ r cos θ

)2. Consider a surface S defined by f(x) =constant. Then ∇f(x) is orthog-

onal to this surface.

Proof. Let c : [0,1] → Rn be a curve lying on S and c(0) = x0.

0 =d

dtf(c(t)) = ∇f(c(t)) · c′(t).

This means that ∇f(c(t)) is orthogonal to its tangent vector c′(t). Sincethis is true for arbitrary curve on S passing x0, ∇f(x0) is orthogonal toS at x0.

3. The direction of greatest rate of increase of f(x) is ∇f(x).

7

6.7.1. Mean Value Theorem. Suppose f : A ⊂ Rn → R is differentiable onan open set A. For any x,y ∈ A such that the line segment joining x and ylies in A, ∃c on that segment such that

f(y)− f(x) = Df(c) · (y − x)

Proof. Define h(t) = f((1− t)x + ty). Then

∃t0 ∈ (0,1) such that h(1)− h(0) = h′(t0)

and therefore

f(y)− f(x) = h(1)− h(0) = h′(t0) = Df((1− t0)x + t0y︸︷︷︸=c

) · (y − x)

8

• Definition. A bilinear map B : Rm ×Rn → R is n×m matrix such that

B(x,y) =∑ij

aijxiyj = (x1, · · · , xn)

a11 · · · a1m... . . . ...

an1 · · · anm

y1...

vm

• Definition 6.8.4. For positive integer r, f is said to be of class Cr if all

partial derivatives up to order r exist and continuous.

• Let f : A ⊂ Rn → R is of class C2. Then

D2f(x) =

∂2f∂x1∂x1

· · · ∂2f∂x1∂xn... . . . ...

∂2f∂xn∂x1

· · · ∂2f∂xn∂xn

• If D2f us continuous, D2f is symmetric.

9

Taylor’s Theorem 6.8.5.[Case:f ∈ C3]. Let f : A ⊂ Rn → R is of class C3.Suppose x ∈ A and x + th ∈ A for 0 ≤ t ≤ 1. Then ∃c = x + t0h,0 < t0 < 1,such that

f(x + h)− f(x) =n∑

i=1

∂f

∂xi(x)hi +

1

2!

n∑i,j=1

∂2f

∂xi∂xj(x)hihj

+1

3!

n∑i,j,k=1

(∂3f

∂xi∂xj∂xk

(x + t0h)hihjhk

)Proof.

f(x + h)− f(x) =∫ 10

ddt

f(x + th)dt =∫ 10

∑ni=1

∂f∂xi

(x + th)hidt

=∑n

i=1

∫ 10

∂f∂xi

(x + th)hid(t−1)

dtdt (Why?

d(t−1)dt

= 1)

=∑n

i=1

[∂f∂xi

(x)hi −∫ 10

ddt

(∂f∂xi

(x + th)hi

)(t− 1) dt

]=

∑ni=1

∂f∂xi

(x)hi + R1(h,x)

where

R1(h,x) =n∑

i,j=1

∫ 1

0(1− t)

(∂2f

∂xi∂xj(x + th)hihj

)dt

10

Using ddt

(−(t−1)2

2!

)= (1− t) and integration by part,

R1(h,x) =∑n

i,j=1

∫ 10

ddt

(−(t−1)2

2!

) (∂2f

∂xi∂xj(x + th)hihj

)dt

= 12!

∑ni,j=1

∂2f∂xi∂xj

(x)hihj + R2(h,x)

where

R2(h,x) :=n∑

i,j,k=1

∫ 1

0

(t− 1)2

2!

(∂3f

∂xi∂xj∂xk

(x + th)hihjhk

)dt

Recall the second mean value theorem for integral∫ 1

0f(t)g(t)dt = g(t0)

∫ 1

0f(t)dt for some 0 < t0 < 1.

Hence, ∃t0,0 < t0 < 1 such that

R2(h,x) =n∑

i,j,k=1

(∂3f

∂xi∂xj∂xk

(x + t0h)hihjhk

) ∫ 1

0

(t− 1)2

2!dt︸︷︷︸

1

3!

.

One can proceed by using induction using the same method to get the general Taylor’s

theorem.

6.8.5. Taylor’s Theorem [General Case:f ∈ Cr]. Let f : A ⊂ Rn → R is ofclass Cr. Suppose x ∈ A and x + th ∈ A for 0 ≤ t ≤ 1. Then

f(x + h) = f(x) + Df(x) · h + · · ·+1

r!Dr−1f(x) · (h, · · · ,h) + Rr−1(x,h)

where Rr−1(x,h) is the remainder. Furthermore,

Rr−1(x,h)

‖h‖r−1→ 0 as h → 0

Another proof of Taylor’s formula. Let g(t) = f(x+th) for t ∈ [0,1]. Applyingone-dimensional Taylor’s formula, there exists t ∈ (0,1) such that

g(1) = g(0) +r−1∑k=1

1

k!g(k)(0) +

1

(r − 1)!g(k−1)(t)

Note that Rr−1(x, h) = + 1r!

g(k−1)(t), g(1) = f(x + h), g(0) = f(x),

g′(0) = Df(x) · h =∑n

i=1∂f∂xi

(x)hi

g′′(0) = D2f(x) · (h,h) =∑n

i,j=1∂2f

∂xi∂xj(x)hihj

g′′′(0) = D3f(x)(h,h,h) =∑n

i,j,k=1

(∂3f

∂xi∂xj∂xk(x)hihjhk

)11

Theorem 6.9.2. If f : A ⊂ Rn → R is differentiable and x0 ∈ A is an extremepoint for f , then Df(x0) = 0.

Proof. Assume Df(x0) 6= 0. We try to prove that f(x0) is not a local extremevalue.

• Let h = Df(x0)‖Df(x0)‖. Since f is differentiable at x0,

limλ→0

1

|λ||f(x0 + λh)− f(x0)−Df(x0) · (λh)| = 0.

• Hence, (for given ε = ‖Df(x0)‖2

) there exist δ > 0 such that

0 < |λ| < δ ⇒ |f(x0 + λh)− f(x0)−Df(x0) · (λh)| <‖Df(x0)‖

2|λ|

Since Df(x0) · h = ‖Df(x0)‖, we have

−‖Df(x0)‖

2|λ| < f(x0 + λh)− f(x0)− ‖Df(x0)‖λ <

‖Df(x0)‖2

|λ|

This leads to the followings:

12

– for 0 < λ < δ, ‖Df(x0)‖2

λ < f(x0 +λh)− f(x0). Hence, f(x0) is not localmaximum.

– for −δ < λ < 0, f(x0 + λh) − f(x0) < ‖Df(x0)‖2

λ. Hence, f(x0) is notlocal minimum.

Theorem 6.9.4. Suppose f : A ⊂ Rn → R is a C3−function and x0 is a criticalpoint.

• If f has a local maximum at x0, then Hx0(f) is negative semi-definite.

• If Hx0(f) is negative ( positive ) definite, then f has a local maximum(minimum) at x0

Indeed, this theorem holds true for f ∈ C2.

Proof. Since Df(x0) = 0, Taylor’s theorem gives

f(x0 + h)− f(x) =1

2D2f(x0)(h,h) + R2(x0,h)

where limh→0R2(x0,h)‖h‖ = 0.

If D2f(x0) is negative definite, then

1

2D2f(x0)(h,h) + R2(x0,h) < 0 for sufficiently small h

and therefore f(x0 + h)− f(x0) < 0 for sufficiently small h. Hence, f(x0) has

a local maximum at x0.

13

• Example 6.9.5. The matrix A =

(a bb d

)is positive definite if

(x, y)

(a bb d

) (xy

)> 0 if (x, y) 6= (0,0).

Hence, A is positive definite iff ax2 + 2bxy + dy2 > 0 for all all x, y.Therefore, A is positive definite iff a > 0 and ad− b2 > 0.

• Example 6.9.6. Let f(x, y) = x2 − xy + y2. Then Df(0,0) = (0,0) and

D2f(0,0) =

(2 −1−1 2

). Hence, the Hessian is positive definite. Thus

f has a local minimum at (0,0).

14

Chapter 8. Integration

Definition. Let A ⊂ ℝ2 be a bounded set and let f : A → ℝ be a boundedfunction.

∙ We enclose A in some rectangle B = [a1, b1]× [a2, b2] and extend f to thewhole rectangle by defining it to be zero outside of A.

∙ Let P be a partition of B obtained by dividing a1 = x0 < x1 < ⋅ ⋅ ⋅ < xn = b1

and a2 = y0 < y1 < ⋅ ⋅ ⋅ < ym = b2:

P = {[xi, xi+1]× [yj, yj+1]︸︷︷︸=subrectangle R

: i = 0,1, ⋅ ⋅ ⋅ , n− 1, j = 0,1, ⋅ ⋅ ⋅ ,m− 1}.

∙ Define the upper sum of f :

U(f,P) :=∑R∈P

sup{f(x, y) ∣ (x, y) ∈ R} × (volume of R)

∙ Define the lower sum of f :

L(f,P) :=∑R∈P

inf{f(x, y) ∣ (x, y) ∈ R} × (volume of R)

1

∙ Define the upper integral of f on A by∫A

f = inf {L(f,P) : P is a partition of B}

and the lower integral of f on A by∫A

f = sup {L(f,P) : P is a partition of B}

∙ We say that f is Riemann integrable or integrable if∫A

f =

∫A

f.

∙ If f is integrable on A, we denote∫A

f =

∫A

f =

∫A

f.

Volume and sets of measure zero.

Definition. Let A be a bounded set of ℝn.

∙ The characteristic function 1A of A is the map defined by 1A(x) = 1 ifx ∈ A and 1A(x) = 1 if x /∈ A.

∙ We say that A has volume if 1A is Riemann integrable and the volume isthe number

vol(A) =

∫A

1A(x)dx.

∙ The set A is said to have measure zero if for every � > 0 there is acountable number of rectangles R1, R2, ⋅ ⋅ ⋅ such that

A ⊂ ∪∞n=1Rn &∞∑n=1

vol(Rn) < �.

∙ Examples: The set of rational number has measured zero in ℝ. As asubset of ℝ2, the real line has measure zero.

2

∙ Lebesgue’s monotone convergence theorem. Let gn : [0,1] → ℝ be

integrable functions and∣∣∣∫ 1

0 gn(x)dx∣∣∣ < ∞. Suppose that 0 ≤ gn+1 ≤ gn

and gn(x)→ 0 for all x ∈ [0,1]. Then

limn→∞

∫ 1

0gn(x)dx = 0.

∙ Example: limn→∞∫ 1

0 e−nx2

xpdx = 0 if p > −1.

∙ Fubini’s Theorem. Let A = [a, b] × [c, d] ⊂ ℝ2, and let f : A → ℝ becontinuous. Then∫

A

f =

∫ b

a

(∫ d

c

f(x, y)dy

)dx =

∫ d

c

(∫ b

a

f(x, y)dx

)dy

3

Chapter 10 Fourier Series. Fourier analysis arouse historically inconnection with problems in mechanic such as heat conduction and wavemotion.

∙ Vibrating string. Consider a string of length l with clamped ends thatis free to vibrate when plunked. Let y(t, x) is the displacement of thestring at time t and x ∈ [0, l].

– y obeys the wave equation

∂2y

∂t2= c2∂

2y

∂x2

(Force=mass× acceleration = tension)

– That the string has clamped ends entails that y(t,0) = y(t, l) = 0.

∙ It is both important and remarkable that any solution y(x, t) can bedecomposed into harmonics:

y(x, t) =∞∑n=1

cn yn(x, t)︸︷︷︸standing wave

=∞∑n=1

cnsin(n�lx)

cos(!nt), !n =n�c

l︸︷︷︸frequency

4

∙ Physically, a standing wave is a synchronous up-and-down motion thatrepeats its shape periodically after time 2�

!, such as occurs when a string

produces a pure note.

∙ Specific standing waves called fundamental solutions (a kind of basis)are given by

yn(x, t) = sin(n�lx)

cos(!nt), n = 0,1,2, ⋅ ⋅ ⋅

∙ Thus a complicated-looking vibration is in reality an infinite linear com-bination of harmonics.

∙ The purpose of Fourier analysis is to carry out this procedure of decom-position using general method.

Exercise: Using separable variable, prove that any solution y(x, t) can bedecomposed into harmonics

y(x, t) =∞∑n=1

cnyn(x, t) =∞∑n=1

cnsin(n�lx)

cos(!nt), !n =n�c

l

10.1 Review: Inner Product in ℝn.

∙ For x,y ∈ ℝn, define inner product and norm:

⟨x,y⟩ =n∑

j=1

x(j)y(j), ∥x∥ =√⟨x,x⟩.

∙ The distance (or metric) between x and y is defined by ∥x − y∥, andhence ∥x− y∥ = 0 implies x = y.

∙ If ⟨x,y⟩ = 0, x and y are said to be orthogonal.

∙ {e1, e2, ⋅ ⋅ ⋅ , en} is said to be an orthonormal basis of ℝn if

1. ℝn = span{e1, e2, ⋅ ⋅ ⋅ , en}

2. ∥ej∥ = 1, j = 1, ⋅ ⋅ ⋅ , n

3. ⟨ej, ei⟩ = 0 if i ∕= j.

∙ For example, e1 = (1,0, ⋅ ⋅ ⋅ ,0), e2 = (0,1,0, ⋅ ⋅ ⋅ ,0), ....

5

∙ If {e1, e2, ⋅ ⋅ ⋅ , en} is an orthonormal basis, then every x ∈ ℝn can be rep-resented uniquely by

x =n∑

j=1

⟨x, ej⟩ej

∙ If Vm = span{e1, ⋅ ⋅ ⋅ , em}, the element in Vm closest to x is

xm =m∑j=1

⟨x, ej⟩ej

with the distance ∥x− xm∥ =√∑n

j=m ⟨x, ej⟩2.

This useful dot product properties in Euclidean space can be generalized to

infinite dimensional spaces by introducing Hilbert space.

10.1 Inner Product space C[0,2�]

∙ Let A be the interval (0,2�).

∙ Let V be the space of all continuous functions f : [0,2�]→ ℂ.

∙ For f, g ∈ V , we define the inner product

⟨ f, g ⟩ =

∫ 2�

0f(x)g(x) dx

where g(x) denotes the complex conjugate of g(x). The above inner productcan be approximated by

⟨ f, g ⟩ ≈n∑

j=1

f(xj)g(xj) ∆x.

where we divide the interval [0,2�] into n subintervals with endpoints x0 = 0 < x1 <

⋅ ⋅ ⋅ < xn = 2� and equal width ∆x = 2�n

.

∙ Two functions f and g are said to be orthogonal if

⟨ f, g ⟩ =

∫ 2�

0f(x)g(x)dx = 0.

6

∙ The norm of f is defined as

∥f∥ =√⟨ f, f ⟩ =

√∫ 2�

0∣f(x)∣2dx.

∙ The distance between f and g is defined by

d(f, g) = ∥f − g∥.

∙ If {�n } is an orthogonal set of functions on the interval A with theproperty that ∥�n∥ = 1, then we call {�n } as an orthonormal set.

∙ Example.

{1√2�,

1√�

cosx,1√�

sinx,1√�

cos 2x,1√�

sin 2x, ⋅ ⋅ ⋅ }

is an orthonormal set in V .

10.1 Inner Product space

Definition. Let V be a complex vector space V . An inner product on V isa mapping ⟨⋅, ⋅⟩ : V × V → ℂ with the following properties :

1. ⟨�f + �g ℎ⟩ = �⟨ f, ℎ ⟩+ �⟨ g, ℎ ⟩ for all f, g, ℎ ∈ V and �, � ∈ ℂ.

2. ⟨ f, g ⟩ = ⟨ g, f ⟩

3. ⟨ f, f ⟩ ≥ 0, and ⟨ f, f ⟩ = 0 ⇒ f = 0

Theorem 10.1.2. The space V of the continuous functions f : [a, b] → ℂforms an inner product space if we define

⟨f, g⟩ =

∫ b

a

f(x)g(x)dx.

7

10.1 Inner Product space V = C[a, b] Consider the space Vof the continuous functions f : [a, b] → ℂ with the inner product ⟨f, g⟩ =∫ baf(x)g(x)dx.

∙ Define the norm of f by ∥f∥ =√⟨ f, f ⟩.

∙ Define the distance between f and g by d(f, g) = ∥f − g∥.

For f, g, ℎ ∈ V , we have

∙ Cauchy-Schwarz inequality. ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥

∙ Minkowski inequality. ∥f + g∥ ≤ ∥f∥+ ∥g∥

∙ Parallelogram law. ∥f + g∥2 + ∥f − g∥2 = 2∥f∥2 + 2∥g∥2

∙ Pythagorean Theorem.

If ⟨ f, g ⟩ = 0, then ∥f + g∥2 = ∥f∥2 + ∥g∥2

8

Cauchy-Schwarz inequality. ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥

Proof:

∙ Suppose g ∕= 0. Let ℎ = g∥g∥. Then

∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥ ⇔ ∣⟨ f, ℎ ⟩∣ ≤ ∥f∥

∙ Denote � = ⟨ f, ℎ ⟩. Then

0 ≤ ∥f − �ℎ∥2 = ⟨f − �ℎ, f − �ℎ⟩= ∥f∥2 − � ⟨ℎ, f⟩ − � ⟨f, ℎ⟩+ ∣�∣2

= ∥f∥2 − ∣�∣2

Hence, ∣�∣ = ∣⟨ f, ℎ ⟩∣ ≤ ∥f∥. This completes the proof.

9

Minkowski inequality. ∥f + g∥ ≤ ∥f∥+ ∥g∥

Proof:

∥f + g∥2 = ⟨f + g, f + g⟩ = ∥f∥2 + ⟨f, g⟩+ ⟨g, f⟩+ ∥g∥2

≤ ∥f∥2 + 2∥f∥∥g∥+ ∥g∥2

= (∥f∥+ ∥g∥)2

10

Definition of convergence in an inner product space V . LetV be an inner product space and let fn be a sequence in V . We say that fnconverges to f (in mean) and write fn → f if ∥fn − f∥ = 0, that is,

∀� > 0, ∃N s.t. n ≥ N ⇒ ∥fn − f∥ < �.

Similarly, a series∑n

k=1 gk converges to f if

limn→∞

∥n∑

k=1

gk − f∥ = 0.

Examples: Let V = C([0,1]), the space of continuous functions f : [0,1]→ ℂ.

∙ Let fn = nx�[0,1

n] +(2−nx)�(1

n,2

n]. Then fn → 0 in mean, that is,

∫ 10 ∣fn(x)−

0∣2dx→ 0.

∙ Let fn = n2x�[0,1

n] + (2n− n2x)�(1

n,2

n]. Then

limn→∞

fn(x) = 0 (∀x ∈ ℝ) & limn→∞

∫ 1

0∣fn(x)− 0∣2dx =∞.

11

Definition of Cauchy sequence. A sequence fn in an inner productspace is said to be a Cauchy sequence when

∀� > 0, ∃N s.t. n,m ≥ N ⇒ ∥fn − fm∥ < �.

An inner product space is called complete if every Cauchy sequence in Vconverges. A complete inner product space is called a Hilbert space.

Remark: The inner product space V = C([0,2]) is not complete.

∙ Let fn(x) = xn for 0 ≤ x ≤ 1 and fn(x) = 1 for 1 ≤ x ≤ 2.

∙ Then fn is Cauchy sequence since ∥fn − fm∥2 =∫ 1

0 ∣xn − xm∣2dx → 0 as

n,m→∞.

∙ However, fn → f where f(x) = 0 for 0 ≤ x ≤ 1 and f(x) = 1 for 1 ≤ x ≤ 2.f /∈ V .

12

A complete inner product space. To make the inner productspace V = C([a, b]) complete, we need the following theorem and measuretheory:

Theorem 8.3.4 If g(x) is integrable, g ≥ 0, and∫ bag(x)dx = 0, then the set

{x ∈ [a, b] : g(x) ∕= 0} has measure zero.

Proof. TA

♣ For any integrable function f , theorem 8.3 leads to∫ b

a

∣f(x)∣2dx = 0⇒ f = 0 except for those x in a set of measure zero.

Regarding such a f as equivalent to zero, we have the following theorem:

Theorem 10.1.6 Let V = L2([a, b]) be the space of functions f : [a, b] → ℂthat ∣f ∣2 is integrable. Then V is an inner product space with inner product

⟨f, g⟩ =∫ baf(x)g(x)dx and norm ∥f∥ =

√⟨f, f⟩.

13

Proof of Theorem 8.3.4: If g(x) is integrable, g ≥ 0, and∫ bag(x)dx = 0,

then the set {x ∈ [a, b] : g(x) ∕= 0} has measure zero.

∙ We first show that a set Am = {x ∈ A : g(x) > 1/m} has measure zero.

∙ Recall∫ bag(x) = inf{U(f, P ) : P is any partition}.

∙ Let � > 0 be given. There exist a partition P such that U(g, P ) < �m

.

∙ Let I1, ⋅ ⋅ ⋅ , Ik be the subintervals of the partition P such that Ii∩Am ∕= ∅.Then

k∑i=1

∣Ii∣ ≤k∑i=1

(m sup

Iig(x)∣Ii∣

)≤ m U(g, P ) < �.

where ∣Ii∣ is the length of the interval Ii.

∙ Since Am ⊂ ∪ki=1Ii and∑k

i=1 ∣Ii∣ < �, Am has measure zero.

∙ Since {x ∈ [a, b] : g(x) ∕= 0} ⊂ ∪∞m=1Am, the set has measure zero.

14

Proof of Theorem 10.1.6: Prove that V = L2([a, b]) is an innerproduct space.

∙ If ∥f∥ = 0,∫ ba∣f(x)∣2dx = 0. From Theorem 8.3.4, f = 0 since we are

identifying functions that agree except on a set of measure zero.

∙ It is easy to see that ⟨f, g⟩ satisfies all the other rules of inner productspace. We only need to prove that ∣⟨f, g⟩∣ <∞ for all f, g ∈ V .

∙ If we split f and g into real and imaginary part, and into positive andnegative part, we are reduce to the case in which f and g are real andpositive.

∙ From Lebesgue monotone convergence theorem (page 467), it sufficesto show that

limM→∞

∫ b

a

(fg)M <∞ ( see page 462)

∙ Note that 0 ≤ (fg)M ≤ f√Mg√M + f2√M

+ g2√M

.

15

∙∫ ba

(fg)M ≤ ∥f√M∥∥g√M∥+ ∥f√M∥2 + ∥g√M∥

2.

∙ Hence,∫ ba

(fg)M ≤ ∥f∥∥g∥+ ∥f∥2 + ∥g∥2 <∞.

Example 10.1.8. If f1, ⋅ ⋅ ⋅ , fn are orthonormal in an inner prod-uct space V , prove that f1, ⋅ ⋅ ⋅ , fn are linearly independent.

∙ Definition. f1, ⋅ ⋅ ⋅ , fn are said to be linearly independent if

n∑i=1

cifi = 0 ⇒ c1 = ⋅ ⋅ ⋅ = cn = 0.

∙ Assume that∑n

i=1 cifi = 0. We want to prove c1 = ⋅ ⋅ ⋅ = cn = 0.

∙ Due to orthogonality, we have

ck = ck∥fk∥2 =

⟨n∑i=1

cifi, fk

⟩= ⟨0, fk⟩ = 0.

16

Example 10.1.8. Let V be an inner product space. Define the

project of f on g to be the vector

ℎ =⟨f, g⟩∥g∥2

g

Show that ℎ and f − ℎ are orthogonal, and interpret this resultgeometrically.

Proof: First, let us prove it when ∥g∥ = 1:

⟨ℎ, f − ℎ⟩ = ⟨ℎ, f⟩ − ∥ℎ∥2 = ⟨⟨f, g⟩g, f⟩ − ∣ ⟨f, g⟩ ∣2 = 0.

For the general case, repeat the above procedure.

17

10.2 Orthogonal family of functions∙ Throughout this section, we assume that V is an inner product space

with an inner product ⟨⋅, ⋅⟩.

∙ A vector � ∈ V is called normalized if ∥�∥ =√⟨�, �⟩ = 1.

∙ f and g are called orthogonal if ⟨f, g⟩ = 0.

∙ Definition. An orthonormal family �0, �1, ⋅ ⋅ ⋅ in V is called complete ifevery f ∈ V can be written

f =∞∑k=0

ck�k (ck = ⟨f, �k⟩)

We call f =∑∞

k=0 ck�k the Fourier series of f with respect to �0, �1, ⋅ ⋅ ⋅and ck = ⟨f, �k⟩ the Fourier coefficients.

∙ An orthonormal family {�k} in V is complete iff for every f ∈ V ,

limn→∞

∥f −n∑

k=0

⟨f, �k⟩�k∥ = 0.

18

Theorem 10.2.1: Suppose f =∑∞k=0 ck�k for an orthonor-

mal family �0, �1, ⋅ ⋅ ⋅ in V (convergence in mean). Then ck =⟨f, �k⟩ = ⟨f, �k⟩.

Proof.

∙ Set sn =∑n

k=0 ck�k, so that ∥sn − f∥ → 0.

∙ Hence, ∣ ⟨f − sn, �i⟩ ∣ ≤ ∥f − sn∥ → 0 as n→∞.

∙ If n ≥ i, then ⟨sn, �⟩ =∑n

k=0 ⟨ck�k, �i⟩ = ci.

∙ If n ≥ i, ∣⟨f − sn, �i⟩∣ = ∣⟨f, �i⟩ − ci∣ ≤ ∥f − sn∥ → 0 as n→∞.

∙ Hence, ⟨f, �i⟩ = ci.

19

Examples of complete orthonormal families :

∙ Let V = L2([0,2�]) be the inner product space in Theorem 10.1.6.

∙ The exponential system {�n(x) = einx√2�

: n = 0,±1,±2} is a complete

orthonormal system in the space V, that is, Fourier series for f ∈ V forthis family is given by

f =∞∑

k=−∞

ckeikx

√2�

, ck = ⟨f, �k⟩ =1√2�

∫ 2�

0f(x)e−ikxdx.

∙ The trigonometric system 1√2�, cosmx√

2�, sinnx√

2�,m, n = 1,2, ⋅ ⋅ ⋅ is complete

orthonormal system in V.

Proof. See Mean completeness theorem 10.3.1. (optional)

20

Gram-Schmidt process :

∙ Let g0, g1, g2, ⋅ ⋅ ⋅ be an linearly independent functions in an inner productspace V .

∙ We can form a corresponding orthonormal system �0, �1, ⋅ ⋅ ⋅ as follows

�0 =g0

∥g0∥

�1 =�1

∥�1∥�1 = g1 − ⟨g1, �0⟩�0

�k+1 =�k

∥�k∥�k = gk −

k∑i=0

⟨g1, �i⟩�i

21

Theorem: Bessel inequality: Let �0, �1, ⋅ ⋅ ⋅ be an orthonomal systemin an inner product space V . For each f ∈ V , the real series

∑∞i=0 ∣⟨f, �i⟩∣

2

converges and∞∑i=0

∣⟨f, �i⟩∣2 ≤ ∥f∥2.

Proof.

∙ Set sn =∑n

k=0 ck�k where ck = ⟨f, �k⟩.

∙ Key idea 1: f − sn and sn are orthogonal.

∙ Key idea 2: Apply Pythagoras’ theorem: ∥f∥2 = ∥f − sn∥2 + ∥sn∥2.

∙ Hence, ∥sn∥2 ≤ ∥f∥2.

∙ Since �i are orthogonal, ∥sn∥2 =∑n

i=0 ∣⟨f, �i⟩∣2 .

22

Parseval’s Theorem : Let �0, �1, ⋅ ⋅ ⋅ be an orthonomal system in aninner product space V . Then �0, �1, ⋅ ⋅ ⋅ is complete iff for every f ∈ V , wehave

∞∑i=0

∣⟨f, �i⟩∣2 = ∥f∥2.

Proof.

∙ Set sn =∑n

k=0 ck�k where ck = ⟨f, �k⟩.

∙ Then ∥f∥2 = ∥f − sn∥2 + ∥sn∥2.

∙ If �0, �1, ⋅ ⋅ ⋅ is complete, ∥f − sn∥2 → 0. Therefore, letting n→∞,

∥f∥2 = limn→∞

{∥f − sn∥2 + ∥sn∥2

}= 0 +

∞∑i=0

∣⟨f, �i⟩∣2

∙ Conversely, if∑∞

i=0 ∣⟨f, �i⟩∣2 = ∥f∥2, then ∥f∥2 − ∥sn∥2 → 0, and so ∥f −

sn∥2 → 0.

23