1.4 Cauchy Sequence in R
Definition. (1.4.1)
A sequence xn ∈ R is said to converge to a limit x if• ∀ε > 0, ∃N s.t. n > N ⇒ |xn − x | < ε.A sequence xn ∈ R is called Cauchy sequence if• ∀ε, ∃N s.t. n > N & m > N ⇒ |xn − xm| < ε.
Proposition. (1.4.2)
Every convergent sequence is a Cauchy sequence.
Proof. Assume xk → x . Let ε > 0 be given.
• ∃N s.t. n > N ⇒ |xn − x | < ε2 .
• n, m ≥ N ⇒|xn − xm| ≤ |x − xn|+ |x − xm| < ε
2 + ε2 = ε.
1
Theorem. (1.4.3; Bolzano-Weierstrass Property)
Every bounded sequence in R has a subsequence thatconverges to some point in R.
Proof. Suppose xn is a bounded sequence in R. ∃M such that−M ≤ xn ≤ M, n = 1, 2, · · · . Select xn0 = x1.
• Bisect I0 := [−M,M] into [−M, 0] and [0, M].
• At least one of these (either [−M, 0] or [0, M]) must contain xn forinfinitely many indices n.
• Call it I1 and select n1 > n0 with xn1 ∈ I0.
• Continue in this way to get a subsequence xnksuch that
• I0 ⊃ I1 ⊃ I2 ⊃ I3 · · ·• Ik = [ak , bk ] with |Ik | = 2−kM.• Choose n0 < n1 < n2 < · · · with xnk
∈ Ik .
• Since ak ≤ ak+1 ≤ M (monotone and bounded), ak →∃ x .• Since xnk
∈ Ik and |Ik | = 2−kM, we have
|xnk−x | < |xnk
−ak |+|ak − x | ≤ 2−k−1M+|ak − x | → 0 as k →∞.
2
Corollary. (1.4.5; Compactness)
Every sequence in the closed interval [a, b] has a subsequencein R that converges to some point in R.
Proof. Assume a ≤ xn ≤ b for n = 1, 2, · · · . By Theorem 1.4.3, ∃a subsequence xnk
and a ≤ ∃x ≤ b such that xnk→ x .
Lemma. (1.4.6; Boundedness of Cauchy sequence)
If xn is a Cauchy sequence, xn is bounded.
Proof. ∃N s.t. n ≥ N ⇒ |xn − x | < 1. Thensupn |xn| ≤ 1 + max{|x1|, · · · , |xN |} (Why?)
Theorem. (1.4.3; Completeness)
Every Cauchy sequence in R converges to an element in [a, b].
Proof. Cauchy seq. ⇒ bounded seq. ⇒ convergent subseq.
3
1.5. Cluster Points of the sequence xn
Definition. (1.5.1; cluster points)
A point x is called a cluster point of the sequence xn if• ∀ε > 0, ∃infinitely many values of n with |xn − x | < εIn other words, a point x is a cluster point of the sequence xn iff∀ε > 0 & ∀N, ∃n > N s.t. |xn − x | < ε
Example
• Both 1 and −1 are cluster points of the sequence1,−1, 1,−1, · · · .
• The sequence xn = 1n has the only cluster point 0.
• The sequence xn = n does not have any cluster point.
4
Proposition.
1. x is a cluster point of the sequence xn iff∃ a subsequence xnk
s.t. xnk→ x .
2. xn → x iff every subsequence of xn converges to x
3. xn → x iff the sequence {xn} is bounded and x is its onlycluster points.
Proof.
1. (⇒) Assume x is a cluster point. Then, we can choosen1 < n2 < n3 · · · s.t. |xnk
− x | < 1k . (Why?) This gives a
subsequence xnk→ x .
2. Trivial
3. (⇐)If not, ∃ε and ∃ a subseq xnkso that |xnk
− x | > ε. Sincexnk
is bounded, ∃ a convergent subseq. The limit of thatsubseq would be a cluster pt of the seq xn different from x ,but there are no such pt. Contradiction.
5
Definition. (1.5.3; limit superior & limit inferior of seq xn )
Define the limit superior limxn in the following way:• If xn is bounded above, then
lim supn→∞ xn = limxn = the largest cluster point
limxn = −∞ if the set cluster point is empty
• If xn is NOT bounded above, then limxn = ∞
Similarly, we can define the limit inferior limxn.
Examples
• For the seq 1, 0,−1, 1, 0,−1, · · · , limxn = 1 and limxn = −1.
• If xn = n, then limxn = ∞ = limxn
• Let xn = (−1)n 1+nn . Then limxn = 1 and limxn = −1.
6
Definition. (1.6.2; Vector space)
A real vector space V is a set of elements called vectors, withgiven operations of vector addition + : V × V → V and scalarmultiplication · : R× V → V such that the followings hold for allv , u,w ∈ V and all λ, µ ∈ R:
1. v + w = w + v , (v + u) + w = v + (u + w), λ(v + w) =λv + λw , λ(µv) = (λµ)v , (λ + µ)v = λv + µv , 1v = v.
2. ∃0 ∈ V s.t. v + 0 = v. ∃ − v ∈ V s.t. v − v = 0.
• A subset of V is called a subspace if it is itself a vector spacewith the same operations.
• W is a vector subspace of V iff λv + µu ∈ W wheneveru, v ∈ W and λ, µ ∈ R.
• The straight line W = {(x1, x2) : x1 = 2x2} is a subspace ofR2.
7
Euclidean space Rn & Definitions & Properties
The Euclidean n-space Rn with the operations(x1, · · · , xn) + (y1, · · · , yn) = (x1 + y1, · · · , xn + yn) & λ(x1, · · · , xn) =
(λx1, · · · , λxn) is a vector space of dimension n.
• The standard basis of Rn;e1 = (1, 0, · · · , 0), · · · , en = (0, · · · , 0, 1).
• Unique representation: x = (x1, · · · , xn) ∈ Rn can beexpressed uniquely as x = x1e1 + · · ·+ xnen.
• Inner product of x and y : 〈x , y〉 =∑n
i=1 xiyi
• Norm of x: ‖x‖ =√〈x , y〉.
• Distance between x and y : dist(x , y) = ‖x − y‖• Triangle inequality: ‖x + y‖ ≤ ‖x‖+ ‖y‖.• Cauchy-Schwartz inequality: 〈x , y〉 ≤ ‖x‖ ‖y‖• Pythagorean theorem: If 〈x , y〉 = 0, then‖x + y‖2 = ‖x‖2 + ‖y‖2.
8
Definition. (1.7.1; Metric Space (M , d) equipped with d =distance)
A metric space (M, d) is a set M and a function d : M ×M → Rsuch that
1. d(x , y) ≥ 0 for all x , y ∈ M.
2. d(x , y) = 0 iff x = y.
3. d(x , y) = d(y , x) for all x , y ∈ M.
4. d(x , y) ≤ d(x , z) + d(z , y) for all x , y ∈ M.
Example [Fingerprint Recognition] Let M be a data set offingerprints in Seoul city police department.
• Motivation: Design an efficient access system to find a target.
• We need to define a dissimilarity function stating thedistance between the data. The distance d(x , y) between twodata x and y must satisfy the above four rules.
• Similarity queries. For a given target x∗ ∈ M and ε > 0,arrest all having finger print y ∈ M such that d(y , x∗) < ε.
9
Definition. (1.7.3. Normed Space (V , ‖ · ‖))A normed space (V, ‖ · ‖) is a vector space V and a function‖ · ‖ : V → R called a norm such that
1. ‖v‖ ≥ 0, ∀v ∈ V2. ‖v‖ = 0 iff v = 0.
3. ‖λv‖ = |λ|‖v‖, ∀v ∈ V and every scaler λ.
4. ‖v + w‖ ≤ ‖v‖+ ‖w‖, ∀v,w ∈ V
Examples
• V = R and ‖x‖ = |x | for all x ∈ R.
• V = R2 and ‖v‖ =√
v21 + v2
2 for all v = (v1, v2) ∈ R2.
• Let V = C ([0, 1])=all continuous functions on the interval[a, b]. Define‖f ‖ = sup{|f (x)| : x ∈ [0, 1]} (called supremum norm).
10
Proposition.
If (V, ‖ · ‖) is a normed vector space and
d(v,w) = ‖v −w‖
, then d is a metric in V.
Proof. EASY.Examples
• For V = C ([0, 1]), the metric is
d(f , g) = ‖f − g‖ = sup{|f (x)− g(x)| : x ∈ [0, 1]}.
The sup distance between functions is the largest verticaldistance between their graphs.
11
Definition.
A vector space V with a function 〈·, ·〉 : V × V → R is called aninner product space if
1. 〈 v, v〉 ≥ 0 for all v ∈ V.
2. 〈 v, v〉 = 0 iff v = 0.
3. 〈λv,w〉 = λ〈v,w〉, ∀v ∈ V and every scaler λ.
4. 〈v + w,h〉 = 〈 v,h 〉+ 〈w,h 〉.5. 〈 v,w 〉 = 〈w, v 〉
Examples
1. V = R2 and 〈 v,w 〉 = v1w1 + v2w2.Two vectors v and w are orthogonal if 〈 v,w 〉 = 0.
2. V = C [0, 1] and 〈 f , g 〉 =∫ 10 f (x)g(x)dx
3. ‖v‖ =√〈 v, v 〉 is a norm on V.
12
Theorem. (Cauchy-Schwarz inequality )
If 〈 ·, · 〉 is an inner product in a real vector space V, then|〈 f , g 〉| ≤ ‖f ‖‖g‖Proof:
• Suppose g 6= 0. Let h = g‖g‖ . It suffices to prove that
|〈 f , h 〉| ≤ ‖f ‖. (Why? |〈 f , g 〉| ≤ ‖f ‖‖g‖ iff |〈 f , h 〉| ≤ ‖f ‖.)• Denote α = 〈 f , h 〉. Then
0 ≤ ‖f − αh‖2 = 〈f − αh, f − αh〉= ‖f ‖2 − α 〈h, f 〉 − α 〈f , h〉+ |α|2= ‖f ‖2 − |α|2
Hence, |α| = |〈 f , h 〉| ≤ ‖f ‖. This completes the proof.
13
Chapter 2: Topology of M = Rn
Throughout this chapter, assume M = Rn ( the Euclidean space )with the metric d(x, y) =
√∑ni=1 |xi − yi |2 = ‖x− y‖
Definition. (D(x, ε), open, neighborhood)
• D(x, ε) := {y ∈ M : d(y, x) < ε} is called ε-ball (or ε-disk)about x.
• A ⊂ M is open if ∀ x ∈ M, ∃ε > 0 s.t. D(x, ε) ⊂ A.
• A neighborhood of x is an open set A containing x.
• Open sets: (a, b), D(x, ε), {(x , y) ∈ R2 : 0 < x < 1}.• The union of an arbitrary collection of open subsets of M is
open. (Why?)
• The intersection of a finite number of open subsets of M isopen. (Note that ∩∞n=1(−1/n, 1/n) = {0} is closed. )
1
2.2 Interior of a set A: int(A)
Definition. (2.2.1; Interior point & interior of A)
Let (M, d) is a metric space and A ⊂ M. x is called an interiorpoint of A if ∃D(x, ε) s.t. D(x, ε) ⊂ A. Denote
int(A) := the collection of all interior points of A.
Examples. Proofs are very easy.
• If A = [0, 1], then int(A) = (0, 1).
• int{(x , y) ∈ R2 : 0 < x≤1} = {(x , y) ∈ R2 : 0 < x<1}.• If A is open, then int(A) = A.
• Let (M, d) be a metric space and x0 ∈ M.int{y ∈ M : d(y, x0)≤1} = {y ∈ M : d(y, x0)<1}
2
Definition. (2.3-4: Closed sets & Accumulation Points )
• A set B in a metric space M is said to be closed if M \B is open.• x ∈ M is accumulation point (or cluster point ) of a setA ⊂ M if ∀ε > 0, D(x, ε) contains y ∈ A with y 6= x.
Prove the followings:
• Closed sets: [a, b], {y ∈ R2 : d(y, x0)≤1}.• The union of an a finite number of closed subsets of M is
closed. (Note that ∪∞n=1[1/n, 2− 1/n] = (0, 2) is open. )
• The intersection of an arbitrary family of closed subsets ofM is closed. Why?
• Every finite set in Rn is closed.
• A set A ⊂ M is closed iff the accumulation points of Abelongs to A.
• A = {1, 12 , 1
3 , 14 , · · · } ∪ {0} is closed.
3
Definition. ( Closure of A & Boundary of A)
Let (M, d) is a metric space and A ⊂ M.
• cl(A) :=the intersection of all closed set containing A.
• ∂A = bd(A) = cl(A) ∩ cl(M \ A) is called the boundary of A
Examples
• Closure: cl((0, 1)) = [0, 1], cl{(x , y) ∈ R2 : x > y} ={(x , y) ∈ R2 : x ≥ y}.
• Boundary bd((0, 1)) = {0, 1}, bd{(x , y) ∈ R2 : x > y} ={(x , y) ∈ R2 : x = y}.
Let (M, d) is a metric space and A ⊂ M. Prove that
• cl(A) = A ∪ { accumnulation points of A}.• x ∈ cl(A) iff inf{d(x, y) : y ∈ A} = 0.
• x ∈ bd(A) iff ∀ε > 0, D(x, ε)∩A 6= ∅ & D(x, ε)∩ (M \A) 6= ∅.
4
Definition. (Sequences & Completes)
Let (M, d) is a metric space and xk a sequence of points in M.
• limk→∞ xk = x iff ∀ε > 0, ∃N s.t. k ≥ N ⇒ d(x , xk) < ε.
• xk is Cauchy seq.iff ∀ε > 0, ∃N s.t. k, l ≥ N ⇒ d(xk , xl) < ε.
• xk is boundediff ∃B > 0 & x0 ∈ M s.t. d(xk , x0) < B for all k.
• x is a cluster point of the seq. xk
iff ∀ε, ∃ infinitely many k with d(xk , x) < ε.
The space M is called complete if every Cauchy seq. in Mconverges to a point in M.
In a metric space, it is easy to prove the followings:• Every convergent seq. is a Cauchy seq.• A Cauchy seq. is bounded.• If a subseq. of a Cauchy seq. converges to x , then the
sequence itself converges to x .5
Chapter 3: Compact & Connected sets
Throughout this chapter, we assume that (M, d) is a metric space.
Definition. (3.1.1: Sequentially compact & Compact)
Let A ⊂ M.
• A is called sequentially compact if EVERY sequence in Ahas a subsequence that converges to a point in A.
• A is compact if EVERY open cover of A has FINITEsubcover.• An open cover of A is a collection {Ui} of open sets such
that A ⊂ ∪iUi .• An open cover {Ui} of A is said to have finite subcover if a
finite subcollection of {Ui} covers A.
• In chapter 1, we proved that every sequence xn in the closedinterval [a, b] has a subsequence that converges to a point in[a, b]. Hence, [a, b] is sequentially compact.
1
Examples of compact set
1. Prove that the entire line R is NOT compact.Proof. Clearly, {D(n, 1) : n = 0,±1,±2, · · · } is open coverof R but does not have a finite subcover (why?).
2. Prove that A = (0, 1] is not compact. Proof. Clearly,(0, 1] = ∪∞n=1(1/n, 2). Hence, {(1/n, 2) : n = 1, 2, · · · } is anopen cover of (0, 1] but does not have a finite subcover.
3. Heine-Borel thm. Let A ⊂ M = Rn. A is compact iff A isclosed and bounded. Proof. later.
4. Give an example of a bounded and closed set that is notcompact.Sol’n. Let M = {en : n = 1, 2, · · · } wheree1 = (1, 0, 0, · · · ), e2 = (0, 1, 0, · · · ), · · · . Let d(ei , ej) =
√2
if i 6= j . Then (M, d) is a metric space.• The entire metric space M is closed and bounded (why?).• {D(en, 1) ; n = 1, 2 · · · } is open cover of M but but does not
have a finite subcover (why?). Hence, M is not compact.
2
Theorem. (3.1.3; Bolzano-Weirstrass theorem)
A ⊂ M is compact iff A is sequentially compact.
• Lemma 1: Let A ⊂ M. If A is compact, then A is closed.Proof . We will show M \ A is open. Let x ∈ M \ A.
1. A ⊂ ∪∞n=1Un where Un = M \ D(x, 1/n) open set.2. Since A is compact and {Un} covers A, ∃ a finite subcover,
that is, ∃N s.t. A ⊂ ∪Nn=1Un. = UN
3. Hence, D(x, 1/N) ⊂ UcN ⊂ Ac = M \ A and therefore M \ A is
open.
• Lemma 2: Let A ⊂ B ⊂ M. If B is compact and A isclosed, then A is compact.Proof. Let Ui be an open covering of A.
1. Set V = M \ A. Note that V is open.2. Thus {Ui , V } is an open cover of B.3. Since B is compact, B has a finite cover, say,{U1, · · · , UN ,V }. Hence, A ⊂ U1 ∪ · · · ∪ UN .
3
• Lemma 4: If A is sequentially compact, then A is totallybounded.
1. Definition of totally bounded: A ⊂ M is totally bounded if∀ε, ∃ finite set {x1, · · · , xN} ⊂ M s.t. A ⊂ ∪N
i=1D(xi , ε).2. Proof. If not, then for some ε > 0 we cannot cover A with
finitely many disks.
(i) Choose x1 ∈ A and x2 ∈ A \ D(x1, ε).(ii) By assumption, we can repeat; choose xn ∈ A \ ∪n−1
i=1 D(xi , ε)for n = 1, 2, · · · .
(iii) This seq {xn} satisfies d(xn, xm) > ε for all n 6= m.(iv) Hence, xn has no convergent subseq., a contradiction.
• Summery. Let A ⊂ M.• A is compact ⇒ A is closed• A is a closed subset of a compact set ⇒ A is compact.• A is sequentially compact ⇒ A is totally bounded.
4
Proof of B-W thm (⇒): If A is compact, then A issequentially compact.Let A be compact. Let {xn} be a seq. in A.
1. To derive a contradiction, assume that {xn} has noconvergent subseq.
2. Then, {xn} has infinitely many distinct points {yk} which hasno accumulation points. (Why? If not, ∃ convergent subseq. )
3. Hence, ∃ some neighborhood Uk of yk containing no other yi .
4. {yn} is closed because it has no accumulation points. Hence,{yn} is compact by Lemma 2. Lemma2: Any closed subset of the compact set A is
compact.
5. But {Uk} is an open cover that has no finite subcover, acontradiction.
6. Hence, xn has a convergence subsequence. The limit lies in A,since A is closed by Lemma 1.
Hence, xn has a subsequence that converges to a point in A.
5
Proof of B-W thm (⇐): If A is sequentially compact, than Ais compact.Suppose {Ui} is an open cover of A. We need to prove that {Ui}has finite subcover.
• ∃r > 0 s.t. ∀y ∈ A, D(y , r) ⊂ Ui for some Ui .Why?
1. If not, ∃yn ∈ A s.t. D(yn, 1/n) is not contained in any Ui .2. By assumption, {yn} has a subseq., say, ynk
→ z ∈ A. Sincez ∈ A ⊂ ∪iUi , z ∈ Ui0 for some Ui0 .
3. Since Ui0 is open, ∃ε > 0 s.t. D(z , ε) ⊂ Ui0 .4. Since ynk
→ z , ∃N = nk0 ≥ 2/ε s.t. yN ∈ D(z , ε/2).5. But D(yN , 1/N) ⊂ D(z , ε) ⊂ Ui0 (why?), a contradiction.
• Since A is totally bounded (see Lemma 4), we can writeA ⊂ D(y1, r) ∪ · · · ∪ D(yn, r) for finitely many yi .
• Since D(yk , r) ⊂ Uik for some Uik , A ⊂ Ui1 ∪ · · · ∪ Uin ,finite subcover. Hence, A is compact.
6
Theorem. (3.15; Compact ⇔ Closed and Totally Bounded)
Let A ⊂ M. A is compact iff A is complete and totally bounded.
(Proof of ⇒) Assume A is compact.
1. A is compact ⇒ totally bounded & sequentially compact.
2. A is sequentially compact ⇒ A is complete.
(Proof of ⇐) Assume A is is complete and totally bounded. Itsuffices to prove that A is sequentially compact. Assume that {yn}is a sequence in A.
1. We may assume that the yk are all distinct. (Why? If not, ...)
2. Since A is totally bounded, for each k = 1, 2, · · ·∃xk1, · · · , xkLk
∈ M s.t. A ⊂ D(xk1, 1/k) ∪ · · · ∪ D(xkLk, 1/k)
3. Nest page...
7
Theorem. (Continue...)
Let A ⊂ M. A is compact iff A is complete and totally bounded.
(Proof of ⇐) Assume A is is complete and totally bounded. It suffices to prove that A is sequentially compact.
Assume that {yn} is a sequence in A.
1. We may assume that the yk are all distinct. (Why? If not, ...)
2. Since A is totally bounded, for each k = 1, 2, · · ·
∃xk1, · · · , xkLk∈ M s.t. A ⊂ D(xk1, 1/k) ∪ · · · ∪ D(xkLk
, 1/k)
3. For k = 1, an infinitely many yn lie in one of these disksD(x1,j , 1). Hence, we can select a subseq. {y11, y12, · · · } lyingentirely in one of these disks.
4. Repeat the previous step for k = 2 and obtain the subseq.{y21, y22, · · · } of {y11, y12, · · · } lying entirely in one of thesedisks D(x2j , 1/2).
5. Now choose the diagonal subsequence y11, y22, y33, · · · . Thissequence is Cauchy seq. because d(yii , yjj) ≤ max{1/i , 1/j}.
6. Since A is complete, yii converges to a point in A.8
Theorem. (3.2.1, Heine-Borel thm.)
Let A ⊂ M = Rn. A is compact iff A is closed and bounded.
Proof.
• Recall Thm 3.1.5: A is compact iff A is closed and totallybounded.
• Since M = Rn is Euclidean space,
A is bounded ⇔ A is totally bounded
Caution: If M is not Euclidean space, the above statement is nottrue. See Example 3.1.8 where there is an example that A isbounded but not totally bounded.
9
Theorem. (3.3.1: Nested Set Property)
Let Fk be a sequence of compact non-empty set in a metricspace M such that F1 ⊇ F2 ⊇ F3 ⊇ · · · . Then,
∩∞k=1Fk 6= ∅.
1. For each n, choose xn ∈ Fn.2. Since {xn} ⊂ F1 and F1 is compact, ∃ a subseq {xnk
} thatconverges to some point z in F1, that is,
xnk−→ z ∈ F1
3. With a rearrangement, we may assume that xn → z . (why?)4. n > N =⇒ xn ∈ Fn ⊂ FN =⇒ xn ∈ FN
5. Since limj→∞ xN+j = z & xN+j ∈ FN & FN is compact, itmust be
z ∈ FN , N = 1, 2, 3, · · ·This completes the proof.
10
Definition. (Path-Connected Sets)
• φ : [a, b] → M is said to be continuous if
tk ∈ [a, b] → t =⇒ φ(tk) → φ(t)
• A continuous path joining x , y ∈ M is a continuousmapping φ : [a, b] → M such that φ(a) = x , φ(b) = y.
• A ⊂ M is said to be path-connected if for any x , y ∈ A,there exists a continuous path φ : [a, b] → M joining x and ysuch that
φ([a, b]) ⊂ A.
11
Definition. (3.5.1: Separate, Connected Sets)
Let A be a subset of a metric space M.
• Two open set U, V are said to be separate A if
1. U ∩ V ∩ A = ∅2. U ∩ A 6= ∅ & V ∩ A 6= ∅3. A ⊂ U ∪ V .
• A is disconnected if such sets U, V exist.
• A is connected if such sets U, V do not exist.
12
Theorem. (3.3.1)
Path-connected sets are connected.
1. Clearly, [a, b] is connected.
2. To derive a contradiction, suppose A is path-connected butnot connected. Then ∃ open sets U, V such that
(i) U ∩ V ∩ A = ∅ & A ⊂ U ∪ V(ii) ∃x ∈ U ∩ A & ∃y ∈ V ∩ A
3. Since A is path-connected, ∃ a continuous pathφ : [a, b] → M s.t. φ(a) = x , φ(b) = y , φ([a, b]) ⊂ A.
4. From Theorem 4.2.1 which we will learn soon, φ([a, b]) isconnected. This is a contradiction since U, V separateφ([a, b]).
13
Example 3.1
• Show that A := {x ∈ Rn : ‖x‖ ≤ 1} is compact andconnected.Proof.
1. Since A is closed and bounded, A is compact by Heine-Borelthm.
2. To prove connectedness, let x , y ∈ A.3. Define φ : [0, 1] → Rn by φ(t) = tx + (1− t)y . Clearly, φ is
continuous path joining φ(0) = x and φ(1) = y .4. ‖φ(t)‖ ≤ t‖x‖+ (1− t)‖y‖ ≤ t + (1− t) = 1 for t ∈ [0, 1].
Hence, φ([0, 1]) ⊂ A.5. Hence, A is path-connected.
14
Example 3.2
• Let A ⊂ Rn, x ∈ A and y ∈ Rn \ A. Let φ : [0, 1] → Rn be acontinuous path joining x and y .Show that ∃ t0 s.t. φ(t0) ∈ bd(A).
1. Let t0 = sup{t : φ([0, t]) ⊂ A}. This is well-defined becauseφ(0) = x ∈ A.
2. If t0 = 1, clearly y = φ(t0) ∈ bd(A).3. Assume 0 ≤ t0 < 1. From the definition of t0, for
n = 1, 2, · · · , ∃tn s.t. t0 ≤ tk ≤ t0 + 1n & φ(tn) ∈ Ac
4. Since φ(tn) ∈ Ac → φ(t0), φ(t0) ∈ bd(A).
15
Chapter 4. Continuous Mappings
Throughout this chapter, we assume that M = Rn and N = Rm
are Euclidean space with the standard metric
d(x, y) = ‖x− y‖ =
√√√√n∑
j=1
(xi − yj)2, x, y ∈ M
ρ(v,w) = ‖v −w‖ =
√√√√m∑
j=1
(vi − wj)2, v,w ∈ N
Please note that the same symbol ‖ · ‖ may have different normdepending on its context.Throughout this chapter, we assume that A ⊂ M = Rn and
f : A → N = Rm is a mapping.
1
Definition. (4.1.1: Continuity of f : A → N)
• Suppose that x0 ∈ {accumulation points of A}. We writelimx→x0 f (x) = b if ∀ε > 0, ∃ δ > 0 s.t.
0 < ‖x− x0‖ < δ & x ∈ A ⇒ ‖f (x)− b‖ < ε
• Let x0 ∈ A. We say that f is continuous at x0 if eitherx0 6= {accumulation points of A} or limx→x0 f (x) = f (x0).
• Let B ⊂ A. f is called continuous on B if f is continuous ateach point on B. If A = B, we just say that f is continuous.
2
Theorem. (4.1.4: Continuity of f : A → N)
The following assertions are equivalent.
1. f is continuous on A
2. For every convergent seq xk → x0 in A, we havef (xk) → f (x0).
3. For each open set U in N, f −1(U) is open relative to A;that is, f −1(U) = A ∩ V for some open V
4. For each closed set F in N, f −1(F ) is closed relative to A;that is, f −1(F ) = A ∩ G for some close G
Proof. 1easy=⇒ 2
?=⇒ 4
easy=⇒ 3
?=⇒ 1
3
Proof of (2 =⇒ 4)
Let F ⊂ N be closed. We want to prove that f −1(F ) is closedrelative to A. We begin with reviewing the definition of closed.
1. B is closed iff B = B ∪ {accumulation points of B}.2. B is closed iff for every sequence {xk} ⊂ B that xk → x0, we
necessary have x0 ∈ B.
3. B ⊂ A is closed relative to A iffB = (B ∪ {accumulation points of B})∩A
4. B ⊂ A is closed relative to A iff for every sequence {xk} ⊂ Bthat xk → x0∈ A, we necessary have x0 ∈ B.
5. Proof of (2 =⇒ 4). Let xk ∈ f −1(F ) and let xk → x0 ∈ A.By 2, f (xk) → f (x0). Since F is closed, f (x0) ∈ F .∴ x0 ∈ f −1(F ). ∴ f −1(F ) is closed relative to A.
4
Proof of (3 =⇒ 1)
For given x0 ∈ A and ε > 0, we must find δ > 0 such that
‖x− x0‖ < δ & x ∈ A︸ ︷︷ ︸x ∈ D(x0,δ)∩A
⇒ ‖f (x)− f (x0)‖ < ε︸ ︷︷ ︸f (x) ∈ D(f (x0),ε)
1. Since D(f (x0), ε) is open, by 3
f −1 (D(f (x0), ε)) is open relative to A.
∴ f −1 (D(f (x0), ε)) = A ∩ V for some open set V .
2. Since x0 ∈ V and V is open,
∃ δ > 0 s.t. D(x0, δ) ⊂ V .
3. Hence, D(x0, δ) ∩ A ⊂ f −1 (D(f (x0), ε)) and this completesthe proof.
5
Theorem. (4.2.1: f (connected) is connected if f ∈ C (M))
Suppose that f : M → N is continuous and let K ⊂ M.
(i) If K is connected, so is f (K ).
(ii) If K is path-connected, so is f (K ).
Proof of (i). Suppose f (K ) is not connected.
1. From the definition of disconnectedness, ∃ open U, V s.t.
f (K ) ⊂ U ∪ V , U ∩ V ∩ f (K ) = ∅, U ∩ f (K ) 6= ∅, V ∩ f (K ) 6= ∅
2. Since f is continuous, f −1(U) and f −1(V ) are open. Moreover,K ⊂ f −1(U) ∪ f −1(V ), f −1(U) ∩ f −1(V ) ∩ K = ∅,
f −1(U) ∩ K 6= ∅, f −1(V ) ∩ K 6= ∅.3. Hence, K is disconnected, a contradiction.
6
Proof of (ii). If K is path-connected, so is f (K ).
1. Let v,w ∈ f (K ) and let x, y ∈ K s.t. f (x) = v, f (y) = w.
2. Since K is path-connected, ∃ a continuous curvec : [0, 1] → M s.t.
c(t) ∈ K (0 ≤ t ≤ 1), c(0) = x, c(1) = y
3. Since f is continuous, it is easy to show thatc(t) = f (c(t)) ∈ f (K ) for 0 ≤ t ≤ 1 and and c : [0, 1] → N iscontinuous path joining v and w.
4. Hence, f (K ) is path-connected
7
Theorem. (4.2.2: f (compact) is compact if f ∈ C (M))
Suppose that f : M → N is continuous and K ⊂ M is compact.Then f (K ) is compact.
It suffices to prove that f (K ) is sequentially compact.
1. Let vn ∈ f (K ). Let xn ∈ K s.t. f (xn) = vn.
2. Since K is compact, ∃ a convergent subsequence, say,xnk
→ x0 ∈ K .
3. Since f is continuous, vnk= f (xnk
) → f (x0) ∈ f (K ). Thisproves that f (K ) is sequentially compact.
8
Examples
Let f : R2 → R be a continuous map. Denote x = (x1, x2).
• Let f (x) = x1 for x ∈ R2. If K ⊂ R2 be compact, so isf (K ) = {x1 : x = (x1, x2) ∈ K}. (Why? Since f iscontinuous and K is compact, f (K ) is compact.)
• Let f (x) = 7 for x ∈ R2. The set {7} is compact, whileR2 = f −1({7}) is not compact.
• The set A = {f (x) : ‖x‖ = 1} is a closed interval. (Why?K = {x ∈ R2 : ‖x‖ = 1} is compact and connected. Hence,A = f (K ) is compact and connected. )
9
Theorem.
(1) Let f : A ⊂ N → M and g : A ⊂ N → M be continuous at x0.Then
• f ± αg is continuous at x0 for any α ∈ R.
• fg is continuous at x0
• f /g is continuous at x0 if g(x0) 6= 0.
(2) Suppose f : A ⊂ N → M and h : B ⊂ N → Rp are continuousand f (A) ⊂ B. Then h ◦ f : A ⊂ N → Rp is also continuous.
Proof. EASY
10
Theorem. (4.4.1: Maximum-Minimum Principle)
Let f : A ⊂ M → R be continuous and let K be a compact subsetin A. Then,
• f (K ) is bounded.
• ∃x0, y0 ∈ K such that
f (x0) = inf f (K ) = infx∈K
f (x) & f (y0) = sup f (K ) = supx∈K
f (x).
Proof. Since K is compact and f is continuous on K ⊂ A, f (K ) iscompact. Hence, f (K ) is closed and bounded in R by Heine-Borelthm. This completes the proof.
11
Theorem. (4.5.1: Intermediate Value Theorem)
Let f : A ⊂ M → R be continuous. Assume K is a connectedsubset in A and x , y ∈ K and f (x) < f (y). Then,
• For every number c ∈ R such that f (x) < c < f (y),
∃ z ∈ K s.t. f (z) = c
Proof. Since K is connected and f is continuous on K ⊂ A, f (K )is connected. Hence, [f (x), f (y)] ⊂ f (K ).∴ ∃ z ∈ K s.t. f (z) = c . This completes the proof.
12
4.6 Uniform Continuity
Throughout this section, we assume that f : A ⊂ Rn → Rm iscontinuous.
• Definition. Let B ⊂ A. f is uniformly continuous on B iffor every ε > 0, there is δ > 0 s.t.
‖x − y‖ < δ & x , y ∈ B ⇒ ‖f (x)− f (y)‖ < ε.
• Example. Consider f : R→ R, f (x) = x2. Then f iscontinuous on R, but it is not uniformly continuous. Why?Let xn = n + 1/n and yn = n. Then |xn − yn| = 1/n → 0,while |f (xn)− f (yn)| ≥ 1.
• Example. Consider f : (0, 1) → R, f (x) = 1/x . Then f iscontinuous on (0, 1), but it is not uniformly continuous. Why?Let xn = 1/n . Then |xn+1 − xn| < 1/n → 0, while|f (xn+1)− f (xn)| = 1.
13
Theorem. (Uniform Continuity Theorem)
Let f : A ⊂ Rn → Rm be continuous and let K ⊂ A be compact.Then f uniformly continuous on K.
1. Let ε be given. Since f is continuous on K , for each x ∈ K ,∃δx > 0 s.t. f (D(x , δx) ∩ K ) ⊂ D
(f (x), ε
2
)
2. Since K ⊂ ∪xD(x , δx/2) and K is compact,∃{x1, · · · , xN} ⊂ K s.t. K ⊂ ∪N
j=1D(xj , δ) where
δ = 12 min{δx1 , · · · , δxN
}.3. If |x − y | < δ, x , y ∈ K , then ∃xj s.t. |x − xj | < δ. Since|y − xj | ≤ |y − x |+ |x − xj | < 2δ ≤ δxj ,‖f (x)− f (y)‖ ≤ ‖f (x)− f (xj)‖+ ‖f (xj)− f (y)‖ ≤ ε
2 + ε2 = ε.
14
Chapter 5. Uniform Convergence
This chapter deals with very important results in physical science:
• a basic iteration technique called the contractionmapping principle (5.7.1)
• some applications to differential and integral equationsand some problems in control theory. (5.7.2, 5.7.3, 5.7.10)
To study such results, we need
• compactness in a complete metric space (5.5.3)
• uniform convergence, equi-continuity (5.6.2)
1
Definition. (Pointwise convergence & Uniform Convergence)
Let N be a metric space with the metric ρ, A a set, andfk : A → N, k = 1, 2, · · ·• fk → f pointwise if for each x ∈ A, limk→∞ fk(x) = f (x), i.e.
∀x ∈ A, limk→∞
ρ(fk(x), f (x)) = 0
• fk → f uniformly if limk→∞ supx∈A ρ(fk(x), f (x)) = 0, i.e.
∀ε > 0, ∃ N s.t. k > N ⇒ supx∈A
ρ(fk(x), f (x)) < ε
Examples:
• fk(x) = xk → 0 pointwise in (0, 1). (Why?)
• fk(x) = xk does NOT converge to 0 uniformly in (0, 1).
• Show that fn(x) = xn
1+xn converges pointwise on [0, 2] but thatthe convergence is not uniform.
2
Definition. (5.1.3: Does∑
k gk makes sense ?)
Denote fn(x) =∑n
k=1 gk(x).
• ∑k gk = f (pointwise) if fn → f pointwise.
• ∑k gk = f uniformly if fn → f uniformly.
Examples.
• ∑∞k=0
(−1)kx2k+1
(2k+1)! = sin x uniformly in the interval [−100, 100].
• ∑k xk = 1
1−x converges uniformly in [−0.9, 0.9]
• ∑k xk = 1
1−x converges pointwise (NOT uniformly) in (−1, 1)
• ∑k xk does not converge in R \ (−1, 1)
3
The Weierstrass M-test
Theorem. (5.2.1: Cauchy Criterion)
Let V be a complete normed vector space with norm ‖ · ‖, and letA be a set. Let fk : A → V is a sequence of functions. Then fkconverges uniformly on A iff
∀ ε > 0, ∃ N s.t. l , k > N ⇒ supx∈A
‖fk(x)− fl(x)‖ < ε
Proof of ⇒.
1. Assume fk → f uniformly. Let ε > 0 be given.
2. Then∃ N s.t. k ≥ N ⇒ ‖fk − f ‖ = supx∈A |fk(x)− f (x)| < ε/2.
3. Hence,l , k ≥ N ⇒ ‖fk − fl‖ ≤ ‖fk − f ‖+ ‖fl − f ‖ < ε
2 + ε2 = ε.
4
Theorem. (5.2.1: Continue...)
♣♦♥ Then fk converges uniformly on A iff∀ ε > 0, ∃ N s.t. l , k > N ⇒ supx∈A ‖fk(x)− fl(x)‖ < ε
Proof of ⇐.
1. From the assumption, fk(x) is Cauchy sequence for all x ∈ A.
2. Hence, for all x ∈ A, ∃ limk fk(x) and we can definef (x) = limk fk(x).
3. Let ε > 0 be given. From the assumption,∃ N s.t. l , k > N ⇒ supx∈A ‖fl(x)− fl(x)‖ < ε/2.
4. From 2,∀x ∈ A, ∃ Nx s.t. l > Nx ⇒ ‖f (x)− fl(x)‖ < ε/2.
5. From 3 and 4, if k ≥ N and x ∈ A, then‖fk(x)− f (x)‖ ≤ ‖fk(x)− fl(x)‖+‖fl(x)− f (x)‖ < ε/2+ε/2for any l ≥ Nx .
6. From 5, k ≥ N ⇒ supx∈A ‖fk(x)− f (x)‖ < ε.
5
Theorem. (5.2.2: Weierstrass M test)
Let V be a complete normed vector space with norm ‖ · ‖, and letA be a set. Suppose that gk : A → V are functions such thatsupx∈A ‖gk(x)‖ < Mk and
∑∞k=1 Mk < ∞. Then
∑∞k=1 gk
converges uniformly.
Proof.
1. Denote fn(x) =∫ nk=1 gk(x).
2. Then ‖fn(x)− fn+`(x)‖ = ‖∑n+`k=n gk(x)‖ ≤ ∑n+`
k=n Mk .
3. Since limn→∞∑∞
k=n Mk = 0, it follows from 2 and Theorem5.2.1 that fn converges uniformly.
6
5.5 The space of continuous functions
Throughout this section, we assume M = Rn, A ⊂ M, andN = Rn. (N,M: complete normed space)
• Denote C(A, N) = {f : f : A → N is continuous }. Then C isa vector space.
• For f ∈ C(A,N), f is said to be bounded if there is a constantC such that ‖f (x)‖ < C for all x ∈ A.
• Denote Cb(A, N) = {f ∈ C : f is bounded }.• Define
‖f ‖ = supx∈A
‖f (x)‖
• ‖f ‖ is a measure of the size of f and is called the norm of f .
7
Theorem. (5.5.1-3: Cb(A, N) is a complete normed space)
Let A ⊂ M = Rm, N = Rn. The set Cb(A,N) is a complete normedspace equipped with the norm ‖f ‖ = supx∈A ‖f (x)‖; that is,
1. Cb(A,N) is a normed space.• ‖f ‖ ≥ 0 and ‖f ‖ = 0 iff f = 0.• ‖αf ‖ = |α|‖f ‖ for α ∈ R, f ∈ Cb.• ‖f + g‖ ≤ ‖f ‖+ ‖g‖.
2. Completeness: Every Cauchy sequence {fk} in Cb(A,N)converges to a function f ∈ Cb(A, N), that is,
limk→∞
‖fk − f ‖ = limk→∞
supx∈A
‖fk(x)− f (x)‖ = 0.
• Clearly, Cb(A, N) is a normed space. (EASY!)
• From the definition, fk → f uniformly iff fk → f in Cb.
• From Cauchy criterion (Theorem 5.2.1), Cb(A, N) is complete.
8
Examples
• Let B = {f ∈ C ([0, 1],R) : f (x) > 0 for all x ∈ [0, 1]}.Show that B is open in C ([0, 1],R).Proof.
1. In order to prove that B is open, we must show that∀ f ∈ B, ∃ε > 0 s.t. D(f , ε) ⊂ B.
2. Let f ∈ B. Since [0, 1] is compact, f has a minimum value-say,m- at some point in [0, 1]. Hence, infx∈[0,1] f (x) = m.
3. Let ε = m2 . We will show D(f , ε) ⊂ B.
Proof. If g ∈ D(f , ε), then ‖g − f ‖ < ε, and∴ g(x) ≥ f (x)−|g(x)− f (x)| ≥ f (x)−‖f − g‖ ≥ m−ε = m
2for all x ∈ [0, 1]. Hence, g ∈ B. ∴ D(f , ε) ⊂ B
• Prove that B is D = {f ∈ Cb : infx∈[0,1] f (x) ≥ 0}.Proof.
1. D = D because if fn ∈ D → f uniformly, then fn(x) → f (x)pointwise and ∴ infx∈[0,1] f (x) ≥ 0.
2. If f ∈ D, then fn(x) := f (x) + 1n ∈ B and ‖fn − f ‖ = 1
n → 0.
∴ B ⊂ D ⊂ B.
9
Examples
• Consider a sequence fn ∈ Cb such that ‖fn+1− fn‖ ≤ rn, where∑rn is convergent. Prove that fn converges.
Proof.
1. Let ε > 0 be given.2. Since
∑rn is convergent,
∃ N s.t. n > N ⇒∞∑
k=n
rk < ε
3. Hence, if n ≥ N, then
‖fn+k−fn‖ = ‖n+k−1∑
j=n
(fj+1−fj)‖ ≤n+k−1∑
j=n
‖fj+1−fj‖ ≤∞∑
j=n
rj < ε
4. From 3, fn is a Cauchy sequence, so it converges.
10
Arzela-Ascoli Theorem
Throughout this section, we assume thatM = Rm, A ⊂ M, N = Rn (N,M: complete normed space).
Definition. (5.6.1: Equi-continuous)
Assume B ⊂ C(A, N).
• We say that B is equi-continuous if
∀ ε > 0, ∃ δ > 0 s.t. ‖x − y‖ < δ & x , y ∈ A
⇒ supf ∈B‖f (x)− f (y)‖ < ε
• We say B is pointwise compact iff Bx = {f (x) : f ∈ B}is compact in N for each x ∈ A.
11
Example 5.6.4 (Compact sequence)
Let fn ⊂ Cb([0, 1],R) and be such that f ′n exist and
supn‖fn‖ ≤ C & sup
n
(sup
x∈(0,1)|f ′n(x)|
)≤ C
for a positive constant C . Prove that B := {fn} is equicontinuous.Proof.
• By the mean value theorem,
|fn(x)− fn(y)| ≤ supz∈(0,1)
|f ′n(z)| |x − y | ≤ C |x − y |, for all n
• Hence, for given ε > 0, we can choose δ = εC and
|x−y | < δ & x , y ∈ [0, 1] ⇒ supn|fn(x)−fn(y)| < C |x−y | < ε.
Hence, B := {fn} is equi-continuous. (So, {fn} has aconvergent subsequence. Why? See Arzela-Ascoli theorem.)
12
Theorem. (5.6.2:Arzela-Ascoli theorem)
Let A be compact and B ⊂ C(A,N). If B is closed,equi-continuous, and pointwise compact, then B is compact, thatis, any sequence fn in B has a uniformly convergent subsequence.
The proof strategy is based on Bolzano-Wierstrass properties.
Theorem. (Special case of Arzela-Ascoli theorem)
Let B ⊂ C([0, 1],R). If B is closed, equi-continuous, and bounded,then B is compact.
Proof.
1. Assume fn is a sequence in B.
2. Denote C1/n = { 1n , 2
n , · · · , n−1n , 1}. Let C = ∪nC1/n.
3. Since C is countable, we can write C = {x1, x2, · · · }.13
Theorem. (Special case of Arzela-Ascoli theorem)
Let B ⊂ C([0, 1],R). If B is closed, equi-continuous, and bounded,then B is compact.
Proof.
1. Assume fn is a sequence in B.
2. Denote C1/n = { 1n
, 2n
, · · · , n−1n
, 1}. Let C = ∪nC1/n .
3. Since C is countable, we can write C = {x1, x2, · · · }.
4. Since Bx1 is compact, ∃ a convergent subseq of fn(x1). Let usdenote this subsequence by
f11(x1), f12(x1), · · · , f1k(x1), · · ·
5. Similarly, the sequence f1k(x2) has a subsequence
f21(x2), f22(x2), · · · , f2k(x2), · · · which is converegnt.
6. We proceed in this way and then set gn = fnn.
14
Proof of Arzela-Ascoli theorem
7. gn = fnn is obtained by picking out the diagonal
f11 f12 f13 · · · f1n · · · (1st subseq.)f21 f22 f23 · · · f11 · · · (2nd sub seq.)...
......
...fn1 fn2 fn3 · · · fnn · · · (n-th subseq.)
8. From the construction from the diagonal process,
limn→∞ gn(xi ) exists for all xi ∈ C .
9. Now, we are ready to prove
‖gn − gm‖ = supx∈[0,1]
|gn(x)− gm(x)| → 0 as m, n →∞.
15
Continue...
9. Proof of limn,m→∞ supx∈A |gn(x)− gm(x)| = 0.
a. Let ε > 0 be given.b. From equi-continuity of {gn} ⊂ B, we can choose δ s.t.
|x−y | < δ & x , y ∈ A = [0, 1] ⇒ supn|gn(x)−gn(y)| < ε/3
c. Choose L ≥ 1δ . From 8,
∃ N s.t. n, m > N ⇒ supxi∈C1/L
|gn(xi )− gm(xi )| <ε
3.
d. For each x ∈ A, there exist yj ∈ C1/L s.t. |x − yj | < δ.Therefore, if n, m > N, then
|gn(x)− gm(x)| ≤ |gn(x)− gn(yj)|+ |gn(yj)− gm(yj)|+|gm(x)− gm(yj)| ≤ ε
3+
ε
3+
ε
3
This proves limn,m→∞ supx∈A |gn(x)− gm(x)| = 0.
16
Continue...
10. From 9, gn is a Cauchy sequence in C([0, 1], N).
11. Since C([0, 1], N) is the complete normed space, gn convergesto some g ∈ C([0, 1],N).
12. Since B is closed, it must be g ∈ B.
13. From 1, 11, and 12, B is sequentially compact, so it iscompact.
♣ ♣ ♣ ♣ ♣ ♣ ♣
The proof of Arzela-Ascoli theorem is exactly the same as thespecial case discussed above except the step 2.For the replacement of the step 2, we use the fact that thecompact set A is totally bounded . The compactness of Aprovides that, for each δ > 0, there exist a finite setCδ = {y1, · · · , yk} such that A ⊂ ∪k
j=1D(yj , δ).
17
5.7 The contraction mapping principle
Theorem. (5.7.1: Contraction mapping principle)
Let M be a complete normed space and Φ : M → M a givenmapping. Assume
∃ k ∈ [0, 1) s.t. ‖Φ(f )− Φ(g)‖ ≤ k ‖f − g‖ for all f , g ∈ M
Then there exists a unique fixed point f∗ ∈ M s.t. Φ(f∗) = f∗. Infact, if f0 ∈ M and fn+1 = Φ(fn), n = 0, 1, 2, · · · , then
limn→∞ ‖fn − f∗‖ = 0
Key idea: Φ is shrinking distances:
‖fn+1 − fn‖ = ‖Φ(fn)− Φ(fn)‖ ≤ k‖fn − fn−1‖ ≤ · · · ≤ kn‖f1 − f0‖
1
The proof of contraction mapping principle:∃ f∗ ∈ M s.t. Φ(f∗) = f∗
1. Let f0 ∈ M and fn+1 = Φ(fn), n = 0, 1, 2, · · · .
2. ‖f2 − f1‖ = ‖Φ(f1)− Φ(f0)‖ ≤ k‖f1 − f0‖.3. ‖f3 − f2‖ = ‖Φ(f2)− Φ(f1)‖ ≤ k‖f2 − f1‖ ≤ k2‖f1 − f0‖.4. Inductively, ‖fn+1 − fn‖ ≤ kn‖f1 − f0‖.5. Hence,∑∞
n=0 ‖fn+1 − fn‖ ≤ ‖f1 − f0‖∑∞
n=0 kn = ‖f1 − f0‖ 11−k < ∞.
6. From the proof in Example 5.5.6 and 4, fn converges.
7. Since M is complete, limn→∞ fn = f∗ for some f∗ ∈ M.
8. Φ is uniformly continuous because‖Φ(f )− Φ(g)‖ ≤ k ‖f − g‖.
9. From 8, limn→∞Φ(fn) = Φ(f∗).10. Hence, f∗ = limn→∞ fn+1 = limn→∞Φ(fn) = Φ(f∗).
2
The proof of contraction mapping principle:Uniqueness of the fixed point f∗
11. To prove the uniqueness, assume g∗ is another fixed point,i.e., Φ(g∗) = g∗
12. Then f∗ − g∗ = Φ(f∗)− Φ(g∗) and
‖f∗ − g∗‖ = ‖Φ(f∗)− Φ(g∗)‖ ≤ k‖f∗ − g∗‖
Hence, (1− k)‖f∗ − g∗‖ ≤ 0.
13. Since 1 ≤ k < 1, it must be
‖f∗ − g∗‖ = 0
Hence, f∗ = g∗
3
Theorem. (5.7.2: Existence of sol’n of Differential equations)
Let A ⊂ R2 be an open neighborhood of (t0, x0). Assumef : A → R is continuous function satisfying the following Lipschitzcondition:
|f (t, x1)− f (t, x2)| ≤ K |x1 − x2| for all (t, x1), (t, x2) ∈ A.
Then, there is a δ > 0 s.t. the equation
dx(t)
dt= f (t, x), x(t0) = x0
has a unique C 1-solution x = φ(t) with φ(t0) = x0, fort ∈ (t0 − δ, t0 + δ), i.e.,
φ′(t) = f (t, φ(t)) for all t ∈ (t0 − δ, t0 + δ) & φ(t0) = x0
C 1-solution = continuously differentiable solution4
Get insight: Proof of Theorem 5.7.2
Before the proof, let us get some insight. Imagine that φ isthe solution of dx(t)
dt = f (t, x), x(t0) = x0.Since φ′(t) = f (t, φ(t)) with φ(t0) = x0,
φ(t) = φ(t0) +
∫ t
t0
φ′(s)ds = x0 +
∫ t
t0
f (s, φ(s))ds
Hence, φ is a fixed point for the map Φ : M → M defined by
Φ(φ) = x0 +
∫ t
t0
f (s, φ(s))ds
In order to apply the contraction mapping principle, we need tochoose a suitable space M. In practice, the solution φ can beachieved from the following iterative method:
φn+1(t) = Φ(φn) = x0 +
∫ t
t0
f (s, φn(s))ds & φ0 = x0
5
Proof of Theorem 5.7.2
1. Let L = sup(x ,t)∈A |f (x , t)| where A is a closed subset of A.Since f is continuous in A, L < ∞.
2. Choose δ such that Kδ < 1 and
{(t, x) : |t − t0| < δ, |x − x0| < Lδ} ⊂ A
3. Denote C = C([t0 + δ, t0 + δ],R). From theorem 5.5.3, C is acomplete normed space (or Banach space) with norm
‖φ‖ = supt∈[t0+δ,t0+δ]
|φ(t)|
4. Let
M = {φ ∈ C : φ(t0) = x0 & |φ(t)− x0| ≤ Lδ}5. Then, M is also a complete normed space. (Why? M is
closed subset of C w.r.t. the norm ‖ · ‖.)6
Proof of Theorem 5.7.2
5. Define Φ : M → C by (Please find its motivation from the previous slide)
Φ(φ) = x0 +
∫ t
t0
f (s, φ(s))ds
6. Claim: φ ∈ M ⇒ Φ(φ) ∈ M.Proof. Let φ ∈ M and ψ = Φ(φ).• ψ(t0) = x0 and ψ ∈ C because
limh→0
|ψ(t + h)− ψ(t)| = limh→0
∣∣∣∣∣∫ t+h
t
f (s, φ(s))ds
∣∣∣∣∣ ≤ limh→0
Lh = 0
• From 1,
|t−t0| ≤ δ ⇒ |ψ(t)−x0| =∣∣∣∣∫ t
t0
f (s, φ(s))ds
∣∣∣∣ ≤ L|t−t0| ≤ Lδ
• Hence, ψ ∈ M.
7. From 6, Φ maps M to M. See the condition of Theorem 5.7.1.
7
Proof of Theorem 5.7.2
7. Using the Lipschitz condition,
‖Φ(φ1)− Φ(φ2)‖ = supt∈[t0+δ,t0+δ]
∣∣∣∣∫ t
t0
f (s, φ1(s))− f (s, φ2(s))ds
∣∣∣∣
≤ supt∈[t0+δ,t0+δ]
∣∣∣∣∫ t
t0
K |φ1(s)− φ2(s)|ds
∣∣∣∣≤ δK‖φ1 − φ2‖
8. Since δK < 1,
‖Φ(φ1)− Φ(φ2)‖ ≤ k‖φ1 − φ2‖, k = δK ∈ [0, 1)
9. From 5.7.1, ∃ φ∗ ∈ M s.t. Φ(φ∗) = φ∗.
8
Theorem. (5.7.3: Fredholm equation)
Assume that K (x , y) is continuous on [a, b]× [a, b] and
M = supx ,y∈[a,b]
|K (x , y)|
If |λ|M|b − a| < 1, then the following Fredholm equation has aunique solution in C([a, b],R):
f (x) = λ
∫ b
aK (x , y) f (y) dy + φ(x), x ∈ [a, b]
where λ ∈ R, φ ∈ C([a, b],R).
Proof. For f ∈ C([a, b],R), we define
(Φ(f ))(x) = λ
∫ b
aK (x , y) f (y) dy + φ(x)
9
Proof of 5.7.3
1. Claim: Φ maps from C([a, b],R) to C([a, b],R).Proof. Let f ∈ C([a, b],R). We need to show that Φ(f ) iscontinuous. Let ε > 0 be given.• Since [a, b]× [a, b] is compact, K (x , y) is uniformly continuous.• Hence, ∃ δ
s.t. ‖(x1, y)− (x2, y)‖ < δ & (x1, y), (x2, y) ∈ [a, b]× [a, b]imply |K (x1, y)− K (x2, y)| < ε
‖f ‖ |b−a|+1 .
• If |x1 − x2| < δ and x1, x2 ∈ [a, b], then|(Φ(f ))(x1)− (Φ(f ))(x2)| =∫ b
a|K (x1, y)− K (x2, y)||f (y)|dy ≤ δ ‖f ‖|b − a| < ε.
2. Set k = |λ|M|b − a|. Then k < 1 and
‖Φ(f )− Φ(g)‖ = supx∈[a,b]
∣∣∣∣∫ b
aK (x , y)(f (y)− g(y)|dy
∣∣∣∣ ≤ k‖f − g‖
3. From 5.7.1, ∃ unique f∗ ∈ C([a, b],R) s.t. Φ(f∗) = f∗.
10
Theorem. (5.7.4: Volterra integral equation)
Assuming K (x , y) is continuous on [a, b]× [a, b], the Volterraintegral equation f (x) = λ
∫ xa K (x , y) f (y) dy + φ(x) has a unique
solution f (x) for any λ.
Proof. For f ∈ C([a, b],R), we define(Φ(f ))(x) = λ
∫ xa K (x , y) f (y) dy + φ(x)
1. As in 5.7.4, Φ maps from C([a, b],R) to C([a, b],R).
2. Let M = supx ,y∈[a,b] |K (x , y)|. Then,
|Φ(f )(x)− Φ(g)(x)| = |λ|∣∣∣∣∫ x
aK (x , y)(f (y)− g(y))dy
∣∣∣∣≤ |λ||x − a|M‖f − g‖
11
Proof of 5.7.4
3. From 2,
|Φ2(f )(x)− Φ2(g)(x)| = |λ|∣∣∣∣∫ x
a
K (x , y)(Φ(f )(y)− Φ(g)(y))dy
∣∣∣∣
≤ |λ|∣∣∣∣∫ x
a
M|y − a||λ|M‖f − g‖dy
∣∣∣∣
≤ |λ|2M2 |b − a|22!
‖f − g‖
4. Inductively, we have
‖Φn(f )− Φn(g)| ≤ |λ|nMn|b − a|nn!
‖f − g‖
5. By the ratio test,∑ |λ|nMn|b−a|n
n! converges.
6. Hence, we can choose N so that |λ|NMN |b−a|NN! < 1. ∴ ΦN is
a contraction!
12
Proof of 5.7.4
7. From 6, ∃ unique f∗ ∈ C([a, b],R) s.t. ΦN(f∗) = f∗.8. From 7, ΦN+1(f∗) = Φ(f∗).9. From 8, Φ(f∗) is a fixed point of ΦN .
10. From 7, 9, and the uniqueness of the fixed point, it mustbe f∗ = Φ(f∗).
What a CUTE IDEA is!
13
Examples
• Example 5.7.5. Let Φ : R→ R be defined by Φ(x) = x + 1.|Φ(x)− Φ(y)| = |x − y |� k|x − y | for any k ∈ [0, 1), and Φdoes not have a unique fixed point.
• Example 5.7.6. Solve x ′(t) = x(t), x(0) = 1.Solution. Let Φ(φ)(t) = 1 +
∫ t0 φ(s)ds. Let φ0 = 1 and
φn+1 = Φ(φn), n = 0, 1, · · · . Then φn(t) =∑n
k=01k! t
k .Hence, φn(t) → et .
• Example 5.7.7. Solve x ′(t) = t x(t) for t near 0 andx(0) = 3.Solution. Let Φ(φ)(t) = 3 +
∫ t0 φ(s)ds. Let φ0 = 3 and
φn+1 = Φ(φn), n = 0, 1, · · · . Then φn(t) = 3∑n
k=01k!
(t2
2
)k.
Hence, φn(t) → 3et2/2.
14
Examples
• Example 5.7.5. Consider the integral equation
f (x) = a +
∫ x
0xe−xy f (y) dy
Check directly on which intervals [0, r ] we get a contraction.Solution. Let K (x , y) = xe−xy and letΦ(f )(x) = a +
∫ x0 xe−xy f (y) dy . Then
‖Φ(f )− Φ(g)| = supx∈[0,r ]
∣∣∣∣∫ x
0K (x , y)(f (y)− g(y) dy
∣∣∣∣
≤ supx∈[0,r ]
∣∣∣∣∫ x
0K (x , y) dy
∣∣∣∣ ‖f − g‖
= supx∈[0,r ]
∣∣∣1− e−x2∣∣∣ ‖f − g‖
Since 0 < 1− e−r2< 1 for any r , Φ is a contraction for any r .
15
5.8 The Stone-Weierstrass Theorem
Aim of Weierstrass Theorem is to show that any continuousfunction can be uniformly approximated by a function that hasmore easily managed properties, such as a polynomial.
Theorem. (5.8.1: Weierstrass-Bernstein )
Let f ∈ C([0, 1],R). There exist a sequence of polynomial pn suchthat limn→∞ ‖pn − f ‖ = 0. In fact,
pn(x) =n∑
k=0
n!
k!(n − k)!xk(1− x)n−k f (k/n) → f unformly
• Meaning of rk(x) := n!k!(n−k)! xk(1− x)n−k : Imagine a coin
with probability x of getting heads and, consequently, with
probability 1− x of getting tails. In n tosses, the probability of
getting exactly k heads is that quantity.
1
Rough proof: Weierstrass-Bernstein
• ∑nk=0 rk(x) = 1 and
∑nk=0(k/n − x)2rk(x) = x(1− x).
limn→∞
∑
| kn−x |>δ
rk(x) = 0, for any δ > 0
andlim
n→∞∑
| kn−x |<δ
rk(x) = 1, for any δ > 0
• Suppose that in gambling game called n-tosses, f (k/n) dollarsis paid out when exactly k heads turn up when n tosses aremade.m The average amount (after a lo∼∼ong evening of
playing n-tosses) paid out when n tosses are made is
pn(x) =n∑
k=0
rk(x) f (k/n) ≈ f (x)
2
The Weierstrass-Bernstein theorem can be applied to C([a, b],R)because
g ∈ C([a, b],R) ⇒ f (x) = g(x(b − a) + a) ∈ C([a, b],R).
Theorem. (5.8.2: Stone-Wierstrass)
Let M be a metric space, A ⊂ M a compact set, and B ⊂ C(A,R)satisfies the following:
1. B is algebra: f , g ∈ B &α ∈ R⇒ f + g , fg , αg ∈ B2. 1 ∈ B3. ∀x , y ∈ A, x 6= y , ∃ f ∈ B s.t. f (x) 6= f (y).
Then B is dense in C(A,R), that is, B = C(A,R).
The proof is easy (just technical). I just provide a rough insight.
1. Since B is algebra, f ∈ B ⇒ pn(f ) ∈ B.
2. Assume that A is a finite set. Then the proof is trivial.
3. Use the concept of finite δ−net for the compact set A.3
Differentiable Mappings
Definition: Let A be an open set in Rn. A mapping f : A ⊂ Rn → Rm
is said to be differentiable at x0 ∈ A if ∃ a linear function (m × n matrix)Df(x0) : Rn → Rm such that
limx→x0
‖f(x)− f(x0)−Df(x0)(x− x0)‖‖x− x0‖
= 0
• Theorem 6.2.2. If f : A ⊂ Rn → Rm is differentiable, then ∂fj
∂xiexist, and
Df(x) =
∂f1
∂x1
∂f1
∂x2· · · ∂f1
∂xn∂f2
∂x1· · · · · · ∂f2
∂xn... ... . . . ...∂fm
∂x1· · · · · · ∂fm
∂xn
(called Jacobian matrix)
• 1-Dimension. If f : (a, b) → R is differentiable at x0, then ∃ a numberm = f ′(x0) such that
limx→x0
‖f(x)− f(x0)−m(x− x0)‖‖x− x0‖
= 0 or limx→x0
f(x)− f(x0)
x− x0= m
1
Thm 6.1.2. If f : A ⊂ Rn → Rm is differentiable at a, then f is
continuous at a and Df(a) is uniquely determined.
Proof of uniqueness. Let L1 and L2 be two m×n matrix (or linear mappings)satisfying
limx→a
‖f(x)− f(a)− L1(x− a)‖‖x− a‖
= 0 = limx→a
‖f(x)− f(a)− L2(x− a)‖‖x− a‖
It suffices to prove that ‖L1ej − L2ej‖ = 0 for j = 1, · · · , n.
‖L1ej − L2ej‖ = 1|h|‖L1(hej)− L2(hej)‖ = ‖L1(hej)−L2(hej)‖
‖hej‖
= ‖f(a+hej)−f(a)−L1(hej)−[f(a+hej)−f(a)−L2(hej)]‖‖hej‖
≤ ‖f(a+hej)−f(a)−L1(hej)‖‖hej‖ + ‖f(a+hej)−f(a)−L2(hej)‖
‖hej‖
→ 0 as h → 0
• Proof of continuity: Since limy→a ‖f(y)−f(a)−Df(a)(y−a)‖ = 0, limy→a ‖f(y)−f(a)‖ = 0.
2
Thm 6.2.2. Assume f : A ⊂ Rn → Rm is differentiable at x and
Df(x) = [aij]. Then,∂fj∂xi
exist and aij =∂fj∂xi
.
Proof. Denote e1 = (1,0, · · · ,0), e2 = (0,1,0, · · · ,0), en = (0, · · · ,0,1). Wehave
limy→x‖f(y)−f(x)−Df(x)(y−x)‖
‖y−x‖ = 0
⇒ limh→0‖f(x+hei)−f(x)−Df(x)(hei)‖
|h| = 0, ı = 1,2, · · · , n
⇒ limh→0
√∑m
j=1|fj(x+hei)−fj(x)−aij(hei)|2
|h| = 0, = 1,2, · · · , n
⇒ ∂fj
∂xiexists and aij = ∂fj
∂xi
3
Thm 6.4.1. Let f : A ⊂ Rn → Rm. If each ∂fj
∂xiexist and continuous on A,
then f is differentiable on A.
[Proof for the case n = 2, m = 1.] Let Df(x) =[
∂f∂x1
(x), ∂f∂x2
(x)], x ∈ A. From
the mean value theorem,
f(y)− f(x) = f(y1, y2)− f(x1, y2) + f(x1, y2)− f(x1, x2)
=∂f
∂x1(u1, y1) (y1 − x1) +
∂f
∂x2(x1, u2) (y2 − x2)
for some ui between xi and yi. Hence,
f(y)− f(x)−Df(x)(y − x) = α (y1 − x1) + β (y2 − x2)
where α :=[
∂f∂x1
(u1, y2)− ∂f∂x1
(x)]
and β :=[
∂f∂x2
(x1, u2)− ∂f∂x2
(x)].
Due to continuity of ∂f∂x1
and ∂f∂x2
, α → 0 & β → 0 as y → x and
|f(y)− f(x)−Df(x)(y − x)|‖y − x‖
=
∣∣∣α (y1 − x1) + β (y2 − x2)∣∣∣√
(y1 − x1)2 + (y2 − x2)2≤
√α2 + β2 → 0
as y → x. This proves that limy→x‖f(y)−f(x)−Df(x)(y−x)‖
‖y−x‖ = 0.
4
Remark. About a differentiable map f : A ⊂ Rn → Rm.
• The proof of Thm 6.4.1 for the general case f : A ⊂ Rn → Rm is almostsame as the special case f : A ⊂ R2 → R.
• Intuitively, x → f(x0) + Df(x0)(x− x0) is supposed to be the best affineapproximation to f near x0
• It should be noticed that the existence of ∂fj
∂xidoes not imply that the
derivative Df exist.
Directional Derivatives.Let f : A ⊂ Rn → R be real-valued function.
• Let e ∈ Rn be a unit vector. ddt
f(x + te)|t=0 = limt→0f(x+te)−f(x)
tis called
the directional derivative of f at x in the direction e.
• If f is differentiable, then limt→0f(x+te)−f(x)
t= Df(x) · e.
• Note that the existence of all directional derivatives at a point need notimply differentiability.
Example. Let f(x, y) = xyx2+y
for x2 6= −y and f(x, y) = 0 if x2 =
−y. Note that f is not continuous at (0,0), since limt→0 f(t, t3 − t2) =
limt→0t(t3−t2)t2+t3−t2
= −1 6= 0 = f(0,0). But all directional derivative of f at
(0,0) exist:
limt→0
f(ta, tb)
t=
1
t
t2ab
t2a2 + tb= a
for any unit vector e = (a, b).
5
Chain Rule 6.5.1: Let A ⊂ Rn be open and let f : A → Rm be differentiable. LetB ⊂ Rn be open, f(A) ⊂ B, and g : B → Rp be differentiable. Then h = g ◦ f isdifferentiable on A and Dh(x) = Dg(f(x))Df(x):
D(g ◦ f)(x) =
∂g1
∂y1· · · ∂g1
∂ym... . . . ...∂gp
∂y1· · · ∂gp
∂ym
∂f1
∂x1· · · ∂f1
∂xn... . . . ...∂fm
∂x1· · · ∂fm
∂xn
Proof. From the assumption, it is easy to see that
limx→x0
‖:=♠︷ ︸︸ ︷
(g ◦ f)(x)− (g ◦ f)(x0)−Dg(f(x0))(f(x)− f(x0)) ‖‖x− x0‖
= 0
limx→x0
‖:=♣︷ ︸︸ ︷
f(x)− f(x0)−Df(x0)(x− x0) ‖‖x− x0‖
= 0
Since (g ◦ f)(x)− (g ◦ f)(x0)−Dg(f(x0))Df(x0)(x− x0) = ♠+ Dg(f(x0))♣,it follows from the above identities that
limx→x0
‖= ♠ + Dg(f(x0)) ♣︷ ︸︸ ︷
(g ◦ f)(x)− (g ◦ f)(x0)−Dg(f(x0))Df(x0)(x− x0) ‖‖x− x0‖
= 0.
6
Directional derivatives and examples
1. If h(r, θ) = f(r cos θ, r sin θ), then(∂h∂r
∂h∂θ
)=
(∂f∂x
∂f∂y
) (cos θ −r sin θsin θ r cos θ
)2. Consider a surface S defined by f(x) =constant. Then ∇f(x) is orthog-
onal to this surface.
Proof. Let c : [0,1] → Rn be a curve lying on S and c(0) = x0.
0 =d
dtf(c(t)) = ∇f(c(t)) · c′(t).
This means that ∇f(c(t)) is orthogonal to its tangent vector c′(t). Sincethis is true for arbitrary curve on S passing x0, ∇f(x0) is orthogonal toS at x0.
3. The direction of greatest rate of increase of f(x) is ∇f(x).
7
6.7.1. Mean Value Theorem. Suppose f : A ⊂ Rn → R is differentiable onan open set A. For any x,y ∈ A such that the line segment joining x and ylies in A, ∃c on that segment such that
f(y)− f(x) = Df(c) · (y − x)
Proof. Define h(t) = f((1− t)x + ty). Then
∃t0 ∈ (0,1) such that h(1)− h(0) = h′(t0)
and therefore
f(y)− f(x) = h(1)− h(0) = h′(t0) = Df((1− t0)x + t0y︸ ︷︷ ︸=c
) · (y − x)
8
• Definition. A bilinear map B : Rm ×Rn → R is n×m matrix such that
B(x,y) =∑ij
aijxiyj = (x1, · · · , xn)
a11 · · · a1m... . . . ...
an1 · · · anm
y1...
vm
• Definition 6.8.4. For positive integer r, f is said to be of class Cr if all
partial derivatives up to order r exist and continuous.
• Let f : A ⊂ Rn → R is of class C2. Then
D2f(x) =
∂2f∂x1∂x1
· · · ∂2f∂x1∂xn... . . . ...
∂2f∂xn∂x1
· · · ∂2f∂xn∂xn
• If D2f us continuous, D2f is symmetric.
9
Taylor’s Theorem 6.8.5.[Case:f ∈ C3]. Let f : A ⊂ Rn → R is of class C3.Suppose x ∈ A and x + th ∈ A for 0 ≤ t ≤ 1. Then ∃c = x + t0h,0 < t0 < 1,such that
f(x + h)− f(x) =n∑
i=1
∂f
∂xi(x)hi +
1
2!
n∑i,j=1
∂2f
∂xi∂xj(x)hihj
+1
3!
n∑i,j,k=1
(∂3f
∂xi∂xj∂xk
(x + t0h)hihjhk
)Proof.
f(x + h)− f(x) =∫ 10
ddt
f(x + th)dt =∫ 10
∑ni=1
∂f∂xi
(x + th)hidt
=∑n
i=1
∫ 10
∂f∂xi
(x + th)hid(t−1)
dtdt (Why?
d(t−1)dt
= 1)
=∑n
i=1
[∂f∂xi
(x)hi −∫ 10
ddt
(∂f∂xi
(x + th)hi
)(t− 1) dt
]=
∑ni=1
∂f∂xi
(x)hi + R1(h,x)
where
R1(h,x) =n∑
i,j=1
∫ 1
0(1− t)
(∂2f
∂xi∂xj(x + th)hihj
)dt
10
Using ddt
(−(t−1)2
2!
)= (1− t) and integration by part,
R1(h,x) =∑n
i,j=1
∫ 10
ddt
(−(t−1)2
2!
) (∂2f
∂xi∂xj(x + th)hihj
)dt
= 12!
∑ni,j=1
∂2f∂xi∂xj
(x)hihj + R2(h,x)
where
R2(h,x) :=n∑
i,j,k=1
∫ 1
0
(t− 1)2
2!
(∂3f
∂xi∂xj∂xk
(x + th)hihjhk
)dt
Recall the second mean value theorem for integral∫ 1
0f(t)g(t)dt = g(t0)
∫ 1
0f(t)dt for some 0 < t0 < 1.
Hence, ∃t0,0 < t0 < 1 such that
R2(h,x) =n∑
i,j,k=1
(∂3f
∂xi∂xj∂xk
(x + t0h)hihjhk
) ∫ 1
0
(t− 1)2
2!dt︸ ︷︷ ︸
1
3!
.
One can proceed by using induction using the same method to get the general Taylor’s
theorem.
6.8.5. Taylor’s Theorem [General Case:f ∈ Cr]. Let f : A ⊂ Rn → R is ofclass Cr. Suppose x ∈ A and x + th ∈ A for 0 ≤ t ≤ 1. Then
f(x + h) = f(x) + Df(x) · h + · · ·+1
r!Dr−1f(x) · (h, · · · ,h) + Rr−1(x,h)
where Rr−1(x,h) is the remainder. Furthermore,
Rr−1(x,h)
‖h‖r−1→ 0 as h → 0
Another proof of Taylor’s formula. Let g(t) = f(x+th) for t ∈ [0,1]. Applyingone-dimensional Taylor’s formula, there exists t ∈ (0,1) such that
g(1) = g(0) +r−1∑k=1
1
k!g(k)(0) +
1
(r − 1)!g(k−1)(t)
Note that Rr−1(x, h) = + 1r!
g(k−1)(t), g(1) = f(x + h), g(0) = f(x),
g′(0) = Df(x) · h =∑n
i=1∂f∂xi
(x)hi
g′′(0) = D2f(x) · (h,h) =∑n
i,j=1∂2f
∂xi∂xj(x)hihj
g′′′(0) = D3f(x)(h,h,h) =∑n
i,j,k=1
(∂3f
∂xi∂xj∂xk(x)hihjhk
)11
Theorem 6.9.2. If f : A ⊂ Rn → R is differentiable and x0 ∈ A is an extremepoint for f , then Df(x0) = 0.
Proof. Assume Df(x0) 6= 0. We try to prove that f(x0) is not a local extremevalue.
• Let h = Df(x0)‖Df(x0)‖. Since f is differentiable at x0,
limλ→0
1
|λ||f(x0 + λh)− f(x0)−Df(x0) · (λh)| = 0.
• Hence, (for given ε = ‖Df(x0)‖2
) there exist δ > 0 such that
0 < |λ| < δ ⇒ |f(x0 + λh)− f(x0)−Df(x0) · (λh)| <‖Df(x0)‖
2|λ|
Since Df(x0) · h = ‖Df(x0)‖, we have
−‖Df(x0)‖
2|λ| < f(x0 + λh)− f(x0)− ‖Df(x0)‖λ <
‖Df(x0)‖2
|λ|
This leads to the followings:
12
– for 0 < λ < δ, ‖Df(x0)‖2
λ < f(x0 +λh)− f(x0). Hence, f(x0) is not localmaximum.
– for −δ < λ < 0, f(x0 + λh) − f(x0) < ‖Df(x0)‖2
λ. Hence, f(x0) is notlocal minimum.
Theorem 6.9.4. Suppose f : A ⊂ Rn → R is a C3−function and x0 is a criticalpoint.
• If f has a local maximum at x0, then Hx0(f) is negative semi-definite.
• If Hx0(f) is negative ( positive ) definite, then f has a local maximum(minimum) at x0
Indeed, this theorem holds true for f ∈ C2.
Proof. Since Df(x0) = 0, Taylor’s theorem gives
f(x0 + h)− f(x) =1
2D2f(x0)(h,h) + R2(x0,h)
where limh→0R2(x0,h)‖h‖ = 0.
If D2f(x0) is negative definite, then
1
2D2f(x0)(h,h) + R2(x0,h) < 0 for sufficiently small h
and therefore f(x0 + h)− f(x0) < 0 for sufficiently small h. Hence, f(x0) has
a local maximum at x0.
13
• Example 6.9.5. The matrix A =
(a bb d
)is positive definite if
(x, y)
(a bb d
) (xy
)> 0 if (x, y) 6= (0,0).
Hence, A is positive definite iff ax2 + 2bxy + dy2 > 0 for all all x, y.Therefore, A is positive definite iff a > 0 and ad− b2 > 0.
• Example 6.9.6. Let f(x, y) = x2 − xy + y2. Then Df(0,0) = (0,0) and
D2f(0,0) =
(2 −1−1 2
). Hence, the Hessian is positive definite. Thus
f has a local minimum at (0,0).
14
Chapter 8. Integration
Definition. Let A ⊂ ℝ2 be a bounded set and let f : A → ℝ be a boundedfunction.
∙ We enclose A in some rectangle B = [a1, b1]× [a2, b2] and extend f to thewhole rectangle by defining it to be zero outside of A.
∙ Let P be a partition of B obtained by dividing a1 = x0 < x1 < ⋅ ⋅ ⋅ < xn = b1
and a2 = y0 < y1 < ⋅ ⋅ ⋅ < ym = b2:
P = {[xi, xi+1]× [yj, yj+1]︸ ︷︷ ︸=subrectangle R
: i = 0,1, ⋅ ⋅ ⋅ , n− 1, j = 0,1, ⋅ ⋅ ⋅ ,m− 1}.
∙ Define the upper sum of f :
U(f,P) :=∑R∈P
sup{f(x, y) ∣ (x, y) ∈ R} × (volume of R)
∙ Define the lower sum of f :
L(f,P) :=∑R∈P
inf{f(x, y) ∣ (x, y) ∈ R} × (volume of R)
1
∙ Define the upper integral of f on A by∫A
f = inf {L(f,P) : P is a partition of B}
and the lower integral of f on A by∫A
f = sup {L(f,P) : P is a partition of B}
∙ We say that f is Riemann integrable or integrable if∫A
f =
∫A
f.
∙ If f is integrable on A, we denote∫A
f =
∫A
f =
∫A
f.
Volume and sets of measure zero.
Definition. Let A be a bounded set of ℝn.
∙ The characteristic function 1A of A is the map defined by 1A(x) = 1 ifx ∈ A and 1A(x) = 1 if x /∈ A.
∙ We say that A has volume if 1A is Riemann integrable and the volume isthe number
vol(A) =
∫A
1A(x)dx.
∙ The set A is said to have measure zero if for every � > 0 there is acountable number of rectangles R1, R2, ⋅ ⋅ ⋅ such that
A ⊂ ∪∞n=1Rn &∞∑n=1
vol(Rn) < �.
∙ Examples: The set of rational number has measured zero in ℝ. As asubset of ℝ2, the real line has measure zero.
2
∙ Lebesgue’s monotone convergence theorem. Let gn : [0,1] → ℝ be
integrable functions and∣∣∣∫ 1
0 gn(x)dx∣∣∣ < ∞. Suppose that 0 ≤ gn+1 ≤ gn
and gn(x)→ 0 for all x ∈ [0,1]. Then
limn→∞
∫ 1
0gn(x)dx = 0.
∙ Example: limn→∞∫ 1
0 e−nx2
xpdx = 0 if p > −1.
∙ Fubini’s Theorem. Let A = [a, b] × [c, d] ⊂ ℝ2, and let f : A → ℝ becontinuous. Then∫
A
f =
∫ b
a
(∫ d
c
f(x, y)dy
)dx =
∫ d
c
(∫ b
a
f(x, y)dx
)dy
3
Chapter 10 Fourier Series. Fourier analysis arouse historically inconnection with problems in mechanic such as heat conduction and wavemotion.
∙ Vibrating string. Consider a string of length l with clamped ends thatis free to vibrate when plunked. Let y(t, x) is the displacement of thestring at time t and x ∈ [0, l].
– y obeys the wave equation
∂2y
∂t2= c2∂
2y
∂x2
(Force=mass× acceleration = tension)
– That the string has clamped ends entails that y(t,0) = y(t, l) = 0.
∙ It is both important and remarkable that any solution y(x, t) can bedecomposed into harmonics:
y(x, t) =∞∑n=1
cn yn(x, t)︸ ︷︷ ︸standing wave
=∞∑n=1
cnsin(n�lx)
cos(!nt), !n =n�c
l︸ ︷︷ ︸frequency
4
∙ Physically, a standing wave is a synchronous up-and-down motion thatrepeats its shape periodically after time 2�
!, such as occurs when a string
produces a pure note.
∙ Specific standing waves called fundamental solutions (a kind of basis)are given by
yn(x, t) = sin(n�lx)
cos(!nt), n = 0,1,2, ⋅ ⋅ ⋅
∙ Thus a complicated-looking vibration is in reality an infinite linear com-bination of harmonics.
∙ The purpose of Fourier analysis is to carry out this procedure of decom-position using general method.
Exercise: Using separable variable, prove that any solution y(x, t) can bedecomposed into harmonics
y(x, t) =∞∑n=1
cnyn(x, t) =∞∑n=1
cnsin(n�lx)
cos(!nt), !n =n�c
l
10.1 Review: Inner Product in ℝn.
∙ For x,y ∈ ℝn, define inner product and norm:
⟨x,y⟩ =n∑
j=1
x(j)y(j), ∥x∥ =√⟨x,x⟩.
∙ The distance (or metric) between x and y is defined by ∥x − y∥, andhence ∥x− y∥ = 0 implies x = y.
∙ If ⟨x,y⟩ = 0, x and y are said to be orthogonal.
∙ {e1, e2, ⋅ ⋅ ⋅ , en} is said to be an orthonormal basis of ℝn if
1. ℝn = span{e1, e2, ⋅ ⋅ ⋅ , en}
2. ∥ej∥ = 1, j = 1, ⋅ ⋅ ⋅ , n
3. ⟨ej, ei⟩ = 0 if i ∕= j.
∙ For example, e1 = (1,0, ⋅ ⋅ ⋅ ,0), e2 = (0,1,0, ⋅ ⋅ ⋅ ,0), ....
5
∙ If {e1, e2, ⋅ ⋅ ⋅ , en} is an orthonormal basis, then every x ∈ ℝn can be rep-resented uniquely by
x =n∑
j=1
⟨x, ej⟩ej
∙ If Vm = span{e1, ⋅ ⋅ ⋅ , em}, the element in Vm closest to x is
xm =m∑j=1
⟨x, ej⟩ej
with the distance ∥x− xm∥ =√∑n
j=m ⟨x, ej⟩2.
This useful dot product properties in Euclidean space can be generalized to
infinite dimensional spaces by introducing Hilbert space.
10.1 Inner Product space C[0,2�]
∙ Let A be the interval (0,2�).
∙ Let V be the space of all continuous functions f : [0,2�]→ ℂ.
∙ For f, g ∈ V , we define the inner product
⟨ f, g ⟩ =
∫ 2�
0f(x)g(x) dx
where g(x) denotes the complex conjugate of g(x). The above inner productcan be approximated by
⟨ f, g ⟩ ≈n∑
j=1
f(xj)g(xj) ∆x.
where we divide the interval [0,2�] into n subintervals with endpoints x0 = 0 < x1 <
⋅ ⋅ ⋅ < xn = 2� and equal width ∆x = 2�n
.
∙ Two functions f and g are said to be orthogonal if
⟨ f, g ⟩ =
∫ 2�
0f(x)g(x)dx = 0.
6
∙ The norm of f is defined as
∥f∥ =√⟨ f, f ⟩ =
√∫ 2�
0∣f(x)∣2dx.
∙ The distance between f and g is defined by
d(f, g) = ∥f − g∥.
∙ If {�n } is an orthogonal set of functions on the interval A with theproperty that ∥�n∥ = 1, then we call {�n } as an orthonormal set.
∙ Example.
{1√2�,
1√�
cosx,1√�
sinx,1√�
cos 2x,1√�
sin 2x, ⋅ ⋅ ⋅ }
is an orthonormal set in V .
10.1 Inner Product space
Definition. Let V be a complex vector space V . An inner product on V isa mapping ⟨⋅, ⋅⟩ : V × V → ℂ with the following properties :
1. ⟨�f + �g ℎ⟩ = �⟨ f, ℎ ⟩+ �⟨ g, ℎ ⟩ for all f, g, ℎ ∈ V and �, � ∈ ℂ.
2. ⟨ f, g ⟩ = ⟨ g, f ⟩
3. ⟨ f, f ⟩ ≥ 0, and ⟨ f, f ⟩ = 0 ⇒ f = 0
Theorem 10.1.2. The space V of the continuous functions f : [a, b] → ℂforms an inner product space if we define
⟨f, g⟩ =
∫ b
a
f(x)g(x)dx.
7
10.1 Inner Product space V = C[a, b] Consider the space Vof the continuous functions f : [a, b] → ℂ with the inner product ⟨f, g⟩ =∫ baf(x)g(x)dx.
∙ Define the norm of f by ∥f∥ =√⟨ f, f ⟩.
∙ Define the distance between f and g by d(f, g) = ∥f − g∥.
For f, g, ℎ ∈ V , we have
∙ Cauchy-Schwarz inequality. ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥
∙ Minkowski inequality. ∥f + g∥ ≤ ∥f∥+ ∥g∥
∙ Parallelogram law. ∥f + g∥2 + ∥f − g∥2 = 2∥f∥2 + 2∥g∥2
∙ Pythagorean Theorem.
If ⟨ f, g ⟩ = 0, then ∥f + g∥2 = ∥f∥2 + ∥g∥2
8
Cauchy-Schwarz inequality. ∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥
Proof:
∙ Suppose g ∕= 0. Let ℎ = g∥g∥. Then
∣⟨ f, g ⟩∣ ≤ ∥f∥∥g∥ ⇔ ∣⟨ f, ℎ ⟩∣ ≤ ∥f∥
∙ Denote � = ⟨ f, ℎ ⟩. Then
0 ≤ ∥f − �ℎ∥2 = ⟨f − �ℎ, f − �ℎ⟩= ∥f∥2 − � ⟨ℎ, f⟩ − � ⟨f, ℎ⟩+ ∣�∣2
= ∥f∥2 − ∣�∣2
Hence, ∣�∣ = ∣⟨ f, ℎ ⟩∣ ≤ ∥f∥. This completes the proof.
9
Minkowski inequality. ∥f + g∥ ≤ ∥f∥+ ∥g∥
Proof:
∥f + g∥2 = ⟨f + g, f + g⟩ = ∥f∥2 + ⟨f, g⟩+ ⟨g, f⟩+ ∥g∥2
≤ ∥f∥2 + 2∥f∥∥g∥+ ∥g∥2
= (∥f∥+ ∥g∥)2
10
Definition of convergence in an inner product space V . LetV be an inner product space and let fn be a sequence in V . We say that fnconverges to f (in mean) and write fn → f if ∥fn − f∥ = 0, that is,
∀� > 0, ∃N s.t. n ≥ N ⇒ ∥fn − f∥ < �.
Similarly, a series∑n
k=1 gk converges to f if
limn→∞
∥n∑
k=1
gk − f∥ = 0.
Examples: Let V = C([0,1]), the space of continuous functions f : [0,1]→ ℂ.
∙ Let fn = nx�[0,1
n] +(2−nx)�(1
n,2
n]. Then fn → 0 in mean, that is,
∫ 10 ∣fn(x)−
0∣2dx→ 0.
∙ Let fn = n2x�[0,1
n] + (2n− n2x)�(1
n,2
n]. Then
limn→∞
fn(x) = 0 (∀x ∈ ℝ) & limn→∞
∫ 1
0∣fn(x)− 0∣2dx =∞.
11
Definition of Cauchy sequence. A sequence fn in an inner productspace is said to be a Cauchy sequence when
∀� > 0, ∃N s.t. n,m ≥ N ⇒ ∥fn − fm∥ < �.
An inner product space is called complete if every Cauchy sequence in Vconverges. A complete inner product space is called a Hilbert space.
Remark: The inner product space V = C([0,2]) is not complete.
∙ Let fn(x) = xn for 0 ≤ x ≤ 1 and fn(x) = 1 for 1 ≤ x ≤ 2.
∙ Then fn is Cauchy sequence since ∥fn − fm∥2 =∫ 1
0 ∣xn − xm∣2dx → 0 as
n,m→∞.
∙ However, fn → f where f(x) = 0 for 0 ≤ x ≤ 1 and f(x) = 1 for 1 ≤ x ≤ 2.f /∈ V .
12
A complete inner product space. To make the inner productspace V = C([a, b]) complete, we need the following theorem and measuretheory:
Theorem 8.3.4 If g(x) is integrable, g ≥ 0, and∫ bag(x)dx = 0, then the set
{x ∈ [a, b] : g(x) ∕= 0} has measure zero.
Proof. TA
♣ For any integrable function f , theorem 8.3 leads to∫ b
a
∣f(x)∣2dx = 0⇒ f = 0 except for those x in a set of measure zero.
Regarding such a f as equivalent to zero, we have the following theorem:
Theorem 10.1.6 Let V = L2([a, b]) be the space of functions f : [a, b] → ℂthat ∣f ∣2 is integrable. Then V is an inner product space with inner product
⟨f, g⟩ =∫ baf(x)g(x)dx and norm ∥f∥ =
√⟨f, f⟩.
13
Proof of Theorem 8.3.4: If g(x) is integrable, g ≥ 0, and∫ bag(x)dx = 0,
then the set {x ∈ [a, b] : g(x) ∕= 0} has measure zero.
∙ We first show that a set Am = {x ∈ A : g(x) > 1/m} has measure zero.
∙ Recall∫ bag(x) = inf{U(f, P ) : P is any partition}.
∙ Let � > 0 be given. There exist a partition P such that U(g, P ) < �m
.
∙ Let I1, ⋅ ⋅ ⋅ , Ik be the subintervals of the partition P such that Ii∩Am ∕= ∅.Then
k∑i=1
∣Ii∣ ≤k∑i=1
(m sup
Iig(x)∣Ii∣
)≤ m U(g, P ) < �.
where ∣Ii∣ is the length of the interval Ii.
∙ Since Am ⊂ ∪ki=1Ii and∑k
i=1 ∣Ii∣ < �, Am has measure zero.
∙ Since {x ∈ [a, b] : g(x) ∕= 0} ⊂ ∪∞m=1Am, the set has measure zero.
14
Proof of Theorem 10.1.6: Prove that V = L2([a, b]) is an innerproduct space.
∙ If ∥f∥ = 0,∫ ba∣f(x)∣2dx = 0. From Theorem 8.3.4, f = 0 since we are
identifying functions that agree except on a set of measure zero.
∙ It is easy to see that ⟨f, g⟩ satisfies all the other rules of inner productspace. We only need to prove that ∣⟨f, g⟩∣ <∞ for all f, g ∈ V .
∙ If we split f and g into real and imaginary part, and into positive andnegative part, we are reduce to the case in which f and g are real andpositive.
∙ From Lebesgue monotone convergence theorem (page 467), it sufficesto show that
limM→∞
∫ b
a
(fg)M <∞ ( see page 462)
∙ Note that 0 ≤ (fg)M ≤ f√Mg√M + f2√M
+ g2√M
.
15
∙∫ ba
(fg)M ≤ ∥f√M∥∥g√M∥+ ∥f√M∥2 + ∥g√M∥
2.
∙ Hence,∫ ba
(fg)M ≤ ∥f∥∥g∥+ ∥f∥2 + ∥g∥2 <∞.
Example 10.1.8. If f1, ⋅ ⋅ ⋅ , fn are orthonormal in an inner prod-uct space V , prove that f1, ⋅ ⋅ ⋅ , fn are linearly independent.
∙ Definition. f1, ⋅ ⋅ ⋅ , fn are said to be linearly independent if
n∑i=1
cifi = 0 ⇒ c1 = ⋅ ⋅ ⋅ = cn = 0.
∙ Assume that∑n
i=1 cifi = 0. We want to prove c1 = ⋅ ⋅ ⋅ = cn = 0.
∙ Due to orthogonality, we have
ck = ck∥fk∥2 =
⟨n∑i=1
cifi, fk
⟩= ⟨0, fk⟩ = 0.
16
Example 10.1.8. Let V be an inner product space. Define the
project of f on g to be the vector
ℎ =⟨f, g⟩∥g∥2
g
Show that ℎ and f − ℎ are orthogonal, and interpret this resultgeometrically.
Proof: First, let us prove it when ∥g∥ = 1:
⟨ℎ, f − ℎ⟩ = ⟨ℎ, f⟩ − ∥ℎ∥2 = ⟨⟨f, g⟩g, f⟩ − ∣ ⟨f, g⟩ ∣2 = 0.
For the general case, repeat the above procedure.
17
10.2 Orthogonal family of functions∙ Throughout this section, we assume that V is an inner product space
with an inner product ⟨⋅, ⋅⟩.
∙ A vector � ∈ V is called normalized if ∥�∥ =√⟨�, �⟩ = 1.
∙ f and g are called orthogonal if ⟨f, g⟩ = 0.
∙ Definition. An orthonormal family �0, �1, ⋅ ⋅ ⋅ in V is called complete ifevery f ∈ V can be written
f =∞∑k=0
ck�k (ck = ⟨f, �k⟩)
We call f =∑∞
k=0 ck�k the Fourier series of f with respect to �0, �1, ⋅ ⋅ ⋅and ck = ⟨f, �k⟩ the Fourier coefficients.
∙ An orthonormal family {�k} in V is complete iff for every f ∈ V ,
limn→∞
∥f −n∑
k=0
⟨f, �k⟩�k∥ = 0.
18
Theorem 10.2.1: Suppose f =∑∞k=0 ck�k for an orthonor-
mal family �0, �1, ⋅ ⋅ ⋅ in V (convergence in mean). Then ck =⟨f, �k⟩ = ⟨f, �k⟩.
Proof.
∙ Set sn =∑n
k=0 ck�k, so that ∥sn − f∥ → 0.
∙ Hence, ∣ ⟨f − sn, �i⟩ ∣ ≤ ∥f − sn∥ → 0 as n→∞.
∙ If n ≥ i, then ⟨sn, �⟩ =∑n
k=0 ⟨ck�k, �i⟩ = ci.
∙ If n ≥ i, ∣⟨f − sn, �i⟩∣ = ∣⟨f, �i⟩ − ci∣ ≤ ∥f − sn∥ → 0 as n→∞.
∙ Hence, ⟨f, �i⟩ = ci.
19
Examples of complete orthonormal families :
∙ Let V = L2([0,2�]) be the inner product space in Theorem 10.1.6.
∙ The exponential system {�n(x) = einx√2�
: n = 0,±1,±2} is a complete
orthonormal system in the space V, that is, Fourier series for f ∈ V forthis family is given by
f =∞∑
k=−∞
ckeikx
√2�
, ck = ⟨f, �k⟩ =1√2�
∫ 2�
0f(x)e−ikxdx.
∙ The trigonometric system 1√2�, cosmx√
2�, sinnx√
2�,m, n = 1,2, ⋅ ⋅ ⋅ is complete
orthonormal system in V.
Proof. See Mean completeness theorem 10.3.1. (optional)
20
Gram-Schmidt process :
∙ Let g0, g1, g2, ⋅ ⋅ ⋅ be an linearly independent functions in an inner productspace V .
∙ We can form a corresponding orthonormal system �0, �1, ⋅ ⋅ ⋅ as follows
�0 =g0
∥g0∥
�1 =�1
∥�1∥�1 = g1 − ⟨g1, �0⟩�0
�k+1 =�k
∥�k∥�k = gk −
k∑i=0
⟨g1, �i⟩�i
21
Theorem: Bessel inequality: Let �0, �1, ⋅ ⋅ ⋅ be an orthonomal systemin an inner product space V . For each f ∈ V , the real series
∑∞i=0 ∣⟨f, �i⟩∣
2
converges and∞∑i=0
∣⟨f, �i⟩∣2 ≤ ∥f∥2.
Proof.
∙ Set sn =∑n
k=0 ck�k where ck = ⟨f, �k⟩.
∙ Key idea 1: f − sn and sn are orthogonal.
∙ Key idea 2: Apply Pythagoras’ theorem: ∥f∥2 = ∥f − sn∥2 + ∥sn∥2.
∙ Hence, ∥sn∥2 ≤ ∥f∥2.
∙ Since �i are orthogonal, ∥sn∥2 =∑n
i=0 ∣⟨f, �i⟩∣2 .
22
Parseval’s Theorem : Let �0, �1, ⋅ ⋅ ⋅ be an orthonomal system in aninner product space V . Then �0, �1, ⋅ ⋅ ⋅ is complete iff for every f ∈ V , wehave
∞∑i=0
∣⟨f, �i⟩∣2 = ∥f∥2.
Proof.
∙ Set sn =∑n
k=0 ck�k where ck = ⟨f, �k⟩.
∙ Then ∥f∥2 = ∥f − sn∥2 + ∥sn∥2.
∙ If �0, �1, ⋅ ⋅ ⋅ is complete, ∥f − sn∥2 → 0. Therefore, letting n→∞,
∥f∥2 = limn→∞
{∥f − sn∥2 + ∥sn∥2
}= 0 +
∞∑i=0
∣⟨f, �i⟩∣2
∙ Conversely, if∑∞
i=0 ∣⟨f, �i⟩∣2 = ∥f∥2, then ∥f∥2 − ∥sn∥2 → 0, and so ∥f −
sn∥2 → 0.
23