Contents
0 Pre-preliminaries 7
0.1 Course Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.2 Logic and inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
0.2.1 Set operations and logical connectives. . . . . . . . . . . . . . . . . . 13
1 Preliminaries 15
1.1 Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Infinite Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.1 Countable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3 The rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.1 The abstract structure of Q. . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Axiom of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.5 A vocabulary for sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2 Construction of the real numbers 29
2.1 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 The reals as an ordered field . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Limits and completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Other constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3 Topology of the Real Line 39
3.1 Limits and bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.1 Limit Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Open sets and closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.1 Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 Math 413 CONTENTS
3.3.1 Key properties of compactness . . . . . . . . . . . . . . . . . . . . . 57
4 Continuous functions 59
4.1 Concepts of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.1.2 Limits of functions and limits of sequences . . . . . . . . . . . . . . . 61
4.1.3 Inverse images of open sets . . . . . . . . . . . . . . . . . . . . . . . 62
4.1.4 Related definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Properties of continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5 Differentiation 71
5.1 Concepts of the derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.2 Continuity and differentiability . . . . . . . . . . . . . . . . . . . . . 72
5.2 Properties of the derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.1 Local properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2.2 IVT and MVT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3 Calculus of derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.1 Arithmetic rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4 Higher derivatives and Taylor’s Thm . . . . . . . . . . . . . . . . . . . . . . 78
5.4.1 Interpretations of f ′′ . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4.2 Taylor’s Thm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6 Integration 81
6.1 Integrals of continuous functions . . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.1 Existence of the integral . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2 Properties of the Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . 86
6.3 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7 Sequences and Series of Functions 93
7.1 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.1.1 Basic properties of C . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 Numerical Series and Sequences . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2.1 Convergence and absolute convergence . . . . . . . . . . . . . . . . . 97
7.2.2 Rearrangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2.3 Summation by parts . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
CONTENTS 5
7.3 Uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3.1 Definition of uniform convergence . . . . . . . . . . . . . . . . . . . . 110
7.3.2 Criteria for uniform convergence . . . . . . . . . . . . . . . . . . . . 111
7.3.3 Continuity and uniform convergence . . . . . . . . . . . . . . . . . . 111
7.3.4 Spaces of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3.5 Term-by-term integration . . . . . . . . . . . . . . . . . . . . . . . . 114
7.3.6 Term-by-term differentiation . . . . . . . . . . . . . . . . . . . . . . 115
7.4 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.4.1 Radius of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.4.2 Analytic continuation . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.5 Approximation by polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.5.1 Convolution and approximate identities . . . . . . . . . . . . . . . . 120
7.5.2 The Stone-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . 125
7.5.3 Convolution and differential equations . . . . . . . . . . . . . . . . . 128
Chapter 0
Pre-preliminaries
0.1 Course Overview
The study of the real numbers, R, and functions of a real var, f(x) = y where x, y real.
Given f : R→ R which describes some system, how to study f?
• Need rigourous vocab for properties of f (definitions)
• Need to see when some properties imply others (theorems)
Result: can make inferences about the system.
Motivation for analysis: limits, the heart & soul of calculus.
Limits provide a rigourous basis for ideas like sequences, series, continuity, derivatives,
integrals. More adv: model an arbitrary function as a limit of a sequence of “nice”
functions (polys, trigs) or as a sum of “nice” functions (Fourier, wavelets). All of this
requires understanding limits of numbers.
Outline:
1. Logic: not, and, or, implication; rules of inference
2. Sets: elements, intersection, union, containment; special sets
3. The real numbers: algebraic properties (+,×), order properties (<), completeness
properties
8 Math 413 Pre-preliminaries
4. Sequences: types of, convergence, basic results (arithmetic, etc), subsequences, con-
vergence, Cauchy sequences
5. Series: convergence tests, absolute convergence, power series
6. Functions: arith, behavior, continuity & limits, IVT, compact domains
7. Differentiation: MVT, L’Hopital, Taylor & linearization
8. Integrals: integrability and the Riemann integral
9. Special functions: exp, log, gamma
10. Seqs and series of functions
0.2 Logic and inference
Most theorems involve proving a statement of the form “if A is true, then B is true.” This
is written A =⇒ B and called if-then or implication. A is the hypothesis and B is the
conclusion. To say “the hypothesis is satisfied” means that A is true. In this case, one
can make the argument
A =⇒ B
A
B
and infer that B must therefore be true, also.
What does A =⇒ B mean? We use the more familiar connectives “and” and “or”
and “not” (¬) to describe it, via truth tables. Consider:
A B A and B
T T T
T F F
F T F
F F F
and
A B A or B
T T T
T F T
F T T
F F F
and
A ¬A
T F
T T
0.2 Logic and inference 9
A =⇒ B means that whenever A is true, B must also be true, i.e., it CANNOT be
the case that A is true B is false: (A =⇒ B) ≡ ¬(A and ¬B). This means that the truth
table for =⇒ can be found:
A B ¬B A and (¬B) ¬(A and ¬B)
T T F F T
T F T T F
F T F F T
F F T F T
=⇒
A B A =⇒ B
T T T
T F F
F T T
F F T
A B ¬A ¬B A =⇒ B ¬(A and ¬B) ¬A or B ¬B =⇒ ¬A B =⇒ A ¬A =⇒ ¬B
T T F F T T T T T T
T F F T F F F F T T
F T T F T T T T F F
F F T T T T T T T T
If A =⇒ B and B =⇒ A, then the statements are equivalent and we write “A if
and only if B” as A ⇐⇒ B,A ≡ B, or A iff B. This is often used in definitions.
A B A =⇒ B B =⇒ A (A =⇒ B) and (B =⇒ A) A ⇐⇒ B
T T T T T T
T F F T F F
F T T F F F
F F T T T T
If you know that A ⇐⇒ B, then you can replace A with B (or v.v.) wherever it
appears. A ≡ B is like “=” for logical statements.
One last rule (DeMorgan):
A B ¬A ¬B ¬(A and B) ¬A or ¬B ¬(A or B) ¬A and ¬B
T T F F F F F F
T F F T T T F F
F T T F T T F F
F F T T T T T T
10 Math 413 Pre-preliminaries
Thus, ¬(A and B) ≡ (¬A or ¬B) and ¬(A or B) ≡ (¬A and ¬B).
Example 0.2.1. Thm: a bounded increasing sequence converges.
This means: If a sequence {an} is increasing and bounded, then it converges, i.e.,
({an} increasing) and ({an} bounded) =⇒ {an} converges.
Suppose we are considering the sequence where an = 1 − 1n . We apply the theorem and
see that an must converge (to something?).
Suppose we are considering the sequence an = (−1)n, which is known to diverge. The
theorem is still helpful; by contrapositive,
¬({an} converges) =⇒ ¬(({an} increasing) and ({an} bounded))
{an} diverges =⇒ ¬({an} increasing) or ¬({an} bounded),
using DeMorgan. So an is either not increasing or unbounded. However, an is bounded,
because every term is contained in the finite interval [−1, 1]. Thus, we can infer that an
must not be increasing. (Note: not increasing does not imply decreasing!)
How to prove A =⇒ B.
Direct proof.
1. Assume the hypothesis, i.e., assume A is true, just for now.
2. Apply this “fact” and other basic knowledge.
3. Show that B is true, based on all this.
Example 0.2.2 (direct pf). n odd =⇒ n2 odd.
1. Assume n is an odd integer.
2. Then n = 2k + 1, for some integer k, so
n2 = (2k + 1)2 = 4k2 + 4k + 1 = 2(2k2 + 2k︸ ︷︷ ︸m
) + 1 = 2m + 1, for some m ∈ Z.
3. Thus, n2 is odd.
0.2 Logic and inference 11
Indirect proof: Proof by contrapositive.
(A =⇒ B) ≡ (¬B =⇒ ¬A),
so show ¬B =⇒ ¬A directly.
Example 0.2.3 (contrapositive). 3n + 2 odd =⇒ n odd.
The contrapositive is: n even =⇒ 3n + 2 even.
1. Assume n is an even integer.
2. Then n = 2k, for some integer k, so
3n + 2 = 3(2k) + 2 = 6k + 2 = 2(3k + 1) = 2m, for some m ∈ Z.
3. Thus, 3n + 2 is even.
Example 0.2.4 (contrapositive). n2 even =⇒ n even.
This is just the contrapositive of the prev. example.
Indirect proof: Proof by contradiction.
In order to show that A is true by contradiction,
1. assume that A is false (assume ¬A is true)
2. derive a contradiction (show that ¬A implies something which is clearly false/impossible)
Example 0.2.5 (contradiction).√
2 is irrational.
1. Assume the negative of the statement:√
2 = mn , for some m,n ∈ Z.
2. If m,n have a common factor, we can cancel it out to obtain
√2 =
a
b, in lowest terms (∗)
2 =a2
b2
2b2 = a2
This shows a2 is even. But we just showed in the prev ex that
a2 even =⇒ a even,
12 Math 413 Pre-preliminaries
so a must be even. This means a = 2c for some integer c, so
2b2 = (2c)2
b2 = 2c2
This shows that b2 is even. But then b must also be even. <↙(*)
Mathematical (weak) induction: how to prove statements of the form
P (n) is true for every n.
1. Basis step: show that P (0) or P (1) is true.
2. Induction step: show that P (n) =⇒ P (n + 1).
Example 0.2.6 (induction). The sum of the first n odd positive integers is n2.
1. The sum of the first 1 positive integers is 1 = 12.
2. Induction step: show that
[1 + 3 + 5 + · · ·+ (2n− 1) = n2
]=⇒ [
1 + 3 + 5 + · · ·+ (2n− 1) + (2n + 1) = (n + 1)2].
This is a statement A =⇒ B which we show directly, so assume A is true:
1 + 3 + 5 + · · ·+ (2n− 1) = n2.
(This is the induction hypothesis.)
1 + 3 + 5 + · · ·+ (2n− 1) + (2n + 1) = (1 + 3 + 5 + · · ·+ (2n− 1)) + (2n + 1)
= n2 + (2n + 1)
= (n + 1)2.
Thus we have shown that B is true, based on the assumption A. Hence, we have
proven the statement: A =⇒ B.
Question 1. (a) Use the DeMorgan laws to argue that ¬(A and ¬B) ≡ (¬A or B).
(b) Use induction to show n! ≤ nn for every n ∈ N.
0.2 Logic and inference 13
0.2.1 Set operations and logical connectives.
intersection: A ∩B = {x ... x ∈ A and x ∈ B}
union: A ∪B = {x ... x ∈ A or x ∈ B}
complement: Ac = {x ... x /∈ A}
difference: A \B = {x ... x ∈ A and x /∈ B} = A ∩Bc
product: A×B = {(x, y) ... x ∈ A and y ∈ B}
containment: A ⊆ B ⇐⇒ (x ∈ A =⇒ x ∈ B)
Example 0.2.7. “Convergent sequences are bounded.”
({an} is convergent) =⇒ ({an} is bounded)
The set of convergent sequences is a subset of the bounded sequences.
Question 2. (a) Use the DeMorgan laws to argue that (A∩B)c = Ac∪Bc and (A∪B)c =
Ac ∩Bc.
(b) Prove that the empty set is a subset of every set.
Chapter 1
Preliminaries
1.1 Quantifiers
1
Sentential logic/propositional calculus:
A,B are absolute statements about the state of affairs. Any expression like A is (globally)
true or false.
How to express more delicate ideas: relations between specific objects/individuals, etc?
Predicate logic (aka 1st order logic):
A(x), B(x) are statements about a variable x. It may be that A(n) is true but A(m) is
false!
So how to express when something is always true or sometimes true or never true?
Use quantifiers.
Definition 1.1.1 (Universal quantifier). If A(x) is true for every possible value of x
(under discussion), we say ∀x,A(x).
Example 1.1.1. x2 ≥ 0, ∀x ∈ R. TRUE
∀x, a < b =⇒ a2 < b2. FALSE
∀x ≥ 0, a < b =⇒ a2 < b2. TRUE
∀x ∈ (0, 1), ∀n, xn < x. TRUE (assume n ∈ N).
1May 2, 2007
16 Math 413 Preliminaries
Definition 1.1.2 (Existential quantifier). If A(x) is true for at least one allowable value
of x, we say ∃x,A(x), or ∃x such that A(x).
Example 1.1.2. ∃x, x = −x. TRUE (x = 0)
∃x, x2 = x. TRUE (x = 0, 1)
What are the negations?
¬∃x,A(x) means there is no x for which A(x) is true, i.e., A(x) is false for every x:
¬∃x,A(x) ≡ ∀x,¬A(x).
∃x,¬A(x) means there is an x for which A(x) is false, i.e., A(x) is not true for every x:
∃x,¬A(x) ≡ ¬∀x,A(x).
Example 1.1.3. 1. ¬∃x ∈ R, x2 < 0 is the same as ∀x ∈ R, x2 ≥ 0.
2. Not every triangle is equilateral: ¬∀t, Eq(t).
There are nonequilateral triangles: ∃t,¬Eq(t).
3. Every prime number is odd:
∀p ∈ P, Odd(p) ⇐⇒ ¬¬∀p ∈ P, Odd(p) ⇐⇒ ¬∃p ∈ P,¬Odd(p).
We used A ≡ ¬¬A. However, 2 ∈ P and ¬Odd(2) imply the contradiction ∃p,¬Odd(p).
Therefore, the original statement ∀p ∈ P, Odd(p) is false.
Order of quantifiers.
∀x, ∀y, A(x, y) ≡ ∀y, ∀x,A(x, y)
∃x, ∃y, A(x, y) ≡ ∃y, ∃x,A(x, y)
However, cannot interchange different types of quantifier!
∃x, ∀y, A(x, y) ≡/ ∀y, ∃x,A(x, y)
Example 1.1.4. Suppose M(x, y) means “x is the mother of y”. Then:
∀y, ∃x,M(x, y) means that every y has a mother, but
∃x, ∀y, M(x, y) means that there is some x which is the mother of every y.
1.1 Quantifiers 17
Note: one implication is valid.
∃x, ∀y, A(x, y) =⇒ ∀y, ∃x,A(x, y).
1.1.3 Exercises: #3 Due: Jan. 29
Question 3. Interpret in words: ∀x, ∃y, y > x but not ∃y, ∀x, y > x. (x, y are integers)
18 Math 413 Preliminaries
1.2 Infinite Sets
1.2.1 Countable sets
Definition 1.2.1. Two sets A and B have the same cardinality iff they can be put in
one-to-one correspondence, i.e., if every element of A corresponds to a unique element of
B; all elements are “paired off”.
Cardinality is “size” for finite sets, but it works for infinite sets, too.
Definition 1.2.2. A set A is infinite iff there is a proper subset B ⊆ A which has the
same cardinality as A.
Definition 1.2.3. A set A is countable iff it has the same cardinality as the natural
numbers N = {1, 2, 3, 4, . . . }, i.e., if we can write
A = {a1, a2, a3, . . . }.
Are some infinite sets larger than others? Yes! Some sets have too many elements to
count.
Example 1.2.1. N is infinite (and countable):
1, 2, 3, 4, . . .
1, 2, 3, 4, 5, . . .
Thus, a countable set is infinite.
Theorem 1.2.4. The set of integers Z is countable.
Proof. 0, 1, -1, 2, -2, . . .
More precisely, define a function a : N→ Z by
an :=
−n
2 , n even,
n+12 , n odd
and convince yourself that it’s a bijection.
1.2 Infinite Sets 19
Theorem 1.2.5. N is “the smallest” infinite set, i.e., every subset of N is either finite or
countable.
Proof. Homework. Write it out, cross ’em off.
Theorem 1.2.6. If A is countable and B is countable, then A ∪B is countable.
Proof. Since A = {a1, a2, a3, . . . } and B = {b1, b2, b3, . . . }, we can write
A ∪B = {a1, b1, a2, b2, a3, b3, . . . }.
That was a direct proof.
Theorem 1.2.7. Suppose A1, A2, A3, . . . An are countable. Then the union is also count-
able:n⋃
k=1
Ak = A1 ∪ · · · ∪An = {x ... x ∈ Ak, for some n}.
Proof. Homework. Use induction and the previous result.
What if we take an infinite union?
Theorem 1.2.8. Suppose A1, A2, A3, . . . is a countable sequence of countable sets. Then
the union is also countable:
∞⋃
k=1
Ak = {x ... ∃k such that x ∈ Ak}.
Proof. If we write Ai = {ai1, ai,2, . . . }, then: (grid).
How about a product?
Theorem 1.2.9. Suppose A1, A2, A3, . . . An are countable. Then so is the product:
n∏
k=1
Ak = A1 ×A2 ×A3 × · · · ×An = {(a1, a2, . . . , an) ... ak ∈ Ak}
Proof. We use induction. The basis step is to see that the product of two countable sets
is countable:
A×B = {(x, y) ... x ∈ A, y ∈ B} =
⋃
x∈A
⋃
y∈B
(x, y).
20 Math 413 Preliminaries
Now for the induction step, assume∏n−1
k=1 Ak is countable. Then
A1 ×A2 ×A3 × · · · ×An−1 ×An =
(n−1∏
k=1
Ak
)×An
is a product of two countable sets, hence countable by the basis step.
Corollary 1.2.10. The set of rational numbers Q is countable.
Proof. Homework.
How about a countable product?
Theorem 1.2.11. Let A1, A2, A3, . . . all have more than one element. Then∏∞
k=1 Ak is
uncountable.
Proof. Consider the simplest case, where each Ak = 2 = {0, 1}. Then we have the set of
all binary sequences:
2N :=∞∏
k=1
{0, 1} = {(a1, a2, a3, . . . ) ... ai = 0 or 1}
Suppose, by way of contradiction, that 2N were countable. Then we have a list
a1 = 0 01100101010101010 . . .
a2 = 0 1 0101010101010101 . . .
a3 = 10 1 010101101101010 . . .
a4 = 111 0 01010100010010 . . .
and this list contains ALL elements of 2N. Now consider the element a∗ = 1001 . . . .
Corollary 1.2.12. The set of real numbers R is uncountable.
Proof. R contains all nonterminating decimal numbers of the form 0.00101011001..., and
there are uncountably many of these.
Definition 1.2.13. The power set of A is
2A = {B ... B ⊆ A} = {f : A → 2}.
1.2 Infinite Sets 21
Example 1.2.2. Suppose A = {1, 2, 3, 4, 5}. Then the subset {2, 3, 5} corresponds to
(0, 1, 1, 0, 1) ∈ 2A. This is the function 1 7→ 0, 2 7→ 1, 3 7→ 1, 4 7→ 0, 5 7→ 1
1.2.3 Exercises: #1,3 Recommended: #2,4 Due: Jan. 29
1. Every subset of N is either finite or countable.
2. If A1, A2, A3, . . . An are countable then⋃n
k=1 Ak is countable.
3. Show that the set of algebraic numbers is countable. A number x is algebraic iff
a0 + a1x + a2x2 + · · ·+ anxn = 0,
for some integers ai. Hint: for N ∈ N, there are only finitely many equations with
n + |a0|+ · · ·+ |an| = N.
4. Is the set of all irrational real numbers countable?
22 Math 413 Preliminaries
1.3 The rational numbers
The evolution of numbers ...
N = {1, 2, 3, 4 . . . }.
To solve equations like x + 5 = 2, need to add negatives:
Z = {. . . ,−2,−1, 0, 1, 2, . . . }.
To solve equations like x× 5 = 2, need to add rationals:
Q = {m
n... m,n ∈ Z, n 6= 0}.
To solve equations like xn = 2, need to add roots like n√
2. What if we add solutions to
all polynomials anxn + . . . a1x + a0 = 0? We get A with
π /∈ A, but√−1 ∈ A.
So what is R anyway? Roughly:
R ≈ {limxn ... {xn} ⊆ Q, {xn} converges}.
Problem: how to define lim xn only in terms of Q? The usual definition of limit says that
limxn = L iff
∀ε > 0, ∃N, n ≥ N =⇒ |xn − L| < ε.
This is circular: cannot use a real number L to define itself. Cauchy sequences will
overcome this.
1.3.1 The abstract structure of Q.
Definition 1.3.1. A group is a set with an associative binary operation, an identity, and
inverses. Written additively, (G, +, 0) must satisfy
1. x, y ∈ G =⇒ x + y is a well-defined element of G.
2. ∃!0 such that x + 0 = 0 + x = x, ∀x ∈ G.
1.3 The rational numbers 23
3. ∀x ∈ G, ∃!y ∈ G such that x + y = y + x = 0. Write y = −x.
Written multiplicatively, (G,×, 1) must satisfy
1. x, y ∈ G =⇒ x× y is a well-defined element of G.
2. ∃!1 such that x× y = y × x = x,∀x ∈ G.
3. ∀x ∈ G, ∃!y ∈ G such that x× y = y × x = 1. Write y = 1x .
Theorem 1.3.2. Z, Q, R, C are groups under addition. N, N0 are not.
Theorem 1.3.3. Let Q× = {x ∈ Q ... x 6= 0}. Then Q× is a group under multiplication.
So are R× and C×. Z× is not.
Definition 1.3.4. A set which is a group under addition, and whose nonzero elements
form a group under multiplication is called a field if the two operations behave nicely
together:
a× (b + c) = (a× b) + (a× c) (Distributive law)
and the operations +,× are commutative.
Theorem 1.3.5. Q, R, C are fields. GLn = {invertible n× n matrices} is not.
The set Q is defined
Q = {(m,n) ... m,n ∈ Z, n 6= 0} .
and the operations +,× are defined on it in terms of the familiar operations in Z by:
(p, q) + (r, s) = (ps + rq, qs)
(p, q)× (r, s) = (pr, qs)
p
q+
r
s=
ps + qr
qsp
q× r
s=
pr
sq.
With these operations, Q has the algebraic structure of a field.
From now on, write × by juxtaposition or · .There is an equivalence relation on Q:
(p, q) 'Q (r, s) ⇐⇒ ps =Z qrp
q' r
s⇐⇒ ps = qr.
Definition 1.3.6. Any equivalence relation satisfies the following, for all elements of the
set:
1. x ' x. (reflexivity)
24 Math 413 Preliminaries
2. x ' y =⇒ y ' x. (symmetry)
3. x ' y, y ' z =⇒ x ' z. (transitivity)
There is a total order structure on Q.
Definition 1.3.7. Any order relation < satisfies the following, for all elements of the set:
1. x ≮ x. (antireflexivity)
2. x < y =⇒ y ≮ x. (antisymmetry)
3. x < y, y < z =⇒ x < z. (transitivity)
Definition 1.3.8. A total order also satisfies:
∀x, y, exactly one is true: x < y, x = y, or y < x. (trichotomy)
NOTE: trichotomy may allow you to break a proof into cases!
x ≤ y is shorthand for (x < y or x = y).
Definition 1.3.9. An ordered field is a field with an order < that satisfies
1. x, y > 0 =⇒ x + y, x× y > 0
2. x < y ⇐⇒ x + z < y + z.
Theorem 1.3.10. Q, R are ordered fields. C is not.
An order structure on a field allows us to define a notion of distance. First, we define
a notion of size by
Definition 1.3.11. The absolute value (or magnitude or modulus) of a is
|a| =
a, a ≥ 0,
−a, a < 0.
NOTE: obviously, −|a| ≤ a ≤ |a|.Then the distance from one element to another is defined as the size of the difference:
dist(x, y) = |x− y|.
Theorem 1.3.12 (Triangle inequality). |a + b| ≤ |a|+ |b|.
1.3 The rational numbers 25
Proof 1. Using the OBVIOUS note,
−|a| ≤ a ≤ |a|−|b| ≤ b ≤ |b|
−(|a|+ |b|) ≤ a + b ≤ |a|+ |b||a + b| ≤ ||a|+ |b|| = |a|+ |b|.
Proof 2. |a + b|2 = (a + b)(a + b) = a2 + 2ab + b2 ≤ a2 + 2|a||b|+ b2 = (|a|+ |b|)2.
This allows for quantitative version of “if x is close to y and y is close to z, then x is
close to z”: let |x− y| < ε and |y − z| < ε. Then:
|x− z| = |x + (−y + y)− z| = |(x− y) + (y − z)|≤ |x− y|+ |y − z|< ε + ε = 2ε.
So if we want x to be within 12 of z, find x within 1
4 of y and z within 14 of y.
Other forms of ∆ ineq:
|x− y| ≥ |x| − |y|
|x− y| ≥ ||x| − |y||∣∣∣∣∣
n∑
i=1
xi
∣∣∣∣∣ ≤n∑
i=1
|xi|
Proof. Fun! (And required)
Theorem 1.3.13 (Axiom of Archimedes). Let x > 0. Given any M (no matter how
large), ∃y ∈ Q such that xy > M .
By the field properties of Q, this is equivalent to:
Let x > 0. Given any ε (no matter how small), ∃y ∈ Q such that 0 < xy < ε.
These are also true for R.
A basic idea of analysis:
a < b =⇒ ∃c ∈ (a, b) ∩ R.
26 Math 413 Preliminaries
I.e., a < c < b and c ∈ R.
Question 4. What does this mean?
∀ε > 0, |a− b| < ε
1.4 Axiom of Choice
Given a sequence of nonempty sets A1, A2, . . . , the product∏∞
k=1 Ak is nonempty.
1.5 A vocabulary for sequences 27
1.5 A vocabulary for sequences
Definition 1.5.1. A sequence of numbers is an countable ordered list a1, a2, . . . . Also, a
function a : N→ R, where a(n) = an.
A sequence can be specified by giving
(i) the first few terms: {1, 12 , 1
3 , . . . }
(ii) an explicit formula for the nth term: { 1n}, or
(iii) a recurrence relation for the nth term: a1 = 1, an+1 = n−1n an.
Example 1.5.1. The Fibonacci numbers can be described by
(i) {1, 1, 2, 3, 5, 8, 13, 21, . . . }
(ii){
1√5
(1+√
52
)n
− 1√5
(1−√5
2
)−n}
, or
(iii) a0 = 1, a1 = 1, an+2 = an+1 + an.
Definition 1.5.2. {an} is increasing iff an ≤ an+1, ∀n.
{an} is strictly increasing iff an < an+1, ∀n.
{an} is increasing, (strictly increasing) iff an ≥ an+1(an > an+1), ∀n.
Definition 1.5.3. {an} is monotone iff it is increasing or decreasing.
Definition 1.5.4. A sequence {an} is bounded above if there is a number B ∈ R such
that an ≤ B, ∀n. This B is an upper bound for the sequence {an}.
Definition 1.5.5. {an} is bounded below if there is a number B ∈ R such that an ≥ B, ∀n.
This B is an lower bound for the sequence {an}.
Definition 1.5.6. {an} is bounded iff it is bounded above and bounded below.
Definition 1.5.7. {an} positive (negative), written an ≥ 0 (an ≤ 0) iff {an} is bounded
below (above) by 0.
Chapter 2
Construction of the real numbers
The completeness of R. 1
We have seen that Q and R are both ordered fields, so what is the difference? Topology:
R is connected, Q is not. For example,√
2 is not rational: Q has a “hole” at√
2.
In topology, a set X is defined to be connected iff there are two nonempty open sets
A,B such that A ∪B = X and A ∩B = ∅. Example:
Q = (−∞,√
2) ∪ (√
2,∞).
R cannot be written in such a way.
Another way to phrase this: completeness. Let A ⊆ (b, c). Then ∃x ∈ R such that
1. x is an upper bound for A: ∀a ∈ A, a ≤ x.
2. y is an upper bound for A =⇒ x ≤ y.
We say x is the least upper bound for A or supremum of A, and write x = sup A.
Example: there is no “smallest rational number” that is larger than (or at least as
large as) every element of A = (0,√
2).
1May 2, 2007
30 Math 413 Construction of the real numbers
2.1 Cauchy sequences
Definition 2.1.1. Let {xn} be a sequence in Q. We say the limit of {xn} is L (or that
{xn} converges to L) iff
For each m = 1, 2, . . . , ∃Nm such that n ≥ N =⇒ |xn − L| < 1m
.
Write lim xn = L or xn → L.
This is the same as the more familiar definition
∀ε > 0, ∃Nε, n ≥ N =⇒ |xn − L| < ε.
NOTE 1: the presence of ∀ makes the strictness of the inequality irrelevant, i.e., equiv
to
∀ε > 0, ∃Nε, n ≥ N =⇒ |xn − L| ≤ ε.
NOTE 2: since ∃N occurs after ∀ε, it is implicit that N depends on ε. From now on,
drop the ε.
Definition 2.1.2. If given N ∈ N, we can always find xn > N , then we say lim xn = ∞,
i.e., xn →∞.
Definition 2.1.3. A sequence {xn} in Q is a Cauchy sequence iff
∀ε > 0,∃N such that m,n ≥ N =⇒ |xm − xn| < ε, i.e., |xm − xn| m,n→∞−−−−−−−→ 0.
Example 2.1.1 (Nonexample). 1, 2, 2 12 , 3, 3 1
3 , 3 23 , 4, . . . .
Definition 2.1.4 (Alternative). A sequence {xn} in Q is a Cauchy sequence iff
∀ε > 0,∃(a, b) such that |b− a| < ε and {xN , xN+1, xN+1, . . . } ⊆ (a, b), for some N.
Example 2.1.2. The tail of the nonCauchy sequence has no upper bound, so any interval
containing it is of the form [x,∞).
The definition of Cauchy sequence makes no claim about convergence! (To a specified
limiting object.) How to know when a Cauchy sequence converges? Define
R = {{xn} ... {xn} is a Cauchy sequence in Q}.
2.1 Cauchy sequences 31
Idea: as a real number, {xn} = lim xn.
Then prove:
Theorem 2.1.5. A sequence in R has a limit ⇐⇒ it is Cauchy.
Proof. Later: §2.3. For now, pretend it sounds good.
PROBLEM: uniqueness. What if two Cauchy sequences tend to the same L?
SOLUTION: define them to be the same if they do tend to the same L:
Definition 2.1.6. Two Cauchy sequences {xn} and {yn} are equivalent ({xn} ' {yn})iff
∀ε > 0, ∃Nε such that n ≥ N =⇒ |xn − yn| < ε
THINK: By Thm just above, this means: two Cauchy sequences are equivalent iff they
have the same limit.
Theorem 2.1.7. Equivalence of Cauchy sequences really is an equivalence relation.
Proof. Must show: reflexivity, symmetry, transitivity.
1. reflexive: {xn} ' {xn}.
∀ε > 0, ∃Nε such that n ≥ N =⇒ |xn − xn| = 0 < ε.
2. symmetric: {xn} ' {yn} =⇒ {yn} ' {xn}.Since |xn − yn| = |yn − xn|, this is clear.
3. transitive. Suppose {xn} ' {yn} and {yn} ' {zn}. Must show {xn} ' {zn}.Fix m ∈ N. From {xn} ' {yn}, can find N1 such that
n ≥ N1 =⇒ |xn − yn| < 1m
. (∗)
From {yn} ' {zn}, can find N2 such that
n ≥ N2 =⇒ |yn − zn| < 1m
. (∗∗)
If N := max{N1, N2}, then N satisfies both (*) and (**).
|xn − zn| = |xn − yn + yn − zn| ≤ |xn − yn|+ |yn − zn| < 2m
.
32 Math 413 Construction of the real numbers
PROOF REDUX: make initial < 12m to end with < 1
m .
Theorem 2.1.8 (The K-ε principle.). Suppose {an} is a sequence and for any ε > 0, it
is true that |an − L| < Kε for n ≥ N , where K > 0 is a fixed constant (doesn’t depend
on n, ε). Then lim an = L.
Proof. Exercise.
Definition 2.1.9. A real number is an (equivalence class of) Cauchy sequences of rational
numbers. Therefore, “x ∈ R” means
∃{xn} ⊆ Q, {xn} Cauchy, and x := lim xn.
So to prove something about x, y ∈ R, prove it for {xn}, {xn}.
Theorem 2.1.10 (Uniqueness of limits). A sequence xn has at most one limit.
Proof. We must show that (xn → L) and (xn → L′) =⇒ L = L′. Assume that both L
and L′ are limits of xn and suppose, by way of contradiction, that L 6= L′. Then we may
choose ε = 12 |L − L′|, so that ε > 0 and 2ε = |L − L′|. From the assumptions, we have
|xn − L| < ε and |xn − L′| < ε for all large n. Thus,
|L− L′| = |L− xn + xn − L′| ≤ |L− xn|+ |xn − L′| < 2ε = |L− L′|. <↙x ≮ x.
2.2 The reals as an ordered field 33
2.2 The reals as an ordered field
A rigorous argument would construct R as equivalence classes of Cauchy sequences, and
then prove:
• R is an ordered field.
• Every Cauchy sequence in R converges to a point in R.
• R satisfies the Axiom of Archimedes.
We give the idea.
Properties defined on Q can be passed to R by the limit. For example, the field
operations:
Definition 2.2.1. Let x, y ∈ R. Then pick any rational sequences {xn}, {yn} with
limxn = x and lim yn = y. Define x + y = lim(xn + yn) and x · y = lim(xn · yn).
This definition only makes sense if {xn + yn}, {xn · yn} are Cauchy:
Theorem 2.2.2. If {xn} and {yn} Cauchy in Q, then (i) so is {xn + yn}, and (ii) so is
{xn · yn}.
Proof. (i) Given ε > 0, we can find N1, N2 such that
m, n ≥ N1 =⇒ |xm − xn| < ε, and m,n ≥ N2 =⇒ |ym − yn| < ε.
Then let N = max(N1, N2). For m,n ≥ N , we have
|(xm + ym)− (xn + yn)| = |(xm − xn) + (ym − yn)|≤ |xm − xn|+ |ym − yn| < ε + ε = 2ε.
(ii) Given k > 0, we can again find N1, N2 such that
m,n ≥ N1 =⇒ |xm − xn| < 1k
, and m,n ≥ N2 =⇒ |ym − yn| < 1k
.
Then let N = max(N1, N2) and
|xmym − xnyn| = |xmym − xnym + xnym − xnyn|≤ |xmym − xnym|+ |xnym − xnyn|
34 Math 413 Construction of the real numbers
≤ |ym||xm − xn|+ |xn||ym − yn|
< |ym|1k
+ |xn|1k
< 2N1k
Cauchy sequences are bounded
Lemma 2.2.3. Every Cauchy sequence is bounded.
Proof. Let {xn} be Cauchy, and choose ε = 1. Then there is some (a, b) and N such that
|b − a| < 1 and {xN , xN+1, . . . } ⊆ (a, b). Define M := max{|x1|, |x2|, . . . , |xN−1|, |a|, |b|}.Since this is a finite set, it has a maximum. For xk,
k < N =⇒ |xk| ≤ |xk|,k ≥ N =⇒ xk ∈ (a, b) =⇒ |xk| ≤ |a| or |xk| ≤ |b|.
Theorem 2.2.4. R is a complete ordered field. Furthermore, there is only one complete
ordered field (up to isomorphism). Also, R satisfies the Axiom of Archimedes.
Theorem 2.2.5 (Density of Q in R). Let a < b be real numbers. Then
(i) ∃r ∈ Q such that a < r < b, and
(ii) ∃s ∈ R \Q such that a < s < b.
Proof. (i) b−a > 0, so we can find n such that n(b−a) > 1 by the Archimedean property.
Let m be an integer such that m−1 ≤ na < m. (This is possible, since⋃Z[m−1,m) = R.)
na < m ≤ 1 + na < nb
a <m
n< b.
(ii) Homework. Use (i) and√
2 /∈ Q.
This can be extended: see optional HW.
2.3 Limits and completeness 35
2.3 Limits and completeness
Theorem 2.3.1 (Completeness of R). A sequence {x,x2, . . . } of real numbers converges
iff it is a Cauchy sequence.
Proof. (⇒) Assume {xn} converges to x ∈ R. Then fix ε > 0 and find N such that
n ≥ N =⇒ |x− xn| < ε.
Then if j, k ≥ N ,
|xj − xk| = |xj − x + x− xk| ≤ |xj − x|+ |x− xk| < ε + ε = 2ε.
(⇐) Assume {xn} is Cauchy. We need to find a Cauchy sequence {yn} ⊆ Q to define
y, and then show lim xn = y.
1. For xk, we can find a rational in (xk − 1k , xk + 1
k ) by density; call it yk. To see that
yk is Cauchy, fix a positive error distance ε > 0 and find N such that
j, k ≥ N =⇒ |xj − xk| < ε.
This is possible, since xn are Cauchy. Then
|yj − yk| ≤ |yj − xj |+ |xj − xk|+ |xk − yk| < 1j+ ≥ +
1k
.
So if we pick j, k so large that 1j + 1
k < ε, we get |yj − yk| < 2ε, and {yn} is Cauchy.
2. Now show lim xn = y:
|y − xk| ≤ |y − yk|+ |yk − xk| ≤ |y − yk|+ 1k
< ε for k >> 1.
Theorem 2.3.2. Let xn → x and yn → y. Then
1. xn + yn → x + y and xn · yn → x · y.
2. If y 6= 0, then yk 6= 0 for large k and xk/yk → x/y.
3. xn ≤ yn =⇒ x ≤ y. (Limit location thm)
36 Math 413 Construction of the real numbers
Proof. (i) already done. (ii) is similar. (iii) Use contradiction: suppose not.
Then xn ≤ yn but x > y. Then |x− y| > 0, so find N such that
|xn − x| < |x− y|2
and |yn − y| < |x− y|2
for n ≥ N. (SPLIT IT)
For each xn, yn past the N th, yn < x + |x−y|2 < xn. <↙
Corollary 2.3.3. (Squeeze Thm) If xn → L, yn → L and xn ≤ zn ≤ yn for some
sequences {xn}, {yn}, {zn}, then lim zn = L.
NOTE: for both, even if xn < yn, can only conclude x ≤ y.
R does not have a hole at√
2.
Theorem 2.3.4. For a > 0, there is a unique positive b ∈ R such that b2 = a. (b =√
a).
Proof. First, suppose a > 1. Then a(a− 1) > 0, so
a2 = a + a(a− 1) > a =⇒ 1 < a < a2.
Define y1 := 1 and z1 := a2. Divide-and-conquer: define the midpoint m1 := (y1 + z1)/2.
(Sketch).
Pick the interval [y2, z2] such that a ∈ [y22 , z2
2 ].
Find the next midpoint m2 := (y2+z2)/2. This procedure generates two Cauchy sequences
{yn} and {yn}: {yN , yN+1, . . . } is contained in an interval of length (a2 − 1)/2N , so we
can define b := lim yn = lim zn. Then
y2n ≤ a ≤ z2
n =⇒ b2 ≤ a ≤ b2 =⇒ b2 = a.
For uniqueness, note that if c2 = a, then b2 − c2 = (b + c)(b − c) = 0. Since R is a field,
b + c = 0 or b− c = 0. Since both b, c are positive, b− c = 0 =⇒ b = c.
The case a = 1 is trivial: b = 1.
Finally, 0 < a < 1 =⇒ 1a > 1, so there is a unique real number b ∈ R such that b2 = 1
a ,
by the first part. Then 1b2 =
(1b
)2 = a.
In the same way, but with significantly worse algebra, one can show:
Theorem 2.3.5. For every x > 0 and n ∈ N, there is a unique y > 0 such that yn = x.
Theorem 2.3.6. a, b > 0 and n ∈ N imply (ab)1/n = a1/nb1/n.
2.4 Other constructions 37
Proof. Put α = a1/n and β = b1/n, so that ab = αnβn, by commutativity of mult. The
uniqueness part of the previous thm gives
(ab)1/n = αβ = a1/nb1/n.
§2.3.3 Exercise: #3 Recommended: #7,8
1. If {an} is an increasing sequence and an → L, show that an ≤ L,∀n.
2. (a) Between any two rationals, there is another rational.
(b) Between any two rationals, there is an irrational. ((ii) above)
(c) Between any two irrationals, there is an irrational.
3. Prove |a− b| ≥ ||a| − |b||.
2.4 Other constructions
(SKIP)
Chapter 3
Topology of the Real Line
“Topology”: the study of qualitative geometric properties: connected, continuous, ... 1
Abstractly, this amounts to studying open sets. In R, this is unions of intervals (a, b).
In analysis, topology is all about knowing when a sequence converges.
xn → L ⇐⇒ (U is an open nbd of L =⇒ xn ∈ U for all but finitely many n).
3.1 Limits and bounds
Not all sequences have a limit, so we need another idea.
Definition 3.1.1. Let A be a nonempty subset of R. Then define the supremum of A to
be the least (smallest) upper bound of A. In other words, a = sup A means:
1. A ≤ a, that is, x ∈ A =⇒ x ≤ a. So a is an upper bound of A.
2. A ≤ b =⇒ a ≤ b. So a is the smallest upper bound.
If A has no upper bound, write supA = ∞.
Definition 3.1.2. The infimum of a nonempty set A ⊆ R is the greatest lower bound of
A, defined analogously.
Example 3.1.1. Let A = {x ... x2 ≤ 2} and B = {x ..
. x2 ≥ 2}. Then A is the set of lower
bounds of B and B is the set of upper bounds of A.
1May 2, 2007
40 Math 413 Topology of the Real Line
Theorem 3.1.3. If x is an upper bound of A and x ∈ A, then x = sup A.
Proof. Homework.
Definition 3.1.4. An ordered set S has the least-upper-bound property iff
A ⊆ S, A nonempty and bounded above =⇒ sup A exists in S.
Every ordered set with the l-u-b property also has the greatest lower bound property:
Theorem 3.1.5. Let S have the l-u-b property, and let B ⊆ S be a nonempty set which
is bounded below. Let L be the set of all lower bounds of B. Then α = sup L exists in S
and α = inf B.
Proof. (i) B is bounded below, so L 6= ∅.
(ii) Every x ∈ B is an upper bound of L, because
L = {y ∈ S ... y ≤ x, ∀x ∈ B}.
By (i) and (ii), the l-u-b property implies that L has a supremum α := sup L. We will
show α = inf B.
To see that α ∈ L, i.e., that α is a lower bound of B, let x < α = sup L.
Idea : (x ∈ B =⇒ α ≤ x) ≡ (x < α =⇒ x /∈ B).
Then x is not an upper bound of L, by the defn of sup. Since B is the set of upper bounds
of L (just shown), this means x /∈ B. By contrapositive (Idea), we have shown that α is
a lower bound for B, hence α ∈ L.
Now since α := sup L, α is an upper bound on L and
α < β =⇒ β /∈ L.
We have shown that α is a lower bound of B, and that β is not, if α < β.
Moral: if the set has sups, it also has infs (and vice versa).
Theorem 3.1.6 (Completeness of R, alt version). R has the l-u-b property (and hence
also the g-l-b property).
3.1 Limits and bounds 41
Proof. Use the Cauchy sequence construction; see Strichartz.
Theorem 3.1.7. A monotone increasing sequence {an} ⊆ R is convergent iff it is bounded
above. In this case, the limit is the sup of the set {a1, a2, . . . }.
Proof. (⇒) If it’s convergent, then it’s Cauchy. If it’s Cauchy, then it’s bounded.
(⇐) Since R is l-u-b and the set sup{a1, a2, . . . } is bounded above, let α := sup{a1, a2, . . . }.Then α is an upper bound, so ak ≤ α, ∀k. Then α − 1
m is not an upper bound for the
sequence, for any m ∈ N i.e., there is some N for which α− 1m < aN . Since the sequence
is monotone, this will also be true for every term thereafter:
j ≥ J =⇒ α− 1m
< an.
By the Squeeze Thm, lim(α− 1m ) = α implies that lim an = α.
Example 3.1.2. We use two results from discrete math.
Binom formula:
(1 + x)k = 1 + kx + · · ·+(
k
i
)xi + · · ·+ xn
Geometric sum (finite):
1 + r + r2 + · · ·+ rn =1− rn+1
1− r.
When r = 12 , this gives 1 + 1
2 + 14 + · · ·+ 1
2n = 2(1− 1
2n+1
)< 2.
Theorem 3.1.8. The sequence an =(1 + 1
2n
)2n
has a limit. (The limit is e).
Proof. By a theorem, suffices to show an bounded & increasing. Since n →∞, suffices to
consider n ≥ 2.
an is incr: need(1 + 1
2n
)2n
<(1 + 1
2n+1
)2n+1
.
b 6= 0 =⇒ b2 > 0 =⇒ (1 + b)2 > 1 + 2b (WHY ?)
=⇒ ((1 + b)2
)2n
> (1 + 2b)2n
=⇒ (1 + 1
2n+1
)2n+1
>(1 + 1
2n
)2n
b = 12n+1 .
42 Math 413 Topology of the Real Line
an is bounded above. First, note that
k(k − 1) . . . (k − i + 1) ≤ ki, and1i!
=1i· 1i− 1
. . .12≤
(12
)i−1
.
Then
(1 +
1k
)k
= 1 + k
(1k
)+ · · ·+ k(k − 1) . . . (k − i + 1)
i!
(1k
)i
+ · · ·+ k!k!
(1k
)k
= 1 + k
(1k
)+ · · ·+ ki
i!
(1k
)i
+ · · ·+ kk
k!
(1k
)k
= 1 + 1 +12
+ · · ·+ 12i−1
+ · · ·+ 12k−1
< 1 + 2 = 3.
So k = 2n shows 3 is an upper bound for an.
3.1.1 Limit Points
Definition 3.1.9. x is a limit point or a cluster point of the set A iff every open interval
around x contains an infinite number of points of A.
This is equivalent to:
Definition 3.1.10. Let {xj} be a sequence with limit point x. Then for every n ∈ N,
there are an infinite number of terms xj satisfying |x− xj | < 1n .
Example 3.1.3. Consider {(−1)n(1 + 1n )}. This has limit points 1 and −1. However,
neither of these is the sup or inf.
Contrast:
Definition 3.1.11. x is an isolated point of A if x ∈ A and there is some open set U with
x ∈ U and U ∩A = ∅. So a limit point is one which is not isolated.
Definition 3.1.12. If {xn} is a sequence, then a subsequence is a new sequence obtained
from the original by deleting some (possibly infinitely many) terms, but keeping the order
intact. The subsequence is denoted {xnk}.
3.1 Limits and bounds 43
Example 3.1.4. The sequence {(−1)n(1+ 1n )} has the monotone decreasing subsequence
1 + 12n obtained by taking every second term. Infinitely many deletions.
xn = −2,32, −4
3,
54, −6
5,
76, . . .
n = 1, 2, 3, 4, 5, 6, . . .
n1 = 2, n2 = 4, n3 = 6, . . .
xn1 = x2 =32
xn2 = x4 =54, xn3 = x6 =
76,
Note: nk ≥ k.
Example 3.1.5. The sequence {1, 12 , 1
3 , 14 , . . . } has subsequence {1
2 , 13 , 1
4 , . . . } obtained by
deleting the first term. A single deletion.
Theorem 3.1.13. Let {xn} be a sequence in R.
(i) xn → x iff every neighbourhood of x of the form (x− ε, x + ε), ε > 0 contains all but
finitely many points xn.
(ii) x ∈ R is a limit point of the set A iff there is a sequence {xn} ⊆ A with xn → x and
xn 6= x.
(iii) xn → x iff xnk→ x, for every subsequence {xnk
}.
(i). (⇒) Suppose xn → x and fix any ε > 0. Corresponding to this ε, there is an N ∈ Nsuch that
n ≥ N =⇒ |xn − x| < ε,
by defn of convergence. Thus, all points save {x1, . . . , xN−1} must lie in the interval.
(⇐) Suppose that for any ε > 0, (x− ε, x+ ε) contains all but finitely many of the xn.
Fix ε, and let
B := {xn ... |xn − x| < ε}.
Then by assumption, there is an N ∈ N such that
n ≥ N =⇒ xn ∈ B.
Then |xn − x| < ε whenever n ≥ N , and we have xn → x.
44 Math 413 Topology of the Real Line
(ii). (⇒) Fix ε > 0. For each n ∈ N, there is a point in A ∩ (x − 1n , x + 1
n ) which is not
x. Call it xn, so |xn − x| < 1n . Since 1
n decreases,
n ≥ N =⇒ |xn − x| < 1n≤ 1
N< ε.
Since we can choose N arbitrarily large, this shows xn → x.
(⇐) Immediate from the hypothesis and definition of limit point.
(iii). Homework.
Corollary 3.1.14 (to part (ii), above). x ∈ R is a limit point of the sequence {xn} ⊆ Riff there is a subsequence {xnk
} with xnk→ x.
Proof. Let A = {xn}. Note: the requirement xn 6= x is dropped because {an} allows for
repetition.
A sequence {xn} with limit point x is like a combination of a sequence which converges
to x with a bunch of “noise”. A sequence may have many limit points.
Example 3.1.6. Consider
1 + 1,
1 +12, 2 +
13,
1 +14, 2 +
15, 3 +
16,
1 +17, 2 +
18, 3 +
19, 4 +
110
, . . .
This sequence has limit points N.
Often, the biggest or smallest limit point is useful.
Definition 3.1.15. For any sequence {xn}, the limit superior is defined by
limsup xn := limn→∞
supj>n
xj ,
and the limit inferior is defined by
liminf xn := limn→∞
infj>n
xj ,
3.1 Limits and bounds 45
Example 3.1.7. Consider the sequence {xn} = {(−1)n(1+ 1n )}. We’ve seen that sup xn =
32 and inf xn = −2, but these don’t describe the limiting behavior of the sequence.
This sequence has limit points 1 and −1, so limsup xn = 1 and liminf xn = −1.
Theorem 3.1.16. (i) limsup xj is a limit point of {xj}.
(ii) limsup xj is the supremum of the set of limits points of {xj}.
Proof of (i). Let y = limsupxj , and start with the case y < ∞. Given n ∈ N, we can find
K such that
k ≥ K =⇒ |y − supj>k
xj | < 1n
.
Since y is finite, this shows supj>k xj is also finite, hence there is some x` satisfying
` > k, and |x` − supj>k
xj | < 1n
.
Together, |x`− supj>k xj | < 2n . By picking larger indices (say k1 > k) we can find another
point (say x`1 , `1 > `) satisfying the same criteria. Hence there are infinitely many, and
y is a limit point of the sequence.
If y = ∞, then {supj>k xj} is unbounded above. Thus {xj} is unbounded above, and
∞ is a limit point. If y = −∞, then for any −n, n ∈ N, there is K such that
k ≥ K =⇒ supj>k
xj < −n.
Since this shows there are infinitely many xj with xj ≤ −n, −∞ is a limit point.
Proof of (ii). Suffices to show that y = limsup xj is an upper bound, by Thm. 3.1.3 (a
contained upper bound is a max). Let x be a limit point, so that xjm → x for some
subsequence {xjm}. Since jk+1 > k, we have
yk = supj>k
xj = sup{xj ... j > k} ≥ xjk+1 =⇒ y ≥ x.
Theorem 3.1.17. {xn} converges iff liminf xn = limsup xn. (In this case, the common
value is limxn.)
Proof. (⇒) Fix ε > 0 and suppose xn → x converges. Then for |x| < ∞, we can find N
such that
n ≥ N =⇒ |xn − x| < ε.
46 Math 413 Topology of the Real Line
This implies | supn>N xn − x| ≤ ε, so limsup xn = x. Similarly for liminf xn.
(⇐) Since
k ≥ N =⇒ infn>N
xn ≤ xk ≤ supn>N
xn,
we apply the Squeeze Thm to the hypotheses.
This theorem is extendable to the case xn → ±∞. For x = ∞, lim xn = limsup xn by
the prev thm. Also, the condition n ≥ N =⇒ xn > K implies that infk>n xk ≥ K →∞.
Similarly for x = −∞.
§3.1.3 Exercise: #3,4,9 Recommended: #2,5,12
1. If x is an upper bound of A and x ∈ A, then x = sup A.
2. Prove that the two definitions of limit point are equivalent.
3. Prove that the following is also an equivalent definition of limit point : given any
n,m ∈ N, there is a j ≥ m for which |x− xj | < 1n .
3.2 Open sets and closed sets 47
3.2 Open sets and closed sets
3.2.1 Open sets
QUESTION: Why are open sets handy? ANSWER: They have wiggle room.
Definition 3.2.1. limxn = L iff ∀ε > 0, ∃N, n ≥ N =⇒ |xn − L| < ε.
This is equivalent in R to a more general definition:
∀U (open interval containing L), ∃N such that n ≥ N =⇒ xn ∈ U.
REASON: L ∈ (a, b) =⇒ |L − a|, |L − b| > 0. Take ε to be the smaller of the two.
Then |xn − L| < ε =⇒ xn ∈ (a, b). Other direction is similar.
Definition 3.2.2. A set U is open iff every point of U lies in an open interval which is
contained in U , i.e., if
x ∈ U =⇒ ∃a, b such that x ∈ (a, b) ⊆ U.
This means open sets are automatically “big”: they contain uncountably many points;
contain EVERY point between the inf and sup of any subinterval. An open set automat-
ically has positive length.
Interpretation of defn: Roughly, no point of an open set lies on the boundary.
Definition 3.2.3. x is an interior point of A iff there is an open set U with x ∈ U ⊆ A.
So an alternative definition of open set is: A is open iff every point of A is an interior
point.
Example 3.2.1. (0, 1) is open because it does not contain its boundary points 0,1. If we
added 0, then and tiny interval (−ε, ε) about 0 will always contain negative numbers, so
(−ε, ε) * (0, 1). So [0, 1) is not open.
NOTE: two open intervals overlap by a positive amount, or else not at all. If overlap,
then the union is a single interval. If disjoint, then the union is two disjoint intervals.
Consider (a, b) and (c, d), where a < d. Two cases: b > c or b ≤ c.
Theorem 3.2.4. A set U ⊆ R is open iff it is a finite or countable union of disjoint open
intervals, i.e.,
U =N⋃
n=1
(an, bn), N may be ∞.
48 Math 413 Topology of the Real Line
Proof. Define an open subinterval I ⊆ U to be maximal iff
I ⊆ (a, b) ⊆ U =⇒ I = (a, b).
Then let A be the collection of maximal intervals. It is clear that⋃
A∈AA ⊆ U . For the
reverse inclusion, consider the collection of subintervals of U that contain x. The union
of all of these is again a subinterval of U which contains x. Also, it is maximal. Thus
U ⊆ ⋃A∈AA.
Maximal intervals are disjoint (proof by contradiction).
Pick a rational number from each interval in A (by density of Q in R). (See HW)
Since maximal intervals are disjoint, these numbers are all distinct. So the cardinality of
A cannot exceed the cardinality of Q.
Theorem 3.2.5. Let A1, A2, . . . be open sets. Then
(i) Any union, like G =⋃
An, is open.
(ii) Any finite intersection, like F =⋂k
i=1 Ani , is open.
Proof of (i). Pick x ∈ G. Then ∃n, x ∈ An. Since An is open, can find an open interval
U with x ∈ U ⊆ An. Then x ∈ U ⊆ G.
Proof of (ii). This is trivial if the intersection is empty, so assume it isn’t. Then can pick
x ∈ F =⋂k
i=1 Ai, so ∀i = 1, . . . , k, x ∈ Ai. For each Ai, have x ∈ (bi, ci) ⊆ Ai. Define
b = max{bi} and c = min{ci}.KEY POINT: b < x < c, since these are finite sets! Then
x ∈ (b, c) ⊆ Ai,∀i = 1, . . . , k =⇒ x ∈ (b, c) ⊆⋂
Ai.
Example 3.2.2. Let An := (− 1n , 1 + 1
n ). Then
∞⋂n=1
An = [0, 1],
which is not open: there is no interval about 0 or 1 which is contained in the set.
Definition 3.2.6. A neighbourhood of x is an open set U with x ∈ U . Typically, we use
neighbourhoods of the form (x− ε, x + ε), but it need not be an interval.
3.2 Open sets and closed sets 49
Definition 3.2.7. The interior of A ⊆ R is
intA := {x ∈ A ... x ∈ (a, b) ⊆ A, for some a, b}.
x is an interior point of A iff x ∈ intA.
It is obvious that every set contains its interior points. For a set to be open, it means
that every point is an interior point.
Theorem 3.2.8. x = lim xn iff every neighbourhood of x contains all but finitely many
points of {xn}.
Proof. Exercise: this is basically the same as the first theorem in this section.
Definition 3.2.9. x is a limit point of A iff every neighbourhood U of x contains a point
of A, other than x itself. For U open,
A′ := {x ... ∀U open nbd, {x} ( U ∩A}.
Write A′ for the set of limit points of A.
If a sequence has only a finite number of repetitions, this coincides with the defn of
limit point of a sequence:
{5, 5, 5, 5, 5, . . . }
has 5 as a limit point of the sequence, but not the set. (This is because of the requirement
{x} ( U ∩A.) See HW §3.2.3 #2.
Example 3.2.3. Every point of [0, 1] is a limit point.
2 is not a limit point of [0, 1] ∪ {2}.√
2 is a limit point of Q.
Definition 3.2.10. A set C is closed iff C contains all its limit points, i.e.,
x ∈ C ′ =⇒ x ∈ C.
Theorem 3.2.11. A set is open iff its complement is closed.
Proof. (⇒) Suppose A is open. Let x be a limit point of Ac. Then every neighbourhood
U of x contains a point of Ac and so is not contained in A, i.e., x is not an interior point
of A. Since A is open x ∈ Ac, i.e., Ac contains all its limit points.
50 Math 413 Topology of the Real Line
(⇐) Suppose Ac is closed. Pick some x ∈ A, so that x /∈ Ac and hence x is not a
limit point of Ac. Then there is a neighbourhood U of x which does not intersect Ac, i.e,
U ⊆ A.
Example 3.2.4. 0 is a limit point of (0, 1), but isn’t an element of (0, 1]. So (0, 1] is not
closed. Is it open?
(0, 1]c = (−∞, 0] ∪ (1,∞)
1 is a limit point of this set which is not contained in it, so (0, 1]c not closed =⇒ (0, 1]
not open.
Alt: 1 has no open nbd contained in (0, 1]; no wiggle room to the right.
NOTE: two disjoint closed sets are separated by some positive distance, i.e., min{|x−y| ..
. x ∈ A, y ∈ B} > 0 whenever A,B are closed and disjoint.
Theorem 3.2.12. If C1, C2, . . . are closed sets, then
(i) Any intersection, like⋂
Cn, is closed.
(ii) Any finite union, like⋃k
i=1 Cni , is closed.
Proof of (i).⋂
Cn =⋂
(Ccn)c = (
⋃Cc
n)c. Ccn open =⇒ ⋃
Ccn open =⇒ ⋂
Cn closed.
Definition 3.2.13. The closure of a set is the union of the set and all its limit points:
A := A ∪A′.
Theorem 3.2.14. The closure of a set A is the intersection of all closed sets containing
A:
A =⋂
A⊆C
C, C closed.
Proof. (⊆) If x ∈ A ∪ A′, then x ∈ A or x ∈ A′. If x ∈ A, then x ∈ C trivially, for any
C ⊇ A, and done. So assume x ∈ A′. Let C be any closed set containing A. Then for
every open neighbourhood U of x, there is y ∈ U ∩ A ⊆ U ∩ C, y 6= x. Thus, x is a limit
point of C, and hence contained in C. Since C was any arbitrary closed set containing A,
this is true for all closed sets containing A, and hence for the intersection of them.
(⊇) We show x /∈ A∪A′ =⇒ x /∈ ⋂A⊆C C. So assume x ∈ (A∪A′)c = Ac ∩ (A′)c, so
that x /∈ A and x /∈ A′. Then we can find an open neighbourhood U of x which is disjoint
from A, i.e., U ⊆ Ac. Then A ⊆ U c, and U c is a closed set by prev thm. We need to show
that x ∈(⋂
A⊆C C)c
=⋃
A⊆C Cc, so we’re done.
3.2 Open sets and closed sets 51
Corollary 3.2.15. The closure of A is the smallest closed set containing A.
Definition 3.2.16. B is a dense subset of A iff A ⊆ B.
Example 3.2.5 (The Cantor Set). Define a nested sequence of sets Ck+1 ⊆ Ck by
C0 = [0, 1]
C1 = [0, 13 ] ∪ [ 23 , 1]
C2 = [0, 19 ] ∪ [ 29 , 1
3 ] ∪ [ 23 , 79 ] ∪ [ 89 , 1]
...
The Cantor set is C :=⋂∞
n=0 Cn.
Alternative definition: define f1(x) = x3 and f2(x) = x
3 + 23 . Then C is the unique
nonempty closed and bounded set for which f1(C) ∪ f2(C) = C.
Theorem 3.2.17. 1. The Cantor set is closed.
2. Every point of the Cantor set is a limit point.
3. The Cantor set is totally disconnected, i.e., it contains no open interval.
4. The Cantor set contains uncountably many points.
5. The Cantor set has measure zero (length 0) as seen by 1−∑j=1
(13
)j.
6. For x ∈ [0, 1], define the ternary expansion of x by
x =d1
3+
d2
9+
d3
27+ · · ·+ dn
3n+ · · · =
∞∑
k=1
dk
3k,
where dk ∈ {0, 1, 2}. Then the Cantor set consists of exactly those x ∈ [0, 1] which
have dk ∈ {0, 2}, ∀k.
§3.2.3 Exercise: #1,4,7,8,13 Recommended: #2,5,14 #7,13 are short-
answer.
1. Suppose U is open, C is closed, and K is compact.
(a) Is U \ C open? Is C \ U closed?
(b) Is U \K open? Is C \K compact?
52 Math 413 Topology of the Real Line
(c) If V is open, can U \ V be open?
(d) If J is compact, can K \ J be compact?
2. Prove the theorems about the Cantor set, using whichever of the definitions seems
best.
3.3 Compact sets 53
3.3 Compact sets
Definition 3.3.1. A set K ⊆ R is compact iff every sequence {xn} ⊆ K has a cluster
point x ∈ K.
This is an abstract version of “small” just like open was an abstract version of “big”.
We will characterize compactness in R:
Theorem 3.3.2. A set K ⊆ R is compact iff it is closed and bounded.
but first, need two theorems.
Definition 3.3.3. Let A1, A2, . . . be a sequence of sets in R. This sequence is nested iff
A1 ⊇ A2 ⊇ . . . . If this is a sequence of intervals An = [an, bn] = {x ... an ≤ x ≤ bn}, then
this means
an ≤ an+1 ≤ bn+1 ≤ bn, ∀n.
Note:⋂∞
n=1 An = {x ... x ∈ An, ∀n}.
Theorem 3.3.4 (Nested Intervals Thm). Suppose that An = [an, bn] is a nested sequence
of intervals with lim(an − bn) = 0. Then⋂∞
n=1 An = {L}. Also, an → L and bn → L.
Proof. There are 5 steps.
(i) an ≤ bm for any n,m. Suppose an > bm. Then
n > m =⇒ bn ≤ bm < an<↙
n < m =⇒ bm < an ≤ am<↙
(ii) {an} is increasing by nestedness, and bounded by (i), so converges by completeness.
Thus, let L = lim an.
(iii) ∀n, an ≤ L ≤ bn. Part (ii) shows an ≤ L
(EXERCISE: an ↗ L =⇒ an ≤ L)
an ≤ bm =⇒ L ≤ bm by Limit Location Thm.
(iv) L is the only number common to all intervals and bn → L.
Add the two convergent sequences {bn − an} and {an} to get
lim bn = lim((bn − an) + an) = lim(bn − an) + lim an = 0 + L = L.
54 Math 413 Topology of the Real Line
Theorem 3.3.5 (Bolzano-Weierstrass). A bounded sequence in R has a convergent sub-
sequence.
Proof. Suppose {xn} is bounded, so that
a0 ≤ xn ≤ b0, ∀n.
By a previous theorem, suffices to find a cluster point of the sequence.
Apply the bisection method (aka divide-and-conquer): let c e the midpoint of [a0, b0].
Then at least one of [a, c] or [c, b0] contains infinitely many points xn; call it [a1, b1].
(Choose the first one, if both have infinitely many.) Continuing, we get a nested sequence
[a0, b0] ⊇ [a1, b1] ⊇ . . . ⊇ [am, bm] ⊇ . . .
Since
|bn − an| = |b0 − a0|2n
→ 0,
the Nested Intervals Thm gives
∃!L ∈⋂
[an, bn].
Claim: L is a cluster point of {xn}. Given ε > 0, choose N large enough that |bn−an| < ε.
Then [an, bn] ⊆ (L− ε, L + ε) and contains infinitely many of the xn.
Theorem 3.3.6. A set K ⊆ R is compact iff it is closed and bounded.
Proof. (⇒) Use contrapositive: K not (closed and bounded) implies K not compact. So
assume K is either not closed or not bounded; we show that K contains a sequence with
no limit point in K (just need one).
case (1) K is not closed. Then there is some limit point x of K, and x /∈ K. Since
x is a limit point, we can always find an ∈ (x − 1n , x + 1
n ) satisfying an ∈ K and
an 6= x. By construction, the only possible limit point of {an} is x, but x /∈ K.
case (2) K is not bounded. Then for each n ∈ N, there is an ∈ K with |an| > n.
Then an can have no limit point in R, and certainly not in K.
(⇐) Suppose K is closed and bounded and let {xn} be any sequence in K. Then the
Bolzano-Weierstrass Theorem gives a convergent subsequence {xnk} with xnk
→ x ∈ K.
Then x ∈ K is a limit point of K.
3.3 Compact sets 55
Corollary 3.3.7. Any infinite subset of a compact set has a cluster point.
Proof. This comes from the two previous theorems
Definition 3.3.8. An open cover of A ⊆ R is a collection of open sets {Ui} with A ⊆ ⋃Ui.
Example 3.3.1. (0, 1) ⊆ (0, 1) ⊆ (−1, 2).
(0, 1) ⊆ (0, 23 ) ∪ ( 1
3 , 1).
(0, 1) ⊆ ⋃n( 1
n , 1− 1n ).
R ⊆ ⋃n(−n, n).
U ⊆ ⋃x∈U (x− ε, x + ε).
Theorem 3.3.9. Let K ⊆ R be compact, and let B be a closed subset of K. Then B is
compact.
Proof. K is closed and bounded, by thm, so B ⊆ K must also be bounded. Since B is
closed by hypothesis, B is compact.
Theorem 3.3.10 (Heine-Borel Thm). A set is compact iff every open cover has a finite
subcover.
Proof. (⇒) Suppose A is an open cover of a compact set K. First, reduce A to a countable
subcover B. Let I be any open interval with rational endpoints. If there is an open set of
A which contains I, then add it to B. If not, then don’t. Now any point of K is contained
in an open interval with rational endpoints, and hence contained in one of the sets from
A that was added to B. So B ⊆ A is a subcover of K which is at most countable, so
B = B1, B2, . . . .
If B is finite, we are done, so suppose not. Then for each n ∈ N, we can choose a
point xn ∈ K which is not contained in⋃n
k=1 Bk. Then we have a subsequence xnk→
x ∈ K, by compactness. Since x ∈ K ⊆ ⋃Bk, we have x ∈ BN for some N . But
{xN , xN+1, xN+2, . . . } ⊆ BcN by construction. <↙
(⇐) Note that⋃∞
n=1(−n, n) is an open cover of R, hence also of K. Since we are
assuming the subcover property, we have K ⊆ (−n, n) for some n >> 1. So K is bounded.
To see K is closed, we show Kc is open.
Pick x ∈ Kc. We must produce an open set U such that x ∈ U ⊆ Kc, i.e., an open
neighbourhood U of x which is disjoint from K.
56 Math 413 Topology of the Real Line
For each point y ∈ K, separate x from y by open sets: choose open neighbourhoods
of Uy of x and Vy of y which are disjoint (Ux ∩ Vy = ∅). Since U :=⋂
Uy is disjoint from
K, we would like to use this as a neighbourhood of x:
x ∈ U ⊆ Kc =⇒ x is an interior point of Kc.
Problem: an arbitrary intersection need not be open.
Meanwhile, {Vy} is an open cover of K. By the compactness of K, there must be
some finite open subcover of K which we can denote by {Vi}ni=1. Then by looking at the
neighbourhoods of x which correspond to these sets Vi, we have Ui∩Vi = ∅, ∀i = 1, . . . , n.
Since(⋂n
j=1 Uj
)⊆ Ui,∀i = 1, . . . , n, we have
(⋂Ui
)∩ Vi = ∅, ∀i =⇒
(⋂Ui
)∩
⋃Vi = ∅.
Thus
K ⊆⋃
Vi =⇒(⋂
Ui
)is disjoint from K.
Moreover, since (⋂n
i=1 Ui) is a finite intersection of open sets, it is also open. Therefore,
we can take U =⋂n
i=1 Ui as our open neighbourhood of x which is disjoint from K.
Theorem 3.3.11. A nested sequence A1 ⊇ A2, . . . of nonempty compact sets has a
nonempty intersection.
Proof. For each n, choose a xn ∈ An. Then {xn} ⊆ A1 by nesting, so it has a limit point
x ∈ A1 (since A1 is compact). But x is also a limit point of {xn, xn+1, . . . }, so x ∈ An.
Etc, x ∈ ⋂An.
Theorem 3.3.12. Let K be compact, and let B be a closed subset of K. Then B is
compact. (Same as earlier, but now don’t require K ⊆ R.)
Proof. Let {Uα}α∈A be an open cover of B. We need to find a finite subcover. Since B is
closed, we know that its complement Bc is open. Then
{Uα}α∈A ∪ {Bc}
is an open cover of the whole set K, and hence has a finite open subcover
{Ui}ni=1 ∪ {Bc}.
3.3 Compact sets 57
Since Bc doesn’t cover any part of B, we can throw it out and still have that
{Ui}ni=1
is an open cover of B. This is a finite subcover for B, i.e., B is compact.
Example 3.3.2. Define An = (0, 1n ). Then
⋂An = ∅.
3.3.1 Key properties of compactness
K ⊆ R is compact iff
1. Every sequence {xn} ⊆ K has a limit point x ∈ K.
2. K is closed and bounded.
3. Every open cover of K has a finite subcover.
Later, after we’ve seen continuity, we’ll also want
Theorem 3.3.13.
Let f : X → R be continuous, and let K ⊆ X be compact. Then there exist points
m,M ∈ K such that f(m) ≤ f(x) ≤ f(M), ∀x ∈ K.
Example 3.3.3. f(x) : (0, 1) → R by f(x) = x2 or f(x) = 1x .
Theorem 3.3.14. Let f : K → Y be continuous, where K is compact. Then f(K) is
compact in Y .
Theorem 3.3.15. On a compact set, any continuous function is automatically uniformly
continuous.
§3.3.3 Exercise: #4,8 Recommended: #3,6,10
#3 is short-answer; a rigorous proof is not required.
1. If F is closed and K is compact, then F ∩K is compact.
58 Math 413 Topology of the Real Line
2. Suppose K = {Kα} is a collection of compact sets. if K has the property that
the intersection of every finite subcollection is nonempty, then prove that⋂
Kα is
nonempty. (Try contradiction or DeMorgan’s.)
3. Suppose that every point of the nonempty closed set A is a limit point of A. Show
that A is uncountable. (Try contradiction, and use the previous problem.)
Chapter 4
Continuous functions
4.1 Concepts of continuity
4.1.1 Definitions
Definition 4.1.1. 1 A function from a set D to a set R is a subset f ⊆ D×R for which
each element d ∈ D appears in only one element (d, ·) ∈ f . Write f : D → R. If (x, y) ∈ f ,
then we usually write f(x) = y.
D is the domain of the function; the subset of R of elements x for which the function is
defined.
R is the range; a subset of R which contains all the points f(x). Generally, assume R = R.
Definition 4.1.2. A function is a rule of assignment x 7→ f(x), where for each x in the
domain, f(x) is a unique and well-defined element of the range.
f(x) = y means “f maps x ∈ D to f(x) ∈ R”.
Definition 4.1.3. The image of f is the subset
Im f := {y ∈ R ... ∃x ∈ D, f(x) = y} ⊆ R.
The function f is surjective or onto iff Im f = R, that is,
∀y ∈ R, ∃x ∈ D, f(x) = y.
1May 2, 2007
60 Math 413 Continuous functions
Definition 4.1.4. For f : D → R, the preimage of B ⊆ R is the subset
f−1(B) := {x ... f(x) ∈ B} ⊆ D.
Example 4.1.1. The preimage of [0, 1] under f(x) = x2 is [−1, 1].
The preimage of [−1, 1] under f(x) = sin x is R.
The preimage of {1} under f(x) = sin x is {2kπ}k∈Z.
The preimage of [0, 1] under log x is [1, e].
Definition 4.1.5. A function f is injective or one-to-one iff no two distinct points in D
get mapped onto the same point in R, i.e.
f(x) = f(y) =⇒ x = y.
Example 4.1.2. f(x) = x2 is injective on (0,∞) but not on R.
f(x) = 1x is injective on R \ {0}.
A function is usually described by a formula or by its graph. If the graph is a connected
curve, we want to call the function continuous. To formalize:
x ≈ y =⇒ f(x) ≈ f(y).
Definition 4.1.6. f is continuous at x ∈ D iff
∀ε > 0,∃δ, |x− t| < δ =⇒ |f(x)− f(t)| < ε.
Write limt→x f(t) = f(x).
IDEA: |x− t| < δ means “t → x”, and |f(x)− f(t)| < ε means “f(t) → f(x)”, i.e.
t → x =⇒ f(t) → f(x).
Example 4.1.3. Define the Heaviside function
H(x) =
0, x < 0
1, x ≥ 0.
H is not continuous at 0: pick ε = 12 . If t < 0, then no matter how small |x− t|, we still
4.1 Concepts of continuity 61
have
|f(x)− f(t)| = |1− 0| = 1 >12
= ε,
so f(t) 9 f(x).
MORAL: the defn of continuity prevents a function from changing too rapidly; |f(x)|cannot “jump”.
Definition 4.1.7. f is a continuous function iff it is continuous at each x in its domain,
i.e.,
∀ε > 0, ∀x, ∃δ, |x− t| < δ =⇒ |f(x)− f(t)| < ε.
Strengthen this idea by disallowing a function from growing faster than a “globally
controlled” rate.
Definition 4.1.8. f is uniformly continuous on D iff
∀ε > 0, ∃δ,∀x, |x− t| < δ =⇒ |f(x)− f(t)| < ε.
This δ depends only on ε, not on x; thus, it works for ALL x simultaneously. NOTE:
uniformly continuous is a global property; it makes no sense to ask if f is uniformly
continuous at x0.
Example 4.1.4. f(x) = x2 is not uniformly continuous on R.
¬(∀ε > 0,∃δ,∀x, (|x− t| < δ =⇒ |f(x)− f(t)| < ε))
∃ε > 0,∀δ,∃x, ¬(|x− t| < δ =⇒ |f(x)− f(t)| < ε)
∃ε > 0,∀δ,∃x, (|x− t| < δ and |f(x)− f(t)| ≥ ε), (for some t).
Let δ > 0 be fixed. Then for points t of the form x + δ2 , we have |x − t| = |x − x + δ
2 | =δ/2 < δ. However,
|f(x)− f(t)| = |x2 − (x + δ)2| = |2xδ − δ2| = 2δ(x− δ) ≮ ε, x >> 1.
4.1.2 Limits of functions and limits of sequences
Continuous functions are useful because they preserve limits:
limt→x
f(t) = f(limt→x
t) = f(x),
62 Math 413 Continuous functions
i.e., f is a continuous function iff it takes convergent sequence to convergent sequences.
Theorem 4.1.9. limt→x f(t) = L iff limn→∞ f(xn) = L for every {xn} with xn → x
Proof. (⇒) Choose a sequence xn → x, and fix ε > 0. Since f is continuous, ∃δ > 0 for
which
|x− t| < δ =⇒ |f(t)− L| < ε.
Also, there is N such that
n ≥ N =⇒ |xn − x| < δ.
Thus, n ≥ N =⇒ |f(xn)− L| < ε.
(⇐) Contrapositive: suppose it is false that limt→x f(t) = L. Then:
∃ε > 0,∀δ > 0,∃t, |x− t| < δ and |f(t)− L| ≥ ε
∃ε > 0, ∀n ∈ N,∃tn, |x− tn| < 1n and |f(tn)− L| ≥ ε
This produces tn → x for which it is false that limn→∞ f(xn) = L.
Corollary 4.1.10. If f has a limit at x, this limit is unique.
Proof. Combine prev thm with uniqueness thm for sequences.
4.1.3 Inverse images of open sets
Theorem 4.1.11. Suppose the domain of f is open. f is continuous iff the preimage of
every open set is open.
Proof. (⇒) Let f be continuous and let V ⊆ R be open. Must show: every point of f−1(V )
is an interior point. Pick p ∈ D for which f(p) ∈ V . Since V is open, (f(p)−ε, f(p)+ε) ⊆ V
for some ε > 0. Since f is continuous, there is a δ > 0 such that
|x− p| < δ =⇒ |f(x)− f(p)| < ε.
Thus, x ∈ f−1(V ) as soon as |x− p| < δ, i.e. x ∈ (p− δ, p + δ) ⊆ f−1(V ).
(⇐) Suppose that V open in R implies that f−1(V ) is an open subset of D. Fix any
p ∈ D and ε > 0. Choose a specific V = (f(p) − ε, f(p) + ε). Then f−1(V ) open means
that p ∈ f−1(V ) is an interior point of f−1(V ). In particular, we can find δ > 0 such that
4.1 Concepts of continuity 63
x ∈ f−1(V ) as soon as x ∈ (p− δ, p + δ). But
x ∈ f−1(V ) =⇒ f(x) ∈ V =⇒ |f(x)− f(p)| < ε.
Corollary 4.1.12. f is continuous iff the preimage of every closed set is closed.
Proof. HW.
Connectedness
Definition 4.1.13. A set X is connected iff it CANNOT be written
X = A ∪B, A ∩B = ∅, A, B 6= ∅,
for two open sets A, B.
Theorem 4.1.14. The continuous image of a connected set is connected.
Proof. Let f : X → R be continuous, and let X be connected. Must show f(X) is
connected. Suppose, by way of contradiction, that f(X) = A∪B is a separation of f(X).
Then f−1(A) and f−1(B) are disjoint nonempty open sets whose union is X. <↙
A ∩B = ∅ =⇒ f−1(A ∩B) = ∅, because
x ∈ f−1(A ∩B) =⇒ x ∈ f−1(A) and x ∈ f−1(B) =⇒ f(x) ∈ A ∩B.
4.1.4 Related definitions
Definition 4.1.15. f is a Lipschitz function (or strongly continuous function) iff
|f(x)− f(y)| ≤ M |x− y|
for some constant M . (The Lipschitz constant.)
Then
|x− y| < ε
M=⇒ |f(x)− f(y)| < ε.
64 Math 413 Continuous functions
Definition 4.1.16. f is a Holder function (or satisfies a Holder condition of order α) iff
|f(x)− f(y)| ≤ M |x− y|α, 0 < α ≤ 1,
for some constant M . (The Lipschitz constant.)
NOTE: if α = 0, this would just say that f is bounded (with max variation M).
Then
|x− y| <( ε
M
)1/α
=⇒ |f(x)− f(y)| < ε.
NOTE: for α ≈ 0,(
εM
)1/α → 0 fast (ε < M).
Definition 4.1.17. f has a limit from the right at x (or is right-continuous at x) iff
∀ε > 0,∃δ, 0 < t− x < δ =⇒ |f(t)− L| < ε.
Write f(x+) := limt→x+ f(t) = L.
Similarly for f(x−) := limt→x− f(t):
∀ε > 0,∃δ, 0 < x− t < δ =⇒ |f(t)− L| < ε.
4.2 Properties of continuity 65
4.2 Properties of continuity
Theorem 4.2.1. If f, g are continuous, then so are f + g, f · g, and (if g 6= 0) f/g.
Proof. Continuous functions preserve sequences, and limits are linear & multiplicative for
sequences.
NOTE: Define (f + g)(x) := f(x) + g(x), etc.
USE this with the thm that lim commutes with continuous functions:
Example 4.2.1. NOTE: all limits are finite, and the denom 6= 0!
limt→1
3t2 −√t
t2 + 1=
limt→1(3t2 −√t)limt→1(t2 + 1)
=3 limt→1 t2 − limt→1
√t
limt→1 t2 + limt→1 1=
3−√limt→1 t
1 + 1=
22
= 1
Theorem 4.2.2. Let x = g(t), c = g(b). If g(t) is continuous at b and f(x) is continuous
at c, then f ◦g(t) = f(g(t)) is continuous at b.
Proof. Given ε > 0, ∃δ > 0 such that
f(x) ≈ε f(c) for x ≈δ c continuity of f
g(t) ≈δ g(b) for t ≈α b continuity of g.
Then t ≈α b =⇒ x = g(t) ≈δ g(b) = c =⇒ f(x) ≈ε f(c).
Theorem 4.2.3 (Pasting Lemma). Suppose f : A → R and g : B → R where A,B are
closed. If f(x) = g(x) for every x ∈ A ∩B, then h : A ∪B → R is continuous:
h(x) :=
f(x), x ∈ A
g(x), x ∈ B.
Proof. Suppose C ⊆ R is a closed set. By elementary set theory,
h−1(C) = f−1(C) ∪ g−1(C).
f continuous =⇒ f−1(C) closed, and g continuous =⇒ g−1(C) closed, so the finite
union h−1(C) is closed.
Theorem 4.2.4. If f, g continuous then so are max{f, g} and min{f, g}.
66 Math 413 Continuous functions
Proof. Apply the Pasting Lemma to
max{f, g}(x) :=
f(x), f(x) ≥ g(x)
g(x), g(x) ≥ f(x).
The intersection is the set where f = g by defn, so only remains to check the two sets are
closed.
{x ... f(x) ≥ g(x)} = {x ..
. f(x)− g(x) ≥ 0} = {x ... (f − g)(x) ≥ 0} = (f − g)−1 ([0,∞))
Theorem 4.2.5 (Intermediate Value Theorem). If f is continuous on [a, b], then f as-
sumes all values between f(a) and f(b).
Proof 1. If f(a) = f(b), it is trivial, so wlog let f(a) < f(b). Suppose c ∈ (f(a), f(b)).
Let
A := {x ... f(x) < c}
B := {x ... f(x) > c}.
Assume, by way of contradiction, that c /∈ f(a, b). Then A ∪ B = [a, b] and clearly
A ∩B = ∅. Since a ∈ A and b ∈ B, neither is empty. <↙[a, b] is connected.
Proof 2. Again, suppose c ∈ (f(a), f(b)) and ∀x, f(x) 6= c. Then A = (−∞, c), B = (c,∞)
is a disconnection of f([a, b]). <↙
Corollary 4.2.6. Let f be continuous on [a, b]. If f changes sign on the interval (say
f(a) < 0 < f(b)), then ∃c ∈ [a, b] for which f(c) = 0.
Example 4.2.2. A polynomial of odd degree has a real zero.
Proof. Consider x2k+1 = x(x2)k. Then
x < 0 =⇒ x(x2)k < 0, x > 0 =⇒ x(x2)k > 0.
Apply this to the polynomial a0 + a1x + . . . anxn, assuming an 6= 0 (so it has degree
n = 2k + 1). The lesser terms will not matter for |x| >> 1:
∣∣an−1xn−1 + · · ·+ a0
∣∣ ≤ |an−1|∣∣xn−1
∣∣ + · · ·+ |a0|
4.2 Properties of continuity 67
≤ |anxn|(∣∣∣∣
an−1
anx
∣∣∣∣ +∣∣∣∣an−2
anx2
∣∣∣∣ + · · ·+∣∣∣∣
a0
anxn
∣∣∣∣)
≤ |anxn|,
since the expression in parentheses is < 1 for x >> 1.
NEXT: The continuous image of a compact set is compact.
Theorem 4.2.7. Let f be continuous, where K is compact. Then f(K) is compact.
Proof. Let {Uα}α∈A be a covering of f(K) by open sets Uα. NTS: ∃ a finite subcover.
Since f is continuous, every set f−1(Uα) will be open in K.
Since the Uα cover f(K), we have an open cover {f−1(Uα)} of K.
K is compact by hypothesis, so there must be some finite subcover {f−1(Ui)}ni=1 of K.
K ⊆n⋃
i=1
f−1(Ui) =⇒ f(K) ⊆ f
(n⋃
i=1
f−1(Ui)
)by S&M 4(a)
=n⋃
i=1
f(f−1(Ui)
)by S&M 4(b)
⊆n⋃
i=1
Ui by S&M 1(b),
and we have f(K) contained in the finite subcover {Ui}ni=1. I.e., f(K) is compact.
Corollary 4.2.8. A continuous function on a compact set is bounded and attains its sup
and inf.
Proof. If K is compact, then so is f(K), by prev. So f(K) is closed and bounded.
Theorem 4.2.9. On a compact set, continuity and uniform continuity are equivalent.
Proof. It is clear that uniform continuity implies continuity. For the converse, fix ε > 0.
Since f is continuous, for each x in the domain, can find δx such that
|x− y| < δx =⇒ |f(x)− f(y)| < ε
2.
For each x, define a neighbourhood Ux to be a δx
2 -ball about x:
Ux := (x− δx
2 , x + δx
2 ).
68 Math 413 Continuous functions
Then {Ux} is an open cover, so there are finitely many points {x1, . . . , xn} such that {Uxi}
is a finite subcover. We have the corresponding numbers δi := δxi . Define
δ := mini=1,...,n
{ δi
2 }.
NOTE: finiteness here is why compactness is necessary — then δ > 0.
Now show this δ satisfies the defn of uniformly continuous: pick any p, q in the domain
with |p− q| < δ. Since {Uxi} is a cover,
∃i, p ∈ Uxi =⇒ |p− xi| < δi
2(∗)
Then
|q − xi| ≤ |q − p|+ |p− xi|
< δ +δi
2by (*)
≤ δi. (∗∗)
So by the initial defn of δi, (*) and (**) give
|f(p)− f(q)| ≤ |f(p)− f(xi)|+ |f(xi)− f(q)| < ε.
A monotone function on an interval has one-sided limits at all points of the domain,
finite except possibly at the endpoints.
Theorem 4.2.10. If f is monotonic increasing on (a, b), then for every x ∈ (a, b),
supa<t<x
f(t) = f(x−) ≤ f(x) ≤ f(x+) = infa<t<x
f(t).
Further, for a < x < y < b, f(x+) ≤ f(y−).
Proof. Let A := sup{f(t) ... a < t < x}. Since f(x) is an upper bound for the set, we know
A exists. To prove A = f(x−), fix ε > 0. Since A is a l-u-b, can find δ > 0 such that
a < x− δ < x and A− ε < f(x− δ) ≤ A.
4.2 Properties of continuity 69
Since f is monotonic,
f(x− δ) ≤ f(t) ≤ A for x− δ < t < x.
Combine prev two eqns to get |f(t)−A| < ε. Thus, A = f(x−).
Next, for a < x < y < b, monotonicity gives
supa<t<y
f(t) = supy−δ<t<y
f(t) = f(y−)
from which
f(x+) = infa<t<y
f(t) ≤ supa<t<y
f(t) = f(y−).
Corollary 4.2.11. A monotone function on an interval has at most a countable number
of discontinuities, all of which are jump discontinuities.
Proof. Wlog, let f be increasing, and let D be the set of discontinuities of f . For each
x ∈ D, associate a rational number r(x) such that f(x−) < r(x) < f(x+). Since
x < y =⇒ f(x+) ≤ f(y−),
the strict inequalities above give
x 6= y =⇒ r(x) 6= r(y).
r is injective, and thus gives a bijection between the jumps and a subset of Q.
§4 Exercise: #2,5,11 Recommended: #3,9
1. Show f is continuous iff the preimage of every closed set is closed.
2. Show that a continuous function is determined by its values on a dense subset of the
domain: let f, g : X → R be continuous, and let D be dense in X. Prove that f(D)
is dense in f(X) and that
(f(x) = g(x), ∀x ∈ D) =⇒ (f(x) = g(x), ∀x ∈ X).
70 Math 413 Continuous functions
3. Let I := [0, 1]. If f : I → I is continuous, show that f has a fixed point, i.e.,
∃x ∈ I, f(x) = x.
4. Consider the function which is 0 on every irrational, and takes the value 1n if x is a
rational written as mn in lowest terms:
f(x) =
0, x ∈ R \Q1n , x = m
n ∈ Q, n > 0.
Show that f is continuous at every irrational, and has a simple discontinuity at every
rational.
Chapter 5
Differentiation
5.1 Concepts of the derivative
1
5.1.1 Definitions
Definition 5.1.1. f is differentiable at x0 iff
∀ε > 0, ∃δ > 0|x− x0| < δ =⇒∣∣∣∣f(x)− f(x0)
x− x0− L
∣∣∣∣ < ε,
in which case f ′(x0) = L.
Since x 6= x0, multiply the inequality to obtain
Definition 5.1.2. f is differentiable at x0 iff
∀m > 0,∃n > 0|x− x0| < 1n
=⇒ |f(x)− (f(x0) + f ′(x0)(x− x0))| < |x− x0|m
.
So f ≈ g for g(x) = f(x0) + f ′(x0)(x− x0).
Definition 5.1.3. If f(x)g(x)
x→x0−−−−−→∞, then f “blows up” faster than g. If f(x)g(x)
x→x0−−−−−→ 0,
then g “blows up” faster than f ; write f(x) = o(g(x)).
Write f(x) = O(g(x)) iff f(x)g(x) ≤ b < ∞ as x → x0.
1May 2, 2007
72 Math 413 Differentiation
Then “f is differentiable” means f(x) − g(x) = o(|x − x0|), where g is the affine
approximation to f : g(x) = f(x0) + f ′(x0)(x− x0).
5.1.2 Continuity and differentiability
Theorem 5.1.4. f is differentiable at x0 implies f is continuous at x0.
Proof. f(t)− f(x) =f(t)− f(x)
t− x· (t− x) → f ′(x) · 0 = 0.
In fact, f must be Lipschitz.
Definition 5.1.5. f is differentiable on an open set U iff it is differentiable at every point
of U . f is C1 on U iff f ′ is continuous on U and Ck iff f (k) is continuous on U .
NOTE: f ∈ C0 means that f is continuous.
Example 5.1.1. A function which is C2 but not C3 on R:
f(x) =
x2
2 + x + 1, x ≤ 0
ex, x ≥ 0.
5.2 Properties of the derivative 73
5.2 Properties of the derivative
5.2.1 Local properties
Definition 5.2.1. f is monotone increasing at x iff f(s) ≤ f(x) ≤ f(t) for a < s < x <
t < b. f is strictly increasing at x iff f(s) < f(x) < f(t) for a < s < x < t < b.
Proposition 5.2.2. f is (monotone or strictly) increasing on (a, b) iff f is order-preserving
on (a, b).
Proof. Immediate.
Definition 5.2.3. f has a local maximum at x iff f(t) ≤ f(x), for all t ∈ (x− ε, x + ε).
f has a strict local maximum at x iff f(t) > f(x), for all t 6= x in (x− ε, x + ε).
Theorem 5.2.4. If x is a local max or min of f , then f ′(x) = 0.
Proof. Choose δ such that a < x− δ < x < x + δ < b. Then for x− δ < t < x, we have
f(t)− f(x)t− x
≥ 0.
Letting t → x, get f ′(x) ≥ 0. Similarly for the other inequality.
Example 5.2.1. f(x) = x3 is strictly increasing, but has derivative 0 at 0.
Also, f doesn’t have a max or min at 0.
Theorem 5.2.5 (Rolle’s). If f is continuous on [a, b] and differentiable on the interior
and f(a) = f(b), then ∃x ∈ (a, b) such that f ′(x) = 0.
Proof. If f is constant on the interval, we are done, so wlog let h(t) > h(a) somewhere.
Then f attains its max at some point x ∈ (a, b) (since f([a, b]) is compact), and the prev
thm gives f ′(x) = 0.
5.2.2 IVT and MVT
Theorem 5.2.6 (Continuity of derivatives). f is differentiable on (a, b). Then f ′ assumes
every value between f ′(s) and f ′(t), for a < s < t < b.
Proof. Let λ ∈ (f ′(s), f ′(t)), and define g(t) = f(t)− λt so that
g′(a) = f(a)− λ < 0 =⇒ g(t1) < g(a) for some a < t1 < b, and
74 Math 413 Differentiation
g′(b) = f(b)− λ > 0 =⇒ g(t2) < g(b) for some a < t2 < b.
Then g attains its min on [a, b] at some point x such that a < x < b. It follows that
g′(x) = 0, hence f ′(x) = λ.
Corollary 5.2.7. If f is differentiable on [a, b], then f ′ cannot have any simple or jump
discontinuities on [a, b].
Theorem 5.2.8 (Cauchy mean value thm). f, g are continuous on [a, b] and differentiable
on the interior. Then ∃x ∈ (a, b) for which
[f(b)− f(a)]g′(x) = [g(b)− g(a)]f ′(x).
Proof. Define
h(t) := [f(b)− f(a)]g(t)− [g(b)− g(a)]f(t) a ≤ t ≤ b
so that h is continuous on [a, b] and differentiable on the interior and
h(a) = f(b)g(a)− f(a)g(b) = h(b).
By Rolle’s Thm, get h′(x) = 0 for some x.
Interp: x = f(s), y = g(t) (SKETCH).
Corollary 5.2.9 (“The” Mean Value Thm). If f continuous on [a, b] and differentiable
on the interior, then ∃x ∈ (a, b) for which
f(b)− f(a) = (b− a)f ′(x).
Theorem 5.2.10. f is differentiable on (a, b).
1. If f ′(x) ≥ 0,∀x ∈ (a, b), then f is increasing.
2. If f ′(x) ≤ 0,∀x ∈ (a, b), then f is decreasing.
3. If f ′(x) = 0,∀x ∈ (a, b), then f is constant.
Proof. These can all be read off from the equation
f(t)− f(s) = (t− s)f ′(x),
76 Math 413 Differentiation
5.3 Calculus of derivatives
5.3.1 Arithmetic rules
Theorem 5.3.1 (Linearity).
f, g differentiable at x =⇒ (af + bg)′(x) = af ′(x) + bg′(x),∀a, b ∈ R.
Proof. HW (follows immediately from limit defn)
Theorem 5.3.2 (Product rule).
f, g differentiable at x =⇒ (fg)′(x) = f ′(x)g(x) + f(x)g′(x).
Proof. Let h = fg so that
h(t)− h(x) = f(t)g(t)− f(t)g(x) + f(t)g(x)− f(x)g(x)
h(t)− h(x)t− x
=f(t)[g(t)− g(x)]
t− x+
g(x)[f(t)− f(x)]t− x
.
Theorem 5.3.3 (Quotient rule).
f, g differentiable at x, g(x) 6= 0 =⇒(
fg
)′(x) = f ′(x)g(x)−f(x)g′(x)
g2(x) .
Proof. HW: Let h = f/g so that
h(t)− h(x)t− x
=1
g(t)g(x)
[g(x)
f(t)− f(x)t− x
− f(x)g(t)− g(x)
t− x
].
Theorem 5.3.4 (Chain Rule). If f is differentiable at x and g is differentiable at f(x),
then g◦f is differentiable at x with (g◦f)′(x) = g′(f(x))f ′(x).
Proof. Let y = f(x) and s = f(t) and define h(t) = g(f(t)). By the defn of derivative,
f(t)− f(x) = (t− x)[f ′(x) + o(1)] as t → x
g(s)− g(y) = (s− y)[g′(y) + o(1)] as s → y.
So : h(t)− h(x) = g(f(t))− g(f(x))
= (s− y)[g′(y) + o(1)]
= [f(t)− f(x)] [g′(f(x)) + o(1)]
= [(t− x)[f ′(x) + o(1)]] [g′(f(x)) + o(1)]
h(t)− h(x)t− x
= [f ′(x) + o(1)][g′(f(x)) + o(1)].
5.3 Calculus of derivatives 77
Note: s → y as t → x, by continuity of f .
Theorem 5.3.5 ((Baby) Inverse Fn Thm). f : (a, b) → (c, d) is C1 and f ′(x) > 0 on
(a, b). Then f is invertible and is C1 with (f−1)′(y) = 1/f ′(x), if y = f(x).
Proof. Use the chain rule to differentiate both sides of the identity f−1(f(x)) = x.
78 Math 413 Differentiation
5.4 Higher derivatives and Taylor’s Thm
5.4.1 Interpretations of f ′′
Definition 5.4.1. If f ′ is defined in a neighbourhood of x and differentiable at x, then f
is twice differentiable at x, with f ′′(x) = (f ′)′(x). If f ′′ is continuous, then f ∈ C2.
Theorem 5.4.2. Suppose f ′′ exists in some neighbourhood of x.
(i) f ′(x) = 0, f ′′(x) > 0 =⇒ x is a strict local min.
(ii) If x is a local max, then f ′′(x) ≤ 0.
(iii) f ′′(x) > 0 =⇒ the graph of f(x) lies below any secant line.
Proof. (i) f ′′(x) > 0 means f ′ is increasing on some open interval (a, b) around x, with
f ′(a) < 0 and f ′(b) > 0 (since f ′(x) = 0). Then f is decreasing on (a, x) and
increasing on (x, b).
(ii) This is implied by the contrapositive of (1).
(iii) Let g be an affine function whose graph intersects f ’s at x = s and x = t. We need
to show f(x) < g(x), i.e. that h := f − g is negative on (s, t) (note h(s) = h(t) = 0).
If f ′′(x) > 0 for a < x < b, then h is also, since g′′ = 0. Suppose h were not negative
on (s, t); then it would have a local max on this interval at some point c. By Part
(1), this would imply h′′(c) ≤ 0. <↙
Part (i) holds, in particular, for g(t) = f(x) + f ′(x)(x− t).
5.4.2 Taylor’s Thm
Definition 5.4.3. If f ∈ Cn, the Taylor expansion of f at a is
Tn(a, x) = f(a) + f ′(a)(x− a) +12f ′′(a)(x− a)2 + · · ·+ 1
n!f (n)(a)(x− a)n,
where f (n) = (f (n−1))′ is defined by induction.
Theorem 5.4.4. If f (n) exists at a, then Tn(a, x) is the unique polynomial of degree n in
powers of (x− a) having nth-order agreement with f(x) at a.
5.4 Higher derivatives and Taylor’s Thm 79
Proof. Suppose p is a polynomial in (x− a):
p(x) = c0 + c1(x− a) + · · ·+ cn(x− a)n.
After k differentiations, the terms c0, . . . , ck−1 vanish:
p(k)(x) = k!ck + (terms with (x− a) as a factor).
So p(k)(a) = k!ck. If f(x) and p(x) have nth-order agreement,
f (k)(a) = k!ck =⇒ ck =f (k)(a)
k!, ∀k = 1, . . . , n.
Theorem 5.4.5 (Taylor’s Thm). Suppose f ∈ Cn+1(I) for some open interval I contain-
ing a and x. Then for some c between a and x,
f(x) = f(a) + f ′(a)(x− a) +12f ′′(a)(x− a)2 + · · ·+ 1
n!f (n)(a)(x− a)n + Rn(x),
Rn(x) =f (n+1)(c)(n + 1)!
(x− a)n+1.
NOTE: c depends on x, so RHS of f is not a polynomial; f (n+1)(c) is not a constant.
Proof. We show the theorem holds at x = b. Let P be defined by
P (x) = Tn(a, x) + C(x− a)n+1,
where C is chosen so that f(b) = P (b), i.e., C = f(b)−Tn(a,b)(b−a)n+1 . Let
g(x) = f(x)− P (x)
so that g(a) = g(b) = 0. Must show f (n+1)(c) = (n + 1)!C for some c ∈ (a, b). Since
g(n+1)(x) = f (n+1)(x)− (n + 1)!C, suffices to find a zero of g(n+1) on (a, b).
Since T(k)n (a) = f (k) for k = 0, . . . , n, we have
g(a) = g′(a) = · · · = g(n+1)(x) = 0.
80 Math 413 Differentiation
Applying the MVT n times to the derivatives of g,
g(x) = g′(x1)(x− a)
g′(x1) = g′′(x2)(x1 − a)
...
g(n)(xn) = g′(xn+1)(xn − a),
each time using g(k)(xk) = 0 and the fact that f ∈ Cn =⇒ g ∈ Cn. Then g(n+1)(xn) = 0
for xn ∈ (a, b).
Corollary 5.4.6. f = Tn + o(|x− a|n) as x → a.
Theorem 5.4.7 (L’Hopital’s Rule). Suppose f, g ∈ C1(a, b) satisfy limx→c f(x) = limx→c g(x) =
0 for c ∈ (a, b). If g′(c) 6= 0, then
limx→c
f(x)g(x)
= limx→c
f ′(x)g′(x)
.
Proof. We only need to take the limit of
f(x)g(x)
=f(c) + f ′(c)(x− c) + o(|x− c|)g(c) + g′(c)(x− c) + o(|x− c|) Taylor’s Thm
=f ′(c)(x− c) + o(|x− c|)g′(c)(x− c) + o(|x− c|) continuity
=f ′(c) + o(1)g′(c) + o(1)
cancellation.
Theorem 5.4.8 (L’Hopital’s Other Rule). Suppose f, g ∈ C1(a, b) satisfy limx→c f(x) =
limx→c g(x) = ∞, then the same result holds.
§5 Exercise: # Recommended: #
1. Prove the linearity of differentiation.
2. Prove the quotient rule.
Chapter 6
Integration
6.1 Integrals of continuous functions
1
6.1.1 Existence of the integral
Let f(x) be a function defined on [a, b]. We want to define its integral∫ b
af(x) dx.
Definition 6.1.1. A partition P of the interval [a, b] is a finite set of points {a =
x0, x1, x2, . . . , xn = b}, xi < xi+1.
Definition 6.1.2. On each subinterval [xi, xi+1], of the partition P , define
Mi = sup f(x) xi−1 ≤ x ≤ xi
mi = inf f(x) xi−1 ≤ x ≤ xi
U(f, P ) =n∑
i=1
Mi(xi − xi−1)
L(f, P ) =n∑
i=1
mi(xi − xi−1)
U(f, P ) is the upper sum of f on the partition P , and L(f, P ) is the lower sum of f on
the partition P .
1May 2, 2007
82 Math 413 Integration
Definition 6.1.3. If supP L(f, P ) = infP U(f, P ), then the integral of f on [a, b] is defined
to be the common value, denoted∫ b
af(x) dx. We say f is (Riemann-) integrable on [a, b]
and write f ∈ R[a, b].
Definition 6.1.4. The partition P ′ is a refinement of P iff P ⊆ P ′.
Note that adding points to the partition has two effects:
1. the lengths of the subintervals decreases, and
2. L(f, P ) increases and U(f, P ) decreases.
Theorem 6.1.5. For P ⊆ P ′, L(f, P ′) ≥ L(f, P ) and U(f, P ) ≤ U(f, P ).
Proof. Do the case where P ′ = P ∪ {y} first; then the sums only change on the one
subinterval. General case by repetition.
Suppose we have a nested sequence of partitions P1 ⊆ P2 ⊆ . . . . Then {L(f, Pi)} and
{U(f, Pi)} are monotonic sequences, each bounded by any element of the other. So both
converge. f is integrable when they converge to the same value.
Definition 6.1.6. The oscillation of f is Osc(f, P ) = U(f, P )− L(f, P ).
Next:∫ b
af(x) dx exists iff Osc(f, Pn) → 0, for some nested sequence {Pn}.
Theorem 6.1.7 (Oscillation). f ∈ R[a, b] iff ∀ε > 0, ∃P such that Osc(f, P ) < ε.
Proof. (⇒) Let f ∈ R and fix ε > 0. Then there are partitions P1, P2 such that
U(f, P1)−∫
f < ε
∫f − L(f, P1) < ε.
Let P = P1 ∪ P2 be the common refinement. Then
U(f, P ) ≤ U(f, P2) <
∫f + ε < L(f, P1) + 2ε ≤ L(f, P ) + 2ε,
so that U(f, P )− L(f, P ) < 2ε.
(⇐) Apply the inequality U(f, P )− L(f, P ) < ε to
L(f, P ) ≤ supP
L(f, P ) ≤ infP
U(f, P ) ≤ U(f, P )
to get 0 < infP U(f, P )− supP L(f, P ) < ε. Since this is true for any ε > 0, done.
6.1 Integrals of continuous functions 83
Definition 6.1.8. A norm on a vector space is a function from X to R that satisfies
(i) ‖x‖ ≥ 0, ‖x‖ = 0 ⇐⇒ x = 0.
(ii) ‖ax‖ = |a| · ‖x‖,∀a ∈ R.
(iii) ‖x− z‖ ≤ ‖x− y‖+ ‖y − z‖, ∀x, y, z ∈ X.
Example 6.1.1. On R, absolute value is a norm.
For x = (a, b) ∈ R2, ‖x‖ =√
a2 + b2.
For f ∈ R(D), ‖f‖1 :=∫
D|f(x)| dx.
Definition 6.1.9. A step function is one which is locally constant except for finitely many
points. Thus, a step function is usually defined in terms of a partition; it is constant on
each subinterval and can have a jump discontinuity at each point of the partition.
Example 6.1.2. Define the step functions
fP (x) = infxi−1≤x≤xi
f(x) and fP (x) = supxi−1≤x≤xi
f(x).
Then the previous theorem states that f is integrable iff there is a sequence of partitions
{Pn} with ∫|fPn(x)− fPn(x)| dx
n→∞−−−−−→ 0.
Since we can always take a common refinement of two partitions, this amounts to
saying that f is integrable iff it can be approximated by step functions, i.e.,
∀ε > 0,∃g ∈ Step, n ≥ N =⇒ ‖f − g‖1 < ε.
Often, showing that f is integrable amounts to approximating it by g ∈ Step with
respect to the norm ‖ · ‖1.
For Osc arguments, showing f ∈ R(I) amounts to showing that
‖fP − fP ‖1 < ε =⇒ ‖g − f‖1 < ε,
for the step function g = fP (or fP ). The justification is that
fP ≤ f ≤ fP .
Next, two sufficient conditions for integrability: continuity and monotonicity.
84 Math 413 Integration
Theorem 6.1.10. If f is continuous on [a, b] then f ∈ R[a, b].
Proof. Fix ε > 0 and choose γ > 0 such that 0 < (b − a)γ < ε. Since f is uniformly
continuous on [a, b], there is δ > 0 such that |f(x)− f(t)| < γ whenever
|x− t| < δ, x, t ∈ [a, b].
If P is any partition of [a, b] such that |xi − xi−1| < δ,∀i, then
|x− t| < δ =⇒ |f(x)− f(t)| < γ =⇒ Mi −mi < γ.
Therefore,
Osc(f, P ) = U(f, P )− L(f, P ) =n∑
i=1
(Mi −mi)(xi − xi−1)
≤ γ
n∑
i=1
(xi − xi−1)
= γ|b− a|< ε.
By Oscillation thm, f ∈ R.
Theorem 6.1.11. f monotonic on [a, b] =⇒ f ∈ R[a, b].
Proof. Suppose f is monotonic increasing (the proof is analogous in the other case) so
that on any partition,
Mi = f(xi), mi = f(xx−i), i = 1, . . . , n.
For any n, choose a partition by dividing [a, b] into n equal subintervals; then xi−xi−1 =
(b− a)/n for each i = 1, . . . , n. Then can get
Osc(f, P ) = U(f, P )− L(f, P ) =n∑
i=1
(f(xi)− f(xi−1))(xi − xi−1)
=b− a
n· [f(b)− f(a)] < ε
for large enough n. Then apply Oscillation thm.
6.1 Integrals of continuous functions 85
Theorem 6.1.12. Let f ∈ R[a, b], m ≤ f ≤ M . If ϕ is continuous on [m,M ] and
h(x) := ϕ(f(x)) for x ∈ [a, b], then h ∈ R[a, b].
Proof. Fix ε > 0. Since ϕ is uniformly continuous on [m,M ], find 0 < δ < ε such that
|s− t| < δ =⇒ |ϕ(s)− ϕ(t)| < ε, s, t ∈ [m,m].
Since f ∈ R, find P = {x0, . . . , xn} such that
Osc(f, P ) < δ2.
Mi,mi are extrema of f , M ′i ,m
′i are for h. Subdivide the set of indices {1, . . . , n} into
two classes:
i ∈ A ⇐⇒ Mi −mi < δ,
i ∈ B ⇐⇒ Mi −mi ≥ δ.
For i ∈ A, have M ′i−m′
i ≤ ε by choice of δ. For i ∈ B, have M ′i−m′
i ≤ 2 supm≤t≤M |ϕ(t)|.By prev bound of δ2,
δ∑
i∈B
(xi − xi−1) ≤∑
i∈B
(Mi −mi)(xi − xi−1) < δ2
∑
i∈B
(xi − xi−1) < δ.
Then
Osc(h, P ) =∑
i∈A
(M ′i −m′
i)(xi − xi−1) +∑
i∈B
(M ′i −m′
i)(xi − xi−1)
≤ ε(b− a) + 2δ sup |ϕ(t)|< ε(b− a + 2 sup |ϕ(t)|).
For all of these Osc arguments, showing f ∈ R(I) amounts to showing that ‖fP −fP ‖1 < ε, i.e., that f can be approximated from above and below by step functions, with
respect to the norm ‖ · ‖1.
86 Math 413 Integration
6.2 Properties of the Riemann Integral
Theorem 6.2.1. Let f, g ∈ R[a, b].
(i) (Linearity) f+g ∈ R[a, b] and cf ∈ R[a, b], ∀c ∈ R, with∫
(f+g) dx =∫
f dx+∫
g dx
and∫
cf dx = c∫
f dx.
(ii) max{f, g}, min{f, g} ∈ R[a, b].
(iii) fg ∈ R[a, b].
(iv) f ≤ g on [a, b] =⇒ ∫ b
af(x) dx ≤ ∫ b
ag(x) dx.
(v) |f | ∈ R[a, b] and∣∣∣∫ b
af(x) dx
∣∣∣ ≤∫ b
a|f(x)| dx.
(vi) For a < c < b,∫ c
af(x) dx +
∫ b
cf(x) dx =
∫ b
af(x) dx.
(vii) If |f(x)| ≤ M on [a, b], then∫ b
af(x) dx ≤ M(b− a).
Theorem 6.2.2. Let f, g ∈ R[a, b].
(i) (Linearity) f+g ∈ R[a, b] and cf ∈ R[a, b], ∀c ∈ R, with∫
(f+g) dx =∫
f dx+∫
g dx
and∫
cf dx = c∫
f dx.
Proof. Fix ε > 0 and suppose h = f1 + f2. Find partitions P1, P2 such that
Osc(f1, P1) < ε, and Osc(f2, P2) < ε
Let P = P1 ∪ P2 be the common refinement. Then these inequalities are still true,
and
L(f1, P ) + L(f2, P ) ≤ L(f, P ) ≤ U(f, P ) ≤ U(f1, P ) + U(f2, P ),
which implies Osc(f, P ) < 2ε. Hence, f ∈ R, and
U(fj , P ) <
∫fj(x) dx + ε,
which implies (by the long inequality above) that
∫f dx ≤ U(f, P ) <
∫f1 dx +
∫f2 dx + 2ε.
6.2 Properties of the Riemann Integral 87
Since ε was arbitrary, this gives∫
f dx ≤ ∫f1 dx +
∫f2 dx. Similarly for the other
inequality. For cf , use∑
cMi(xi − xi−1) = c∑
Mi(xi − xi−1), etc.
(ii) max{f, g}, min{f, g} ∈ R[a, b].
Proof. Let h(x) := max{f(x), g(x)}. Since f(x), g(x) ≤ h(x),
Osc(h, P ) ≤ Osc(f, P ) + Osc(g, P ) < ε.
(iii) fg ∈ R[a, b].
Proof. Use ϕ(t) = t2 to get f2 ∈ R, then observe 4fg = (f + g)2 − (f − g)2.
(iv) f ≤ g on [a, b] =⇒ ∫ b
af(x) dx ≤ ∫ b
ag(x) dx.
Proof. HW. Use g − f ≥ 0 and part (i).
(v) |f | ∈ R[a, b] and∣∣∣∫ b
af(x) dx
∣∣∣ ≤∫ b
a|f(x)| dx.
Proof. Use ϕ(t) = |t| to get |f | ∈ R, then choose c = ±1 to make c∫
f ≥ 0 and
observe
cf ≤ |f | =⇒∣∣∣∣∫
f
∣∣∣∣ = c
∫f =
∫cf ≤
∫|f |.
(vi) For a < c < b,∫ c
af(x) dx +
∫ b
cf(x) dx =
∫ b
af(x) dx.
Proof. HW. Note that you can always refine the partition by adding c.
(vii) If |f(x)| ≤ M on [a, b], then∫ b
af(x) dx ≤ M(b− a).
Proof. HW. Use (iv) and show∫ b
a1 dx = b− a.
“The fundamental theorem(s) of calculus” (next two thms) shows that integration and
differentiation are almost inverse operations.
Theorem 6.2.3 (Differentiation of integral). If f ∈ R[a, b], then F (x) =∫ x
af(t) dt ∈
C0[a, b] and F ′(c) = f(c) if f is continuous at c.
88 Math 413 Integration
Proof. Since f ∈ R, |f(t)| ≤ M for a ≤ t ≤ b. For a ≤ x < y ≤ b,
|F (y)− F (x)| =∣∣∣∣∫ y
x
f(t) dt
∣∣∣∣ ≤∫ y
x
|f(t)| dt ≤ M(y − x).
This Lipschitz condition gives uniform continuity of F on [a, b].
If f is continuous at c, then given ε > 0, have δ > 0 such that
|t− c| < δ =⇒ |f(t)− f(c)| < ε, ∀t ∈ [a, b].
If we choose s < t ∈ [a, b] such that c− δ < s ≤ c ≤ t < c + δ, then
∣∣∣∣F (t)− F (s)
t− s− f(c)
∣∣∣∣ =∣∣∣∣
1t− s
∫ t
s
[f(u)− f(c)] du
∣∣∣∣ < ε
shows that F ′(c) = f(c).
NOTE: f ∈ C0[a, b] =⇒ F ∈ C1[a, b].
Theorem 6.2.4 (Integration of derivative). If f ∈ R[a, b] and f has a primitive F which
is differentiable on [a, b], then∫ b
af(x) dx = F (b)− F (a).
Proof. Fix ε > 0 and choose a partition P = {x0, . . . , xn} of [a, b] such that Osc(f, P ) < ε.
By the MVT, get ti ∈ [xi−1, xi] such that
f(ti)(xi − xi−1) = F (xi)− F (xi−1), i = 1, . . . , n
n∑
i=1
f(ti)(xi − xi−1) =n∑
i=1
F (xi)− F (xi−1) = F (b)− F (a).
Since L(f, P ) ≤ ∑f(ti)(xi − xi−1) ≤ U(f, P ),
Osc(f, P ) < ε =⇒∣∣∣∣∣
n∑
i=1
f(ti)(xi − xi−1)−∫ b
a
f(x) dx
∣∣∣∣∣ < ε.
Theorem 6.2.5. Any two primitives of f differ only by a constant.
Proof. Let F, G both be primitives of f . Then
(F −G)′ = F ′ −G′ = f − f = 0 =⇒ F −G = c, for some c ∈ R.
6.2 Properties of the Riemann Integral 89
Putting this minor result together with the previous two shows that
D(I(f)) = f, but I(D(f)) = f + c,
so integration and differentiation are almost inverse operations.
Theorem 6.2.6 (Integration by parts). If f, g ∈ C1[a, b], then
∫ b
a
f(x)g′(x) dx = f(b)g(b)− f(a)g(a)−∫ b
a
f ′(x)g(x) dx.
Proof. Put h(x) = f(x)g(x), so h, h′ ∈ R by Integral properties thm. Use the Integration
of derivative thm on h′.
NOTE: IBP is product rule in reverse, just like CoV is chain rule in reverse.
Theorem 6.2.7 (Change of variable). If g ∈ C1[a, b], g is increasing, and f ∈ R[g(a), g(b)],
then f ◦g ∈ R[a, b] and
∫ b
a
f(g(x))g′(x) dx =∫ g(b)
g(a)
f(y) dy.
Proof. First, show f◦g ∈ R. To each partition P = {x0, . . . , xn} of [a, b], there corresponds
a partition Q = {y0, . . . , yn} of [g(a), g(b)], so that g(xi) = yi. Since the range of f on
[yi−1, yi] is the same as the range of f ◦g on [xi−1, xi],
U(f ◦g, P ) = U(f, Q) L(f ◦g, P ) = L(f, Q)
Since f ∈ R, can choose P so that both Osc(f, Q) < ε, in which case Osc(f ◦g, P ) < ε,
and ∫ b
a
(f ◦g)(x) dx =∫ g(b)
g(a)
f(y) dy.
Next, let F be a primitive of f , so F ′ = f . Then the chain rule gives
(F (g(x)))′ = f(g(x))g′(x)∫ b
a
(F (g(x)))′ dx =∫ b
a
f(g(x))g′(x) dx = F (g(b))− F (g(a)) =∫ g(b)
g(a)
f(y) dy,
where the last two equalities come by the integration of derivatives thm.
90 Math 413 Integration
Theorem 6.2.8. If f is bounded and has only finitely many discontinuities on [a, b], then
f ∈ R[a, b].
Proof. We build a clever partition by working around the “bad points” of f . Away from
the bad points, we use continuity to get integrability. At the bad points, we use the bound
on f (and a “horizontal squeeze” to estimate the oscillation.
Let α1 < α2 < · · · < αn be the discontinuities of f on [a, b]. Let Ik = (αk− ε2n , αk+ ε
2n )
be an open interval about αk. Then the total measure of these intervals is
∣∣∣∣∣n⋃
k=1
Ik
∣∣∣∣∣ ≤n∑
k=1
|Ik| =n∑
k=1
ε
n= ε.
If we remove the intervals Ik from [a, b], the remaining set K = [a, b] \ ⋃Ik is compact,
and so f is uniformly continuous on K. Choose δ > 0 such that
‖s− t| < δ =⇒ |f(s)− f(t)| < ε.
Now let P = {a = x0, x1, . . . , xm = b} be any partition of [a, b] such that
(a) P contains the endpoints {u1, . . . , un, v1, . . . , vn} of the “removed” intervals Ik.
(b) P contains no point x which lies inside an interval Ik.
(c) Whenever xi is not one of the uk, then xi − xi−1 < δ.
Put M = sup |f(x)|. Then Mk −mk ≤ 2M . In fact, we have Mi −mi ≤ ε unless xi−1
is one of the uk.
Osc(f, P ) = U(f, P )− L(f, P ) =n∑
i=1
(Mi −mi)(xi − xi−1)
=∑
contin
(Mi −mi)(xi − xi−1) +∑
k
(Mk −mk)(vk − uk)
≤ ε∑
contin
(xi − xi−1) + 2M∑
k
(vk − uk)
= ε(b− a) + 2Mε.
By Oscillation thm and K-ε (with K = (b− a) + 2M), f ∈ R.
6.3 Improper Integrals 91
6.3 Improper Integrals
Definition 6.3.1. Roughly, an improper integral is an integral which can only be defined
on a domain by taking a limit of its values on subdomains.
Example 6.3.1. f(x) = xα has an integrable singularity at x = 0 iff α > −1 and an
integrable singularity at ∞ iff α < −1.
Solution.
∫ t
s
xα dx =
[xα+1
α+1
]t
s, α 6= −1,
[log x]ts , α = −1,
=
tα+1−sα+1
α+1 , α 6= −1,
log tlog s , α = −1.
Let α > −1, so α + 1 > 0. Then we have the improper integrals
∫ t
0
xα dx = lims→0
∫ t
s
xα dx =tα+1
α + 1,
∫ ∞
s
xα dx = limt→∞
∫ t
s
xα dx = ∞.
Let α < −1, so α + 1 < 0. Then we have the improper integrals
∫ t
0
xα dx = lims→0
∫ t
s
xα dx = ∞,
∫ ∞
s
xα dx = limt→∞
∫ t
s
xα dx = − sα+1
α + 1.
If α = −1, then log tlog s diverges as s → 0 or as t →∞, so neither improper integral exists.
On an infinite domain, a new possibility arises: the integral can fail to exist due to
oscillations.
Example 6.3.2. limn→∞∫∞0
sin xx dx exists, but limn→∞
∫∞0
∣∣ sin xx
∣∣ dx does not.
Proof. lims→0sin x
x dx = 1 (L’Hop), so this is not a problem. If s < t, then
∫ s
0
sin x
xdx−
∫ t
0
sin x
xdx =
∫ t
s
sin x
xdx = −cos t
t+
cos s
s−
∫ t
s
cosx
x2dx
∣∣∣∣∫ s
0
sin x
xdx−
∫ t
0
sin x
xdx
∣∣∣∣ ≤∣∣∣∣cos t
t
∣∣∣∣ +∣∣∣cos s
s
∣∣∣−∣∣∣∣∫ t
s
cosx
x2dx
∣∣∣∣
≤ 1t
+1s
+∫ t
s
∣∣∣cos x
x2
∣∣∣ dx
≤ 2s
+t− s
s2
92 Math 413 Integration
< ε, for s >> 1.
However, note that sin x >√
2/2 for x ∈ (π4 , 3π
4 ), and so for any m ∈ N we have
2kπ +π
4≤ x ≤ 2kπ +
3π
4=⇒ sin x >
√2/2
(2k + 1)π +π
4≤ x ≤ (2k + 1)π +
3π
4) =⇒ sin x < −
√2/2.
Each of these intervals has length π2 > 1, so
∫ M
0
∣∣∣∣sin x
x
∣∣∣∣ dx ≥m∑
j=0
∫ (j+3/4)π
(j+1/4)π
∣∣∣∣sin x
x
∣∣∣∣ dx
≥m∑
j=0
∫ (j+3/4)π
(j+1/4)π
1/√
2x
dx
≥ 1√2
m∑
j=0
π
21
(j + 3/4)π
≥ π
2√
2
m∑
j=0
1(j + 3/4)π
Definition 6.3.2. Suppose f is defined on [0, 1] and f ∈ R[ε, 1] for every ε > 0. f has
an absolutely convergent improper integral on [0, 1] iff limε→0
∫ 1
ε|f(x)| dx exists.
NOTE: This implies that limε→0
∫ 1
εf(x) dx exists.
Definition 6.3.3. The Cauchy principal value integral P.V.∫ 1
−1f(x) dx is defined to be
limε→0
(∫ −ε
−1+
∫ 1
εf(x) dx
)if the limit exists.
§6 Exercise: # Recommended: #
1.
Chapter 7
Sequences and Series of Functions
7.1 Complex Numbers
1
7.1.1 Basic properties of C
Consider the plane
R2 := {(x, y) ... x, y ∈ R}.
This space already has a vector space structure:
(x1, y1) + (x2, y2) = (x1 + x2, y1 + y2)
a(x1, y1) = (ax1, ay1).
We can also endow it with a multiplicative structure by defining
(a, b) · (c, d) := (ac− bd, ad + bc).
This makes R2 into a field called C. To check that this alleged field has inverses, note that
(x, y) · 1x2 + y2
(x,−y) = 1.
1May 2, 2007
94 Math 413 Sequences and Series of Functions
Typically, one writes the basis vectors of R2 as 1 = (1, 0) and i = (0, 1), so that general
elements are
z = x(1, 0) + y(0, 1) = x + yi.
Then
i2 = (0, 1)(0, 1) = (0− 1, 0 + 0) = −(1, 0) = −1.
Theorem 7.1.1. Let p ∈ C[x] be a polynomial of degree n. Then p has n roots in C.
That is, if
p(x) = a0 + a1x + a2x2 + · · ·+ anxn, aj ∈ C,
then one can write
p(x) = c0(x− c1) . . . (x− cn), cj ∈ C.
Remarkably, every known proof relies on topology (completeness) somehow.
NOTE: a > 0 implies automatically that a ∈ R, not C. Earlier: C is complete, but
not ordered.
Theorem 7.1.2. For any real numbers a and b,
(a, 0) + (b, 0) = (a + b, 0) (a, 0) · (b, 0) = (ab, 0).
This allows us to identify R with the subfield of C consisting of the elements (x, 0).
C has another useful operation.
Definition 7.1.3. z is the conjugate of z, defined by
z = x + yi = x− yi.
This corresponds to reflection in the horizontal axis R.
Note: : C→ C is a continuous function.
Theorem 7.1.4. For z, w ∈ C,
1. z + w = z + w,
2. zw = zw,
3. if z = x + iy, then z + z = 2 Re(z) = 2x and z − z = 2i Im(z) = 2y,
4. zz = |z|2 ≥ 0, with equality iff z = 0.
7.1 Complex Numbers 95
Definition 7.1.5. We can extend |x| to |z|:
|z| := (zz)1/2 or |x + iy| :=√
x2 + y2.
Note: | · | : C→ R+ is a continuous function.
Theorem 7.1.6. 1. |z| = |z|.
2. |zw| = |z||w|.
Proof. HW: show |zw|2 = |z|2|w|2 and take √ .
3. |Re z| ≤ |z|.
Proof. a2 ≤ a2 + b2 and take √ .
4. |z + w| ≤ |z|+ |w|.
Proof.
|z + w|2 = (z + w)(z + w) = zz + zw + zw + ww
= |z|2 + 2Re(zw) + |w|2
≤ |z|2 + 2|zw|+ |w|2 prev
= |z|2 + 2|z||w|+ |w|2
= (|z|+ |w|)2.
Note that (x, y) · (0, 1) = (−y, x), so multiplying by i corresponds to rotation by π2
(ccw). In general, if |α| = 1, then αz corresponds to rotating z about 0.
eiθ =∞∑
n=0
(θi)n
n!
=∞∑
k=0
(θi)2k
(2k)!+
∞∑
k=0
(θi)2k+1
(2k + 1)!
=∞∑
k=0
(−1)kθ2k
(2k)!+ i
∞∑
k=0
(−1)kθ2k+1
(2k + 1)!
= (cos θ, sin θ)
96 Math 413 Sequences and Series of Functions
This shows that any complex number with unit norm can be written eiθ, and that any eiθ
has unit norm.
Definition 7.1.7. An ε-ball around a point z ∈ C is
B(z, ε) = {z + reiθ ... 0 ≤ r < ε, θ ∈ R}
= {w ∈ C ... |z − w| < ε}.
This means we can still use the same notation to discuss complex-valued functions,
e.g., continuity is
∀ε > 0, ∃δ > 0, |z − w| < ε =⇒ |f(z)− f(w)| < ε.
Functions of a complex variable are discussed in a different class, but we can still
discuss complex-valued functions of a real variable, i.e. f : R→ C.
If f, g are real functions, then Z(x) = f(x) + g(x)i is a complex function. Most (but
not all) theorems will remain true for complex functions. Exceptions: without order, there
is no notion of “between” in the range, so IVT, Bolzano, MVT, Taylor remainder don’t
make much sense.
One can always decompose a complex-valued function f(x) = u(x) + iv(x), where
u(x) = Re(f(x)) and v(x) = Im(f(x)).
IDEA: since the domain is 1-dimensional, the image will be a curve (1-dimensional) in C,
which is topologically identical to R2.
Theorem 7.1.8. f : R→ C, |f | ∈ R[a, b] and∣∣∣∫ b
af(x) dx
∣∣∣ ≤∫ b
a|f(x)| dx.
Proof. Use ϕ(t) = |t| to get |f | ∈ R by composition. Since z =∫
f ∈ C, there is some
α ∈ C, |α| = 1, such that αz ∈ R, i.e., αz = |z|. Let u = Re(αf). Then u ≤ |αf | = |f |, so
∣∣∣∣∫
f
∣∣∣∣ = α
∫f =
∫αf ≤
∫|f |.
§7.1 Exercise: # Recommended: #
1.
7.2 Numerical Series and Sequences 97
7.2 Numerical Series and Sequences
7.2.1 Convergence and absolute convergence
Definition 7.2.1. An (infinite) series is a sum of a sequence {ak}:
∞∑
k=0
ak = a0 + a1 + a2 + . . . .
To make it clear that the terms of the sequence {ak} are added in order, define
∞∑
k=0
ak = limn→∞
n∑
k=0
ak = lim sn,
where sn :=∑n
k=0 ak. The series∑
ak converges or diverges as the sequence sn does.
Thus, a series is a any sequence which can be written in a certain simple recursive
form:
sn = sn−1 + f(n).
Example 7.2.1. Geometric series: 1 + r + r2 + · · · = ∑∞k=0 rk is the limit of
sn = 1 + r + r2 + · · ·+ rn = sn−1 + rn.
Example 7.2.2. Harmonic series: 1 + 12 + 1
3 + · · · = ∑∞k=1
1k is the limit of
sn = 1 +12
+13
+ · · ·+ 1n
= sn−1 +1n
.
Definition 7.2.2. A telescoping series is one that can be written in the form
∞∑
k=0
(ak+1 − ak).
A telescoping series has partial sums
sn = (a1 − a0) + (a2 − a1) + (a3 − a2) + · · ·+ (an−1 − an−2) + (an − an−1) = an − a0.
So the sum can be found as
∞∑
k=0
(ak+1 − ak) = lim sn = lim an − a0.
98 Math 413 Sequences and Series of Functions
Given any sequence, this provides a way to write a series which has as partial sums, the
terms of the original sequence:
1. Start with any sequence {xn}.
2. Define a0 = x0 and for n ≥ 1, let an := xn − xn−1.
3. Then xn is the nth partial sum
n∑
k=0
ak = x0 + (x1 − x0) + (x2 − x1) + · · ·+ (xn − xn−1) = xn.
Theorem 7.2.3.∑
an converges =⇒ an → 0.
Proof. HW.
Let sn be a partial sum of the series and S = lim sn. Then
sn = sn−1 + an
an = sn − sn−1
lim an = lim(sn − sn−1) = lim sn − lim sn−1 = S − S = 0.
Theorem 7.2.4 (Tail-convergence).∑∞
n=N0an converges ⇐⇒ ∑∞
n=0 an converges ⇐⇒ ∑∞n=N an, ∀N .
Proof. Idea: lim sn = lim sn+N .
Theorem 7.2.5 (Cauchy Criterion for series).∑
an converges iff
∀ε > 0, m > n >> 1 =⇒∣∣∣∣∣
m∑
k=n+1
ak
∣∣∣∣∣ < ε.
Proof. This is just the Cauchy Criterion for sequences applied to the partial sums:
|sn − sm| =∣∣∣∣∣
n∑
k=0
ak −m∑
k=0
ak
∣∣∣∣∣ =
∣∣∣∣∣m∑
k=n+1
ak
∣∣∣∣∣ , n < m.
Theorem 7.2.6 (Linearity). ∀p, q ∈ R, if∑
an and∑
bn converge, then∑
(pan + qbn)
converges and∑
(pan + qbn) = p∑
an + q∑
bn.
Proof. lim(psn + qtn) = p lim sn + q lim tn.
7.2 Numerical Series and Sequences 99
NOTE: series are not multiplicative like sequences are:
∑anbn 6=
∑an
∑bn, because
1 + a1b2 + a2b2 + . . . anbn 6= (1 + a1 + · · ·+ an)(1 + b2 + · · ·+ bn).
(More cross terms on right.)
Theorem 7.2.7 (Increasing & bounded). If 0 ≤ an,∀n, then∑
an converges iff the
partial sums are bounded.
Proof. (⇒) lim sn exists =⇒ {sn} bounded.
(⇐) sn = sn−1+an ≥ sn−1, so monotone. Then {sn} bounded implies {sn} convergent
by completeness.
Definition 7.2.8.∑
an is absolutely convergent iff∑ |an| converges.
∑an is conditionally convergent iff
∑ |an| diverges but∑
an converges.
Example 7.2.3. 1. For a positive-term series, convergence ≡ absolute convergence.
2.∑ (−1)n
2n and∑ (−1)n
n! are absolutely convergent, since∑
12n and
∑1n! are conver-
gent.
3.∑ (−1)n
n is conditionally convergent, since the harmonic series diverges.
Most comparison tests actually establish absolute convergence. In a moment, we’ll see
that absolute convergence implies convergence, so this is a stronger result.
Theorem 7.2.9 ((Direct) Comparison Thm). If |an| ≤ bn, ∀n, then
∑bn converges =⇒
∑|an| converges.
In this case,∑
an ≤∑
bn.
Proof. Fix ε > 0 and apply the Cauchy Criterion and ∆-ineq:
∣∣∣∣∣m∑
k=n
an
∣∣∣∣∣ ≤m∑
k=n
|an| ≤m∑
k=n
bn < ε, n, m >> 1.
The converse of this last theorem is also quite helpful:
0 ≤ an ≤ bn,∑
an diverges to ∞ =⇒∑
bn diverges to ∞.
100 Math 413 Sequences and Series of Functions
Theorem 7.2.10 (Absolute convergence thm).∑ |an| converges =⇒ ∑
an converges.
Proof 1. Split the series into positive and negative components:
a+n := max{an, 0} =
|an|, an ≥ 0
0, an ≤ 0.
a−n := −min{an, 0} =
|an|, an ≤ 0
0, an ≥ 0.
Then an = a+n − a−n and the Comp Thm gives
0 ≤ a+n ≤ |an| =⇒
∑a+
n ≤∑
|an|,
and same for a−n . Linearity Thm lets us add two convergent series.
Proof 2. Apply the Cauchy criterion to |∑mk=n ak| ≤
∑mk=n |ak|.
Completeness is what implies the Comparison Thm, so it also implies this Thm.
Strangely, this property is equivalent to completeness! First, need a couple of definitions.
Definition 7.2.11. A vector space is a set X where any two element of X can be added,
or multiplied by a number in R. (There are more details, but this is all we’ll need.)
Definition 7.2.12. A norm on a vector space is a function from X to R that satisfies
(i) ‖x‖ ≥ 0, ‖x‖ = 0 ⇐⇒ x = 0.
(ii) ‖ax‖ = |a| · ‖x‖,∀a ∈ R.
(iii) ‖x− z‖ ≤ ‖x− y‖+ ‖y − z‖, ∀x, y, z ∈ X.
NOTE: the scalars in these two definitions can be replaced by Q, C, or any other field.
Example 7.2.4.
Rn with ‖x‖ =(∑n
i=1 x2i
)1/2.
Mn(R) with ‖A‖ =∑n
i=1 |aij |.the continuous functions on an interval C(I) with ‖f‖ = supx∈I |f(x)|.the continuous functions on an interval C(I) with ‖f‖1 =
∫I|f(x)| dx.
the continuous functions on an interval C(I) with ‖f‖2 =(∫
I|f(x)|2 dx
)1/2.
Now we can show that in any normed vector space, completeness (defined as conver-
gence of Cauchy sequences) is equivalent to summability of absolutely convergent series.
7.2 Numerical Series and Sequences 101
Theorem 7.2.13. Suppose we have a vector space (X, ‖·‖). Then X is complete iff every
absolutely convergent series in X converges.
Proof. (⇒) Suppose that every Cauchy sequence in X converges and that∑∞
k=1 ‖xk‖converges. Must show that
∑∞k=1 xk converges.
Show that the sequence of partial sums is Cauchy, hence converges.
Let sn =∑n
k=1 xk. Then for n > m, we have
‖sn − sm‖ =
∥∥∥∥∥n∑
k=1
xk −m∑
k=1
xk
∥∥∥∥∥ =
∥∥∥∥∥n∑
k=m+1
xk
∥∥∥∥∥
≤n∑
k=m+1
‖xk‖ by ∆ ineq
< ε, for m >> 1,
since∑
k=1 ‖xk‖ converges,∑∞
k=N ‖xk‖ N→∞−−−−−−→ 0 by Tail-Conv Thm.
(⇐) Suppose that∑∞
k=1 ‖xk‖ converges =⇒ ∑∞k=1 xk converges. Use this to show
that any Cauchy sequence converges.
Let {xn} be Cauchy. Then
∀ε > 0, ∃N, such that m,n ≥ N =⇒ ‖xn − xm‖ < ε, or
∀j ∈ N, ∃nj , such that m,n ≥ nj =⇒ ‖xn − xm‖ <12j
.
So we can find a subsequence {xnj}, choosing n1 < n2 < . . . . Define
y1 = xn1 ,
yj = xnj − xnj−1 , j > 1.
Then∑k
j=1 yj = xnk(by telescoping), and
∞∑
j=1
‖yj‖ ≤ ‖y1‖+∞∑
j=1
12j
= ‖y1‖+ 1 < ∞.
So lim xnk=
∑yj exists, i.e., xnj → x ∈ X. Since {xn} is Cauchy, it must also converge
to the same limit (REC HW); for m,n ≥ N , have
‖xn − x‖ = ‖xn − xnk+ xnk
− x‖ ≤ ‖xn − xnk‖+ ‖xnk
− x‖ < 2ε,N >> 1.
102 Math 413 Sequences and Series of Functions
Recap: (⇒) comes by writing a series as a sequence, (⇐) comes by writing a sequence
as a series, using the telescoping trick.
Theorem 7.2.14 (Ratio test). Suppose an 6= 0, n >> 1, and∣∣∣an+1
an
∣∣∣ < r < 1 for n >> 1.
Then∑
an converges absolutely. Conversely, if∣∣∣an+1
an
∣∣∣ > r > 1 for n >> 1, then∑
an
diverges.
Proof. Compare∑ |an| to geometric series.
case (1) Pick M such that r < M < 1. Then for n >> 1, we get a recursion relation
∣∣∣∣an+1
an
∣∣∣∣ < M =⇒ |an+1| < |an|M.
Applying this to aN+k and iterating,
|aN+k| < |aN+k−1|M < |aN+k−2|M2 < · · · < |aN |Mk
∞∑
k=0
|aN+k| <∞∑
k=0
|aN |Mk = |aN |∞∑
k=0
Mk
RHS converges by geom, so Comp Thm gives convergence of LHS.
Then Tail-Conv Thm gives convergence of∑ |an|
case (2) L > 1. This follows BSA.
Theorem 7.2.15 (Root test). Suppose |an|1/n < r < 1 for n >> 1. Then∑ |an| converges
absolutely. Similarly, if |an|1/n > r > 1, then∑
an diverges.
Proof. By cases, like for Ratio Test.
The Ratio Test is easier and more common, but the Root Test is stronger:
Theorem 7.2.16. If∣∣∣an+1
an
∣∣∣ < r for n >> 1, then n√|an| < r for n >> 1.
Proof. HW.
Theorem 7.2.17 (Integral Test). Suppose f(x) ≥ 0 and decreasing for x ≥ N ∈ N. Then∑
f(n) converges iff∫∞
Nf(x) dx converges.
Proof. Define the area An :=∫ n+1
nf(x), dx. The area of the rectangle under the graph is
f(n + 1)× |n− (n + 1)| = f(n + 1). Thus,
0 ≤ f(n + 1) ≤ An, for n ≥ N.
7.2 Numerical Series and Sequences 103
Shifting one unit to the right, the region under the graph is contained in the rectangle, so
0 ≤ An ≤ f(n), for n ≥ N.
This gives
N+n∑
k=N+1
f(k) ≤∫ N+n
N+1
f(x) dx ≤N+n−1∑
k=N
f(k) ≤∫ N+n−1
N
f(x) dx
and so the two sequences converge or diverge together.
Theorem 7.2.18 (p-series).∑
1np converges iff p > 1.
Proof. For p ≥ 0, 1np is decreasing, so apply the integral test to
∫ ∞
1
1xp
dx = limr→∞
∫ r
1
1xp
dx =
limr→∞ r1−p−11−p , p 6= 1,
limr→∞ log r, p = 1.
For p = 1, log r →∞ and both diverge.
For p > 1, r1−p → 0, so both are finite.
For 0 ≤ p < 1, r1−p →∞, so both diverge.
Finally, consider p < 0 and put q = −p > 0. Then∑
nq diverges by nth term test.
Theorem 7.2.19 (Asymptotic comparison test). If lim |an||bn| = L, where L ∈ (0,∞) then
∑|an| converges ⇐⇒
∑|bn| converges.
Proof. HW: reduce to Direct comparison for n >> 1.
Theorem 7.2.20 (Alternating series test). If {an} is positive and decreasing with an → 0,
then∑
(−1)nan converges.
Proof. Briefly postponed.
Corollary 7.2.21. For an alternating series, en = |sn−S| < an+1, where S =∑
(−1)nan.
Proof. HW.
Example 7.2.5. |sin x− x| < |x|33! , etc.
104 Math 413 Sequences and Series of Functions
Theorem 7.2.22 (Cauchy’s subsequence test). Suppose a1 ≥ a2 ≥ · · · ≥ 0. Then
∞∑n=1
an converges ⇐⇒∞∑
k=0
2ka2k converges.
a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, . . .
a1, 2a2, 4a4, 8a8, . . .
Sketch of proof. Since positive-term, enough to show boundedness of partial sums. Define
sn := a1 + a2 + · · ·+ an,
tk := a1 + 2a2 + · · ·+ 2ka2k .
Note that n 6= k here. For n < 2k, get sn ≤ tk. For n > 2k, get 2sn ≥ tk, so that
sn ≤ tk ≤ 2sn, so that both bounded or both unbounded.
7.2.2 Rearrangements
Theorem 7.2.23. If∑
an is absolutely convergent, then any rearrangement of it is also
convergent, and has the same sum.
Proof. First, suppose an ≥ 0. Let∑
an′ denote a rearrangement of the original series;
the partial sums are sn and s′n. Fix ε > 0 and show |sn − s′n| < ε.
The hypothesis means that (by Tail-conv) for some N ,
∞∑
k=N
|ak| < ε.
By going far enough in the rearranged series, can ensure that
{a1, a2, . . . , aN} ⊆ {a′1, a′2, . . . , a′p},
so that ∞∑
k=p+1
|a′k| ≤∞∑
k=N
|ak| < ε.
Theorem 7.2.24. If∑
an is conditionally convergent, then for any x ∈ R, there is a
rearrangement which sums to x (or even diverges to ±∞).
Sketch of proof. To be conditionally convergent, must have infinitely many positive and
7.2 Numerical Series and Sequences 105
negative terms, so separate into∑
a+n and
∑a−n , and reindex so that each is decreasing.
For x ≥ 0, form rearrangement as follows:
1. Add positive terms until sum exceeds x. Stop as soon as∑J
j=1 a+j ≥ x.
2. Add negative terms until x exceeds sum. Stop as soon as∑J
j=1 a+j −
∑Kk=1 a−k ≤ x.
3. Repeat.
Since a+n → ∞ and a−n → ∞, neither step (i) nor step (ii) can ever go on for infinitely
many steps. Since an is conditionally convergent, an → 0 and the procedure generates a
nested sequence of intervals.
To make the series diverge to ∞, add positive terms until the sum exceeds 1 more than
the first negative term. Then add the negative term. Repeat.
Absolute convergence allows for the possibility of working with double sums:∑
i
∑j aij .
Example 7.2.6. Define a doubly indexed series by
aij =
0, i < j
−1, i = j
2j−i, i > j.
−1 0 0 0 . . . −112 −1 0 0 . . . − 1
2
14
12 −1 0 . . . − 1
4
18
14
12 −1 . . . − 1
8
......
......
...
0 0 0 0 . . . 0 or − 2?
7.2.3 Summation by parts
Theorem 7.2.25 (Summation by parts). Given two sequences {an}, {bn}, write An =∑n
k=0 ak and A−1 := 0. Then
q∑n=p
anbn =q−1∑n=p
An(bn − bn+1) + Aqbq −Ap−1bp.
Proof. Bang it out.
106 Math 413 Sequences and Series of Functions
Theorem 7.2.26 (Dirichlet’s test). Suppose that the partial sums An =∑n
i=1 ai form a
bounded sequence, and suppose there is a sequence {bi} with bi ≥ bi+1 and bi → 0. Then
show∑
aibi converges.
Proof. We will use the Cauchy criterion on∑
aibi. Let |An| ≤ M and fix ε > 0. For some
N , we have bN < ε, and for q ≥ p ≥ N ,
∣∣∣∣∣q∑
n=p
anbn
∣∣∣∣∣ =
∣∣∣∣∣q−1∑n=p
An(bn − bn+1) + Aqbq −Ap−1bp
∣∣∣∣∣
≤ M
∣∣∣∣∣q−1∑n=p
(bn − bn+1) + bq + bp
∣∣∣∣∣ bn ≥ 0
≤ 4Mbp
≤ 4MbN .
By K-ε, the Cauchy criterion gives convergence
Theorem 7.2.27 (Alternating series test). If {cn} is positive and strictly decreasing with
cn → 0, then∑
(−1)ncn converges.
Proof. Use Dirichlet’s Test with an = (−1)n, bn = cn.
Theorem 7.2.28 (Cauchy Product). Suppose that∑
an = A and∑
bn = B, and at least
one of them converges absolutely. Define cn =∑n
k=0 akbn−k. Then∑
cn = AB.
Proof. Wlog, suppose it is∑
an that converges absolutely, and define
An :=n∑
k=0
an, Bn :=n∑
k=0
bn, Cn :=n∑
k=0
cn, βn = Bn −B.
Thus B + βn = Bn. Then we use an error term estimate:
Cn = a0b0 + (a0b1 + a1b0) + · · ·+ (a0bn + a1bn−1 + · · ·+ anb0)
= a0Bn + a1Bn−1 + · · ·+ anB0
= a0(B + βn) + a1(B + βn−1) + · · ·+ an(B + β0)
= AnB + a0βn + · · ·+ anβ0.
So let en := a0βn + · · ·+ anβ0 and it will suffice to show en → 0.
To use the absolute convergence of∑
an, let α :=∑ |an|.
7.2 Numerical Series and Sequences 107
Fix ε > 0. Since∑
bn converges, choose N such that n ≥ N =⇒ |βn| < ε, so that
|en| ≤ |a0βn + . . . an−N−1βN+1|+ |βNan−N + · · ·+ anβ0|≤ εα + |βNan−N + · · ·+ anβ0|.
Now since an → 0, for N fixed and n >> 1, we can make |βNan−N + · · ·+anβ0| < ε. Then
|en| < (α + 1)ε.
§7.2 Exercise: # Recommended: #
1.∑
an converges =⇒ an → 0.
2. If {xn} is Cauchy in R and some subsequence {xnk} converges to x ∈ R, then prove
the full sequence {xn} also converges to x.
3. If∣∣∣an+1
an
∣∣∣ < r for n >> 1, then n√|an| < r for n >> 1.
4. If lim |an||bn| = 1, then
∑|an| converges ⇐⇒
∑|an| converges.
108 Math 413 Sequences and Series of Functions
7.3 Uniform convergence
What does it mean to say a sequence of functions {fn} converges? I.e., how to define
when lim fn(x) = f(x)? There are different (nonequivalent) ways to define such a limit.
What does it mean to say a sum of functions {fn} converges? I.e., how to define∑
fn(x) = f(x)? For example, power series have fn(x) = anxn. What about other kinds
of functions?
We want to know when things are valid, like for f(x) =∑
fn(x),
f ′(x)? =∑
f ′n(x)∫
f(x) dx? =∑ ∫
fn(x) dx,
The Gamma function is defined Γ(x) =∫∞0
tx−1e−t dt. Is it valid to compute
Γ′(x)? =∫ ∞
0
∂∂x (tx−1e−t) dt =
∫ ∞
0
tx−1 log t
etdt?
These operations all involve interchanging the order of limits; series, integrals and
derivatives are all defined in terms of limits.
Definition 7.3.1. Let {fn} be a sequence of functions all defined on some common
domain I. Then fn converges pointwise iff limn fn(x) exists for every x ∈ I. In this case,
we can define the limit function by
f(x) := limn
fn(x),
and write fnpw−−−→ f . The defn is equivalent to:
∀ε > 0, ∀x ∈ I, ∃N, n ≥ N =⇒ fn(x) ≈ε f(x).
Example 7.3.1. Let fn(x) = xx+n on R. Then
limx→∞
limn→∞
fn(x) = limx→∞
0 = 0,
limn→∞
limx→∞
fn(x) = limn→∞
1 = 1.
7.3 Uniform convergence 109
Example 7.3.2. Let fn(x) = xn on I = [0, 1]. Then {fn} converges pointwise and
f(x) =
0, 0 ≤ x < 1,
1, x = 1.
So a sequence of continuous functions can converge pointwise to something which is not
continuous! In fact, fn ∈ C∞(I), but f /∈ C(I)!
Even worse:
Example 7.3.3. Let fk(x) = limn→∞(cos k!xπ)2n. Then whenever k!x is an integer,
fk(x) = 1. If x = p/q is rational, then for k ≥ q, fn(x) = 1. If k!x is not an integer (for
example, if x is irrational), then fk(x) = 0. We have an everywhere discontinuous limit
function
f(x) = limk→∞
limn→∞
(cos k!xπ)2n =
0, x ∈ R \Q,
1, x ∈ Q.
Example 7.3.4. Let fn(x) = x2
(1+x2)n on R and consider
f(x) =∞∑
n=0
fn(x) =∞∑
n=0
x2
(1 + x2)n=
0, x = 0,
1 + x2, x 6= 0,
since the series is geometric for x 6= 0. So a series of continuous functions can converge
pointwise to something which is not continuous! (Not even integrable!)
Example 7.3.5. Let fn(x) = sin nx√n
on R. Then
f(x) = limn→∞
fn(x) = 0, ∀x ∈ R,
so f ′(x) = 0. On the other hand, f ′n(x) =√
n cos(nx) so that limn→∞ f ′n(x) 6= f ′(x).
Example 7.3.6. Let fn(x) = n2x(1 − x2)n on [0, 1]. Then limn→∞ fn(x) = 0 for any
x ∈ [0, 1]. Thus, trivially∫ 1
0limn→∞ fn(x) dx = 0. However,
limn→∞
∫ 1
0
fn(x) dx = limn→∞
n2
2n + 2= ∞.
CONCLUSION: pointwise convergence sucks.
110 Math 413 Sequences and Series of Functions
7.3.1 Definition of uniform convergence
Definition 7.3.2. Let {fn} be a sequence of functions with a common domain I. The
sequence converges uniformly to f on I iff there exists some function f for which
∀ε > 0, ∃N, n ≥ N =⇒ fn(x) ≈ε f(x), ∀x ∈ I.
We write fnunif−−−−→ f .
NOTE: ∀x appears at the end: N does not depend on x. This is the “uniform nature”
of the convergence; N works globally for all of I.
NOTE: uniform convergence implies pointwise convergence.
Theorem 7.3.3. Suppose f(x) = limn fn(x) pointwise. Then
fnunif−−−−→ f ⇐⇒ sup
x∈I|fn(x)− f(x)| n→∞−−−−−→ 0.
Proof. |fn(x)− f(x)| < ε, ∀x is equivalent to the condition supx∈I |fn(x)− f(x)| < ε.
Example 7.3.7. xn does not converge uniformly on [0, 1).
For f(x) ≡ 0, sup |fn(x)− f(x)| = 1 9 0.
More directly, choose ε = 12 . For any fixed n, one can find x ≈ 1− such that xn > 1
2 = ε.
Definition 7.3.4.∑
fn converges pointwise or uniformly iff the corresponding sequence
of partial sums converges pointwise or uniformly.
Example 7.3.8.∑
xn
n converges uniformly to ex on any compact interval [−R, R], but
not on R.
Since |c| < R =⇒ 0 < ec < eR, Taylor’s Thm with Lagrange remainder gives have
∣∣∣∣ex −(
1 + x +x2
2+ · · ·+ xn
n!
)∣∣∣∣ ≤ec|c|n+1
(n + 1)!≤ eRRn+1
(n + 1)!n→∞−−−−−→ 0.
To see that the convergence is not uniform on R, note that for any given (fixed) n,
n = 2k =⇒ limx→−∞
sn(x) = ∞
n = 2k + 1 =⇒ limx→−∞
sn(x) = −∞,
whereas limx→−∞ ex = 0. Hence the sup is ∞ for any n and cannot go to 0.
7.3 Uniform convergence 111
7.3.2 Criteria for uniform convergence
Theorem 7.3.5 (Cauchy Criterion). {fn} converges uniformly on I iff
∀ε > 0, ∃N m, n ≥ N =⇒ |fn(x)− fm(x)| < ε, ∀x.
Proof. HW. (⇒), use ∆ ineq. (⇐), use pointwise Cauchy Crit to obtain the limit f .
Theorem 7.3.6 (Weierstrass M -test). Let {fn} be defined on I and satisfy |fn(x)| ≤ Mn.
If∑
Mn converges, then∑
fn(x) converges uniformly on I.
Proof. Fix ε > 0. Then
∣∣∣∣∣m∑
i=n
fi(x)
∣∣∣∣∣ ≤m∑
i=n
|fi(x)| ≤m∑
i=n
Mn < ε,
for n, m >> 1, because∑
Mn converges. The result follows from the previous thm.
Example 7.3.9.∑
cos nxn2 converges uniformly on R.
Note that∣∣ cos nx
n2
∣∣ ≤ 1n2 and
∑1
n2 converges.
In fact,∑ cos fn(x)
n2 converges uniformly on R for any arbitrary fn(x).
7.3.3 Continuity and uniform convergence
Theorem 7.3.7. A uniform limit of continuous functions is continuous.
Proof. Suppose we have fnunif−−−−→ f , where each fn ∈ C0(I). Then NTS f is continuous
at an arbitrary point c ∈ I. Given ε > 0, use unif convergence to pick N such that
n ≥ N =⇒ fn(x) ≈ε f(x).
Then since fn is continuous at c,
x ≈δ c =⇒ fn(x) ≈ε fn(c).
Combine the two to obtain, for x ≈δ c,
f(x) ≈ε fn(x) ≈ε fn(c) ≈ε f(c) =⇒ f(x) ≈3ε f(c).
112 Math 413 Sequences and Series of Functions
In ∆-ineq form, we used estimates on the RHS of
|f(x)− f(c)| ≤ |f(x)− fn(x)|+ |fn(x)− fn(c)|+ |fn(c)− f(c)|.
Corollary 7.3.8. If fnunif−−−−→ f on I and each fn is uniformly continuous on I, then f
is uniformly continuous.
Proof. HW.
Corollary 7.3.9. If∑
fn(x) converges uniformly on I, then it converges to a continuous
function. In particular, a power series is continuous inside its interval of convergence.
Proof. We’ll prove that it’s differentiable in a moment, so wait until then.
Theorem 7.3.10 (Unrestricted convergence). Let fn ∈ C(D), where D is compact. Then
fnunif−−−−→ f iff fn(xn) → f(x) whenever xn → x.
Proof. Skip.
Theorem 7.3.11. Let f ∈ R[a, b] and |f(x)| ≤ B for a ≤ x ≤ b. Then there is a sequence
{fn} ⊆ C([a, b]) with |fn(x)| ≤ B for all x ∈ [a, b] and n ∈ N such that
∫ b
a
|fn(x)− f(x)| dxn→∞−−−−−→ 0.
Proof. Fix ε > 0. Since f is integrable, choose a partition P = {x0, . . . , xN} for which
Osc(f, P ) < ε2 . Now consider this partition as defining a step function g, whose integral
approximates f ’s: define g(x) piecewise on P by
g(x) := sup f(x), x ∈ [xi−1, xi].
Clearly g is integrable (| sup f − f(x)| ≤ | sup f − inf f |), and∫ b
a|f − g| dx ≤ Osc(f, P ).
Although g is not continuous, we can fix it to be. Let A = supi(Mi −mi). Choose δ > 0
such that
(2δ) ·N ·A < ε and δ < 12 min |xi − xi−1|.
Now define h(x) to be equal to g, except in the δ-ball about each point of the partition,
where we let h be the affine function joining (xi−1−δ, f(xi−1−δ)) to (xi−δ, f(xi−δ)).
7.3 Uniform convergence 113
7.3.4 Spaces of functions
The previous result may be rephrased as: C[a, b] is dense in R[a, b] in the ‖ · ‖1-norm. Is
C[a, b] dense in R[a, b] in the ‖ · ‖∞-norm? No: consider the Heaviside function
H(x) =
0, x < 0
1, x ≥ 0.
For any f ∈ C(R), sup |H(x) − f(x)| ≥ 12 by the IVT, so the sup cannot be made less
than ε for ε < 12 .
This is one manifestation of the fact that the topologies associated with the ‖·‖1-norm
and the ‖ · ‖∞-norm are different. Here is another: a sequence which converges with
respect to the ‖ · ‖1-norm but not the ‖ · ‖∞-norm. Define
fn(x) :=
2nx, x ∈ [0, 12n ],
2− 2nx, x ∈ [ 12n , 1
n ],
0, else.
Then fn is a triangular “tent function” with a symmetric peak of height 1 over the point
x = 12n and support [0, 1
n ]. {fn} converges to f(x) ≡ 0 in the ‖ · ‖1-norm (and also
pointwise) but not in ‖ · ‖∞-norm, since sup |fn(x)− f(x)| = 1 9 0.
Definition 7.3.12. A step function is one which is locally constant except for finitely
many points. Thus, a step function is usually defined in terms of a partition; it is constant
on each subinterval and can have a jump discontinuity at each point of the partition.
Step(I) = pw − const(I) ⊆ pw − contin(I) ⊆ R(I)
C(I) ⊆ R(I)
monotone(I) ⊆ R(I)
114 Math 413 Sequences and Series of Functions
7.3.5 Term-by-term integration
Theorem 7.3.13 (Integration of a uniform limit). Let fnunif−−−−→ f , where each fn ∈
R[a, b]. Then f ∈ R[a, b] and lim∫ b
afn(x) dx =
∫ b
af(x) dx.
Proof. Put en := supa≤x≤b |fn(x)− f(x)| so that fn − en ≤ f ≤ fn + en. Then the upper
and lower sums satisfy
∫ b
a
(fn − en) dx ≤ L(f, P ) ≤ U(f, P ) ≤∫ b
a
(fn + en) dx (∗)
and thus 0 ≤ Osc(f, P ) ≤ 2εn(b− a) n→∞−−−−−→ 0. Thus f ∈ R[a, b]. Now (*) becomes
∫ b
a
(fn − en) dx ≤∫ b
a
f dx ≤∫ b
a
(fn + en) dx,
which gives ∣∣∣∣∣∫ b
a
fn dx−∫ b
a
f dx
∣∣∣∣∣ ≤ 2en(b− a) n→∞−−−−−→ 0.
Theorem 7.3.14 (Term-by-term integration of a series). If∑
fk(x) converges uniformly
on [a, b] and each fk ∈ R[a, b], then∫ b
af dx =
∑ ∫ b
afk(x) dx.
Proof.
∫ b
a
f dx =∫ b
a
( ∞∑
k=0
fk(x)
)dx = lim
n→∞
∫ b
a
n∑
k=0
fk(x) dx prev thm
= limn→∞
n∑
k=0
∫ b
a
fk(x) dx linearity.
Example 7.3.10 (Sawtooth function). It can be shown (using Fourier series) that
f(x) =π
2− 4
π
∞∑
k=0
cos(2k + 1)x(2k + 1)2
converges to the function g(x) = x, for 0 ≤ x ≤ π. By Weierstrass M -test, it converges
uniformly (prev example). Integrating term-by-term,
x2
2=
πx
2− 4
π
(sin x +
sin 3x
33+
sin 5x
53+ . . .
).
Since the sum converges uniformly, f(x) ∈ C0(R). In fact, f is 2π-periodic and an even
function. Thus, f is the sawtooth: /\/\/\/\/\
7.3 Uniform convergence 115
Theorem 7.3.15 ((Baby) dominated convergence thm). Suppose fn ∈ R[a, b] for every
0 < a < b < ∞, and suppose fnunif−−−−→ f on every compact subset of (0,∞). If g ∈
R[0,∞), then
|fn| ≤ g =⇒ limn→∞
∫ ∞
0
fn(x) dx =∫ ∞
0
f(x) dx.
Proof. HW.
Theorem 7.3.16 (Stirling’s Formula). limx→∞Γ(x+1)
(x/e)x√
2πx= 1.
Proof. HW.
Often: limn→∞ n!(n/e)n
√2πn
= 1, meaning that n! ∼ (n/e)n√
2πn.
7.3.6 Term-by-term differentiation
Example 7.3.11. Recall fn(x) = sin nx√n
. This is uniformly dominated by 1√n, so converges
uniformly to f ≡ 0, but f ′n(x) 9 f ′(x)! Not even uniform convergence can save us now!
Need stronger hypothesis.
Theorem 7.3.17. Let fn ∈ C1(I), fnpw−−−→ f and f ′n
unif−−−−→ g. Then f ∈ C1(I) and
f ′(x) = g(x).
Proof. Fix a point a ∈ I. Then FToC1 gives
fn(x)− fn(a) =∫ x
a
f ′n(t) dtn→∞−−−−−→
∫ x
a
g(t) dt.
However, we also have fn(x)− fn(a) → f(x)− f(a), so apply FToC2 to
f(x)− f(a) =∫ x
a
g(t) dt
to see that f ∈ C1(I) with f ′(x) = g(x).
This can be strengthened:
Theorem 7.3.18. Suppose fn ∈ C1(I) and {f ′n} converges uniformly. If {fn(c)} con-
verges for some c ∈ I, then fnunif−−−−→ f ∈ C1(I) and limn→∞ f ′n(x) = f ′(x).
Proof. Not for the faint of heart.
Corollary 7.3.19. Let fk ∈ C1(I). If∑
fk converges pointwise and∑
f ′k converges
uniformly, then f(x) :=∑
fk(x) ∈ C1(I) and f ′(x) =∑
f ′k(x).
116 Math 413 Sequences and Series of Functions
Proof. Let sn(x) :=∑n
k=0 fk(x). Then s′n(x) =∑n
k=0 f ′k(x) unif−−−−→ f ′, sn ∈ C1(I), and
snpw−−−→ f , so apply the previous thm.
Example 7.3.12 (Sawtooth function). Recall the uniformly convergent series
f(x) =π
2− 4
π
∞∑
k=0
cos(2k + 1)x(2k + 1)2
= x, 0 ≤ x ≤ π.
Differentiating term-by-term,
1 = f ′(x)? =? − 4π
(sin x +
sin 3x
3+
sin 5x
5+ . . .
).
To establish this equality, we’d need uniform convergence of the series on the right. Unfor-
tunately, this doesn’t converge uniformly on R. If it did, prev thm would give f ′ ∈ C0(R),
but the sawtooth is clearly nondifferentiable for x = kπ.
It turns out that it does converge uniformly on (kπ, (k + 1π). Moral: convergence of
Fourier series can be subtle.
§7.3 Exercise: # Recommended: #
1. If fn(x) unif−−−−→ f(x) on I and each fn is uniformly continuous on I, then f is
uniformly continuous.
2. (Dominated convergence thm) Suppose fn ∈ R[a, b] for every 0 < a < b < ∞, and
suppose fnunif−−−−→ f on every compact subset of (0,∞). If g ∈ R[0,∞), then
|fn| ≤ g =⇒ limn→∞
∫ ∞
0
fn(x) dx =∫ ∞
0
f(x) dx.
7.4 Power series 117
7.4 Power series
7.4.1 Radius of convergence
Definition 7.4.1. A power series is a series of the form∑
anxn, where x is a variable.
The nth term of a power series is fn(x) = anxn (rather than just an).
∑anxn is a family of series, one for each value of x. We are interested in the subfamily
corresponding to
A = {x ∈ R ...
∑|anxn| converges}.
Then we can define a function
f : A → R, f(x) =∑
anxn.
Definition 7.4.2. For any power series∑
anxn, ∃!R ≥ 0 such that
∑anxn converges absolutely for |x| < R,
∑anxn diverges for |x| > R.
R is the radius of convergence of the power series. (Note: may have R = 0.) By convention,
R = ∞ iff the series converges ∀x ∈ R.
Now must validate the assertion of the definition: that such an R exists. This
will be shown by computing R explicitly.
Theorem 7.4.3. The radius R of a power series∑
cnzn is given by
1R
= limsupn→∞
n√|cn|.
Proof. Suppose |z| < R, where R is defined as above. Observe that
limsupn→∞
n√|an| < 1 ⇐⇒ n
√|an| < r < 1, n >> 1.
Thus we can apply the Root Test to∑
an where an = cnzn.
limsupn→∞
n√|an| = |z| limsup
n→∞n√|cn| = |z|
R< 1.
The case |z| > R follows BSA.
118 Math 413 Sequences and Series of Functions
NOTE: sup A < p =⇒ ∃q, a ≤ q < p, ∀a ∈ A. (Proof: let q = sup A.)
Theorem 7.4.4 (Unif convergence of power series). If∑
anxn has radius of convergence
R, then the series converges uniformly on [−L,L] whenever 0 ≤ L < R.
Proof. We know∑
anxn converges absolutely for |x| ≤ L < R, so apply the Weierstrass
M -test with |anxn| ≤ |an|Ln = Mn.
7.4.2 Analytic continuation
Definition 7.4.5. A function f is analytic iff it has a power series expansion about every
point in its domain.
Heaviside fails analyticity at only one point ...
Theorem 7.4.6. f is analytic iff Tn(a, x) → f(x), ∀a.
Theorem 7.4.7. If∑
an(x− c)n convergent with radius R, define f(x) :=∑
an(x− c)n.
Then f has a power series expansion about each point y ∈ B(c′, R) which converges on
any B(c′, r) ⊆ B(c,R).
Corollary 7.4.8. Any function defined as a power series is automatically analytic.
If∑
an(x− c)n converges to f in B(c, r), then an = f(n)(c)n! by Taylors Thm. Thus,
∑an(x− c)n =
∑bn(x− c)n =⇒ an = bn, ∀n,
so the expansion is unique at c. However, can have x ∈ B(c1, R1) ∩B(c2, R2), with
f(x) =∑
an(x− c1)n =∑
bn(x− c2)n, an 6= bn.
Definition 7.4.9. Moving from a power series expansion at one point to a power series
expansion about a different point is called analytic continuation (or magic).
Suppose you know the values of an analytic function f on an open set, so that you
know them on B(c,R). Since power series can be differentiated term-by term on their
interval of convergence, you can determine all values f (n)(c),
hence all the values an = f (n)(c)/n!,
hence all values f(x) for x ∈ B(c,R).
So for x = c′ ∈ B(c′, r) ⊆ B(c,R), can start the process over.
7.4 Power series 119
Theorem 7.4.10. If f is analytic and f(x) = 0, ∀x ∈ (a, b), then f ≡ 0.
Theorem 7.4.11. If f, g are analytic, then so are f ± g, f · g, f/g (for g 6= 0), and f ◦g(for dom f ⊆ Im g)
§7.4 Exercise: # Recommended: #
1.
120 Math 413 Sequences and Series of Functions
7.5 Approximation by polynomials
7.5.1 Convolution and approximate identities
Definition 7.5.1. The support of f is the closure of the largest open set on which f 6= 0.
Since the zero set of a function is closed,
spt f = {x ... f(x) = 0}C .
f is always defined on all of spt f .
The support of f is where does anything interesting. For example:
Theorem 7.5.2. For f ∈ R(D),∫
Df(x) dx =
∫spt f
f(x) dx.
Definition 7.5.3. If f, g ∈ R(R), then their convolution f ∗ g is defined by
(f ∗ g)(x) :=∫
f(x− y)g(y) dy.
Convolution is a weighted average of translates.
NOTE: if at least one of f, g has compact support, then the convolution will exist. For
the rest of this section, we assume that functions have compact support, or equivalently,
that we are working on a compact interval.
Convolution as a product
Theorem 7.5.4. Suppose f, g, h ∈ R(D), D compact.
(i) (Linearity) f ∗ (g + h) = (f ∗ g) + (f ∗ h) and (cf) ∗ g = c(f ∗ g) = f ∗ (cg), ∀c ∈ C.
Proof. Immediate from linearity of the integral.
(ii) (Commutativity) f ∗ g = g ∗ f .
Proof. Change of variables: y 7→ x− y.
(iii) (Associativity) f ∗ (g ∗ h) = (f ∗ g) ∗ h.
Proof. Fubini theorem:∫ ∫
ϕ(x, y) dx dy =∫ ∫
ϕ(x, y) dy dx.
(iv) (Fourier transform) f ∗ g(ξ) = f(ξ)g(ξ).
7.5 Approximation by polynomials 121
Convolution is a smoothing operation.
Theorem 7.5.5. For f, g ∈ R(D), f ∗ g is continuous, D compact.
Proof. First, prove it for the case when f, g are continuous. Fix ε > 0 and c ∈ D. Compact
support gives a bound for the continuous function f , say |f(x)| < B. Also, it gives uniform
continuity of g, so find δ > 0 such that
|x− c| < δ =⇒ |(x− y)− (c− y)| < δ =⇒ |g(x− y)− g(c− y)| < ε.
Now for |x− c| < δ, we have
|(f ∗ g)(x)− (f ∗ g)(c)| =∣∣∣∣∫
D
f(y)g(x− y)− g(c− y) dy
∣∣∣∣
≤∫
D
|f(y)| · |g(x− y)− g(c− y)| dy
≤ Bε · | spt f |.
Now suppose that f, g are not necessarily continuous. Let {fn} ⊆ C(D) be a sequence
with |fn(x)| ≤ B and∫
D|fn(x)− f(x)| dx
n→∞−−−−−→ 0, and similarly {gn} ⊆ C(D). Then
f ∗ g − fn ∗ gn = (f − fn) ∗ g + fn ∗ (g − gn).
Consequently, we have
|((f − fn) ∗ g)(x)| ≤∫
D
|f(x− y)− fn(x− y)| · |g(y)| dy
≤ supy∈D
|g(y)|∫
D
|f(x− y)− fn(x− y)| dyn→∞−−−−−→ 0.
Note that this actually shows supx∈D |((f − fn) ∗ g)(x)| → 0, so that (f − fn) ∗ gunif−−−−→ 0
on D. Similarly, fn ∗ (g− gn) unif−−−−→ 0 on D. Since each fn ∗ gn is continuous, the uniform
limit f ∗ g is, too.
In fact, convolution is almost magically smoothing:
Theorem 7.5.6. Let f, g ∈ C1(D), D compact. Then f ′ ∗ g = f ∗ g′ = (f ∗ g)′.
Proof. The first equality is immediate by integration by parts; boundary terms vanish,
122 Math 413 Sequences and Series of Functions
since the functions have compact support. For the other equality,
f ∗ g(x)− f ∗ g(c)x− c
=1
x− c
(∫f(x− y)g(y) dy −
∫f(c− y)g(y) dy
)
=∫ (
f(x− y)− f(c− y)x− c
)g(y) dy.
Thus, it suffices to see that f(x−y)−f(c−y)x−c
unif−−−−→ f ′(c − y). MVT gives z ∈ (x, c) where
f ′(z) = f(x−y)−f(c−y)x−c . Then f ∈ C1(D) implies f ′ is uniformly continuous on the compact
set D, so
x → c =⇒ z → c =⇒ f ′(z − y) unif−−−−→ f ′(c− y).
That is, since f is uniformly continuous, get δ such that sup |f ′(z−y)−f ′(c−y)| < ε.
Theorem 7.5.7. f ∈ Cn, g ∈ Cm =⇒ f ∗ g ∈ Cn+m.
Proof. By prev, (f ∗ g)(n+m) = f (n) ∗ g(m). Then f (n), g(m) continuous (by hypothesis)
implies the convolution of them is continuous.
In other words, convolving against a smooth function produces something which is at
least as smooth as your original function.
Proposition 7.5.8. If g is a polynomial and f is continuous with compact support, then
f ∗ g is a polynomial.
Proof. Let g(x) =∑n
k=0 akxk. Then
f ∗ g(x) =∫
g(x− y)f(y) dy =∫ (
n∑
k=0
ak(x− y)k
)f(y) dy
=∫ n∑
k=0
k∑
j=0
(k
j
)(−1)k−jakxjyk−jf(y) dy
=n∑
j=0
n∑
k=j
(k
j
)(−1)k−jak
∫yk−jf(y) dy
xj .
Definition 7.5.9. An approximate identity (or a family of good kernels) is a sequence of
continuous (smooth?) functions {gn} such that
(i) gn ≥ 0,
(ii)∫
gn(x) dx = 1, and
7.5 Approximation by polynomials 123
(iii) For every δ > 0, limn→∞∫|x|≥δ
gn(x) = 0.
If f ∗ gn is the weighted average of f(y−x), with weights given by gn, then (iii) shows
that gn concentrates all weight at the origin as n →∞. In fact, (iii) is equivalent to
(iii’) ∀δ > 0, n >> 1 =⇒ spt gn ⊆ (−δ, δ).
It turns out that the Dirac delta δ is the identity with respect to convolution: f ∗ δ =
δ ∗ f = f , for any f . An approximate identity is the best one can do at a continuous
approximation of δ. Even though f ∗ gn 6= f , we do have the following.
Theorem 7.5.10. Let {gn} be an approximate identity, and let f ∈ R(I). Then
f is continuous at x =⇒ limn→∞
(f ∗ gn)(x) = f(x).
If f is continuous, then the limit is uniform.
Proof. If ε > 0 and f is continuous at x, choose δ > 0 such that
|y| < δ =⇒ |f(x− y)− f(x)| < ε.
Then by the properties of the approx identity,
(f ∗ gn)(x)− f(x) =∫
gn(y)f(x− y) dy − f(x)
=∫
gn(y)f(x− y) dy −∫
gn(y)f(x) dy
=∫
gn(y)[f(x− y)− f(x)] dy
|(f ∗ gn)(x)− f(x)| ≤∫
gn(y)|f(x− y)− f(x)| dy
=∫
|y|<δ
gn(y)|f(x− y)− f(x)| dy +∫
|y|≥δ
gn(y)|f(x− y)− f(x)| dy
= ε
∫
|y|<δ
gn(y) dy + (2 sup f) ·∫
|y|≥δ
gn(y) dy.
Now the second integral goes to 0 by AI property (iii), and the first integral is bounded
above by ε∫
g = ε. By K-ε, f ∗ gn is continuous at x.
If f is continuous, then it is uniformly continuous on its compact support, so we can
pick a δ that works for all x. Repeating the same argument gives supx |(f ∗ gn)(x)− f(x)| ≤Kε, and the convergence is uniform.
124 Math 413 Sequences and Series of Functions
Theorem 7.5.11 (Weierstrass approximation theorem). Let f ∈ C[a, b]. Then there is a
sequence of polynomials {gn} converging uniformly to f on [a, b].
Proof. First, observe some simplifications.
1. We can assume f vanishes outside [a, b] (else extend linearly to [a− 1, b + 1]).
2. If p(x) is a polynomial on [a− b, b− a] then f ∗ p will be a polynomial for x ∈ [a, b].
3. Wlog, may assume [a, b] = [−1, 1], since the general case can be obtained by com-
posing with an appropriate affine function.
Define an approximate identity for f ∈ Cc[−1, 1], f(−1) = f(1) = 0, by
gn(x) :=
cn(1− x2)n, x ∈ [−1, 1]
0, else,
where cn = 1/∫ 1
−1(1 − x2)n dx. The first two AI properties are clearly satisfied, but we
must show (iii).
Since gn is even, it suffices to show that for any δ > 0, we have gnunif−−−−→ 0 on [δ, 1].
But gn is decreasing on this interval, so it suffices to see that sup gn = gn(δ) → 0, ∀δ > 0.
For this, it suffices to see that cn is bounded.
By the Binomial Theorem,
(1− x2)n ≥ 1− nx2 ≥ 1− n
(1
2√
n
)2
= 1− 14
=34.
Let |x| ≤ 1/2√
n. Then
∫ 1
−1
(1− x2)n dx ≥∫ 1/2
√n
−1/2√
n
(1− x2)n dx ≥ 34n−1/2 =⇒ cn ≤ 4
3n1/2,
and hence
gn(δ) = cn(1− δ2)n ≤ 43n1/2(1− δ2)n n→∞−−−−−→ 0,
since nan → 0 for a > 0 (old HW). Now f ∗ gn is a sequence of polynomials converging
uniformly to f on [−1, 1].
Corollary 7.5.12. For every interval [−a, a], there is a sequence of real polynomials fn
such that fn(0) = 0 and such that fn(x)unif−−−−→ |x| on [−a, a].
7.5 Approximation by polynomials 125
Proof. By Weierstrass’ Thm, there is a sequence {gn} of real polynomials converging
uniformly to |x|. Then let
fn(x) := gn(x)− gn(0).
7.5.2 The Stone-Weierstrass Theorem
We will isolate the properties of the polynomials which makes such an approximation
possible.
Definition 7.5.13. A family of functions A defined on a set X is an algebra iff (i)
f + g ∈ A, (ii) fg ∈ A, and cf ∈ A whenever f, g ∈ A and c is a constant. (If A is an
algebra of complex functions, then c ∈ C.)
Definition 7.5.14. An algebra A is uniformly closed (or closed in the topology of uniform
convergence) iff
fnunif−−−−→ f, fn ∈ A =⇒ f ∈ A.
The uniform closure of A is the set of all functions which are limits of uniformly convergent
sequences of elements of A.
Example 7.5.1. The polynomials on [a, b] form an algebra of functions.
Weierstrass’ Thm states that C[a, b] is the uniform closure of the polynomials on [a, b].
Definition 7.5.15. A set of functions A on I is said to separate points iff for every
x 6= y ∈ I, there is a f ∈ A for which f(x) 6= f(y).
Example 7.5.2. The algebra of polynomials separates points, but the algebra of even
polynomials on any symmetric interval does not, since f(x) = f(−x).
Definition 7.5.16. A set of functions A on I is said to vanish at no point of I iff for
every x ∈ I, there is a f ∈ A for which f(x) 6= 0.
Theorem 7.5.17. Let A be an algebra of functions that vanishes at no point of I and
separates points. Suppose x, y ∈ I are distinct and a, b are constants (real, if A is a real
algebra). Then A contains a function f such that f(x) = a, f(y) = b.
Proof. From the hypothesis, we have g, h, k ∈ A such that
g(x) 6= g(y), h(x) 6= 0, k(x) 6= 0.
126 Math 413 Sequences and Series of Functions
Then the algebra also contains the functions ϕ,ψ defined by
ϕ := gk − g(x)k, ψ := gh− g(y)h.
From these definitions, ϕ(x) = ψ(y) = 0, ϕ(y) 6= 0, ψ(x) 6= 0. Then the function we want
is
f =a
ψ(x)ψ +
b
ϕ(x)ϕ.
Theorem 7.5.18 (Stone-Weierstrass). Let A be an algebra of real continuous functions
on a compact interval K. If A separates points and vanishes at no point of K, then the
uniform closure A is B = C(K).
Proof. Step (1) If f ∈ B, then |f | ∈ B.
Proof. Fix ε > 0 and let a = supx∈K |f(x)|. By the previous corollary, we can
find c1, . . . , cn ∈ R such that for y ∈ [−a, a],
∣∣∣∣∣n∑
i=1
ciyi − |y|
∣∣∣∣∣ < ε.
Then since B is an algebra, the function g =∑n
i=1 cifi is an element of B. Then
by the displayed equation and the defn of a, we have
∣∣∣g(x)− |f(x)|∣∣∣ < ε, x ∈ K
Since B is uniformly closed, this shows that |f | ∈ B.
Step (2) If f, g ∈ B, then max{f, g} and min{f, g} are, too.
Proof. This follows from the previous step and the identities
max{f, g} =f + g
2+|f − g|
2,
min{f, g} =f + g
2− |f − g|
2.
By iteration, this gives max{f1, f2, . . . , fn} ∈ B, etc.
7.5 Approximation by polynomials 127
Step (3) If f ∈ C(K), x ∈ K, and ε > 0, then there is a function gx ∈ B such that
gx(x) = f(x) and gx(t) > f(t)− ε, for t ∈ K.
Proof. Note that x and ε are fixed. Since A ⊆ B and A satisfies the hypothesis of
the previous theorem, so does B. Then for any fixed y ∈ K, we can find hy ∈ Bthat agrees with f at x and y:
hy(x) = f(x), hy(y) = f(y).
By the continuity of hy and the positivity theorem, there exists an open neigh-
bourhood Uy of y such that
hy(t) > f(t)− ε, t ∈ Uy.
Since K is compact, there is a finite set of points y1, . . . , yn such that
K ⊆ Uy1 ∪ · · · ∪ Uyn .
Put gx := max{hy1 , . . . , hyn}. By the last step, gx ∈ B, and gx has the required
properties.
Step (4) If f ∈ C(K) and ε > 0, there is h ∈ B such that |h(t)− f(t)| < ε, for t ∈ K.
Proof. For each x ∈ K, we have the function gx ∈ B from the previous step. By
the Positivity Thm, each x ∈ K has an open neighbourhood Vx for which
gx(t) < f(t) + ε, t ∈ Vx.
Again using the compactness of K, choose a finite subcover
K ⊆ Vx1 ∪ · · · ∪ Vxm .
Put h := min{gx1 , . . . , gxm} so that h ∈ B by Step 2. Then ∀t ∈ K,
f(t)− ε < h(t) Step (2)
< f(t) + ε this construction
=⇒ |h(t)− f(t)| < ε.
128 Math 413 Sequences and Series of Functions
Since B is uniformly closed, this proves the theorem.
7.5.3 Convolution and differential equations
Above, we said that δ turns out to be the identity for ∗. In fact, this is almost how it’s
defined: δ has the property that
(f ∗ δ)(x) =∫
f(x− y)δ(y) dy = f(x).
The closest thing to a function that makes this work is something like
δ(x) :=
∞, x = 0
0, else,
but of course this isn’t a function. However, δ can be defined honestly as a measure
δ(E) :=
1, 0 ∈ E
0 0 /∈ E,
or as a distribution (that is, a linear map δ : C∞ → R)
δ(f) = 〈δ, f〉 := f(0).
Note that f(0) ∈ R and δ is linear because
δ(af + bg) = (af + bg)(0) = af(0) + bg(0) = aδ(f) + bδ(g), a, b ∈ R.
In any case, (δ ∗ f)(x) = (f ∗ δ)(x) = f(x) is true, so that δ ∗ f = f ∗ δ = f .
Consider a linear partial differential equation like
a(x1, x2)∂2
∂x1∂x2u(x) + b(x1, x2)
∂
∂x1u(x) + c(x1, x2)
∂2
∂x22
u(x) = f(x1, x2),
where a, b, c, f ∈ C∞(R2). In general, such an equation is written
Pu =∑
aα∂αu = f,
where P =∑
aα∂α is the general form of a linear differential operator and α = (α1, α2, . . . , αn)
7.5 Approximation by polynomials 129
is a multi-index, denoting ∂(1,2) = ∂∂x1
∂2
∂x22, etc.)
Definition 7.5.19. Call E a fundamental solution of P iff PE = δ. (What is E?)
Then a solution of Pu = f can be found by convolving with E:
P (E ∗ f) = PE ∗ f by the magic property
= δ ∗ f E is a fund. soln.
= f δ is the identity for ∗,
so that u = E ∗ f is a solution. Note also that E ∗ (Pu) = E ∗ f = u, so that E is a left
and right inverse for P .