Analysis I for Engineers - EPFL€¦ · Analysis I for Engineers Joachim STUBBE1 January 9, 2015...

Analysis I for Engineers

Joachim STUBBE1

January 9, 2015

1EPFL SB MATHGEOM, Station 8 CH-1015 Lausanne,[email protected]

Abstract

Objective: Introduce Analysis as the science of the infinite and apply the con-cepts of infinity and convergence (limit process) to the algebraic structures.

Contents

1 Basic concepts 31.1 Sets and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Sets and subsets . . . . . . . . . . . . . . . . . . . . . . . 31.1.2 Boolean operations . . . . . . . . . . . . . . . . . . . . . . 41.1.3 A first description of sets of numbers . . . . . . . . . . . . 51.1.4 Cartesian product set. . . . . . . . . . . . . . . . . . . . . 51.1.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 Algebraic structures and order structures . . . . . . . . . . . . . 81.3 Natural numbers and induction principle . . . . . . . . . . . . . . 9

1.3.1 Sum and product . . . . . . . . . . . . . . . . . . . . . . . 101.3.2 Examples/exercises . . . . . . . . . . . . . . . . . . . . . . 121.3.3 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.5 The ordered fields Q and R . . . . . . . . . . . . . . . . . . . . . 14

1.5.1 Properties of the rational numbers . . . . . . . . . . . . . 141.5.2 Subsets of Q and of R . . . . . . . . . . . . . . . . . . . . 151.5.3 Properties of real numbers . . . . . . . . . . . . . . . . . . 171.5.4 Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.5.5 Subsets of R. . . . . . . . . . . . . . . . . . . . . . . . . . 191.5.6 Absolute value. . . . . . . . . . . . . . . . . . . . . . . . . 21

1.6 Some real-valued functions. . . . . . . . . . . . . . . . . . . . . . 231.6.1 Monotonic functions. . . . . . . . . . . . . . . . . . . . . . 231.6.2 Functions defined by parts . . . . . . . . . . . . . . . . . . 231.6.3 Trigonometric functions. . . . . . . . . . . . . . . . . . . . 24

1.7 Introduction to complex numbers . . . . . . . . . . . . . . . . . . 251.7.1 The field C . . . . . . . . . . . . . . . . . . . . . . . . . . 251.7.2 Modulus and complex conjugate. . . . . . . . . . . . . . . 261.7.3 Representation of complex numbers and polar form . . . . 271.7.4 Roots of a complex number . . . . . . . . . . . . . . . . . 29

1.8 Solving equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 291.8.1 Equations of order two . . . . . . . . . . . . . . . . . . . . 301.8.2 A few general results . . . . . . . . . . . . . . . . . . . . . 30

2 Sequences, Limits and Continuity 322.1 Sequences and subsequences . . . . . . . . . . . . . . . . . . . . . 33

2.1.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.1.2 Bounded sequences . . . . . . . . . . . . . . . . . . . . . . 332.1.3 Monotonic sequences . . . . . . . . . . . . . . . . . . . . . 34

2

2.1.4 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . 342.2 Convergent sequences and limits . . . . . . . . . . . . . . . . . . 35

2.2.1 Limit of a sequence . . . . . . . . . . . . . . . . . . . . . . 352.2.2 Properties of the limit values . . . . . . . . . . . . . . . . 37

2.3 Convergence criteria . . . . . . . . . . . . . . . . . . . . . . . . . 382.4 The Bolzano-Weierstrass theorem . . . . . . . . . . . . . . . . . . 432.5 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.6 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.6.1 Examples of continuous functions . . . . . . . . . . . . . . 472.6.2 Applications of the Bolzano-Weierstrass theorem to con-

tinuous functions over [a, b] . . . . . . . . . . . . . . . . . 502.7 Strongly divergent sequences . . . . . . . . . . . . . . . . . . . . 532.8 Limits of recurrent sequences . . . . . . . . . . . . . . . . . . . . 54

2.8.1 Linear recurrent sequences . . . . . . . . . . . . . . . . . . 542.8.2 Non-linear recurrent sequences . . . . . . . . . . . . . . . 552.8.3 Banach’s fixed point theorem . . . . . . . . . . . . . . . . 56

2.9 Supplement: Sequences of complex numbers . . . . . . . . . . . . 582.9.1 Continuous functions . . . . . . . . . . . . . . . . . . . . . 59

3 Series 603.1 Convergence of a series . . . . . . . . . . . . . . . . . . . . . . . . 603.2 Series with non-negative terms. . . . . . . . . . . . . . . . . . . . 643.3 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.4 Absolute convergence: convergence criteria . . . . . . . . . . . . 673.5 Order of the terms in a series . . . . . . . . . . . . . . . . . . . . 693.6 The exponential series . . . . . . . . . . . . . . . . . . . . . . . . 71

4 Real Functions and Limit process 774.1 Limit of a function . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.1.1 Limit of a function - the definitions . . . . . . . . . . . . . 774.1.2 Punctured limit of a function - the definitions . . . . . . . 794.1.3 Inifinite limits and limits at infinity . . . . . . . . . . . . 824.1.4 Properties of limit values . . . . . . . . . . . . . . . . . . 834.1.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 844.1.6 Asymptotic behavior and asymptotes . . . . . . . . . . . 86

4.2 Uniformly continuous functions . . . . . . . . . . . . . . . . . . . 87

5 Differential calculus 895.1 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.1.1 Properties of the derivative . . . . . . . . . . . . . . . . . 935.1.2 Unilateral derivative . . . . . . . . . . . . . . . . . . . . . 965.1.3 Derivative and local behavior . . . . . . . . . . . . . . . . 965.1.4 An application of the derivative: de l’Hospital’s rule. . . . 975.1.5 The class C1(I) . . . . . . . . . . . . . . . . . . . . . . . 975.1.6 Supplement: the derivative of functions f : R −→ C . . . 98

5.2 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . 985.2.1 Local extrema and Rolle theorem . . . . . . . . . . . . . . 985.2.2 Mean value theoremn . . . . . . . . . . . . . . . . . . . . 100

5.3 Higher order derivatives . . . . . . . . . . . . . . . . . . . . . . . 1035.3.1 The class Cn(I) . . . . . . . . . . . . . . . . . . . . . . . 104

3

4

5.3.2 Leibniz’s rule . . . . . . . . . . . . . . . . . . . . . . . . . 1045.3.3 Convex function and second derivative . . . . . . . . . . . 1055.3.4 Local extrema and second derivative . . . . . . . . . . . . 1075.3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . 1085.3.6 Inflexion points and second derivative . . . . . . . . . . . 108

5.4 Higher order derivatives and series expansions . . . . . . . . . . . 1095.4.1 Polynomial functions . . . . . . . . . . . . . . . . . . . . . 1095.4.2 Truncated Taylor expansion . . . . . . . . . . . . . . . . . 1105.4.3 Computation of Taylor polynomials . . . . . . . . . . . . 1135.4.4 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.5 Study of a function . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6 Integration 1186.1 The Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . 1186.2 Properties of the Riemann integral . . . . . . . . . . . . . . . . . 1216.3 The derivative and the integral . . . . . . . . . . . . . . . . . . . 1236.4 Integration Techniques . . . . . . . . . . . . . . . . . . . . . . . . 124

6.4.1 Integration by parts . . . . . . . . . . . . . . . . . . . . . 1246.4.2 Variable substitution . . . . . . . . . . . . . . . . . . . . . 1246.4.3 Examples - techniques and frequent integrals . . . . . . . 1256.4.4 Supplement: Integral of functions f : [a, b]→ C . . . . . . 127

6.5 Generalized integrals . . . . . . . . . . . . . . . . . . . . . . . . . 1276.6 The Gamma function . . . . . . . . . . . . . . . . . . . . . . . . 132

A Derivatives and antiderivatives 136

B Generalized Integrals 139

Chapter 1

Basic concepts: Numbers,Structures and Functions

Objectives I Review (or introduce) standard mathematical concepts: ele-mentary properties over sets (and functions), sets of numbers, writing numbersby introducing sup and inf, a short introduction to complex numbers.

Objectives II The student must learn to reason simply, to carry out ele-mentary operations -but of a certain complexity nevertheless-, the student mustrealize the necessity of rigor and the requirement of a certain level of abstraction.

Notions. Sets and boolean operations, relations and functions, indicator func-tion, equivalence relation and quotient set, the sets N,Z,Q,R, axioms, supre-mum, infimum, open and closed sets, the interior, the boundary and the closureof a set, isolated point, accumulation point and limit point, absolute value,real-valued function, C, Euler formula, Moivre formula, roots of polynomials.

Skills to acquire Master computing with elementary functions, conduct aproof by induction or by contradiction, deduct simple relations from axioms,understand and know how to apply the notions above to examples, computewith the sum and finite product symbol, understand the meaning of absolutevalue in R (resp. of the modulus in C), know how to solve inequalities and applythe inequalities to the real numbers, compute the roots of a complex number,solve a second-degree equation with complex coefficients.

1.1 Sets and Functions

1.1.1 Sets and subsets

A set E is a collection of objects called elements. If a is an element of E, wesay that a belongs to E or that E contains a, and we write a ∈ E. If a is notan element E, we write a /∈ E. If the elements a, b, . . . form the set E, we writeE = {a, b, . . .}. A set E can have a finite or an infinite number of elements.The empty set, denoted as { } or ∅, has no element. The set E is a subset (wealso say part) of the set F if each element of E is an element of F . We write

5

CHAPTER 1. BASIC CONCEPTS 6

E ⊂ F . If E ⊂ F and F ⊂ E, then E and F contain the same elements. Wewrite E = F . The following is always true : ∅ ⊂ E. Given a set E and itssubsets we can define a set P(E) containing the set E and its subsets. We callP(E) the set of all subsets of E.

Example. Note that {a, b} = {b, a} (such a set is also called a pair) but{a, b} 6= {{a}, {b}}. If E = {a, b}, then P(E) = {∅, {a}, {b}, {a, b}}.

Example. If a ∈ E, then {a} ⊂ E and {a} ∈ P(E).

1.1.2 Boolean operations

If E and F are two sets, we define the union of E and F as the set of theelements belonging to E or to F :

E ∪ F = {x : x ∈ E or x ∈ F}

We define the intersection of E and F as the set of elements that belong to Eand to F :

E ∩ F = {x : x ∈ E and x ∈ F}

If E is a subset of F we define the complement of E in F as the set of elementsof F that are not elements of E:

Ec = F \ E = {x : x ∈ F and x /∈ E}

More generally, if S is a set and E,F ⊂ S we define the difference of E and Fas:

F \ E = {x : x ∈ F and x /∈ E} = F ∩ Ec

where Ec denotes the complementary of E in S: Ec = S \ E.

Table: Properties of boolean operations

Commutativity E ∩ F = F ∩ E E ∪ F = F ∪ E

Associativity D ∩ (E ∩ F ) = (D ∩ E) ∩ F D ∪ (E ∪ F ) = (D ∪ E) ∪ F

Distributivity D ∩ (E ∪ F ) = (D ∩ E) ∪ (D ∩ F )

D ∪ (E ∩ F ) = (D ∪ E) ∩ (D ∪ F )

De Morgan laws (E ∩ F )c = Ec ∪ F c (E ∪ F )c = Ec ∩ F c


1.1.3 A first description of sets of numbers

We denote by N the set of natural numbers

N := {0, 1, 2, . . .},

(where ”:=” means ”is defined as”), Z the set of integers

Z := {. . . ,−2,−1, 0, 1, 2, . . .},

Q the set of rational numbers

Q := {pq

: p, q ∈ Z, q 6= 0}

and R the set of real numbers. We have the following inclusions:

∅ ⊂ N ⊂ Z ⊂ Q ⊂ R.

To give a simple characterization of Q and of R, we use the decimal representa-tion:

Proposition 1.1.1. A real number is a rational number if and only if thedecimal representation becomes periodical.

Here we apply the convention according to which, for example, 0.25 = 0.250.

Proof. Exercise.

Example.

41

70= 0.5857142, 0.5857142 =

5

10+

857142

9999990=

41

70,

1

17= 0.0588235294117647, 0.0588235294117647 =

588235294117647

9′999′999′999′999′999=

1

17,

π = 3.141592653589793238462643 . . . ,

e = 2.718281828459045235360287 . . .

We also write Z+ = {1, 2, . . .} (or N∗) for the set of positive integers. Obvi-ously

Z+ ⊂ N, Z+ ∪ {0} = N.

The complement of Z+ in N is {0}.

1.1.4 Cartesian product set.

Let E,F be two sets. We define the cartesian product of E and of F as the setof ordered pairs (x, y) where x ∈ E and y ∈ F :

E × F = {(x, y) : x ∈ E and y ∈ F}.

Likewise, we define the cartesian product of n sets (Ei)1≤i≤n:

E1 × . . .× En = {(x1, . . . , xn) : x1 ∈ E1, . . . , xn ∈ En}.

If Ei = E for all i, we write En for the cartesian product of the Ei.


Warning: note that ordered pair and pair are different notions and henceE × F 6= F × E.

Relations, Equivalence relations. We can interpret any subset of E×F asa correspondence between the elements of E and of F . In particular, if E = Fany subset R ⊂ E × E is called a relation on E. The symbols ”=” or ”≤”are common examples of relations on the sets N,Z,Q,R (see 1.2). We needequivalence relations in order to formally define the sets Z,Q,R (see 1.4 and1.5). The goal is to classify the elements of E following their types. R ⊂ E×Eis an equivalence relation, written x ∼ y for (x, y) ∈ R if the following propertiesare satisfied :

1. x ∼ x for all x ∈ E (reflexivity),

2. x ∼ y implies y ∼ x (symmetry),

3. x ∼ y and y ∼ z imply x ∼ z (transitivity).

The set [[x]] := {y ∈ E : y ∼ x} is called the equivalence class of x. We have[[x]] = [[y]] if x ∼ y and x ∈ [[x]] is called a representative of the equivalenceclass [[x]]. The quotient set, written E/∼ is the set of equivalence classes. Therelation given by the sign ”=” always defines an equivalence relation.

Examples

1. We introduce an equivalence relation over the natural numbers by therelation ”to have the same parity”. The equivalence classes are the evennumbers and the odd numbers for which we choose the representatives 0and 1. So, 0 ∼ 2, 1 ∼ 5, and the quotient set is given by {[[0]], [[1]]}. Tofacilitate notation we often identify an equivalence class with its privilegedrepresentative. With this convention the quotient set is given by {0, 1}.

2. Over Z we introduce an equivalence class by the relation ”to have the samesquare”. The equivalence classes are given by [[0]] := {0},[[1]] := {−1, 1},[[2]] := {−2, 2} and so on. We can identify the quotient set with N.

1.1.5 Functions

Definitions. A correspondence which associates to every element x ∈ E anelement y ∈ F is called a function from E to F and is written f : E → F .To indicate that f(x) is the element of F associated to x, we use the notationx 7→ f(x). We say that f(x) is the value that f takes at the point x or the imageof x under f . We call E the domain1 and F the codomain of f . The subset ofF given by f [E] := {f(x) : x ∈ E} is called the image of f (also written Im(f)).Finally, the graph of a function, written Gf is the subset of E × F given by

Gf = {(x, f(x)) : x ∈ E}.

In general it is represented in a coordinate system. For example, for a realfunction (i.e. E,F ⊂ R) the graph is represented by the curve in the plane withcartesian coordinates. Furthermore we introduce the following notions:

1We often write f : E → F for a function even if its domain Df is smaller than E if, forexample, we want to give general properties of a class of functions that do not depend on thedomain Df . With this convention we say that f : E → F is a function if and only if Df = E.


Surjective function. A function f : E → F is said to be surjective if f [E] =F or, in other words, if every y ∈ F is the image under f of at least one elementx ∈ E.

Injective function. A function f : E → F is said to be injective if x1 6= x2

implies f(x1) 6= f(x2) for all x1, x2 ∈ E. In other words, every y ∈ f [E] is theimage under f of only one element x ∈ E.

Bijective function. A function f : E → F is said to be bijective if it is bothsurjective and injective.

Identity function. The function IdE : E → E defined by IdE(x) = x iscalled the identity function over E. The identity function is bijective.

Constant function. A function f : E → F is said to be constant if f(x1) =f(x2) for all xj ∈ E, j = 1, 2.

Function composition. Let f : E → F and g : A → B two functions suchthat f [E] ⊂ A. Then, the function g◦f : E → B, defined by (g◦f)(x) := g(f(x))is called the composite function of g and f . The chain rule is associative: leth : C → D be a function such that g[A] ⊂ C. Then

h ◦ (g ◦ f) = (h ◦ g) ◦ f,

because for all x ∈ E(h ◦ (g ◦ f)

)(x) = h

((g ◦ f)(x)

)= h

(g(f(x))

)= (h ◦ g)

(f(x)

)=((h ◦ g) ◦ f

)(x)

Inverse function. When f : E → F is bijective, we can define a functionf−1 : F → E which to all y ∈ F associates the element x in E given by theunique solution to the equation y = f(x). f−1 is called the inverse function off . The function f−1 is bijective and f−1 ◦ f = IdE , f ◦ f−1 = IdF .

Table: Properties of f : E→ F with the equation y = f(x), y ∈ F

f : E → F surjective has at least one solution x ∈ E

f : E → F injective y = f(x) has at most one solution x ∈ E

f : E → F bijective has exactly one solution x ∈ E

Restriction of a function. Let f : E → F , D a subset of the domain E off and g : D → F the function such that g(x) = f(x) for all x ∈ D. We call gthe restriction of f to D and we write f |D (read: f restricted to D).


Extension of a function. Let E ⊂ D. A function g : D → F is called aextension of f if f is the restriction of g to E , i.e. g|E = f .

Indicator function. Let E,R two sets such that E ⊂ R. The indicatorfunction of E, written χE is defined by

χE(x) =

{1 if x ∈ E,0 if x ∈ R \ E.

If E,F are two subsets of a set R, then

χE(x) · χF (x) = χE∩F (x)

andχE(x) + χF (x) = χE∪F (x) + χE∩F (x).

In probability, this relation is called the inclusion-exclusion principle. If E ⊂S, F ⊂ T , then E × F ⊂ S × T and

χE×F (x, y) = χE(x)χF (y) =

{1 if x ∈ E and y ∈ F ,0 if (x, y) ∈ S × T \ E × F .

Example - a function over finite sets: the cardinality. Let E be a setthat has a finite number of elements. We then call this number the cardinalityof E written card(E). In this case, we can define the cardinality for any subsetof E and visualize the cardinality as a function defined for any subset of E:card : P(E)→ N. Note that card(∅) = 0.

Example - addition and multiplication. The algebraic operations +, · canbe seen as functions. For example, addition on integers is a function fromE = Z× Z to F = Z given by f((n,m)) = n+m.

1.2 Algebraic structures and order structures

We first recall the usual computational rules by giving a list of axioms. Wewill then present a sketch of a way of building and describing sets in a moreaxiomatic way starting from the set of natural numbers.

Algebraic axioms - properties of a field. Let x, y, z ∈ Q or R.

A1 x+ (y + z) = (x+ y) + z and x · (y · z) = (x · y) · z.

A2 x+ y = y + x and x · y = y · x.

A3 There exists an element written 0 (called identity element for addition)such that for all x: 0 + x = x.

A4 For all x there exists an element −x such that x+ (−x) = 0

A5 There exists an element 1 6= 0 (called identity element for multiplication)such that for all x: 1 · x = x.

A6 For all x 6= 0 there exists an element written x−1 such that x · x−1 = 1.

A7 x · (y + z) = x · y + x · z.


Sets N and Z. We recall that the set N does not satisfy the axioms 4 and6, and the set Z does not satisfy the axiom 6.

Order axioms. Let x, y, z ∈ N,Z,Q or R.

O1 x ≤ y and y ≤ z imply x ≤ z.

O2 x ≤ y and y ≤ x imply x = y.

O3 For all x, y we have either x < y, or x = y, or x > y.

O4 If x ≤ y, then for all z: x+ z ≤ y + z.

O5 If 0 ≤ x and if 0 ≤ y, then 0 ≤ xy.

Remark. We recall the definition of the signs < and > from ≤:

x < y if x ≤ y and x 6= y,

x > y if y ≤ x and x 6= y.

We write x ≥ y if y ≤ x. If the relation x = y is not verified we write it asx 6= y. The axioms O1, O2 and O3 signify that the sets N,Z,Q and R areordered.

Proposition 1.2.1. The sets Q and R are two ordered fields.

Elementary consequences of the axioms. The algebraic and order axiomsA1 - A7 and O1 - O5 lead to a lot of other elementary properties among whichwe choose to mention the following results:

1. The identity elements 0 and 1 are unique.

2. 0 · x = 0 for all x ∈ R (or x ∈ Q).

3. We denote by −a the additive inverse of a. Then (−1) · (−1) = 1.

4. x2 ≥ 0 for all x ∈ R and x2 > 0 if and only if x 6= 0.

5. Let a, b ∈ R, a 6= 0. Then the equation ax + b = 0 has a unique solution

x = − ba

.

1.3 Natural numbers and induction principle

The set of natural numbers. We denote by N the set of natural numbers:

N = {0, 1, 2, . . .}

On the set of natural numbers there is an algebraic structure given by theaddition operation, written ”+”, and the multiplication operation, written ”·”(the axioms A1,A2,A3,A5 and A7). Moreover the set of natural numbers isordered. However we need a supplementary property to describe the set N2.

2We say that N is the smallest infinite ordinal. We categorize linguistically the ordinals bythe words first, second, and so on.


Good order properties Any non-empty subset of N has a smallest ele-ment.

This property implies

Theorem 1.1. - Induction principle Let P (n) be a property depending on anatural number such that:

1. the property P (0) is true.

2. P (n) implies P (n+ 1).

Then P (n) is true for any natural number.3

Example. Let P (n) be the assertion which states that f(n) := 22n+3 + 7n+1

is divisible by 3. Then P (0) is true because f(0) = 15 is divisible by 3. Bysupposing that f(n) is divisible by 3, we must show that f(n + 1) is divisibleby 3. With a simple computation we find:

f(n+ 1) = 22(n+1)+3 + 7(n+1)+1 = 22f(n) + 7n+1(7− 22) = 4f(n) + 3 · 7n+1

hence the assertion. It is essential to check that P (0) is true since with f(n) =22n+4 + 7n+1 we can always show that P (n) implies P (n+ 1) but P (0) is false(24 + 7 = 23 is not divisible by 3) so P (n) is not proved.

In the sequel, we will present some other applications of proofs by induction,notably to derive a few useful sums and products. We first recall some propertiesholding for sums and for products.

1.3.1 Sum and product

Definition. Let m ≤ n be two integers. For every integer k such that m ≤k ≤ n, let ak be a real number. We define 4:

n∑k=m

ak = am + am+1 + . . .+ an,

n∏k=m

ak = am · am+1 · . . . · an = amam+1 . . . an.

If m = n, then the sum (the product) contains only the number an. For n =m− 1 we define:

m−1∑k=m

ak := 0 (empty sum)

m−1∏k=m

ak := 1 (empty product)

3Otherwise there exists a smaller integer m > 0 such that the relation R(m) is false. Itfollows that m − 1 is a natural number and R(m − 1) is true so R(m) is true hence thecontradiction. The theorem is even equivalent to the good order property.

4The symbols∑

and∏

evoke the Greek capital letters Sigma and Pi.


A few computation rules. If l − 1 ≤ m ≤ n, then:

m∑k=l

ak +

n∑k=m+1

ak =

n∑k=l

ak

( m∏k=l

ak

)·( n∏k=m+1

ak

)=

n∏k=l

ak.

For any integer k such that m ≤ k ≤ n let ak, bk be real numbers.

n∑k=m

(ak + bk) =

n∑k=m

ak +

n∑k=m

bk

n∏k=m

akbk =

n∏k=m

ak ·n∏

k=m

bk.

Moreover, let λ be a number (real, complex). Then

n∑k=m

λak = λ

n∑k=m

ak

n∏k=m

(λak) =

n∏k=m

λak = λn−m+1n∏

k=m

ak.

Change of indices.n∑

k=m

ak =

n−l∑j=m−l

aj+l

n∏k=m

ak =

n−l∏j=m−l

aj+l

A useful technique consists to change the order of the ak in a sum or in a product:let σ be a permutation of {1, . . . , n}, i.e. σ is a function from E = {1, . . . , n} toE such that σ(j) 6= σ(k) for all j 6= k. More precisely, σ is bijective. Hence

n∑k=1

ak =

n∑k=1

aσ(k)

n∏k=1

ak =

n∏k=1

aσ(k).

In particular, the order inversions a1 + . . . + an = an + . . . + a1, respectivelya1 · . . . · an = an · . . . · a1, yield

n∑k=1

ak =

n∑k=1

an+1−k

n∏k=1

ak =

n∏k=1

an+1−k.


Example - An application of the order inversion. The order inversionallows us to easily compute the sum of the n first positive integers5:

n∑k=1

k =1

2

n∑k=1

k +1

2

n∑k=1

(n+ 1− k) =1

2

n∑k=1

(k + n+ 1− k) =n(n+ 1)

2.

Product of two sums. For j, k such that m ≤ j, k ≤ n:

n∑j=m

n∑k=m

ajbk =

( n∑j=m

aj

)( n∑k=m

bk

)=

n∑j=m

aj

n∑k=m

bk.

1.3.2 Examples/exercises

The following formulas and inequalities can be proved either by induction or byelementary transformations of sums or products:

Arithmetic sum of natural numbers.

n∑k=1

k =1

2

n∑k=1

k +1

2

n∑k=1

(n+ 1− k) =1

2

n∑k=1

(k + n+ 1− k) =n(n+ 1)

2.

Binomial coefficients, binomial formula - geometric progression.

(x+ y)n =

n∑k=0

(nk

)xkyn−k.

an − bn = (a− b) ·n−1∑k=0

an−k−1bk. (1.1)

n∑k=0

ak =1− an+1

1− a.

Other finite sums

Sn :=

n∑k=1

1

k(k + 1)=

n

n+ 1, n ∈ Z+. (1.2)

n∑k=1

k2 =n(n+ 1)(2n+ 1)

6,

n∑k=0

(−1)n−kk2 =n(n+ 1)

2.

n∏k=1

(1 +

1

k

)k=

(n+ 1)n

n!.

5On a sidenote, it was the young (unruly!) nine year old student Carl Friedrich Gauss(1777-1855) who reinvented and applied this method to compute the sum of the first hundredpositive integers.


Telescopic sums.

f(n+ 1)− f(0) =

n∑k=0

(f(k + 1)− f(k)

)Inequalities: Bernoulli, geometric-arithmetic, Cauchy-Schwarz.

1.3.3 Prime numbers

We say that a natural number p is prime if p ≥ 2 and if it is divisible onlyby 1 and by itself. Every non-negative integer n can be uniquely written as aproduct of prime numbers (proof by induction, exercise):

n =

m∏i=1

pkii , p1 < p2 < . . . < pm, ki ∈ N∗. (1.3)

There exists an infinity of prime numbers (proof by contradiction, exercise). Fora, b ∈ N∗ we can apply (1.3) to find the greatest common divisor of a and b,written gcd(a, b), and the least common multiplier of a and b, written lcm(a, b)(see ”Savoir faire en mathematiques”, Y.Biollay, A. Chaabouni, J.St.)

1.4 Integers

The equation x + m = n, m,n ∈ N has a solution x ∈ N if and only if m ≤ n,written x = n − m. If m > n we can formally define the solution by x =−(m − n). This definition is justified if we build Z by equivalence classes inN × N: (n1,m1) ∼ (n2,m2) if n1 + m2 = n2 + m1 (i.e by pairs (n,m) withconstant difference). We define an equivalence class by

[[(n,m)]] := {(n1,m1) ∈ N× N : (n1,m1) ∼ (n,m)}

In particular, [[(n,m)]] = [[(n −m, 0)]] if m ≤ n and [[(n,m)]] = [[(0,m − n)]] ifm > n. The quotient set N×N/ ∼ is the set of the equivalence classes [[(n,m)]]and we define

Z := N× N/∼. (1.4)

Over this quotient set, we define addition and multiplication as follows:

[[(n,m)]] + [[(n′,m′)]] = [[(n+ n′,m+m′)]],

[[(n,m)]] · [[(n′,m′)]] = [[(nn′ +mm′, nm′ +mn′)]].(1.5)

We then identify (such an identification is called a homeomorphism becausethe structures of addition and of multiplication are preserved) the classes [[(n, 0)]]with n and [[(0, n)]] with −n.

The set Z is countable, which means that there exists a bijective functioni : N→ Z, for example i(0) = 0, i(1) = 1, i(2) = −1, i(3) = 2, . . ..


1.5 The ordered fields Q and RWe have seen that Q and R are ordered fields, i.e. they satisfy the same

algebraic and order axioms. By introducing a new notion - that is supremumand infimum of a set - we will be able to show that Q is not complete and thatR has this supplementary property called the upper bound property (see 1.5.3).

1.5.1 Properties of the rational numbers

Construction of the rationals. We define Q by an equivalence relationin Z × Z \ {0}: (p, q) ∼ (p′, q′) if pq′ = p′q (that is by ordered pairs (p, q)with constant quotient). The privileged representative of the equivalence class(hence of a rational number) is (p, q) with gcd(p, q) = 1 for p, q > 0. The set Qis countable (see the illustration)

Illustration - Q is countable.

Arithmetics of rationals. Let a, b, c, d ∈ Z, b, d 6= 0. Then

a

b+c

d:=

ad+ bc

bd,

a

b· cd

:=a · cb · d

.

For the inverse elements:

−(ab

)=−ab

=a

−band if a 6= 0: (a

b

)−1=

1ab

=b

a.

The order relation over rationals. Let a, b, c, d ∈ Z, b, d 6= 0. Then

a

b≤ c

d⇔ ad

bd≤ bc

bd⇔ ad ≤ bc.

Let us also note that the set Q is dense in the following sense: between two ratio-nals a < b there is always another rational number, for example the arithmeticmean a+b

2 , and, consequently, there is an infinite number of rational numbersbetween a < b. The set Q also satisfies the axiom of Archimedes:

Proposition 1.5.1. For all (x, y) ∈ Q × Q satisfying x > 0 and y ≥ 0 thereexists a non-negative integer n such that nx > y.

Proof. If y = 0 we can choose n = 1 (or any other non-negative integer). Let

x :=a

b> 0, y :=

c

d> with a, b, c, d ∈ Z+. By applying the order relation we

have nx > y if and only if nad > bc. This inequality is valid for the choicen = bc+ 1 since (bc+ 1)ad ≥ bc+ 1 > bc.


Quadratic equations. In order to illustrate the fact that Q is not alge-braically closed we recall that the equation x2 = 2 has no rational solutions.

Proposition 1.5.2. There is no x ∈ Q that satisfies x2 = 2.

Proof. Let us suppose that there exists x = pq with p, q ∈ Z+ such that x2 = 2.

We can also suppose that p and q have no common divisor, that is, that theirgreatest common divisor is 1: pgcd(p, q) = 1. We have

p2

q2= 2 i.e. p2 = 2q2

and consequently p2 is even. Hence p is even and there exists an integer p′ suchthat p = 2p′ (since the square of an odd integer is odd; indeed (2n + 1)2 =2(2n2 + n) + 1 is odd). So

p2 = 4p′2 = 2q2 i.e. 2p′2 = q2

Hence q must be even. This is a contradiction to our original hypothesis that pand q have no commun divisor. Hence there is no rational number x such thatx2 = 2.

1.5.2 Subsets of Q and of RTo describe the supplementary property of R, we need an appropriate lan-

guage concerning some subsets of an ordered set. In the sequel let A 6= ∅ asubset of the ordered set S = Q or S = R.

Upper bound. A is said to be bounded above if there exists b ∈ S such thatfor all x ∈ A we have x ≤ b. The number b is called an upper bound of A.

Lower bound. A is said to be bounded below if there exists a ∈ S such thatfor all x ∈ A we have x ≥ a. The number a is called a lower bound of A.

Remark. If A has a lower bound (upper bound) then it has several.

Bounded subset. A is said to be bounded, if it is both bounded above andbounded below.

Example. Let us consider the set A = {x : 0 ≤ x2 < 2, x ∈ Q}. Theset A is bounded. A lower bound is a = −2 and an upper bound is b = 2because (−2)2 = 22 = 4 > 2. Another pair (lower bound, upper bound) is

(a = − 32 , b = 3

2 ) because 32

22 = 94 . Our argument is justified thanks to the

following inequalities:

if x, y > 0 : x < y ⇔ x2 < y2 and if x, y < 0 : x < y ⇔ x2 > y2.

Indeed, simply note that x2 − y2 = (x+ y)(x− y).


Supremum. An upper bound b ∈ S is said to be the supremum of A, writtenb = supA, if b is the smallest upper bound, that is if every upper bound b′ ofA satisfies b′ ≥ b. If A is not bounded above, we write supA = +∞ or simplysupA =∞.6

Remark - Unicity. If the supremum exists, it is unique (so our terminologyis justified). Indeed, let us suppose that there exists two smaller upper boundsb1 and b2. We have b1 ≥ b2 because b2 is the smallest upper bound but alsob2 ≥ b1, so b1 = b2.

Infimum. A lower bound a ∈ S is said to be the infimum of A, writtena = infA, if a is the biggest lower bound, that is if every lower bound a′ of Asatisfies a′ ≤ a. If A is not bounded below, we write infA = −∞.

Remark - Unicity. If the infimum exists, it is unique.

Maximum. An upper bound b ∈ S is said to be the maximum of A, writtenb = maxA, if b = supA and b ∈ A.

Minimum. A lower bound a ∈ S is said to be the minimum of A, writtena = minA, if a = infA and a ∈ A.

Example - A bounded subset of the rationals with no supremum.We will show that the set A = {x : 0 ≤ x2 < 2, x ∈ Q} has no supremum(and no infimum) in S = Q. In other words, there is no b ∈ Q such thatb = sup{x : 0 ≤ x2 < 2, x ∈ Q}. The proof will be led by contradiction.

Proposition 1.5.3. The set A = {x : 0 ≤ x2 < 2, x ∈ Q} has no supremumand no infimum in S = Q.

Proof. We only give the proof for the supremum. Let us then suppose thatb = supA ∈ Q. Of course b > 0. We either have b2 < 2, or b2 = 2, or b2 > 2.The case b2 = 2 is excluded by the proposition 1.5.2.If b2 > 2, then b > 2/b. Let us reason by contradiction. The goal is to build anupper bound x ∈ Q such that x < b. Let x = 1

2 (b+ 2b ), the arithmetic mean of

b and 2/b. Obviously we have x ∈ Q. Furthermore, x < b because 2/b < b, orby an explicit computation

x− b =1

2

(2

b− b)

=1

2b(2− b2) < 0.

6The symbol ∞ represents infinity and was introduced by John Wallis, De sectionibusconicis nova methodo expositis tractatus, section I, Prop.1, p.4 (1655)


We show that x2 > 2. Indeed

x2 − 2 =1

4

(2

b+ b)2 − 2

=1

4

( 4

b2+ 4 + b2

)− 2

=1

4

( 4

b2− 4 + b2

)=

1

4

(2

b− b)2

= (x− b)2 > 0.

Hence we have found an upper bound x < b = supA, which contradicts thedefinition of supremum. The case b2 > 2 is then impossible.If b2 < 2, then b < 2/b. We will construct a number y ∈ Q such that y > b andy2 < 2. Let x = 1

2 (b+ 2b ) and y = 2

x . Obviously we have y ∈ Q. The number ysatisfies y > b because x < 2/b:

y − b =2

x− b =

1

x(2− bx) =

1

2x(2− b2) > 0.

Moreover, y2 < 2, i.e. y ∈ A since

y2 − 2 =2

x2(2− x2) = − 2

x2(x− b)2 < 0.

Consequently, we have found y ∈ A with y > b = supA, contradicting thedefinition of supremum.

1.5.3 Properties of real numbers

Upper bound axiom. The set R is complete, i.e. any subset A 6= ∅ that isbounded above has a supremum.

Remark. This axiom implies that any subset of real numbers A 6= ∅ that isbounded below has an infimum because

infA = −sup {−x : x ∈ A}.

We only use the upper bound property to characterize real numbers. The con-struction of real numbers related to this proprerty was proposed by RichardDedekind7. Let us also remark that R is the only totally ordered field thatsatisfies the upper bound property. The upper bound axiom implies that theequation x2 = 2 (and hence any quadratic equation that can be transformed tothe form x2 = r, r > 0) has real solutions.

Proposition 1.5.4. The equation x2 = 2 has two solutions in R, written√

2and −

√2.

Proof. We only prove the existence of the positive solution√

2. Let us considerthe set A = {x : 0 ≤ x2 < 2, x ∈ R}. The set A is bounded, so there existsb = supA ∈ R and b > 0. As in the proposition 1.5.3, we can exclude the casesb2 > 2 and b2 < 2. Hence b2 = 2.

7Richard Dedekind 1831 - 1916, german mathematician


Consequently, Q ⊂ R and the elements in R \ Q, the complement of Q in Rare the irrational numbers. The upper bound axiom implies that the set R isArchimedean:

Proposition 1.5.5. - R is Archimedean. For all (x, y) ∈ R × R satisfyingx > 0 and y ≥ 0, there exists a non-negative integer n such that nx > y.

Proof. If y = 0 take n = 1. Let y > 0. Let us suppose that there exists no non-negative integer such that nx > y. Consequently the non-empty subset X ofreal numbers given by X := {nx : n ∈ Z+} is bounded above (an upper boundis y). By the upper bound property, supX <∞ exists and (n+1)x ≤ supX forall natural numbers, hence nx ≤ supX − x for all natural numbers and hencefor any non-negative n. It follows that supX−x is an upper bound for X whichis strictly smaller than supX hence the contradiction.

Remark - A characterization of the real number 0. Archimedes’ axiomimplies that if a is a real number such that 0 ≤ a < 1

n for all non-negativeinteger n, then a = 0.

Floor function. Consequently, we can associate to any real number x aunique integer [x], called the floor of x, such that

[x] ≤ x < [x] + 1.

Proposition 1.5.6. Q is dense in R, that is, there always exists a rationalnumber between two real numbers a < b.

Proof. Thanks to Archimedes’ axiom there exists a non-negative integer n suchthat n(b − a) > 1 (choose x = b − a and y = 1); consequently b > na+1

n .

Let us take r := [na+1]n . Obviously r ∈ Q and we have the following chain of

inequalities:

b >na+ 1

n≥ [na+ 1]

n= r =

[na] + 1

n>na

n= a.

Proposition 1.5.7. R is not countable.

Proof. We will lead the proof by contradiction by using an argument calledCantor’s diagonal argument8. We suppose that R is countable. Note that Ris countable if and only if ]0, 1[ is countable because the function f :]0, 1[→ R

given by f(x) =x− 1

2

x(x− 1)is bijective. Thus there exists a bijective function

g : N→]0, 1[ such that g(n) = 0, xn0xn1xn2 . . ., xnj ∈ {0, 1, 2, . . . , 9}. We defineyn, n ∈ N by

yn =

{1 if xnn 6= 1

2 if xnn = 1.

Then the number 0, y0y1y2 . . . is not represented which yields a contradiction.

8Georg Cantor (1845-1918), a german mathematician


1.5.4 Intervals

Bounded intervals. An interval is a subset A 6= ∅ of R which contains allthe numbers between inf A and supA. For bounded intervals we have fouralternatives inf A, supA ∈ (/∈) A. Let −∞ < a < b < +∞.

Open interval. ]a, b[= {x ∈ R : a < x < b}.

Closed interval. [a, b] = {x ∈ R : a ≤ x ≤ b}.

Left-open interval. ]a, b] = {x ∈ R : a < x ≤ b}.

Right-open interval. [a, b[= {x ∈ R : a ≤ x < b}.

Unbounded intervals.

Open interval. ]−∞, b[= {x ∈ R : x < b}, ]b,∞[= {x ∈ R : x > b}.

Closed interval. ]−∞, b] = {x ∈ R : x ≤ b}, [b,∞[= {x ∈ R : x ≥ b}.

1.5.5 Subsets of R.

An interval A is open if and only if inf A /∈ A and supA /∈ A. An open intervalpossesses the property that for all x ∈ A there exists an interval A′ = ]a′, b′[such that x ∈ A′ and A′ ⊂ A. This property will be useful later in order tocharacterize any open subset:

Open and closed subsets. A set A is said to be open if for all x ∈ A thereexists an interval A′ =]a′, b′[ such that x ∈ A′ and A′ ⊂ A. Note that accordingto this definition, the interval ]a, b[ is an open set. A set A is said to be closedif A is the complement (in R) of an open set. For example the closed interval[a, b] is the complement of the open set ] −∞, a[ ∪ ]b,∞[. The set R (as wellas the empty set ∅) possesses the property to be open and closed. Unions andfinite intersections of open (closed) sets are open (closed). Arbitrary unions ofopen sets are open.

The interior and the boundary of a set. Let E ⊂ R and a ∈ E. We saythat a is in the interior of E if there exists an open interval ]a − ε, a + ε[ suchthat ]a− ε, a+ ε[⊂ E. The set of interior points of E is called the interior of E

and is written◦E. A point a ∈ E is said to be isolated if there exists an interval

]a− ε, a+ε[ such that ]a−ε, a+ ε[∩E = {a}. A point a ∈ R is called a boundarypoint of E if any open set ]a− ε, a+ ε[ contains points of E and points of R \E.The set of boundary points of E is called the boundary of E and is written ∂E.The boundary of a bounded interval I (open, closed, left-open, right-open) isgiven by {inf I, sup I}, the interior is ] inf I, sup I[. The real axis is equal to itsinterior. For the other unbounded intervals, the boundary consists of a singlepoint (inf I or sup I).


Example - Finite sets. Let E ⊂ R a finite set, that is E = {x0, . . . xN},N ∈ R and without loss of generality x0 < . . . < xN . Thus E is closed becauseits complement in R is a union of open intervals:

R \ E =]−∞, x0[∪ ]x0, x1[∪ . . .∪ ]xN−1, xN [∪ ]xN ,∞[.

Note that E = ∂E and◦E = ∅.

Closed and bounded sets. Let E ⊂ R be bounded and closed. Then inf E ∈E and supE ∈ E, that is inf E = minE and supE = maxE. In other words,E has a smallest and a biggest element. This property will be of interest for usfor the study of functions. We are looking, for example, for simple criteria toguarantee that the image of a function is bounded and closed (see ”continuousfunctions”, Ch.3). To prove that supE ∈ E, we suppose that supE /∈ E hencesupE ∈ Ec. The set Ec is open, so there exists an open interval ]a, b[ such thatsupE ∈]a, b[ and ]a, b[⊂ Ec, hence the contradiction. In fact from the definitionof supremum, any interval ]a, b[ containing the real number supE must have

some elements in E. We have E =◦E ∪ ∂E (see the proposition below).

Example - an infinity of points. Let E = { 1

n, n ∈ N∗}. E is not open

because any interval around a point in E also contains points in R \E. The setE is not closed because its complement is not open. In fact, any open interval]−ε, ε[, ε > 0 around a point 0 ∈ R\E contains points in E. The set E = E∪{0}is closed. In fact, let x ∈ R \ E. If x < 0, then ]−∞, 0[⊂ (R \ E). If x > 1 take,for example, the interval ]1,∞[. If 0 < x < 1 there exists a unique n ∈ N∗ such

that1

n+ 1< x <

1

n. Take the interval ]

1

n+ 1,

1

n[. The interior of E is empty

and ∂E = E ∪ {0}.

The closure of a set. Let E ⊂ R and a ∈ R. We say that a is a closure pointof E if for any interval ]a− r, a+ r[, r > 0:

]a− r, a+ r[∩E 6= ∅.

The set of closure points of E is called the closure of E and is written E.

The limit points of a set. Let E ⊂ R and a ∈ R. We say that a is a limitpoint of E if any interval ]a− r, a+ r[, r > 0 contains an x ∈ E such that x 6= a:

(]a− r, a+ r[\{a}) ∩ E 6= ∅.

This condition is equivalent to ]a− r, a+ r[∩ (E \ {a}) 6= ∅. If a is a limit pointof E, then a is a closure point of E. An isolated point in E is not a limit point.


Table: Classification of points relatively to a set E ⊂ R

Interior point of E ]a− r, a+ r[⊂ E for an r > 0: we write a ∈◦E

Isolated point of E ]a− r, a+ r[∩E = {a} for an r > 0

Boundary point of E ]a− r, a+ r[∩E 6= ∅ and]a− r, a+ r[∩ (R \ E) 6= ∅ for all r > 0

Closure point of E ]a− r, a+ r[∩E 6= ∅ for all r > 0: we write a ∈ E

Limit point of E (]a− r, a+ r[\{a}) ∩ E 6= ∅ for all r > 0

The next proposition follows directly from the definitions:

Proposition 1.5.8. Let E ⊂ R.

1.◦E ⊂ E ⊂ E.

2. E =◦E ∪ ∂E.

3. E is open if and only if E =◦E.

4. E is closed if and only if E = E.

5. ∂E = E ∩ R \ E.

6. The closure of E is the disjoint union of the set of limit points and the setof isolated points in E.

1.5.6 Absolute value.

To each real number x, we can associate the non-negative real number definedby

|x| =

{x if x > 0

−x otherwise.(1.6)

and |x| is called the absolute value of x. Note that |x| = max(x,−x) wheremax(x, y) denotes the maximum of x and y. The definition (1.6) is equivalent

to |x| =√x2.

Properties. For x, y ∈ R we have

1. Positivity: |x| ≥ 0 and |x| = 0⇔ x = 0


2. Homogeneity: |yx| = |y||x|

3. Triangle inequality: |x+ y| ≤ |x|+ |y|

4. If y 6= 0, |xy | =|x||y|

5.∣∣|x| − |y|∣∣ ≤ |x− y|

6. Let r > 0, a ∈ R. |x− a| < r ⇔ −r < x− a < r and |x− a| ≤ r ⇔ −r ≤x− a ≤ r. In other words:

]a−r, a+r[= {x ∈ R : |x−a| < r}, [a−r, a+r] = {x ∈ R : |x−a| ≤ r}

7. Let r > 0, a ∈ R. |x− a| > r ⇔ x < a− r or x > a+ r and |x− a| ≥ r ⇔x ≤ a− r or x ≥ a+ r. In other words:

]−∞, a− r[∪]a+ r,∞[= {x ∈ R : |x− a| > r},]−∞, a− r] ∪ [a+ r,∞[= {x ∈ R : |x− a| ≥ r}

8. If |x| < ε for all ε > 0 then x = 0

Remark. We can interpret the absolute value as a function with values in R+.The properties 1, 2 and 3 are the properties of a norm over a vector space. Theproperties 4, 5 are consequences of 2, 3. The properties 6, 7 and 8 follow directlyfrom the definition.

Remark. Property 8 is a consequence of positivity and gives us an importantcharacterization of the number 0:

x = 0⇔ |x| < ε for all ε > 0.

The implication⇒ is obvious. The conclusion⇐ means that if |x| is arbitrarilysmall, then x = 0. It is particularly important for the concept of limit describedin the next chapter.

Remark. For x, y ∈ R, we can define the distance d(x, y) between two num-bers x and y by d(x, y) = |x − y|. The distance d(x, y) satisfies the followingproperties:

1. Positivity: d(x, y) ≥ 0 and d(x, y) = 0⇔ x = y

2. Symmetry: d(x, y) = d(y, x)

3. Triangle inequality: d(x, y) ≤ d(x, z) + d(z, y) for all z ∈ R.

The identities ”min-max” of the absolute value. For all x, y ∈ R:

|x+ y|+ |x− y| = |x|+ |y|+ ||x| − |y|| = 2 max(|x|, |y|). (1.7)

The proof is left for the reader as an exercise. By replacing x by x + y and yby x− y in equation (1.7) we find

||x+ y| − |x− y|| = |x|+ |y| − ||x| − |y|| = 2 min(|x|, |y|). (1.8)


1.6 Some real-valued functions.

We give a first list of functions f : R→ R. For polynomial and rational functionssee ”Savoir faire en mathematiques, Y. Biollay, A. Chaabouni, J.St.”.

1.6.1 Monotonic functions.

Let E and F be non-empty subsets of R and f : E → F a real function. Letx, x1, x2 ∈ E.

Increasing function. A function f is said to be increasing over E if x1 < x2

implies f(x1) ≤ f(x2).

Strictly increasing function. A function f is said to be strictly increasingif x1 < x2 implies f(x1) < f(x2).

Decreasing function. A function f is said to be decreasing if x1 < x2 impliesf(x1) ≥ f(x2).

Strictly decreasing function. A function f is said to be strictly decreasingif x1 < x2 implies f(x1) > f(x2).

Example.

1. The function f : R→ R given by f(x) = x3 is strictly increasing because

x32−x3

1 = (x2−x1)(x21 +x1x2 +x2

2) = (x2−x1)(x2

1

2+x2

2

2+

(x1 + x2)2

2

)> 0

if x1 < x2.

2. The function f : Q → R defined by f(x) = 2x is strictly increasing(exercise).

3. Let f : E → F be a strictly decreasing bijective function. Then its inversefunction f−1 : F → E is strictly decreasing.

1.6.2 Functions defined by parts

The absolute value function is defined by

|x| =

{x if x > 0

−x otherwise.

The sign function is defined by

sign(x) =

1 if x > 0,

0 if x = 0,

−1 if x < 0.


The Heaviside function9 is defined by

Heaviside(x) = H(x) =

{1 if x > 0,

0 if x ≤ 0.

The Gaussian step function (or floor function) is defined by

G(x) = [x]

1.6.3 Trigonometric functions.

The trigonometric functions are defined by the unit circle (see ”Savoir faire enmathematiques, Y. Biollay, A. Chaabouni, J.St.”.)

cos

sin

–2

–1

0

1

2

–4 –2 2 4

sin : ]−∞,∞[−→ [−1, 1] cos : ]−∞,∞[−→ [−1, 1]

cotan

tan

–4

–2

0

2

4

–4 –2 2 4

tan : ]π2 + kπ, π2 + (k + 1)π[−→]−∞,∞[cot : ]kπ, (k + 1)π[−→]−∞,∞[, k ∈ Z.

9Named after Oliver Heaviside (1850 - 1925), an English engineer and physicist


Important formulas.sin2 α+ cos2 α = 1

sin(α+ β) = sinα cosβ + cosα sinβ

cos(α+ β) = cosα cosβ − sinα sinβ

1.7 Introduction to complex numbers

The equation x2 = −1 has no real solutions. In other words, the square rootis not defined for all real numbers. Later we will see that we need square roots ofnegative real numbers to solve third-order equations even in the case where theseequations have three real solutions. In order to eliminate this constraint, wemust extend the set of real numbers. A possible approach consists to introducethe symbol i =

√−1 and to define a complex number z as a sum of the form

z = x+ iy where x and y are real numbers.

1.7.1 The field CComplex numbers. We denote by C the set of complex numbers whose ele-ments are all the expressions of the form z = x+ iy where x, y ∈ R and i2 = −1:

C = {z = x+ iy : (x, y) ∈ R× R, i2 = −1}.

Addition and multiplication. The set C is provided with an addition anda multiplication: if z1 = x1 + iy1 and z2 = x2 + iy2, then

z1 + z2 = (x1 + x2) + i(y1 + y2)

z1 · z2 = (x1x2 − y1y2) + i(x1y2 + x2y1).

Remark. The product of two complex numbers is obtained by applying thedistributive rule:

(x1+iy1)·(x2+iy2) = x1x2+i(x1y2+x2y1)+y1y2i2 = (x1x2−y1y2)+i(x1y2+x2y1).

Example.

(2 + 7i) · (5 + 3i) = 2 · 5 + (2 · 3 + 7 · 5)i+ 7 · 3i2 = −11 + 41i

-

6

O x1

y1z1 = x1 + iy1

z2

z = z1 + z2

��

��

��*

��

3

Proposition 1.7.1. The set C is a field.


Powers of i.i2 = −1, i3 = −i, i4 = 1, i5 = i.

1.7.2 Modulus and complex conjugate.

Real and imaginary part. Let z = x + iy. The real number x is calledthe real part of z and we denote it by x = <(z) or x = Re z, whereas the realnumber y is called the imaginary part of z and is written y = =(z) or y = Im z.If y = 0, z is real. If x = 0 and y 6= 0, we say that z is imaginary. Let us notethat

<(z) = =(z) = 0⇔ z = 0.

Modulus. The real number |z| =√x2 + y2 is called the modulus of z. If z is

real the modulus of z is equal to its absolute value.

Complex conjugates. The numbers z = x + iy and z = x − iy are calledcomplex conjugates.

Properties of complex conjugation. For z, z1, z2 ∈ C, we have

1. z = z

2. z1 + z2 = z1 + z2

3. z1 · z2 = z1 · z2

4. If z2 6= 0,(z1z2

)= z1

z2

5. z · z = |z|2 et |z| = |z|

6. If z 6= 0, z−1 = 1z = z

|z|2

7. <(z) = z+z2 =(z) = z−z

2i .

_

1/z

z=x-iy

z=x+iy

1/|z|

|z|

1

i

The unit circle, z, z and 1z


Properties of the modulus. For z, z1, z2 ∈ C, we have

1. Positivity: |z| ≥ 0 and |z| = 0⇔ z = 0

2. Homogeneity: |z1z2| = |z1||z2|

3. Triangle inequality: |z1 + z2| ≤ |z1|+ |z2|

4. If z2 6= 0, | z1z2 | =|z1||z2|

5.∣∣|z1| − |z2|

∣∣ ≤ |z1 − z2|

6. If |z| < ε for all ε > 0 then z = 0.

Distance. Starting from the modulus we can define the distance d(z1, z2) oftwo complex numbers z1 and z2 by d(z1, z2) = |z1 − z2|. As for the distancebetween two real numbers, the distance d(z1, z2) satisfies the three followingproperties.

1. Positivity: d(z1, z2) ≥ 0 and d(z1, z2) = 0⇔ z1 = z2

2. Symmetry: d(z1, z2) = d(z2, z1)

3. Triangle inequality: d(z1, z2) ≤ d(z1, z) + d(z, z2) for all z ∈ C.

1.7.3 Representation of complex numbers and polar form

The complex plane. We can represent a complex number z = x+ iy in theplane R2 by the vector joining the origin to the point (x, y). The x-axis repre-sents the real numbers and the y-axis represents the imaginary numbers. Thisrepresentation allows us to give a geometric interpretation of complex numbers.For example, the modulus of a complex number is the distance from the point(x, y) to the origin (0, 0). Addition of two complex numbers corresponds tothe addition of two vectors. In order to interpret the product of two complexnumbers, we go from cartesian coordinates to polar coordinates in the plane.

Polar coordinates. Let z 6= 0. Then r = |z| 6= 0 and zr correspond to a

unique point of the unit circle (i.e. the circle of radius 1 and center (0, 0)).Thus there exists a unique θ ∈ [0, 2π[ such that{

x = r cos θ

y = r sin θor z = r(cos θ + i sin θ). (1.9)

θ is called the argument of z and we denote it as θ = arg z. Let us underlinethat the argument of z is defined modulo a multiple of 2kπ with k ∈ Z. If x 6= 0we always have tan θ = y

x but only if x > 0 and y ≥ 0 we have θ = tan−1 yx =

arctan yx .

-

6

��

��z

r

x

iy

θ

r


To sum it up, the equation (1.9) defines a bijective function ]0,∞[×[0, 2π[→C \ {0}. In Complex Analysis (see Analysis 4), we define the equation (1.9)over the domain ]0,∞[×[−π, π[ which significantly simplifies the representation

of the angle θ thanks to the bissection formula tanθ

2=

sin θ

1 + cos θ(see also

”Savoir-Faire en Maths”, p. 129):

θ =

{2 arctan y

y+r if (x, y) /∈]−∞, 0[×{0}π if (x, y) 6= (0, 0).

Euler formula. For θ ∈ R we define the exponential of an imaginary numberby:

eiθ = cos θ + i sin θ.

Using the Euler formula, any complex number z can be written under the polarform

z = reiθ = r(cos θ + i sin θ)

where r = |z| and θ = arg z. We will prove in Chapter 3 that the base e isEuler’s number, i.e. e = 2.71828 . . ., and we will give a rigorous proof of thelink between the exponential function and trigonometric functions in Chapter5. By applying the addition formulas for sine and cosine, we only show herethat the exponential defined by Euler’s formula satisfies the usual properties.

Proposition 1.7.2. Let α, β ∈ R. Then

ei(α+β) = eiαeiβ

Proof. We easily compute

eiαeiβ = (cosα cosβ − sinα sinβ) + i(cosα sinβ + sinα cosβ)

= cos(α+ β) + i sin(α+ β).

By using e−iθ = cos θ − i sin θ we can represent sin θ and cos θ in terms ofthe exponential.

Proposition 1.7.3. Let θ ∈ R. Then

sin θ =eiθ − e−iθ

2iand cos θ =

eiθ + e−iθ

2.

Particular values. Let n be an integer.

e2nπi = 1, e(2n+1)πi = −1

e(4n+1)πi

2 = eiπ2 = i, e

(4n+3)πi2 = e

−iπ2 = −i

De Moivre formula. For all integer n and all θ ∈ R:(eiθ)n

= (cos θ + i sin θ)n = (cosnθ + i sinnθ) = einθ


Computing in polar representation. Let z1 = r1eiθ1 , z2 = r2e

iθ2 , z =reiθ.

1. z = re−iθ

2. If z 6= 0, 1z = 1

r e−iθ

3. z1 · z2 = r1r2ei(θ1+θ2)

4. arg(z1 · z2) = arg z1 + arg z2 + 2kπ, k ∈ Z.

Hence multiplying by a complex number z = reiθ 6= 0 corresponds to a homoth-ety of center the origin and of ratio r followed by a rotation of center the originand of angle θ.

1.7.4 Roots of a complex number

Proposition 1.7.4. Let s > 0, β ∈ R and n a non-negative integer. Theequation

zn = seiβ

has n distinct solutions of the form

z = n√s · eiθ where θ =

β + 2kπ

n, k = 0, 1, . . . , n− 1.

Square root. Let z = x+ iy. If y > 0

√z =

√x+

√x2 + y2

2+ i

√√x2 + y2 − x

2

and if y < 0

√z =

√x+

√x2 + y2

2− i

√√x2 + y2 − x

2.

1.8 Solving equations

Equations of order n. We consider an equation of the form

anzn + an−1z

n−1 + . . .+ a1z + a0 = 0

for z ∈ C or z ∈ R. The coefficients a0, . . . , an are complex numbers. If an 6= 0,we call this equation an equation of order n. In this case we can transform it tothe normal form by letting bk = ak

anfor all k = 0, . . . , n:

zn + bn−1zn−1 + . . .+ b1z + b0 = 0

We have an equation with real coefficients if the bk are real. We can prove thatthis equation always has one complex root (see Analysis 4).


1.8.1 Equations of order two

Normal form. We consider the equation

z2 + pz + q = 0

where p, q ∈ C. We find the solution by completing the square:(z +

p

2

)2=(p

2

)2 − qWe obtain two roots given by:

z1 = −p2

+

√(p2

)2 − q, z2 = −p2−√(p

2

)2 − qIn particular, if p, q ∈ R, we have the three following cases:

z2 + pz + q = 0, p, q ∈ R z1,2 = −p2 ±√(

p2

)2 − q(p2

)2 − q > 0 two real roots(p2

)2 − q = 0 a double real root(p2

)2 − q < 0 two complex conjugate roots

Relations between roots and coefficients - Viete formulas. The rootsz1, z2 satisfy the formulas of Viete10

z1 + z2 = −p, z1z2 = q.

1.8.2 A few general results

Order reduction. If we know a root z1 of the equation

zn + bn−1zn−1 + . . .+ b1z + b0 = 0

we can reduce the order of the equation (exercise):

zn + bn−1zn−1 + . . .+ b1z + b0z − z1

= zn−1 + cn−2zn−2 + . . .+ c1z + c0.

Then, we determine the solutions of

zn−1 + cn−2zn−2 + . . .+ c1z + c0 = 0.

Equations with real coefficients. We consider the equation

zn + bn−1zn−1 + . . .+ b1z + b0 = 0

with coefficients bk ∈ R. If z1 is a root of this equation, then the complexconjugate z1 is also a root.

10Francois Viete or Franciscus Vieta (1540 - 1603) was a French mathematician and lawyer


Example. We can verify that z1 = i is a root of the equation

6z4 − z3 + 5z2 − z − 1 = 0.

Consequently z2 = −i is another root and (z + i)(z − i) = z2 + 1 divides6z4 − z3 + 5z2 − z − 1. Indeed,

6z4 − z3 + 5z2 − z − 1 = 6(z2 + 1)(z2 − 1

6z − 1

6

)= 6(z2 + 1)

(z − 1

2)(z +

1

3

).

Chapter 2

Sequences, Limits andContinuity

In this chapter, we introduce a central concept in Analysis, the limit process.This concept is motivated by the following fact : we cannot exactly compute thereal number

√2 in a finite number of steps. But

√2 can be approached with any

arbitrary precision. Approaching a number with an arbitrary precision meansrepresenting it as a limit of a sequence. The concept of limit process is based onthe topological structure of the set of real numbers given by the open sets (andthe distance between two real numbers is defined by the absolute value). Thealgebraic and order axioms allow us to treat this concept using computationbecause when the limit exists, the limit process preserves these structures, i.e.it commutes with the algebraic operations and the order relation. Furthermore,we introduce a class of real functions that also commute with the limit process:the continuous functions.

Notions to learn. Sequence, subsequence, bounded sequence, convergent se-quence, limit of a sequence, Cauchy sequence, convergence criteria, limit supe-rior and limit inferior of a sequence, accumulation point, divergent sequence,strongly divergent sequence, geometric sequence, Euler’s number, continuousfunction, continuous extension of a function, limit of a function, Bolzano-Weierstrasstheorem and its applications to sequences and continuous functions, intermedi-ate value theorem, recurrent sequence, Banach’s fixed point theorem.

Skills to acquire. Understand and know how to apply the computation rulesfor limits, understand and know how to apply the convergence criteria and proveconvergence or divergence of a given sequence or recurrent sequence by meansof these criteria, know how to verify the continuity of a function, know how todetermine a continuous extension and to compute the limit of a function, knowhow to apply the intermediate value theorem and Banach’s fixed point theorem

34

CHAPTER 2. SEQUENCES, LIMITS AND CONTINUITY 35

2.1 Sequences and subsequences

2.1.1 Sequences

Definition - sequence. A sequence is a mapping f from N to R, writtenf : N → R, that is a correspondance which associates to each n ∈ N a realnumber f(n). We take xn = f(n) and we denote the sequence as (xn)n∈N or(x0, x1, x2, . . .).

More generally, let n0 be an integer, then (xn)n≥n0 also defines a sequence.

Examples of sequences.

1. Sequences given by an explicit expression:

(a) Let xn = x for all n ∈ N. The sequence (xn)n∈N is a constantsequence (x, x, x, x, . . .).

(b) Let xn = 1n for all n ∈ N \ {0}, hence (xn)n∈N∗ = (1, 1

2 ,13 ,

14 , . . .).

(c) Let xn = (−1)n for all n ∈ N, so (xn)n∈N = (1,−1, 1,−1, . . .).

(d) Let a, q ∈ R \ {0}. If we define xn = aqn for all n ∈ N, thenthe sequence (xn)n∈N is called a geometric sequence: (xn)n∈N =(a, aq, aq2, aq3, . . .).

(e) Let xn = (n+2)(n+3)n2+n+1 for all n ∈ N, hence (xn)n∈N = (6, 4, 20

7 ,3013 , . . .).

2. Recurrent sequences. The elements of the sequence are given by a recur-rence relation and initial values.

(a) First order recurrence relations: xn+1 = f(xn) where f is a realfunction and x0 ∈ R is a given initial condition. For example, letx0 = 2 and xn+1 = 1

2 (xn + 2xn

) for all n ∈ N \ {0}, thus (xn)n∈N =

(2, 32 ,

1712 ,

577408 , . . .). We will later prove that this sequence approaches√

2. A geometric sequence can be uniquely defined by the recurrencerelation xn+1 = qxn and the initial condition x0 = a.

(b) Second order recurrence relations: xn+1 depends on xn and on xn−1

which we can express by the relation xn+1 = f(xn, xn−1) where fis a real-valued function and x0, x1 ∈ R are given initial conditions.For example, the relation xn+1 = xn + xn−1, x0 = 0, x1 = 1 yieldsthe sequence of Fibonacci numbers.

3. Series. Let (xk)k∈N be a sequence. We define the sequence of the sumsSn =

∑nk=1 xk for all n ≥ 0. For example, if xk = 1

(k+1)(k+2) for all

k ∈ N \ {0}, then (Sn)n≥0 = ( 12 ,

23 ,

34 ,

45 , . . .). Series will be considered in

Chapter 3.

2.1.2 Bounded sequences

Set of images of a sequence. Let (xn)n∈N be a sequence. We call set ofvalues of (xn)n∈N or set of images of (xn)n∈N the set of values taken by (xn)n∈N,i.e. the set {x1, x2, . . .}. The notion of bounded sequence corresponds to thenotion relative to a bounded set if we consider its set of values.


Definition - bounded sequence. A sequence (xn)n∈N is said to be boundedbelow if there exists a ∈ R such that for all n ∈ N we have xn ≥ a. A sequence(xn)n∈N is said to be bounded above if there exists b ∈ R such that for all n ∈ Nwe have xn ≤ b. A sequence (xn)n∈N is said to be bounded, if (xn)n∈N is bothbounded below and bounded above.

Example. The geometric sequence (xn)n∈N = (qn)n∈N is bounded if |q| ≤ 1.If q > 1 it is only bounded below, if q < −1 it is neither bounded below norbounded above.

Proposition. The sequence (xn) is bounded if and only if there exists a con-stant c ≥ 0 such that |xn| ≤ c for all n ∈ N.

2.1.3 Monotonic sequences

Definition - monotonic sequences.

1. A sequence (xn)n∈N is said to be increasing if xn ≤ xn+1 for all n ∈ N.

2. A sequence (xn)n∈N is said to be strictly increasing if xn < xn+1 for alln ∈ N.

3. A sequence (xn)n∈N is said to be decreasing if xn ≥ xn+1 for all n ∈ N.

4. A sequence (xn)n∈N is said to be strictly decreasing if xn > xn+1 for alln ∈ N. A sequence (xn)n∈N is said to be monotonic if it is increasing ordecreasing.

Example. The sequence ( 1n )n∈N∗ is strictly decreasing. The geometric se-

quence (xn)n∈N = (qn)n∈N is strictly increasing if q > 1, constant if q = 1 andstrictly decreasing if 0 < q < 1.

2.1.4 Subsequences

Example. Let xn = (−1)n for all n ∈ N. We can extract a sequence by onlyconsidering the even indices nk = 2k and k ∈ N. This gives a sequence definedby yk = xnk = x2k. Such a sequence is called a subsequence of (xn)n∈N. Inour case, we obtain a constant subsequence because xnk = 1 for all indices nk.More generally, we have the

Definition - subsequence. If (nk)k∈N is a strictly increasing sequence ofnatural numbers, we say that (xnk)k∈N is a subsequence of the sequence (xn)n∈N.

Example. For xn = (−1)n and nk of the form nk = 2k + 1, we obtainthe subsequence given by xnk = −1. If nk = 3k, we have the subsequence(xnk)k∈N = (1,−1, 1,−1, . . .).


2.2 Convergent sequences and limits

2.2.1 Limit of a sequence

Introduction. For certain sequences, the elements xn approach a well-definedreal number when the index increases. For example, the sequence (xn)n∈N =( 1n+1 )n∈N = (1, 1

2 ,13 ,

14 , . . .) approaches 0. We also say that the sequence (xn)n∈N

converges towards 0. More precisely we have the definition

Definition - convergent sequence. A sequence (xn)n∈N converges towardsx ∈ R, if to all ε > 0, we can associate a natural number Nε such that for alln ≥ Nε we have |xn − x| < ε. We then write

limn→+∞

xn = x.

We also say that the sequence (xn)n∈N is convergent and that its limit is x ∈ R.A non-convergent sequence is said to be divergent.

Alternative notations. If limn→+∞

xn = x, we also write xn → x when n →+∞.

Metric interpretation of convergence. The definition means that the dis-tance d(xn, x) = |xn−x| between the elements xn of the sequence and the pointx becomes arbitrarily small for all indices n that are large enough. In otherwords,

limn→+∞

xn = x ⇔ limn→+∞

d(xn, x) = 0. (2.1)

The definition of the limit process by means of a distance (which is said to bemetric) will allow us to extend it to sequences of complex numbers, of vectorsand of functions.

Remark - Unicity of the limit. Whenever the limit exists, it is unique, inother words, any sequence has at most one limit. Indeed, if there exists y ∈ Rsuch that |xn − y| < ε for all n ≥Mε, then for all n ≥ max(Nε,Mε)

|x− y| = |x− xn + xn − y|≤ |x− xn|+ |xn − y| by the triangle inequality

< ε+ ε = 2ε

Hence x = y.

Remark - the set of images of a convergent sequence. The sequence(xn)n∈N converges towards x if for any open interval of the form ]x − ε, x + ε[all the xn, starting from an index N = Nε, are in ]x − ε, x + ε[. Consequently,only a finite number of elements xn are outside this interval. By noticing thatany finite set is bounded, this remark implies the following proposition:

Proposition 2.2.1. Any convergent sequence is bounded. Any subsequence ofa convergent sequence converges towards the same limit.


Remark - Definition of a convergent sequence in practice. We willexplain how we will deal with the number ε. Let us suppose that we haveproved that for all ε > 0 there exists a natural number Nε such that for alln ≥ Nε the inegality |xn − x| < Cε is valid where C depends neither on εnor on n. We assert that x is the limit of the sequence (xn)n∈N. The onlydifference with the definition is the constant C in front of the ε. In order to getthe inequality in the definition, we take ε′ = ε/C. Then there exists a naturalnumber N ′ such that for all n ≥ N ′ we have |xn − x| < Cε′. Consequently, forall n ≥ N ′

|xn − x| < Cε′ = ε.

In the litterature, the estimations are presented such that we obtain ”< ε” atthe end by doing the rearrangements as above. In this course we often keepconstants in front of ε.

Remark - Usual formulation of the limit process. Instead of sayingthat there exists a natural number N such that a certain assertion is true forall n ≥ N , we often simply say for all n large enough.

Elementary examples.

1. The constant sequence xn = x where x ∈ R and n ∈ N satisfies limn→+∞

xn =

x because for all ε > 0 and for all n ∈ N we have |xn − x| = 0 < ε.

2. Let xn = 1n for n ≥ 1. For all ε > 0 we have | 1n | < ε if n > 1

ε . Thus wechoose a natural number Nε such that Nε >

1ε , for example Nε = [ 1

ε ] + 1.Consequently, for all ε > 0, we have | 1n | < ε for all n ≥ Nε, that is

limn→+∞

1

n= 0.

3. The sequence with elements xn = (−1)n, n ∈ N is divergent. To provethis assertion, we suppose that (xn) converges to a real number x. Sofor ε = 1 there exists a natural number N such that |xn − x| < 1 for alln ≥ N . So for all n ≥ N , thanks to the triangle inequality we have that

2 = |xn− xn+1| = |xn− x+ x− xn+1| ≤ |xn− x|+ |xn+1− x| < 1 + 1 = 2

that is 2 < 2. Hence the sequence cannot converge towards any x.

4. Let us consider the geometric sequence defined by xn = qn, n ∈ N andq ∈ R. If q = 1 the sequence is constant and hence convergent (see 1.). Ifq = −1 the sequence is divergent (see 3.). If |q| > 1, then the sequenceis not convergent because |q|n is not bounded above, i.e. for all C > 0,there exists n ∈ N such that |q|n > C. To prove this assertion notice thatby Bernoulli’s inequality we have for all natural number n:

|q|n = (1 + |q| − 1)n ≥ 1 + n(|q| − 1)

Let C > 0 be arbitrary. By the axiom of Archimedes there exists a naturalnumber n such that n(|q| − 1) > C. Consequently for this value of n

|q|n ≥ 1 + n(|q| − 1) ≥ 1 + C > C.


The case |q| < 1 remains:

Proposition 2.2.2. Let |q| < 1. Then, the geometric sequence (qn)n∈Nis convergent and

limn→+∞

qn = 0.

Proof. Notice that 1|q| > 1. Thus for all ε > 0 there exists a natural

number N such that ( 1|q| )

N > 1ε , that is |q|N < ε. This implies |q|n < ε

for all n ≥ N .

2.2.2 Properties of the limit values

We present the computation rules relative to limit values.

Theorem 2.1. - Computation rules for limits. Let us suppose that

limn→+∞

xn = x and limn→+∞

yn = y.

Then for all α, β ∈ R, we have

limn→+∞

(αxn + βyn) = αx+ βy. (2.2)

limn→+∞

xnyn = xy. (2.3)

limn→+∞

xnyn

=x

yif y 6= 0. (2.4)

limn→+∞

|xn| = |x| (= | limn→+∞

xn|). (2.5)

Proof. - Rule (2.2). For all ε > 0 there exists an N ∈ N such that for all n ≥ N

|xn − x| < ε and |yn − y| < ε

Moreover, the sequences are bounded (because they are convergent). Thereexists C1, C2 > 0 such that |xn| ≤ C1 and |yn| ≤ C2. Then for all naturaln ≥ N

|αxn + βyn − (αx+ βy)| ≤ |α||xn − x|+ |β||yn − y|< |α|ε+ |β|ε= (|α|+ |β|)ε.

According to the remark above, this inequality implies that the sequence (αxn+βyn) converges towards αx+ βy.- Rule (2.3). For all natural n ≥ N we have

|xnyn − xy| = |(xn − x)yn + x(yn − y)|< C2ε+ C1ε

= (C1 + C2)ε.

- Rule (2.3). If y 6= 0 there exists a natural N0 such that |yn − y| < |y|2 for

all n ≥ N0 (choose ε = |y|2 > 0 in the definition). This inequality implies that


|yn| > |y|2 if n ≥ N0. In particular, yn 6= 0 if n ≥ N0 which also justifies the

construction of the sequence given byxnyn

for all n large enough. Then for all

n ≥ max(N,N0) ∣∣∣∣xnyn − x

y

∣∣∣∣ =

∣∣∣∣ (xn − x)y + x(y − yn)

yyn

∣∣∣∣=|(xn − x)y + x(y − yn)|

|y||yn|

< 2(|y|+ |x|)ε|y|2

.

- Rule (2.5). The fact that (|xn|) converges towards |x| is a consequence of thetriangle inequality: ∣∣|xn| − |x|∣∣ ≤ |xn − x|.Exemple. Let xn = (n+2)(n+3)

n2+n+1 for all n ∈ N. To apply the theorem we must

extract the term n2 from the numerator and the denominator, that is, we writexn as follows:

xn =n2(1 + 2

n )(1 + 3n )

n2(1 + 1n + 1

n2 )=

(1 + 2n )(1 + 3

n )

1 + 1n + 1

n2

Let us notice that limn→+∞

1n = 0 implies lim

n→+∞1n2 = lim

n→+∞1n limn→+∞

1n = 0. Con-

sequently

limn→+∞

xn = limn→+∞

(1 + 2n )(1 + 3

n )

1 + 1n + 1

n2

=( limn→+∞

1 + limn→+∞

2n )( lim

n→+∞1 + lim

n→+∞3n )

limn→+∞

1 + limn→+∞

1n + lim

n→+∞1n2

=(1 + 0)(1 + 0)

1 + 0 + 0= 1

Proposition 2.2.3. Let (xn)n∈N be a bounded sequence and (yn)n∈N a sequencewhich converges towards 0. Then the sequence (xnyn)n∈N converges towards 0.

Proof. Exercise.

Example. Compute limn→+∞

sinnn . The sequence (xn)n∈N = (sinn)n∈N is bounded

and (yn)n∈N = ( 1n )n∈N converges towards 0. Then

limn→+∞

sinn

n= 0.

2.3 Convergence criteria

Proposition 2.3.1. - Limit process and order structure. Let (xn)n∈N and(yn)n∈N be two sequences satisfying both following properties:

1. (xn)n∈N and (yn)n∈N converge respectively towards x and y.


2. There exists a natural number N0 such that for all n ≥ N0: xn ≤ yn.

Then, x ≤ y.

Proof. We will lead the proof by contradiction. Let us suppose that x > y.Take ε = x−y

2 > 0. For this ε there exists N ≥ N0 such that for all n ≥ N :

|xn − x| < ε, |yn − y| < ε and xn ≤ yn

In particular,x− ε < xn ≤ yn < y + ε

Here resides the contradiction because x− ε = y + ε = x+y2 .

The sandwich theorem is a simple consequence of this Proposition.

Theorem 2.2. - Sandwich Theorem1. Let (xn)n∈N, (un)n∈N and (vn)n∈Nbe three sequences satisfying both following properties:

1. (un)n∈N and (vn)n∈N converge towards the same limit L

2. There exists a natural number N0 such that for all n ≥ N0: un ≤ xn ≤ vnThen (xn)n∈N converges towards L.

Proof. We give a direct proof. For all ε > 0 there exists a natural number N1

such that|un − L| < ε i.e. − ε < un − L < ε

and|vn − L| < ε i.e. − ε < vn − L < ε

Then for all n ≥ N = max(N0, N1)

−ε < un − L < xn − L < vn − L < ε

and thus |xn − L| < ε.

Example. Let a > 0. Compute limn→+∞

√1 + a

n . Notice that

1 ≤√

1 +a

n≤ 1 +

a

2n

Hence limn→+∞

√1 + a

n = 1.

Example. Let a > 0. Show that

limn→+∞

n√a = 1.

Notice that xn = n√a. If a > 1 we have xn ≥ 1 and by Bernoulli’s inequality

a = (xn)n = (1 + xn − 1)n ≥ 1 + n(xn − 1) i.e. xn ≤ 1 +a− 1

n

and the sandwich theorem gives the desired result. If a = 1 the result is trivialand if a < 1 we obtain the result thanks to the identity n

√a = 1

n√

1a

.

1We also call it the squeeze theorem.


Example. Show thatlim

n→+∞n√n = 1.

Bernoulli’s inequality does not any longer give a bound which converges towards1. So let us consider the sequence defined by yn =

√xn where xn = n

√n. Let

us notice that yn ≥ 1. By Bernoulli’s inequality, we find that for all n ≥ 1

√n = (yn)n = (1 + yn − 1)n ≥ 1 + n(yn − 1) i.e. yn ≤ 1 +

√n− 1

n

and by the sandwich theorem, we obtain limn→+∞

yn = 1 since limn→+∞

√n−1n = 0.

Consequently

limn→+∞

n√n = lim

n→+∞xn = lim

n→+∞y2n = ( lim

n→+∞yn)2 = 1.

Theorem 2.3. - The ”quotient criterion”. Let (xn)n∈N be a sequence forwhich the limit

ρ = limn→+∞

∣∣∣∣xn+1

xn

∣∣∣∣exists. Then, if ρ < 1 the sequence (xn)n∈N converges towards 0 whereas if ρ > 1it diverges.

Proof. The hypothesis of the theorem allows us to compare the sequence (xn)n∈Nwith a geometric sequence. Without loss of generality let us suppose that xn 6= 0for all n. If ρ < 1 we choose ε > 0 such that ρ + ε < 1. There exists a naturalnumber Nε such that for all n ≥ Nε:∣∣∣∣xn+1

xn− ρ∣∣∣∣ < ε

hence by the triangle inequality |xn+1| ≤ (ρ + ε)|xn|. By induction (see alsoChapter 2.8) we conclude that |xn+1| ≤ (ρ+ ε)n−Nε |xNε | hence the assertion ofthe theorem. We leave the case ρ > 1 as an exercise.

Remark. If ρ = 1 the sequence (xn)n∈N can be convergent (for example xn =1) or divergent (for example xn = (−1)n).

Example. The geometric sequence (xn)n∈N = (qn)n∈N, |q| 6= 1 satisfies thiscriterion because xn+1 = qxn.

Example. Let a ∈ R. Show that

limn→+∞

an

n!= 0.

With xn = an

n! we have

limn→+∞

∣∣∣∣xn+1

xn

∣∣∣∣ = limn→+∞

|a|n+ 1

= 0 < 1.

Theorem 2.4. - Motonicity criteria.

1. Any increasing and bounded above sequence converges towards the supre-mum of its set of values.


2. Any decreasing and bounded below sequence converges towards the infimumof its set of values.

3. Let (xn)n∈N be an increasing sequence and (yn)n∈N be a decreasing se-quence such that

limn→+∞

(xn − yn) = 0.

Then

(a) for all n ∈ N: x0 ≤ xn ≤ xn+1 ≤ yn+1 ≤ yn ≤ y0

(b) (xn)n∈N and (yn)n∈N converge towards the same limit.

Proof. Exercise.

Example - Euler’s number.2 Let us consider the sequences (xn)n∈N∗ and(yn)n∈N∗ defined by

xn =

(1 +

1

n

)n, yn =

(1 +

1

n

)n+1

.

We assert that (xn)n∈N is strictly increasing and that (yn)n∈N is strictly de-creasing. Indeed, by applying Bernoulli’s inequality we have

xn+1

xn=

(n+ 2)n+1nn

(n+ 1)2n+1

=(n2 + 2n)n+1

(n+ 1)2(n+1)

n+ 1

n

=

(1− 1

(n+ 1)2

)n+1n+ 1

n

>

(1− (n+ 1)

1

(n+ 1)2

)n+ 1

n= 1

and

ynyn+1

=(n+ 1)2n+3

(n+ 2)n+2nn+1

=(n2 + 2n+ 1)n+2

(n2 + 2n)n+2

n

n+ 1

=

(1 +

1

n(n+ 2)

)n+2n

n+ 1

>

(1 + (n+ 2)

1

n(n+ 2)

)n

n+ 1= 1

This implies that (xn)n∈N∗ and (yn)n∈N∗ are bounded and that

x1 ≤ xn ≤ xn+1 ≤ yn+1 ≤ yn ≤ y1

2Named after the mathematician and physicist Leonhard Euler(1707 - 1783), this numberis sometimes also called Neper’s constant.


Moreover

limn→+∞

(xn − yn) = limn→+∞

− 1

n

(1 +

1

n

)n= 0.

Then the sequences (xn)n∈N∗ and (yn)n∈N∗ converge towards the same limit andwe take

e = limn→+∞

(1 +

1

n

)n.

The sequences (xn)n∈N∗ and (yn)n∈N∗ are not really adapted for the numericalcomputation of the number e because they converge only slowly. We will stillshow that

e =

∞∑k=0

1

k!= limn→+∞

n∑k=0

1

k!.

This series expansion converges much faster towards the limit (see Chapter 3.6).

Example - a recurrent sequence for the square root. Let a > 0. Let usconsider the sequence (xn)n∈N defined by

xn+1 =1

2(xn +

a

xn) and x0 > 0

Proposition 2.3.2. limn→+∞

xn =√a.

Proof. First method: the sequence (xn)n∈N satisfies the following properties.For all n ∈ N

1. xn > 0 because x0 > 0 and xn > 0 imply xn+1 > 0.

2. x2n+1 ≥ a because x2

n+1 − a = 14

(xn − a

xn

)2 ≥ 0.

3. xn+1 ≤ xn because xn − xn+1 = 12xn

(x2n − a

)≥ 0.

Consequently, for n ≥ 1, (xn) is a decreasing and bounded below sequence andis bounded below by a

x1because xn ≥ a

xn≥ a

x1thanks to properties 2 and 3.

Hence x = limn→+∞

xn exists and x > 0. By the induction relation, we have

limn→+∞

xn+1 =1

2( limn→+∞

xn +a

limn→+∞

xn)

that is

x =1

2(x+

a

x)

where x2 = a and, moreover, x > 0 by property 1.

Proof. Second method: if the sequence is convergent it must converge towards√a. We define yn = xn −

√a which verifies the induction relation

yn+1 =1

2

ynyn +

√ayn.


y0 > −√a implies y1 > 0 and hence by induction yn > 0 for all n ≥ 1, which

yields

yn+1 ≤1

2yn

for all n ≥ 1. This inequality implies by induction that yn ≤ 21−ny1. By thesandwich theorem, lim

n→+∞yn = 0, i.e. lim

n→+∞xn =

√a.

We present other recurrent sequences in Chapter 2.8 in the context of exactresolution methods for certain recurrent sequences as well as methods to studytheir convergence.

Speed of convergence. After each induction step, we can estimate the ap-proximation error of

√a thanks to the inequalities

a

xn≤√a ≤ xn.

We consider the errors defined by xn =√a(1 + un) and a

xn=√a(1 − vn). So

un ≥ 0 for n ≥ 1 and vn = un1+un

≤ un. The un satisfy the recurrence relation

un+1 =1

2

u2n

1 + un

and we can estimate un by un+1 ≤ 12 min(un, u

2n). For example, if for an n, the

error is smaller than one percent, that is un ≤ 10−2, then un+1 ≤ 5 · 10−5 andun+2 ≤ 1.25 · 10−9. In fact, if 0 < u0 ≤ 1, then 0 < un ≤ 21−2nu2n

0 (exercise).

2.4 The Bolzano-Weierstrass theorem

Introduction. We have seen that any convergent sequence is bounded. Abounded sequence is not always convergent. We will show that from anybounded sequence we can extract a convergent subsequence.

Theorem 2.5. - Bolzano-Weierstrass theorem3 From any bounded se-quence (xn)n∈N, we can extract a convergent subsequence (xnk)k∈N.

Interpretation of the Bolzano-Weierstrass theorem. The Bolzano-Weierstrasstheorem means that for every bounded sequence there is always (at least) a realnumber x for which every neighborhood contains an infinite number of elementsof this sequence. In other words, the elements of a bounded sequence (or of in-finitely many numbers in a bounded interval) accumulate or concentrate around(at least) a real number.

To prove this result we build a convergent sequence starting from (xn)n∈N.Its limit is called the limit superior of the sequence (xn)n∈N.

3Named after Bernhard Bolzano (1781-1848) and Karl Weierstrass (1815-1897). This the-orem represents one of the fundamental theorems from Analysis.


Limit superior and limit inferior. Let (xn)n∈N be a bounded sequence.We define the sequence (yn)n∈N by taking

yn = sup{xk : k ≥ n}.

The sequence (yn)n∈N is decreasing and bounded below, and hence convergent.Its limit is called the limit superior of the sequence (xn)n∈N and we denote itby lim sup

n→+∞xn:

lim supn→+∞

xn := limn→+∞

sup{xk : k ≥ n}. (2.6)

The sequence defined byzn = inf{xk : k ≥ n}

is increasing and bounded above and we denote its limit, called limit inferior ofthe sequence (xn)n∈N, by lim inf

n→+∞xn:

lim infn→+∞

xn := limn→+∞

inf{xk : k ≥ n}. (2.7)

By definition, we always have

lim infn→+∞

xn ≤ lim supn→+∞

xn.

Monotonic sequence and lim sup, lim inf. Let (xn)n∈N be an increasingsequence. Then yn = y0 = sup{xk : k ≥ 0} for all n ∈ N and zn = xn forall n ∈ N. If (xn)n∈N is a decreasing sequence, then yn = xn and zn = z0 =inf{xk : k ≥ 0}.

Example. The sequence xn = (−1)n(1+ 1n ) is bounded but is not convergent.

We have

yn = sup{xk : k ≥ n} =

{1 + 1

n ifn is even,

1 + 1n+1 ifn is odd,

and hence lim supn→+∞

xn = 1. Similarly we show that lim infn→+∞

xn = −1.

Proof. (of the Bolzano-Weierstrass theorem) Let yn = sup{xk : k ≥ n} andy = lim

n→+∞yn. For all ε > 0 and for all natural N there exists an n ≥ N such

that

|yn − y| <1

2ε.

From the construction of the sequence yn, there exists an index n1 ≥ n suchthat |xn1 − yn| < 1

2ε. So n1 ≥ N and

|xn1− y| = |xn1

− yn + yn − y| ≤ |xn1− yn|+ |yn − y| <

1

2ε+

1

2ε = ε.

This implies that for all ε > 0 there exists an infinite number of nk ∈ N andnk ≥ N such that |xnk − y| < ε (if nk is the found index, choose N = nk + 1 tofind an nk+1 > nk and so on.).


Proposition 2.4.1. lim inf , lim sup and convergent sequences. Let (xn)n∈Nbe a sequence and x be a real number. Then lim

n→+∞xn = x if and only if

lim supn→+∞

xn = lim infn→+∞

xn = x. In this case

limn→+∞

xn = lim supn→+∞


xn. (2.8)

Proof. If lim supn→+∞


xn = x, then for the yn, zn built above: zn ≤ xn ≤

yn and we conclude using the sandwich theorem. Let limn→+∞

xn = x, then the

sequence (xn)n∈N is bounded and lim supn→+∞

xn = y and lim infn→+∞

xn = z exist. The

convergence of the xn implies that for all ε > 0 there exists a natural numberN = Nε such that xn ∈]x − ε, x + ε[ for all n ≥ Nε hence yn, zn ∈]x − ε, x + ε[for all n ≥ Nε. Consequently, x = y = z.

Accumulation point of a sequence. We say that p ∈ R is an accumulationpoint of a sequence (xn)n∈N if there exists a subsequence (xnk)k∈N of (xn)n∈Nsuch that lim

k→+∞xnk = p.

Proposition 2.4.2. - Closure of a set and accumulation point of asequence. A point a is a closure point of a set E ⊂ R if and only if there existsa sequence of elements an ∈ E that converges towards a.

Proof. If an ∈ E converges towards a, then for all r > 0 there exists a naturalnumber N such that |an− a| < r for all n ≥ N , that is an ∈]a− r, a+ r[∩E. Ifa is a closure point of E, then for all n there exists an ∈]a− 1

n , a+ 1n [∩E. The

sequence (an)n∈N converges towards a.

Remark. If a ∈ E, we can choose the constant sequence: an = a. If E isbounded above and supE /∈ E, there exists a sequence of elements an ∈ E thatconverges towards supE. If E is bounded below and inf E /∈ E, there exists asequence of elements an ∈ E that converges towards inf E.

Example - the closure of the set of values of a bounded sequence.Let (xn)n∈N be a bounded sequence and let us consider its set of values {xn :n ∈ N}. By the previous proposition, the union of {xn : n ∈ N} with the set ofaccumulation points of the sequence (xn)n∈N gives the closure of {xn : n ∈ N}.

Example - the closure of the rationals. Let (xn)n∈N = (r1, r2, . . .) be thesequence of rationals in ]0, 1[. Then, any x ∈ [0, 1] is an accumulation point ofthis sequence. Hence the closure of its set of values is [0, 1].

Corollary 2.6. - Sequence in a closed set. Let E ⊂ R be a closed set and(xn) be a sequence of elements xn ∈ E. Then any accumulation point of (xn)is in E. In particular, if lim

n→+∞xn = x, then x ∈ E.

Proof. Let x be any accumulation point of (xn). By Proposition 2.4.2 x ∈ Eand E = E since E is closed.


2.5 Cauchy sequences

Introduction. Let (xn)n∈N be a sequence that converges towards x. Wehave observed that the distances between the elements of the sequence becomearbitrarily small. More precisely, if (xn)n∈N converges towards x, for all ε > 0there exists N ∈ N such that |xn − x| < 1

2ε for all n ≥ N . Thus, for all integersm,n ≥ N

|xn − xm| = |xn − x+ x− xm| ≤ |xn − x|+ |xm − x| <1

2ε+

1

2ε = ε.

It is this last property that we use as a definition for a family of sequences calledthe Cauchy sequences.

Cauchy sequences. A sequence (xn)n∈N is said to be a Cauchy sequence4 ifto all ε > 0, we can associate an N = Nε ∈ N such that for all m,n ≥ N wehave |xn − xm| < ε.

Theorem 2.7. - Fundamental theorem on Cauchy sequences. A sequence(xn)n∈N is a Cauchy sequence if and only if it converges.

Proof. We have already seen that any convergent sequence is a Cauchy sequence.To show that any Cauchy sequence converges, notice that if (xn)n∈N is a Cauchysequence, then (xn)n∈N is bounded because for a given ε, there exists a naturalnumber N such that |xn − xN | < ε for all n ≥ N . By the Bolzano-Weierstrasstheorem, there exists a subsequence of (xn)n∈N that converges. Let x be itslimit. We will prove that the entire sequence (xn)n∈N converges towards x. Thesequence (xn)n∈N is a Cauchy sequence, so for all ε > 0, there exists N ∈ Nsuch that for all m,n ≥ N

|xn − xm| <1

2ε.

There exists an element of the subsequence xm such that m ≥ N and |xm−x| <12ε. Hence for all n ≥ N

|xn − x| = |xn − xm + xm − x| ≤ |xn − xm|+ |xm − x| <1

2ε+

1

2ε = ε.

Remark. We can show that this theorem is equivalent to the axiom statingthat any set that is bounded above has a supremum. We can thus characterizethe property stating that R is complete by the fact that any Cauchy sequenceconverges.

2.6 Continuous functions

Continuous function. Let f : Df → R be a function and x ∈ Df . We saythat f is continuous at the point x, if for any sequence (xn)n∈N of elements inDf such that lim

n→+∞xn = x we have

limn→+∞

f(xn) = f(x) = f(

limn→+∞

xn). (2.9)

4Named after Augustin Louis Cauchy (1789-1857).


We denote this property as follows (see also Chapter 4):

limη→x

f(η) = f(x). (2.10)

We say that the limit of f(η) when η tends to x is equal to f(x). Even if thisnotation does not anymore explicitly give reference to the domain Df of thefunction f the limit lim

η→xf(η) depends onDf since we can only consider sequences

of elements in Df (see example 6 below). A function f is continuous over Df ifit is continuous at every x ∈ Df . It follows that we have some computationalrules for the limits of theorem 2.1 and of the definition of continuity:

Operations on continuous functions.

1. The sum and the product of continuous functions are continuous.

2. The composition of continuous functions is continuous.

More precisely we have:

Theorem 2.8. - Operations over continuous functions.

1. Let f, g : R −→ R be continuous at a. Then the functions λ ·f with λ ∈ R,f + g and f · g are continuous at a.

2. Let f, g : R −→ R such that f is continuous at a and g is continuous atb = f(a). Then the function g ◦ f is continuous at a.

2.6.1 Examples of continuous functions

1. The polynomial functions f(x) =

n∑k=1

akxk, ak ∈ R are continuous at all

x ∈ R since if limi→+∞

xi = x, then by the computation rules for limits

limi→+∞

x2i = x2 and by induction lim

i→+∞xki = xk for any positive integer k

and the sum of convergent sequences is convergent.

2. By the same argument, rational functions are continuous over their domainof definition.

3. f(x) = |x| is continuous at every x ∈ R: it is point 4. of theorem 2.1.

4. The functions sinx, cosx are continuous at all x ∈ R: first notice thatsinx is continuous at x = 0 since | sinx| ≤ |x| hence for any sequence suchthat lim

i→+∞xi = 0 it follows that lim

i→+∞sinxi = 0. By the Pythagorean

theorem, cosx is continuous at x = 0. Continuity at every x ∈ R is aconsequence of the addition formulas. For example, if lim

i→+∞xi = x, then

limi→+∞

sinxi = limi→+∞

sin(xi − x+ x)

= limi→+∞

sin(xi − x) cosx+ cos(xi − x) sinx

= 0 · cosx+ 1 · sinx = sinx.


5. To illustrate the fact that the definition of continuity also applies to func-tions that are not defined over an interval, let us consider the functionf : Q −→ R defined by f(x) = ax where a is a non-negative real number.This function satisfies the definition of a continuous functions over its do-main Df = Q if we let f(0) = 1. In fact, let (xk)k∈N be a sequence ofrationals such that lim

k→+∞xk = x ∈ Q. By using axk − ax = ax(axk−x − 1)

we notice that it is sufficient to study the case x = 0. Thus we need toshow that lim

k→+∞xk = 0 with xk ∈ Q implies lim

k→+∞axk = 1 =: a0. If a = 1

the assertion is trivial. If a > 1 we notice that f(x) is strictly increasing.For any non-negative integer n there exists a natural number M = Mn

such that − 1

n< xk <

1

nfor all k ≥ Mn (we have chosen ε = 1

n in the

definition of the limit). Consequently,

a−1n < axk < a

1n .

We have seen that limn→+∞

a1n = 1 hence the assertion by the sandwich

theorem. By the same argument, we prove continuity if a < 1 for anysequence of rationals (xk)k∈N such that lim

k→+∞xk = x.

6. Let f : Q −→ R be defined by f(x) = χQ(x). Then f is constant overits domain, hence continuous and we have lim

η→xf(η) = f(x)(= 1) at every

x ∈ Q. If we consider the function f as a function defined over R, thatis f : Q −→ R defined by f(x) = χQ(x), then lim

η→xf(η) does not exist

anymore since by density of the rationals and of the irrationals in R, forall x ∈ R we can take sequences of rationals and sequences of irrationalsthat converge towards x yielding constant sequences 1 or 0 for the valuesof f .

7. If x ∈ Df is an isolated point of the domain of f , then f is always continu-ous at x since the only sequence of elements in Df that converges towardsx is the constant sequence. Consequently any function f : Z −→ R iscontinuous !

Continuous extension. Let f : Df → R be a function and x /∈ Df be aclosure point of Df . If for any sequence (xn)n∈N of elements in Df such that

limn→+∞

xn = x there exists l ∈ R such that

limn→+∞

f(xn) = l, (2.11)

also written limη→x

f(η) = l, then the extension of f , written f and defined by

f(η) :=

{f(η), if η ∈ Df ;l, if η = a.

(2.12)

is continuous at η = x. We call f the continuous extension of f at η = x.


Examples.

1. Let f : R \ {0} −→ R be defined by f(t) =sin t

t. Then f admits a

continuous extension at t = 0 defined by f(0) = 1. It is a consequence ofthe sandwich theorem by applying the inequalities

cos t ≤ sin t

t≤ 1

for all t 6= 0, |t| ≤ π2 and the continuity of cos t.

2. Let a ∈ R and g : R \ {a} −→ R be defined by g(x) =x2 − a2

x− a. Then g

admits a continuous extension at x = a defined by g(0) = 2a. In fact, letus first notice that a is in the closure of the domain of g. Then for anysequence of elements xn ∈ R \ {a} that converges towards a we have

limn→+∞

g(xn) = limn→+∞

(xn − a)(xn + a)

xn − a= limn→+∞

(xn + a) = 2a.

The function g gives the slopes of the secant lines of the function h(x) = x2

between the points (a, h(a)) and (x, h(x)), x 6= a. The existence of acontinuous extension means that the graph of h(x) = x2 has a tangentat (a, h(a)) with the slope given by the continuous extension at x = a:g(a) = 2a. We will see at Chapter 5 that this condition will help us todefine the derivative of a function.

3. Let us again consider the function f : Q −→ R defined by f(x) = ax wherea is a postive real number. We can give a continuous extension definedfor all x ∈ R by the definition

ax := limk→+∞

axk . (2.13)

By the same argument that was presented above, we can check that thisextension gives a continuous function over R since Q is dense in R. Usinga different but more elegant method we will prove in Chapter 3 that thereexists in fact a unique continuous extension given by the equation (2.13).

Limit of a function. The following definition covers both situations discussedabove. Let f : Df → R be a function and a ∈ R a closure point of Df . If thereexists an l ∈ R such that for any sequence (xn)n∈N of elements in Df such that

limn→+∞

xn = x we have limn→+∞

f(xn) = l we say that the limit of f(x) when x

tends to a is equal to l and we write

limx→a

f(x) = l. (2.14)

It follows that for both cases a ∈ Df and a /∈ Df :

1. if a ∈ Df , then the limit limx→a

f(x) exists if and only if f is continuous at

a and in this case l = f(a) (take the constant sequence with xn = a)


2. if a /∈ Df then the limit limx→a

f(x) exists if and only if f admits a continuous

extension at a, written f : Df ∪ {a} → R and in this case l = f(a).

We will discuss the notion of limit of a function in more detail in Chapter 4 inwhich we also extend this notion to strongly divergent sequences that will bepresented in Chapter 2.7.

2.6.2 Applications of the Bolzano-Weierstrass theorem tocontinuous functions over [a, b]

Introduction. The important properties of continuous functions defined overa bounded and closed interval are consequences of the Bolzano-Weierstrass the-orem. Here we show that any continuous function over a closed and boundedinterval attains its extremal values. For another consequence about uniformcontinuity of functions see Chapter 4.

Maximum and minimum of a function. Let D,T ⊂ R and f : D → T .Then sup{f(x) : x ∈ D} and inf{f(x) : x ∈ D} are called the supremum,respectively the infimum of f and we denote them as sup

x∈Df(x), respectively

infx∈D

f(x). We say that f attains its maximum (respectively its minimum) at

a ∈ D if f(a) = supx∈D

f(x) (resp. infx∈D

f(x)) and we write f(a) = maxx∈D

f(x) (resp.

f(a) = minx∈D

f(x)).

Theorem 2.9. Let f be a continuous function defined over the closed andbounded interval [a, b]. Then f attains its maximum and its minimum.

Proof. It suffices to show that the function f attains its maximum becauseminx∈D

f(x) = −maxx∈D

(−f(x)) and −f is continuous. Let S := supx∈[a,b]

f(x). (Note

that S =∞ if f is not bounded). There exists a sequence (xn)n∈N of elementsxn ∈ [a, b] such that lim

n→+∞f(xn) = S. The sequence (xn)n∈N is bounded (be-

cause [a, b] is bounded), hence by the Bolzano-Weierstrass theorem, there existsa subsequence (xnk)k∈N that converges. Let p be the limit of this subsequence.The interval [a, b] is closed, hence p ∈ [a, b]. From the continuity of f we have

f(p) = limk→+∞

f(xnk) = S.

In particular, S <∞ so f is bounded and reaches its maximum.

Remark. If the interval is open, semi-open or unbounded this result does nothold. For example, the identity function x 7→ x defined over ]0, 1[ is boundedand continuous but does not attain neither its supremum 1 nor its infimum 0.

Theorem 2.10. - Intermediate value theorem. Let f be a continuousfunction defined over a closed and bounded interval [a, b] and f(a) < f(b). Thenfor all real r such that f(a) < r < f(b), there exists an element c ∈]a, b[ suchthat f(c) = r. r is said to be an intermediate value.


Proof. We consider the bounded set E = {x ∈ [a, b] : f(x) ≤ r}. Notice thatE 6= ∅ since a ∈ E. We let c := supE. c is in the closure of E, so there exists asequence of elements xn ∈ S that converges towards c. From the continuity off :

f(c) = limn→+∞

f(xn) ≤ r.

r < f(b) implies c < b. Consequently, the left-open interval ]c, b] is non-emptyand f(x) > r for all x ∈]c, b]. Consequently,

f(c) = limn→+∞

f(c+1

n) ≥ r.

hence f(c) = r.

Remark. If f(a) > f(b), the conclusion of the intermediate value theoremobviously remains true by applying the theorem to −f .

Theorem 2.11. - Theorem of transport of intervals. Let f be a contin-uous function defined over the bounded and closed interval [a, b]. Then the setof images of f is the bounded and closed interval [ min

x∈[a,b]f(x), max

x∈[a,b]f(x)].

Proof. The theorem of transport of intervals is a corollary of theorem 2.9 andof the intermediate value theorem.

The image of a continuous function over [a, b] and the intermediate valuetheorem.

Corollary (Solutions of the equations). Let f be a continuous functiondefined over the bounded and closed interval [a, b] and f(a)f(b) ≤ 0. Then theequation f(x) = 0 has at least one solution in [a, b].

Proposition 2.6.1. Let h : [0, 1]→ [0, 1] be continuous. Then the equation

x = h(x)


has at least one solution x in [0, 1]. We say that x is a fixed point of h5. Moregenerally, let g : [0, 1] → [0, 1] be a continuous and surjective function. Then,the equation

g(x) = h(x)

has at least one solution x in [0, 1].

Proof. Let us take f(x) = g(x) − h(x). The function g is surjective, so thereexists c, d ∈ [0, 1] such that g(c) = 0 and g(d) = 1. Consequently f(c) =−h(c) ≤ 0 and f(d) = 1 − h(d) ≥ 0. By the above corollary there exists an xsuch that f(x) = 0. In particular, if g(x) = x, g is surjective and there exists afixed point for function h.

Proposition 2.6.2. 1. Let I be an interval and f : I → J be a continuousand injective function. Then f is strictly monotonic.

2. Let I be an interval and f : I → J be continuous and bijective. Then itsinverse function f−1 : J → I is continuous.

Proof. 1. It suffices to consider the case I = [a, b], otherwise take a, b ∈I, a < b arbitrarily to prove strict monotonicity over any bounded andclosed subinterval of I, and hence over I. Without loss of generality wecan suppose that f(a) < f(b). Let x1, x2 ∈]a, b[, x1 < x2 be arbitrary.Injectivity of f implies that f(x1), f(x2) ∈]f(a), f(b)[ since if, for exam-ple, f(x1) < f(a) then the intermediate value theorem guarantees theexistence of a c ∈]x1, b[ such that f(c) = f(a) which contradicts injec-tivity. Moreover f(x1) 6= f(x2). If f(x1) > f(x2) by the intermediatevalue theorem there exists c ∈]x2, b[ such that f(c) = f(x1) hence thecontradiction.

2. As earlier it is enough to consider the case I = [a, b]. Without loss ofgenerality we can suppose that f(a) < f(b), that is f is strictly increas-ing hence J = [f(a), f(b)]. Let (yn) be a sequence of elements yn ∈ Jthat converges towards y. Notice that y ∈ J because J is closed. Wedefine xn := f−1(yn). Then xn ∈ I = [a, b]. By the Bolzano-Weierstrasstheorem, there exists a subsequence (xnk) = (f−1(ynk)) that convergestowards an x ∈ I:

x = limk→+∞

f−1(ynk).

By applying the continuous function f to this identity we obtain:

f(x) = f( limk→+∞

f−1(ynk)) = limk→+∞

f(f−1(ynk)) = y.

If there is another subsequence (xmk) = (f−1(ymk)) with limit x ∈ I wefind by the same argument that f(x) = y hence x = x thanks to theinjectivity of f . Consequently, any sequence (xn) converges towards x. Itfollows that for any sequence yn ∈ J that converges towards y:

limn→+∞

f−1(yn) = x = f−1(y),

that is f−1 is continuous at every y ∈ J .

5It is Brouwer’s theorem (Brouwer, 1881-1966) for functions f : R→ R.


A strictly monotonic continuous fonction over [a, b] and its inverse function.

2.7 Strongly divergent sequences

Introduction. Among all divergent sequences, we will particularly distinguishthose which tend to infinity.

Definition. We say that the sequence (xn) tends to ∞ (respectively −∞),if for all C ∈ R, there exists an N ∈ N such that for all n ≥ N , xn > C(respectively xn < C) and we write lim

n→+∞xn = ∞ (respectively lim

n→+∞xn =

−∞). A sequence is said to be strongly divergent if it tends to ∞ (respectively−∞).

Example. The arithmetic sequence defined by xn+1 = xn + d and x0 = atends to ∞ if d > 0 and tends to −∞ if d < 0. Indeed, by induction we caneasily prove that xn = a+ nd.

Computation rules for limit values. In certain cases, the computationrules (2.1)-(2.3) for limit values of convergent sequences can be extended tostrongly divergent sequences if we define the following rules.

Theorem 2.12. - Computation rules for the symbol ∞.

∞+∞ =∞, ∞ ·∞ =∞, 0/∞ = 0

c+∞ =∞, c/∞ = 0, for all c ∈ R

c · ∞ =∞, for all c > 0


Remark. Obviously we do not define expressions that are indefined like 0 ·∞,∞/∞, ∞−∞, ∞/0 or c/0.

2.8 Limits of recurrent sequences

We present a pretty general approach to solve exercises asking to determinethe limit of a recurrent sequence. The sequences to study are linear recur-rence relations of the form xn+1 = qxn + b and non-linear recurrence relationsxn+1 = f(xn). The theory and the algorithm to explicitly solve linear sequencesare dealt with in a Linear Algebra course or in a course covering differentialequations and dynamic systems. Here we only look at the convergence problem.For non-linear sequences, we present a general method to study the convergenceproblem.

2.8.1 Linear recurrent sequences

The recurrence relation xn+1 = qxn + b

Geometric sequences. Let a, q ∈ R (or more generally a, q ∈ C). Let usconsider the geometric sequence (xn)n∈N defined by

xn = aqn (2.15)

A geometric sequence is characterized by the property xn+1 = qxn and initialcondition x0 = a. The sequence (xn)n∈N is convergent if |q| < 1 or q = 1otherwise it is divergent (except if a = 0). More precisely, if (xn) is a sequencesatisfying xn+1 = qxn and x0 = a 6= 0, then

limn→+∞

xn =

0 if |q| < 1,

a if q = 1,

+∞ sgn (a) if q > 1,

does not exist otherwise for a 6= 0.

(2.16)

Recurrent sequences. Let xn be defined by

xn+1 = qxn + b and x0 = a (2.17)

for a, b 6= 0. We suppose that the sequence (xn)n∈N converges and tends to alimit x. The limit x is necessarily a solution of the linear equation

x = qx+ b i.e. x =b

1− q(2.18)

Consequently, if q = 1 the sequence does not converge. We define yn by yn =xn − x. In case of convergence, the sequence (yn)n∈N must converge towards 0.By using x = qx+ b we find

xn+1 − x = qxn + b− qx− b = q(xn − x)

oryn+1 = qyn (2.19)


and y0 = x0 − x = a− x. We then deduce the following result:

Proposition 2.8.1. - Universality of the limit. The recurrent sequence definedby (2.17) converges for any initial condition x0 if and only if |q| < 1. In thiscase

limn→+∞

xn =b

1− q.

In particular, the sequence (xn)n∈N is given by

xn =b

1− q− bqn

1− q+ x0q

n.

Example. Compute the limit of the sequence (xn)n∈N defined by

xn+1 =1

4(3xn + 1), x0 = 0

Solution. Let us suppose that (xn) converges. Let us denote by x its limitthat satisfies x = 1

4 (3x+ 1), i.e. x = 1. Take yn = xn − x = xn − 1. Then

xn+1 − 1 =1

4(3xn + 1)− 1 =

3

4(xn − 1)

or

yn+1 =3

4yn

and y0 = x0 − 1 = −1. The sequence (yn)n∈N is a geometric sequence withq = 3

4 . Hence (yn)n∈N converges towards 0. Consequently, xn = yn+x = yn+1converges towards 1.

2.8.2 Non-linear recurrent sequences

A general method. Let xn be defined by

xn+1 = f(xn) and x0 = a (2.20)

for a continuous real f . We suppose that the sequence (xn)n∈N converges andtends to a limit x∗. The limit x is necessarily a solution of the equation

x∗ = f(x∗). (2.21)

Such a solution x∗ is also called a fixed point of f . Let us suppose that thisequation has only one solution. (If there exists several solutions, we take theone which seems to be the correct limit or we search for bounds of the sequence(xn)n∈N to exclude all the solutions except one; if there is no solution, then(xn)n∈N cannot converge). As in the linear case, we define yn by yn = xn − x∗.In case of convergence, (yn)n∈N must converge towards 0. By using x∗ = f(x∗)we find

xn+1 − x∗ = f(xn)− f(x∗)

oryn+1 = f(x∗ + yn)− f(x∗) (2.22)


We then try to find estimations of yn which guarantee that |f(x∗+yn)−f(x∗)| ≤q|yn| for a constant q ∈ R such that 0 < q < 1. In this case

|yn+1| ≤ q|yn|

and hence by induction|yn| ≤ qn|y0|

hencelim

n→+∞yn = 0, or lim

n→+∞xn = x∗,

2.8.3 Banach’s fixed point theorem

By generalizing the method presented in 2.8.2 we can prove a famous (andimportant !) theorem:

Theorem 2.13. - Banach’s fixed point theorem.6 Let I be a closed intervaland f : I → I be a contraction mapping, that is there exists 0 < q < 1 such that

|f(x)− f(x′)| ≤ q|x− x′| (2.23)

for all x, x′ ∈ I. Then f has a unique fixed point x∗ in I.

Proof. For x0 ∈ I, we consider the recurrence relation (2.20). Let us first noticethat xn ∈ I implies xn+1 ∈ I since f : I → I. By induction on n:

|xn+1 − xn| = |f(xn)− f(xn−1)| ≤ q|xn − xn−1| = . . . ≤ qn|x1 − x0|.

The sequence (xn)n∈N is a Cauchy sequence since for all non-negative integersn, k, by a telescopic sum and the triangle inequality:

|xn+k − xn| =∣∣ k∑i=1

(xn+i − xn+i−1)∣∣

≤k∑i=1

|xn+i − xn+i−1|

≤k∑i=1

qn+i−1|x1 − x0| = qn|x1 − x0|k∑i=1

qi−1

≤ qn|x1 − x0|1− q

.

Let x∗ be the limit of this sequence. Notice that x∗ ∈ I since I is closed. Weshow that x∗ is a fixed point of f . For all n:

|f(x∗)− x∗| = |f(x∗)− f(xn) + xn+1 − x∗|≤ |f(x∗)− f(xn)|+ |xn+1 − x∗|≤ q|xn − x∗|+ |xn+1 − x∗| → 0 when n→∞.

We still have to prove unicity. Let x∗∗ ∈ I be another fixed point, then

|x∗∗ − x∗| = |f(x∗∗)− f(x∗)| ≤ q|x∗∗ − x∗|

hence x∗∗ = x∗.6Named after Stefan Banach (1892 - 1945). We consider it together with the Bolzano-

Weierstrass theorem as one of the fundamental theorems of Analysis. Its main applicationsto differential calculus and integration will be discussed in Analysis 2.


Remark. The sole property to be a contraction mapping is not enough toguarantee the existence of a fixed point in I. f [I] must be in I (in order toapply the theorem this adaptation of f often represents the hard part !). For

example, let I = [0, 1] and f(x) =x+ a

2, a ≥ 0. Then f is a contraction

mapping and f [I] = [0,1 + a

2]. The theorem ensures the existence of a unique

fixed point in I if a ≤ 1. If a > 1, then f [I] is not anymore in I. By an explicitcomputation we see that x∗ = a is the only fixed point of f in R, but a /∈ I andf does not have any fixed points in I.

Remark. The fixed point theorem ensures the existence of a unique solutionx∗ to the equation x = f(x) in I for any contraction mapping f : I → I. Itfollows from the proof that any recurrent sequence defined by xn+1 = f(xn)with initial condition x0 in I converges towards x∗. Hence the theorem allowsus to investigate these recurrence relations more deeply. To illustrate it, we willreexamine a previous example:

Example. We consider the recurrence relation given by

xn+1 =1

3(3xn − x2

n + 4).

In order to be able to apply the fixed point theorem we hope to find a closed

interval I over which f(x) :=1

3(3x−x2 +4) is a contraction mapping f : I → I.

We have already shown that f(x) ≤ 25

12= f

(3

2

)for all x ∈ R and f(x) ≥ 0 for

all x ∈ [−1, 4] (the zeros are located at −1 and 4). Consequently, f : [−1, 4]→[0,

25

12] ⊂ [−1, 4]. We must still check whether or not f is contraction mapping

over this domain. We have

f(x)− f(x′) = (x− x′)3− x− x′

3.

Hence f is not a contraction mapping over this domain because by choosingx = 3, x′ = 4 the second factor becomes −4/3 and is hence greater than 1 inabsolute value (as for x, x′ ≤ 0). We try to take a smaller domain. A sufficient

choice is, for example, I = [1,5

2] since f(x) ∈ [1,

25

12] ⊂ I for all x ∈ I and

−2

3≤ 3− x− x′

3≤ 1

3

so the choice q = 2/3 is admissible. By Banach’s fixed point theorem, forany initial condition x0 ∈ I, the sequence defined by this recurrence relationconverges towards the fixed point x∗ = 2. Note that the initial condition x0 = 0is not in I, but x1 = 4/3 ∈ I hence the convergence towards x∗ = 2.


The graph of f(x) :=1

3(3x− x2 + 4) and the line of equation y = x.

2.9 Supplement: Sequences of complex numbers

The results of this chapter are also valid for sequences of complex numbers withthe exception of the convergence criteria which are based upon properties of anordered field. The modulus | · | replaces the absolute value to measure the dis-tance between numbers. Consequently, in analogy with the metric formulationof the limit process in equation (2.1) we define convergence for a sequence ofcomplex numbers zn:

limn→+∞

zn = z ⇔ limn→+∞

|zn − z| = 0. (2.24)

A sequence (zn)n∈N is bounded if and only if there exists a constant C > 0such that |zn| ≤ C for all n ∈ N. Any bounded sequence satisfies the Bolzano-Weierstrass theorem:

Theorem 2.14. - Bolzano-Weierstrass theorem for sequences in C.From any bounded sequence (zn)n∈N, we can extract a convergent subsequence(znk)k∈N.

Proof. For all n ∈ N we let zn = xn + iyn with xn, yn ∈ R being the real partand the imaginary part of zn. By the definition of the modulus, the sequences(xn)n∈N, (yn)n∈N are bounded since |xn| ≤ |zn| ≤ C and |yn| ≤ |zn| ≤ C fora constant C > 0. By the Bolzano-Weierstrass theorem for sequences, thereexists a subsequence of (xn)n∈N, written (xnm)m∈N that is convergent. Then weapply the Bolzano-Weierstrass theorem to the subsequence (ynm)m∈N to extracta convergent subsequence, written (ynmk )k∈N. By construction, the sequence(xnmk )k∈N is a subsequence of the convergent subsequence (xnm)m∈N and ishence convergent. It follows that (znmk )k∈N is convergent.


As previously, the Bolzano-Weierstrass theorem implies the fundamental the-orem on Cauchy sequences:

Theorem 2.15. - fundamental theorem on Cauchy sequences in C. Asequence (zn)n∈N is a Cauchy sequence if and only if it converges.

2.9.1 Continuous functions

The definition of continuous functions f : R −→ C, f : C −→ R or f : C −→ Cis given by the commutativity property with the limit process as in equation(2.9).

Examples of continuous functions.

1. The function f : R −→ C given by f(x) = eipx := cos(px) + i sin(px),p ∈ R is continuous at every x ∈ R.

2. The functions f : C −→ R given by f(z) = z+z, f(z) = i(z−z), f(z) = zzand f(z) = |z| are continuous at every z ∈ C.

3. The polynomial functions f : C −→ C given by f(z) =

n∑k=0

akzk, ak ∈ C

are continuous at every z ∈ C.

4. If f : C −→ C is continuous at z0 ∈ C, then f(z) and f(z) are continuousat z0 ∈ C.

Chapter 3

Series

We apply the results from Chapter 2 to the computation of series. We will showthat Euler’s number can be represented as a series and we define the exponentialseries.

Notions to learn. Convergent or divergent series, absolutely convergent se-ries, geometric series, comparison criterion, d’Alembert’s criterion, Cauchy’scriterion, alternating series, Leibniz’s criterion, exponential series, series for sin,cos, sinh and cosh.

Skills to acquire. Understand the different notions of convergence, under-stand and know how to apply the convergence criteria and to prove convergenceor divergence of a given series by means of these criteria, compute the value of aconvergent series in simple cases by means of the established computation rules.

3.1 Convergence of a series

Definition - series. Let (ak)k∈N be a sequence of elements in R. Any formalexpression of the form

∞∑k=0

ak = a0 + a1 + a2 + . . . (3.1)

is called a series. The ak are called the terms of the series.The expression (3.1) is formal because it requires an infinite number of ad-

ditions. In order to give a meaning to (3.1), we will study the sequence (Sn) ofpartial sums defined by

Sn =

n∑k=0

ak .

Definition - convergence of a series. The series

∞∑k=0

ak is said to be con-

vergent if the sequence (Sn)n∈N of the partial sums converges, otherwise it is

62

CHAPTER 3. SERIES 63

said to be divergent. The limit S = limn→+∞

Sn is called the value of the series

∞∑k=0

ak.

If the series

∞∑k=0

ak converges, we also say that the sequence of the ak is

summable, or summable of sum S. If the sequence of the Sn diverges strongly,

that is if limn→+∞

Sn = ∞ or limn→+∞

Sn = −∞, we respectively write

∞∑k=0

ak = ∞

and

∞∑k=0

ak = −∞.

Recurrence relation. The partial sums Sn satisfy the recurrence relationSn+1 = Sn + an+1.

Theorem 3.1. - Computation rule for convergent series. Let∞∑k=0

ak,

∞∑k=0

bk be two convergent series. Then for all λ, µ ∈ R:

λ

∞∑k=0

ak + µ

∞∑k=0

bk =

∞∑k=0

(λak + µbk). (3.2)

Proof. For all n ∈ N we have:

λ

n∑k=0

ak + µ

n∑k=0

bk =

n∑k=0

(λak + µbk)

hence we get firstly the existence of the limit

∞∑k=0

(λak + µbk) = limn→+∞

n∑k=0

(λak + µbk)

and secondly the assertion (3.2).

Theorem 3.2. - Principal criterion for the convergence of a series.

The series

∞∑k=0

ak converges if and only if for all ε > 0 there exists a natural

number N such that for all n > m ≥ N we have∣∣ n∑k=m+1

ak∣∣ < ε.

Proof. A series converges if the sequence of partial sums Sn converges. Thesequence (Sn)n∈N is hence a Cauchy sequence, that is for all ε > 0 there exists


a natural number N such that for all n > m ≥ N we have |Sn − Sm| < ε. This

proves the theorem because Sn − Sm =

n∑k=m+1

ak.

For the case n = m+1, theorem 3.2 gives the following necessary convergencecondition:

Corollary 3.3. - necessary condition for the convergence of a series.

If the series

∞∑k=0

ak converges, then the sequence of the ak converges towards

zero.

Example. The series

∞∑k=0

sin(k) is divergent because the sequence of the sin(k)

does not converge towards zero (it is even divergent).

We now introduce a second concept of convergence for a series, called ab-solute convergence. We will see that this concept is stronger in the sense thatabsolute convergence implies convergence, but not the other way around. Thusthere exists convergent series that are not absolutely convergent. The advantageof the concept of absolute convergence is that it allows to establish easily appli-cable convergence criteria. Furthermore, we will see that absolute convergenceis a necessary and sufficient condition for the value of a series to be independentof the indexation of the elements in the sequence (ak)k∈N.

Definition - absolute convergence of a series. The series

∞∑k=0

ak is said to

be absolutely convergent if the series

∞∑k=0

|ak| of the absolute values converges.

Theorem 3.4. - absolutely convergent implies convergent. If the series

of absolute values

∞∑k=0

|ak| converges, then the series

∞∑k=0

ak converges.

Proof. This theorem is a consequence of the triangle inequality for the absolutevalue that implies

|Sn − Sm| =

∣∣∣∣∣n∑

k=m+1

ak

∣∣∣∣∣ ≤n∑

k=m+1

|ak| .

Since the series is absolutely convergent we have by theorem 3.2 that for all ε > 0

there exists a natural number N such that for all n > m ≥ N ,

∣∣∣∣∣n∑

k=m+1

|ak|

∣∣∣∣∣ =

n∑k=m+1

|ak| < ε. This shows that the sequence (Sn) of partial sums is a Cauchy

sequence.


Example - geometric series. Let q ∈ R. The series with terms ak = qk iscalled a geometric series. It converges absolutely if |q| < 1 and

∞∑k=0

qk =1

1− q. (3.3)

For q ≤ −1 the series diverges and for q ≥ 1 it diverges strongly.

Proof. For q 6= 1 we have

Sn =

n∑k=0

qk =1− qn+1

1− q=

1

1− q− qn+1

1− q.

If |q| < 1, then

∣∣∣∣ n∑k=0

qk − 1

1− q

∣∣∣∣ ≤ 1

1− q|q|n+1 → 0 when n→∞. For q ≤ −1,

the Sn are non-negative for even n and are non-positive for odd n. For q > 1 the

Sn are non-negative and limn→∞

1− qn+1

1− q=∞. For q = 1 we have Sn = n+ 1→

∞ when n→∞.

Example - harmonic series. The harmonic series

∞∑k=1

1

kis divergent. It

follows that the corollary 3.3 gives a necessary but not sufficient condition forthe convergence of a series.

Proof. Let us prove divergence of the harmonic series. Let p be a natural num-ber. We will show by induction that

S2p =

2p∑k=1

1

k≥ 1 +

p

2

For p = 0, this inequality is true. Hence

S2p+1 = S2p +

2p+1∑k=2p+1

1

k≥ 1 +

p

2+

2p+1∑k=2p+1

1

k

≥ 1 +p

2+ (2p+1 − 2p)

1

2p+1= 1 +

p+ 1

2.

The subsequence S2p diverges and consequently the sequence Sn diverges too.

Example - alternating harmonic series. From the previous example, we

deduce that the alternating harmonic series

∞∑k=1

(−1)k+1 1

kis not absolutely

convergent. We will show in Section 3.4 that it is nonetheless convergent. Hencethere exists series that are convergent without being absolutely convergent.

Motivated by these examples, we now analyze two situations for which wecan easily establish convergence criteria: the series with non-negative terms aswell as the alternating series.


3.2 Series with non-negative terms.

Let

∞∑k=0

ak be a series with non-negative terms ak. For these series convergence

and absolute convergence are equivalent. The sequence of partial sums Sn sat-isfies the recurrence relation Sn+1 = Sn + an+1 and thus we get Sn+1 ≥ Sn.

By the Bolzano-Weierstrass theorem, the series

∞∑k=0

ak is convergent if and only

if the sequence of partial sums is bounded, and the series is strongly divergent

otherwise. We write

∞∑k=0

ak < ∞ if the series converges and

∞∑k=0

ak = ∞ if the

series diverges strongly.

Example. The series

∞∑k=1

1

k(k + 1)converges because Sn = n

n+1 (see Formula

(1.2)), and hence∞∑k=1

1

k(k + 1)= 1 .

Proposition 3.2.1. - comparison criterion.Let (an)n∈N and (bn)n∈N be two sequences with non-negative elements and

let n0 ∈ N such that for any integer n ≥ n0, an ≤ bn. Then

∞∑k=0

bk < ∞ implies that

∞∑k=0

ak <∞ ,

∞∑k=0

ak = ∞ implies that

∞∑k=0

bk =∞ .

Example. The series

∞∑k=1

1

kαdiverges if 0 < α ≤ 1 and converges if α > 1.

Proof. Indeed, if 0 < α ≤ 1 we have 1kα ≥

1k and hence

limn→∞

n∑k=1

1

kα≥ limn→∞

n∑k=1

1

k=∞ ,

because the harmonic series diverges. If α ≥ 2, we use (for k > 2) the inequalities1kα ≤

1k2 <

1k(k−1) and hence

limn→∞

n∑k=1

1

kα≤ lim

n→∞

n∑k=1

1

k2≤ 1 + lim

n→∞

n∑k=2

1

k(k − 1)

= 1 + limn→∞

n−1∑k=1

1

k(k + 1)= 2 .


If 1 < α ≤ 2, we show inductively that for any positive integer p

S2p−1 =

2p−1∑k=1

1

kα≤

p−1∑n=0

(1

2α−1

)n.

For p = 1 this inequality is true. Hence

S2p+1−1 = S2p−1 +

2p+1−1∑k=2p

1

kα≤

p−1∑n=0

(1

2α−1

)n+

2p+1−1∑k=2p

1

kα

≤p−1∑n=0

(1

2α−1

)n+ (2p+1 − 2p) · 1

2αp

=

p−1∑n=0

(1

2α−1

)n+

(1

2α−1

)p=

p∑n=0

(1

2α−1

)n.

The sequence of partial sums (Sn)n∈N is increasing. For all n there exists psuch that n ≤ 2p − 1 and consequently

Sn ≤ S2p−1 ≤∞∑n=0

(1

2α−1

)n=

1

1− 21−α

hence the convergence of (Sn)k∈N.

This technique consisting to ”densify” a series also extends to a more generalsituation:

Proposition - Densification criterion. Let (an)n∈N be a decreasing se-

quence of non-negative elements. Then,

∞∑k=0

ak converges if and only if

∞∑k=0

2ka2k

converges.

3.3 Alternating series

Definition - alternating series. A series of the form

∞∑k=0

(−1)kbk, with

(bk)k∈N being a sequence of non-negative elements is called an alternating series.

Remark. If an alternating series converges we have

∞∑k=0

(−1)kbk = −∞∑k=0

(−1)k+1bk,

because for all n ≥ 0 we have

n∑k=0

(−1)kbk = −n∑k=0

(−1)k+1bk.


Theorem 3.5. - Leibniz’s convergence criterion. Let (bk)k∈N be a de-creasing sequence of non-negative elements such that lim

k→+∞bk = 0. Then the

alternating series

∞∑k=0

(−1)kbk converges.

Proof. For n ∈ N let xn = S2n+1 and yn = S2n. We show that the sequences(xn)n∈N and (yn)n∈N satisfy the third monotonicity criterion from Chapter 2.3.The monotonicity of (an)n∈N implies that (xn)n∈N is increasing and that (yn)n∈Nis decreasing because

xn+1 − xn = S2n+3 − S2n+1 = −b2n+3 + b2n+2 ≥ 0

andyn+1 − yn = S2n+2 − S2n = b2n+2 − b2n+1 ≤ 0.

Moreover xn − yn = S2n+1 − S2n = −b2n+1 ≤ 0. Then the sequences (xn)n∈Nand (yn)n∈N are bounded and

limn→+∞

(xn − yn) = limn→+∞

(−b2n+1) = 0.

The third monotonicity criterion implies that the sequences (xn) and (yn) con-verge towards the same limit. In this way, the sequence of partial sums Snconverges.

Remark. For an alternating series satisfying the conditions in Leibniz’s the-orem we have the following bounds for all n ∈ N:

S2n+1 ≤∞∑k=0

(−1)kbk ≤ S2n.

Example - a convergent alternating series that is not absolutely con-vergent. By Leibniz’s criterion, the alternating harmonic series,

∑∞k=1(−1)k+1 1

k ,is convergent. By applying the estimations above for n = 2 and n = 3 we obtain

7

12<

37

60<

∞∑k=1

(−1)k+1 1

k<

47

60<

5

6.

This example shows that there are convergent series that are not absolutelyconvergent.

Example - an absolutely convergent alternating series. By Leibniz’s

criterion, the series

∞∑k=1

(−1)k+1 1

k2is convergent. This series is in fact absolutely

convergent because

∞∑k=1

1

k2≤ 1 +

∞∑k=1

1

k(k + 1)= 2.


Example - a divergent alternating series. For any positive integer k let

bk =

{1√

k+√k+2

, if k is even;1

k(k+2) , if k is odd.

We assert that the series

∞∑k=1

(−1)kbk is divergent. The sequence (bk)k∈N satisfies

limk→+∞

bk = 0 but it is not decreasing; hence Leibniz’s criterion is not applicable.

By using b2j =

√2(√j + 1−

√j)

2and b2j−1 =

1

2

( 1

2j − 1− 1

2j + 1

), we obtain

for any non-negative integer n:

S2n =

n∑j=1

b2j −n∑j=1

b2j−1

=

√2(√n+ 1− 1)

2− 1

2

(1− 1

2n+ 1

)hence lim

n→+∞S2n =∞.

3.4 Absolute convergence: convergence criteria

In general it is easier to study if a series is absolutely convergent because wecan apply the comparison criteria that we have established for series with non-negative terms.

Theorem - comparison criteria. Let

∞∑k=0

bk be an absolutely convergent

series and (ak)k∈N be a sequence such that for all k ∈ N, |ak| ≤ |bk|. Then the

series

∞∑k=0

ak converges absolutely.

Let

∞∑k=0

bk be a series that is not absolutely convergent and (ak)k∈N be a

sequence such that for all k ∈ N, |ak| ≥ |bk|. Then the series

∞∑k=0

ak does not

converge absolutely.

The following criteria are consequences of the comparison criterion where(bk)k∈N is a geometric sequence, that is bk = Cρk. See equation (3.3).

Theorem - d’Alembert’s criterion. Let (ak)k∈N be a sequence of elements

in R∗ for which limk→+∞

∣∣∣ak+1

ak

∣∣∣ = ρ. Then if ρ < 1, the series

∞∑k=0

ak converges

absolutely and diverges if ρ > 1.


Theorem - generalized d’Alembert’s criterion (the convergent case).Let (ak)k∈N be a sequence of elements in R∗ for which there exists n0 ∈ N andρ < 1 such that ∣∣∣∣an+1

an

∣∣∣∣ ≤ ρ for all n ≥ n0,

then the series

∞∑k=0

ak converges absolutely. In particular, if

lim supk→+∞

∣∣∣∣ak+1

ak

∣∣∣∣ = ρ

for a ρ < 1, then the series

∞∑k=0

ak converges absolutely.

Theorem - Cauchy’s criterion. Let (ak)k∈N be a sequence of elements in

R for which lim supk→+∞

k√|ak| = ρ. Then if ρ < 1, the series

∞∑k=0

ak converges

absolutely and diverges if ρ > 1.

Remark. If the sequence ( k√|ak|)k is convergent, then lim sup

k→+∞

k√|ak| = lim

k→+∞k√|ak|.

Application to geometric series. Take the geometric series ak = a0qk.

Then ρ = |q| since

limk→+∞

∣∣∣∣ak+1

ak

∣∣∣∣ = limk→+∞

k√|ak| = |q|.

Remark - Range of application of these criteria. Let us recall thatd’Alembert’s criterion and Cauchy’s criterion do not give any information onwhether or not the sequence ak has a polynomial decay. For example if ak = 1

kα ,α > 0, k ≥ 1 we obtain ρ = 1 for both criteria. Indeed,

limk→+∞

∣∣∣∣ak+1

ak

∣∣∣∣ = limk→+∞

(k

k + 1

)α= 1.

and

limk→+∞

k

√1

kα= 1.

In this case we must directly apply the comparison criterion. Also notice that∣∣∣∣ak+1

ak

∣∣∣∣ =

(k

k + 1

)α< 1 for all k,

but there does not exist an n0 and a ρ < 1 such that∣∣∣ak+1

ak

∣∣∣ ≤ ρ for all k ≥ n0.

The fact that Cauchy’s criterion gives the same value for ρ as d’Alembert’scriterion is not inherent to this example.


Remark - Compare d’Alembert and Cauchy. If limk→+∞

∣∣∣ak+1

ak

∣∣∣ = ρ then

limk→+∞

k√|ak| = ρ. If we find ρ = 1 by d’Alembert’s criterion (hence d’Alembert

does not give any information on the convergence or divergence of the series),we do not have to test Cauchy because it will give ρ = 1 as well. On the otherhand, there are series for which the limit given by d’Alembert’s criterion doesnot exist but Cauchy’s criterion is applicable as it is shown in the followingexample.

Example. For k ∈ N let ak be defined by

ak =(3− 2 · (−1)k

)(3

4

)kthat is ak =

(

3

4

)kif k is even

5

(3

4

)kif k is odd

By applying the comparison criterion with bk = 5(

34

)k, we conclude that the

series∑∞k=0 ak converges absolutely. d’Alembert’s criterion is not applicable

because ∣∣∣∣a2m+1

a2m

∣∣∣∣ =15

4and

∣∣∣∣ a2m

a2m−1

∣∣∣∣ =3

20,

hence∣∣∣ak+1

ak

∣∣∣ does not converge. The generalization of d’Alembert’s criterion is

not applicable either. Cauchy’s criterion gives us the convergence of∑∞k=0 ak

because2m√|a2m| =

3

4and 2m+1

√|a2m+1| =

3

42m+1√

5

and consequently

lim supk→+∞

k√|ak| = lim

k→+∞

3

4k√

5 =3

4< 1.

We compute the series:∞∑k=0

ak =∞∑m=0

a2m +∞∑m=0

a2m+1

=

∞∑m=0

(32

42

)m+ 5 · 3

4

∞∑m=0

(32

42

)m=

1

1− ( 34 )2

+ 5 · 3

4· 1

1− ( 34 )2

=76

7.

3.5 Order of the terms in a series

In the last example of the previous section, we have changed the order of theterms in the series to compute the sum separately over the even and the oddindices. We have tacitly supposed that this does not change the result. Thisproperty is far from being obvious. We can prove that the series that are ab-solutely convergent are the only ones for which the order of the terms has noimportance. We will distinguish two situations: the change of order of theak given by a permutation and the change of order by summation over subse-quences.


Theorem - Commutativity for absolutely convergent series. For an

absolutely convergent series

∞∑k=0

ak the limit does not depend on the order of

the terms ak. In particular:

1. For any bijective mapping σ from N to N, we have

∞∑k=0

aσ(k) =

∞∑k=0

ak.

2. For any pair of disjoint subsequences (mk)k∈N, (nk)k∈N in N such that{mk : k ∈ N} ∪ {nk : k ∈ N} = N we have

∞∑k=0

ak =

∞∑k=0

amk +

∞∑k=0

ank .

By choosing mk = 2k, nk = 2k + 1 we obtain in particular

∞∑k=0

ak =

∞∑k=0

a2k +

∞∑k=0

a2k+1.

We give an example to show that it is possible to reorder the terms of aconvergent but not absolutely convergent series, in order for the final series toconverge towards another limit.

Example. The series

∞∑k=1

(−1)k+1 1

k= 1− 1

2+

1

3− 1

4+

1

5− 1

6+

1

7− 1

8+

1

9− 1

10+ . . .

is convergent but not absolutely convergent. We prove that we cannot reorderthe terms of this series. Let us first notice that

∞∑k=1

(−1)k+1 1

k<

5

6.

We will reorder the terms in the following way :

(1 +1

3− 1

2) + (

1

5+

1

7− 1

4) + (

1

9+

1

11− 1

6) + . . .

Each parentheses, written pm, m ≥ 1 contains three terms :

pm =1

4m− 3+

1

4m− 1− 1

2m=

8m− 3

2m(4m− 1)(4m− 3)> 0

Hence∞∑m=1

pm =5

6+

∞∑m=2

pm >5

6.

and consequently

∞∑k=1

(−1)k+1 1

k6=∞∑m=1

(1

4m− 3+

1

4m− 1− 1

2m

).


Remark. For the series∞∑k=1

(−1)k+1 1

k2and

∞∑k=1

(−1)k+1 1

k(k + 1),

we can reorder the terms as above without changing the result because bothseries are absolutely convergent.

Product of two absolutely convergent series. Let∑∞k=0 ak and

∑∞k=0 bk

be two absolutely convergent series. For all n ∈ N we let

cn =

n∑k=0

akbn−k.

Then the series∑∞n=0 cn is absolutely convergent and

∞∑n=0

cn =

∞∑k=0

ak ·∞∑k=0

bk.

3.6 The exponential series

We define the exponential function by means of a series and we study its prop-erties. The trigononetric functions sin, cos and hyperbolics sinh, cosh are builtfrom the exponential series. We also present a few elementary properties.

Theorem 3.6. - exponential series.

1. For all x ∈ R, we define a number exp(x) by the absolutely convergentseries

exp(x) :=

∞∑k=0

xk

k!.

2. Euler’s number e defined by e = limn→+∞

(1 + 1

n

)nsatisfies e = exp(1), that

is

e =

∞∑k=0

1

k!.

3. For all x, y ∈ R, we have exp(x+ y) = exp(x) exp(y), that is

∞∑n=0

(x+ y)n

n!=

( ∞∑k=0

xk

k!

)·

( ∞∑k=0

yk

k!

).

4. The exponential function exp : R →]0,∞[ is strictly increasing and con-tinuous over R.

Proof. 1. Let us show that the series is absolutely convergent. We let ak =xk

k! . By d’Alembert’s criterion

lim

∣∣∣∣ak+1

ak

∣∣∣∣k→+∞

= limk→+∞

∣∣∣∣ x

k + 1

∣∣∣∣ = 0.


2. For all n ≥ 1, by Newton’s binomial formula we have(1 +

1

n

)n=

n∑k=0

(n

k

)n−k

=

n∑k=0

1

k!

n(n− 1) · . . . · (n− k + 1)

nk

=

n∑k=0

1

k!

k−1∏j=0

(1− j

n

)

By using(1− j

n

)< 1 we obtain the upper bound(

1 +1

n

)n≤

n∑k=0

1

k!≡ Sn.

By Bernoulli’s inequality, we get that for k ≥ 1 and k ≤ n

k−1∏j=0

(1− j

n

)≥(

1− k − 1

n

)k≥ 1− k(k − 1)

n

and hence, for all n ≥ 1,(1 +

1

n

)n=

n∑k=0

1

k!

k−1∏j=0

(1− j

n

)

≥n∑k=0

1

k!

(1− k(k − 1)

n

)

=

n∑k=0

1

k!− 1

n

n∑k=2

1

(k − 2)!= Sn −

1

n· Sn−2.

Hence e = exp(1) by the sandwich theorem.

3. By applying the result on the product of two absolutely convergent series,we find that

cn =

n∑k=0

xk

k!

yn−k

(n− k)!=

1

n!

n∑k=0

(n

k

)xkyn−k =

(x+ y)n

n!.

4. Let us first notice that exp(x) > 0 for all x ∈ R and exp(x) > 1 for allx > 0. Let x1 < x2. Then

exp(x2)− exp(x1) = exp(x1)(exp(x2 − x1)− 1) > 0.

This identity also implies that it is sufficient to prove continuity of exp(x)at x = 0. Let (xn)n∈N be a sequence that converges towards 0. We cansuppose that |xn| < 1. By using

exp(xn)− 1 = xn

∞∑k=1

1

k

xk−1n

(k − 1)!


we obtain the estimation

| exp(xn)− 1| ≤ |xn|∞∑k=1

|xn|k−1

(k − 1)!≤ |xn| e

hence limn→+∞

exp(xn) = 1 = exp(0)

Practical computing.

Logarithmic function. It follows from theorem 3.6 that exp : R →]0,∞[ isbijective: injectivity is a consequence of strict monotonicity. To prove surjec-tivity, notice first that en = exp(n) for all n ∈ Z (proof by induction) and

limn→−∞

exp(n) = 0, limn→+∞

exp(n) = +∞.

Consequently, for all y > 0 there exists n ∈ Z such that exp(n) ≤ y ≤ exp(n+1)and we conclude by the intermediate value theorem. Hence exp : R→]0,∞[ hasa strictly increasing and continuous inverse function.

This inverse function is called the natural logarithm or Naperian logarithmand is written ln(x).

ln : R+ \ {0} → R, ln(exp(x)) = x for allx ∈ R

and exp(ln(y)) = y for all y > 0. The logarithm satisfies the property

ln(st) = ln(s) + ln(t), s, t > 0.

Moreover, ln 1 = 0, ln e = 1. We will see at Chapter 5 that the logarithm has aseries expansion, but this series does not converge over all R+ \ {0}.

Generalized exponential series. The logarithm allows us to write the ex-ponential function in base a > 0 and a 6= 1:

expa(x) = exp(x ln a)

and its inverse function is given by lnx/ln a for x > 0. By induction, we first

get an = expa(n) for all n ∈ Z and then ar = expa(r) for all r =p

q∈ Q. This

is why we let:ax := expa(x) for all x ∈ R, a > 0. (3.4)

This notation is justified by the fact that expa(x) is the only continuous exten-sion of ax : Q→]0,∞[ thanks to the following result:

Proposition. Let f : R → R be a continuous function that satisfies thefunctional equation

f(x+ y) = f(x)f(y) for all x, y ∈ R


Then, either f(x) = 0 for all x ∈ R, or f(1) =: a > 0 and f(x) = ax for allx ∈ R.

Proof. We have f(1) ≥ 0 because f(1) = f( 12 + 1

2 ) = f( 12 )2. If f(1) = 0, then

f(x) = f(x− 1 + 1) = f(x− 1)f(1) = 0 for all x ∈ R.

If f(1) > 0 we let a := f(1). We have f(0) = 1 and f(−1) = a−1 because

a = f(1 + 0) = f(1)f(0) = af(0) and 1 = f(0) = f(1)f(−1) = af(−1)

By induction, we have f(n) = an = expa(n) for any integer n and then f(pq ) =

apq = expa(pq ) for all p, q ∈ Z, q 6= 0, by using ap = f(p) = f(q · pq ) = f(pq )q.

Hence f(x) = ax = expa(x) for all x ∈ Q. For all x ∈ R there exists a sequenceof elements xn ∈ Q such that lim

n→+∞xn = x (because Q is dense in R). From

the continuity of f and of ax = expa(x), we have

f(x) = limn→+∞

f(xn) = limn→+∞

axn = limn→+∞

expa(xn) = expa(x) = ax.

Definition - exponential of a complex number. We can extend the defi-nition of the exponential to complex numbers. For z ∈ C let

ez = exp(z) : =

∞∑k=0

zk

k!(3.5)

This series is absolutely convergent (the modulus of a complex number replacesthe absolute value). Notice that exp(z) is continuous over C and also satisfiesthe property exp(z1 + z2) = exp(z1) exp(z2) for all z1, z2 ∈ C. In particular, forall z = x+ iy, with x, y ∈ R:

ez = ex+iy = exeiy. (3.6)

Exponential of an imaginary number. Let θ ∈ R. We have

exp(iθ) =

∞∑k=0

(iθ)k

k!

=

∞∑k=0

(−1)kθ2k

(2k)!+ i

∞∑k=0

(−1)kθ2k+1

(2k + 1)!

because i2k = (−1)k and i2k+1 = i(−1)k. We deduce from this the

Series for sine and cosine. The series

cos(θ) =

∞∑k=0

(−1)kθ2k

(2k)!

sin(θ) =

∞∑k=0

(−1)kθ2k+1

(2k + 1)!


are absolutely convergent for all θ ∈ R. The methods from differential calculusallow us to prove that these series indeed correspond to the functions sin andcos defined over the unit circle (see Chapter 5):

sin : R→ [−1, 1], sinx =

∞∑k=0

(−1)kx2k+1

(2k + 1)!(3.7)

cos : R→ [−1, 1], cosx =

∞∑k=0

(−1)kx2k

(2k)!(3.8)

For z ∈ C, we define sin z and cos z by these series.

Trigonometric functions. The functions sin and cos over the real numbersare periodical functions with period T = 2π, that is

sin(x+ 2π) = sinx cos(x+ 2π) = cosx

for all x ∈ R. Obviously they are also periodical with period T = 2kπ wherek ∈ Z+. For all x, y ∈ R, sin and cos satisfy

sin2 x+ cos2 x = 1 Pythagorean theorem

sin(x+ y) = sinx cos y + cosx sin y

cos(x+ y) = cosx cos y − sinx sin y

and

sinx± sin y = 2 sin(x± y

2

)cos(x∓ y

2

)cosx+ cos y = 2 cos

(x+ y

2

)cos(x− y

2

)cosx− cos y = −2 sin

(x+ y

2

)sin(x− y

2

)A few special values:

sin 0 = 0 sin π6 = 1

2 sin π4 = 1

2

√2 sin π

3 = 12

√3 sin π

2 = 1

cos 0 = 1 cos π6 = 12

√3 cos π4 = 1

2

√2 cos π3 = 1

2 cos π2 = 0

The functions sin and cos are invertible over the domain D = [−π2 ,π2 ], respec-

tively D = [0, π]. The corresponding inverse functions are denoted as arcsin andarccos. We define the trigonometric functions tan and cot by

tan : R \ {(k +1

2)π, k ∈ Z} → R, tanx =

sinx

cosx

cot : R \ {kπ, k ∈ Z} → R, cotx =cosx

sinx

The functions tan and cot are invertible over the domain D =]− π2 ,

π2 [, respec-

tively D =]0, π[. The corresponding inverse functions are denoted as arctan andarccot.


Hyperbolic functions. We define the hyperbolic functions by

sinhx = ex−e−x2 sinh : R→ R

coshx = ex+e−x

2 cosh : R→ [1,∞[

tanhx = sinh xcosh x tanh : R→]− 1, 1[

cothx = cosh xsinh x coth : R \ {0} →]−∞,−1[ ∪ ]1,∞[

hence the series for sinh and cosh:

sinhx =

∞∑k=0

x2k+1

(2k + 1)!, coshx =

∞∑k=0

x2k

(2k)!(3.9)

For z ∈ C, we define sinh z and cosh z by these series. The inverse functions arewritten

arcsinhx = ln(x+√x2 + 1) arcsinh : R→ R

arccoshx = ln(x+√x2 − 1) arccosh : [1,∞[→ [0,∞[

arctanhx = 12 ln 1+x

1−x arctanh :]− 1, 1[→ R

arccothx = arctanh 1x arccoth :]−∞,−1[ ∪ ]1,∞[→ R \ {0}

Chapter 4

Real Functions and Limitprocess

We extend the concept of limit to functions in order to study local and asymp-totic behavior. A more geometric interpretation of the limit allows us to intro-duce uniformly continuous functions and uniform convergence of sequences ofcontinuous functions.

Notions to learn. Limit of a function, punctured limit of a function, left-sided limit, right-sided limit, infinite limit, continuous function and uniformlycontinuous function.

Skills to acquire. Know how to compute the limit of a function and applythe techniques from Chapter 2 to functions, know how to check continuity anduniform continuity of a function, understand the properties relative to uniformlycontinuous functions, know how to give the asymptotic behavior of a function.

4.1 Limit of a function

4.1.1 Limit of a function - the definitions

Introduction. The concept of limit of a function is a fundamental concept inAnalysis. Unfortunately, the concept of limit of a function is often a source ofconfusion for learners since there are two different definitions in the litterature.The first one analyzes the function exclusively without considering the point forwhich we want to define the limit of the function and is historically based on theworks of Weierstrass and Cauchy. The other one - more modern - allows us toeasily extend the concept of limit that we have already used to define continuousfunctions and continuous extensions. We have adopted this idea to define thelimit of a function. We will compare this definition to the former one, also calledthe punctured limit of a function. We recall that in order to define the limitof a continuous function f : Df −→ R at a point a of its domain Df , we haveused the sequences of elements in Df that converge towards a (see section 2.6for more details). From this definition, we obtain the following formulation: afunction f is continuous at a if the limit of f(x) when x tends to a is equal to

79

CHAPTER 4. REAL FUNCTIONS AND LIMIT PROCESS 80

f(a). Hence the limit process and the function f are interchangeable. Whenwe have extended this case to any kind of function, we have observed that foran a ∈ Df the limit l := lim

x→af(x) can only exist if the function is continuous at

this point since taking the constant sequence yields l = f(a). We then definedthe limit of a function at points a /∈ D in the closure of Df . For these points,by proposition 2.4.2 there exists a sequence of elements in Df that convergestowards a. This construction brought us to continuous extensions. In thischapter we will introduce a formalism (that is said to be topological) by usingthe open intervals Br(a) :=]a − r, a + r[= {x : |x − a| < r} around a (alsocalled open balls in view of their generalization in Rn) and their image under f .Using this method we will obtain an equivalent definition, the ”epsilon-delta”definition. If by these definitions the function has a limit at a closure point ofDf for a /∈ Df , this limit corresponds to the continuous extension of f at pointa.

Limit of a function - Definition 1. Let a ∈ R be a closure point of Df . Wesay that f has a limit l ∈ R when x tends to a if for any sequence of elementsxn ∈ Df such that lim

n→+∞xn = a, the sequence (f(xn))n∈N converges towards l,

i.e. limn→+∞

f(xn) = l. We then write

limx→a

f(x) = l.

Limit of a function - Definition 2. Let a ∈ R be a closure point of Df . Wesay that f has a limit l ∈ R when x tends to a if to each ε > 0, we can associateδ > 0 such that |f(x) − l| < ε if x ∈ Df and |x − a| < δ, i.e. x ∈ Bδ(a) ∩Df .We then write

limx→a

f(x) = l.

Proposition 4.1.1. Definitions 1 and 2 are equivalent.

Proof. 1. 2 ⇒ 1. Let (xn)n∈N be a sequence in Df that converges towardsa. We will prove that lim

n→+∞f(xn) = l, that is, for all ε > 0 there exists a

natural integer N such that |f(xn) − l| < ε for all n ≥ N . By definition2, there exists a δ > 0 such that |x − a| < δ implies |f(x) − l| < ε. Forthis δ, there exists an N such that n ≥ N implies |xn − a| < δ. Hence|f(xn)− l| < ε for all n ≥ N .

2. 1⇒ 2. Let limn→+∞

f(xn) = l for any sequence (xn)n∈N in Df that converges

towards a. If l is not the limit according to definition 2, there existsan ε > 0 such that for all δ > 0 there exists an yδ ∈ Df such that|yδ − a| < δ and |f(yδ) − l| ≥ ε. For δ = 1

n , we have built a sequenceof elements (yn) with this property. The sequence (x1, y1, x2, y2, . . .) thusconverges towards a but the sequence (f(x1), f(y1), f(x2), f(y2), . . .) doesnot converge.

Remark on the application of the definitions. In practice, we often useDefinition 1 to show that lim

x→af(x) does not exist and Definition 2 to prove

existence of the limit value of a function.


Remark on notation. The notation limx→a

f(x) does not take into account the

fact that the limit depends on the domain Df since we can only consider theelements x ∈ Bδ(a) ∩ Df . We can also write lim

h→0f(a + h) = l by considering

a+ h ∈ Df .

Continuous function at a. If for an a ∈ Df , the limit limx→a

f(x) = l exists,

it follows that f is continuous at a since we necessarily have l = f(a) (take theconstant sequence xn = a). According to Definition 2 a function f is continuousat a ∈ Df if to all ε > 0, we can associate δ > 0 such that |f(x) − f(a)| < ε ifx ∈ Df and |x− a| < δ. We then write

limx→a

f(x) = f(a).

As proved in definition 4.1.1 this identity is equivalent to the commutativity off with the limit process, that is for any sequence (xn)n∈N of elements in Df

such that limn→+∞

xn = a, we have

limn→+∞

f(xn) = f(a) = f(

limn→+∞

xn)

4.1.2 Punctured limit of a function - the definitions

Discussion. By applying the definition of the limit of a function to ”exotic”examples we obtain some rather surprising results. Let f : R \ {0} → R givenby f(x) = 1 − x2. The function f is not defined at a = 0 which is a closurepoint of the domain of f and we have

limx→0

f(x) = 1.

On the other hand, if we consider g : R→ R given by

g(x) =

{1− x2 if x 6= 0;0.5 if x = 0,

then limx→0

g(x) does not exist. For example, we can take the constant sequence

xn = 0 and the sequence yn =1

n+ 1to see that

limn→∞

g(xn) = 0.5 6= 1 = limn→∞

g(yn)

or else the sequence zn =1 + (−1)n

n+ 1to observe that lim

n→∞g(zn) does not exist.

The function g is not continuous at 0.


For the function on the left the limit exists, for the function on the right thelimit does not exist.

The definition of the limit allows to discover the points of discontinuity of afunction over its domain, but it does not allow to get the value of the continuousextension at these points (unlike the closure points that are not in the domain).

Introduce the punctured limit. This is why we can define the limit of afunction at a point a belonging to the closure of the domain by considering onlythe x 6= a, even if a belongs to the domain. For the sequences that convergetowards a the elements will always be different from a: xn 6= a. This is notanymore possible if a is an isolated point of the domain Df since there existsno sequence of elements xn 6= a that converges towards a. This is why we mustrefer to the notion of limit point of the domain Df for which, by definition, thereexists such a sequence. We recall that the notion of limit point blends with theone of a closure point if a /∈ Df . The difference between the set of limit pointsand the set of closure points is the set of isolated points of the domain. Thelimit built in this way is called the punctured limit.1 For a limit point a in thedomain Df , the punctured limit of f is equivalent to the limit of the restrictionof f to Df \ {a}. We also notice that several authors do not use the notionof limit point of a set to replace it by a stronger hypothesis called ” functiondefined over a neighborhood of a”, which means that there exists r > 0 suchthat ]a − r, a + r[⊂ Df ∪ {a}.2 This condition is too restrictive and does notallow, for example, to build the continuous extensions of functions defined overrationals. We underline again the fact that we obtain the same conclusions forcontinuous functions (the interchangeability of the limit process and of f) withlimit points belonging to the domain of f .

1We point out the fact that authors that use this definition do not add the adjective”punctured”. A good reference that uses this construction to define the limit of a function is:Srishti D. Chatterjee, Cours d’Analyse, Tome 1, PPUR 1997, chapters 1.4.6.

2For example, Jacques Douchet and Bruno Zwahlen, Calcul differentiel et integral, PPUR,2011 or Jacques Douchet, Analyse (Volume 1) PPUR 2012


The punctured limit exists in both cases and yields the same value.

If f(x) = 1 − x2 for all real x (that is f is continuous at x = 0 we obviouslyhave lim

x→0f(x) = 1 as for the punctured limit.

For a continuous function the limit and the punctured limit coincide. This isalso true for closure points that do not belong to the domain.

Punctured limit of a function - Definition 1. Let a ∈ R be a limit pointDf . We say that f has a punctured limit l ∈ R when x tends to a if forany sequence of elements xn ∈ Df \ {a} such that lim

n→+∞xn = a, the sequence

(f(xn))n∈N converges towards l, i.e. limn→+∞


limx→a,x6=a

f(x) = l

Punctured limit of a function - Definition 2. Let a be a limit point of Df .We say that f has a punctured limit l ∈ R when x tends to a if to every ε > 0,we can associate δ > 0 such that |f(x) − l| < ε if x ∈ Df and 0 < |x − a| < δ,i.e. x ∈ Bδ(a) ∩Df \ {a}. We then write

limx→a,x 6=a

f(x) = l.


Particular cases - right-sided limit and left-sided limit. We can usetotal order on R to distinguish both alternatives x < a and x > a in thedefinition of punctured limit. This brings us to the concepts of unilateral limits,more precisely to the notions of right-sided limit and left-sided limit. We givethe sequential definition (definition 1):

Right-sided limit. Let a ∈ R be a limit point of Df such that there existsa sequence of elements an ∈ Df , an > a with lim

n→+∞an = a. We say that f

has a right-sided limit l ∈ R when x tends to a if for any sequence of elementsxn ∈ Df such that xn > a and lim

n→+∞xn = a, the sequence f(xn) converges

towards l, i.e. limn→+∞


limx→a+

f(x) := limx→a,x>a

f(x) = l.

In the scope of punctured limits a function f is said to be right-continuous ata ∈ Df if lim

x→a+f(x) = f(a).

Left-sided limit. Let a ∈ R be a limit point of Df such that there exists asequence of elements an ∈ Df , an < a with lim

n→+∞an = a. We say that f has a

left-sided limit l ∈ R when x tends to a if for any sequence of elements xn ∈ Df

such that xn < a and limn→+∞

xn = a, the sequence f(xn) converges towards l,

i.e. limn→+∞


limx→a−

f(x) := limx→a,x<a

f(x) = l.

f is said to be left-continuous at a ∈ Df if limx→a+

f(x) = f(a).

Unilateral limits and punctured limits - equivalence condition Wehave lim

x→a,x6=af(x) = l if and only if

limx→a−

f(x) = limx→a+

f(x) = l.

4.1.3 Inifinite limits and limits at infinity

Introduction. Let f : Df → R. Let a /∈ Df be in the closure of Df . First wedescribe the behavior of strongly divergent functions at point a. If Df containsa sequence of elements that diverges strongly towards +∞ (or towards −∞),we say that +∞ (or −∞) is in the closure of Df or is a closure point of Df (byabusing a bit this notion). In this case, notably if Df = R, we can describe theasymptotic behavior of f when x tends to infinity. Under these hypotheses, wegive the definitions in the terminology used in Definition 1 to define the limit ofa function.

Infinite limits. We write limx→a

f(x) = ±∞ if for any sequence (xn)n∈N of

elements in Df that converges towards a, we have limn→+∞

f(xn) = ±∞.


Limits at infinity. We write limx→±∞

f(x) = l if for any sequence (xn)n∈N of

elements in Df that diverges strongly towards ±∞, we have limn→+∞

f(xn) = l.

Infinite limits at infinity. We write limx→+∞

f(x) = ±∞ if for any sequence

(xn)n∈N of elementsDf that diverges strongly towards +∞ we have limn→+∞

f(xn) =

±∞. We write limx→−∞

f(x) = ±∞ if for any sequence (xn) of elements in R that

diverges strongly towards −∞, we have limn→+∞

f(xn) = ±∞.

4.1.4 Properties of limit values

The computation rules for the limit values of sequences also apply to limits offunctions. For infinite limits we recall the computation rules given by theorem2.12:

∞+∞ =∞, ∞ ·∞ =∞, 0/∞ = 0

c+∞ =∞, c/∞ = 0, for all c ∈ R

c · ∞ =∞, for all c > 0

By applying these rules if the limits are infinite we get the following result:

Theorem 4.1. Let f : Df −→ R, g : Dg −→ R and a ∈ R ∪ {−∞,∞} in theclosure of Df ∩Dg. Let us suppose that

limx→a

f(x) = l1 ∈ R ∪ {−∞,∞} and limx→a

g(x) = l2 ∈ R ∪ {−∞,∞}.

Then for all α, β ∈ R, we have

limx→a

(αf(x) + βg(x)) = αl1 + βl2. (4.1)

limx→a

f(x)g(x) = l1l2. (4.2)

limx→a

f(x)

g(x)=l1l2

if l2 6= 0. (4.3)

limx→a|f(x)| = |l1|(= | lim

x→af(x)|). (4.4)

Theorem 4.2. Let f : Df → T and g : A→ B be two real functions such thatf [Df ] ⊂ A and

limx→a

f(x) = b and limy→b

g(x) = l.

Then, limx→a

g(f(x)) = l.

Bounded function. A function f : Df → T is said to be bounded if thereexists a constant C > 0 such that |f(x)| ≤ C for all x ∈ Df .


A convergence criterion. Let f be a bounded function and limx→a

g(x) = 0

and a ∈ Df ∩Dg. Then limx→a

f(x)g(x) = 0.

Proposition 4.1.2. Let limx→a

f(x) = l1 and limx→a

g(x) = l2 and let us suppose that

there exists an r > 0 such that f(x) ≤ g(x) in Br(a) ∩ Df ∩ Dg \ {a}. Then,l1 ≤ l2.

Proof. This is a direct consequence of proposition 2.3.1.

As for sequences, this proposition implies a sandwich theorem.

Proposition 4.1.3. - sandwich theorem. Let f, g, h : D → T be three realfunctions such that lim

x→ag(x) = l and lim

x→ah(x) = l and let us suppose that there

exists an r > 0 such that g(x) ≤ f(x) ≤ h(x) in Br(a) ∩Df ∩Dg ∩Dh \ {a}.Then lim

x→af(x) = l.

4.1.5 Examples

1. Compute

limx→0

sinhx

x.

For x > 0 we have the following inequalities: x < sinhx < x coshx.Moreover sinh−x

−x = sinh xx . Then from the continuity of cosh and cosh(0) =

1:

limx→0

sinhx

x= 1.

We now study the function f : R \ {0} → R defined by f(x) = sinh xx .

The function f(x) is continuous over R \ {0}. Thanks to the previousresults, it is possible to define a continuous extension of f(x) = sinh x

x thatis continuous at all x ∈ R:

f(x) =

{sinh xx if x 6= 0,

1 if x = 0.

2. Show that

limx→0

sin1

x

does not exist. To prove this we consider the sequence (xn) defined byxn = 1

π(n+ 12 )

for n ≥ 0. The sequence (xn) converges towards 0 and

sin1

xn= sinπ(n+

1

2) = (−1)n

does not converge.

3. Let f : R→ R be a polynomial function defined by

f(x) =

n∑k=0

akxk, ak ∈ R an > 0


We havelim

x→±∞f(x) = ±∞ if n is odd

andlim

x→±∞f(x) = +∞ if n is even

4. For all n ∈ N and x > 0, we have ex ≥ xn+1/(n+ 1)!. Then

limx→∞

ex

xn=∞

that is exponential growth at infinity is faster than the growth of a poly-nomial function.

5.

limx→∞

(1 +

1

x

)x= limy→0+

(1 + y

) 1y = e

We prove it as follows: let x > 0. We let n = [x]. The inequality n ≤ x <n+ 1 implies

1 +1

n+ 1≤ 1 +

1

x≤ 1 +

1

n

The function ax is increasing if a > 1 and satisfies a0 = 1. Then(1 +

1

n+ 1

)n ≤ (1 +1

x

)x ≤ (1 +1

n

)n+1

and the sandwich theorem gives the desired result. Moreover, for all a ∈ R

6.limx→∞

(1 +

a

x

)x= limy→0

(1 + y

) ay = ea

thanks to the continuity of xp. By using the fact that lnx is continuousfor x > 0, we also obtain

limy→0+

ln(1 + y)

y= 1

7.limx→∞

tanhx = 1

because

limx→∞

tanhx = limx→∞

ex − e−x

ex + e−x= limx→∞

1− e−2x

1− e−2x=

1− 0

1 + 0= 1

8.limx→∞

x1x = 1

Proof. Let x > 0. We take n = [x], then n ≤ x < n+ 1 and 1n+1 <

1x ≤

1n . Consequently

n1

n+1 < x1x < (n+ 1)

1n

and the result is a consequence of limn→∞

n1n = 1.

9.

limx→∞

lnx

x= 0 and lim

x→∞

lnx

xp= 0 for all p > 0.


Proof. By the previous example

ln(

limx→∞

x1x

)= ln 1.

The function ln is continuous and satisfies ln ab = b ln a and ln 1 = 0.Hence

0 = ln 1 = ln(

limx→∞

x1x

)= limx→∞

ln(x

1x

)= limx→∞

lnx

x.

By taking x = yp, we have that ln xx = p ln y

yp tends to 0 when y tends toinfinity.

4.1.6 Asymptotic behavior and asymptotes

Horizontal asymptote. Let f : R→ R be a function such that limx→+∞

f(x) =

L. The line with equation y = L is a horizontal asymptote of the function f at+∞.

Asymptotic behavior. Let f : R→ R be a function such that limx→+∞

f(x) =

−∞ or limx→+∞

f(x) = +∞. We would like to describe more precisely the asymp-

totic behavior of f by means of a simpler function h. If h : R → R is suchthat

limx→+∞

f(x)− h(x) = 0,

then f and h have the same asymptotic behavior.

Oblique asymptote. If h is a linear function, i.e. h(x) = ax + b, then theline defined by h is an oblique asymptote of function f at +∞. In particular,

a = limx→+∞

f(x)

x.

Similarly, we describe the asymptotic behavior of a function f at −∞. A moresystematic way to study asymptotic behavior by means of a Taylor expansionwill be given in the next chapter.

Examples.

1. Give the oblique asymptotes of f(x) =√

3x2 + 2x+ 1 at −∞ and +∞.Regarding the asymptote at +∞, we are looking for a, b such that

limx→+∞

(f(x)− (ax+ b)

)= 0.

This yields a =√

3. Notice that

f(x)−√

3x =3x2 + 2x+ 1− 3x2

√3x2 + 2x+ 1 +

√3x

=2x+ 1

√3x(√

1 + 23x + 1

3x2 + 1)

hence b =√

3/3 since limx→+∞

(f(x) −√

3x) = 1√3. At −∞ the oblique

asymptote is the line with equation y = −√

3x−√

3/3.


2. Let f(x) =√x4 + 4x3 + 1. Find the quadratic polynomial h(x) = ax2 +

bx+ c such thatlim

x→+∞f(x)− h(x) = 0.

We compute the coefficients by induction. Obviously

limx→+∞

f(x)

x2= 1.

So a = 1. Then we have

limx→+∞

f(x)− x2

x= limx→+∞

4x3 + 1

x(√x4 + 4x3 + 1 + x2)

= 2,

hence b = 2, and finally

limx→+∞

(f(x)− x2 − 2x) = limx→+∞

−4x2 + 1√x4 + 4x3 + 1 + x2 + 2x

= −2,

hence c = −2.

4.2 Uniformly continuous functions

Uniform continuity. Let I be an interval. A function f defined over I issaid to be uniformly continuous over I if to all ε > 0 we can associate δ > 0such that x, y ∈ I and |x− y| < δ imply |f(x)− f(y)| < ε.

Lipschitz-continuous functions. Let I be an interval. A function f definedover I is said to be Lipschitz-continuous over I if there exists L > 0 such that|f(x)− f(y)| < L|x− y| for all x, y ∈ I.

Proposition 4.2.1. - Uniformly continuous functions.

1. A uniformly continuous function over an interval I is continuous.

2. Let f be a continuous function defined over the bounded and closed interval[a, b]. Then f is uniformly continuous over [a, b].

3. A Lipschitz-continuous function over I is uniformly continuous over I.

Proof. The first assertion is obvious. To prove 2, let us suppose that the asser-tion is false for a certain function f . Then there exists ε > 0 such that for alln ∈ N, there exists xn, yn such that

|xn − yn| <1

n, |f(xn)− f(yn)| > ε.

By the Bolzano-Weierstrass theorem, there exists xnk , ynk that converge towardsx and y respectively. From the first inequality x = y. So f(x) = f(y), hencethe contradiction. To prove 3, choose δ = ε/L in the definition of a uniformlycontinuous function.


Approximation of a continuous function by a step function. Uniformcontinuity plays an important role in the construction of the integral. We willshow in Chapter 6 that a uniformly continuous function can be approximatedby a step function with an arbitrary precision.

Examples of continuous but not uniformly continuous functions. Thefunctions f, g :]0, 3[−→ R defined by f(x) = 1

x and g(x) = sin 1x are not uni-

formly continuous. In fact for f , consider the sequences defined by xn = 1n

and yn = xn2 , n a positive integer. Then xn − yn converges towards 0 when n

tends to infinity but |f(yn) − f(xn)| = n diverges. In order to analyze g, let

xn =1

π(n+ 12 )

and yn = 1πn . Then xn − yn converges towards 0 when n tends

to infinity but |g(yn) − g(xn)| = 1 does not converge towards 0. The functionh : R −→ R defined by h(x) = x2 is uniformly continuous over any bounded andclosed interval but it is not uniformly continuous over an unbounded interval.Take, for instance, xn = n and yn = n+ 1

n . Then xn − yn converges towards 0when n tends to infinity but |h(yn)− h(xn)| = 2 + 1

n2 does not converge to 0.

Example - a uniformly continuous function that is not Lipschitz-continuous. The function f(x) =

√x is uniformly continuous over [0, 1] but

there does not exists a constant L > 0 such that√x−√

0 ≤ L(x− 0).

Chapter 5

Differential calculus

We present the basic techniques of differential calculus.

Notions to learn. Derivative and differentiable function, chain rule, meanvalue theorem, the class of functions Cn, Taylor polynomials, power series, localextrema, inflexion point, asymptote, convex function, De l’Hospital’s rule.

Skills to acquire. Know the derivatives of elementary functions, understandand know how to apply the properties of derivation to the computation of deriva-tives, know how to compute the truncated Taylor expansion of a function, knowhow to apply De l’Hospital’s rule to compute limits, know how to study a func-tion, know how to compute the power series of elementary functions and todetermine the radius of convergence.

5.1 The derivative

Let f : D → R and a ∈ D be a limit point of D. These conditions are verifiedfor all a ∈ D if D = I is an interval containing a non-empty open interval.

Definition. A function f : D → R is said to be differentiable at a ∈ D if thelimit

limx→a

f(x)− f(a)

x− aexists. This limit, when it exists, is called the derivative of function f at thepoint a and we denote it by f ′(a).

Remark. Notice that the quotientf(x)− f(a)

x− ais not defined at a hence x 6= a

(and x ∈ D). This is why we do not refer explicitly to the punctured limit

limx→a,x 6=a

f(x)− f(a)

x− a. Letting h = x− a with a+ h ∈ D \ {a} (that is h 6= 0) we

can also write

f ′(a) = limx→a

f(x)− f(a)

x− a= limh→0

f(a+ h)− f(a)

h(5.1)

91

CHAPTER 5. DIFFERENTIAL CALCULUS 92

Geometric interpretation - the tangent problem. Let f : D → R be afunction. We want to build the tangent at a point (a, f(a)) of a given curve(x, f(x)). This problem leads to differential calculus because we would like toknow if the slope of the secant given by the line joining (a, f(a)) to (x, f(x)) forx ∈ D \ {a} has a limit value, the slope of the tangent at (a, f(a)). Now theslope of the secant is given by

ma(x) =f(x)− f(a)

x− aif x ∈ D \ {a}

and if the limit exists when x tends to a, the derivative f ′(a) is equal to thecontinuous extension of ma(x) at a, i.e. :

ma(x) =

{f(x)−f(a)

x−a if x ∈ D \ {a},f ′(a) if x = a

(5.2)

is a continuous function at a ∈ D. The equation of the tangent at a is given by

ta(x) = f(a) + f ′(a)(x− a) or ta(a+ h) = f(a) + f ′(a)h.

Notice that for all x ∈ D we can write

f(x) = f(a) + (x− a)ma(x)

This implies

Proposition. If a function f : D → R is differentiable at a ∈ D, then f iscontinuous at a.

f(x)

f(a+h)

f(a)

a+ha

the graphs of f(x), the segment f(a) + f(a+h)−f(a)h (x− a) and the tangent

ta(x) = f(a) + f ′(a)(x− a)


Approximation of a fonction in a neighborhood. We expect to havef(x) ≈ f(a)+f ′(a)(x−a) for x close enough to a. Or, more precisely, the prop-erty to be differentiable is equivalent to the property that f can be approachedby a linear function. This result and the quality of the approximation are givenin the following proposition. We first introduce a new notation:

Notation O(h) and o(h). We say that an expression f(h) is O(h) (read: bigO of h) when h tends to zero and we write f(h) = O(h) if there exists a constantC > 0 such that |f(h)| ≤ C|h| for all small enough h. In particular, it followsthat lim

h→0f(h) = 0. We say that an expression f(h) is o(h) (read: small o of h)

when h tends to zero and we write f(h) = o(h) if limh→0

f(h)h = 0. These notions

represent a convention to describe the qualitative behavior when the argumenttends to zero. In general in these type of computations, only the dominant termis preserved, for example :

o(h) + o(h) = o(h)

o(h) +O(h) = O(h)

O(h) +O(h2) = O(h)

o(h) +O(h2) = o(h)

hp ·O(h) = O(hp+1), p > 0

hp · o(h) = o(hp+1), p > 0

O(h2) +O(h3) + h · o(h) = O(h2)

o(h) · o(h) = o(h2)

O(h) · o(h) = o(h2)

These identities signify : ”the sum of two functions which are o(h) is o(h)”, thesum of a function which is o(h) and of function which is O(h) is O(h)” etc. If fis differentiable at a we can characterize the gap between f and its tangent ata by the following result:

Theorem 5.1. - Taylor expansion of first order. f : D → R is differen-tiable at a ∈ D if and only if

f(a+ h) = f(a) + f ′(a)h+ o(h) (5.3)

for all a+ h ∈ D.

Proof. If f is differentiable, it follows from (5.1) that

0 = limh→0

f(a+ h)− f(a)

h− f ′(a) = lim

h→0

f(a+ h)− f(a)− f ′(a)h

h

hence from the definition of o(h): f(a+h)− f(a)− f ′(a)h = o(h). If f satisfies(5.3), then from the definition of o(h):

0 = limh→0

o(h)

h= limh→0

f(a+ h)− f(a)− f ′(a)h

h

hence (5.1).


Definition. Let D = I be an interval. We say that a function f : D → R isdifferentiable over I if it is differentiable at all a ∈ D. The function f ′ definedover I is called the derivative of f .

Corollary. A differentiable function over I is continuous over I.

Notation. We also denote the derivative f ′(x) by d f(x)d x or by d

d xf(x). In par-ticular, the last notation shows that differentiating a function f is an operationon f whose result is a function f ′, the derivative of f . We can compare theaction of d

d x on a function with that of a matrix on a vector. The derivative

at a point a ∈ D is denoted by f ′(a) or by d f(x)d x

∣∣x=a

or by dd x

∣∣x=a

f(x). If the

variable x represents time we also denote the derivative f ′(x) by f(x).

Examples - compute the derivative by means of the definition.

1. The constant functions, i.e. f(x) = c for all x ∈ R have derivative f ′(x) =0.

2. The identity function f(x) = x is differentiable at all x ∈ R and f ′(x) = 1.

3. Let f(x) = x2. Then for all x ∈ R

limh→0

f(x+ h)− f(x)

h= limh→0

(x+ h)2 − x2

h

= limh→0

2xh+ h2

h= 2x,

that is f ′(x) = 2x. Notice that (x+h)2−x2−2xh = h2, hence o(h) = h2.

4. Let f(x) =√x. For all x > 0

limh→0

√x+ h−

√x

h= limh→0

x+ h− xh(√x+ h+

√x)

= limh→0

1√x+ h+

√x

=1

2√x

because x 7→√x is a continuous function. Let x = 1. We can write

√1 + h =

√1 +

1

2√

1h+ o(h) ≈ 1 +

h

2

A first approximation yields√

1 + h ≈ 1 + h2 . The error is smaller than

10−3 if |h| < 0.087 and√

1+h1+h/2 < 10−3 if |h| < 0.093.

5. Let f(x) = ex. For all x ∈ R, the function ex is differentiable and f ′(x) =ex because

limh→0

ex+h − ex

h= limh→0

exeh − 1

h= ex lim

h→0

1 + h+ r2(h)− 1

h= ex

since r2(h)/h converges towards 0 when h tends to 0.


6. Show that for all x ∈ R

d sinx

dx= cosx,

d cosx

dx= − sinx.

Using sin y − sinx = 2 sin(y−x

2

)cos(x+y

2

), we have

limh→0

sin (x+ h)− sinx

h= limh→0

2 sin(h2

)cos(

2x+h2

)h

= limh→0

sin(h2

)h2

· limh→0

cos(x+

h

2

)= cosx

because limt→0

sin tt = 1 and cosx is continuous. To compute the derivative

of cosine, we use the identity cos y − cosx = −2 sin(y−x

2

)sin(x+y

2

)and

the continuity of the function sin.

5.1.1 Properties of the derivative

Theorem 5.2. - Computation rules for differential calculus. Let f, g :D → R be two differentiable functions at a ∈ D. Then

(αf + βg)′(a) = αf ′(a) + βg′(a) for α, β ∈ R (linearity) (5.4)

(fg)′(a) = f ′(a)g(a) + f(a)g′(a) (product rule) (5.5)(f

g

)′(a) =

f ′(a)g(a)− f(a)g′(a)

g2(a)if g(a) 6= 0 (quotient rule) (5.6)

Proof. We use either the quotient (5.1) in the definition or the first order Taylorexpansion (5.3). We have f(a + h) = f(a) + f ′(a)h + o(h) and g(a + h) =g(a) + g′(a)h+ o(h) for a+ h ∈ D. The linearity of the derivative (5.4) follows.For the product rule (5.5) notice that

f(a+ h)g(a+ h) =(f(a) + f ′(a)h+ o(h)

)·(g(a) + g′(a)h+ o(h)

)= f(a)g(a) +

(f ′(a)g(a) + f(a)g′(a)

)h+ o(h)

by applying the conventions for the o(h). It follows that the product fg admitsa first order Taylor expansion at a ∈ D. The linear term in h gives the derivativeof fg at a ∈ D hence (5.5). The proof of the quotient rule (5.6) is left as anexercise.

Examples.

1. Show that for all n ∈ N and x ∈ R

d xn

d x= nxn−1

By induction and the product rule

d xn

d x=d x · xn−1

d x= x

dxn−1

d x+xn−1 d x

d x= (n−1)x·xn−2 +xn−1 = nxn−1.


2. By the quotient rule we have

(e−x)′ =

(1

ex

)′= − ex

e2x= −e−x.

3. Show that for all x ∈ Rd sinhx

dx= coshx,

d coshx

dx= sinhx.

From the linearity of the derivative we have

(sinhx)′ =

(ex − e−x

2

)′=

(ex)′ − (e−x)′

2=ex − (−e−x)

2= coshx.

and

(coshx)′ =

(ex + e−x

2

)′=

(ex)′ + (e−x)′

2=ex + (−e−x)

2= sinhx

4. Show that for all x ∈ Rd tanhx

dx=

1

cosh2 x

From the definition of tanh and the quotient rule

(tanhx)′ =

(sinhx

coshx

)′=

(sinhx)′ coshx− sinhx(coshx)′

cosh2 x

=coshx coshx− sinhx sinhx

cosh2 x

=1

cosh2 x= 1− tanh2 x.

Theorem 5.3. - Chain rule. If f : D → T and g : T → B are differen-tiable respectively at a ∈ D and b = f(a), then the composite function g ◦ f isdifferentiable at a and

(g ◦ f)′(a) = g′(f(a))f ′(a) (5.7)

Remark. We also write

d g(f(x))

d x

∣∣∣∣x=a

=d g(y)

d y

∣∣∣∣y=f(a)

· d f(x)

d x

∣∣∣∣x=a

or more briefly (g ◦ f)′ = (g′ ◦ f) · f ′.

Proof. Apply the first order Taylor expansion to g(f(a+ h)). Exercise.

Corollary 5.4. - derivative of the inverse function. Let f be invertible,continuous and differentiable at a. If f ′(a) 6= 0, then f−1 is differentiable atb = f(a) and

(f−1)′(f(a)) =1

f ′(a)(5.8)

or by making the substitution b = f(a)

(f−1)′(b) =1

f ′(f−1(b)). (5.9)


Remark. We also write

d f−1(y)

d y

∣∣∣∣y=f(a)

=1

d f(x)d x

∣∣∣∣x=a

ord f−1(y)

d y

∣∣∣∣y=b

=1

d f(x)d x

∣∣∣∣x=f−1(b)

Proof of the corollary. We apply the chain rule to the identity f−1(f(x)) =x.

Examples.

1. To compute the derivative of h(x) = (3x2 + 5x+ 2)n, we apply the chainrule for f(x) = 3x2 + 5x + 2 and g(y) = yn. With f ′(x) = 6x + 5 andg′(y) = nyn−1 we obtain

h′(x) =d g(y)

d y

∣∣∣∣y=f(x)

· d f(x)

d x

= nf(x)n−1f ′(x)

= n(6x+ 5)(3x2 + 5x+ 2)n−1.

2. Show that (lnx)′ = 1x for all x > 0. By using the formula for the derivative

of an inverse function we have

d ln y

d y=

1d ex

d x

∣∣x=ln y

=1

ex∣∣x=ln y

=1

eln y=

1

y.

3. The logarithmic derivative of a function. Let f(x) > 0 be a differ-entiable function. The derivative

d ln(f(x))

d x=d ln y

d y

∣∣∣∣y=f(x)

· d f(x)

d x=f ′(x)

f(x)

is called the logarithmic derivative of f(x).

4. Let g be a differentiable function and λ 6= 0 a real constant. We considerthe function gλ defined by gλ(x) = g(λx). The derivative of gλ is given by(apply the chain rule with g = g and f(x) = λx)

d gλ(x)

d x=d g(λx)

d x= λg′(λx)

In particular, for λ = −1, we have

d g(−x)

d x= −g′(−x).

For example, d e−x

d x = −e−x.


5. We compute the derivative of the function arcsin :]− 1, 1[→]− π2 ,

π2 [.

d arcsin y

d y=

1d sin xd x

∣∣x=arcsin y

=1

cos(arcsin y)

=1√

1− sin2(arcsin y)

=1√

1− y2.

5.1.2 Unilateral derivative

Right derivative. A function f : D → T is said to be right differentiable ata ∈ D if the limit

limx→a+

f(x)− f(a)

x− aexists. This limit, when it exists, is called the right derivative of the function fat the point a.

Left derivative. A function f : D → T is said to be left differentiable ata ∈ D if the limit

limx→a−

f(x)− f(a)

x− aexists. This limit, when it exists, is called the left derivative of the function fat a.

Derivative and unilateral derivatives at an interior point of the do-

main. A function f : D → T is differentiable at a ∈◦D if and only if its right

derivative and its left derivative exist and are equal.

Example. The function abs(x) = |x| is differentiable at all a 6= 0 and abs′(a) =sign(a). At a = 0 the unilateral derivatives exist: the right derivative is +1 andthe left derivative is −1. Hence abs(x) is not differentiable at a = 0.

5.1.3 Derivative and local behavior

The first order Taylor expansion (see equation (5.3)) enables the analysis of thebehavior of a function f : D → T in the neighborhood of an interior point a iff ′(a) 6= 0:

Theorem 5.5. - transversality if f ′(a) 6= 0. Let f : D → T be differ-

entiable at a ∈◦D such that f ′(a) 6= 0. Then there exists δ > 0 such that the

function h 7→ f(a + h) − f(a) changes signs at h = 0 and f(a + h) − f(a) 6= 0if h ∈]− δ, δ[ and h 6= 0. More precisely, if f ′(a) > 0, then

f(a+ h)− f(a) > 0 if 0 < h < δ

f(a+ h)− f(a) < 0 if − δ < h < 0


Proof. The hypothesis a ∈◦D implies that there is a neighborhood ]a−r, a+r[⊂

D. From equation (5.3), f(a + h) − f(a) − f ′(a)h = o(h). From the definitionof o(h), for all ε > 0 there exists δ > 0 such that |o(h)| < ε|h| if |h| < δ. Inparticular, for ε = |f ′(a)| there exists δ > 0 such that |f(a+h)−f(a)−f ′(a)h| =|o(h)| < |f ′(a)||h| for all |h| < δ. Hence f(a+h)−f(a) = f ′(a)h+o(h) 6= 0.

5.1.4 An application of the derivative: de l’Hospital’s rule.

The undefined expression 0/0. Let g, f be two real functions such that

f(x), g(x)→ 0 when x tends to a. Then the limit limx→a

f(x)g(x) is undefined. We can

evaluate this limit if f, g are differentiable at a and if g′(a) 6= 0:

Theorem 5.6. - De l’Hospital’s rule I. Let g, f : D → R be two functionsthat are differentiable at a such that f(a) = g(a) = 0 and g′(a) 6= 0. Then thereexists δ > 0 such that g(a+ h) 6= 0 for all h such that |h| < δ and

limx→a

f(x)

g(x)= limh→0

f(a+ h)

g(a+ h)=f ′(a)

g′(a).

Proof. From the definition of the derivative

f(a+ h) = f(a) + f ′(a)h+ o(h) = f ′(a)h+ o(h)

g(a+ h) = g(a) + g′(a)h+ o(h) = g′(a)h+ o(h)

for all a+ h ∈ D. By theorem 5.5, g(a+ h) = g′(a)h+ o(h) 6= 0. Then

f(a+ h)

g(a+ h)=f ′(a)h+ o(h)

g′(a)h+ o(h)→ f ′(a)

g′(a)

when h tends to zero.

Examples.

1.

limx→0

2x+ sinx

ex − 1= limx→0

2 + cosx

ex= 3

since the derivatives of the functions are continuous.

2.

limx→0

tan 2x

tanx= limx→0

2 + 2 tan2 2x

1 + tan2 x= 2.

5.1.5 The class C1(I)

Definition. Let I be an open interval. We say that f : I → R is of class C1(I)if f is differentiable over I and its derivatve f ′ is continuous over I.

Remark. By definition C0(I) is the set (or the class) of continuous functionsover I. Hence a function f is of class C1(I) if f is differentiable over I and f ′

is of class C0(I). Obviously, C1(I) ⊂ C0(I).


Remark. There exists differentiable functions over I with non-continuousderivatives over I. For example, the function f : R→ R defined by

f(x) =

{x2 sin 1

x if x 6= 0,

0 if x = 0

is differentiable over R with derivative given by

f ′(x) =

{2x sin 1

x − cos 1x if x 6= 0,

0 if x = 0.

The derivative f ′ is not continuous at x = 0.

5.1.6 Supplement: the derivative of functions f : R −→ CDifferentiable function. Let D ⊂ R, a ∈ D a limit point of D and f =u+ iv : D −→ C with u, v : D −→ R. The function f is said to be differentiableat a ∈ D if the real functions u and v are differentable at a ∈ D. In this casethe derivative of function f at point a, written f ′(a), is given by

f ′(a) = u′(a) + iv′(a).

Example. Let p be real. The function f : R −→ C defined by f(x) = eipx

is differentiable at all x ∈ R and f ′(x) = ipeipx. In fact by using f(x) =cos(px) + i sin(px) we find

f ′(x) = −p sin(px) + ip cos(px) = ip(i sin(px) + cos(px)

).

5.2 Mean value theorem

5.2.1 Local extrema and Rolle theorem

Monotonic function and derivative. Let f : D → R be a differentiable

function at a ∈◦D. If f is increasing in a neighborhood Bε(a) then f ′(a) ≥ 0. If

f is decreasing in a neighborhood Bε(a) then f ′(a) ≤ 0. This is a consequenceof the fact that a function f is increasing (respectively decreasing) if and onlyif for any pair (x1, x2) we have (f(x2) − f(x1))(x2 − x1) ≥ 0 (respectively(f(x2)− f(x1))(x2 − x1) ≤ 0).

Behavior if f ′(a) 6= 0. If f ′(a) > 0, by theorem 5.5 there exists a neighbor-hood Bε(a) such that x < a implies f(x) < f(a) and x > a implies f(x) > f(a)(the curve y = f(x) intersects the line y = f(a) at x = a). If f ′(a) 6= 0 wecannot deduce the monotonicity in a neighborhood of a. For example, let usconsider the function f defined by

f(x) =

{x if x ∈ Q,x+ x2 if x ∈ R \Q.

It is differentiable at x = 0 with f ′(0) = 1, but it is not monotonic over anyneighborhood of 0.


Stationary point. We say that a ∈ D is a stationary point for functionf : D → T if f ′(a) = 0.

Local extrema. We say that function f : D → T admits a local minimum

at a ∈◦D if there exists a neighborhood Bδ(a) such that x ∈ Bδ(a) ∩D implies

f(a) ≤ f(x). We say that function f : D → T admits a local maximum at

a ∈◦D if there exists a neighborhood Bδ(a) such that x ∈ Bδ(a) ∩ D implies

f(a) ≥ f(x). The minimum (respectively the maximum) is called strict if theinequality is strict in Bδ(a) \ {a}.

Theorem 5.7. - Necessary condition for local extrema. Let a ∈◦D (the

interior of D) be a local maximum (or minimum) of f : D → R and let f bedifferentiable at a. Then f ′(a) = 0.

Proof. The function (which gives the slope of the secant lines and of the tangentline defined by equation 5.2)

ma(x) =

{f(x)−f(a)

x−a if x ∈ D \ {a},f ′(a) if x = a

is continuous at x = a and changes signs at a since if f admits a local extremumat a the expression f(x) − f(a) does not changes signs but obviously x − achanges signs. The continuity of ma (since f is differentiable at a) impliesf ′(a) = ma(a) = 0.

Remark. The theorem gives a necessary but not sufficient condition for alocal extremum of a differentiable function. For example, the function f(x) = x3

admits a stationary point at x = 0 that is not a local extremum.

Continuous and differentiable functions. The fact that a continuous func-tion that is defined over a bounded and closed interval [a, b] attains its minimum(and its maximum) allows us to distinguish the following alternatives:

1. The minimum is either a or b.

2. The minimum is a stationary point of f .

3. The minimum is a point where f ′ does not exist.

From these three alternatives, we derive the following result:

Theorem 5.8. -(Rolle’s theorem1). Let f : [a, b] → R be continuous anddifferentiable over ]a, b[ such that f(a) = f(b). Then function f admits at leasta stationary point in ]a, b[.

Proof. The continuous function f attains its minimum and its maximum in [a, b].From the hypothesis that f is differentiable, the third alternative is excluded.If f attains its minimum and its maximum on the border of the interval, thecondition f(a) = f(b) (which means the minimum of f is equal to the maximumof f) implies that f is constant. Hence f ′(x) = 0 at all x ∈ [a, b]. We still haveto deal with the case where an extremum is attained at an interior point c ofthe interval. This yields f ′(c) = 0 by theorem 5.7.

1Named after Michel Rolle (1652 - 1719), a French mathematician.


5.2.2 Mean value theoremn

Theorem 5.9. Lagrange’s mean value theorem2. Let f : [a, b] → R becontinuous and differentiable over ]a, b[. Then there exists at least one pointc ∈]a, b[ for which we have

f(b)− f(a)

b− a= f ′(c).

Proof. The function g : [a, b]→ R defined by

g(x) = f(x)− f(a)− f(b)− f(a)

b− a(x− a)

satisfies the conditions of Rolle’s theorem. Then there exists a stationary pointc ∈]a, b[ for g. Then

0 = g′(c) = f ′(c)− f(b)− f(a)

b− a.

The mean value theorem allows us to conclude on the monotonicity of afunction by looking at its derivative.

Corollary 5.10. Let f : [a, b]→ R be continuous, differentiable over ]a, b[ andf ′(x) ≥ 0 (respectively f ′(x) > 0), then f is increasing (respectively strictlyincreasing) over [a, b]. If f ′(x) ≤ 0 (respectively f ′(x) < 0), then f is decreasing(respectively strictly decreasing) over [a, b].

Proof. By the mean value theorem, for all x, y ∈ [a, b] and x < y there existsc = c(x, y) ∈]x, y[ such that f(y)− f(x) = f ′(c)(y − x).

Combined with the result stating that the derivative of a monotonic functionhas a sign, this corollary implies :

Corollary 5.11. Let f : [a, b]→ R be continuous and differentiable over ]a, b[.Then

1. f ′(x) ≥ 0 if and only if f is increasing.

2. f ′(x) ≤ 0 if and only if f is decreasing.

The following corollaries play an important role in the theory of differentialequations.

Corollary 5.12. Let f : [a, b]→ R be continuous, differentiable over ]a, b[ andf ′(x) = 0. Then f is constant over [a, b].

Proof. By the mean value theorem, for all x ∈ [a, b] there exists c = c(x) ∈]a, x[such that f(x)− f(a) = f ′(c)(x− a) = 0, i.e. f(x) = f(a).

2Named after Giuseppe Lodovico de Lagrangia (1736 - 1813), (in french: Joseph Louis,comte de Lagrange, an Italian mathematician and astronomer.


Remark. Consequently, a continuous and differentiable function f is constantif and only if f ′(x) = 0. The following result holds:

Corollary 5.13. Let f, g : [a, b] → R be two continuous functions that aredifferentiable over ]a, b[ and such that f ′(x) = g′(x). Then there exists a realconstant c such that f(x) = g(x) + c.

Theorem 5.14. Cauchy’s mean value theorem. Let f, g : [a, b]→ R be twocontinuous and differentiable functions over ]a, b[ and g′(x) 6= 0 for all x ∈]a, b[.Then there exists at least one point c ∈]a, b[ for which we have

f(b)− f(a)

g(b)− g(a)=f ′(c)

g′(c).

Proof. The function h : [a, b]→ R defined by

h(x) = f(x)− f(a)− f(b)− f(a)

g(b)− g(a)(g(x)− g(a))

satisfies the conditions in Rolle’s theorem. Then there exists a stationary pointc ∈]a, b[ for h. Thus

0 = h′(c) = f ′(c))− f(b)− f(a)

g(b)− g(a)g′(c).

This theorem allows us to formulate a generalization of de l’Hospital’s rule.

Theorem 5.15. - de l’Hospital’s rule II. Let f, g :]a, b[→ R be two differ-entiable functions and g(x), g′(x) 6= 0 for all x ∈]a, b[. Moreover, we supposethat

1. limx→a+

f(x) = limx→a+

g(x) = l with l = 0,−∞ or +∞,

2. limx→a+

f ′(x)g′(x) = r with r ∈ R ∪ {−∞,+∞}.

Then

limx→a+

f(x)

g(x)= r.

Remark. This rule remains valid if x tends to b−, a, or ±∞.

Examples. It is important to verify that we indeed have the existence of

limx→a+

f ′(x)g′(x) in order to be able to apply de l’Hospital’s rule. However, often we

formally apply de l’Hospital’s rule and if we are able to compute limx→a+

f ′(x)g′(x) , the

validity of de l’Hospital’s rule is justified a posteriori.

1.

limx→+∞

ln(x)

x= limx→+∞

1x

1= 0.


2.

limx→0

sin(x)

x= limx→0

cos(x)

1= 1.

3. To compute limx→π

2 +

tan(3x)tan(x) , first notice that from de l’Hospital’s rule

limx→π

2 +

cos(x)

cos(3x)= limx→π

2 +

sin(x)

3 sin(3x)= −1

3

because sin π2 = − sin 3π

2 = 1 and sine is a continuous function. Conse-quently,

limx→π

2 +

tan(3x)

tan(x)= limx→π

2 +

sin(3x)

sin(x)

cos(x)

cos(3x)= (−1) · (−1

3) =

1

3.

If we directly apply de l’Hospital’s rule, we have

limx→π

2 +

tan(3x)

tan(x)= limx→π

2 +

3 cos−2(3x)

cos−2(x)= 3

(lim

x→π2 +

cos(x)

cos(3x)

)2

= 3 · 1

9=

1

3.

by using the result for limx→π

2 +

cos(x)cos(3x) .

Other undefined expressions. To compute other undefined expressions withde l’Hospital’s rule, we transform them using the following methods:

1. Expression ”0·∞”: compute limx→a+

f(x)g(x) if limx→a+

f(x) = 0 and limx→a+

g(x) =

∞. We either write f(x)g(x) = f(x)1

g(x)

(”0/0”) or f(x)g(x) = g(x)1

f(x)

(”∞/∞”).

For example,

limx→0+

x lnx = limx→0+

lnx1x

= limx→0+

1x

− 1x2

= limx→0+

(−x) = 0.

2. Expression ”∞−∞”: compute limx→a+

f(x) − g(x) if limx→a+

f(x) = ∞ and

limx→a+

g(x) =∞. We write

f(x)− g(x) =

1g(x) −

1f(x)

1f(x)g(x)

(”0/0”).

For example, for all a ≥ 0

limx→0+

1

sinx− 1

x+ ax2= limx→0+

x+ ax2 − sinx

(x+ ax2) sinx= limx→0+

1 + 2ax− cosx

(1 + 2ax) sinx+ (x+ ax2) cosx

= limx→0+

2a+ 1−cos xx

sin xx + 2a sinx+ (1 + ax) cosx

=2a

2= a.


5.3 Higher order derivatives

Definition. Let I be an open interval and f : I → R be a differentiablefunction over I. If the derivative f ′ : I → R is also differentiable over I, itsderivative is called second derivative of f and is written f ′′. More generally,the successive derivatives of f , if they exist, are written by exponentiating byoblique dashes or by a natural integer placed between parentheses:

f(x) = f (0)(x),f ′(x) = f (1)(x),f ′′(x) = f (2)(x),

· · ·(f (n−1)(x))′ = f (n)(x).

The function f (n) is called the nth derivative of f . We also write

f (n)(x) =dn f(x)

d xn=

dn

d xnf(x).

Examples.

1. Let m be a natural integer and f(x) = xm. For any natural integer n, wehave

f (n)(x) =

{dn

d xnxm = m!

(m−n)!xm−n if n ≤ m,

0 otherwise.

2. Let f(x) = eλx. For any natural integer n, we have

dn

d xneλx = λneλx.

3. Let f(x) = sinx. We have

d sin xd x = cosx , d2 sin x

d x2 = − sinx

d3 sin xd x3 = − cosx , d4 sin x

d x4 = sinx.

and consequently, for any natural integer n:

f (n)(x) =

{(−1)m cosx if n = 2m+ 1,

(−1)m sinx if n = 2m.

4. A few special functions are defined by taking the higher order derivativesof an elementary function. For example, for n ∈ N let

Hn(x) := (−1)nex2 dn

dxne−x

2

(5.10)

be the Hermite polynomials. We get :

H0(x) = 1

H1(x) = 2x

H2(x) = 4x2 − 2

H3(x) = 8x3 − 12x.


5.3.1 The class Cn(I)

Definition. Let I be an open interval. We say that f : I → R is of class Cn(I)if f is n times differentiable over I and its nth derivative f (n) is continuous overI. We refer to C∞(I) as the set of functions whose successive derivatives arecontinuous.

Remark. The set Cn(I) is a vector space because if f, g ∈ Cn(I) then αf +βg ∈ Cn(I) for α, β ∈ R. We have the following inclusions:

C∞(I) ⊂ Cn+1(I) ⊂ Cn(I) ⊂ · · · ⊂ C1(I) ⊂ C0(I).

5.3.2 Leibniz’s rule

Leibniz’s rule. Let f, g ∈ Cn(I). Then fg ∈ Cn(I) and

(fg)(n)(x) =

n∑k=0

(nk

)f (k)(x)g(n−k)(x)

for all x ∈ I.

Proof. We prove Leibniz’s rule by induction. For n = 1 we have the productrule (5.2). We can suppose that the formula is true for n. Then,

(fg)(n+1)(x) =d

d x(fg)(n)(x) =

d

d x

n∑k=0

(nk

)f (k)(x)g(n−k)(x)

=

n∑k=0

(nk

)(f (k+1)(x)g(n−k)(x) + f (k)(x)g(n+1−k)(x)) by 5.1 and 5.2

=

n∑k=0

(nk

)(f (k+1)(x)g(n+1−k−1)(x) + f (k)(x)g(n+1−k)(x))

=

n+1∑j=1

(n

j − 1

)f (j)(x)g(n+1−j)(x) +

n∑k=0

(nk

)f (k)(x)g(n+1−k)(x)

=

n+1∑k=0

(n+ 1k

)f (k)(x)g(n+1−k)(x)

where at the last step we use Pascal’s triangle for binomial coefficients:(n

k − 1

)+

(nk

)=

(n+ 1k

).

Application to Hermite polynomials. We can also show that for all n ∈ N:H ′n(x) = 2xHn(x) − Hn+1(x) and for all non-negative integers n: H ′n(x) =2nHn−1(x). For the first induction, notice that

H ′n(x) = (−1)nd

dx

(ex

2 dn

dxne−x

2)= 2xHn(x) + (−1)nex

2 dn+1

dxn+1e−x

2

= 2xHn(x)−Hn+1(x).


For the second induction, we use this identity to get

H ′n(x) = 2xHn(x) + (−1)nex2 dn+1

dxn+1e−x

2

= 2xHn(x) + (−1)nex2 dn

dxn(−2x e−x

2

)

hence by Leibniz’s rule:

H ′n(x) = 2xHn(x)− 2xHn(x)− (−1)nex2

2ndn−1

dxn−1e−x

2

5.3.3 Convex function and second derivative

Definition - convex function. Let I be an open interval. A function f :I → R is said to be convex over I if for any couple x1, x2 in I and any t ∈ [0, 1]:

f(tx1 + (1− t)x2) ≤ tf(x1) + (1− t)f(x2).

The function f is said to be strictly convex if for any couple x1 6= x2 in I andall t ∈]0, 1[:

f(tx1 + (1− t)x2) < tf(x1) + (1− t)f(x2).

A function f is said to be (strictly) concave if −f is strictly concave.

Jensen’s inequality. Let I be an interval. A function f : I → R is con-vex over I if and only if for any n-tuple x1, x2, . . . , xn of I and any n-tuplet1, t2, . . . , tn ∈ [0, 1] verifying

∑nk=1 tk = 1, we have

f(

n∑k=1

tkxk) ≤n∑k=1

tkf(xk)

We can prove Jensen’s inequality by a simple induction. Below we present aproof under the hypothesis that f is differentiable over I.

Proposition 5.3.1. Let I be an open interval and f : I → R be a differentiablefunction over I. Then f is convex if and only if f ′(x) is an increasing functionover I.

Proof. We can suppose that x1 ≤ x2. Then for all t ∈]0, 1[ we have x1 ≤tx1 + (1− t)x2 ≤ x2.

1. Let f be convex and differentiable. Then

f ′(x1) = limt→ 1t 6= 1

f(tx1 + (1− t)x2)− f(x1)

(1− t)(x2 − x1)≤ f(x2)− f(x1)

x2 − x1

and

f ′(x2) = limt→ 0t 6= 0

f(tx1 + (1− t)x2)− f(x2)

−t(x2 − x1)≥ f(x2)− f(x1)

x2 − x1

Hence

f ′(x1) ≤ f(x2)− f(x1)

x2 − x1≤ f ′(x2).


2. Let f ′ be increasing and x1 < x2. By Lagrange’s mean value theorem forall t ∈]0, 1[ there exists c1, c2 such that x1 ≤ c1 ≤ tx1 +(1−t)x2 ≤ c2 ≤ x2

and

f(tx1 + (1− t)x2)− f(x1)

(1− t)(x2 − x1)= f ′(c1) ≤ f ′(c2) =

f(x2)− f(tx1 + (1− t)x2)

t(x2 − x1).

Hence

t(f(tx1 + (1− t)x2)− f(x1)) ≤ (1− t)(f(x2)− f(tx1 + (1− t)x2)),

that is, f is convex.

Consequence - inequalities for convex and differentiable functions.Let I be an open interval and f : I → R be a convex and differentiable functionover I. Then for all x, y ∈ I, we have

f ′(x)(y − x) ≤ f(y)− f(x) ≤ f ′(y)(y − x).

If f is strictly convex, the inequalities are strict if x 6= y.

Application - Jensen’s inequality. We use the second inequality to proveJensen’s inequality if f is a convex and differentiable function. We let y =∑nk=1 tkxk. Obviously y ∈ I. For all xk ∈ I, we have

f(y)− f(xk) ≤ f ′(y)(y − xk).

Hencen∑k=1

tk(f(y)− f(xk)) ≤n∑k=1

tkf′(y)(y − xk).

By using 1 =∑nk=1 tk et y =

∑nk=1 tkxk this inequality becomes

f(y)−n∑k=1

tkf(xk) ≤ 0.

Corollary 5.16. Let I be an open interval and f : I → R be a function that istwice differentiable over I. Then

1. f is convex if and only if f ′′(x) ≥ 0 for all x ∈ I.

2. If f ′′(x) > 0 for all x ∈ I, then f is strictly convex over I.

Proof. The derivative f ′(x) is increasing if and only if f ′′(x) ≥ 0 for all x ∈ I.If f ′′(x) > 0, then f ′(x) is strictly increasing over I, from which we deduce thatf is strictly convex over I.


Example. The function f(x) = x2 satisfies f ′(x) = 2x and f ′′(x) = 2. Con-sequently, f(x) = x2 is strictly convex over R. Jensen’s inequality writes forxk ∈ R as follows: ( n∑

k=1

tkxk

)2

≤n∑k=1

tkx2k.

By letting tk = 1n we obtain(

1

n

n∑k=1

xk

)2

≤ 1

n

n∑k=1

x2k

Example. The function f(x) = lnx is of class C∞(R+). We have f ′(x) = 1x

and f ′′(x) = − 1x2 . Consequently, f(x) = lnx is strictly concave over R+.

Jensen’s inequality for xk > 0 writes as follows:

ln( n∑k=1

tkxk)≥

n∑k=1

tk ln(xk).

By letting tk = 1n , we obtain

ln( 1

n

n∑k=1

xk)≥ 1

n

n∑k=1

ln(xk).

By taking the exponential on both sides of this inequality, we get the inequalitybetween the geometric mean and the arithmetic means:

1

n

n∑k=1

xk ≥n∏k=1

x1n

k .

5.3.4 Local extrema and second derivative

Introduction. Let I be an open interval and f : I → R be a differentiablefunction over I. Let us suppose that x0 ∈ I is a stationary point of f , i.e.f ′(x0) = 0. The stationary point x0 is a local minimum of f if in a neighborhoodof x0 the function f is strictly decreasing for x < x0 and strictly increasing forx > x0. Consequently, if in a neighborhood of x0 the derivative f ′ satisfies

f ′(x) < 0 if x < x0

f ′(x) > 0 if x > x0

the function f admits a local minimum at x0.

Behavior if f ′′(x0) > 0. Let f ′(x0) = 0. If f ′′(x0) > 0, there exists aneighborhood Vε(x0) such that x < x0 implies f ′(x) < 0 and x > x0 impliesf ′(x) > 0. Consequently, we have the following result:

Theorem 5.17. Second derivative criterion. Let I be an open intervaland f : I → R be a differentiable function over I. Let x0 be a stationarypoint of f and let us suppose that the second derivative of f exists at x0. Iff ′′(x0) > 0(< 0), then function f admits a local strict minimum (maximum) atx0.


Corollary 5.18. Let I be an open interval and f : I → R be a (strictly)convex/concave and differentiable function over I. Let x0 be a stationary pointof f . Then the function f attains its (strict) mininum /maximum at x0.

Remark. In particular, if the second derivative of f exists over I and f ′′(x) ≥0/≤ 0 for all x ∈ I, then function f satisfies the hypotheses of the corollary.

5.3.5 Applications

1. Bernoulli’s inequality. Let p > 1. Show that for all x > −1 we havethe inequality

(1 + x)p ≥ 1 + px

with strict inequality if x 6= 0.

2. Young’s inequality. Let p > 1 and q be defined by the relation 1p+ 1

q = 1.Show that for any pair x, y ≥ 0 we have the inequality

xy ≤ xp

p+yq

q

with strict inequality if xp−1 6= y.

3. Least squares method. Let ak ∈ R for k = 1, . . . , n. Which values of xminimize the function

f(x) =

n∑k=1

(x− ak)2 ?

For example, the ak represent the results of n measures of a quantity. Thetask is to approach at best the value a of this quantity.

5.3.6 Inflexion points and second derivative

Definition. Let f : I → R be a differentiable function at a ∈ I. We say thatf admits an inflexion point at a, if the graph of f crosses its tangent at thepoint (a, f(a)), that is, f(x)− f(a)− f ′(a)(x− a) changes signs at x = a.

Proposition (The Inflexion point criterion). Let f : I → R be a differ-entiable function at a ∈ I. Then, f admits an inflexion point at a if and only ifthere exists a neighborhood Bδ(a) such that for all x ∈ Bδ(a) \ {a}:

f ′(a) <f(x)− f(a)

x− aor f ′(a) >

f(x)− f(a)

x− a.

Proposition (necessary condition). Let I be an open interval and f :I → R be a function of class C2(I). If f admits an inflexion point at a, thenf ′′(a) = 0.


Proposition (sufficient condition). Let I be an open interval and f : I →R be a function of class C2(I). If there exists a neighborhood Vδ(a) such thatfor all x ∈ Vδ(a) \ {a}:

(x− a)f ′′(x) < 0 or (x− a)f ′′(x) > 0

then f admits an inflexion point at a. In particular, if f is three times differen-tiable at a and f ′′′(a) 6= 0, then f admits an inflexion point at a.

Example. The function tanh : R→ R admits an inflexion point at a = 0:

(tanhx)′ =1

cosh2 x= 1− tanh2 x ,

(tanhx)′′ =−2 sinhx

cosh3 x= −2 tanhx(1− tanh2 x).

Hence (tanhx)′′ = 0 if x = 0 (necessary condition) and

d3 tanhx

dx3

∣∣x=0

= −2 6= 0

implies that this point also satisfies the sufficient condition.

5.4 Higher order derivatives and series expan-sions

Introduction. We have shown in 5.1 that a function f : I → T that isdifferentiable at a ∈ I can be approached by a polynomial of degree 1 in aneighborhood of a. More precisely, let P1(x) = f(a) + f ′(a)(x− a), then

f(x) = P1(x) +R1(x) and limx→a

R1(x)

x− a= 0

that we have also written as f(x) = P1(x) + o(x − a). These expressions arecalled the first order Taylor expansion of function f around point a. We willextend this concept to functions that are n-times differentiable and prove that,under convenient hypotheses, for all m ≤ n, we can write

f(x) = Pm(x) +Rm(x) and limx→a

Rm(x)

(x− a)m= 0

where Pm(x) is a polynomial of degree m. This expression Pm(x) +Rm(x) withRm(x) = o((x− a))m (or better) is called the Taylor expansion of order m.

5.4.1 Polynomial functions

General polynomial function. Let f be the polynomial of degree n givenby

f(x) =

n∑k=0

ckxk, cn 6= 0.

We can easily prove by induction that

ck =f (k)(0)

k!

and more generally

f(x) =

n∑k=0

f (k)(a)

k!(x− a)k.

Any polynomial function is entirely determined by the values of its derivativesat a single point a. If f is a general function whose derivatives are known untilthe order n, we will show that this sum is a convenient approximation of f in aneighborhood of a.

Representation of the remainders Rm(x). For a Taylor expansion oforder m < n we want to estimate the error Rm(x) in a neighborhood of a. Fora polynomial function, we can use the (m+ 1)th derivative of f . If m = 0, themean value theorem states that there exists a c = cx between x and a such that

f(x) = f(a) + f ′(c)(x− a)

i.e. R0(x) = f ′(cx)(x − a). For m ≥ 1 we need a generalization of the meanvalue theorem to prove that there exists a c = cx between x and a such that

Rm(x) =f (m+1)(cx)

(m+ 1)!(x− a)m+1

In the case of a general function f , the goal is to express the rest Rm(x)only by the mth derivative or, if f is even (m + 1)-times differentiable, by the(m+ 1)th derivative of f in order to estimate the error of the truncated Taylorexpansion.

5.4.2 Truncated Taylor expansion

Taylor polynomials and Taylor remainders. Let f : I → T be a functionthat is n-times differentiable at a ∈ I. Then for all m ≤ n, the polynomialfunction Pm : R→ R of degree m defined by

Pm(x) =

m∑k=0

f (k)(a)

k!(x− a)k

is called the Taylor polynomial of order m of function f at point a. The function

Rm(x) = f(x)− Pm(x)

is called the mth Taylor remainder of f at point a.

Truncated Taylor expansion. The expression

f(x) = Pm(x) +Rm(x)

112

is called the Taylor expansion of order m of function f around a. We can provethat the Taylor polynomial of order m is the unique polynomial of degree mapproaching the function f at the order ≥ m.

Theorem 5.19. - Taylor’s formula for truncated expansion. Let I bean open interval and f : I → R be a function of class Cn(I). Let a ∈ I and mbe an integer such that 0 ≤ m ≤ n. Then for all x ∈ I, there exists cx,m betweena and x such that

Rm(x) =f (m+1)(cx,m)

(m+ 1)!(x− a)m+1 if m < n

and

Rn(x) =f (n)(cx,n)− f (n)(a)

n!(x− a)n if m = n

Corollary 5.20. For any integer m such that 0 ≤ m ≤ n, we have

limx→a

Rm(x)

(x− a)m= 0.

We often write f(x) = Pm(x) +O((x− a)m+1) if m < n, respectively f(x) =Pn(x) + o((x− a)n) if m = n. Reminder: O(xk) represents a function with theproperty |O(xk)| ≤ C|xk| in a neighborhood of x = 0 and o(xk) represents a

function with the property limx→0

o(xk)xk

= 0.

Proof of the theorem. The theorem is a consequence of the following result:

Theorem (generalized Cauchy theorem). Let I = [a, b] be an interval andg, h : I → R be two functions of class Cm+1(I) such that g(k)(a) = h(k)(a) = 0for k = 0, 1, . . . ,m and h(m+1)(x) 6= 0 for all x ∈ I. Then for all k = 0, 1, . . . ,m,h(k)(x) 6= 0 and for all x ∈]a, b] there exists a c ∈]a, b[ such that

g(b)

h(b)=g(m+1)(c)

h(m+1)(c).

Proof of the generalized Cauchy theorem. We first show that h(k)(x) 6= 0for all x ∈]a, b]. Suppose that there exists x ∈]a, b] such that h(m)(x) = 0. Themth derivative h(m)(x) hence satisfies the conditions of Rolle’s theorem over[a, x], i.e. h(m)(x) is differentiable and h(m)(a) = h(m)(x) = 0. Thus thereexists a cx ∈ [a, x] such that

(h(m)(cx))′ = h(m+1)(cx) = 0

which contradicts the hypothesis h(m+1)(x) 6= 0 for all x ∈ I. By induction,h(k)(x) 6= 0 for all x ∈]a, b] and all k = 0, 1, . . . ,m. By Cauchy’s theorem, thereexists c1 ∈]a, b[ such that

g(b)

h(b)=g(b)− g(a)

h(b)− h(a)=g(1)(c1)

h(1)(c1).

113

Then, with g(1)(a) = h(1)(a) = 0, there exists c2 ∈]a, c1[ such that

g(1)(c1)

h(1)(c1)=g(2)(c2)

h(2)(c2),

henceg(b)

h(b)=g(2)(c2)

h(2)(c2).

and we obtain the theorem by induction.

Proof of the theorem on the truncated Taylor expansion. Let m < n.The functions

Rm(x) = f(x)− Pm(x) and hm(x) = (x− a)m+1

satisfy the conditions of the generalized Cauchy theorem over I = [a, x]. Thesefunctions are (m+ 1)-times differentiable with

R(m+1)m (x) = f (m+1)(x) and h(m+1)

m (x) = (m+ 1)!

Then there exists cx,m between a and x such that

Rm(x)

hm(x)=f (m+1)(cx,m)

(m+ 1)!.

If m = n notice that

Rn(x) = f(x)− Pn(x)

= f(x)− Pn−1(x)− f (n)(a)

n!(x− a)n

= Rn−1(x)− f (n)(a)

n!(x− a)n.

Then we apply the previous result to the remainder Rn−1(x).

Examples.

1. We are looking for the Taylor expansion of order 4 of the function f(x) =ln(1 + x) around a = 0. Notice that for all n ≥ 1

f (n)(x) = (−1)n−1(n− 1)!(1 + x)−n

hence f (n)(0) = (−1)n−1(n − 1)!. The Taylor polynomial of degree 4 isgiven by

P4(x) = x− x2

2+x3

3− x4

4.

There exists a c between 0 and x such that

R4(x) =1

5(1 + c)5x5

Hence, for x ≥ 0, we have R4(x) ≤ x5

5 and for −1 < x ≤ 0 we get the

estimation R4(x) ≤ x5

5(1−x)5 .

114

2. We are looking for the Taylor expansion of order 4 of the function f(x) =sin(x) around a = 0. The Taylor polynomial of degree 4 is given by

P4(x) = P3(x) = x− x3

3!.

There exists a c between 0 and x such that

R4(x) =cos c

5!x5.

Application to the computation of limits. Let f, g ∈ Cn(I) be two func-tions such that f (k)(a) = g(k)(a) = 0 for 0 ≤ k < n and g(n)(a) 6= 0. Then

limx→a

f(x)

g(x)=f (n)(a)

g(n)(a). (5.11)

In fact f and g have the following expansions:

f(x) =f (n)(a)

n!(x− a)n +Rn,f (x), g(x) =

g(n)(a)

n!(x− a)n +Rn,g(x)

We can compute the limits of undefined expressions by truncated Taylor expan-sions. For example, for all a ≥ 0

limx→0+

1

sinx− 1

x+ ax2= limx→0+

x+ ax2 − sinx

(x+ ax2) sinx= limx→0+

ax2 +O(x3)

(x+ ax2)(x+O(x3))

= limx→0+

ax2 +O(x3)

x2 +O(x3)= a

by only using sinx = x+O(x3).

Application to extrema. Let f ∈ Cn(I) be a function such that f (k)(a) = 0for 1 ≤ k < n and f (n)(a) 6= 0. Then

1. If n is even and f (n)(a) > 0, the function f admits a relative strict mini-mum at a.

2. If n is even and f (n)(a) < 0, the function f admits a relative strict maxi-mum at a.

3. If n is odd and f (n)(a) 6= 0, function f admits an inflexion point at a.

5.4.3 Computation of Taylor polynomials

Introduction. Let f, g ∈ Cn(I). Their Taylor polynomials of order m ≤ nare given by

Pm(x) =

m∑k=0

f (k)(a)

k!(x− a)k

respectively

Qm(x) =

m∑k=0

g(k)(a)

k!(x− a)k.

the goal is to compute the Taylor polynomials of order m ≤ n for the functionsαf + βg, fg, f/g and g ◦ f from Pm and Qm.

115

Linearity. The Taylor polynomial of order m ≤ n for αf + βg is given byαPm + βQm.

Product of two functions. The Taylor polynomial of order m ≤ n for fg isobtained from the product PmQm by only keeping the terms of degree ≤ m.

Quotient of two functions. The Taylor polynomial of order m ≤ n forf/g is obtained from the expansion of the quotient Pm/Qm by only keepingthe terms of degree ≤ m. We can also compute this Taylor polynomial bypolynomial division Pm(x)/Qm(x) until the order m by starting with the order0 (see example 2 below).

Composite function. Let f(a) = 0. The Taylor polynomial of order m ≤ nfor g ◦ f around 0 is given by Qm(Pm(x)) and by only keeping terms of degree≤ m:

Qm(Pm(x)) =

m∑k=0

g(k)(0)

k!Pm(x)k

Examples.

5.4.4 Power series

Definition. Let f be a function of class C∞(I). We call

P∞(x) =

∞∑k=0

f (k)(a)

k!(x− a)k

the Taylor series (or power series) of function f expanded at point a. The Taylorpolynomial Pm(x) represents for all m ∈ N a partial sum of this series.

Remark. The Taylor series is convergent for x = a and P∞(a) = f(a) but itis not necessarily convergent for x 6= a. On the other hand, if P∞(x) < ∞, wedo not always have f(x) = P∞(x). For example, the function f : R→ R definedby

f(x) =

{exp(− 1

x2 ) if x 6= 0,

0 if x = 0.

is of class C∞(I) and f (n)(0) = 0 for all n ∈ N. Hence P∞(x) = 0 for all x ∈ R.The class of functions for which f(x) = P∞(x) is called the class of power seriesor the class of analytic real functions.

Power series and its radius of convergence. Let (an) be a sequence anda ∈ R. The series

∞∑k=0

ak(x− a)k

is called a power series. Moreover, let R be defined by

1

R= lim sup

k→+∞

k√|ak|

116

by using the convention

R = 0 if lim supk→+∞

k√|ak| = +∞

andR = +∞ si lim sup

k→+∞

k√|ak| = 0.

R is called the radius of convergence of the series. Consequently, for |x−a| < Rthe series is absolutely convergent and for |x− a| > R the series diverges.

For |x− a| < R we define function f by

f(x) =

∞∑k=0

ak(x− a)k

Theorem 5.21. Derivating a power series. The function f(x) is of classC∞(]a−R, a+R[). In particular,

f ′(x) =

∞∑k=0

kak(x− a)k−1

and

ak =f (k)(a)

k!

i.e. f(x) = P∞(x) for |x− a| < R.

This result is a consequence of a more general result for uniformly convergentseries of functions We give an elementary proof of this result:

Proof. Without loss of generality, we can take a = 0. First notice that the series∞∑k=0

ak(x− a)k and

∞∑k=0

kak(x− a)k−1 have the same radius of convergence. To

prove the theorem, we need the following inequality. For all k ∈ N and x, h ∈ R

∣∣ (x+ h)k − xk

h− kxk−1

∣∣ ≤ k(k − 1)

2|h|(|x|+ |h|)k−2. (5.12)

By applying Newton’s binomial formula to the term (x+h)k, we get the identity

(x+ h)k − xk

h− kxk−1 = h

k−2∑j=0

k(k − 1)

(k − j)(k − j − 1)

(k − 2j

)xkhk−2−j .

By taking the absolute value and by using the inequalityk(k − 1)

(k − j)(k − j − 1)≤

k(k − 1)

2for 0 ≤ j ≤ k − 2, we obtain

∣∣ (x+ h)k − xk

h− kxk−1

∣∣ ≤ |h| k−2∑j=0

k(k − 1)

2

(k − 2j

)|x|k|h|k−2−j

117

and the inequality (5.12) follows by applying Newton’s binomial formula to theright-hand side.Let |x| < R (reminder: a = 0). Then there exists δ > 0 such that |x| < R − δand |x| + |h| < R − δ for all |h| ≤ δ. Thanks to the absolute convergence, wecan change the order of the terms in the series. Then for 0 < |h| < δ:

f(x+ h)− f(x)

h−∞∑k=0

kakxk−1 =

∞∑k=0

ak( (x+ h)k − xk

h− kxk−1

)hence by the inequality (5.12)∣∣f(x+ h)− f(x)

h−∞∑k=0

kakxk−1∣∣ ≤ |h| ∞∑

k=0

k(k − 1)

2ak(|x|+ |h|)k−2

and this series converges absolutely thanks to the hypothesis |x|+ |h| < R− δ.It follows that

limh→0

f(x+ h)− f(x)

h−∞∑k=0

kakxk−1 = 0

hence the assertion f ′(x) =

∞∑k=0

kak(x − a)k−1. By the same argument, the

series f ′(x) =

∞∑k=0

kak(x − a)k−1 is differentiable and we conclude that f ∈

C∞(]a−R, a+R[) by induction.

Examples.

ex =

∞∑k=0

xk

k!,x ∈ R

sinhx =

∞∑k=0

x2k+1

(2k + 1)!, x ∈ R

coshx =

∞∑k=0

x2k

(2k)!, x ∈ R

sinx =

∞∑k=0

(−1)kx2k+1

(2k + 1)!, x ∈ R

cosx =

∞∑k=0

(−1)kx2k

(2k)!, x ∈ R

ln(1 + x) =

∞∑k=1

(−1)k+1xk

k, x ∈]− 1, 1[

1

1 + x=

∞∑k=0

(−1)kxk, x ∈]− 1, 1[

arctanx =

∞∑k=0

(−1)kx2k+1

2k + 1, x ∈]− 1, 1[

118

5.5 Study of a function

The study of a real function f consists generally in determining its propertiesaccording to the following list:

1. Domain and range (if it is already possible without refering to the pointsbelow), belonging to a class Ck.

2. Parity, other symmetries, periodicity.

3. Continuity and discontinuity points with computation of the limits atthese points.

4. Values or limits at the boundary points of its domain, computation of theasymptotes, asymptotic behavior.

5. Differentiability with computation of the derivative f ′.

6. Stationary points and their nature, local extrema.

7. Inflexion points.

8. If possible give the variation and draw the graph of f .

Often these properties cannot be determined by following the order of this list.For example, we often determine the range of f at the end, after the computationof the extrema or we compute the limits at the boundary points of the domainand the asymptotes by means of f ′.

119

Chapter 6

Integration

We present the basic techniques of integration.

Notions to learn. Definite integral, indefinite integral, antiderivative, fun-damental theorem of calculus, integration by parts, substitution, generalizedintegral, Gamma function, Stirling’s formula, Gaussian integral.

Skills to acquire. Know the antiderivatives of elementary functions, knowhow to apply integration by parts and variable substitution to the computationof definite, indefinite and generalized integrals.

6.1 The Riemann integral

Introduction. The task of computing the area of geometric figures whoseboundary is not always a straight line is at the root of integral calculus. Theresolution of Newton’s equations in classical mechanics leads us to similar prob-lems. Let us consider, for example, the movement on a straight line. If thespeed v(t) of an object, as a function of time t, is known, what is the distancetraveled, denoted as s, in the time interval [a, b]? If the speed is constant,v(t) = v, the answer is obviously s = v(b − a), which corresponds to the areaof the region delimited by the graph of v(t) and the horizontal axis. If v(t) ispiecewise constant, that is v(t) is a step function, then the distance traveled sis always equal to the area of the region delimited by the graph of v(t) and thehorizontal axis. For a general function v(t), we consider an approximation ofv(t) by step functions. The result s of this limit process is called integral andwe write

s =

∫ b

a

v(t)dt.

The motion equation

s(t) ≡ d s(t)

d t= v(t), s(a) = 0

120

where s(t) denotes the distance traveled in the interval [a, t], and gives us anothermore important interpretation of the integral. The integral s(t)

s(t) =

∫ t

a

v(τ)dτ.

is the inverse of the differentiation operator. We call function s(t) the antideriva-tive of v(t). To study the behavior of motion for t large enough we look at thelimit

limt→∞

s(t) =

∫ ∞a

v(τ)dτ

called generalized integral.

Subdivision of a closed interval. Let a < b. The ordered and finite subset

σ = {a0, . . . , an : a0 = a < a1 < . . . < an = b}

is called a subdivision or a partition of the interval [a, b]. The number P (σ) =max{ai − ai−1 : i = 1, . . . , n} is called the step of the partition. In particular,the partition

σ = {a0, . . . , an : ak = a+ kb− an}

is called the regular partition of order n of [a, b].

Step function. Let σ be a partition of [a, b] and y1, . . . , yn ∈ R. Let, fori = 1, . . . , n, gi : ]ai−1, ai[→ R such that gi(x) = yi for x ∈ [ai−1, ai[. Thefunction f : ]a, b[→ R defined by f(x) = gi(x) is called a step function. Anylinear combination of step functions is a step function.

Integral of a step function. Let f be a step function. We define the integralIf (a, b) of f by

If :=

∫ b

a

f(x)dx :=

n∑k=1

yk(ak − ak−1).

This definition does not depend on the partition σ. The second step consists inextending this construction to more general functions.

Riemann sums. Let f : [a, b]→ R be a bounded function and σ be a partitionof [a, b]. We let

mk = inf{f(x) : x ∈]ak−1, ak[}

Mk = sup{f(x) : x ∈]ak−1, ak[}We call

Sσf =∑nk=1mk(ak − ak−1)

Sσ

f =∑nk=1Mk(ak − ak−1)

the lower Riemann1 sum, respectively the upper Riemann sum, of function frelatively to partition σ. Obviously, we have Sσf ≤ S

σ

f .

1Bernhard Riemann, 1822 - 1866, a German mathematician, defined the integral in thisway.

121

Integrable function and integral. We let

Sf = supσSσf , Sf = inf

σSσf .

We have Sf ≤ Sf . A bounded function over [a, b] is said to be integrable over

[a, b] if Sf = Sf . In this case, we write

Sf = Sf = If =

∫ b

a

f(x)dx.

By convention: ∫ b

a

f(x)dx = −∫ a

b

f(x)dx.

Theorem 6.1. - Integrability criteria

1. A bounded function f : [a, b]→ R is integrable if and only if for all ε > 0there exists a partition σ of [a, b] such that

Sσ

f ≤ Sσf + ε.

2. If f : [a, b]→ R is a continuous function, then f is integrable.

3. If f : [a, b] → R is an increasing (or decreasing) function, then f is inte-grable.

Proof. 1. From the definition of Sσ

f , Sσf , the condition implies

0 ≤ Sσf − Sσf ≤ ε

hence the assertion.

2. Let σ be a regular partition of order n. Then

Sσ

f − Sσf =

n∑k=1

(Mk −mk)(b− an

)

From the continuity of f , f is uniformly continuous over the closed andbounded interval [a, b]. Hence for all ε > 0, there exists n ∈ N such thatfor all k between 1 and n, we have Mk −mk ≤ ε. Then

n∑k=1

(Mk −mk)(b− an

) ≤ ε(b− a).

3. Let σ be a regular partition of n. If f is increasing, we have mk = f(ak−1)and Mk = f(ak). Consequently

Sσ

f − Sσf =

n∑k=1

(Mk −mk)(ak − ak−1)

=

(b− an

) n∑k=1

(f(ak)− f(ak−1))

=

(b− an

)(f(b)− f(a))

122

A function that is not integrable. Let f : [0, 1]→ R be defined by

f(x) =

{0 if x ∈ Q,1 if x ∈ R \Q.

Then for any partition σ of [0, 1] we have Sσf = 0 and Sσ

f = 1.

Computation of the integral by means of the definition.

1. Let f(x) = x2. We will show that for all b > 0∫ b

0

x2dx =b3

3.

Let σ be a regular partition of order n. Then since f is increasing over[0, b]

Sσ

f =

(b− 0

n

) n∑k=1

(0 + k

b− 0

n

)2

=

(b

n

)3 n∑k=1

k2

=b3

n3

(n+ 1)(n+ 12 )(n+ 1)

3−→ b3

3when n→∞.

6.2 Properties of the Riemann integral

Decomposition of a function into its positive part and its negativepart. We define the positive part, respectively the negative part, of a functionf : Df → R by

f+(x) = max{f(x), 0} = f(x)+|f(x)|2 f+ : Df → R+

f−(x) = −min{f(x), 0} = |f(x)|−f(x)2 f− : Df → R+

Obviously f(x) = f+(x)− f−(x) and |f(x)| = f+(x) + f−(x). If f : [a, b]→ Ris bounded and integrable, then f−, f+ are integrable.

Theorem 6.2. - Properties of the Riemann integral Let f, g : [a, b] → Rbe two integrable functions. Then∫ b

a

(αf(x) + βg(x))dx = α

∫ b

a

f(x)dx+ β

∫ b

a

g(x)dx for all α, β ∈ R. (6.1)

If f(x) ≤ g(x) for all x ∈ [a, b], then∫ b

a

f(x)dx ≤∫ b

a

g(x)dx. (6.2)

In particular, ∣∣∣∣ ∫ b

a

f(x)dx

∣∣∣∣ ≤ ∫ b

a

|f(x)|dx. (6.3)

123

For all c ∈]a, b[ ∫ b

a

f(x)dx =

∫ c

a

f(x)dx+

∫ b

c

f(x)dx. (6.4)

Proof. We only give a short proof because the linearity of the integral is aconsequence of the linearity of the derivative if f, g are continuous thanks tothe existence of an antiderivative (see below). In the general case of integrablefunctions f, g it suffices to prove that∫ b

a

(−f(x))dx = −∫ b

a

f(x)dx,

∫ b

a

f(x) + g(x)dx =

∫ b

a

f(x)dx+

∫ b

a

g(x)dx.

The first assertion is a consequence of inf −f = − sup f and of sup−f = − inf fin the Riemann sums. For the second one, notice that for all ε > 0 we canchoose a common partition σ such that

Sσf + Sσg ≤ Sσf+g ≤ S

σ

f+g ≤ Sσ

f + Sσ

g ≤ Sσf + Sσg + ε

since inf f + inf g ≤ inf f +g ≤ sup f +g ≤ sup f + sup g in the sums. The threelast properties follow directly from the properties of Riemann sums.

Theorem 6.3. - Mean value theorem. Let f, g : [a, b]→ R be two continuousfunctions and g ≥ 0. Then there exists (at least) a real number c ∈]a, b[ suchthat ∫ b

a

f(x)g(x)dx = f(c)

∫ b

a

g(x)dx

In particular, if g = 1: ∫ b

a

f(x)dx = f(c)(b− a).

Proof. We let m := minx∈[a,b]

f(x), M := maxx∈[a,b]

f(x). Then mg(x) ≤ f(x)g(x) ≤

Mg(x) over [a, b] and from the monotonicity of the integral

m

∫ b

a

g(x)dx ≤∫ b

a

f(x)g(x)dx ≤M∫ b

a

g(x)dx.

If

∫ b

a

g(x)dx = 0, then

∫ b

a

f(x)g(x)dx = 0 and the assertion is true for all

c ∈]a, b[. If

∫ b

a

g(x)dx > 0, then

m ≤∫ baf(x)g(x)dx∫ bag(x)dx

≤M

and the assertion is a consequence of the intermediate value theorem for thecontinuous function f .

124

6.3 The derivative and the integral

Antiderivatives. Let f : [a, b] → R be an integrable function. A continuousfunction F : [a, b]→ R is called an antiderivative of f if for all x ∈]a, b[: F ′(x) =f(x). If G is another antiderivative of f , then F −G is constant.

Existence of antiderivatives for continuous fonctions and the indefi-nite integral. Let f : [a, b]→ R be continuous. Then the function F : [a, b]→R defined by the indefinite integral

F (x) =

∫ x

a

f(t)dt

is an antiderivative of f . In fact, for all x, x + h ∈ [a, b], h 6= 0 there exists bythe mean value theorem 6.3 a real number c = ch between x and x + h suchthat

F (x+ h)− F (x)

h=

1

h

∫ x+h

x

f(t)dt = f(ch)

and limh→0

f(ch) = f(x) hence the assertion. The fundamental theorem of integral

calculus follows:

Theorem 6.4. - Fundamental theorem of integral calculus. Let f :[a, b] → R be continuous. Then if F : [a, b] → R is an antiderivative of f , wecan write ∫ b

a

f(x)dx = F (b)− F (a)

Remark. We often write F (x)∣∣ba

instead of F (b)− F (a).

Remark. For the indefinite integral we find in tables the notation∫f(x)dx = F (x) or

∫f(x)dx = F (x) + C or

∫ x

f(t)dt = F (x).

Examples. From the fundamental theorem of integral calculus, a simple methodto determine antiderivatives consists in computing the derivatives of known func-tions. For example,

1. ∫xpdx =

xp+1

p+ 1if p 6= −1 and x > 0.

2. ∫1

xdx = ln |x| x 6= 0

3. ∫sinx dx = − cosx,

∫cosx dx = sinx

125

4. Let f : [a, b]→ R be of class C1 and f > 0. Then∫f ′(x)

f(x)dx = ln f(x)

and ∫f ′(x)f(x)pdx =

f(x)p+1

p+ 1if p 6= −1.

6.4 Integration Techniques

Introduction. The main rules of integral calculus are a consequence of thefundamental theorem of integral calculus and of the rules established for differ-ential calculus.

6.4.1 Integration by parts

Integration by parts is a consequence of the derivation rule for the product oftwo functions.

Integration by parts. Let I be an open interval, a, b ∈ I and f, g : [a, b]→ Rbe two functions of class C1. Then,∫ b

a

f(x)g′(x)dx = f(x)g(x)∣∣ba−∫ b

a

f ′(x)g(x)dx.

Application - truncated Taylor series and Taylor’s formula. Let I bean open interval, a ∈ I and f : [a, b] → R be a function of class Cn+1. Then,for all x ∈ I:

f(x) =

n∑k=0

f (k)(a)

k!(x− a)k +

1

n!

∫ x

a

(x− t)nf (n+1)(t)dt

that is, the Taylor remainder is explicitly given by

Rn(x) =1

n!

∫ x

a

(x− t)nf (n+1)(t)dt

6.4.2 Variable substitution

The variable substitution rule is a consequence of the derivative of a compositefunction.

An application of the chain rule to antiderivatives. Let a < b andf : [a, b] → R be a continuous function. Let I be an open interval, and g, h :I → [a, b] be two functions of class C1. Then the function K : I → R definedby

K(x) =

∫ h(x)

g(x)

f(t)dt

126

is of class C1 and for all x ∈ I:

K ′(x) = f(h(x))h′(x)− f(g(x))g′(x)

since f admits an antiderivative F and from the fundamental theorem of integralcalculus K(x) = F (h(x)) − F (g(x)). The formula for variable substitutionfollows.

Variable substitution. Let a < b and f : [a, b]→ R be a continuous function.Let I be an open interval, α, β ∈ I, α < β and φ : I → R be a function of classC1 such that φ([α, β]) ⊂ [a, b]. Then,∫ φ(β)

φ(α)

f(x)dx =

∫ β

α

f(φ(t))φ′(t)dt

The transformation x = φ(t) is called variable substitution.

Remark. Intuitively, we write

dx =dx

dtdt.

6.4.3 Examples - techniques and frequent integrals

1. Polynomials and exponential function:∫ x

Pn(t)eλt dt. We use thatfor all λ ∈ R, λ 6= 0∫ x

eλt dt =eλx

λ, i.e. eλt =

(eλx

λ

)′and integration by parts. Let Pn(t) be a polynomial of degree n. We needn integration by parts to ”eliminate” Pn(t):∫ x

Pn(t)eλt dt =

∫ x

Pn(t)

(eλt

λ

)′dt

= Pn(t)eλt

λ

∣∣x − ∫ x

P ′n(t)eλt

λdt

= λ−1Pn(x)eλx − λ−1

∫ x

P ′n(t)eλt dt

and P ′n(t) is a polynomial of degree n−1. We can prove by induction that∫ x

Pn(t)eλt dt = eλxn∑k=0

(−1)kλ−k−1P (k)n (x)

2. Polynomials, sine and cosine:∫ x

Pn(t) sin t dt and∫ x

Pn(t) cos t dt.We use the derivatives

sin t = (− cos t)′, (sin t)′ = cos t

127

and n integration by parts to ”eliminate” Pn(t):∫ x

Pn(t) cos t dt =

∫ x

Pn(t)(sin t)′ dt

= Pn(t) sin t∣∣x − ∫ x

P ′n(t) sin t dt

= Pn(t) sin t∣∣x − ∫ x

P ′n(t)(− cos t)′ dt

= Pn(t) sin t+ P ′n(t) cos t∣∣x − ∫ x

P ′′n (t) cos t dt etc.

3. A recurrent sequence for In :=∫ x

sinn t dt. We use sin2 + cos2 = 1,the integrals

I0 =

∫ x

sin0 t dt = x, I1 =

∫ x

sin t dt = − cosx

and the derivatives

sin t = (− cos t)′, (sin t)′ = cos t

Example.

I2 =

∫ x

sin2 t dt =

∫ x

sin t(− cos t)′ dt

= − sin t cos t∣∣x − ∫ x

(sin t)′(− cos t) dt

= − sinx cosx+

∫ x

cos2 t dt

= − sinx cosx+

∫ x

1− sin2 t dt

= − sinx cosx+ x− I2

Hence 2I2 = − sinx cosx+ x.

Application - the Wallis product.

π

2=

∞∏k=1

4k2

4k2 − 1(Wallis product). (6.5)

This product will be useful to prove Stirling’s formula.

4. Integral of√

1− x2. The integrand is defined between its zeros −1 and1. We apply the variable substitution x = sin t. Let −1 < a < b < 1.Notice that for t ∈ [−π2 ,

π2 ] the function sin t is increasing and cos t =√

1− sin2 t ≥ 0.

5. Integral of functions containing√

1 + x2. The integrand is definedfor all x ∈ R. We apply the variable substitution x = sinh t

128

6. Computation of∫ b

a1/√

x2 − 1 dx and of∫ b

a

√x2 − 1 dx. We apply

the variable substitution x = cosh t. Let a, b > 1.

7. Integrals∫ b

a1

1+x2 dx and∫ b

a1

1−x2 dx.

8. Integral of rational functions P(x)Q(x) - Partial fraction decomposi-

tion.

9. Integral of the inverse of a function.

10. The trapezoidal rule I. Let f be a function of class C2([a, b]). Then,there exists c ∈]a, b[ such that∫ b

a

f(x) dx =f(a) + f(b)

2(b− a)− 1

12f ′′(c)(b− a)3

11. The trapezoidal rule II. Let f be a function of class C2([a, b]). Then,there exists d ∈]a, b[ such that∫ b

a

f(x) dx = f

(a+ b

2

)(b− a) +

1

24f ′′(d)(b− a)3

6.4.4 Supplement: Integral of functions f : [a, b]→ CWe can easily extend integral calculus and its techniques to functions f : [a, b]→C (but not to functions f : C→ C !)

Integrable function. A function f : [a, b] → C, f(x) = u(x) + iv(x), isintegrable if u, v, : [a, b]→ R are bounded and integrable. We define∫ b

a

f(x) dx :=

∫ b

a

u(x) dx+ i

∫ b

a

v(x) dx. (6.6)

If u, v are continuous functions they admit antiderivatives U, V and we defineF (x) := U(x) + iV (x). We have F ′(x) = f(x) and∫ b

a

f(x) dx = F (b)− F (a) = U(b)− U(a) + i(V (b)− V (a)). (6.7)

Example. If f : [a, b] → C is given by f(x) = eipx = cos px + i sin px, p ∈ R,

p 6= 0, then F (x) =eipx

ip=

sin px

p− icos px

pand

∫ b

a

eipx dx =eipb − eipa

ip.

6.5 Generalized integrals

Introduction. We have defined the integral for a subset of bounded functionsf : [a, b] → R that are said to be integrable. Can we extend this definition tonon-bounded functions defined over an open interval or to functions defined overnon-bounded intervals?

129

Definition. Let I be an interval and a = inf I < b = sup I. Let f : I → Rbe a function that is bounded and integrable over any interval [c, d] ⊂ I wherea < c < d < b. We say that f is integrable over I if and only if for a pointp ∈]a, b[ the limits

limx→a+

∫ p

x

f(t) dt and limx→b−

∫ x

p

f(t) dt

exist. In this case, we define the generalized integral of f over I by∫ b

a

f(t) dt = limx→a+

∫ p

x

f(t) dt+ limx→b−

∫ x

p

f(t) dt

We say that the generalized integral of f over I is absolutely convergent if thegeneralized integral of |f | over I exists. In this case the generalized integralof f over I is convergent (as it is the case for series). Just as for absolutelyconvergent series we can build a comparison criterion for absolutely convergentintegrals.

Comparison criterion. Let f, g : I → R be integrable over I such that0 ≤ |f |(x) ≤ |g|(x). Then,∫

I

|g(t)| dt < +∞ ⇒∫I

|f(t)| dt < +∞ (6.8)∫I

|f(t)| dt = +∞ ⇒∫I

|g(t)| dt = +∞. (6.9)

Application: absolute convergence implies convergence. Let f, g : I →R be integrable over I such that 0 ≤ |f |(x) ≤ g(x). Then, the generalizedintegral of f over I is absolutely convergent and

−∞ < −∫I

g(t) dt ≤ −∫I

|f(t)| dt ≤∫I

f(t) dt ≤∫I

|f(t)| dt ≤∫I

g(t) dt < +∞

Examples.

1.

∫ ∞0

sinx

x1+pdx is absolutely convergent for all 0 < p < 1 since

∫ b

0

sinx

x1+pdx

is absolutely convergent for all p < 1, b > 0 and

∫ ∞b

sinx

x1+pdx is absolutely

convergent for all 0 < p, b > 0.Proof: for all s, t ∈]0, b[, s < t and p < 1:∫ t

s

∣∣ sinxx1+p

∣∣ dx ≤ ∫ t

s

∣∣ 1

xp∣∣ dx =

∣∣ t1−p − s1−p

1− p∣∣

Moreover, ∫ ∞b

∣∣ sinxx1+p

∣∣ dx ≤ ∫ ∞b

∣∣ 1

x1+p

∣∣ dx =1

pbp.

130

2.

∫ ∞0

sin t

tdt is convergent but not absolutely convergent. The function

sin t

tis continuous and bounded over any interval ]0, b[ hence it suffices to

consider ∫ ∞b

sin t

tdt, b > 0.

Integrating by parts yields∫ ∞b

sin t

tdt =

− cos t

t

∣∣∣∣∞b

−∫ ∞b

cos t

t2dt

hence convergence according to example (1). To prove that the integraldoes not converge absolutely we bound | sinx| from below as follows: forx ∈ [0, π]:

sinx ≥ g(x) :=

0, if 0 ≤ x < π/4;sinπ/4, if π/4 ≤ x < 3π/4;0, if 3π/4 ≤ x ≤ π.

The function | sinx| is periodical with period π and we apply the same

131

bound over each period. It follows that for any non-negative integer n∫ n

0

| sin t|t

dt ≥ sinπ

4

n−1∑k=0

∫ 3π4 +πk

π4 +πk

1

tdt

≥ sinπ

4

n−1∑k=0

∫ 3π4 +πk

π4 +πk

13π4 + πk

dt

= sinπ

4

n−1∑k=0

π

2

13π4 + πk

.

This expression diverges when n→∞.

3. We consider the functions fn : [0,∞[→ R, n a non-negative integer, de-fined by

fn(x) = n(1− 2n|x− n|)χ[n−2−n,n+2−n](x).

We define

f(x) :=

∞∑n=0

fn(x).

Function f is non-negative, continuous, unbounded and integrable over[0,∞[.

The area of the triangles is n2−n.

Remark: Alternatively, we can write the fn as follows:

fn(x) = n2n−1(|x− n− 2−n|+ |x− n+ 2−n| − 2|x− n|).

132

The antiderivative F (x) such that F (0) = 0 exists and F is a non-negative,increasing, bounded and differentiable function. In particular,

F (x) ≤∞∑n=0

n2−n = 2

and ∫ ∞0

f(t) dt = 2.

The antiderivative F (x) =

∫ x

0

f(t) dt.

Integration rules and techniques for generalized integrals. The gen-eralized integral verifies the properties (6.1) − (6.4). Integration by parts andvariable substitution also extend to generalized integrals. For the integrationby parts of generalized integrals∫ b

a

f(x)g′(x)dx = f(x)g(x)

∣∣∣∣ba

−∫ b

a

f ′(x)g(x)dx.

the term f(x)g(x)∣∣ba

is given by

f(x)g(x)

∣∣∣∣ba

= limx→b−

f(x)g(x)− limx→a+

f(x)g(x).

133

Examples. Let f be an integrable function over R. For all a ∈ R∫ ∞−∞

f(x+ a) dx =

∫ ∞−∞

f(x) dx (invariance under translation)

and ∫ ∞−∞

af(ax) dx =

∫ ∞−∞

f(x) dx (invariance under scaling).

6.6 The Gamma function

Definition. For x > 0 we let

Γ(x) :=

∫ ∞0

tx−1e−t dt.

This generalized integral is well defined because∫ 1

0

tx−1e−t dt ≤∫ 1

0

tx−1 dt =1

x

and by using tx−1 ≤ tx for t ≥ 1 and maxt≥1

txe−t/2 = (2x)xe−x

∫ ∞1

tx−1e−t dt ≤ (2x)xe−x∫ ∞

1

e−t/2 dt ≤ 2(2x)xe−x

Theorem 6.5. - Gamma Function.

1. For all n ∈ Z+

Γ(n+ 1) = n!

and for all x > 0Γ(x+ 1) = xΓ(x)

2. For all x, y > 0:

Γ(x+ y

2

)≤√

Γ(x)Γ(y)

3.

Γ

(1

2

)=√π

Proof. 1. By an integration by parts

Γ(x+ 1) :=

∫ ∞0

txe−t dt = −txe−t∣∣∣∣∞0

+ x

∫ ∞0

tx−1e−t dt = xΓ(x)

since limt→∞

txe−t = 0 and limt→0+

txe−t = 0. Furthermore, Γ(1) = 1 = 0!. By

induction we show that

Γ(n+ 1) = nΓ(n) = n(n− 1)! = n!

134

2. Let us recall the inequality λa2 + λ−1b2 ≥ 2ab for all a, b, λ > 0. Then,for all x, y, λ > 0:

Γ(x+ y

2

)=

∫ ∞0

tx+y2 −1e−t dt =

∫ ∞0

tx2 t

y2 t−1e−t dt

≤ 1

2

∫ ∞0

(λtx + λ−1ty

)t−1e−t dt

=λ

2Γ(x) +

1

2λΓ(y).

We minimize with respect to λ: choose λ > 0 such that λ2 = Γ(y)Γ(x) . This

choice leads to the desired result.

3. We would like to use the Wallis product

π

2=

∞∏k=1

4k2

4k2 − 1= limn→∞

n∏k=1

4k2

4k2 − 1.

The idea is to write Γ( 12 ) as follows:

Γ(n+1

2) = 2−nΓ(

1

2)

n∏k=1

(2k − 1)

and, by variable substitution in the product

Γ(n+3

2) = 2−n−1Γ(

1

2)

n+1∏k=1

(2k − 1) = 2−n−1Γ(1

2)

n∏k=1

(2k + 1).

Hence

Γ(1

2)2 =

Γ(n+ 12 )Γ(n+ 3

2 )22n+1∏nk=1(4k2 − 1)

= 2

( n∏k=1

4k2

4k2 − 1

)·

Γ(n+ 12 )Γ(n+ 3

2 )

Γ(n+ 1)2

From the convexity of ln Γ:

Γ(n+ 1)2 = Γ( (n+ 1

2 ) + (n+ 32 )

2

)2 ≤ Γ(n+1

2)Γ(n+

3

2)

and

Γ(n+1

2)Γ(n+

3

2) = (n+

1

2)Γ(n+

1

2)2

= (n+1

2)Γ(n+ (n+ 1)

2

)2≤ (n+

1

2)Γ(n)Γ(n+ 1)

=n+ 1

2

nΓ(n+ 1)2.

135

This yields the following estimations

2

( n∏k=1

4k2

4k2 − 1

)≤ Γ(

1

2)2 ≤

n+ 12

n2

( n∏k=1

4k2

4k2 − 1

)

hence Γ(1

2)2 = π.

Gaussian integral. Let φ : R→ R be the function defined by

φ(x) =1√2π

e−x2

2

In probability the function φ is called the standard normal density function.Function φ satisfies the property∫ ∞

−∞φ(x) dx = 1.

In fact, φ is an even function and by the variable substitution t = x2/2 we have∫ ∞−∞

φ(x) dx = 2

∫ ∞0

φ(x) dx =2√2π

√2

2

∫ ∞0

t−12 e−t dt =

1√π

Γ

(1

2

)= 1

Theorem 6.6. - Stirling’s formula. We have

limn→∞

n!√2π nn+ 1

2 e−n= 1 (6.10)

Remark. Instead of the limit we also write

n! ∼√

2π nn+ 12 e−n.

Proof. The idea is to write

lnn! =

n∑k=1

ln k =

∫ n

1

lnx dx+ corrections

From the trapezoid rule and the fact that (lnx)′′ = − 1x

2for any integer k =

1, . . . , n− 1, there exists ck ∈ ]k, k + 1[ such that∫ k+1

k

lnx dx =ln(k) + ln(k + 1)

2+

1

12c2k

By summing over k = 1, . . . , n− 1 we obtain∫ n

1

lnx dx =

n∑k=1

ln k − 1

2lnn+

1

12

n−1∑k=1

c−2k

With ∫ n

1

lnx dx = n lnn− n+ 1

136

we have

lnn! =

n∑k=1

ln k = (n+1

2) lnn− n+ bn

where

bn = 1− 1

12

n−1∑k=1

c−2k

The sequence (bn) is decreasing and bounded because c−2k ≤ k−2. Hence (bn) is

convergent. We define an = ebn . By taking the exponential we find

n! = nn+ 12 e−nan i.e. an =

n!

nn+ 12 e−n

From the continuity of the exponential a = limn→∞

an exists and a 6= 0. We will

show that a =√

2π. We have

a2 =a4

a2=

limn→∞

a4n

limn→∞

a22n

= limn→∞

a4n

a22n

= limn→∞

n!4(2n)4n+1

n4n+2(2n)!2= limn→∞

n!224n+1

n(2n)!2.

We have that

(2n)! = 2nn!

n∏k=1

(2k − 1) =2nn!

2n+ 1

n∏k=1

(2k + 1).

Consequently,

a4n

a22n

=n!224n+1

n(2n)!2=

4n+ 2

n

n∏k=1

4k2

4k2 − 1

hence a2 = 2π.

Corollary 6.7. For n large

2−2n

(2nn

)∼ 1√

πn

Proof. Notice that 2−2n

(2nn

)= a2n

√2

a2n√n

.

137

138

Appendix A

Derivatives andantiderivatives ofelementary functions

f(x) Derivative of f An antiderivative of f Conditions

a 0 ax

xα αxα−1 1α+1x

α+1 α 6= −1, x > 0

1x

−1x2 ln |x| x 6= 0

1a2+x2

−2x(a2+x2)2

1a arctan x

a a 6= 0

1a2−x2

2x(a2−x2)2

12a ln

∣∣∣a+xa−x

∣∣∣ a 6= 0, x 6= ±a

1x2−a2

−2x(x2−a2)2

12a ln

∣∣∣x−ax+a

∣∣∣ a 6= 0, x 6= ±a√x2 + a2 x√

x2+a2x2

√x2 + a2 + a2

2 ln(x+√x2 + a2) a 6= 0

√x2 − a2 x√

x2−a2x2

√x2 − a2 − a2

2 ln |x+√x2 − a2| a 6= 0, |x| > |a|

√a2 − x2 −x√

a2−x2

x2

√a2 − x2 + a2

2 arcsin xa a > 0, |x| < |a|

1√x2+a2

−x√(x2+a2)3

ln(x+√x2 + a2) a 6= 0

1√x2−a2

−x√(x2−a2)3

ln |x+√x2 − a2| a 6= 0, |x| > |a|

1√a2−x2

x√(a2−x2)3

arcsin xa a > 0, |x| < |a|

ex ex ex

ax ax ln a ax

ln a a > 0, a 6= 1

lnx 1x x lnx− x x > 0

sinx cosx − cosx

cosx − sinx sinx

tanx 1 + tan2 x − ln | cosx| x 6= π/2 + kπ

cotanx −(1 + cotan2x) ln | sinx| x 6= kπ

1sin x

− cos xsin2 x

ln∣∣tan x

2

∣∣ x 6= kπ

1cos x

sin xcos2 x ln

∣∣tan(x2 + π

4

)∣∣ x 6= π2 + kπ

139

f(x) Derivative of f An antiderivative of f Conditions

arcsinx 1√1−x2

x arcsinx+√

1− x2 |x| < 1

arccosx −1√1−x2

x arccosx−√

1− x2 |x| < 1

arctanx 11+x2 x arctanx− ln

√1 + x2

arccotanx −11+x2 xarccotanx+ ln

√1 + x2

sinhx coshx coshx

coshx sinhx sinhx

tanhx 1− tanh2 x ln coshx

cotanhx 1− cotanh2x ln | sinhx| x 6= 0

arcsinhx 1√1+x2

xarcsinhx−√x2 + 1

arccoshx 1√x2−1

xarccoshx−√x2 − 1 x > 1

arctanhx 11−x2 xarctanhx+ ln

√1− x2 |x| < 1

arccotanhx 11−x2 xarccotanhx+ ln

√x2 − 1 |x| > 1

140

Appendix B

Generalized Integrals

∫ 1

−1

1√1− t2

dt = π∫ ∞0

tx−1e−t dt = Γ(x)

1√2π

∫ ∞−∞

e−t2/2 dt = 1

∫ ∞−∞

e−(at2+2bt+c) dt =

√π

aeb2−aca , a > 0

141

Date post:	22-Dec-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Analysis I for Engineers - EPFL€¦ · Analysis I for Engineers Joachim STUBBE1 January 9, 2015...

Documents