+ All Categories
Home > Documents > M1 Numerical Analysis Course

M1 Numerical Analysis Course

Date post: 07-Dec-2014
Category:
Upload: franco-nelson
View: 24 times
Download: 1 times
Share this document with a friend
Description:
useful
205
Transcript
Page 1: M1 Numerical Analysis Course

Numerical AnalysisM1 SMA International

Ecole Centrale de Nantes

Anthony NOUY

[email protected]

Oce : F231

Page 2: M1 Numerical Analysis Course
Page 3: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Part I

Introduction

1 Origin of problems in numerical analysis

2 References

Page 4: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Part I

Introduction

1 Origin of problems in numerical analysis

2 References

Page 5: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Origin of problems in numerical analysis I

How to interpret the reality with a computer language: from a continuousworld to a discrete world.

Numerical solution of a dierential equation

Find u : x ∈ Ω 7→ u(x) such that

A(u) = b

Example 1 (1D diusion equation, beam in traction, ...)

− d

dx(α

du

dx) = b(x) for x ∈ Ω = (0, 1), u(0) = u(1) = 0

Page 6: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Origin of problems in numerical analysis II

Approximation (from a continuous to a discrete representation)

Represent a function u on a (nite-dimensional) approximation space :

u(x) =n∑i=1

uihi (x)

The solution is then represented by u = (u1, . . . , un) ∈ Rn.

For the denition of the expansion, dierent alternatives such as methodsbased on a weak formulation of the problem.

Example 2 (Galerkin approximation)

Find u ∈ V = v : (0, 1)→ R; v(0) = v(1) = 0 such that∫Ω

dv

dxαdu

dxdx =

∫Ω

v b dx ∀v ∈ V

and replace function space V by approximation spaceVn = v(x) =

∑n

i=1vihi (x) ⊂ V

Page 7: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Origin of problems in numerical analysis III

If A is a linear operator, the initial continuous equation is then transformed into

Linear systems of equations

Find u ∈ Rn such thatAu = b

where A ∈ Rn×n is a matrix and b ∈ Rn a vector.

In order to construct the system of equation (matrix A and right-hand-side b):

Numerical Integration ∫Ω

f (x) dx ≈K∑k=1

ωk f (xk)

If A is a nonlinear operator:

Page 8: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Origin of problems in numerical analysis IV

Nonlinear system of equations

Find u ∈ Rn such thatA(u) = b

where A : u ∈ Rn 7→ A(u) ∈ Rn.

Remedy: iterative solution techniques which transform the solution of anonlinear equation into solution of linear equations.

Example 3

− d

dx(α(x , u)

du

dx) = b(x , u) for x ∈ Ω = (0, 1), u(0) = u(1) = 0

Eigenproblems

Find (u, λ) ∈ Cn × C such that

Au = λu or Au = λBu

where A,B ∈ Cn×n are matrices.

Page 9: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Origin of problems in numerical analysis V

Example 4 (Eigenmodes of a beam)

Wave equation: solution u(x , t) such that

− ∂

∂x(α∂u

∂x) + ρ

∂2u

∂t2= 0 for x ∈ Ω = (0, 1), u(0, t) = u(1, t) = 0

for which we search solutions of the form u(x , t) = w(x) cos(ωt):

w ∈ Vn,

∫Ω

∂v

∂xα∂w

∂xdx = ω2

∫Ω

ρv w dx ∀v ∈ Vn

Ordinary dierential equations in time

d

dtu(t) + A(u(t); t) = b(t)

Page 10: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

Part I

Introduction

1 Origin of problems in numerical analysis

2 References

Page 11: M1 Numerical Analysis Course

Origin of problems in numerical analysis References

References for the course

G. Allaire and S. M. Kaber.

Numerical linear algebra.Springer, 2007. → materials for chapters 1 (Linear Algebra), 2 (Linearsystems), 3(Eigenvalues)

K. Atkinson and W. Han.

Theoretical Numerical Analysis: A Functional Analysis Framework.Springer, 2009.→ materials for chapters 4 (Nonlinear equations), 5(Approximation/Interpolation)→ a quite abstract introduction to numerical analysis (very instructive),with an introduction to functional analysis

E. Suli and D. Mayers.

An Introduction to Numerical Analysis.Cambridge University Press, 2003.→ a clear and simple presentation of all the ingredients of the course

G. Allaire.

Numerical Analysis and Optimization.Cambridge University Press, 2007.→ additional material for numerical solution of PDE and optimizationproblems→ a natural continuation of the course

Page 12: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Part II

Linear algebra

3 Matrices

4 Reduction of matrices

5 Vector and matrix norms

Page 13: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Part II

Linear algebra

3 Matrices

4 Reduction of matrices

5 Vector and matrix norms

Page 14: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Vector space

Let V be a vector space with nite dimension n, on the eld K (R or C).Let E = e1, . . . , en be a basis of V . A vector v ∈ V admits a uniquedecomposition

v =n∑i=1

viei

where the (vi )ni=1 are the components of v on the basis E . When a basis is

chosen and when there is no ambiguity, we can identify V to Kn (Rn or Cn)and let v = (vi )

ni=1, represented by the column vector

v =

v1...vn

We denote respectively by vT and vH the transpose and conjugate transpose ofv , which are the following row vectors

vT =(v1 . . . vn

)vH =

(v1 . . . vn

)where a denotes the complex conjugate of a.

Page 15: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Canonical inner product

We denote by (·, ·) : V × V → K the canonical inner product dened for allu, v ∈ V by

(u, v) = uT v = vTu =n∑i=1

uivi if K = R

(u, v) = uHv = vHu =n∑i=1

uivi if K = C

It is called euclidian inner product if K = R and hermitian inner product ifK = C.

Page 16: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Orthogonality

Orthogonality on a vector space V must be thought with respect to an innerproduct (·, ·). If not mentioned, we classically consider the canonical innerproduct.Two vectors u, v ∈ V are said orthogonal with respect to inner product (·, ·) ifand only if (u, v) = 0.A vector v is said orthogonal to a linear subspace U ⊂ V , which is denotedv ⊥ U, if and only if (v , u) = 0 for all u ∈ U. Two linear subspaces U ⊂ V andU ′ ⊂ V are said orthogonal, and it is denoted U ⊥ U ′, if

(u, u′) = 0 ∀u ∈ U, ∀u′ ∈ U ′

For a given subspace U ⊂ V , we denote by U⊥ its orthogonal complement,which is the largest subspace orthogonal to U. The orthogonal complement ofa vector v ∈ V is denoted by v⊥.

Page 17: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Matrices

Let V and W be two vector spaces with dimension n and m respectively, withbases E = (ei )

ni=1 and F = (fi )

mi=1. A linear map A : V →W , relatively to

those bases, is represented by a matrix A with m rows and n columns

A =

a11 a12 . . . a1na21 a22 . . . a2n...

......

am1 am2 . . . amn

where the coecients aij are such that

Aej =m∑i=1

aij fi , 1 6 j 6 n

We denote (A)ij = aij . The j-th column of A represents the vector Aej in thebasis F .

Denition 5

The set of matrices with m rows and n columns with entries in the eld K is avector space denotedMm,n(K) or Km×n.

Page 18: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Transpose

We denote AH the adjoint (or conjugate transpose) matrix of a complex matrixA = (aij ) ∈ Cm×n, dened by

(AH)ij = aji

We denote AT the transpose of a real matrix A = (aij ) ∈ Rn×m, dened by

(AT )ij = aji

We have the following characterization of AH and AT :

(Au, v) = (u,AHv) ∀u ∈ Cn, v ∈ Cm

(Au, v) = (u,AT v) ∀u ∈ Rn, v ∈ Rm

Page 19: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Product

To the composition of two linear maps corresponds the multiplication of theassociated matrices. If A = (aik) ∈ Km×q and B = (bkj ) ∈ Kq×n, the productAB ∈ Km×n is dened by

(AB)ij =

q∑k=1

aikbkj

We have(AB)T = BTAT , (AB)H = BHAH

The set of square matricesMn,n(K) is simply denotedMn(K) = Kn×n. In thefollowing, unless it is mentioned, we only consider square matrices.

Page 20: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Inverse

We denote by In the identity matrix on Kn×n, associated with the identity mapfrom V to V . If there is no ambiguity, we simply denote In = I and

(I )ij = δij

where δij is the Knonecker delta.A matrix is invertible if there exists a matrix denoted A−1 (unique if it exists)and called the inverse matrix of A, such that AA−1 = A−1A = I . A matrixwhich is not invertible is said singular. If A and B are invertible, we have

(AB)−1 = B−1A−1, (AT )−1 = (A−1)T ≡ A−T , (AH)−1 = (A−1)H ≡ A−H

Page 21: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Particular matrices

Denition 6

A matrix A ∈ Cn×n is said

Hermitian if A = AH

Normal if AAH = AHA

Unitary if AAH = AHA = I

Denition 7

A matrix A ∈ Rn×n is said

Symmetric if A = AT

Orthogonal if AAT = ATA = I

Page 22: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Particular matrices

A matrix A ∈ Kn×n is said diagonal if aij = 0 for i 6= j and we denote

A = diag(aii ) = diag(a11, . . . , ann) =

a11 0 . . . 0

0. . .

. . ....

.... . .

. . . 00 . . . 0 ann

A matrix A is said upper triangular if aij = 0 for i > j :

A =

a11 a12 . . . a1n0 a22 . . . a2n...

. . .. . .

...0 . . . 0 ann

A matrix A is said lower triangular if aij = 0 for j > i :

A =

a11 0 . . . 0

a21 a22. . .

......

.... . . 0

an1 an2 . . . ann

Page 23: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Properties of triangular matrices

Let Ln ⊂ Kn×n be the set of lower triangular matrices, and Un ⊂ Kn×n be theset of upper triangular matrices.

Theorem 8

If A,B ∈ Ln, then AB ∈ Ln

If A,B ∈ Un, then AB ∈ Un

A ∈ Ln (or Un) is invertible if and only if all its diagonal terms are nonzero.

If A ∈ Ln, A−1 ∈ Ln (if it exists)

If A ∈ Un, A−1 ∈ Un (if it exists)

Page 24: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Trace

Denition 9

The trace of a matrix A ∈ Kn×n is dened as

tr(A) =n∑i=1

aii

Property 10

tr(A + B) = tr(A) + tr(B), tr(AB) = tr(BA)

Page 25: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Determinant

Let Sn denote the set of permutations of 1, . . . , n. For σ ∈ Sn, we denote bysign(σ) the signature of the permutation, with sign(σ) = +1 (resp. −1) if σ isan even (resp. odd) permutation of 1, . . . , n.

Denition 11

The determinant of a matrix A ∈ Kn×n is dened as

det(A) =∑σ∈Sn

sign(σ)aσ(1)1 . . . aσ(n)n

Property 12

det(AB) = det(BA) = det(A)det(B)

Page 26: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Image, Kernel I

Denition 13

The image of A ∈ Km×n is a linear subspace of Km dened by

Im(A) = Av ; v ∈ Kn

The rank of a matrix A, denoted rank(A), is the dimension of Im(A):

rank(A) = dim(Im(A)) 6 min(m, n)

Denition 14

The kernel of A ∈ Km×n is a linear subspace of Kn dened by

Ker(A) = v ∈ Kn;Av = 0

The dimension of Ker(A) is called the nullity of A.

Property 15

dim(Im(A)) + dim(Ker(A)) = n

Page 27: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Image, Kernel II

Property 16

For A ∈ Rm×n,

Ker(AT ) + Im(A) = Rm, Ker(AT ) = Im(A)⊥

Ker(A) + Im(AT ) = Rn, Ker(A) = Im(AT )⊥

Proof.

Let us prove that Ker(AT ) = Im(A)⊥, which implies Ker(AT ) + Im(A) = Rm.First, u ∈ Ker(AT )⇒ ATu = 0 ⇒ vTATu = 0 ∀v ⇒ uT y = 0 ∀y ∈ Im(A) ⇒Ker(AT ) ⊂ Im(A)⊥.Secondly, u ∈ Im(A)⊥ ⇒ uTAv = 0 ∀v ⇒ vT (ATu) = 0 ∀v ⇒ ATu = 0 ⇒Im(A)⊥ ⊂ Ker(AT ).

Exercice.

Finish the proof.

Page 28: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Eigenvalues and eigenvectors I

Denition 17

Eigenvalues λi = λi (A), 1 6 i 6 n, of a matrix A ∈ Kn×n are the n roots of itscharacteristic polynomial

pA : λ ∈ C 7→ pA(λ) = det(A− λI )

The eigenvalues may be real or complex. An eigenvalue is said of multiplicity kif it is a root of pA with multiplicity k. The spectrum of matrix A is thefollowing subset of the complex plane

sp(A) = λi (A)ni=1

We have

tr(A) =n∑i=1

λi (A), det(A) =n∏i=1

λi (A)

Page 29: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Eigenvalues and eigenvectors II

Denition 18

The spectral radius ρ(A) of a matrix A is dened by

ρ(A) = max16i6n

|λi (A)|

Property 19

λ ∈ sp(A) if and only if the following equation has (at least) a nontrivialsolution v ∈ Cn\0:

Av = λv

Denition 20

For λ ∈ sp(A), a vector v satisfying Av = λv is called an eigenvector of Aassociated with λ. The linear subspace v ∈ Kn;Av = λv (with dimension atleast one) is called the eigenspace associated with λ.

Page 30: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Part II

Linear algebra

3 Matrices

4 Reduction of matrices

5 Vector and matrix norms

Page 31: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Reduction of matrices

Let V be a vector space with dimension n and A : V → V a linear map on V .Let A be the matrix associated with A, relatively to the basis E = (ei )

ni=1 of V .

Relatively to another basis F = (fi )ni=1 of V , the application A is associated

with another matrix B such that

B = P−1AP

where P is an invertible matrix whose j-th column is composed by thecomponents of fj on the basis E .

Denition 21

Matrices A and B are said similar when they represent the same linear map intwo dierent basis, i.e. when there exists an invertible matrix P such thatB = P−1AP.

Page 32: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Theorem 22 (Triangularization)

For A ∈ Cn×n, there exists a unitary matrix U such that U−1AU is a triangularmatrix, called the Schur form of A (if upper triangular).

Remark.

The previous theorem says that there exists a nested sequence of A-invariantsubspaces 0 = V0 ⊂ V1 ⊂ . . . ⊂ Vn = Cn and there exists an orthonormalbasis of Cn such that Vi is the span of the rst i basis vectors.

Theorem 23 (Diagonalization)

For a normal matrix A ∈ Cn×n, i.e. such that AHA = AAH , there exists aunitary matrix U such that U−1AU is diagonal.

For a symmetric matrix A ∈ Rn×n, there exists an orthogonal matrix Osuch that O−1AO is diagonal.

Page 33: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Singular values and vectors

Denition 24

The singular values of A ∈ Km×n are the eigenvalues of√AHA ∈ Kn×n.

Singular values of A are real non-negative numbers.

Denition 25

σ ∈ R+ is a singular value of A if and only if there exists normalized vectorsu ∈ Km and v ∈ Kn such that we have simultaneously

Av = σu and AHu = σv

u and v are respectively called the left and right singular vectors of Aassociated with singular value σ.

Page 34: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Singular value decomposition (SVD) I

Theorem 26

For A ∈ Km×n, there exist two orthogonal (if K = R) or unitary (if K = C)matrices U ∈ Km×m and V ∈ Kn×n such that

A = USVH

where S = diag(σi ) ∈ Rm×n is a diagonal matrix, with σi the singular values ofA. The columns of U are the left singular vectors of A, and the columns of Vare the right singular vectors of A.

If n = m, S = diag(σi ) =

σ1. . .

σm

. If n 6= m,

S = diag(σi ) ∈ Rm×n must be interpreted as follows (0kl is a k × l matrix withzero entries):

σ1. . . 0m(n−m)

σn

if n > m,

σ1

. . .

σn0(m−n)n

if n < m,

Page 35: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Truncated Singular Value Decomposition (SVD)

The SVD of A can be written

A = USVH =

min(n,m)∑i=1

σiuivHi

After ordering the singular values by decreasing values (σ1 ≥ σ2 ≥ . . .), matrixA can be approximated by a rank-K matrix AK obtained by a truncation of theSVD:

AK =K∑i=1

σiuivHi

We have the following error estimate:

‖A− AK‖F2‖A‖F

=

min(n,m)∑i=K+1

σ2i

Page 36: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Illustration: SVD for data compression

Initial image (778× 643) Singular values Rank-10 SVD

Page 37: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Illustration: SVD for data compression

Initial image (778× 643) Singular values Rank-20 SVD

Page 38: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Illustration: SVD for data compression

Initial image (778× 643) Singular values Rank-30 SVD

Page 39: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Illustration: SVD for data compression

Initial image (778× 643) Singular values Rank-40 SVD

Page 40: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Illustration: SVD for data compression

Initial image (778× 643) Singular values Rank-50 SVD

Page 41: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Illustration: SVD for data compression

Initial image (778× 643) Singular values Rank-100 SVD

Page 42: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Part II

Linear algebra

3 Matrices

4 Reduction of matrices

5 Vector and matrix norms

Page 43: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Vector norms

Denition 27

A norm on vector space V is an application ‖ · ‖ : V → R+ verifying

‖v‖ = 0 if and only if v = 0

‖αv‖ = |α|‖v‖ for all v ∈ V and ∀α ∈ K‖u + v‖ 6 ‖u‖+ ‖v‖ for all u, v ∈ V (triangle inequality)

Example 28 (For V = Kn)

(2-norm) ‖v‖2 =(∑n

i=1|vi |2

)1/2(1-norm) ‖v‖1 =

∑n

i=1|vi |

(∞-norm) ‖v‖∞ = maxi∈1,...,n |vi |

(p-norm) ‖v‖p =(∑n

i=1|vi |p

)1/pfor p > 1.

Page 44: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Useful inequalities

(·, ·) denote the canonical inner product.

Theorem 29 (Cauchy-Schwartz inequality)

|(u, v)| 6 ‖u‖2‖v‖2

Theorem 30 (Hölder's inequality)

Let 1 ≤ p, q ≤ ∞ such that 1

p+ 1

q= 1, then

|(u, v)| 6 ‖u‖p‖v‖q

Theorem 31 (Minkowski inequality)

Let 1 6 p 6∞, then‖u + v‖p 6 ‖u‖p + ‖v‖p

Minkowski inequality is in fact the triangular inequality for the norm ‖ · ‖p.

Page 45: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Matrix norms I

Denition 32

A norm on Km×n is a map ‖ · ‖ : Km×n → R+ which veries

‖A‖ = 0 is and only if A = 0

‖αA‖ = |α|‖A‖ for all A ∈ Km×n and ∀α ∈ K‖A + B‖ 6 ‖A‖+ ‖B‖ for all A,B ∈ Km×n (triangle inequality)

For square matrices (n = m), a matrix norm is a norm which satises thefollowing additional inequality

‖AB‖ 6 ‖A‖‖B‖ for all A ∈ Kn×n, B ∈ Kn×n

An important class of matrix norms is the class of subordinate matrix norms.

Denition 33 (subordinate matrix norm)

Given norms ‖ · ‖ on Kn and Km, we can dene a natural norm on Km×n,subordinate to the vectors norms, and dened by

‖A‖ = maxv∈Cn :v 6=0

‖Av‖‖v‖ = max

v∈Cn :‖v‖61

‖Av‖ = maxv∈Cn :‖v‖=1

‖Av‖

Page 46: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Matrix norms II

Example 34

When considering classical vector norms on Kn, we have the followingcharacterization of the subordinate norms of a square matrix A ∈ Kn×n:

‖A‖1 = maxv‖Av‖1‖v‖1 = maxj

∑i |aij |

‖A‖∞ = maxv‖Av‖∞‖v‖∞ = maxi

∑j |aij |

‖A‖2 = maxv‖Av‖2‖v‖2 =

√ρ(AHA) =

√ρ(AAH) = ‖AH‖2.

Note that ‖A‖2 corresponds to the dominant singular value of A.

Property 35

For all unitary matrix U (i.e. UUH = I ), we have

‖A‖2 = ‖AU‖2 = ‖UA‖2 = ‖UHAU‖2

If A is normal (i.e. AAH = AHA), then ‖A‖2 = ρ(A).

Page 47: M1 Numerical Analysis Course

Matrices Reduction of matrices Vector and matrix norms

Matrix norms III

Theorem 36

Let A be a square matrix and ‖ · ‖ an arbitrary matrix norm. Then

ρ(A) 6 ‖A‖

For ε > 0, there exists at least one subordinate matrix norm such that

‖A‖ 6 ρ(A) + ε

Page 48: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 49: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods

The aim is to introduce dierent strategies for the solution of a system of linearequations

Ax = b

with A ∈ Rn×n, b ∈ Rn.

Page 50: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 51: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods

Condition number

Let consider the following two systems of equations10 7 8 77 5 6 58 6 10 97 5 9 10

x =

32233331

⇒ x =

1111

10 7 8 77 5 6 58 6 10 97 5 9 10

x =

32.122.933.130.9

⇒ x =

9.2−12.64.5−1.1

We observe that a little modication of the right-hand side leads a largemodication in the solution.If an error is made on the input data (here the right-hand side), the error onthe solution may be drastically amplied.This phenomenon is due to a bad conditioning of the matrix A. It reveals thatfor badly conditioned matrices, the solution of systems of equations obtainedwith nite precision computers has to be considered carefully or even notconsidered as a good solution.

Page 52: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods

Denition 37

Let A ∈ Kn×n be an invertible matrix and let ‖ · ‖ be a matrix normsubordinate to the vector norm ‖ · ‖. The condition number of A is dened as

cond(A) = ‖A‖‖A−1‖

Let b ∈ Kn be the right-hand side of a system and let δA ∈ Kn×n and δb ∈ Kn

be perturbations of matrix A and vector b.

Property 38

If x and xε are solutions of the following systems

Ax = b, Aεxε = bε,

with ‖A− Aε‖ = O(ε) and ‖b − bε‖ = O(ε), then

‖x − xε‖‖x‖ 6 cond(A)

(‖A− Aε‖‖A‖ +

‖b − bε‖‖b‖

)+ O(ε2)

Page 53: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods

Property 39

For every matrix A and every matrix norm, cond(A) > 1,cond(A) = cond(A−1), cond(αA) = cond(A), ∀α 6= 0.

For every matrix A, the condition number cond2(A) = ‖A‖2‖A−1‖2associated with the 2-norm veries

cond2(A) =maxi σi (A)

mini σi (A)

where the σi (A) are the singular values of A.

For a normal matrix A,

cond2(A) =maxi |λi (A)|mini |λi (A)|

where the λi (A) are the eigenvalues of A.

For unitary or orthogonal matrix A, the condition number cond2(A) = 1.

The condition number cond2(A) is invariant trough unitarytransformation: cond2(A) = cond2(AU) = cond2(UA) = cond2(UHAU)for every unitary matrix U.

Page 54: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 55: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Principle of direct methods I

For solving

Ax = b,

direct methods consist in determining an invertible matrix M such that

MAx = Mb

is an upper triangular system. This is called the elimination step. Then, asimple backward substitution can be performed to solve this triangular system.

Do not compute the inverse !!!

In practice, the solution x of Ax = b is not obtained by rst computing theinverse A−1 and then computing the matrix-vector product A−1b. Indeed, itwould be equivalent to solving n systems of linear equations.

For simplicity, we use sometimes the notation M−1x but the inverse is nevercomputed in practise. This operation corresponds to the solution of a system ofequations (generally easy due to properties of M: diagonal, triangular).

Page 56: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 57: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Triangular systems of equations I

If A is lower triangular, the systema11 0 . . . 0

a21 a22. . .

......

.... . . 0

an1 an2 . . . ann

x1

...xn

=

b1...bn

is solved by a forward substitution

Algorithm 40 (Forward substitution for lower triangular system)

Step 1. a11x1 = b1

Step 2. a22x2 = −a21x1...

Step n. annxn = bn −∑n−1

j=1anjbj

Page 58: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Triangular systems of equations II

If A is upper triangular, the systema11 a12 . . . a1n0 a22 . . . a2n...

. . .. . .

...0 . . . 0 ann

x1

...xn

=

b1...bn

is solved by a backward substitution

Algorithm 41 (Backward substitution for upper triangular system)

Step 1. annxn = bn

Step 2. an−1,n−1xn−1 = −an−1,nxn...

Step n. a11x1 = b1 −∑n

j=2a1jbj

Page 59: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 60: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination I

Denition 42 (Pivoting matrix)

A pivoting matrix P(i , j), associated with a linear mapping written in a basisE = (ei )

ni=1, is dened as follows

P(i , j) = I − (ei − ej )(ei − ej )H

For A ∈ Kn×n, P(i , j)A is the matrix A with permuted lines i and j , andAP(i , j) is the matrix A with permuted columns i and j . Let us note thatP(i , i) = I .

We now describe the Gauss elimination procedure

Step 1.

Let A = A1 = (a1ij ). Select a nonzero element a1i∗1 of the rst column andpermute the lines 1 and i∗. Let P1 = P(1, i∗) and set

A1 = P1A1 = (a1ij )

Page 61: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination II

Let introduce the matrix

E1 =

1

− a121a111

. . .

.... . .

− a1n1a111

1

such that

A2 = E1A1 =

a111 a112 . . . a11n0 a222 . . . a22n...

......

0 a2n2 . . . a2nn

Step 2.

We have det(A2) = det(E1P1A1) = det(E1)det(P1)det(A) = ±det(A)(−det(A) if a line permutation has been made, +det(A) if not). Therefore A2

is invertible, and so is the submatrix (A2)ij , 2 6 i , j 6 n. We can then operateas in step 1 for this submatrix for eliminating the subdiagonal elements of

Page 62: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination III

column 2: introduce a permutation matrix P2 = P(2, i∗), with i∗ > 2, and aline operation matrix E2, and let A2 = P3A2 and A3 = E3A2.

Step k − 1.After k − 1 steps, we have the matrix

Ak = Ek−1Pk−1 . . .E1P1A1 =

ak11 ak12 . . . . . . . . . ak1nak22 . . . . . . . . . ak2n

. . ....

akkk . . . akkn...

...

aknk . . . aknn

After an eventual pivoting with a pivoting matrix Pk , we dene Ak = PkAk andAk+1 = Ek Ak with

Page 63: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination IV

Ek =

1. . .

1

−akk+1,k

akkk

. . .

.... . .

− aknk

akkk

1

Last step

After n − 1 steps, by we obtain an upper triangular matrix

An = En−1Pn−1 . . .E1P1A

The invertible matrix M = En−1Pn−1 . . .E1P1 is then an invertible matrix suchthat MA is upper triangular.

Page 64: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination V

Remark. Choice of pivoting

In order to avoid dramatic roundo errors with nite precision computers, weadopt one of the following pivoting strategies.

Partial pivoting. At step k, we select Pk = P(k, i∗) such that|aki∗k | = max

k6i6n|akik |

Total pivoting. At step k, we select i∗ and j∗ such that|aki∗j∗ | = max

i>k,j6n|akij | and we permute lines and columns by dening

Ak = P(k, i∗)AkP(j∗, k).

Page 65: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination VI

Remark. Computing the determinant of a matrix

The Gauss elimination is an ecient technique for computing the determinantof a matrix. Indeed,

det(A) = det(An)det(M)−1 = ±n∏i=1

anii

where the sign depends on the number of pivoting operations that have beenperformed.

Remark.

In practice, for solving a system Ax = b, we don't compute the matrix M. Werather operate simultaneously on b by computing

Mb = bn = En−1Pn−1 . . .E1P1b

Then, we solve the triangular system MAx = MB, or equivalently Anx = bn.

Page 66: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Gauss elimination VII

Computational work of Gauss Elimination

O(2

3n3)

For an arbitrary matrix, it seems that this computational work in O(n3) is nearthe optimal that we can expect. That is the reason why Gauss elimination canbe used when no additional information is given on the matrix.

Theorem 43

For A ∈ Kn (inversible or not), there exists at least one invertible matrix Msuch that MA is an upper triangular matrix.

Proof.

For A invertible, the Gauss elimination procedure is a constructive proof for thistheorem. Otherwise, the matrix A is singular if and only there exists a matrixAk with elements akik = 0 for k 6 i 6 n. In this case, we can set Ek = I andPk = I at step k of the Gauss elimination and go to the next step.

Page 67: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 68: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

LU factorization I

The LU factorization of a matrix consists in constructing lower and uppertriangular matrices L and U such that A = LU. In fact, this factorization isobtained by the Gauss elimination procedure.Let us consider the Gauss elimination without pivoting, i.e. by letting Ak = Ak .It is possible if at step k, akkk 6= 0. We then let

M = En−1 . . .E1

and obtainMA = U

where U is the desired upper triangular matrixa111 a112 . . . a11n

a222 . . . a22n. . .

...annn

M being a product of lower triangular matrices, it is a lower triangular matrixand so is its inverse M−1. We then have the desired decomposition with

L = M−1 = E−11 . . .E−1n−1

Page 69: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

LU factorization II

Matrix L = (lij ) is directly obtained from matrices Ek

Ek =

1. . .

1

−lk+1,k

. . ....

. . .

−lnk 1

, E−1k =

1. . .

1

lk+1,k

. . ....

. . .

lnk 1

Page 70: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

LU factorization III

Theorem 44

Let A ∈ Kn×n be such that the diagonal submatricesa11 . . . a1k...

...ak1 . . . akk

∈ Kk×k are invertible. Then, there exists a lower triangular

matrix L and an upper triangular matrix U such that

A = LU

If we further impose that the diagonal elements of L are equal to 1, thisdecomposition is unique.

Proof.

The condition on the invertibility of submatrices ensures that at step k, thediagonal term akkk is nonzero and therefore that pivoting can be omitted.

Page 71: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 72: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Cholesky factorization I

Theorem 45

If A ∈ Rn×n is a symmetric denite positive matrix, there exists at least onelower triangular matrix B = (bij ) ∈ Rn×n such that

A = BBT

If we further impose that the diagonal elements bii > 0, the decomposition isunique.

Page 73: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Cholesky factorization II

Proof.

We simply show that the diagonal submatrices ∆k = (aij ), 1 6 i , j 6 k, arepositive denite. Therefore, they are invertible and there exists a unique LUfactorization A = LU such that L has unit diagonal terms. Since the ∆k arepositive denite, we have

∏k

i=1uii = det(∆kk) > 0, for all k > 1. We then

dene the diagonal matrix D = diag(√uii ) and we write

A = (LΛ)(Λ−1U) = BC

where B = LΛ and C = Λ−1U have both diagonal terms bii = cii =√uii . The

symmetry of matrix A imposes that BC = CTBT and therefore

CB−T =

1 × . . . ×

1 . . . ×. . .

...1

=

1× 1...

. . .

× . . . × 1

= B−1CT

and this last equality is only possible if CB−T = I ⇒ C = BT . (Prove theuniqueness of the decomposition).

Page 74: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 75: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Householder matrices

Denition 46

For v a nonzero vector in Cn, we introduce the following matrix, calledHouseholder matrix associated with v :

H(v) = I − 2vvH

vHv

We will consider, although incorrect, that the identity I is a Householder matrix.

Theorem 47

For x = (xi )ni=1 ∈ Cn, there exists two householder matrices H such that

(Hx)i = 0 for i > 2.

Proof.

Denoting by e1 the rst basis vector of Cn, one veries that the twohouseholder matrices H(v) are associated with the vectors v = x ± ‖x‖2e iαe1,where α ∈ R is the argument of x1 ∈ C, i.e. x1 = |x1|e iα, and we have

H(v)x = ∓‖x‖2e1

Page 76: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Householder method I

The Householder method for solving Ax = b consists in nding n − 1householder matrices Hin−1i=1

such that Hn−1 . . .H1A is upper triangular.Then, we solve the following triangular system by backward substitution:

Hn−1 . . .H1Ax = Hn−1 . . .H1b

Suppose that Ak = Hk−1 . . .H1A is under the form

Ak =

ak11 ak12 . . . . . . . . . ak1na222 . . . . . . . . . a22n

. . ....

akkk . . . akkn...

...

aknk . . . aknn

Let c = (ci )

n−k+1

i=1∈ Cn−k+1 be the vector with components ci = aki+k−1. There

exists a Householder matrix H(vk), with vk ∈ Cn−k+1, such that H(vk)c has

Page 77: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Householder method II

zero components except the rst one. Then, we denote vk =

(0vk

)∈ Cn and

we let Hk = H(vk) the householder matrix associated with vk . Let us note that

Hk = H(vk) =

(Ik−1 00 H(vk)

)Performing this operation for k = 1 . . . n − 1, we obtain the desired uppertriangular matrix An = Hn−1 . . .H1A.

Page 78: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

QR factorization I

The QR factorization is a matrix interpretation of the Householder method.

Theorem 48

For A ∈ Kn×n, there exist a unitary matrix Q ∈ Kn×n and an upper triangularmatrix R ∈ Kn×n such that

A = QR

Moreover, one can choose the diagonal elements of R > 0. Then, if A isinvertible, the corresponding QR factorization is unique.

Page 79: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

QR factorization II

Proof.

The previous householder construction proves the existence of an uppertriangular matrix

R = Hn−1 . . .H1A

where the Hi are householder matrices. The matrix

Q = (Hn−1 . . .H1)−1 = H−11 . . .H−1n−1 = H1 . . .Hn−1

is unitary (recall that the Hk are unitary and hermitian, i.e. H−1k = HHk = Hk).

This proves this existence of a QR decomposition. Let now denote by αi ∈ Rthe arguments of the diagonal elements rkk = |rkk |e iαk and let D = diag(e iαk ).The matrix Q = QD is still unitary and the matrix R = D−1R is still uppertriangular with all its diagonal elements greater than 0. We then have theexistence of a QR factorization A = QR with rkk > 0. We can then show theuniqueness of this decomposition (let as an exercice).

Remark.

If A ∈ Rn×n, Q,R ∈ Rn×n, with Q an orthogonal matrix.

Page 80: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 81: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Triangular systems Gauss elimination LU factorization Cholesky factorization Householder method and QR factorization Computational work

Computational complexity

With classical algorithms...

Algorithm Operations

LU O( 23n3)

Cholesky O( 13n3)

QR O( 23n3)

Page 82: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 83: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 84: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Basic iterative methods I

For the solution of a linear system of equations Ax = b, basic iterative methodsconsist in constructing a sequence xkk≥0 dened by

xk+1 = Bxk + c

from an initial vector x0. Matrix B and vector c are to be dened such that theiterative method converges towards the solution x , i.e.

limk→∞

xk = x

B and c are chosen such that I − B is invertible and such that x is the uniquesolution of x = Bx + c.

Theorem 49

Let B ∈ Kn×n. The following assertions are equivalent

(1) limk→∞ Bk = 0

(2) limk→∞ Bkv = 0 ∀v(3) ρ(B) < 1

(4) ‖B‖ < 1 for at least one subordinate matrix norm ‖ · ‖

Page 85: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Basic iterative methods II

Proof.

(1)⇒ (2). ‖Bkv‖ ≤ ‖Bk‖‖v‖ −→k→∞

0

(2)⇒ (3). If ρ(B) ≥ 1, there exists a vector v 6= 0 such that Bv = λv with|λ| ≥ 1 and then Bkv = λkv does not converge towards 0, a contradiction.(3)⇒ (4). Consequence of theorem 36(4)⇒ (1). ‖Bk‖ ≤ ‖B‖k −→

k→∞0.

Theorem 50

The following assertions are equivalent

(i) The iterative method is convergent

(ii) ρ(B) < 1

(iii) ‖B‖ < 1 for at least one subordinate matrix norm ‖ · ‖

Proof.

The iterative method is convergent if and only if limk→∞ ek = 0, withek = xk − x = Bke0. The proof then results from theorem 49.

Page 86: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 87: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Jacobi, Gauss-Seidel, Relaxation (SOR) I

We decompose A under the form

A = M − N

where M is an invertible matrix and then

Ax = b ⇔ Mx = Nx + b

and we compute the sequence

xk+1 = M−1Nxk + M−1b ≡ Bxk + c

In practice, at each iteration, we solve the system Mxk+1 = Nxk + b. Themethod is then ecient if M have a simple form (diagonal or triangular).

Denition 51

We decompose A = D − E − F where D is the diagonal part of A, −E and −Fits strict lower and upper parts.

Page 88: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Jacobi, Gauss-Seidel, Relaxation (SOR) II

Denition 52 (Jacobi)

M = D, N = E + F

Denition 53 (Gauss-Seidel)

M = D − E , N = F

Denition 54 (Successive Over Relaxation (SOR))

M = ω−1D − E , N = ω−1(1− ω)D + F

Page 89: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Convergence results I

Theorem 55

Let A a positive denite hermitian matrix, decomposed under the formA = M −N with M invertible. If the matrix (MH + N) is positive denite, thenρ(M−1N) < 1.

Page 90: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Convergence results II

Proof.

From theorem 36, we know that it suces to nd a matrix norm for which‖M−1N‖ < 1. We will show this property for the matrix norm subordinate tothe vector norm ‖v‖ =

√vHAv . Let rst note that (MH +N) is hermitian since

(MH + N)H = M + NH = A + N + NH = AH + NH + N = MH + N.

We have‖M−1N‖ = ‖I −M−1A‖ = sup

‖v‖=1

‖v −M−1Av‖

Denoting w = M−1Av , we have, for v such that ‖v‖ = 1,

‖v − w‖2 = 1− vHAw − wHAv + wHAw

= 1− wHMHw − wHMw + wHAw = 1− wH(MH + N)w︸ ︷︷ ︸>0

Therefore ‖v‖ = 1⇒ ‖v −M−1Av‖ < 1. The functionv ∈ Cn 7→ ‖v −M−1Av‖ ∈ R is continuous on the unit sphere, which is acompact set, and therefore the supremum is reached.

Page 91: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Convergence results III

Theorem 56 (Sucient condition for convergence of relaxation)

If A is hermitian positive denite, relaxation method converges if 0 < ω < 2.

Proof.

We show that MH + N = 2−ωω

D. Since A is denite positive, we have for the

canonical basis vectors vi , vHi Avi = vHi Dvi > 0. Matrix MH + N is then

hermitian positive denite if and only if 0 < ω < 2, and the proof ends withtheorem 55.

Theorem 57 (Necessary condition for convergence of relaxation)

The spectral radius of the matrix Bω = M−1N of the relaxation method veries

ρ(Bω) ≥ |ω − 1|

and therefore, relaxation method converges only if 0 < ω < 2.

Page 92: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Convergence results IV

Proof.

We haveBω = (ω−1D − E)−1(ω−1(1− ω)D + F )

and then

det(Bω) = (1− ω)n =n∏i=1

λi (Bω)

Then

ρ(Bω) ≥

(n∏i=1

λi (Bω)

)1/n

= |1− ω|

Page 93: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 94: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Projection methods I

We consider a real system of equations Ax = b. Projection techniques consistsin searching an approximate solution x in a subspace V of Rn. Theapproximate solution is then dened by

x ∈ V, b − Ax ⊥ W

where W is a subspace of Rn with the same dimension of V. The approximatesolution is then dened by orthogonality constraints on the residual. x is calleda projection of x onto the subspace V and parallel to subspace W. The caseV =W corresponds to an orthogonal projection and the orthogonalityconstraint is called Galerkin orthogonality. The case V 6=W corresponds to anoblique projection and the orthogonality constraint is called Petrov-Galerkinorthogonality.Let V = (v1, . . . , vm) and W = (w1, . . . ,wm) dene bases of V and W, theapproximation is then dened by x = Vy , with y ∈ Rm such that

WTAVy = WTb ⇒ y = (WTAV )−1WTb

Page 95: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Projection methods II

Projection method

Until convergence

1 Select V = (v1, . . . , vm) and W = (w1, . . . ,wm)

2 r = b − Ax

3 y = (WTAV )−1WT r

4 x = x + Vy

Subspaces must be chosen such that WTAV is nonsingular. Two importantparticular choices satises this property.

Theorem 58

WTAV is nonsingular for either one the following conditions

A is positive denite and V =WA is nonsingular and W = AV.

Page 96: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Projection methods III

Theorem 59

Assume that A is symmetric denite positive and V =W. Then, x ∈ V is suchthat Ax − b ⊥ V if and only if

‖x − x‖2A = minx∈V‖x − x‖2A, ‖x‖2A = xTAx

Theorem 60

Let A a nonsingular matrix and W = AV. Then, x ∈ V is such thatAx − b ⊥ W if and only if it minimizes the 2-norm of the residual

‖b − Ax‖2 = minx∈V‖b − Ax‖2

Page 97: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Basic one-dimensional projection algorithms I

Basic one-dimensional projection schemes consist in selecting V and W withdimension 1. Let us denote V = spanv and W = spanw. Denotingr = b − Axk the residual at iteration k, the next iterate is dened by

xk+1 = xk + αv , α =(w , r)

(w ,Av)=

wT r

wTAv

Denition 61 (Steepest descent)

We let v = r and w = r . We then have

xk+1 = xk + αr , α =(r , r)

(Ar , r)

If A is symmetric positive denite matrix, xk+1 is the solution of

minα

f (xk + αr), f (x) = ‖x − x‖2A = (x − x ,A(x − x))

We note that −∇f (xk) = A(x − xk) = b − Axk = r , and thereforexk+1 = xk − α∇f (xk). It then corresponds to a steepest descent algorithm forminimizing the convex function f (x), with an optimal choice of step α.

Page 98: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Basic one-dimensional projection algorithms II

Theorem 62 (Convergence of steepest descent)

If A is symmetric positive denite matrix, the steepest descent algorithmconverges.

Denition 63 (Minimal residual)

We let v = r and w = Ar . We then have

xk+1 = xk + αr , α =(Ar , r)

(Ar ,Ar)

which is the solution ofminα‖b − A(xk + αr)‖2

Theorem 64

If A is positive denite, minimal residual algorithm converges.

Page 99: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Basic one-dimensional projection algorithms III

Denition 65 (Residual norm steepest descent)

We let v = AT r and w = Av = AAT r . We then have

xk+1 = xk + αAT r , α =(Av , r)

(Av ,Av)=‖v‖2

‖Av‖2

which is the solution of

minα

f (xk + αv), f (x) = ‖b − Ax‖2 = (Ax − b,Ax − b)

Note that −∇f (xk) = AT (b − Axk) = AT r = v . It then corresponds to asteepest descent algorithm on convex function f (x), with an optimal choice ofstep α.

Theorem 66

If A is nonsingular, residual norm steepest descent algorithm converges.

Page 100: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Part III

Systems of linear equations

6 Conditioning

7 Direct methodsTriangular systemsGauss eliminationLU factorizationCholesky factorizationHouseholder method and QR factorizationComputational work

8 Iterative methodsGeneralitiesJacobi, Gauss-Seidel, RelaxationProjection methodsKrylov subspace methods

Page 101: M1 Numerical Analysis Course

Conditioning Direct methods Iterative methods Generalities Jacobi, Gauss-Seidel, Relaxation Projection methods Krylov subspace methods

Krylov subspace methods

Krylov subspace methods are projection methods which consists in deningsubspace V as the m-dimensional Krylov subspace of matrix A, associated withr0 = b − Ax0, where x0 is an initial guess. This Krylov subspace is dened by

V = Km(A, r0) = spanr0,Ar0, . . . ,Am−1r0

The dierent Krylov subspace methods dier from the choice of space W andfrom the choice of a preconditioner. First class of methods consisting in takingW = Km(A, r0) or W = AKm(A, r0). Second class of methods consisting intaking W = Km(AT , r0).

A complete reference about iterative methods

Yousef Saad.Iterative Methods for Sparse Linear Systems.SIAM, 2003.

Page 102: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Part IV

Eigenvalue problems

9 Jacobi method

10 Givens-Householder method

11 QR method

12 Power iterations

13 Methods based on Krylov subspaces

Page 103: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Eigenvalue problems

The aim is to present dierent techniques for nding the eigenvalues andeigenvectors (λi , vi ) of a matrix A:

Avi = λivi

Page 104: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Part IV

Eigenvalue problems

9 Jacobi method

10 Givens-Householder method

11 QR method

12 Power iterations

13 Methods based on Krylov subspaces

Page 105: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Jacobi method I

Jacobi method allows to nd all the eigenvalues of a symmetric matrix A. It iswell adapted to full matrices.There exists an orthogonal matrix O such that OTAO = diag(λ1, . . . , λn),where the λi are the eigenvalues of A, distinct or not. The Jacobi methodconsists in constructing a sequence of elementary orthogonal matrices (Ωk)k≥1such that the sequence (Ak)k≥1, dened by

Ak+1 = ΩTk AkΩk = (Ω1 . . .Ωk)TAk(Ω1 . . .Ωk) = OT

k AOk

converges towards the diagonal matrix diag(λ1, . . . , λn) (with an eventualpermutation).Each transformation Ak → Ak+1 consists in eliminating two symmetricextra-diagonal terms by a rotation. Let A = Ak and B = Ak+1. The matrix Ωk

is selected as follows

Ωk = I + (cos(θ)− 1)(epeTp + eqe

Tq ) + sin(θ)epe

Tq − sin(θ)eqe

Tp

where θ ∈ (−π/4, π/4)\0 is the unique angle such that bpq = bqp = 0. θ issolution of

cotan(2θ) =aqq − app2apq

Page 106: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Jacobi method II

Theorem 67 (Convergence of eigenvalues)

The sequence (Ak)k≥1 obtained with the Jacobi method converges and

limk→∞

Ak = diag(λσ(i))

where σ is a permutation of 1, ..., n.

Theorem 68 (Convergence of eigenvectors)

We suppose that all eigenvalues of A are distinct. Then, the sequence (Ok)k≥1in the Jacobi method converges to an orthogonal matrix whose columns forman orthonormal set of eigenvectors of A.

Page 107: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Part IV

Eigenvalue problems

9 Jacobi method

10 Givens-Householder method

11 QR method

12 Power iterations

13 Methods based on Krylov subspaces

Page 108: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Givens-Householder method I

Givens-Householder method is adapted to the research of selected eigenvaluesof a symmetric matrix A, such as the eigenvalues lying in a given interval.Two steps

1 Determine an orthogonal matrix P such that PTAP is tridiagonal, withthe Householder method.

2 Compute the eigenvalues of a tridiagonal symmetric matrix with theGivens method.

Theorem 69

For a symmetric matrix A, there exists an orthogonal matrix P, product of n− 2Householder matrices Hk such that PTAP is tridiagonal: P = H1H2 . . .Hn−2

HT1 AH1 =

× × 0 0 . . .× × × × . . .0 × × × . . .0 × × × . . .

.

.

....

.

.

.

, HT2 H

T1 AH1H2 =

× × 0 0 . . .× × × 0 . . .0 × × × . . .0 0 × × . . .

.

.

....

.

.

.

...

Page 109: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Part IV

Eigenvalue problems

9 Jacobi method

10 Givens-Householder method

11 QR method

12 Power iterations

13 Methods based on Krylov subspaces

Page 110: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

QR method I

The most commonly used method to compute the whole set of eigenvalues ofan arbitrary matrix A, even nonsymmetric.

QR algorithm

Let A1 = A. For k ≥ 1, perform until convergence

Ak = QkRk (QR factorization)

Ak+1 = RkQk

All matrices Ak are similar to matrix A. Under certain conditions, the matrixAk converges towards a triangular matrix which is the Schur form of A, whosediagonal terms are the eigenvalues of A.

Page 111: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Part IV

Eigenvalue problems

9 Jacobi method

10 Givens-Householder method

11 QR method

12 Power iterations

13 Methods based on Krylov subspaces

Page 112: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Power iterations method I

Power iteration method allows the capture of the dominant (largest magnitude)eigenvalue and associated eigenvector of a real matrix A.

Power iteration algorithm

Start with an arbitrary normalized vector x (0) and compute the sequence

x (k+1) =Ax (k)

‖Ax (k)‖

andβ(k+1) = (Ax (k), x (k))

Theorem 70

If the dominant eigenvalue is real and of multiplicity 1, the sequences (x (k))k≥0and (β(k))k≥0 respectively converge towards the dominant eigenvector andeigenvalue.

Page 113: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Power iterations method II

Proof.

Let us prove the convergence of the method when A is symmetric. Then, thereexists an orthonormal basis of eigenvectors (v1, . . . , vn), associated witheigenvalues (λ1, . . . , λn). Let us consider that |λ1| > |λi | for all i > 1. Theinitial vector x (0) can be decomposed on this basis: x (0) =

∑n

i=1aivi and then,

since Avi = λivi ,

x (k) =Ax (k−1)

‖Ax (k)‖=

Akx (0)

‖Akx (0)‖

Akx (0) =n∑i=1

aiλki vi = a1λ

k1w

(k), w (k) =

(v1 +

n∑i=2

aia1

(λiλ1

)k

vi

)and since w (k) → v1, we obtain

x (k) =a1λ

k1w

(k)

‖a1λk1w (k)‖−→k→∞

sign(a1λk1)v1, β(k) −→

k→∞(Av1, v1) = λ1

Let us note that for general matrices, a proof using the Jordan form can beused.

Page 114: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Power iterations method III

Exercice. Power method with deation

Under certain conditions, Power method with deation allows to compute thewhole set of eigenvalues of a matrix. See exercices.

Denition 71 (Inverse power method)

For an invertible matrix A, applying the power method to matrix A−1 allows toobtain the eigenvalue of A with smallest magnitude and the associatedeigenvector (if the smallest magnitude eigenvalue is of multiplicity 1).

Denition 72 (Shifted inverse power method)

The shifted inverse power method consists in applying the inverse powermethod to the shifted matrix Aσ = (A− σI ). It allows the capture of theeigenvalue (and associated eigenvector) which is the closest from the value σ.Indeed, if we denote by (vi , λi ) the eigenpairs of matrix A, Aσ has foreigenpairs (vi , λi − σ). Therefore the inverse power method on Aσ willconverge towards the eigenvalue (λi − σ) such that |λi − σ| = minj |λj − σ|.

Page 115: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Part IV

Eigenvalue problems

9 Jacobi method

10 Givens-Householder method

11 QR method

12 Power iterations

13 Methods based on Krylov subspaces

Page 116: M1 Numerical Analysis Course

Jacobi Givens-Householder QR Power iterations Krylov

Methods based on Krylov subspaces

A complete reference for the solution of eigenvalue problems

Yousef Saad.Numerical Methods For Large Eigenvalue Problems.SIAM, 2011.

Page 117: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Part V

Nonlinear equations

14 Fixed point theorem

15 Nonlinear equations with monotone operators

16 Dierential calculus for nonlinear operators

17 Newton method

Page 118: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Solving nonlinear equations

The aim is to introduce dierent techniques for nding the solution u of anonlinear equation

A(u) = b, u ∈ K ⊂ V

where K is a subset of a vector space V and A : K → V is a nonlinear mapping.We will equivalently consider the nonlinear equation

F (u) = 0, u ∈ K ⊂ V

where F : K → V .

Page 119: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Innite dimensional framework

Denition 73

A Banach space V is a complete normed vector space. That means that this isa vector space (on complex or real elds) equipped with a norm ‖ · ‖ and suchthat every Cauchy sequence with respect to this norm has a limit in V .

Denition 74

A Hilbert space is a Banach space V whose norm ‖ · ‖ is associated with anscalar (or hermitian) product (·, ·), with ‖v‖2 = (v , v).

Example 75

V = Rn equipped the natural euclidian scalar product is a nite-dimensionalHilbert space.V = Cn equipped the natural hermitian product is a nite-dimensional Hilbertspace on complex eld.

Page 120: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Part V

Nonlinear equations

14 Fixed point theorem

15 Nonlinear equations with monotone operators

16 Dierential calculus for nonlinear operators

17 Newton method

Page 121: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fixed point theorem I

We here consider nonlinear problems under the form

T (u) = u, u ∈ K ⊂ V (1)

where T : K → V is a nonlinear operator.

Denition 76

A solution u of the equation T (u) = u is called a xed point of mapping T .

We are interested in the existence of a solution to equation (1) and in thepossibility of approaching this solution by the following sequence (uk)k≥0dened by

uk+1 = T (uk)

Remark.

Let us note that nonlinear equations F (u) = 0 can be recasted (in dierentways) in the form (1), by letting

T (u) = F (u) + u, T (u) = αF (u) + u, . . .

Page 122: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fixed point theorem II

Denition 77

Let V be a Banach space endowed with a norm ‖ · ‖. A mappingT : K ⊂ V → V is said

contractive if there exists a constant α, with 0 ≤ α < 1, such that

‖T (u)− T (v)‖ ≤ α‖u − v‖ ∀u, v ∈ K

α is called the contractivity constant.

non-expansive if

‖T (u)− T (v)‖ ≤ ‖u − v‖ ∀u, v ∈ K

Lipschitz continuous if there exists a constant β ≥ 0 such that

‖T (u)− T (v)‖ ≤ β‖u − v‖ ∀u, v ∈ K

β is called the Lipschitz-continuity constant.

Page 123: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fixed point theorem III

Theorem 78 (Banach xed-point theorem)

Assume that K is a closed set in a Banach space V and that T : K → K is acontractive mapping with contractivity constant α. Then, we have thefollowing results:

There exists a unique u ∈ K such that T (u) = u

For any u0 ∈ K, the sequence (uk)k≥0 in K, dened by uk+1 = T (uk),converges to u, i.e.

‖u − uk‖ −→k→∞

0

Page 124: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fixed point theorem IV

Proof.

Let us prove that uk is a Cauchy sequence. We have

‖uk+1 − uk‖ = ‖T (uk )− T (uk−1)‖ ≤ α‖uk − uk−1‖ ≤ αk‖u1 − u0‖

for m ≥ k ≥ 1, we then have

‖um − uk‖ ≤m−1∑i=k

‖ui+1 − ui‖ ≤m−1∑i=k

αi‖u1 − u0‖ = ‖u1 − u0‖αk

m−1−k∑i=0

αi

=αk (1− αm−k )

1− α‖u1 − u0‖ ≤

αk

1− α‖u1 − u0‖

Since α ∈ [0, 1), ‖um − uk‖ → 0 as m, k →∞, and therefore, uk is a Cauchy sequence.Since the sequence uk is Cauchy in a Banach space V , it converges to some u ∈ V andsince K is closed, the limit u ∈ K . In the relation uk+1 = T (uk ), we take the limit k →∞and obtain u = T (u), by continuity of T . Then, u is a xed point of T .For the uniqueness, suppose that u1 and u2 are two xed points. Then we have

‖u2 − u1‖ = ‖T (u2)− T (u1)‖ ≤ α‖u2 − u1‖

which is possible only if u2 = u1.

Page 125: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fixed point theorem V

Example 79

Let V = R and T (x) = ax + b. If a 6= 1, the sequence xk+1 = T (xk) ischaracterized by

xk = axk−1 + b = akx0 +1− ak

1− ab

If |a| < 1, xk converges to b1−a , which is the unique xed point of T . If |a| > 1,

the sequence diverges. Let us note that

|T (x)− T (x)| = |a||x − x |

and therefore, we have that T is a contractive mapping if |a| < 1.

Page 126: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Part V

Nonlinear equations

14 Fixed point theorem

15 Nonlinear equations with monotone operators

16 Dierential calculus for nonlinear operators

17 Newton method

Page 127: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Nonlinear equations with monotone operators I

We consider the application of the xed point theorem to the analysis ofsolvability of a class of nonlinear equations

A(u) = b u ∈ V

where V is a Hilbert space and A : V → V is a Lipschitz continuous andstrictly monotone operator.

Denition 80 (Monotone operator)

A mapping A : V → V on a Hilbert space V is said

monotone if(A(u)− A(v), u − v) ≥ 0 ∀u, v ∈ V

strictly monotone if

(A(u)− A(v), u − v) > 0 ∀u, v ∈ V , u 6= v

strongly monotone if there exists a constant α > 0 such that

(A(u)− A(v), u − v) ≥ α‖u − v‖2 ∀u, v ∈ V

α is called the strong monotonicity constant.

Page 128: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Nonlinear equations with monotone operators II

Theorem 81

Let V be a Hilbert space and A : V → V a strongly monotone and Lipschitzcontinuous operator, with monotonicity constant α and Lipschitz-continuityconstant β. Then, for any b ∈ V , there exists a unique u ∈ V such that

A(u) = b

Moreover, if A(u1) = b1 and A(u2) = b2, then

‖u1 − u2‖ ≤1

α‖b1 − b2‖

which means that the solution depends continuously on the right-hand side b.

Page 129: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Nonlinear equations with monotone operators III

Proof.

The equation A(u) = b is equivalent to Tγ(u) = u, withTγ(u) = u − γ(A(u)− b) for any γ 6= 0. The idea is to prove that there existsa γ such that Tγ : V → V is contractive. The application of Banach xedpoint theorem will then give the existence and uniqueness of a xed point of u,and therefore, the existence and uniqueness of a solution to A(u) = b. We have

‖Tγ(w)− Tγ(v)‖2 = ‖(w − v)− γ(A(w)− A(v))‖2

= ‖w − v‖2 − 2γ(A(w)− A(v),w − v) + γ2‖A(w)− A(v)‖2

≤ (1− 2γα + γ2β2)‖w − v‖2

For γ2 < 2α/β2, we have (1− 2γα + γ2β2) < 1, and Tγ is a contraction.Now if A(u1) = b1 and A(u2) = b2, we have A(u1)− A(u2) = b1 − b2 and

α‖u1−u2‖2 ≤ (A(u1)−A(u2), u1−u2) = (b1−b2, u1−u2) ≤ ‖b1−b2‖‖u1−u2‖

where the second inequality is the Cauchy-Schwartz inequality satised by theinner product of a Hilbert space. This proves the continuity of the solution uwith respect to b.

Page 130: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Part V

Nonlinear equations

14 Fixed point theorem

15 Nonlinear equations with monotone operators

16 Dierential calculus for nonlinear operators

17 Newton method

Page 131: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fréchet and Gâteaux derivatives I

Let F : K ⊂ V →W be a nonlinear mapping, where K is a subset of a normedspace V and W a normed space. We denote by L(V ,W ) the set of linearapplications from V to W .

Denition 82 (Fréchet derivative)

F is Fréchet-dierentiable at u if and only if there exists A ∈ L(V ,W ) suchthat

F (u + v) = F (u) + Av + o(‖v‖) as ‖v‖ → 0

A is denoted F ′(u) and is called the Fréchet derivative of F at u. If F isFréchet-dierentiable at all points in K , we denote by F ′ : K ⊂ V → L(V ,W )the Fréchet derivative of F on K .

Property 83

If F admits a Fréchet derivative F ′(u) at u, then F is continuous at u.

Page 132: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Fréchet and Gâteaux derivatives II

Denition 84 (Gâteaux derivative)

F is Gâteaux-dierentiable at u if and only if there exists A ∈ L(V ,W ) suchthat

limt→0

F (u + tv)− F (u)

t= Av ∀v ∈ V (2)

A is denoted F ′(u) and is called the Gâteaux derivative of F at u. If F isGâteaux-dierentiable at all points in K , we denote by F ′ : K ⊂ V → L(V ,W )the Gâteaux derivative of F on K .

Property 85

If a mapping F is Fréchet-dierentiable, it is also Gâteaux dierentiable andthe derivatives F ′ coincide.Conversely, if a mapping F is Gâteaux-dierentiable at u and if F is continuousat u or if the limit in (2) is uniform with v such that ‖v‖ = 1, then F is alsoFréchet-dierentiable and the two derivatives coincide.

Page 133: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Convex functions I

Denition 86

A subset K of a vector space V is said convex if

∀u, v ∈ K , ∀t ∈ [0, 1], tu + (1− t)v ∈ K

Denition 87

A function J : K → R, dened on a convex set K of V , is said

convex if for all u, v ∈ K

J(tu + (1− t)v) ≤ tJ(u) + (1− t)J(v) ∀t ∈ [0, 1]

strictly convex if for all u, v ∈ K with u 6= v ,

J(tu + (1− t)v) < tJ(u) + (1− t)J(v) ∀t ∈ (0, 1)

Page 134: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Convex functions II

Theorem 88

Let J : K ⊂ V → R be Gateaux-dierentiable. The following statements areequivalent:

(1) J is convex

(2) J(v) ≥ J(u) + (J ′(u), v − u), for all u, v ∈ K

(3) J ′ is monotone, i.e. (J ′(v)− J ′(u), v − u) ≥ 0 , for all u, v ∈ K

Theorem 89

Let J : K ⊂ V → R be Gateaux-dierentiable. The following statements areequivalent:

(1) J is strictly convex

(2) J(v) > J(u) + (J ′(u), v − u), for all u, v ∈ K with u 6= v

(3) J ′ is strictly monotone, i.e. (J ′(v)− J ′(u), v − u) > 0 , for all u, v ∈ Kwith u 6= v

Page 135: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Convex functions III

Denition 90

A function J : K ⊂ V → R is said strongly convex if it is Gateaux-dierentiableand if its Gâteaux derivative is strongly monotone, i.e. if there exists a constantα > 0 such that

(J ′(v)− J ′(u), v − u) ≥ α‖u − v‖2

Page 136: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Convex optimization I

Theorem 91

Let K be a closed convex subset of an Hilbert space V . Assume thatJ : K → R be a convex and Gâteaux dierentiable mapping. Then, there existsu ∈ K such that

J(u) = infv∈K

J(v) (3)

if and only if there exists u ∈ K such that

(J ′(u), v − u) ≥ 0 ∀v ∈ K (4)

When K is a linear subspace, the last inequality reduces to

(J ′(u), v) = 0 ∀v ∈ K (5)

Page 137: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Convex optimization II

Proof.

Assume (3). Then ∀v ∈ K and ∀t ∈ [0, 1],

J(u) ≤ J(tv + (1− t)u) ≤ tJ(v) + (1− t)J(u)

and thenJ(u + t(v − u))− J(u)

t≥ J(v)− J(u) ∀t ∈ (0, 1]

Taking the limit t → 0+, we obtain

(J′(u), v − u) ≥ J(v)− J(u) ≥ 0

Now, assume (4). Since J is convex, we have ∀v ∈ K

J(v) ≥ J(u) + (J′(u), v − u) ≥ 0

Finally, if K is a subspace, then for all v ∈ K , u ± v ∈ K and therefore

(J′(u),±v) ≥ 0 ⇒ (J′(u), v) = 0 ∀v ∈ K

Page 138: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Part V

Nonlinear equations

14 Fixed point theorem

15 Nonlinear equations with monotone operators

16 Dierential calculus for nonlinear operators

17 Newton method

Page 139: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Newton method I

Let U and V be two Banach spaces and F : U → V a Fréchet-dierentiablefunction. We want to solve

F (u) = 0

The Newton method consists in constructing a sequence unn∈N by solvingsuccessive linearized problems. At iteration n, we introduce the linearization Fof F at un, dened by

F (v) = F (un) + F ′(un)(v − un)

and we dene un+1 such that F (un+1) = 0. The Newton iterations are thendened as follows.

Newton iterations

Start from an initial guess u0 and compute the sequence unn∈N dened by

un+1 = un − F ′(un)−1F (un)

Page 140: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Newton method II

Theorem 92 (local convergence of Newton method)

Assume u∗ is solution of F (u∗) = 0 and assume that F ′(u∗)−1 exists and is acontinuous linear map from V to U. Assume that F ′ is locally Lipschitzcontinuous at u∗, i.e.

‖F ′(u)− F ′(v)‖ ≤ L‖u − v‖ ∀u, v ∈ N(u∗)

where N(u∗) is a neighborhood of u∗. Then, there exists δ > 0 such that if‖u0 − u∗‖ ≤ δ, the sequence unn≥1 of the Newton method is well-denedand converges to u∗. Moreover, there exists a constant M < 1/δ such that

‖un+1 − u∗‖ ≤ ‖un − u∗‖2

and‖un − u∗‖ ≤ (Mδ)2

n

/M

Proof.

See [Atkinson & Han (2009, section 5.4)]

Page 141: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Newton method for nonlinear systems of equations I

Let F : Rm → Rm and consider the nonlinear system of equations

F (u) = 0

The iterations of the Newton method are dened by

un+1 = un − F ′(un)−1F (un)

where F ′(un) ∈ Rm×m is called the tangent matrix at un.In algebraic notations, F (u) and F ′(u) can be expressed as follows:

u =

a1...am

, F (u) =

F1(a1, . . . , am)...

Fn(a1, . . . , am)

, F ′(u) =

∂F1∂a1

(u) . . . ∂F1∂am

(u)...

...∂Fm∂a1

(u) . . . ∂Fm∂am

(u)

Page 142: M1 Numerical Analysis Course

Fixed point Monotone operators Dierential calculus Newton method

Modied Newton method

One iteration of the (full) Newton method can be written as a linear system ofequations

Anδn = −F (un), δn = un+1 − un

where An = F ′(un). In order to avoid the computation of the tangent matrixF ′(un) at each iteration, we can use modied Newton iterations where An isonly an approximation of F ′(un). For example, we could update An when theconvergence is too slow or after every k iterations:

An = F ′(um) for n = mk + j , j ∈ 0, . . . , k − 1

Remark.

The convergence of the modied Newton method is usually slower that (full)Newton method but more iterations can be performed for the samecomputation time.

Page 143: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 144: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials

Introduction

Principle of approximation

The aim is to replace a function f , known exactly or approximately, by anapproximating function p which is more convenient for numerical computation.

The most commonly used approximating functions p are polynomials, piecewisepolynomials or trigonometric polynomials.There are several ways of dening the approximating function among a givenclass of functions: interpolation, projection, ...

Page 145: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 146: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 147: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Preliminary denitions

We denote by Pn(I ) the space of polynomials of degre n dened on the closedinterval I ⊂ R:

Pn(I ) = v : I → R; v(x) =n∑i=0

vixi , vi ∈ R

We denote by C(I ) the space of continuous functions f : I → R. C(I ) is aBanach space when equipped with the norm

‖f ‖C(I ) = supx∈I|f (x)|

We denote by f (i) the i-th derivative of f . We denote by Cm(I ) the space of mtimes dierentiable functions f such that all its derivatives f (i) of order i ≤ mare continuous. Cm(I ) is a Banach space when equipped with the norm

‖f ‖Cm(I ) = maxi≤m‖f (i)‖C(I )

Page 148: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Lagrange interpolation

Let f ∈ C([a, b]) be a continuous function dened on the interval [a, b]. Weintroduce a set of n + 1 distinct points xini=0 on [a, b], such that

a ≤ x0 < . . . < xn ≤ b

The Lagrange interpolation pn ∈ Pn of f is the unique polynomial of degree nsuch that

pn(xi ) = f (xi ) for all i ∈ 0, . . . , n

We can represent pn as follows:

pn(x) =n∑i=0

f (xi )`i (x), `i (x) =n∏j=0

j 6=i

x − xjxi − xj

where the `ini=0 form a basis of Pn, called the Lagrange interpolation basis. Itis the unique basis of functions satisfying the interpolation conditions

`i (xj ) = δij ∀i , j ∈ 0, . . . , n

Page 149: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Lagrange interpolation

Theorem 93

Assume f ∈ Cn+1([a, b]). Then, for x ∈ [a, b], there exists ξx ∈ [a, b] such that

f (x)− pn(x) =ωn(x)

(n + 1)!f (n+1)(ξx), ωn(x) =

n∏i=0

(x − xi )

Inuence of the interpolation grid: Function wn(x) on [−1, 1]

Gauss-Legendre grid (blue), Uniform grid (red), Random grid (black)

−1 −0.5 0 0.5 1−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

n = 5 −1 −0.5 0 0.5 1−0.5

0

0.5

1

1.5

2

Page 150: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Lagrange interpolation

Theorem 93

Assume f ∈ Cn+1([a, b]). Then, for x ∈ [a, b], there exists ξx ∈ [a, b] such that

f (x)− pn(x) =ωn(x)

(n + 1)!f (n+1)(ξx), ωn(x) =

n∏i=0

(x − xi )

Inuence of the interpolation grid: Function wn(x) on [−1, 1]

Gauss-Legendre grid (blue), Uniform grid (red), Random grid (black)

−1 −0.5 0 0.5 1−6

−5

−4

−3

−2

−1

0

1

2x 10

−3

n = 11 −1 −0.5 0 0.5 1−0.5

0

0.5

1

1.5

2

Page 151: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Lagrange interpolation: a famous example...

Runge function f (x) = 1

1+x2on [−5, 5]

Uniform grid: n = 5, 11, 19

−5 0 5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

−5 0 5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

−5 0 5−1

0

1

2

3

4

5

6

7

8

9

Gauss-Legendre grid: n = 5, 11, 19

−5 0 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

−5 0 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

−5 0 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 152: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 153: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Hermite polynomial interpolationFirst order interpolation

First order Hermite polynomial interpolation consists in interpolating a functionf (x) and its derivative f ′(x).Assume f ∈ C 1([a, b]). We introduce a set of n + 1 distinct points xini=0 on[a, b], with

a ≤ x0 < . . . < xn ≤ b

The hermite interpolant p2n+1 ∈ P2n+1 of f is uniquely dened by the followinginterpolation conditions:

p2n+1(xi ) = f (xi ), p′2n+1(xi ) = f ′(xi ), 0 ≤ i ≤ n

Page 154: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

General Hermite polynomial interpolationHigher order interpolation

Hermite interpolation can be generalized for the interpolation of higher orderderivatives. At a given point xi , it interpolates the function and its derivativesup to the order mi ∈ N. Let N =

∑n

i=0(mi + 1)− 1. A generalized Hermite

interpolant pN ∈ PN is uniquely dened by the following conditions

p(j)N (xi ) = f (j)(xi ), 0 ≤ j ≤ mi , 0 ≤ i ≤ n

Theorem 94

Assume f ∈ CN+1([a, b]). Then, for x ∈ [a, b], there exists ξx ∈ [a, b] such that

f (x)− pN(x) =ωN(x)

(N + 1)!f (N+1)(ξx), ωN(x) =

n∏i=0

(x − xi )mi

Page 155: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 156: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Trigonometric polynomials

A trigonometric polynomial is dened as follows

pn(x) = a0 +n∑j=1

(aj cos(jx) + bj sin(jx)) , x ∈ [0, 2π)

pn is said of degree n if |an|+ |bn| 6= 0. An equivalent notation is as follows:

pn(x) =n∑

j=−n

cjeijx ,

witha0 = c0, aj = cj + c−j , bj = i(cj − c−j )

or equivalently (under a polynomial-like form)

pn(x) =n∑

j=−n

cjzj = z−n

2n∑k=0

ck−nzk , z = e ix

Page 157: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Lagrange interpolation Hermite interpolation Trigonometric interpolation

Trigonometric interpolation

We introduce 2n + 1 distinct interpolation points xj2nj=0 in [0, 2π). Classically,we use uniformly distributed points

xj = j2π

2n + 1, 0 ≤ j ≤ 2n

The trigonometric interpolant of degree n of function f is dened by thefollowing conditions

pn(xj ) = f (xj ), 0 ≤ j ≤ 2n

It can be equivalently reformulated as an interpolation problem in the complexplane: nd cknk=−n such that

2n∑k=0

ck−nzkj = znj f (xj ), 0 ≤ j ≤ 2n

where we have introduce complex points zj = e ixj .

Page 158: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 159: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

The problem of the best approximation

The aim is to nd the best approximation p of a function f in a set offunctions K (e.g. a polynomial space, piecewise polynomial space, ...)

minp∈K‖f − p‖

The obtained best approximation p depends on the norm selected formeasuring the error (e.g. L2-norm, L∞-norm, ...).

We will rst introduce some general results about optimization problems

infv∈K

J(v)

by giving some general conditions on the set K and the function J for theexistence of a minimizer.

Page 160: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

An rst comprehensive case: extrema of real-valued functions I

Consider a real-valued continuous function J ∈ C([a, b]). The problem is tond a minimizer of J

infv∈[a,b]

J(v)

The classical result of Weierstrass states that a continuous function on a closedinterval K = [a, b] has a minimum in K (and a maximum). We recall the mainsteps of a typical proof in order to obtain more general requirements on K andJ.

1 We denote byα = inf

v∈KJ(v)

By denition of the inmum, there exists a sequence vn ⊂ K such thatlimn→∞ J(vn) = α.

2 K is a closed and bounded interval in R, and therefore it is a compact set.Therefore, from the sequence vn ⊂ K , we can extract a subsequencevnk which converges to some v∗ ∈ K ,

vnk −→k→∞v∗

Page 161: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

An rst comprehensive case: extrema of real-valued functions II

3 Using the continuity of J, we obtain

J(v∗) = limk→∞

J(vnk ) = α

which proves that v∗ is a minimizer of J in K .

Now we come back on the dierent points of the proof in order to generalizethe existence result for functionals J dened on a subset K of a Banach spaceV .

1 The existence of a minimizing sequence vn ⊂ K is the denition of theinmum.

2 In an innite-dimensional Banach space V , a bounded sequence does notnecessarily admits a converging subsequence. However, for a reexiveBanach space V , there exists a weakly convergent subsequence. We thensuppose that V is a reexive Banach space and K ⊂ V is a bounded set.In order for K to contain the limit of this subsequence, K has to be weaklyclosed.

3 Finally, we want the weak limit of the subsequence to be a minimizer of J.We could then impose to J to be continuous with respect to a weak limit.However, this condition is too restrictive and it is sucient to impose thatJ is weakly lower semi continuous (allowing discontinuities).

Page 162: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 163: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Elements on topological vector spaces I

In the following, V denotes a normed space, i.e. a vector space equipped witha norm ‖ · ‖.

Denition 95 (Strong convergence on V )

A sequence vn ⊂ V is said to converge strongly to v ∈ V if

limn→∞

‖vn − v‖ = 0

It is denotedvn → v

Denition 96 (Cauchy sequence)

A sequence vn ⊂ V is Cauchy if

limn→∞

supi,j≥n‖vi − vj‖ = 0

or equivalently, if ∀ε > 0, there exists n ∈ N such that for all i , j ≥ n,‖vi − vj‖ ≤ ε.

Page 164: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Elements on topological vector spaces II

Denition 97 (Closed set)

A subset K ⊂ V is said to be closed if it contains all the limits of itsconvergent sequences:

vn ⊂ K and vn → v ⇒ v ∈ K

The closure K of a set K is the union of this set and of the limits of allconverging sequences in K .

Denition 98 (Compact set)

A subset K of a normed space V is said to be (sequentially) compact if everysequence vnn∈N contains a subsequence vnk k∈N converging to an elementin K .A set K whose closure K is compact is said relatively compact.

Denition 99 (Banach space)

A Banach space is a complete normed vector space, i.e. a normed vector spacesuch that every Cauchy sequence in V has a limit in V .

Page 165: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Elements on topological vector spaces III

Denition 100 (Dual of a normed space V )

The dual space of a normed space V is set space V ′ = L(V ,R) of linearcontinuous maps from V to R. V ′ is a Banach space for the norm

‖L‖ = supv∈V :‖v‖≤1

|L(v)| = supv∈V

|L(v)|‖v‖ , L ∈ V ′

Denition 101 (Reexive normed space)

A normed space V is said reexive if V ′′ = V , where V ′′ = (V ′)′ is the dual ofthe dual of V , also called bidual of V .

Denition 102 (Strong convergence on V ′)

A sequence Ln ⊂ V ′ is said to converge strongly to L ∈ V ′ if

limn→∞

‖Ln − L‖ = 0

Page 166: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Elements on topological vector spaces IV

The dual space can be used to dene a new topology on V , called the weaktopology. The notions of convergence, closure, continuity... can be redenedwith respect to this new topology.

Denition 103 (Weak convergence on V )

A sequence vn ⊂ V is said to converge weakly to v ∈ V if

limn→∞

L(v − vn) = 0 ∀L ∈ V ′

It is denotedvn v

Denition 104 (Weakly closed set in V )

A subset K ⊂ V is said to be weakly closed if it contains all the limits of itsweakly convergent sequences:

vn ⊂ K and vn v ⇒ v ∈ K

Page 167: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Elements on topological vector spaces V

Denition 105 (Weakly compact set)

A subset K of a normed space V is said to be weakly compact if everysequence vnn∈N contains a subsequence vnk k∈N weakly converging to anelement in K .A set K whose closure in the weak topology is weakly compact is said weaklyrelatively compact.

Theorem 106 (Reexive Banach spaces and converging bounded sequences)

A Banach space V is reexive if and only if every bounded sequence in V has asubsequence weakly converging to an element in V .

Let us note that the above theorem could be reformulated as follows: a Banachspace is reexive if and only if the unit ball is relatively compact in the weaktopology.

Theorem 107

A set K in V is bounded and weakly closed if and only if it is weakly compact.

Page 168: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Lower semicontinuity I

Denition 108 (Lower semicontinuity)

A function J : V → R is lower semicontinuous (l.s.c.) if

vn ⊂ K and vn → v ∈ K ⇒ J(v) ≤ lim infn→∞

J(vn)

Denition 109 (Weak lower semicontinuity)

A function J : V → R is weakly lower semicontinuous (w.l.s.c.) if

vn ⊂ K and vn v ∈ K ⇒ J(v) ≤ lim infn→∞

J(vn)

Proposition 110

Continuity implies lower semicontinuity (but the converse statement is nottrue)

Weak lower semicontinuity implies lower semicontinuity (but the conversestatement is not true)

Page 169: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Lower semicontinuity II

Example 111

Let us prove that the norm function ‖.‖ : v ∈ V 7→ ‖v‖ ∈ R in a normed spaceV is w.l.s.c.Let vn ⊂ V be a weakly convergent sequence with vn v . There exists alinear form L ∈ V ′ such that L(v) = ‖v‖ and ‖L‖ = 1 (Corollary of theGeneralized Hahn-Banach theorem). We then have

L(vn) ≤ ‖L‖‖vn‖ = ‖vn‖

and therefore‖v‖ = L(v) = lim

n→∞L(vn) ≤ lim inf

n→∞‖vn‖

If V is an inner product space, we have a simpler proof. Indeed,

‖v‖2 = (v , v) = limn→∞

(vn, v) ≤ lim infn→∞

‖v‖‖vn‖

Page 170: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 171: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

General existence results I

We introduce the problem

infv∈K

J(v) (π)

Theorem 112

Assume V is a reexive Banach space. Let K ⊂ V denote a bounded andweakly closed set. Let J : V → R denote a weakly l.s.c. function. Then,problem (π) has a solution in K.

Proof.

Denote α = infv∈K J(v) and vn ⊂ K a minimizing sequence such thatlimn→∞ J(vn) = α. Since K is bounded, vn is a bounded sequence in areexive Banach space and therefore, we can extract a subsequence vnk weakly converging to some u ∈ V . Since K is weakly closed, u ∈ K . Since J isw.l.s.c.

J(u) ≤ lim infk→∞

J(vnk ) = α

and therefore, u ∈ K is a minimizer of J.

Page 172: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

General existence results II

We now remove the boundedness of the set K by adding a coercivity conditionon J.

Denition 113

A functional J : V → R is said coercive if

J(v)→ +∞ as ‖v‖ → ∞

Theorem 114

Assume V is a reexive Banach space. Let K ⊂ V denote a weakly closed set.Let J : V → R denote a weakly l.s.c. and coercive function. Then, the problem(π) has a solution in K.

Page 173: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

General existence results III

Proof.

Pick an element v0 ∈ K with J(v0) <∞ and let K0 = v ∈ K ; J(v) ≤ J(v0).Since J is coercive, K0 is bounded. Moreover, K0 is weakly closed. Indeed, ifvn ⊂ K0 is such that vn v∗, then v∗ ∈ K (since K is weakly closed) andJ(v∗) ≤ lim infn J(vn) ≤ J(v0), and therefore v∗ ∈ K0. The optimizationproblem is then equivalent to the optimization problem

infv∈K0

J(v)

of a w.l.s.c. function on a bounded and weakly closed set. Theorem 112 allowsto conclude on the existence of a minimizer.

Lemma 115 (Convex closed sets are weakly closed)

A convex and closed set K ⊂ V is weakly closed.

Lemma 116 (Convex l.s.c. functions are w.l.s.c.)

A convex and l.s.c. function is also w.l.s.c.

Page 174: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

General existence results IV

For convex sets and convex functions, theorems 112 and 114 can then bereplaced by the following theorem.

Theorem 117

Assume V is a reexive Banach space. Let K ⊂ V denote a convex and closedset. Let J : V → R denote a convex l.s.c. function. Then, if either (i) K isbounded, or (ii) J is coercive on K, then the minimization problem (π) has asolution in K. Moreover, if J is strictly convex, this solution is unique.

Proof.

The existence simply follows from theorems 112 and 114 and from properties115 and 116. It remains to prove the uniqueness if J is strictly convex. Assumethat u1, u2 ∈ K are two solutions such that u1 6= u2. We haveJ(u1) = J(u2) = minv∈K J(v). Since K is convex, αu1 + (1− α)u2 ∈ K forα ∈ (0, 1), and by strict convexity of J, we have

J(αu1 + (1− α)u2) < αJ(u1) + (1− α)J(u2) = minv∈K

J(v)

which contradicts the fact that u1 and u2 are solutions.

Page 175: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

General existence results V

In the case of a non reexive Banach space V (e.g. V = C([a, b])) the abovetheorems do not apply. However, the reexivity is used for the extraction of aweakly convergent subsequence from a bounded sequence in K . In fact, we justneed the completeness of the set K and not of the space V . In particular, fornite-dimensional subset K , we have.

Theorem 118

Assume V is a normed space. Let K ⊂ V denote a nite-dimensional convexand closed set. Let J : V → R denote a convex l.s.c. function. Then, if either(i) K is bounded, or (ii) J is coercive on K, then the minimization problem (π)has a solution in K. Moreover, if J is strictly convex, this solution is unique.

Page 176: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 177: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Existence and uniqueness of best approximation I

We apply the general results about optimization on the following bestapproximation problem. For a given element u ∈ V , where V is a normedspace, we want to nd the elements in a subset K ⊂ V which are the closest tou. The problem writes

infv∈K‖u − v‖

Denoting J(v) = ‖u − v‖, the problem can then be written under the forminfv∈K J(v).

Property 119

Function J(v) = ‖u − v‖ is convex, continuous (and hence w.l.s.c), andcoercive.

We then have the two existence results.

Theorem 120

Let V be a reexive Banach space and K ⊂ V a closed convex subset. Thenthere exists a best approximation u ∈ K verifying

‖u − u‖ = minv∈K‖u − v‖

Page 178: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Existence and uniqueness of best approximation II

Theorem 121

Let V be a normed space and K ⊂ V a nite-dimensional closed convexsubset. Then there exists a best approximation u ∈ K verifying

‖u − u‖ = minv∈K‖u − v‖

For the uniqueness of the best approximation, we have to look at the propertiesof the norm.

Theorem 122

I there exists a p > 1 such that v 7→ ‖v‖p is strictly convex, then a solution uof the best approximation problem is unique.

Example 123

If V is a Hilbert space equipped with the inner product (·, ·) andassociated norm ‖ · ‖, v 7→ ‖v‖2 is a strictly convex function.

If V = Lp(Ω) with p ∈ (1,+∞), v 7→ ‖v‖pLp(Ω) is strictly convex.

Page 179: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 180: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces I

Let V be a Hilbert space equipped with inner product (·, ·) and associatednorm ‖ · ‖.

Lemma 124

Let K be a closed convex set in Hilbert space V . u ∈ K is a bestapproximation of u ∈ V if and only if

(u − u, v − u) ≤ 0 ∀v ∈ K

Page 181: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces II

Proof.

First suppose that u ∈ K is a best approximation of u ∈ V . Then,

‖u − u‖2 ≤ ‖w − u‖2

∀w ∈ K . By selecting w = u + α(v − u), with α ∈ (0, 1) and v ∈ K , we have

0 ≥ ‖u − u‖2 − ‖(u − u) + α(v − u)‖2

= −α2(v − u, v − u)− 2α(u − u, v − u)

for all α ∈ (0, 1). That implies (u − u, v − u) ≥ 0 ∀v ∈ K .Conversely, if (u − u, v − u) ≥ 0, ∀v ∈ K , then

‖v − u‖2 = ‖(v − u) + (u − u)‖2

= ‖v − u‖2 + 2(v − u, u − u) + ‖u − u‖2

≥ ‖u − u‖2

for all v ∈ K .

Page 182: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces III

Corollary 125

Let K be a closed convex set in Hilbert space V . For any u ∈ V , the bestapproximation in K is unique.

Proof.

Let u1, u2 ∈ K be two best approximations of u ∈ V . Then,(u − u1, u2 − u1) ≤ 0 and (u − u2, u1 − u2) ≤ 0. Additionning theseinequalities, we obtain

(u2 − u1, u2 − u1) = ‖u2 − u1‖2 ≤ 0

and therefore u1 = u2.

We then conclude with the following theorem.

Page 183: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces IV

Theorem 126

Let K ⊂ V be a nonempty closed convex set in Hilbert space V . For anyu ∈ V , there exists a unique best approximation u ∈ K dened by

‖u − u‖ = minv∈K‖u − v‖

Page 184: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces V

Remark.

Let us give another classical proof for the existence of a best approximation,which uses the inner product structure of the space V . Let unn∈N ⊂ K be aminimizing sequence such that limn→∞ ‖u − un‖ = α = infv∈K ‖u − v‖. Usingthe parallelogram law satised by the norm ‖.‖ in an inner product space, wehave

2‖u − un‖2 + 2‖u − um‖2 = ‖un − um‖2 + ‖2u − un − um‖2

Since K is convex, we have (un + um)/2 ∈ K and therefore

‖un − um‖2 = 2‖u − un‖2 + 2‖u − um‖2 − 4‖u − (un + um)/2‖2

≤ 2‖u − un‖2 + 2‖u − um‖2 − 4α2 −→m,n→∞

0

which proves that un is a Cauchy sequence. Since V is complete, un ⊂ Kconverges to an element u ∈ V and since K is closed, u ∈ K .

Page 185: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces: Projection I

Denition 127 (Projector on a convex set)

The best approximation u ∈ K of u ∈ V in a closed convex set K is called theprojection of u onto K and is denoted

u = PK (u)

where PK : V → K is called the projection operator of V onto K .

Proposition 128

The projection operator is monotone

(PK (v)− PK (u), v − u) ≥ 0 ∀u, v ∈ V

and non expansive

‖PK (v)− PK (u)‖ ≤ ‖v − u‖ ∀u, v ∈ V

Page 186: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces: Projection II

Proof.

From the characterizations of PK (u) ∈ K and PK (v) ∈ K , we have respectively

(PK (u)− u,PK (v)− PK (u)) ≥ 0, (PK (v)− v ,PK (u)− PK (v)) ≥ 0

Adding these inequalities, we obtain

(v − u,PK (v)− PK (u)) ≥ (PK (v)− PK (u),PK (v)− PK (u)) ≥ 0

and

‖PK (v)− PK (u)‖2 ≤ (v − u,PK (v)− PK (u)) ≤ ‖v − u‖‖PK (v)− PK (u)‖

We now introduce the following particular case when K is a subspace of V .

Page 187: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces: Projection III

Theorem 129 (Projection on linear subspaces)

Let K be a complete subspace of V . Then, for any u ∈ V , there exists aunique best approximation u = PK (u) ∈ K characterized by

(u − PK (u), v) = 0 ∀v ∈ K

Proof.

We have(u − u,w − u) ≤ 0 ∀w ∈ K

and since K is a subspace, for all v ∈ K , w = u ± v ∈ K , and therefore

±(u − u, v) ≤ 0 ∀v ∈ K

In the case where K is a subspace, u − PK (u) is orthogonal to K , andtherefore, PK is called an orthogonal projection operator.

Page 188: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces: Projection IV

Let us consider that we know an orthonormal basis ϕini=1 of K = Kn. Theprojection PKn (u) is characterized by

PKn (u) =n∑i=1

(ϕi , u)ϕi

Example 130 (Least square approximation by polynomials)

Let V = L2(−1, 1) and Kn = Pn(−1, 1) the space of polynomials of degree lessthan n. An orthonormal basis of Kn is given by the Legendre polynomialsLini=0 dened by

Li (x) =√

(2i + 1)/21

2i i !

d i

dx i

((x2 − 1)i

)

Page 189: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Elements on topological vector spaces General existence results Existence and uniqueness of best approximation Best approximation in Hilbert spaces

Best approximation in Hilbert spaces: Projection V

Example 131 (Least square approximation by trigonometric polynomials)

Let V = L2(0, 2π) and Kn the space of trigonometric polynomials of degreeless than n. The best approximation un = PKn (u) is characterized by

un(x) = a0/2 +n∑j=1

(aj cos(jx) + bj sin(jx))

with

aj =1

(cos(jx), cos(jx))(u(x), cos(jx)) =

1

π

∫2π

0

u(x) cos(jx)dx , j ≥ 0

bj =1

(sin(jx), sin(jx))(u(x), sin(jx)) =

1

π

∫2π

0

u(x) sin(jx)dx , j ≥ 1

Note that un tends to the well-known Fourier series expansion of u.

Page 190: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 191: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 192: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Weighted L2 spaces

Let I ⊂ R and ω : I → R be a weight function which is integrable on I andalmost everywhere positive. We introduce the weighted function space

L2ω(I ) = v : I → R; v is measurable on I ,

∫I

|v(x)|2ω(x)dx < +∞

L2ω(I ) is a Hilbert space for the inner product

(u, v) =

∫I

u(x)v(x)ω(x)dx

and associated norm

‖u‖ =

(∫I

u(x)2ω(x)dx

)1/2

Two functions u, v ∈ L2ω(I ) are said orthogonal if (u, v) = 0.

Page 193: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Part VI

Interpolation / Approximation

18 InterpolationLagrange interpolationHermite interpolationTrigonometric interpolation

19 Best approximationElements on topological vector spacesGeneral existence resultsExistence and uniqueness of best approximationBest approximation in Hilbert spaces

20 Orthogonal polynomialsWeighted L2 spacesClassical orthogonal polynomials

Page 194: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Classical orthogonal polynomials I

A system of orthonormal polynomials pnn≥0, with pn ∈ Pn(I ), can beconstructed by applying the Gram-Schmidt procedure to the basis ofmonomials 1, x , x2, . . .. For a given interval I and weight function ω, it leadsto a uniquely dened system of polynomials.In the following table, we indicate classical families of polynomials for dierentinterval domains I and weight functions.

Classical orthogonal polynomials

I ω(x) pn

(−1, 1) (1+x)a−1(1−x)b−1

2a+b−1B(a,b)Jacobi

(−1, 1) 1

2Legendre

(−1, 1) (1−x2)−1/2

B(1/2,1/2)Chebyshev of rst kind

(−1, 1) (1−x2)1/2

4B(3/2,3/2)Chebyshev of second kind

R 1√2πexp(− x2

2) Hermite

(0,+∞) 1

Γ(a)xaexp(−x) Laguerre

Page 195: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Classical orthogonal polynomials II

Γ denotes the Euler Gamma function dened by

Γ(a) =

∫ ∞0

xa−1exp(−x)dx

B(a, b) denotes the Euler Beta function dened by

B(a, b) =Γ(a)Γ(b)

Γ(a + b)

Remark.

The given weight functions are such that∫I

ω(x)dx = 1

It then denes a measure µ with density ω (dµ(x) = ω(x)dx) and with unitarymass. Equivalently, µ (resp. ω) can be interpreted as the probability law (resp.probability density function) of a random variable.

Page 196: M1 Numerical Analysis Course

Interpolation Best approximation Orthogonal polynomials Weighted L2 spaces Classical orthogonal polynomials

Classical orthogonal polynomials III

Exercice.

Construct by the Gram-Schmidt procedure the orthonormal polynomials ofdegree n = 0, 1, 2 on the interval I = (0, 1) and for the weight functionω(x) = log(1/x).

Page 197: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Part VII

Numerical integration

21 Basic quadrature formulas

22 Gauss quadrature

Page 198: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Numerical integration

Given a function f : Ω→ R, the aim is to approximate the value of the integral

I (f ) =

∫Ω

f (x)dx

using evaluations of the function

I (f ) ≈n∑

k=1

f (xk)ωk

or eventually of the function and its derivatives

I (f ) ≈n∑

k=1

f (xk)ωk +n∑

k=1

f ′(xk)ωk + . . .

These approximations are called quadrature formulas. A quadrature formula issaid of interpolation type if it uses only evaluations of the function.

Page 199: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Integration error and precision

We denote by In(f ) the quadrature formula.

Denition 132

A quadrature formula have a degree of precision k if it integrates exactly allpolynomials of degree less or equal to k

In(f ) = I (f ) ∀f ∈ Pk(Ω)

In(f ) 6= I (f ) for some f ∈ Pk+1(Ω)

Page 200: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Part VII

Numerical integration

21 Basic quadrature formulas

22 Gauss quadrature

Page 201: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Basic quadrature formulas

Rectangle formula (precision degree 1)∫ b

a

f (x)dx ≈ (b − a)f (a + b

2)

Trapezoidal formula (precision degree 1)∫ b

a

f (x)dx ≈ (b − a)f (a) + f (b)

2

Simpson formula (precision degree 3)∫ b

a

f (x)dx ≈ (b − a)

6(f (a) + 4f (

a + b

2) + f (b))

...

Page 202: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Composite quadrature formulas

In order to compute

I (f ; Ω) =

∫Ω

f (x)dx ,

we divide the domain Ω into m subdomains Ωmi=1 such that

I (f ; Ω) =m∑i=1

I (f ; Ωi )

and we introduce a basic quadrature formula on each subdomain

I (f ; Ω) ≈m∑i=1

In(f ; Ωi )

Page 203: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Part VII

Numerical integration

21 Basic quadrature formulas

22 Gauss quadrature

Page 204: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Gauss quadrature I

We want to approximate the weighted integral of a function f

Iw (f ) =

∫ b

a

f (x)w(x)dx

where w(x)dx denes a measure of integration. A Gauss quadrature formulawith n points is dened by

Iw (f ) ≈ Iwn (f ) =n∑i=1

ωi f (xi )

with points and weights such that it integrates exactly all polynomialsf ∈ P2n−1(a, b). The xi (resp. ωi ) are called Gauss points (resp. Gaussweights) associated with the present measure. We introduce the function spaceL2w (a, b) and its natural inner product

(f , g)w =

∫ b

a

f (x)g(x)w(x)dx

Page 205: M1 Numerical Analysis Course

Basic quadrature formulas Gauss quadrature

Gauss quadrature II

Theorem 133

In(f ) = I (f ) for all f ∈ P2n−1(a, b) if and only if

the points xi are such that the polynomialzn(x) =

∏n

i=1(x − xi ) ∈ Pn(a, b) is orthogonal to Pn−1(a, b), i.e.

(zn(x), p(x))w = 0 ∀p ∈ Pn−1(a, b)

the weights are dened by ωi = I (Li ), where Li is the Lagrange interpolantat xi , dened by Li (x) =

∏n

j=1,j 6=i (x − xj )/(xi − xj )

Corollary 134

The n Gauss points of a n-points Gauss quadrature are the n roots of thedegree n orthogonal polynomial.

For (a, b) = (−1, 1) and w(x) = 1, the xi are the n roots of the degree nLegendre polynomial.

For (a, b) = (−∞,∞) and w(x) = exp(−x2), the xi are the n roots of thedegree n Hermite polynomial.

...


Recommended