MATH 3795 Lecture 6. Sensitivity of the Solution of a Linear System

MATH 3795Lecture 6. Sensitivity of the Solution of a Linear

System

Dmitriy Leykekhman

Fall 2008

GoalsI Understand how does the solution of Ax = b changes when A or b

change.

I Condition number of a matrix (with respect to inversion).

I Vector and matrix norms.

D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Sensitivity of the Solution – 1

Linear Systems

I Given A ∈ Rn×n and b ∈ Rn we are interested in the solutionx ∈ Rn of

Ax = b.

I Suppose that instead of A, and b we are given A+ ∆A and b+ ∆b,where ∆A ∈ Rn×n and ∆b ∈ Rn. How do these perturbations inthe data change the solution of the linear system?

I First we need to understand how to measure the size of vectors andof matrices. This leads to vector norms and matrix norms.


Vector Norms

DefinitionA (vector) norm on Rn is a function

‖ · ‖ : Rn → Rx→ ‖x‖

which for all x, y ∈ Rn and α ∈ R satisfies

1. ‖x‖ ≥ 0, ‖x‖ = 0 ⇔ x = 0,

2. ‖αx‖ = |α|‖x‖,3. ‖x+ y‖ ≤ ‖x‖+ ‖y‖, (triangle inequality).


Vector Norms

The most frequently used norms on Rn are given by

‖x‖2 =

(n∑i=1

x2i

)1/2

, 2-norm

The MATLAB’s build in function norm(x) or norm(x, 2).More generally for any p ∈ [1,∞)

‖x‖p =

(n∑i=1

|xi|p)1/p

, p-norm.

The MATLAB’s build in function norm(x, p) and

‖x‖∞ = maxi=1,...,n

|xi|, ∞-norm.

The MATLAB’s build in function norm(x, inf)


Vector Norms

ExampleLet x = (1,−2, 3,−4)T . Then

‖x‖1 = 1 + 2 + 3 + 4 = 10,

‖x‖2 =√

1 + 4 + 9 + 16 =√

30 ≈ 5.48,‖x‖∞ = max {1, 2, 3, 4} = 4.


Vector Norms

The boundaries of the unit balls defined by

{x ∈ Rn : ‖x‖p ≤ 1}.

One can show the following useful inequalities:

I

‖x‖∞ ≤ ‖x‖2 ≤ ‖x‖1.I Let ‖ · ‖ is any vector norm on Rn, then

‖x+ y‖ ≥∣∣‖x‖ − ‖y‖∣∣ for all x, y ∈ Rn.

I Cauchy-Schwarz inequality,

xT y ≤ ‖x‖2‖y‖2 for all x, y ∈ Rn.


Vector Norms


{x ∈ Rn : ‖x‖p ≤ 1}.


I

‖x‖∞ ≤ ‖x‖2 ≤ ‖x‖1.

I Let ‖ · ‖ is any vector norm on Rn, then

‖x+ y‖ ≥∣∣‖x‖ − ‖y‖∣∣ for all x, y ∈ Rn.




Vector Norms


{x ∈ Rn : ‖x‖p ≤ 1}.


I


‖x+ y‖ ≥∣∣‖x‖ − ‖y‖∣∣ for all x, y ∈ Rn.




Vector Norms


{x ∈ Rn : ‖x‖p ≤ 1}.


I


‖x+ y‖ ≥∣∣‖x‖ − ‖y‖∣∣ for all x, y ∈ Rn.




Vector Norms

TheoremVector norms on Rn are equivalent, i.e. for every two vector norms ‖ · ‖aand ‖ · ‖b on Rn there exist constants cab, Cab (depending on the vectornorms ‖ · ‖a and ‖ · ‖b, but not on x) such that

cab‖x‖b ≤ ‖x‖a ≤ Cab‖x‖b ∀x ∈ Rn.

In particular, for any x ∈ Rn we have the inequalities

1√n‖x‖1 ≤ ‖x‖2 ≤ ‖x‖1

‖x‖∞ ≤ ‖x‖2 ≤√n‖x‖∞

‖x‖∞ ≤ ‖x‖1 ≤ n‖x‖∞.


Matrix Norms

DefinitionA matrix norm on Rm×n is a function

‖ · ‖ : Rm×n → RA→ ‖A‖,

which for all A,B ∈ Rm×n and α ∈ R satisfies

1. ‖A‖ ≥ 0, ‖A‖ = 0 ⇔ A = 0 (zero matrix),

2. ‖αA‖ = |α|‖A‖,3. ‖A+B‖ ≤ ‖A‖+ ‖B‖, (triangle inequality).

Warning:Matrix- and vector-norms are denoted by the same symbol ‖ · ‖.However, as we will see shortly, vector-norms and matrix-norms arecomputed very differently. Thus, before computing a norm we need toexamine carefully whether it is applied to a vector or to a matrix. Itshould be clear from the context which norm, a vector-norm or amatrix-norm, is used.


Matrix Norms. First Approach.

I View a matrix A ∈ Rm×n as a vector in Rmn, by stacking thecolumns of the matrix into a long vector.

I Apply the vector-norms to this vectors of length mn.

I This will give matrix norms. For example if we apply the2-vector-norm, then

‖A‖F =

n∑i=1

m∑j=1

a2ij

1/2

.

This is called the Frobenius norm.(We will use ‖A‖2 to denote a different matrix norm.)

I This approach is not very useful.






‖A‖F =

n∑i=1

m∑j=1

a2ij

1/2

.








‖A‖F =

n∑i=1

m∑j=1

a2ij

1/2

.








‖A‖F =

n∑i=1

m∑j=1

a2ij

1/2

.




Matrix Norms. Second Approach.

I We want to solve linear systems Ax = b.Find a vector x such that if we multiply A by this vector (we applyA to this vector), then we obtain b.

I View a matrix A ∈ Rm×n as a linear mapping, which maps a vectorx ∈ Rn into a vector Ax ∈ Rm

A : Rn → Rm

x→ Ax.

I How do we define the size of a linear mapping?

I Compare the size of the image Ax ∈ Rm with the size of x. Thisleads us to look at

supx6=0

‖Ax‖‖x‖

Here Ax ∈ Rm and x ∈ Rn are vectors and ‖ · ‖ are vector norms (inRm and Rn).





A : Rn → Rm

x→ Ax.



supx6=0

‖Ax‖‖x‖






A : Rn → Rm

x→ Ax.



supx6=0

‖Ax‖‖x‖






A : Rn → Rm

x→ Ax.



supx6=0

‖Ax‖‖x‖



Matrix Norms

I Let p ∈ [1,∞]. The following identities re valid

supx 6=0

‖Ax‖p‖x‖p

= sup‖x‖p=1

‖Ax‖p = maxx 6=0

‖Ax‖p‖x‖p

= max‖x‖p=1

‖Ax‖p

I One can show

‖A‖p = maxx 6=0

‖Ax‖p‖x‖p

. (1)

Note that on the left hand side in (1) the symbol ‖ · ‖p refers to thep-matrix-norm, while on the right hand side in (1) the symbol ‖ · ‖prefers to the p-vector-norm applied to the vectors Ax ∈ Rm andx ∈ Rn, respectively.


Matrix Norms

I Let p ∈ [1,∞]. The following identities re valid

supx 6=0

‖Ax‖p‖x‖p

= sup‖x‖p=1

‖Ax‖p = maxx 6=0

‖Ax‖p‖x‖p

= max‖x‖p=1

‖Ax‖p

I One can show

‖A‖p = maxx 6=0

‖Ax‖p‖x‖p

. (1)

Note that on the left hand side in (1) the symbol ‖ · ‖p refers to thep-matrix-norm, while on the right hand side in (1) the symbol ‖ · ‖prefers to the p-vector-norm applied to the vectors Ax ∈ Rm andx ∈ Rn, respectively.


Matrix Norms

For the most commonly used matrix-norms (1) with p = 1, p = 2, orp =∞, there exist rather simple representations.

Let ‖ · ‖p be the matrix norm defined in (1), then

‖A‖1 = maxj=1,...,n

m∑i=1

|aij | (maximum column norm);

‖A‖∞ = maxi=1,...,m

n∑j=1

|aij | (maximum row norm);

‖A‖2 =√λmax(ATA) (spectral norm).

where λmax(ATA) is the largest eigenvalue of ATA.


Matrix Norms

ExampleLet

A =

1 3 −6−2 4 2

2 1 −1

.

Then‖A‖1 = max 5, 8, 9 = 9,‖A‖∞ = max 10, 8, 4 = 10,

‖A‖2 =√

max {3.07, 23.86, 49.06} ≈ 7.0045,

‖x‖F =√

76 ≈ 8.718.


Matrix NormsTwo important inequalities.

TheoremFor any A ∈ Rm×n, B ∈ Rn×k and x ∈ Rn, the following inequalitieshold.

‖Ax‖p ≤ ‖A‖p‖x‖p (compatibility of matrix and vector norm)

and

‖AB‖p ≤ ‖A‖p‖B‖p (submultiplicativity of matrix norms)

Note that for the identity matrix I,

‖I‖p = maxx 6=0

‖Ix‖p‖x‖p

= 1.

Compare this with the first approach in which we view I as a vector oflength n2. For example the Frobenius norm (2-vector norm) is

‖I‖F =√n.


Error Analysis

I LetAx = b (2)

be the original system, where A ∈ Rn×n and b ∈ Rn.

I Let(A+ ∆A)x̃ = b+ ∆b (3)

be the perturbed system, where ∆A ∈ Rn×n and ∆b ∈ Rn representthe perturbations in A and b, respectively.

I What is the error ∆x = x̃− x between the solution x of the exactlinear system (7) and the solution ex perturbed linear system (8).

I Use a representationx̃ = x+ ∆x.


Error Analysis

I LetAx = b (2)


I Let(A+ ∆A)x̃ = b+ ∆b (3)





Error Analysis

I LetAx = b (2)


I Let(A+ ∆A)x̃ = b+ ∆b (3)





Error Analysis. Perturbation in b onlyThe original linear system,

Ax = b,

where A ∈ Rn×n and b ∈ Rn. The perturbed linear system

A(x+ ∆x) = b+ ∆b,

where ∆b ∈ Rn represents the perturbations in b.Subtracting we get

A∆x = ∆b, or ∆x = A−1∆b.

Take norms:‖∆x‖ = ‖A−1∆b‖ ≤ ‖A−1‖‖∆b‖. (4)

To estimate relative error, note that Ax = b and as a result

‖b‖ = ‖Ax‖ ≤ ‖A‖‖x‖ ⇒ 1‖x‖≤ ‖A‖ 1

‖b‖. (5)

Combining (4) and (5) we get

‖∆x‖‖x‖

≤ ‖A‖‖A−1‖‖∆b‖‖b‖

. (6)


Error Analysis. Perturbation in b onlyThe original linear system,

Ax = b,

where A ∈ Rn×n and b ∈ Rn. The perturbed linear system

A(x+ ∆x) = b+ ∆b,

where ∆b ∈ Rn represents the perturbations in b.Subtracting we get

A∆x = ∆b, or ∆x = A−1∆b.

Take norms:‖∆x‖ = ‖A−1∆b‖ ≤ ‖A−1‖‖∆b‖. (4)

To estimate relative error, note that Ax = b and as a result

‖b‖ = ‖Ax‖ ≤ ‖A‖‖x‖ ⇒ 1‖x‖≤ ‖A‖ 1

‖b‖. (5)

Combining (4) and (5) we get

‖∆x‖‖x‖

≤ ‖A‖‖A−1‖‖∆b‖‖b‖

. (6)


Error Analysis. Perturbation in b only

DefinitionThe (p-) condition number κp(A) of a matrix A (with respect toinversion) is defined by

κp(A) = ‖A‖p‖A−1‖p.

Set κp(A) =∞ is A is not invertible. MATLAB’s build in functioncond(A).

IfAx = b,

andA(x+ ∆x) = b+ ∆b,

then the relative error between the solutions obeys

‖∆x‖‖x‖

≤ κp(A)‖∆b‖‖b‖

.


Error Analysis. General Case.

I LetAx = b (7)


I Let(A+ ∆A)(∆x+ x) = b+ ∆b (8)


I If ‖A−1‖p‖∆A‖p < 1, then

‖∆x‖p‖x‖p

≤ κp(A)

1− κp(A)‖∆A‖p

‖A‖p

(‖∆A‖p‖A‖p

+‖∆b‖‖b‖

). (9)

If κp(A) is small, we say that the linear system is well conditioned.Otherwise, we say that the linear system is ill conditioned.


Error Analysis. Example. Hilbert Matrix

ExampleHilbert Matrix H ∈ Rn×n with entries

hij =∫ 1

0

xi+j−2 dx =1

i+ j − 1.

For n = 4,

H =

1 1

213

14

12

13

14

15

13

14

15

16

14

15

16

17

.

H−1 =

16 −120 240 −140−120 1200 −2700 1680240 −2700 6480 −4200−140 1680 −4200 2800

.


Error Analysis. Example. Hilbert Matrix.

ExampleWe compute that the condition number of a Hilbert matrix grows veryfast with n. For n = 4

‖H‖1 =2512

‖H−1‖1 = 13620, κ1(H) = 28375,

‖H‖∞ =2512

‖H−1‖∞ = 13620, κ∞(H) = 28375,

‖H‖2 ≈ 1.5 ‖H−1‖2 ≈ 1.03 ∗ 104, κ2(H) ≈ 1.55 ∗ 104.


Error Analysis. Example. Hilbert Matrix.ExampleWe consider the linear systems

Hx = b.

For given n we set xex = (1, . . . , 1)T ∈ Rn, and compute b = Hxex.Then we compute the solution of the linear system Hx = b using theLU-decomposition and compute the relative error between exact solutionxex and computed solution x.

n κ∞(H) ‖xex−x‖∞‖xex‖∞

4 2.837500e+ 004 2.958744e− 0135 9.436560e+ 005 5.129452e− 0126 2.907028e+ 007 5.096734e− 0117 9.851949e+ 008 2.214796e− 0088 3.387279e+ 010 1.973904e− 0079 1.099651e+ 012 4.215144e− 00510 3.535372e+ 013 5.382182e− 004


Error Analysis.I If we use finite precision arithmetic, then rounding causes errors in

the input data. Using m-digit floating point arithmetic it holds that

|x− fl(x)||x|

≤ 0.5 ∗ 10−m+1.

I Thus, if we solve the linear system in m-digit floating pointarithmetic, then, as rule of thumb, we may approximate the theinput errors due to rounding by

‖∆A‖‖A‖

≈ 0.5 ∗ 10−m+1,‖∆b‖‖b|

≈ 0.5 ∗ 10−m+1

I If the condition number of A is κ(A) = 10α, then

‖∆x‖‖x|

≤ 10α

1− 10α−m+1(0.5 ∗ 10−m + 0.5 ∗ 10−m) ≈ 10α−m.

Provided 10α−m+1 < 1.I Rule of thumb: If the linear system is solved in m-digit floating

point arithmetic and if the condition number of A is of the order10α, then only m− α− 1 digits in the solution can be trusted.


Summary.

I If the condition number of a matrix A is large, then small errors inthe data may lead to large errors in the solution.

I Rule of thumb: If the linear system is solved in m-digit floating pointarithmetic and if the condition number of A is of the order 10α,then only m− α− 1 digits in the solution can be trusted.


Date post:	12-Sep-2021
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

MATH 3795 Lecture 6. Sensitivity of the Solution of a Linear System

Documents