+ All Categories
Home > Documents > Numerical Linear Algebra Chap. 3: Eigenvalue Problems - TUHH

Numerical Linear Algebra Chap. 3: Eigenvalue Problems - TUHH

Date post: 12-Feb-2022
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
43
Numerical Linear Algebra Chap. 3: Eigenvalue Problems Heinrich Voss [email protected] Hamburg University of Technology Institute of Numerical Simulation TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 1 / 43
Transcript

Numerical Linear AlgebraChap. 3: Eigenvalue Problems

Heinrich [email protected]

Hamburg University of TechnologyInstitute of Numerical Simulation

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 1 / 43

Eigenvalues

λ ∈ C is an eigenvalue of A ∈ Cn×n if the homogeneous linear system ofequations

Ax = λx

has a nontrivial solution x ∈ Cn \ {0}. Then, x is called an eigenvector of Acorresponding to λ.

The set of all eigenvalues of A is called the spectrum of A and is denoted byσ(A).

λ is an eigenvalue of A if and only if

det(A− λI) = 0.

χ(λ) := det(A− λI) is a polynomial of degree n, the characteristic polynomialof A.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 2 / 43

Eigenvalues ct.

If λ is a root of χ of multiplicity k (i.e. the poynomial χ(λ) is divisable by(λ− λ)k but not by (λ− λ)k+1) then k is called the algebraic multiplicity of λ.The algebraic multiplicity of λ is denoted by α(λ).

For A ∈ Cn×n its characterictic polynomial χ has degree n. Hence, the sum ofall algebraic multiplicities of eigenvalues equals n.

If λ is an eigenvalue of A then

Eλ := {x ∈ Cn : (A− λI)x = 0}

is a subspace of Cn, which is called the eigenspace of A corresponding to λ.

γ(λ) := dim Eλ is the geometric multiplicity of an eigenvalue λ of A.

It can be shown that γ(λ) ≤ α(λ) for every eigenvalue λ.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 3 / 43

Similar matrices

Let X ∈ Cn×n be nonsingular. Then

A and B := X−1AX

are called similar matrices. A 7→ X−1AX is called similarity transformation.

Since

det(B − λI) = det(X−1(A− λI)X )

= det(X−1) det(A− λI) det(X ) = det(A− λI),

similar matrices have the same eigenvalues including their algebraicmultiplicities.

It can be shown that the geometric multiplicities coincide as well.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 4 / 43

Diagonalizable matrix

Let Ax j = λjx j , j = 1, . . . , k where λi 6= λj for i 6= j . Then the set {x1, . . . , xk}is linearly independent.

Let x =∑k

j=1 αjx j = 0. For j ∈ {1, . . . , k} it follows

(A− λ1I) · · · · · (A− λj−1I)(A− λj+1I) · · · · · (A− λk I)x = αj

k∏i=1,i 6=j

(λj − λi)x j = 0,

and therefore αj = 0.

In particular, if A has n different eigenvalues λj with eigenvectors x j , thenX := (x1, . . . , xn) is nonsingular, and it holds

AX = (Ax1, . . . , Axn) = (λ1x1, . . . , λnxn) = XΛ ⇐⇒ X−1AX = Λ

where Λ =: diag(λ1, . . . , λn) denotes a diagonal matrix with entries λ1, . . . , λn.

Hence, A is diagonalizable, i.e. similar to a diagonal matrix.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 5 / 43

Diagonalizable matrix ct.

More generally, if for all eigenvalues λj , j = 1, . . . , k of A the algebraic andgeometric multiplies coincide (α(λj) = γ(λj)), then choosing in each of theeigenspaces Eλj a basis x j,1, . . . , x j,α(λj ), the matrix

X = (x1,1, . . . , x1,α(λ1), x2,1, . . . , xk,α(λk ))

is nonsingular, and it digonalizes A.

It can be shown that A is diagonalizable if and only if α(λj) = γ(λj) for everyeigenvalue λj of A.

For

A =

(0 10 0

)α(0) = 2 6= 1 = γ(0), and therefore not every matrix is diagonalizable.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 6 / 43

Jordan’s canonical formLet A ∈ Cn×n with distict eigenvalues λ1, . . . , λk . Then there exists anonsingular matrix X such that

X−1AX = diag(J1, . . . , Jk ) :=

J1 O . . . OO J2 . . . , O

. . .O O . . . Jk

is a block diagonal matrix.

Each of the diagonal blocks Jj = diag(Jj,1, . . . , Jj,γ(λj )) is a block diagonalmatrix of dimension α(λj) with γ(λj) blocks where

Jj,i =

λj 1 . . . 0

0 λj. . .

.... . . . . . . . .

.... . . . . . 1

0 . . . 0 λj

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 7 / 43

Hermitian matrices

A ∈ Rn×n is symmetric if A = AT . More generally, A ∈ Cn×n is a Hermitianmatrix if AH := A

T= A, where A denotes the matrix obtained from A by

replacing each of its entries by its conjugate complex.

All eigenvalues of a Hermitian matrix are real: for Ax = λx it holds

xHAx = xH(λx) = λxHx and xHAx = (AHx)Hx = (Ax)Hx = (λx)Hx = λxHx

from which we get λ = λ, i.e. λ ∈ R.

Eigenvectors of a Hermitian matrix correponding to distinct eigenvalues areorthogonal: for Ax = λx , Ay = µy and λ 6= µ it holds

yHAx = λyHx and yHAx = (AHy)Hx = (Ay)Hx = µyHx .

Hence, (λ− µ)yHx = 0, and λ 6= µ implies yHx = 0.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 8 / 43

Invariant subspace

A subspace V of Cn is an invariant subspace of A if Ax ∈ V for every x ∈ V .

Every invariant subspace of A contains an eigenvector of A.

Let x1, . . . , xk ∈ Cn be a basis of V . Then for j = 1, . . . , k there exists bij ∈ Csuch that Ax j =

∑ki=1 bijx i .

Let λ be an eigenvalue of B = (bij) ∈ Ck×k with eigenvector ξ = (ξ1, . . . , ξk )T ,and let x :=

∑ki=1 ξix i 6= 0. Then

Ax =k∑

j=1

ξjAx j =k∑

j=1

k∑i=1

ξjbijx i =k∑

i=1

(k∑

j=1

bijξj)x i =k∑

i=1

λξix i = λx .

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 9 / 43

Hermitian matrices are diagonalizable

Let A be a Hermitian matrix. Then there exists a unitary matrix U ∈ Cn×n (i.e.UHU = I) such that

UHAU = diag(λ1, . . . , λn).

Let x1 be an eigenvector of A such that Ax1 = λ1x1 and (x1)Hx1 = 1. Thenfor x ∈ Cn such that xHx1 = 0 it holds

(Ax)Hx1 = xHAHx1 = xH(Ax1) = λ1xHx1 = 0.

Hence, V1 := {x ∈ Cn : xHx1 = 0} is an invariant subspace of A, andtherefore it contains an eigenvector x2 which can be normalized such that(x2)Hx2 = 1.If x1, . . . , x j are j orthogonal eigenvectors of A, then in the same way as before

Vj := {x1, . . . , x j}⊥ = {x ∈ Cn : xHx i = 0, i = 1, . . . , j}

is an invariant subspace of A, and hence there exists an eigenvector x j+1

which is orthogonal to x1, . . . , x j .U = (x1, . . . , xn) renders the desired property.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 10 / 43

Rayleigh’s principle

Let A ∈ Cn×n be a Hermitian matrix. Then for x 6= 0

RA(x) :=xHAxxHx

is called Rayleigh quotient of A at x .

Let λ1 ≤ λ2 ≤ · · · ≤ λn be the eigenvalues of A, and let x1, . . . , xn be a set ofcorresponding orthogonalized eigenvectors. Then it holds

λ1 = minx 6=0

RA(x) and λn = maxx 6=0

RA(x).

for i = 1, 2, . . . , n it holds

λi = min{RA(x) : x ∈ Cn, xHx j = 0, j = 1, . . . , i − 1}= max{RA(x) : x ∈ Cn, xHx j = 0, j = i + 1, . . . , n}

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 11 / 43

Proof of Rayleigh’s principle

Let x1, . . . , xn be an orthonormal system of eigenvectors of A ∈ Cn×n whereAx j = λjx j .

For x ∈ Cn, x 6= 0 let x =∑n

j=1 ξjx j .

xHx =( n∑

j=1

ξjx j)H( n∑

k=1

ξk xk)

=n∑

j,k=1

ξjξk (x j)Hxk =n∑

j=1

|ξj |2

xHAx =( n∑

j=1

ξjx j)H

A( n∑

k=1

ξk xk)

=( n∑

j=1

ξjx j)H( n∑

k=1

ξk Axk)

=( n∑

j=1

ξjx j)H( n∑

k=1

ξkλk xk)

=n∑

j=1

λj |ξj |2

Hence,

RA(x) =n∑

j=1

αjλj , with αj =|ξj |2∑n

k=1 |ξk |2

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 12 / 43

Proof of Rayleigh’s principle

From 0 ≤ αj ≤ 1 and∑n

j=1 αj = 1 one obtains

λ1 =n∑

j=1

αjλ1 ≤n∑

j=1

αjλj ≤n∑

j=1

αjλn = λn.

λ1 = RA(x1), λn = RA(xn).

λi = min{RA(x) : x ∈ Cn, xHx j = 0, j = 1, . . . , i − 1}= max{RA(x) : x ∈ Cn, xHx j = 0, j = i + 1, . . . , n}

follow in a similar way since ξ1 = · · · = ξi−1 = 0 if xHx j = 0 for j = 1, . . . , i − 1.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 13 / 43

Numerical methods

Linear systems of equations Ax = b can be solved by a finite algorithm (i.e. afinite number of operations) like Gauss elimination.

Determining an eigenvalue of a matrix A ∈ Rn×n is equivalent to finding a rootof the characteristic polynomial

χ(λ) := det(A− λI) = 0.

It is known (Theorem of Abel) that for n ≥ 5 there is no formula for solving

det(A− λI) = 0

for λ. Hence, the eigenvalue problem Ax = λx usually can be solved only byiterative methods.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 14 / 43

Example

A =

0.2 0.3 0.40.6 0.2 0.50.2 0.5 0.1

Choose any vector x0 ∈ R3 and compute the sequence

xk := Axk−1, k = 1, 2, 3, . . .

After a small number of steps (≈ 10) we obtain

xk =

0.51220.69740.5013

and ‖Axk − xk‖ small.

xk seems to be an eigenvector corresponding to the eigenvalue λ = 1.Is this a miracle?

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 15 / 43

A is stochastic

All elements of A are nonnegative, and every column of A adds to 1. Matriceswith these properties are called stochastic. They describe the behavior ofMarkov chaines.

If A is stochastic, then every row of AT adds to 1, and therefore (1, 1, . . . , 1)T

is an eigenvector of AT corresponding to the eigenvalue 1.

det(A− λI) = det(AT − λI)

implies that the eigenvalues of A and AT coincide. Hence, every stochasticmatrix has one eigenvalue λ = 1.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 16 / 43

Power method

Assume that A is diagonizable, i.e. there exist n linearly independenteigenvectors u1, . . . , un of A, and assume that λ1 is a dominant eigenvalue

|λ1| > |λ2|, |λ3|, . . . , |λn|.

The initial vector x0 can be representeted as

x0 =n∑

j=1

αjuj

Ax0 = A( n∑

j=1

αjuj)

=n∑

j=1

αjAuj =n∑

j=1

αjλjuj

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 17 / 43

Power method ct.

A2x0 = A( n∑

j=1

αjλjuj)

=n∑

j=1

αjλjAuj =n∑

j=1

αjλ2j uj

By induction it follows

Amx0 =n∑

j=1

αjλmj uj = λm

1

(α1u1 +

n∑j=2

αj

( λj

λ1

)muj

).

From |λj |/|λ1| < 1 it follows that (λj/λ1)m → 0. Hence, if α1 6= 0, then the

sequence

λ−m1 Amx0 = α1u1 +

n∑j=2

αj

( λj

λ1

)muj

converges to an eigenvector corresponding to λ1.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 18 / 43

Power method ct.

If |λ1| 6= 1, then for increasing m one obtains overflow or underflow.

Apply the method to

B =

0.2 0.3 0.40.6 −0.1 0.50.2 0.5 0.1

The sequence xm converges to the null vector. The largest eigenvalue of B inmodulus seems to be smaller than 1.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 19 / 43

Power method

Normalize xm in each step to avoid underflow or overflow.

Power method1: Given initial vector x0

2: for m = 0, 1, 2, . . . until convergence do3: ym+1 = Axm;4: km+1 = ‖ym+1‖5: xm+1 = ym+1/km+16: end for

With this modification the power method converges in a reasonable number ofsteps to an eigenvector corresponding to the dominant eigenvalueλ1 = 0.9304.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 20 / 43

Observations

λ−m1 Amx0 = α1u1 +

n∑j=2

αj

( λj

λ1

)muj

demonstrates that the speed of convergence depends on

q := maxj=2,...,m

|λj ||λ1|

.

The smaller q is, the faster is the convergence of the power method.

If the initial vector x0 has no component of the eigenvector corresponding tothe dominant eigenvalue (i.e. α1 = 0), then in the course of the algorithmrounding errors usually produce a component of u1 which is amplified infurther iterations until convergence.

Starting the power method for A with a linear combination of eigenvectorscorresponding to λ2 and λ3 one obtains a reasonable approximation to aneigenvector corresponding to λ1 after 40 iterations.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 21 / 43

Observations ct.

If λ1 is a multiple dominant eigenvalue of A

λ1 = λ2 = · · · = λp, |λ1| > |λj | for j = p + 1, . . . , n,

and A is diagonalizable, then all considerations above stay true.

For|λ1| = |λ2| > |λj | for j = 3, . . . , n, and λ1 6= λ2

one does not obtain convergence of the power method.

In steps 4 and 5 of the power method the normalization can be replaced by ascaling

km+1 = `T ym+1

where ` ∈ Rn is a vector which is not orthogonal to the eigenvector u1

corresponding to the dominant eigenvector.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 22 / 43

Inverse iteration

Applying the power method to the inverse matrix A−1 one can determine thesmallest eigenvalue in modulus.

Inverse iterationGiven initial vector x0

for m = 0, 1, 2, . . . until convergence doSolve Aym+1 = xm for ym+1

km+1 = ‖ym+1‖xm+1 = ym+1/km+1

end for

Applying inverse iteration to the matrix B one gets fast convergence to aneigenvector corresponding to the smallest eigenvalue λ3 = −0.2111. For Athe convergence is very slow. What is the difference?

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 23 / 43

Inverse iteration ct.The shifted matrix A− λI has eigenvalues λj − λ, if λj are the eigenvalues ofA.

If λ is not an eigenvalue of A, then (A− λI)−1 has eigenvalues 1λj−λ

.

If |λp − λ| < |λj − λ| for j = 1, . . . , n, j 6= p thenInverse iteration with fixed shift

Given initial vector x0

for m = 0, 1, 2, . . . until convergence doSolve (A− λI)ym+1 = xm for ym+1

km+1 = `T ym+1

xm+1 = ym+1/km+1end for

converges to an eigenvector corresponding to λp. The rate of convergence is

q = maxj 6=k

|λk − λ||λj − λ|

.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 24 / 43

Inverse iteration with variable shiftsFor large m it holds that xm is an approximate eigenvector corresponding toλp and `T xm = 1. Hence,

km+1 = `T ym+1 = `T ((A− λI)−1xm) ≈ 1λp − λ

`T xm =1

λp − λ.

This observations suggests to iterate the shift as well:

km+1 ≈1

λm+1 − λm=⇒ λm+1 := λm + 1/km+1

Inverse iteration with variable shiftsGiven initial vector x0 and initial approximation λ0for m = 0, 1, 2, . . . until convergence do

Solve (A− λmI)ym+1 = xm for ym+1

km+1 = `T ym+1

xm+1 = ym+1/km+1λm+1 = λm + 1/km+1

end forTUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 25 / 43

Quadratic convergence

Let λ be an algebraically simple eigenvalue of A (i.e. λ is a simple root ofdet(A− λI) = 0), let u be a corresponding eigenvector such that `T u = 1.

Then inverse iteration with variable shifts converges locally and quadraticallyto (λ, u): There exists some positive constant C > 0 such that, if λ0 issufficiently close to λ and x0 is sufficiently close to u, then it holds

|λ− λm+1| ≤ C|λ− λm|2 and ‖u − xm+1‖ ≤ C‖u − xm‖2.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 26 / 43

Deflation

Assume that we have already obtained the largest (smallest, closest to agiven shift) eigenvalue λ and corresponding eigenvector u.How can we compute further eigenpairs by the power method?

Let y be a left eigenvector of A corresponding to some eigenvalue µ 6= λ, i.e.yT A = µy .

Then it holds

µyT u = (yT A)u = yT (Au) = λyT u =⇒ yT u = 0.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 27 / 43

Deflation ct.

Let B := A− uwT , where w ∈ Rn satisfies wT u 6= 0

Bu = Au − uwT u = (λ− wT u)u,

i.e. u is an eigenvector of B corresponding to the eigenvalue λ− wT u.

With eigenvalue µ 6= λ of A and its corresponding left eigenvector y , it holds

yT B = yT A− yT uwT = λyT .

Hence, all eigenvalues of A are kept (only the right eigenvectors can change),whereas the eigenvalue λ− wT u can be moved anywhere by the choice of w(for instance to 0 to compute the second largest eigenvalue of A in modulus).

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 28 / 43

Symmetric matrices

Let A = AT ∈ Rn×n be a symmetric matrix, λ an eigenvalue of A, and u acorresponding eigenvalue such that ‖u‖ = 1.

LetB = A− λuuT

If v ∈ Rn is an eigenvector of A (Av = µv ) such that vT u = 0 then

Bv = Av − uuT v = Av = µv

Hence, all eigenvalues of A which are different from λ are eigenvalues of B aswell. 0 is an eigenvalue of B replacing λ. If λ is a multiple eigenvalue of A,then λ is an eigenvalue of B, but the multiplicity is reduced by 1.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 29 / 43

QR Algorithm

QR algorithmA0 := Afor m = 0, 1, 2, . . . until convergence do

Factorize Am = QmRmAm+1 = RmQm

end for

Am+1 = RmQm = QTm(QmRm)Qm = QT

mAmQm

Hence, all Am are (orthogonally) similar, and therefore they have the sameeigenvalues.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 30 / 43

QR algorithm ct.

If the eigenvalues of A are pairwise different of each other in modulus,

|λ1| > |λ2 > | · · · > |λn|

and if a further technical condition is satisfied, then the QR algorithmconverges in the following sense:

If (Am)jk = a(m)jk , then

limm→∞

a(m)jk = 0 for j > k

limm→∞

a(m)jj = λj for j = 1, . . . , n

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 31 / 43

QR algorithm and power method

WithUm = Q1Q2 · · · · ·Qm, Sm = RmRm−1 · · · · · R1

it holdsAm = UmSm. (∗)

For m = 1 the statement is trivial: A = Q1R1 = U1S1.

Am+1 = RmQm = QTmAmQm yields by induction Am+1 = UT

mAUm.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 32 / 43

QR algorithm and power method ct.If (∗) is valid for some m − 1, then it follows from the definition of Am+1

Rm = Am+1QTm = UT

mAUmQTm = UT

mAUm−1

Multiplying by Sm−1 from the right und by Um from the left we obtain

UmSm = AUm−1Sm−1 = Am

which is the proposition for m.

From (∗) we obtain for the first unit vector e1 and ρ = (Rm)(1,1)

Ane1 = UmRme1 = ρUme1.

Hence, the first column has the same direction as the m-th iterate of thepower method with initial vector e1, and it is not surprising that r11 convergesto the largest eigenvalue of A in modulus and the first column to acorresponding eigenvector.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 33 / 43

Examples

For

A =

1 −1 −14 6 3−4 −4 −1

the upper triangular form appears after approximately 10 steps, and thediagonal elements are in the right order.

For

B =

1 0 12 3 −1−2 −2 2

the upper triangular form is arrived after approximately 20 steps, but thediagonal elements are not ordered by magnitude (So, the technical conditionof the last Theorem is not satisfied).After further 50 steps the diagonal elements are ordered by magnitude.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 34 / 43

QR algorithm with shifts

QR algorithm with shiftsA0 := Afor m = 0, 1, 2, . . . until convergence do

Choose a suitable shift κmFactorize Am − κmI = QmRmAm+1 = RmQm + κmI

end for

Again all matrices Am are similar

Am+1 = RmQm + κmI = QTm(QmRm)Qm + κmI

= QTm(Am − κmI)Qm + κmI = QT

mAmQm.

and therefore all eigenvalues of the matrices Am coincide.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 35 / 43

Choice of shifts

Let Qj and Rj be the orthogonal and upper triangular matrices obtained in theQR algorithm with shifts κj , and let

Um = Q1Q2 · · · · ·Qm, Sm = RmRm−1 · · · · · R1.

ThenUmSm = (A− κmI)(A− κm−1I) · · · · · (A− κ1I). (+)

From Am+1 = QTmAmQm it follows immediately by induction Am+1 = UH

mAUm.

For m = 1 equation (+) reads

U1S1 = Q1R1 = A− κ1I

which is the decomposition in the first step of the QR algorithm with shifts.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 36 / 43

Choice of shifts ct.

Assume that (+) holds for some m − 1. From the definition of Am+1 follows

Rm = (Am+1 − κmI)QTm = UT

m(A− κmI)UmQTm = UT

m(A− κmI)Um−1.

Multiplying with Sm−1 from the right and Um from the left one obtains

UmSm = (A− κmI)Um−1Sm−1 = (A− κmI)(A− κm−1I) · · · · · (A− κ1I).

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 37 / 43

Choice of shifts ct.

From (+) one gets for the last unit vector en

(AT − κmI)−1 · · · · · (AT − κ1I)−1en = Um(STm)−1en

Since STm and (ST

m)−1 are lower triangular matrices, it holds that

Um(STm)−1en = σUmen for some σ.

Hence(AT − κmI)−1 · · · · · (AT − κ1I)−1en = σUmen

and the last column of Um can be interpreted as the result of m steps ofinverse iteration with shifts κ1,. . . ,κm and initial vector en

This suggests to choose κm = a(m)n,n which is expected to converge to λn.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 38 / 43

Reducing the cost

The most expensive part in the QR algorithm (shifted or not) is thecomputation of the QR factorization in every step.

This cost can be reduced considerably, if the matrix is transformed to upperHessenberg form first:

A =

a11 a12 a13 . . . a1,n−1 a1na21 a22 a23 . . . a2,n−1 a2n0 a32 a33 . . . a3,n−1 a3n...

. . . . . ....

. . . . . .0 0 0 . . . an.n−1 ann

A has upper Hessenberg form, if ajk = 0 for j > k1.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 39 / 43

Reducing the cost ct.

Assume that Am has upper Hessenberg form. Then a QR decomposition canbe obtained in the following way:

Multiply Am from the left by a rotation in the plane spanned by the first two unitvectors e1 and e2, i.e. by a matrix

U12 =

cos θ sin θ 0 0 . . . 0− sin θ cos θ 0 0 . . . 0

0 0 1 0 . . . 00 0 0 1 . . . 0...

......

. . ....

0 0 0 0 . . . 1

Then U12Am contains in its first two rows linear combinatiosn of the first tworows of A, and the rows 3, . . . , n are the same as in Am. The rotation anglecan be chosen such that the element in the position (2, 1) is annihilated.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 40 / 43

Reducing the cost ct.

Multiplying U12Am from the left by a rotation matrix U23 corresponding to rows2 and 3, we annihilate the element in position (3, 2), which does not changethe element 0 in the (2, 1) position.

Continuing that way we annihilate the elements in positions (i + 1, 1) by arotation Ui,i+1 in the plane spanned by ei and ei+1.

We finally arrive at

Un−1,n · . . . · · ·U23U12Am = R, i.e. Am = QR, Q = UT12 · · · · · UT

n−1,n.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 41 / 43

Reducing the cost ct.

Am+1 = RQ = RUT12 · · · · · UT

n−1,n

Multiplying R by UT12 combines the first two columns of R and leaves the other

columns unchanged. Multiplying by UT23 combines columns 2 and 3 and

leaves the other ones unchanged, etc.

ObviouslyAm+1 = RUT

12 · · · · · UTn−1,n

becomes an upper Hessenberg matrix.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 42 / 43

Reduction to Hessenberg form

A given matrix can be transformed to upper Hessenberg form usingHouseholder matrices.For

A =

(a11 cT

b B

), B ∈ R(n−1)×(n−1), b, c ∈ Rn−1

let w ∈ Rn−1, ‖w‖ = 1 such that the Householder matrix Q1 = I − 2wwT mapsb to a multiple of the first unit vector in Rn−1.

Then with P1 =

(1 00 Q1

)we get

A1 := P1AP1 =

a11 cT Q1k0...0

Q1BQ1

and the first column already has obtained the desired form.The following columns can be tranformed in a similar way.

TUHH Heinrich Voss NLA: Chap.3, Eigenvalue Problems 2006 43 / 43


Recommended