Davidson method - TUHH · 2014. 4. 15. · CHAPTER 3 : JACOBI DAVIDSON METHOD Heinrich Voss...

CHAPTER 3 : JACOBI–DAVIDSON METHOD

Heinrich [email protected]

Hamburg University of Technology

Heinrich Voss (Hamburg University of Technology) Jacobi–Davidson method Eigenvalue problems 2012 1 / 54

Davidson Method

The Davidson method is a popular technique to compute a few of the smallest(or largest) eigenvalues of a large sparse real symmetric matrix.

It is effective when the matrix is nearly diagonal, i.e. if the matrix ofeigenvectors is close to the identity matrix. It is mainly used for problems oftheoretical chemistry (ab initio calculations in quantum chemistry) where thematrices are strongly diagonally dominant.

Similar to the Lanczos method Davidson’s method is an iterative projectionmethod which however does not take advantage of Krylov subspaces butuses the Rayleigh–Ritz procedure with non-Krylov spaces and expands thesearch spaces in a different way.


Davidson methodLet Uk := [u1, . . . ,uk ] ∈ R(n,k) be a matrix with orthonormal columns, and let(θ, s) be an eigenpair of the projected problem

UHk AUk s = λ s,

and y = Uk s.

Davidson (1975) suggested to expand the ansatz space span Uk by thedirection

t := (DA − θI)−1rwhere r := Ay − θy is the residual of the Ritz pair (y , θ) and DA denotes thediagonal of A. uk+1 is obtained by orthogonalizing t against Uk .

A little irritating is the fact that the method fails for diagonal matrices: if A isdiagonal and (θ, y) is a Ritz pair then

t = (DA − θI)−1r = y ∈ spanUj ,

and the space span Uj will not be not expanded.Heinrich Voss (Hamburg University of Technology) Jacobi–Davidson method Eigenvalue problems 2012 3 / 54

Davidson Algorithm

1: Choose initial vector u1 with ‖u1‖ = 1, U1 = [u1]2: for j = 1,2, . . . do3: w j = Auj

4: for k = 1, . . . , j − 1 do5: bkj = (uk )Hw j

6: bjk = (uj)Hwk

7: end for8: bjj = (uj)Hw j

9: Compute largest eigenvalue θ of Band corresponding eigenvector s with ‖s‖ = 1

10: y = Ujs11: r = Ay − θy12: t = (DA − θI)−1r13: t = t − UjUHj t14: uj+1 = t/‖t‖15: Uj+1 = [Uj ,uj+1]16: end for


Jacobi’s methodIn addition to the well known method for determining all eigenvalues (andeigenvectors) of a symmetric matrix Jacobi suggested the following methodfor improving known eigenvalue–eigenvector approximations.

Assume that A is diagonally dominant, and let α := a11 be the maximumdiagonal element. Then α is an approximation to the maximum eigenvalueand e1 an approximation to the corresponding eigenvector.

To improve these approximations we have to solve the problem

A(

1z

)=

(α cT

b F

)(1z

)= λ

(1z

).

The eigenproblem is equivalent

λ = α+ cT z(F − λI)z = −b.


Jacobi’s method ct.

Jacobi suggested to solve this problem iteratively:

θk = α+ cT zk(D − θk I)zk+1 = (D − F )zk − b,

where D denotes the diagonal of F .

Sleijpen & van der Vorst (1996) suggested to combine the approach ofDavidson and the improvement of Jacobi in an iterative projection method, i.e.given a Ritz pair (θ,u) corresponding to some subspace U to expand U by adirection which is orthogonal to u and satisfies (approximately) a correctionequation.


Jacobi–Davidson methodLet u be an approximation to an eigenvector of A, and let θ be the Ritz valuecorresponding to u. Similarly to Jacobi’s approach we determine animprovement of u which is orthogonal to u.

If ‖ u‖ = 1 then the orthogonal projection of the matrix A to u⊥ readsB = (I − uuH)A(I − uuH).

From θ = uHAu one gets

A = B + AuuH + uuHA− θuuH .

An eigenvalue-eigenvector pair (λ, x), x = u + v , v ⊥ u satisfiesA(u + v) = λ(u + v),

and from Bu = 0 it follows

(B − λI)v = −r + (λ− θ − uHAv)uwhere r = Au − θu.


Jacobi–Davidson method ct.By the orthogonality of (B − λI)v and u and of r and u it follows thatλ− θ − uHAv = 0, and therefore the improvement v satisfies

(B − λI)v = −r = (A− θI)u.

Since λ is not known we replace it by the Ritz value θ (or if θ is not yet close toa wanted eigenvalue we replace it by a target value in the vicinity of which weare looking for eigenvalues) to obtain the correction equation

(I − uuH)(A− θI)(I − uuH)v = −r .

We expand the search space which yielded the Ritz pair (θ,u) by the solutionof the correction equation, and determine with this expanded search spacethe next Ritz pair.

This is the basis of the Jacobi-Davidson method introduced by Sleijpen andvan der Vorst 1996.


Jacobi–Davidson method ct.v ∈ u⊥ yields (I − uuH)v = v . Hence, the correction equation can be rewrittenas

(A− θI)v = −r + αu,where α ∈ C has to be determined such that v ⊥ u.This implies

v = −(A− θI)−1r + α(A− θI)−1u= −u + α(A− θI)−1u

Since the search space is expanded by v and since u is already contained inthe current search space, the new search space will contain in particular thevector t := (A− θI)−1u.t is the improvement of the Ritz pair (θ,u) by one step of inverse iteration withshift θ and initial vector u. Hence, the Jacobi-Davidson method can beconsidered as an acceleration of inverse iteration, and can be expected toconverge at least as fast as inverse iteration (i.e. quadratic or even cubic).


Jacobi–Davidson method ct.

It is a disadvantage of the Jacobi–Davidson method that in every step one hasto solve a linear system with varying coefficient matrix A− θI.

It was observed that fast convergence was maintained if the correctionequation was solved only approximately.

Sleijpen and van der Vorst suggest to use MINRES if A is symmetric andGMRES or BiCGStab otherwise and to use a suitable preconditioner K forA− θI in any case.

Every other approximate method is fine if only the projector I − uuH is takenaccount for.


A different motivation, V. 2006Given a search space V ⊂ Cn. Expand V by a direction such that theexpanded space has a high approximation potential for the next wantedeigenvector.

Let the columns of V form an orthonormal basis of V, and let θ be aneigenvalue of the projected problem

V HAVy = λy

and x = Vy , ‖x‖ = 1 a corresponding Ritz vector.

Because of its good approximation property we want to expand the searchspace by the direction of Rayleigh quotient iterationv = (A− θI)−1x/‖(A− θI)−1x‖.

However, in a truly large problem the vector v will not be accessible but onlyan inexact solution ṽ := v + e of (A− θI)v = x , and the next iterate will be asolution of the projection of Ax = λx onto the space Ṽ := span{V, ṽ}.


Expansion of search space

We assume that x is already a good approximation to an eigenvector of A.Then v will be an even better approximation, and therefore the eigenvector weare looking for will be very close to the plane E := span{x , v}.

We therefore neglect the influence of the orthogonal complement of x in V onthe next iterate and discuss the nearness of the planes E andẼ := span{x , ṽ}.

If the angle between these two planes is small, then the projection of Ax = λxonto Ṽ should be similar to the one onto span{V, v}, and the approximationproperties of inverse iteration should be maintained.

If this angle can become large, then it is not surprising that the convergenceproperties of inverse iteration are not reflected by the projection method.


Theorem

Let φ0 = arccos(xT v) denote the angle between x and v , and the relativeerror of ṽ by ε := ‖e‖.

Then the maximal possible acute angle between the planes E and Ẽ is

β(ε) =

{arccos

√1− ε2/ sin2 φ0 if ε ≤ | sinφ0|π2 if ε ≥ | sinφ0|


Proof

For ε > | sinφ0| the vector x is contained in the ball with center v and radius ε,and therefore the maximum acute angle between E and Ẽ is π2 .

For ε ≤ | sinφ0| we assume without loss of generality that v = (1,0,0)T ,ṽ = (1 + e1,e2,e3)H , and x = (cosφ0, sinφ0,0)T .

Obviously the angle between E and Ẽ is maximal, if the plane Ẽ is tangentialto the ball B with center v and radius ε.

Then ṽ is the common point of ∂B and the plane Ẽ , i.e. the normal vector ñ ofẼ has the same direction as the perturbation vector e:

An easy calculation now yields the stated formula:


Proof ct.

e =

e1e2e3

= γñ = γ

cosφ0sinφ0

0

×

1 + e1e2e3

= γ

e3 sinφ0−e3 cosφ0

e2 cosφ0 − (1 + e1) sinφ0

.

Hence, we havee1 = γ sinφ0e3, e2 = −γ cosφ0e3,

and the third component yields

e3 = γ(−γ cos2 φ0e3 − (1 + γ sinφ0e3) sinφ0)= −γ2e3 − γ sinφ0,

i.e.e3 = −

γ

1 + γ2sinφ0. (1)


Proof ct.Moreover, from

ε2 = e21 + e22 + e

23 = γ

2 sin2 φ0e23 + γ2 cos2 φ0e23 + e

23 = (1 + γ

2)e23,

we obtain

ε2 =γ2

1 + γ2sin2 φ0, i.e. γ2 =

ε2

sin2 φ0 − ε2.

Inserting into equation (1) yields

e23 =1

1 + γ2ε2 =

(1− ε

2

sin2 φ0

)ε2,

and since the normal vector of E is n = (0,0,1)T , we finally get

cosβ(ε) =eT n

‖n‖ · ‖e‖ =e3ε

=

√1− ε

2

sin2 φ0. �


Expansion by inexact inverse iterationObviously for every α ∈ R, α 6= 0 the plane E is also spanned by x and x +αv .

If Ẽ(α) is the plane which is spanned by x and a perturbed realizationx + αv + e of x + αv then by the same arguments as in the proof of theTheorem the maximum angle between E and Ẽ(α) is

γ(α, ε) =

{arccos

√1− ε2/ sin2 φ(α) if ε ≤ | sinφ(α)|

π2 if ε ≥ | sinφ(α)|

where φ(α) denotes the angle between x and x + αv .

Since the mapping

φ 7→ arccos√

1− ε2/ sin2 φdecreases monotonically the expansion of the search space by an inexactrealization of t := x + αv is most robust with respect to small perturbations, ifα is chosen such that x and x + αv are orthogonal


Expansion by inexact inverse iterationxT (x + αv) = 0 if and only if

t = x − xHx

xH(A− θI)−1x (A− θI)−1x . (∗)

which is the solution of the correction equation

(I − xxH)(A− θI)(I − xxH)t = (A− θI)x , t ⊥ x .of the Jacobi–Davidson method.

Hence, the Jacobi–Davidson method is the most robust realization of anexpansion of a search space such that the direction of inverse iteration iscontained in the expanded space in the sense that it is least sensitive toinexact solves of linear systems (A− θI)v = x .

The maximum acute angle between E and Ẽ(α) satisfies

γ(α, ε) =

{arccos

√1− ε2 if ε ≤ 1

π2 if ε ≥ 1

.


Inexact inverse iteration

For truly large problems v is not available, but has to be replaced by aninexact solution v + e of (A− θI)v = x .


Orthogonal expansion

span{x , x + αv}, α such that xT (x + αv) = 0 is more robust to perturbationsspan{x , x + αv + e} than span{x , v}.


Expansion by inexact inverse iteration ct.






Subspace extractionProjection methods usually yield good approximations to extremeeigenvalues, but often they do not work well if one is interested in interioreigenvalues.

The reason is that Ritz values converge ’monotonically’ to exterioreigenvalues. A Ritz value that is close to a target value in the interior of thespectrum may be on its way to some exterior eigenvalue, and thecorresponding Ritz vector may have a small components in the directions ofeigenvectors close to the target value. Clearly such a Ritz vector is a poorcandidate for search space expansion or restart.

Example

A = diag{1,2,3}, x = (1,0,1)T =⇒ RA(x) = 2 = λ2.

RA(x) is an eigenvalue of A, but x is far away from being a correspondingeigenvector.

For interior eigenvalues Rayleigh–Ritz generally gives poor approximateeigenvectors.


Harmonic Ritz values/vectors (Morgan 1991)If eigenvalues close to some τ are desired, consider Ritz values of (A− τ I)−1with respect to some subspaceW = span(W ):

W H(A− τ I)−1Wz = 1θ − τ W

HWz

To avoid the inverse matrix, choose W := (A− τ I)V for some V ∈ Rn×k(obtained in the Jacobi–Davidson method, e.g.):

V H(A− τ I)HVz = 1θ − τ V

H(A− τ I)H(A− τ I)Vz (∗)

If 1θ−τ is an extreme eigenvalue and z a corresponding eigenvector, then( 1θ−τ ,Wz) is an approximate eigenpair of (A− τ I)−1, and (θ,Wz) is anapproximate eigenpair of A.

Vz = (A− τ I)−1Wz generally is a better eigenvector approximation than Wz,because it is the result of applying one step of shifted inverse iteration to Wzwith shift τ . Vz is called harmonic Ritz vector, and θ harmonic Ritz value.

The Rayleigh quotient ρ of Vz with respect to A often is a better approximationto an eigenvalue close to τ than θ.


Refined Ritz vectorsGiven an approximate eigenvalue λ̃ (for instance a Ritz value θ or a targetvalue τ ) and a basis V of a search space V. The refined Ritz vector is definedas

ũ = Vz where z = argminy‖(A− λ̃I)Vy‖.

Often, the refined Ritz vector ũ is a much better approximation to aneigenvalue close to λ̃ than an ordinary Ritz vector.

Again, the Rayleigh quotient of ũ often is a better approximate eigenvaluethan λ̃.

Computing the refined Ritz vector is expensive (if V is not a Krylov space),because it requires the singular value decomposition of an n × k matrix at anadditional cost of O(nk2) flops at iteration k .Moreover, since θ changes from one iteration to the next, we can not updatethe SVD from the previous iteration, and we have to compute a new SVD ineach iteration.


Solving the correction equationThe correction equation

(I − uk (uk )H)(A− θI)(I − uk (uk )H)t = −(A− θI)uk , t ⊥ uk

is solved iteratively by a preconditioned Krylov solver.

Suppose that K is a preconditioner for A− θI, i.e.K−1(A− θI) ≈ I.

Since the eigenvalue approximation θ varies in the course of the algorithm thepreconditioner should be altered as well. However, frequently K can be keptfixed for several iterations (and sometimes even when computing severaleigenvalues).

When solving the correction equation one has to consider the restriction to theorthogonal complement of the current approximation uk . This suggests toconsider for fixed K with K−1(A− θI) ≈ I the preconditioner

K̃ := (I − uk (uk )H)K (I − uk (uk )H).


Solving the correction equation ct.

Consider a Krylov subspace solver for the correction equation with initialvector t0 = 0 and left preconditioning.

Since the initial vector and uk are orthogonal the orthogonality is maintainedfor all iterates. Hence, in every step we have to determine a vectorz := K̃−1Ãw where

Ã := (I − uk (uk )H)(A− θk I)(I − uk (uk )H).

Firstly, it follows from (uk )Hw = 0

Ãw = (I − uk (uk )H)(A− θk I)(I − uk (uk )H)w= (I − uk (uk )H)y

with y = (A− θk I)w .



With this notation we have to solve

K̃ z = (I − uk (uk )H)y .

From z ⊥ uk we obtain that z has to satisfy

Kz = y − αuk i.e. z = K−1y − αK−1uk ,

and the requirement z ⊥ uk yields

α =(uk )HK−1y(uk )HK−1uk

.

Hence, in every step of the preconditioned Krylov method we have solve thelinear system K ỹ = y , and additionally at the beginning we have to solve thesystem K ũ = uk .



The approximate solution of the correction equation by a preconditionedKrylov solver has the following form:

– solve K ũ = u for ũ, and compute µ = uH ũ.– determine r̃ from

(i) solve K r̂ = r̃ for r̂(ii) set r̃ = r̂ − uH r̂

µû

– Apply Krylov solver with initial vector t0 = 0, matrix K̃−1Ã and right handside −r̃ . In each step determine z = K̃−1Ãw according to(a) y = (A − θI)w .(b) solve K ŷ = y for ŷ(c) z = ŷ − uH ŷ

µû


Jacobi-Davidson method; largest eigenvalue

1: Choose initial vector t , and set U = [], V = []2: for m = 1,2, . . . do3: t = t − UUH t4: u = t/‖t‖; U = [U u]; v = Au; V = [V v ]5: C(1 : m − 1,m) = U(:,1 : m − 1)HV (:,m)6: C(m,1 : m − 1) = U(:,m)HV (:,1 : m − 1)7: C(m,m) = U(:,m)HV (:,m)8: Compute largest eigenvalue θ of C in modulus

and corresponding eigenvector s such that ‖s‖ = 19: y = Us

10: r = Ay − θy11: if ‖r‖ ≤ ε then12: λ = θ, x = y , STOP13: end if14: Solve approximately

(I − yyH)(A− θI)(I − yyH)t = −r , t ⊥ y for t15: end for


Comments

In step 3. the classical Gram–Schmidt method for orthogonalizing t against Ucan be replaced by the modified Gram–Schmidt method. Observe, however,that the classical method takes advantage of BLAS level-3-operations,whereas the modified method uses only level-1-operations.

It is reasonable to repeat the orthogonalization, if ‖t − UUH t‖/‖t‖ is small, i.e.if the angle between t and spanU is small. To this end we choose κ (which isnot too small), for instance κ = 0.25, and replace step 3. by

1: τ = ‖t‖2: t = t − UUH t3: if ‖t‖/τ ≤ κ then4: t = t − UUH t5: end if

Again (in both appearances) we can replace the classical by the modifiedGram–Schmidt method.


Comments ct.

If A is Hermitean, then step 7. can be replaced by

C(m,1 : m − 1) = C(1 : m − 1,m)H

and the cost for updating the projection of A is cut by half approximately.

10. can be replaced byr = Vs,

if m is smaller than the average filling of the rows of A

It is not necessary to solve the correction equation in 14. very accurately.Sleijpen and van der Vorst suggest a relative accuracy of 2−m in the m-th stepfollowing the recommendation for the inexact Newton method. Notice that ifthe correction equation is solved exactly the Jacobi–Davidson method is ageneralization of inverse iteration, which can be interpreted as Newton’smethod.


Comments ct.

As for the Arnoldi method the increasing storage and the computationaloverhead for increasing the dimension of the ansatz space may make itnecessary to restart.

An obvious way is to restart with the most recent approximation to aneigenvector as initial vector. However, this destroys valuable informationcontained in the discarded part of the ansatz space, and the speed ofconvergence will be slowed down.

Therefore it is often a better strategy to restart with a subspace spanned by asmall number of Ritz vectors or harmonic Ritz vectors corresponding toapproximations close to the wanted eigenvalue (thick restart).

Notice that iterative projection methods which do not use Krylov subspacesare more flexible. Especially at restart, they can retain any number ofapproximate eigenvectors without the need of the implicit restartingframework.


Several eigenpairs

Quite often one is interested not only in one but in several eigenpairs.

The Jacobi–Davidson method often converges very rapidly to an eigenvalue(here the largest one), and the convergence often is faster than for Arnoldi’smethod. On the other hand Arnoldi yields approximations to severaleigenvectors simultaneously whereas Jacobi-Davidson deliberately aims atone particular eigenvalue closest to some target.

Way out: Render converged eigenvectors harmless by deflation


Deflation

After an eigenvector has converged we continue with subspaces spanned bythe remaining eigenvectors.

For Hermitean matrices this is no problem since the eigenvectors areorthogonal. In the correction equation one has to supplement the current Ritzvector by the already converged eigenvectors in the projector to theorthogonal complement.

For non-Hermitean matrices one works with Schur vectors.


Deflation ct.

If u1, . . . ,uk denotes an orthonormal basis of the current search space andC = UHk AUk , then we compute the Schur factorization CV = VS of C, whereV is an orthonormal matrix, and S is an upper triangular matrix (for instanceby the QR algorithm for dense matrices).

Next we reorder S such that |sii − τ | is nonincreasing for some target value τ .This reordering can be done while preserving the upper triangular structure ofS.

The first diagonal elements of S then represent eigenvalue approximationsclosest to τ , and the corresponding columns of Uk V span an approximation tothe invariant subspace of A corresponding to these eigenvectors.

The decomposition A(Uk V ) = (Uk V )S can be used in a restart, if onediscards the columns of S and Uk V corresponding to unwanted sii .


Deflation ct.Assume that a partial Schur form AUk = Uk Rk is known which we want tocomplement by a new column u such that

A[Uk u] = [Uk u][

Rk sλ

], UHk u = 0.

i.e.

AUk = Uk RkAu = Uk s + λu, UHk u = 0.

Multiplying the second equation by UHk yields

UHk Au = UHk Uk s + λU

Hk u = s,


Deflation ct.

and substituting in the second equation one gets

Au = Uk UHk Au + λu,

i.e.(A− λI)u = Uk UHk Au.

From (A− λI)Uk UHk u = 0 it follows

(A− λI)(I − Uk UHk )u = Uk UHk Au,

from which we obtain

(I − Uk UHk )(A− λI)(I − Uk UHk )u = 0.


Deflation ct.The last equation demonstrates that the new pair (u, λ) is an eigenpair of

Ã := (I − Uk UHk )A(I − Uk UHk )

which can be determined by the Jacobi–Davidson method.

To determine the next Schur vector uk+1 by the Jacobi–Davidson method onehas to solve the correction equation

Pk+1(I − Uk UHk )(A− θI)(I − Uk UHk )Pk+1t = −r

where Pk+1 = I − uuH and (θ,u) is the current Ritz pair.

Fokkema, Sleijpen, van der Vorst (1998) contains an example thatdemonstrates that the explicit deflation has to be used in the correctionequation, but in the projection of A to the search space of theJacobi–Davidson method the column vectors of Uk do not have to be takenaccount of.


Preconditioning

Preconditioning for an iterative solver for the correction equation consideringexplicit deflation is insignificantly more expensive than for the determination ofthe largest eigenvalue.

Let K be a preconditioner of A− θI, let (θ,u) be the current Ritz pair, andŨ = [Uk ,u].

The preconditioner K has to be restricted to the orthogonal complement of Ũ,i.e. actually one has to precondition by

K̃ = (I − UUH)K (I − UUH).

This can be done in a similar way as for Ũ = u.


Preconditioning ct.

As for the largest eigenvalue it holds: If we apply a Krylov solver with initialvector t0 = 0 with left preconditioning, then all iterates are contained in theorthogonal complement of Ũ.

For a given vector v we have to find z := K̃−1Ãv in this space where

Ã := (I − ŨŨH)(A− θI)(I − ŨŨH).

Again this is done in two steps: First determine

Ãv = (I − ŨŨH)(A− θI)(I − ŨŨH)v= (I − ŨŨH)y

with y := (A− θI)v (remember ŨHv = 0).


Preconditioning ct.Next we have to determine z ⊥ Ũ such that

K̃ z = (I − ŨŨH)y ,

i.e. on account of ŨHz = 0

(I − ŨŨH)K (I − ŨŨH)z = (I − ŨŨH)Kz = (I − ŨŨH)y ,

and hence(I − ŨŨH)(Kz − y) = 0.

Therefore z satisfies

Kz = y − Ũα i.e. z = K−1y − K−1Ũα,

and the condition ŨHz = 0 yields

α = (ŨHK−1Ũ)−1ŨK−1y .


Preconditioning ct.

In each step of the preconditioned Krylov solver one needs the vectorŷ = K−1y and the matrix Û = K−1Ũ.

As before Û is identical in all iteration steps. Therefore, to perform m iterationsteps when solving the correction equation one has to solve m + k + 1 linearsystems with system matrix K .

If the preconditioner is kept fixed while computing several eigenvalues then Ûcan be reused for all of these eigenvalues.

Implementations of this algorithm in Fortran 77 and in MATLAB can bedownloaded from the Homepage of Gerard Sleijpen.


Generalized Hermitean eigenproblemsConsider the generalized eigenvalue problem

Ax = λBx , where A = AH , B = BH , B positive definite.

Given an approximation (θ, x) to an eigenpair the inverse iteration is definedby v := (A− θB)−1Bx . We expand the current search space by t := x + αv ,where α is chosen such that x and x + αv are orthogonal.

Measuring angles with respect to the scalar product 〈x , y〉B := xHBy , then therobustness requirement 〈x , x + αv〉B = 0 yields the expansion

t = x − xHBx

xHB(A− θB)−1Bx (A− θB)−1Bx ,

which is the solution of the symmetric correction equation

(I − Bxx

H

xHBx

)(A− θB

)(I − xx

HBxHBx

)t = (A− θB)x , t ⊥B x .

This is the correction equation introduced by Sleijpen, Booten, Fokkema &van der Vorst (1996) in a different way.


Generalized Hermitean eigenproblems ct.

The operator (I − Bxx

H

xHBx

)(A− θB

)(I − xx

HBxHBx

)

maps the space (Bx)⊥ onto the space x⊥, so that preconditioning is alwaysrequired if we use a Krylov solver in order to get a mapping of (Bx)⊥ into itself.

Choosing

K̃ =(

I − BxxH

xHBx

)K(

I − xxHB

xHBx

)

with K−1(A− θB) ≈ I, the preconditioner can be implemented in a similar wayas for the standard eigenproblem.

In particular, taking into account the projector when solving the correctionequation with a preconditioned Krylov solver requires only one additionalsolve with the preconditioner K .


Generalized Hermitean eigenproblems ct.As for the standard eigenvalue problem the Jacobi-Davidson method can becombined with restart and deflation.

If we want to work with orthogonal operators in the deflation, then we have towork with B-orthogonal matrices that reduce the given generalized system to(truncated) Schur form

AQk = Zk Dk

where Zk = BQk and Qk has B-orthogonal columns.

Dk is diagonal with k computed eigenvalues on its diagonal, and the columnsof Qk are eigenvectors of the pencil A− λB.

This leads to the projection for the deflation with the first k eigenvectors

(I − Zk QHk )(A− λB)(I −Qk Z Hk ).

It is easy to verify that the deflated operator B is still symmetric and positivedefinite with respect to the space (BQk )⊥.


Generalized eigenproblemsConsider the generalized eigenvalue problem

Ax = λBx

with general matrices A,B ∈ Cn×n and B is nonsingular.Then given an approximation (θ, x) to an eigenpair the inverse iteration isdefined by v := (A− θB)−1Bx .Expanding the current search space by t := x + αv , where α is chosen suchthat x and x + αv are orthogonal with respect to the Euclidean inner product,we obtain

t = x − xHx

xH(A− θB)−1Bx (A− θB)−1Bx .

t this is the solution of the well known correction equation

(I − Bxx

H

xHBx

)(A− θB

)(I − xx

H

xHx

)t = (A− θB)x , t ⊥ x

of the Jacobi–Davidson method; cf. Fokkema, Sleijpen & van der Vorst (1998).


Two-sided Jacobi-DavidsonWe already mentioned that the Jacobi-Davidson method can be consideredas an acceleration of inverse iteration with Rayleigh quotient shifts (Rayleighquotient iteration).

ũk+1 = (A− θk I)−1uk , uk+1 =ũk+1

‖ũk+1‖ , θk+1 = (uk+1)T Auk+1.

The Rayleigh quotient iteration (RQI) is known to converge cubically to simpleeigenvalues of normal matrices. For a (nonnormal) eigenvalue of a nonnormalmatrix it converges locally quadratic at best (cf. Parlett 1974).

Ostrowski’s two sided RQI works with the two-sided (or generalized) Rayleighquotient

θ(u, v) :=vHAuvHu

where u and v are approximate left and right eigenvectors, respectively.

In Parlett (1975) it was shown that the two-sided RQI converges locally andcubically to simple eigenvalues.


Two-sided Rayleigh quotient iteration

1: Choose initial vector u1 and v1 with unit norm such that (v1)Hu1 6= 02: for k = 1,2, . . . do3: compute θk =

(vk )H Auk

(vk )H uk

4: if A− θk I sufficiently singular then5: solve (A− θk I)x = 0 and (AH − θk I)y = 06: STOP7: end if8: solve (A− θk I)uk+1 = uk and normalize uk+19: solve (AH − θk I)vk+1 = vk and normalize vk+1

10: if (vk+1)Huk+1 = 0 then11: the method fails12: end if13: end for


Two-sided Jacobi-DavidsonThe two-sided RQI suggests to work with two search spaces U for the righteigenvector, and V for the left eigenvector.

Suppose that we have k dimensional search spaces U and V, andapproximations u ∈ U and v ∈ V to right and left eigenvectors where vHu 6= 0.Then we choose the Rayleigh quotient

θ = θ(u, v) :=vHAuvHu

(1)

as approximation to the corresponding eigenvalue.

Note that (1) hold if and only if

Au − θu ⊥ v and AHv − θv ⊥ u.

u = Uc and v = Vd (where the columns of U and V form an orthonormalbasis of U and V, respectively) are chosen as eigenvectors of the obliqueprojection of A to U along V, i.e.

V HAUc = θV HUc and UHAVd = θUHVd .


Two-sided Jacobi-Davidson ct.Similar to the original Jacobi-Davidson method we expand the search spacessuch that the directions of inverse iteration are contained in the augmentedspaces:

s = u + α(A− θI)−1u and t = v + β(AH − θI)−1v .For robustness reasons we choose α such that s ⊥ u and t ⊥ v , i.e.

α = − ‖u‖2

uH(A− θI)−1u and β = −‖v‖2

vH(AH − θI)−1v.

Hence, s is a solution of a projected problem(

I − upH

pHu

)(A− θI)(I − uuH)s = (A− θI)u, s ⊥ u,

for some p ∈ Cn, and for consistency reasons we choose p = v , since byconstruction (A− θI)u ∈ v⊥.Similarly the correction equation for t reads

(I − vu

H

uHv

)(AH − θI)(I − vvH)t = (AH − θI)v , t ⊥ v ,


Two-sided Jacobi-Davidson algorithm1: Choose initial vector u1 and v1 with unit norm such that (v1)Hu1 6= 02: s = u1; t = v1; U0 = []; V0 = []3: for k = 1,2, . . . do4: Uk = MGS(Uk−1, s); Vk = MGS(Vk−1, t)5: compute k th column of Wk = AUk6: compute k th row and column of Hk = V Hk Wk7: compute wanted eigentripel (θ, c,d) of the pencil (V Hk AUk ,V

Hk Uk )

8: u = Uk c; v = Vk d ;9: ru = (A− θI)u = Wk c − θu;

10: rv = (AH − θI)v11: STOP if min{‖ru‖, ‖rv‖} ≤ tol12: Solve (approximately) s ⊥ u and t ⊥ v from

(I − uv

H

vHu

)(A− θI)(I − uuH)s = (A− θI)u, s ⊥ u

(I − vu

H

uHv

)(AH − θI)(I − vvH)t = (AH − θI)v , t ⊥ v ,

13: end forHeinrich Voss (Hamburg University of Technology) Jacobi–Davidson method Eigenvalue problems 2012 53 / 54

RemarksHochstenbach & Sleijpen (2003) proposed the two-sided Jacobi-Davidsonmethod with a different motivation. They further suggested a further variantwhere the bases Uk and Vk are constructed to be bi-orthogonal.

Here, the expansions s and t of Uk and Vk satisfy the correction equations(

I − uvH

vHu

)(A− θI)

(I − uv

H

vHu

)s = (A− θI)u, s ⊥ v

(I − vu

H

uHv

)(AH − θI)

(I − vu

H

uHv

)t = (AH − θI)v , t ⊥ u,

Schwetlick & Schreiber (2006) used arguments from bifurcation theory tomotivate a variant of the Jacobi-Davidson method which is based on thecorrection equations

(I − vvH)(A− θI)(I − uuH)s = (A− θI)u, s ⊥ u(I − uuH)(AH − θI)(I − vvH)t = (AH − θI)v , t ⊥ v

All 3 methods converge locally and cubicallyHeinrich Voss (Hamburg University of Technology) Jacobi–Davidson method Eigenvalue problems 2012 54 / 54

Date post:	07-Feb-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Davidson method - TUHH · 2014. 4. 15. · CHAPTER 3 : JACOBI DAVIDSON METHOD Heinrich Voss...

Documents