Date post: | 21-Dec-2015 |
Category: |
Documents |
Upload: | anuradha19 |
View: | 8 times |
Download: | 3 times |
Iterative Methods in Linear Algebra(part 2)
Stan Tomov
Innovative Computing LaboratoryComputer Science Department
The University of Tennessee
Wednesday April 27, 2011
CS 594, 04-27-2011
CS 594, 04-27-2011
Outline
Part IKrylov iterative solvers
Part IIConvergence and preconditioning
Part IIIIterative eigen-solvers
CS 594, 04-27-2011
Part I
Krylov iterative solvers
CS 594, 04-27-2011
Krylov iterative solvers
Building blocks for Krylov iterative solvers covered so far
Projection/minimization in a subspace
Petrov-Galerkin conditionsLeast squares minimization, etc.
Orthogonalization
CGS and MGSCholesky or Householder based QR
CS 594, 04-27-2011
Krylov iterative solvers
We also covered abstract formulations foriterative solvers and eigen-solvers
What are the goals of this lecture?
Give specific examples of Krylov solvers
Show how examples relate to the abstract formulation
Show how examples relate to the building blocks covered sofar, specidicly to
Projection, andOrthogonalization
But we are not going into the details!
CS 594, 04-27-2011
Krylov iterative solvers
How are these techniques related to Krylov iterative solvers?
Remember projection slides 26 & 27, Lecture 7 (left)
Projection in a subspace is the basis for an iterative method
Here projection is in VIn Krylov methods V is the Krylov subspace
Km(A, r0) = span{r0, Ar0, A2r0, . . . , Am−1r0}
where r0 ≡ b − Ax0 and x0 is an initial guess.
Often V or W are orthonormalized
The projection is ’easier’ to find when we work with anorthonormal basis(e.g. problem 4 from homework 5: projection in generalvs orthonormal basis)The orthonormalization can be CGS, MGS, Cholesky orHouseholder based, etc.
CS 594, 04-27-2011
Krylov Iterative Methods
To summarize, Krylov iterative methods in general
expend the Krylov subspace by a matrix-vector product, and
do a projection in it.
Various methods result by specific choices of the expansion andprojection.
CS 594, 04-27-2011
Krylov Iterative Methods
A specific example with the
Conjugate Gradient Method (CG)
CS 594, 04-27-2011
Conjugate Gradient Method
The method is for SPD matrices
Both V and W are the Krylov subspaces, i.e. at iteration i
V ≡W ≡ Ki (A, r0) ≡ span{r0, Ar0, . . . , Ai−1r0}
The projection xi ∈ Ki (A, r0) satisfies the Petrov-Galerkinconditions
(Axi , φ) = (b, φ), for ∀φ ∈ Ki (A, r0)
CS 594, 04-27-2011
Conjugate Gradient Method (continued)
At every iteration there is a way (to be shown later) to construct a new search direction pi such that
span{p0, p1, . . . , pi} ≡ Ki+1(A, r0) and (Api , pj ) = 0 for i 6= j.
Note: A is SPD ⇒ (Api , pj ) ≡ (pi , pj )A can be used as an inner product,i.e. p0, . . . , pi is an (·, ·)A orthogonal basis for Ki+1(A, r0)
⇒ we can easily find xi+1 ≈ x as
xi+1 = x0 + α0p0 + · · · + αi pi s.t.
(Axi+1, pj ) = (b, pj ) for j = 0, . . . , i
Namely, because of the (·, ·)A orthogonality of p0, . . . , pi at iteration i + 1 we have to find only αi
(Axi+1, pj ) = (A(xi + αi pi ), pi ) = (b, pi ), ⇒ αi =(ri , pi )
(Api , pi )
Note: xi above actually can be replaced by any x0 + v , v ∈ Ki (A, r0) (Why?)
CS 594, 04-27-2011
Conjugate Gradient Method (continued)
Conjugate Gradient Method
1: Compute r0 = b − Ax0 for some initial guess x0
2: for i = 0 to ... do3: ρi = rT
i ri
4: if i = 0 then5: p0 = r0
6: else7: pi = ri +
ρiρi−1
pi−1
8: end if9: qi = Api
10: αi =ρi
piT qi
11: xi+1 = xi + αi pi
12: ri+1 = ri − αi qi
13: check convergence; continue if necessary
14: end for
Note:
One matrix vector product/iteration (at line 9)
Two inner-products/iteration (lines 3 and 10)
In exact arithmetic ri+1 = b − Axi+1(Apply A to both sides of 11 and subtract from b toget line 12)
Update for xi+1 is as pointed out before, i.e. with
αi =(ri , ri )
(Api , pi )=
(ri , pi )
(Api , pi )
since (ri , pi−1) = 0 (exercise)
Other relations to be proved (exercise)
pi s’ span the Krylov spacepi s’ are (·, ·)A orthogonal, etc.
CS 594, 04-27-2011
Conjugate Gradient Method (continued)
To sum it up:
In exact arithmetic we get the exact solution in at most nsteps, i.e.
x = x0 + α0p0 + · · ·+ αipi + αi+1pi+1 + · · ·+ αn−1pn−1
At every iteration one more term αjpj is added to the currentapproximation
xi = x0 + α0p0 + · · ·+ αi−1pi−1
xi+1 = x0 + α0p0 + · · ·+ αi−1pi−1 +αipi ≡ xi + αipi
Note: we do not have to solve linear system at every iterationbecause of the A-orthogonal basis that we manage tomaintain and expend at every iterationIt can be proved that the error ei = x − xi satisfies
||ei ||A ≤ 2
(√k(A)− 1√k(A) + 1
)i
||e0||A
CS 594, 04-27-2011
Building orthogonal basis for a Krylov subspace
Building orthogonal basis for a Krylov subspace
We have seen the importance in
Defining projections
not just for linear solvers
Abstract linear solvers and eigen-solver formulations
A specific example
in CG where the basis for the Krylov subspaces is A-orthogonal(A is SPD)
We have seen how to build it
CGS, MGS, Cholesky or Householder based, etc.
These techniques can be used in a method specificallydesigned for Krylov subspaces (general non-Hermitian matrix),namely in the
Arnoldi’s Method
CS 594, 04-27-2011
Arnoldi’s Method
Arnoldi’s method:
Build an orthogonal basis for Km(A, r0)A can be general, non-Hermitian
1: v1 = r0
2: for j = 1 to m do
3: hij = (Avj , vi ) for i = 1, .., j
4: wj = Avj − h1j v1 − ...− hjj vj
5: hj+1,j = ||wj ||26: if hj+1,j = 0 Stop
7: vj+1 =wj
hj+1,j
8: end for
Note:
This orthogonalization is based on CGS (line 4)
wj = Avj − (Avj , v1)v1 − ...− (Avj , vj )vj
⇒ up to iteration j vectors
v1, ..., vj
are orthogonal
The space of this orthogonal basis grows by taking thenext vector to be Avj
If we do not exit at step 6 we will have
Km(A, r0) = span{v1, v2, . . . , vm}
(exercise)
CS 594, 04-27-2011
Arnoldi’s Method (continued)
Arnoldi’s method in matrix notation
Denote
Vm ≡ [v1, . . . , vm], Hm+1 = {hij}m+1×m
and by Hm the matrix Hm+1 without the last row.
Note that Hm is upper Hessenberg (0s below the lower secondsub-diagonal) and
AVm = VmHm + wmeTm
V Tm AVm = Hm
(exercise)
CS 594, 04-27-2011
Arnoldi’s Method (continued)
Variations:
Explained using CGS
Can be implemented with MGS, Householder, etc.
How to use it in linear solvers?
Example with the Full Orthogonalization Method (FOM)
CS 594, 04-27-2011
FOM
FOM
1: β = ||r0||22: Compute v1, . . . , vm with Arnoldi
3: ym = βH−1m e1
4: xm = x0 + Vmym
Look for solution in the form
xm = x0 + ym(1)v1 + · · · + ym(m)vm
≡ x0 + Vmym
Petrov-Galerkin conditions will be
V Tm Axm = V T
m b
⇒ V Tm A(x0 + Vmym) = V T
m b
⇒ V Tm AVmym = V T
m r0
⇒ Hmym = V Tm r0 = βe1
which is given by steps 3 and 4 of the algorithm
CS 594, 04-27-2011
Restarted FOM
What happens when m increases?
computation grows as at least O(m2)n
memory is O(mn)
A remedy is to restart the algorithm, leading to restarted FOM
FOM(m)
1: β = ||r0||22: Compute v1, . . . , vm with Arnoldi3: ym = βH−1
m e1
4: xm = x0 + Vmym. Stop if residual is small enough.5: Set x0 := xm and go to 1
CS 594, 04-27-2011
GMRES
Generalized Minimum Residual Method (GMRES)Similar to FOM
Again look for solution
xm = x0 + Vmym
where Vm is from the Arnoldi process (i.e. Km(A, r0))The test conditions Wm from the abstract formulation (slide27, Lecture 7)
W Tm AVmym = W T
m r0
are Wm = AVm.
The difference results in step 3 from FOM, namely
ym = βH−1m e1
being replaced by
ym = argminy ||βe1 − Hm+1y ||2
CS 594, 04-27-2011
GMRES
Similarly to FOM, GMRES can be defined with
Various orthogonalizations in the Arnoldi process
Restart
Note:
Solving the least squares (LS) problem
argminy ||βe1 − Hm+1y ||2
can be done with QR factorization as discussed inLecture 7, Slide 25
CS 594, 04-27-2011
Lanczos Algorithm
Can we improve on Arnoldi if A is symmetric?
Yes! Hm becomes symmetric so it will be just 3 diagonal
the simplification of Arnoldi in this case leads to theLanczos Algorithm
Lanczos can be used in deriving CG
The Lanczos Algorithm
1: v1 =r0||r0||2
, β1 = 0, v0 = 0
2: for j = 1 to m do
3: wj = Avj − βj vj−1
4: αj = (wj , vj )
5: wj = wj − αj vj
6: βj+1 = ||wj ||2. If βj+1 = 0 then Stop
7: vj+1 =wjβj+1
8: end for
Matrix Hm here is 3-diagonal with diagonal
hii = αi
and off diagonal
hi,i+1 = βi+1
In exact arithmetic vi s’ are orthogonal but inreality orthogonalization gets lost rapidly
CS 594, 04-27-2011
Choice of basis for the Krylov subspace
We saw how different basis for the Krylov spaces is characteristicfor various methods, e.g.
GMRES uses orthogonal
CG uses A-orthogonal
This is true for other methods as well
Conjugate Residual (CR; for symmetric problems) usesATA-orthogonal (i.e. Api ’s are orthogonal)
ATA-orthogonal basis can be generalized to thenon-symmetric case as well, e.g. in theGeneralized Conjugate Residual (GCR)
CS 594, 04-27-2011
Other Krylov methods
We considered various methods that construct a basis for theKrylov subspaces
Another big class of methods is based on biortogonalization(algorithm due to Lanczos)
For non-symmetric matrices build a pair of bi-orthogonalbases for the two subspaces
Km(A, v1) = span{v1,Av1, . . . ,Am−1v1}
Km(AT ,w1) = span{w1,ATw1, . . . , (A
T )m−1w1}
Examples here are BCG and QMR (not to be discussed)
These methods are more difficult to analyze
CS 594, 04-27-2011
Part II
Convergence and preconditioning
CS 594, 04-27-2011
Convergence
Convergence can be analyzed by
Exploit the optimality properties (of projection) when suchproperties exist
A useful tool is Chebyshev polynomials
Depend on the condition number of the matrix, e.g.
in CG it is
||ei ||A ≤ 2
(√k(A)− 1√k(A) + 1
)i
||e0||A
CS 594, 04-27-2011
Preconditioning
Convergence can be slow or even stagnate
for ill-conditioned matrices (with large condition number)
But can be improved with preconditioning
xi+1 = xi + P(b − Axi )
Think of P as a preconditioner, an operator/matrix P ≈ A−1
for P = A−1 it takes 1 iteration
CS 594, 04-27-2011
Preconditioning
Properties desired in a preconditioner:
Should approximate A−1
Should be easy to compute, apply to a vector, and store
Iterative solvers can be extended to support preconditioning(How?)
CS 594, 04-27-2011
Preconditioning
Extending Iterative solvers to support preconditioning
The same solver can be used but on a modified problem, e.g.
Problem Ax = b is transformed into
PAx = Pb
known as left preconditioning
Problem Ax = b is transformed into
APx = b, x = Pu
known as right preconditioning
Convergence of the modified problem would depend on k(PA)(e.g. with left preconditioning)
CS 594, 04-27-2011
Preconditioning
Examples:
Incomplete LU factorization (e.g. ILU(0))
Jacobi (inverse of the diagonal)
Other stationary iterative solvers (GS, SOR, SSOR)
Block preconditioners and domain decomposition
Additive Schwarz (thing of Block-Jacobi)Multiplicative Schwarz (think of Block-GS)
CS 594, 04-27-2011
Preconditioning
Examples so far:
algebraic preconditioners, i.e. exclusively based on the thematrix
Often, for problems coming from PDEs, PDE and discretizationinformation can be used in designing a preconditioner, e.g.
FFTs’ can be involved to approximate differential operators onregular grids (as in Fourier space the operators are diagonalmatrices)
Grid and problem information to define multigridpreconditioners
Indefinite problems are often composed of sub-blocks that aredefinite: used in defining specific preconditioners and evenmodify solvers for these needs, etc.
CS 594, 04-27-2011
Part III
Iterative eigen-solvers
CS 594, 04-27-2011
Iterative Eigen-Solvers
How are iterative eigensolvers related to Krylov subspaces?
Remember projection slides 29 & 30, Lecture 7 (left)
Again, as in linear solvers, Projection in a subspace is the basis for
an iterative eigen-solver
V and W are often based on Krylov subspaces
Km(A, r0) = span{r0, Ar0, A2r0, . . . , Am−1r0}
where r0 ≡ b − Ax0 and x0 is an initial guess.
Often parts of V or W are orthogonalized
For stabilityThe orthogonalization can be CGS, MGS, Cholesky orHouseholder based, etc.The smaller Rayleigh-Ritz are usually solved withLAPACK routines
CS 594, 04-27-2011
Learning Goals
A brief introduction to Krylov iterative solvers and eigen-solvers
Links to building blocks that we have already covered
Abstract formulationProjection, andOrthogonalization
Specific examples and issues(preconditioning, parallelization, etc.)
CS 594, 04-27-2011