+ All Categories
Home > Documents > A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on:...

A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on:...

Date post: 17-Aug-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
143
A short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer Science and Engineering Universite du Littoral, Jan 19-30, 2005 1
Transcript
Page 1: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

A short course on:Preconditioned Krylov subspace methods

Yousef SaadUniversity of Minnesota

Dept. of Computer Science andEngineering

Universite du Littoral, Jan 19-30, 2005

1

Page 2: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Outline

Part 1

• Introd., discretization of PDEs

• Sparse matrices and sparsity

• Basic iterative methods (Relax-

ation..)

Part 2

• Projection methods

• Krylov subspace methods

Part 3

• Preconditioned iterations

• Preconditioning techniques

• Parallel implementations

Part 4

• Eigenvalue problems

• Applications –

Calais February 7, 2005 2

2

Page 3: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Preconditioning – Basic principles

Basic idea is to use the Krylov subspace method on a modified

system such as

M−1Ax = M−1b.

• The matrix M−1A need not be formed explicitly; only need to

solve Mw = v whenever needed.

• Consequence: fundamental requirement is that it should be easy

to compute M−1v for an arbitrary vector v.

Calais February 7, 2005 3

3

Page 4: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Left, Right, and Split preconditioning

Left preconditioning

M−1Ax = M−1b

Right preconditioning

AM−1u = b, with x = M−1u

Split preconditioning . Assume M is factored: M = MLMR.

M−1L AM−1

R u = M−1L b, with x = M−1

R u

Calais February 7, 2005 4

4

Page 5: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Preconditioned CG (PCG)

II Assume: A and M are both SPD.

II Applying CG directly to M−1Ax = M−1b or AM−1u = b

won’t work because coefficient matrices are not symmetric.

II Alternative: when M = LLT use split preconditioner option

II Second alternative: Observe that M−1A is self-adjoint wrt M

inner product:

(M−1Ax, y)M = (Ax, y) = (x,Ay) = (x,M−1Ay)M

Calais February 7, 2005 5

5

Page 6: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Preconditioned CG (PCG)

ALGORITHM : 1 Preconditioned Conjugate Gradient

1. Compute r0 := b−Ax0, z0 = M−1r0, and p0 := z0

2. For j = 0, 1, . . ., until convergence Do:

3. αj := (rj, zj)/(Apj, pj)

4. xj+1 := xj + αjpj

5. rj+1 := rj − αjApj

6. zj+1 := M−1rj+1

7. βj := (rj+1, zj+1)/(rj, zj)

8. pj+1 := zj+1 + βjpj

9. EndDo

Calais February 7, 2005 6

6

Page 7: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Note M−1A is also self-adjoint with respect to (., .)A:

(M−1Ax, y)A = (AM−1Ax, y) = (x,AM−1Ay) = (x,M−1Ay)A

II Can obtain a similar algorithm

II Assume that M = Cholesky product M = LLT .

Then, another possibility: Split preconditioning option, which ap-

plies CG to the system

L−1AL−Tu = L−1b, with x = LTu

II Notation: A = L−1AL−T . All quantities related to the precondi-

tioned system are indicated by .

Calais February 7, 2005 7

7

Page 8: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 2 Conjugate Gradient with Split Preconditioner

1. Compute r0 := b−Ax0; r0 = L−1r0; and p0 := L−T r0.

2. For j = 0, 1, . . ., until convergence Do:

3. αj := (rj, rj)/(Apj, pj)

4. xj+1 := xj + αjpj

5. rj+1 := rj − αjL−1Apj

6. βj := (rj+1, rj+1)/(rj, rj)

7. pj+1 := L−T rj+1 + βjpj

8. EndDo

II The xj’s produced by the above algorithm and PCG are identical

(if same initial guess is used).

Calais February 7, 2005 8

8

Page 9: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Flexible accelerators

Question: What can we do in case M is de£ned only approxi-

mately? i.e., if it can vary from one step to the other.?

Applications:

II Iterative techniques as preconditioners: Block-SOR, SSOR, Multi-

grid, etc..

II Chaotic relaxation type preconditioners (e.g., in a parallel com-

puting environment)

II Mixing Preconditioners – mixing coarse mesh / fine mesh precon-

ditioners.

Calais February 7, 2005 9

9

Page 10: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 3 GMRES – No preconditioning

1. Start: Choose x0 and a dimension m of the Krylov subspaces.

2. Arnoldi process:

• Compute r0 = b−Ax0, β = ‖r0‖2 and v1 = r0/β.

• For j = 1, ...,m do

– Compute w := Avj

– for i = 1, . . . , j, do

hi,j := (w, vi)

w := w − hi,jvi

;

– hj+1,1 = ‖w‖2; vj+1 =w

hj+1,1

• De£ne Vm := [v1, ...., vm] and Hm = hi,j.

3. Form the approximate solution: Compute xm = x0 + Vmym where

ym = argminy‖βe1 − Hmy‖2 and e1 = [1, 0, . . . , 0]T .

4. Restart: If satis£ed stop, else set x0← xm and goto 2.

10

Page 11: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 4 GMRES – with (right) Preconditioning

1. Start: Choose x0 and a dimension m of the Krylov subspaces.

2. Arnoldi process:

• Compute r0 = b−Ax0, β = ‖r0‖2 and v1 = r0/β.

• For j = 1, ...,m do

– Compute zj := M−1vj

– Compute w := Azj

– for i = 1, . . . , j, do :

hi,j := (w, vi)

w := w − hi,jvi

– hj+1,1 = ‖w‖2; vj+1 = w/hj+1,1

• De£ne Vm := [v1, ...., vm] and Hm = hi,j.

3. Form the approximate solution: Compute xm = x0 +M−1Vmym

where ym = argminy‖βe1 − Hmy‖2 and e1 = [1, 0, . . . , 0]T .

4. Restart: If satis£ed stop, else set x0← xm and goto 2.11

Page 12: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 5 GMRES – with variable Preconditioning

1. Start: Choose x0 and a dimension m of the Krylov subspaces.

2. Arnoldi process:

• Compute r0 = b−Ax0, β = ‖r0‖2 and v1 = r0/β.

• For j = 1, ...,m do

– Compute zj := M−1j vj ; Compute w := Azj;

– for i = 1, . . . , j, do:

hi,j := (w, vi)

w := w − hi,jvi

;

– hj+1,1 = ‖w‖2; vj+1 = w/hj+1,1

• De£ne Zm := [z1, ...., zm] and Hm = hi,j.

3. Form the approximate solution: Compute xm = x0 + Zmym where

ym = argminy‖βe1 − Hmy‖2 and e1 = [1, 0, . . . , 0]T .

4. Restart: If satis£ed stop, else set x0← xm and goto 2.Calais February 7, 2005 12

12

Page 13: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Properties

• xm minimizes b−Axm over SpanZm.

• If Azj = vj (i.e., if preconditioning is ‘exact’ at step j) then ap-

proximation xj is exact.

• If Mj is constant then method is ≡ to Right-Preconditioned GM-

RES.

Additional Costs:

• Arithmetic: none.

• Memory: Must save the additional set of vectors zjj=1,...m

Advantage: FlexibilityCalais February 7, 2005 13

13

Page 14: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Standard preconditioners

• Simplest preconditioner: M = Diag(A) II poor convergence.

• Next to simplest: SSOR.

M = (D − ωE)D−1(D − ωF )

• Still simple but often more ef£cient: ILU(0).

• ILU(p) – ILU with level of £ll p – more complex.

• Class of ILU preconditioners with threshold

• Class of approximate inverse preconditioners

• Class of Multilevel ILU preconditioners

• Algebraic Multigrid PreconditionersCalais February 7, 2005 14

14

Page 15: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

An observation. Introduction to Preconditioning

II Take a look back at basic relaxation methods: Jacobi, Gauss-

Seidel, SOR, SSOR, ...

II These are iterations of the form x(k+1) = Mx(k) + f where M is

of the form M = I − P−1A . For example for SSOR,

PSSOR = (D − ωE)D−1(D − ωF )

Calais February 7, 2005 15

15

Page 16: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II SSOR attempts to solve the equivalent system

P−1Ax = P−1b

where P ≡ PSSOR by the £xed point iteration

x(k+1) = (I − P−1A)︸ ︷︷ ︸

M

x(k)+P−1b instead of x(k+1) = (I−A)x(k)+b

In other words:

Relaxation Scheme ⇐⇒ Preconditioned Fixed Point Iteration

Calais February 7, 2005 16

16

Page 17: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

The SOR/SSOR preconditioner

D

−F

−E

II SOR preconditioning

MSOR = (D − ωE)

II SSOR preconditioning

MSSOR = (D − ωE)D−1(D − ωF )

II MSSOR = LU , L = lower unit matrix, U = upper triangular. One

solve with MSSOR ≈ same cost as a MAT-VEC.

Calais February 7, 2005 17

17

Page 18: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II k-step SOR (resp. SSOR) preconditioning:

k steps of SOR (resp. SSOR)

II Questions: Best ω? For preconditioning can take ω = 1

M = (D − E)D−1(D − F )

Observe: M = LU +R with R = ED−1F .

II Best k? k = 1 is rarely the best. Substantial difference in

performance.

18

Page 19: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Iteration times versus

k for SOR(k) precon-

ditioned GMRES

Number of SOR steps

CPU

Ti

me

0. 5.0 10. 15. 20. 25. 30.0.

0.50

1.0

1.5

2.0

GMRES(20)

GMRES(10)

Calais February 7, 2005 18

19

Page 20: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILU(0) and IC(0) preconditioners

II Notation: NZ(X) = (i, j) |Xi,j 6= 0

II Formal de£nition of ILU(0):A = LU +RNZ(L) ⋃NZ(U) = NZ(A)rij = 0 for (i, j) ∈ NZ(A)

II This does not de£ne ILU(0) in a unique way.

Constructive de£nition: Compute the LU factorization of A but

drop any £ll-in in L and U outside of Struct(A).

II ILU factorizations are often based on i, k, j version of GE.

Calais February 7, 2005 20

20

Page 21: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

What is the IKJ version of GE?

ALGORITHM : 6 Gaussian Elimination – IKJ Variant

1. For i = 2, . . . , n Do:

2. For k = 1, . . . , i− 1 Do:

3. aik := aik/akk

4. For j = k + 1, . . . , n Do:

5. aij := aij − aik ∗ akj

6. EndDo

7. EndDo

8. EndDo

Calais February 7, 2005 21

21

Page 22: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Not accessed

Accessed but not

Accessed and modified

modified

Calais February 7, 2005 22

22

Page 23: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILU(0) – zero-£ll ILU

ALGORITHM : 7 ILU(0)

For i = 1, . . . , N Do:

For k = 1, . . . , i− 1 and if (i, k) ∈ NZ(A) Do:

Compute aik := aik/akj

For j = k + 1, . . . and if (i, j) ∈ NZ(A), Do:

compute aij := aij − aikak,j.

EndFor

EndFor

II When A is SPD then the ILU factorization = Incomplete Choleski

factorization – IC(0). Meijerink and Van der Vorst [1977].

Calais February 7, 2005 23

23

Page 24: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Typical eigenvalue distribution

Calais February 7, 2005 24

24

Page 25: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Pattern of ILU(0) for 5-point matrix

A

L U

LU

Calais February 7, 2005 25

25

Page 26: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Stencils and ILU factorization

Stencils of A and the L and U parts of A:

Stencil of A Stencil of L Stencil of U

26

Page 27: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Stencil of the product LU :

ª

Fill-ins

Calais February 7, 2005 27

27

Page 28: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Higher order ILU factorization

II Higher accuracy incomplete Choleski: for regularly structured

problems, IC(p) allows p additional diagonals in L.

II Can be generalized to irregular sparse matrices using the notion

of level of £ll-in [Watts III, 1979]

• Initially Lev(aij) =

0 for aij 6= 0

∞ for aij == 0

• At a given step i of Gaussian elimination:

Lev(akj) = minLev(akj);Lev(aki) + Lev(aij) + 1

Calais February 7, 2005 28

28

Page 29: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II ILU(p) Strategy = drop anything with level of £ll-in exceeding p.

* Increasing level of £ll-in usually results in more accurate ILU and...

* ...typically in fewer steps and fewer arithmetic operations.

Calais February 7, 2005 29

29

Page 30: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILU(1)

Augmented A

L1 U1

L1U1

Calais February 7, 2005 30

30

Page 31: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 8 ILU(p)

For i = 2, N Do

For each k = 1, . . . , i− 1 and if aij 6= 0 do

Compute aik := aik/ajj

Compute ai,∗ := ai,∗ − aikak,∗.

Update the levels of ai,∗

Replace any element in row i with lev(aij) > p by zero.

EndFor

EndFor

II The algorithm can be split into a symbolic and a numerical phase.

Level-of-£ll II in Symbolic phase

Calais February 7, 2005 31

31

Page 32: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILU with threshold – generic algorithms

ILU(p) factorizations are based on structure only and not numeri-

cal values II potential problems for non M-matrices.

II One remedy: ILU with threshold – (generic name ILUT.)

Two broad approaches:

First approach [derived from direct solvers]: use any (direct) sparse

solver and incorporate a dropping strategy. [Munksgaard (?), Os-

terby & Zlatev, Sameh & Zlatev[90], D. Young, & al. (Boeing) etc...]

Calais February 7, 2005 32

32

Page 33: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Second approach : [derived from ‘iterative solvers’ viewpoint]

1. use a (row or colum) version of the (i, k, j) version of GE;

2. apply a drop strategy for the elment lik as it is computed;

3. perform the linear combinations to get ai∗. Use full row expansion

of ai∗;

4. apply a drop strategy to £ll-ins.

Calais February 7, 2005 33

33

Page 34: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILU with threshold: ILUT(k, ε)

• Do the i, k, j version of Gaussian Elimination (GE).

• During each i-th step in GE, discard any pivot or £ll-in whose

value is below ε‖rowi(A)‖.

• Once the i-th row of L + U , (L-part + U-part) is computed retain

only the k largest elements in both parts.

II Advantages: controlled £ll-in. Smaller memory overhead.

II Easy to implement – much more so than preconditioners derived

from direct solvers.

II can be made quite inexpensive.

Calais February 7, 2005 34

34

Page 35: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Restarting methods for linear systems

Motivation: Goal: to use the information generated from

current GMRES loop to improve convergence at next GM-

RES restart.

References:

II R. A. Nicolaides (87): De¤ated CG.

II R. Morgan (92) De¤ated GMRES

II S. Kharchenko & A. Yeremin (92) pole placement ideas.

II K. Burrage, J. Ehrel, and B. Pohl (93): De¤ated GMRES

II E de Sturler: use SVD information in GMRES.

Calais February 7, 2005 35

35

Page 36: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Can help improve convergence and prevent stagnation of GMRES

in some cases.

Generally speaking: One should not expect to solve very hard prob-

lems with Eigenvalue De¤ation Preconditioning alone.

II Question: Can the same effects be achieved with block-Krylov

methods?

Calais February 7, 2005 36

36

Page 37: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Using the Flexible GMRES framework

Method: De¤ation can be achieved by ‘enriching’ the Krylov sub-

space with approximate eigenvectors obtained from previous runs.

We can use Flexible GMRES and append these vectors at end. [See

R. Morgan (92), Chapman & YS (95).]

II Vectors v1, . . . , vm−p = standard Arnoldi vectors

II Vectors vm−p+1, . . . , vm = Computed as in FGMRES where new

vectors zj are previously computed eigenvectors.

II Storage: we need to store v1, . . . , vm and zm−p+1, . . . , zm. II p

additional vectors, with typically p << m.

37

Page 38: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

GMRES with de¤ation

1. De¤ated Arnoldi process: r0 := b−Ax0, v1 := r0/(β := ‖r0‖2).

For j = 1, ...,m do

If j < m− p then zj := vj Else zj = uj−m (eigenvector)

w = Azj

For i = 1, . . . , j, do

hi,j := (w, vi)

w := w − hi,jvi

hj+1,j = ‖w‖2, vj+1 = w/‖w‖2.

EndDo

De£ne Zm := [z1, ...., zm] and Hm = hi,j.

2. Form the approximate solution:

Compute xm = x0 + Zmym where ym = argminy‖βe1 − Hmy‖2.

3. Get next eigenvector estimates u1, . . . , up from Hm, Vm, Zm, ...

4. Restart: If satis£ed stop, else set x0← xm and goto 1.

38

Page 39: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Question 1: which eigenvectors to add?

II Answer: those associated with smallest eigenvalues.

Question 2: How to compute eigenvectors from the Flexible GMRES

step? II Answer: use the relation

AZm = Vm+1Hm

Approximation: λ, u = Zmy

Galerkin Condition: r ⊥ AZm gives the generalized problem

HHmHm y = λ HH

mV Hm+1Zm y

In Addition in GMRES: Hm = QmRm so HHmHm = RH

mRm.

See: Morgan (1993).

Calais February 7, 2005 39

39

Page 40: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

An example: Shell problems

II Can be very hard to solve!

II A matrix of size N=38,002, with Nz = 949,452 nonzero elements.

II Actually symmetric. Not exploited in test.

II Most simplistic methods fail.

II ILUT(50,0) does not work even with GMRES(80).

II This is an example when a large subspace is required.

40

Page 41: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 100 200 300 400 500 600 700 800 900 100010

−18

10−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

Numbers of GMRES steps

log 10

of r

esid

ual

Shell problem. N = 38002, Nz = 949452, m = 80

0 eigenvector 5 eigenvectors10 eigenvectors20 eigenvectors30 eigenvectors

Calais February 7, 2005 41

41

Page 42: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

An example

A matrix arising from Euler’s equations on unstructured mesh

[II Contributed by Larry Wigton from Boeing]

Size = 3,864. (966 mesh points).

Nonzero elements: 238,252 (about 62 per row).

II Dif£cult to solve in spite of its small size.

42

Page 43: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Results with ILUT(lfil, ε)

l£l Iterations estimate of

(tol = 10−8) ‖(LU)−1‖

100 ∗ 0.19E + 56

110 ∗ 0.34E + 9

120 30 0.70E + 5

130 25 0.33E + 7

140 20 0.17E + 4

150 19 0.69E + 4

43

Page 44: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Results with Block Jacobi Preconditioning

with Eigenvalue De¤ation

reduction in residual norm in

1200 GMRES steps with m = 49

4x4 block 16x16 block

p = 0 0.8 E 0 0.8 E 0

p = 4 0.8 E 0 4.0 E-5

p = 8 1.2 E-2 2.9 E-7

p = 12 1.9 E-2 3.8 E-6

Calais February 7, 2005 44

44

Page 45: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Theory – (Hermitian case only)

Assume that A is SPD and let K = Km +W , where W is s.t.

dist(AW,U) = ε

with U = exact invariant subspace associated with λ1, .., λs. Then

the residual r obtained from the minimal residual projection

process onto the augmented Krylov subspace K satis£es the

inequality,

‖r‖2 ≤ ‖r0‖2

√√√√√√√√

1

T 2m(γ)

+ ε2

where γ ≡ λn+λs+1

λn−λs+1, Tm ≡ Chebyshev polyn. of deg. m of 1st kind.

II See [YS, SIMAX vol. 4, pp 43-66 (1997)] for other results.

Calais February 7, 2005 45

45

Page 46: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

SPECIAL FORMS OF ILUS

46

Page 47: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Crout-based ILUT (ILUTC)

Terminology: Crout versions of LU compute the k-th row of U and

the k-th column of L at the k-th step.

Computational pattern

Black = part computed at step k

Blue = part accessed

Main advantages:1. Less expensive than ILUT (avoids sorting)

2. Allows better techniques for dropping

47

Page 48: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

References:

[1] M. Jones and P. Plassman. An improved incomplete Choleski

factorization. ACM Transactions on Mathematical Software, 21:5–

17, 1995.

[2] S. C. Eisenstat, M. H. Schultz, and A. H. Sherman. Algorithms

and data structures for sparse symmetric Gaussian elimination.

SIAM Journal on Scienti£c Computing, 2:225–237, 1981.

[3] M. Bollhofer. A robust ILU with pivoting based on monitoring the

growth of the inverse factors. Linear Algebra and its Applica-

tions, 338(1–3):201–218, 2001.

[4] N. Li, Y. Saad, and E. Chow. Crout versions of ILU. MSI technical

report, 2002.Calais February 7, 2005 48

48

Page 49: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Crout LU (dense case)

II Go back to delayed update algorithm (IKJ alg.) and observe: we

could do both a column and a row version

II Left: U computed by rows. Right: L computed by columns

Note: The entries 1 : k − 1 in the k-th row in left £gure need not be

computed. Available from already computed columns of L. Similar

observation for L (right).

49

Page 50: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 9 Crout LU Factorization (dense case)

1. For k = 1 : n Do :

2. For i = 1 : k − 1 and if aki 6= 0 Do :

3. ak,k:n = ak,k:n − aki ∗ ai,k:n

4. EndDo

5. For i = 1 : k − 1 and if aik 6= 0 Do :

6. ak+1:n.k = ak+1:n,k − aik ∗ ak+1:n,i

7. EndDo

8. aik = aik/akk for i = k + 1, ..., n

9. EndDo

50

Page 51: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 10 ILUC - Crout version of ILU

1. For k = 1 : n Do :

2. Initialize row z: z1:k−1 = 0, zk:n = ak,k:n

3. For i | 1 ≤ i ≤ k − 1 and lki 6= 0 Do :

4. zk:n = zk:n − lki ∗ ui,k:n

5. EndDo

6. Initialize column w: w1:k = 0, wk+1:n = ak+1:n,k

7. For i | 1 ≤ i ≤ k − 1 and uik 6= 0 Do :

8. wk+1:n = wk+1:n − uik ∗ lk+1:n,i

9. EndDo

10. Apply a dropping rule to row z

11. Apply a dropping rule to column w

12. uk,: = z; l:,k = w/ukk, lkk = 1

13. Enddo

II Notice that the updates to the k-th row of U (resp. the k-th column

51

Page 52: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

of L) can be made in any order.

II Operations in Lines 4 and 8 are sparse vector updates (must be

done in sparse mode)..

Calais February 7, 2005 52

52

Page 53: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Comparison with standard techniques

0 10 20 30 40 50 60 700

2

4

6

8

10

12

14

Lfil

Pre

cond

ition

Tim

e

RAEFSKY3 (n=21,200; nnz=1,488,768)

ILUTCr−ILUTc−ILUTb−ILUT

Precondition time vs. L£l for ILUC (solid), row-ILUT (circles), column-

ILUT (triangles) and r-ILUT with Binary Search Trees (stars)Calais February 7, 2005 53

53

Page 54: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILUS – ILU for Sparse Skyline format

II Often in CFD codes the matrices are generated in a format con-

sisting of a sparse row representation of the decomposition

A = D + L1 + LT2

where D is the strict lower part of A and L1, L2 are strict lower

triangular.

sparse row→

← sparse column

54

Page 55: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Can develop ILU versions based on this data structure.

II Advantages: (1) Savings when A has a symmetric structure. (2)

Graceful degradation to an incomplete Choleski when A is symmet-

ric (or nearly symmetric). (3) A little more convenient than ILUT for

handling ‘instability’ of factorization.

Let Ak+1 =

Ak vk

wk αk

. If Ak = LkDkUk then

Ak+1 =

Lk 0

yk 1

Dk 0

0 dk+1

Uk zk

0 1

zk = D−1k L−1

k vk; yk = wkU−1k D−1

k ; dk+1 = αk+1 − ykDkzk

II To get next column zk we need to solve a system with sparse L

and sparse RHS vk. Similarly with yk.

II How can we approximately such systems inexpensively?

Note: Sparse RHS and sparse L

55

Page 56: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Simplest possibility: truncated Neumann series,

zk = D−1k L−1

k vk = D−1k (I + Ek + E2

k + . . .+ Epk)vk

vector zk gets denser as ‘level-of-£ll’ p increases.

II We also use sparse-sparse mode GMRES.

II Idea of sparse-sparse mode computations is quite useful in de-

veloping preconditioners.

Calais February 7, 2005 56

56

Page 57: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Approximate Inverse preconditioners

Motivation:• L - U solves in ILU may be ‘unstable’

• Parallelism in L-U solves limited

Idea: Approximate the inverse of A directly M ≈ A−1

II Right preconditioning: Find M such that

AM ≈ I

II Left preconditioning: Find M such that

MA ≈ I

II Factored approximate inverse: Find L and U s.t.

LAU ≈ D

Calais February 7, 2005 57

57

Page 58: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Some references

• Benson and Frederickson (’82): approximate inverse using stencils

• Grote and Simon (’93): Choose M to be banded

• Cosgrove, Dıaz and Griewank (’91) : Procedure to add £ll-ins to M

• Kolotilina and Yeremin (’93) : Factorized symmetric precondition-

ings M = GTLGL

• Huckle and Grote (’95) : Procedure to £nd good pattern for M

• Chow and YS (’95): Find pattern dynamically by using dropping.

• M. Benzi & Tuma (’96, ’97,..): Factored app. inv.

Calais February 7, 2005 58

58

Page 59: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

One (of many) options:

try to £nd M to approximately minimize ‖I −AM‖F

Note:

min ‖I −AM‖2F = min ∑nj=1 ‖ej −AMej‖

22 =

∑nj=1min ‖ej −Amj‖

22

II Problem decouples into n independent least-squares systems

II In each of these systems the matrix and RHS are sparse

Two paths:

1. Can £nd a good sparsity pattern for M £rst then compute M

using this patters.

2. Can £nd the pattern dynamically [similar to ILUT]

Calais February 7, 2005 59

59

Page 60: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Approximate inverse with drop-tolerance [Chow & YS, 1994]

Find min ‖ej −Amj‖22, 1 ≤ j ≤ n

by solving approximately

Amj = ej, 1 ≤ j ≤ n

with a few steps of GMRES, starting with a sparse mj.

• Iterative method works in sparse mode: Krylov vectors are sparse.

• Use sparse-vector by sparse-vector and sparse-matrix by sparse-

vector operations.

• Dropping strategy is applied on mj.

• Exploit ‘self-preconditioning’.

Calais February 7, 2005 60

60

Page 61: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Sparse-Krylov MINRES and GMRES

• Dual threshold dropping strategy: drop tolerance and maximum

number of nonzeros per column

• In MINRES, dropping performed on solution after each inner iter-

ation

• In GMRES, dropping performed on Krylov basis at each iteration

• Use sparse-vector by sparse-vector operations

Calais February 7, 2005 61

61

Page 62: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Self-preconditioning The system

Amj = ej

may be preconditioned with the current M . This is even more effec-

tive if the columns are computed in sequence.

• Actually use FGMRES

• Leads to inner and outer iteration approach

• Quadratic convergence if no dropping is done

• Effect of reordering?

Calais February 7, 2005 62

62

Page 63: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

A few remarks

II There is no guarantee that M is nonsingular, unless the accuracy

is high enough.

II There are many cases in which APINV preconditioners work

while ILU or ILUTP [with reasonable £ll] won’t

II The best use of APINV preconditioners may be in combining them

with other techniques. For example,

Minimize ‖B −AM‖F

where B is some other preconditioner (e.g. block-diagonal).

II Preconditioner for A→MB−1.

Calais February 7, 2005 63

63

Page 64: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Approximate inverses for block-partitioned matrices

Motivation. Domain Decomposition

B1 F1

B2 F2

. . . ...

Bn Fn

E1 E2 · · · En C

B F

E D

Note the factorization:

B F

E C

=

B 0

E S

I B−1F

0 I

in which S is the Schur complement,

S = C − EB−1F.

Calais February 7, 2005 64

64

Page 65: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

One idea: Compute M = LU in which

L =

B 0

E MS

and U =

I B−1F

0 I

II MS = some preconditioner to S.

One option: MS = S = sparse approximation to S

S = C − EY where Y ≈ B−1F

II Need to £nd a sparse matrix Y such that

BY ≈ F

where F and B are sparse.

Calais February 7, 2005 65

65

Page 66: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Preconditioning the Normal Equations

II Why not solve

ATAx = ATb or AATy = b ?

II Advantage: symmetric positive de£nite systems

II Disadvantages:•Worse conditioning

• Not easy to precondition.

II Generally speaking, the disadvantages outweigh the advantages.

Calais February 7, 2005 66

66

Page 67: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Incomplete Cholesky and SSOR for Normal Equations

II First Observation: IC(0) does not necessarily exist for SPD matri-

ces.

II Can shift matrix: perform IC(0) on AAT + αI for example. Hard

to £nd good values of α for general matrices.

II Can modify dropping strategy: exploit relation between IC(0) and

Incomplete Modi£ed Gram Schmidt on A→ ICMGS, [Wang & Galli-

van, 1993]

67

Page 68: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Can also get L from Incomplete LQ factorization [Saad, 1989].

Advantage: arbitrary accuracy. Disadvantage: need the Q factor.

II We never need to form the matrix B = ATA or B = AAT in

implementation.

II Alternative: use SSOR [equivalent to Kacmarz algorithm]. No dif-

£culties with shifts [Take ω = 1], trivial to implement, no additional

storage required.

Calais February 7, 2005 68

68

Page 69: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ILUM AND ARMS

69

Page 70: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Independent set orderings & ILUM (Background)

Independent set orderings permute a matrix into the form

B F

E C

where B is a diagonal matrix.

II Unknowns associated with the B block form an independent set

(IS).

II IS is maximal if it cannot be augmented by other nodes to form

another IS.

II IS ordering can be viewed as a “simpli£cation” of multicoloring

70

Page 71: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Main observation: Reduced system obtained by eliminating the

unknowns associated with the IS, is still sparse since its coef£cient

matrix is the Schur complement

S = C − EB−1F

II Idea: apply IS set reduction recursively.

II When reduced system small enough solve by any method

II Can devise an ILU factorization based on this strategy.

II See work by [Botta-Wubbs ’96, ’97, YS’94, ’96, (ILUM), Leuze ’89,

..]

Calais February 7, 2005 71

71

Page 72: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

ALGORITHM : 11 ILUM

For lev = 1, nlev Do

a. Get an independent set for A.

b. Form the reduced system associated with this set;

c. Apply a dropping strategy to this system;

d. Set A := current reduced matrix and go back to (a).

EndDo

Calais February 7, 2005 72

72

Page 73: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Group Independent Sets / Aggregates

II Generalizes (common) Independent Sets

Main goal: to improve robustness

Main idea: use independent sets of “cliques”, or “aggregates”. There

is no coupling between the aggregates.

No Coupling

II Reorder equations so nodes of independent sets come £rstCalais February 7, 2005 73

73

Page 74: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Algebraic Recursive Multilevel Solver (ARMS)

Original matrix, A , and reordered matrix, A0 = P T0 AP0 .

0 50 100 150 200 250 300

0

50

100

150

200

250

300

nz = 31550 50 100 150 200 250 300

0

50

100

150

200

250

300

nz = 3155

II Block ILU factorizations (Diagonal blocks treated as sparse):

P Tl AlPl =

Bl Fl

El Cl

Ll 0

ElU−1l I

×

I 0

0 Al+1

×

Ul L−1l Fl

0 I

74

Page 75: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Problem: Fill-in

0 50 100 150 200 250 300

0

50

100

150

200

250

300

nz = 12205

II II

Remedy: dropping strategy

0 50 100 150 200 250 300

0

50

100

150

200

250

300

nz = 4255

II Next step: treat the Schur complement recursively

Calais February 7, 2005 75

75

Page 76: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Algebraic Recursive Multilevel Solver (ARMS)

B F

E C

y

z

=

f

g

L 0

EU−1 I

×

U L−1F

0 S

y

z

=

f

g

where S = C − EB−1F = Schur complement.

II Idea: perform above block factorization recursively on S

II Blocks in B treated as sparse. Can be as large/small as desired.

II Algorithm is fully recursive

II Incorporates so-called W-cycles

II stability criterion added to block independent sets algorithm

Calais February 7, 2005 76

76

Page 77: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Factorization:

P Tl AlPl =

Bl Fl

El Cl

Ll 0

ElU−1l I

×

I 0

0 Al+1

×

Ul L−1l Fl

0 I

II L-solve ∼ restriction operation. U-solve ∼ prolongation.

II Solve Last level system with, e.g., ILUT+GMRES

ALGORITHM : 12 ARMS(Alev ) factorization

1. If lev = last lev then

2. Compute Alev ≈ LlevUlev

3. Else:

4 Find an independent set permutation Plev

5. Apply permutation Alev := P TlevAlevP

6. Compute factorization

7. Call ARMS(Alev+1 )

8. EndIf

77

Page 78: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Inner-Outer inter-level iterations

Idea: Use an iteration at level l to reduce residual norm by tol. τ

Forward Backward

Back to original system Original system

Reduced system: solve by any means

II Many possible variants.

Calais February 7, 2005 78

78

Page 79: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Three options for inner-outer inter-level iterations

(VARMS) Descend, using the level-structure. At last level use GMRES-

ILUT. Ascend back to the current level.

(WARMS) Use a few steps of GMRES+VARMS to solve the reduced

system. At last level: use GMRES-ILUT.

(WARMS*) Use a few steps of FGMRES to solve the reduced system

- Preconditioner: WARMS* (recursive). Last level: use ILUT-GMRES

II WARMS* can be expensive! use with a small number of levels.

II Iterating allows to use less costly factorizations [memory]

Calais February 7, 2005 79

79

Page 80: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Storage: II At each

level - except last,

store: Li, Ui, Fi, Ei

U1

L1

F1

E1

U0

L0

A2

F0

E0

II For WARMS: need to multiply by intermediate Ai ’s

II Al+1 × w computed as(

Cl − ElU−1l L−1

l F)

× w II Need to store

above 4 matrices + Ci.

Calais February 7, 2005 80

80

Page 81: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Group Independent Set reordering

SeparatorFirst Block

Simple strategy used: Do a Cuthill-MKee ordering until there are

enough points to make a block. Reverse ordering. Start a new block

from a non-visited node. Continue until all points are visited. Add

criterion for rejecting “not suf£ciently diagonally dominant rows.”

Calais February 7, 2005 81

81

Page 82: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Original matrix

0.10E-06

0.19E+07

82

Page 83: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Block size of 6

0.10E-06

0.19E+07

83

Page 84: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Block size of 20

0.10E-06

0.19E+07

84

Page 85: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

PARALLEL PRECONDITIONERS

85

Page 86: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Introduction

II In recent years: Big thrust of parallel computing techniques in

applications areas. Problems becoming larger and harder

II In general: very large machines (e.g. Cray T3E) are gone. Excep-

tion: big US government labs.

II Replaced by ‘medium’ size machine (e.g. IBM SP2, SGI Origin)

II Programming model: Message-passing seems to be King (MPI)

II Open MP and threads for small number of processors

II Important new reality: parallel programming has penetrated the

‘applications’ areas [Sciences and Engineering + industry]

Calais February 7, 2005 86

86

Page 87: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Parallel preconditioners: A few approaches

“Parallel matrix computation” viewpoint:

• Local preconditioners: Polynomial (in the 80s), Sparse Approxi-

mate Inverses, [M. Benzi-Tuma & al ’99., E. Chow ’00]

• Distributed versions of ILU [Ma & YS ’94, Hysom & Pothen ’00]

• Use of multicoloring to unaravel parallelism

Calais February 7, 2005 87

87

Page 88: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Domain Decomposition ideas:

• Schwarz-type Preconditioners [e.g. Widlund, Bramble-Pasciak-

Xu, X. Cai, D. Keyes, Smith, ...]

• Schur-complement techniques [Gropp & Smith, Ferhat et al. (FETI),

T.F. Chan et al., YS and Sosonkina ’97, J. Zhang ’00, ...]

Multigrid / AMG viewpoint:

• Multi-level Multigrid-like preconditioners [e.g., Shadid-Tuminaro et

al (Aztec project), ...]

II In practice: Variants of additive Schwarz very common (simplic-

ity)

Calais February 7, 2005 88

88

Page 89: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Intrinsically parallel preconditioners

Some alternatives

(1) Polynomial preconditioners;

(2) Approximate inverse preconditioners;

(3) Multi-coloring + independent set ordering;

(4) Domain decomposition approach.

Calais February 7, 2005 89

89

Page 90: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

POLYNOMIAL PRECONDITIONING

Principle: M−1 = s(A) where s is a (low) degree polynomial:

s(A)Ax = s(A)b

Problem: how to obtain s? Note: s(A) ≈ A−1

II Several approaches.

* Chebyshev polynomials

* Least squares polynomials

* OthersII Polynomial preconditioners are seldom used in practice.

Calais February 7, 2005 90

90

Page 91: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Domain Decomposition

Problem:

∆u = f in Ω

u = uΓ on Γ = ∂Ω.

Domain:

Ω =s⋃

i=1Ωi,

Ω1 Ω2

Ω3

Γ12

Γ13

II Domain decomposition or substructuring methods attempt to solve

a PDE problem (e.g.) on the entire domain from problem solutions

on the subdomains Ωi.

Calais February 7, 2005 91

91

Page 92: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

1 2 3

4 5 6

7 8 9

10 11 12

13 14 15

16 17 18

19 20 21

22 23 24 25

26 27 28 29

30 31 32 33

34

35

36

37383940

Discretization of domain

92

Page 93: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Coef£cient Matrix

Calais February 7, 2005 93

93

Page 94: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Types of mappings

1 2 3 4

5 6 7 8

9 10 11 12

(a)

1 2 3 4

5 6 7 8

9 10 11 12

Ω1

Ω2

(b)

1 2 3 4

5 6 7 8

9 10 11 12

Ω1

Ω2

(c)

(a) Vertex-based, (b) edge-based, and (c) element-based partitioning

94

Page 95: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Calais February 7, 2005 94

95

Page 96: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

DISTRIBUTED SPARSE MATRICES

96

Page 97: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Generalization: Distributed Sparse Systems

II Simple illustration:

Block assignment

Assign equation i and

unknown i to a given

processor.

Calais February 7, 2005 97

97

Page 98: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Partitioning a sparse matrix

II Use a graph partitioner to partition the adjacency graph:

1 2 3 4

5 6 7 8

9 10 11 12

P1

P2

P3

P4

II Can allow overlap.

II Partition can be very general.

Calais February 7, 2005 98

98

Page 99: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Given: a general mapping of pairs equation/unknown to proces-

sors. = set of p subsets of variables .

II Subsets can be arbitrary + allow ‘overlap’. Mapping can be

obtained by graph partitioners.

Problem: build local data structures needed for iteration phase.

II Several graph partitioners available: Metis, Chaco, Scotch, ...

99

Page 100: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Calais February 7, 2005 100

100

Page 101: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Distributed Sparse matrices (continued)

II Once a good partitioning is found, questions are:

1. How to represent this partitioning?

2. What is a good data structure for representing dis-

tributed sparse matrices?

3. How to set up the various “local objects” (matrices,

vectors, ..)

4. What can be done to prepare for communication that will

be required during execution?

Calais February 7, 2005 101

101

Page 102: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Two views of a distributed sparse matrix

External interfacenodes

Internalnodes

Local interfacenodes

XiXiAi

II Local interface variables always ordered last.

Calais February 7, 2005 102

102

Page 103: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Local view of distributed matrix:

local Data

External data External data

OO AiiX Xi

The local matrix:

Local

points

InternalPoints

Interface

Aloc

Bext

Calais February 7, 2005 103

103

Page 104: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Distributed Sparse Matrix-Vector Product Kernel

Algorithm:

1. Communicate: exchange boundary data.

Scatter xbound to neighbors - Gather xext from neighbors

2. Local matrix – vector product

y = Alocxloc

3. External matrix – vector product

y = y +Bextxext

NOTE: 1 and 2 are independent and can be overlapped.

Calais February 7, 2005 104

104

Page 105: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Distributed Sparse Matrix-Vector Product

Main part of the code:

call MSG_bdx_send(nloc,x,y,nproc,proc,ix,ipr,ptrn,ierr)cc do local matrix-vector product for local pointsc

call amux(nloc,x,y,aloc,jaloc,ialoc)cc receive the boundary informationc

call MSG_bdx_receive(nloc,x,y,nproc,proc,ix,ipr,ptrn,ierr)c

c do local matrix-vector product for external pointsc

nrow = nloc - nbnd + 1call amux1(nrow,x,y(nbnd),aloc,jaloc,ialoc(nloc+1))

creturn

Calais February 7, 2005 105

105

Page 106: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

The local exchange information

II List of adjacent processors (or subdomains)

II For each of these processors, lists of boundary nodes to be sent

/ received to /from adj. PE’s.

II The receiving processor must have a matrix ordered consistently

with the order in which data is received.

Requirements

II The ‘set-up’ routines should handle overlapping

II Should use minimal storage (only arrays of size nloc allowed).

Calais February 7, 2005 106

106

Page 107: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Main Operations in (F) GMRES :

1. Saxpy’s – local operation – no communication

2. Dot products – global operation

3. Matrix-vector products – local operation – local communication

4. Preconditioning operations – locality varies.

Calais February 7, 2005 107

107

Page 108: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Distributed Dot Product

/*-------------------- call blas1 function

tloc = DDOT(n, x, incx, y, incy);

/*-------------------- call global reduction

MPI_Allreduce(&tloc,&ro,1,MPI_DOUBLE,MPI_SUM,comm);

Calais February 7, 2005 108

108

Page 109: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

PARALLEL PRECONDITIONERS

109

Page 110: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Three approaches:

• Schwarz Preconditioners

• Schur-complement based Preconditioners

• Multi-level ILU-type Preconditioners

II Observation: Often, in practical applications, only Schwarz Pre-

conditioners are used

Calais February 7, 2005 110

110

Page 111: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Domain-Decomposition-Type Preconditioners

Local view of distributed matrix:

local Data

External data External data

OO AiiX Xi

Block Jacobi Iteration (Additive Schwarz):

1. Obtain external data yi

2. Compute (update) local residual ri = (b−Ax)i = bi−Aixi−Biyi

3. Solve Aiδi = ri

4. Update solution xi = xi + δi

Calais February 7, 2005 111

111

Page 112: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Multiplicative Schwarz. Need a coloring of the subdomains.

1 1

1 1

2 2

3 3

33

44

112

Page 113: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Multicolor Block SOR Iteration (Multiplicative Schwarz):

1. Do col = 1, . . . , numcols

2. If (col.eq.mycol) Then

3. Obtain external data yi

4. Update local residual ri = (b−Ax)i

5. Solve Aiδi = ri

6. Update solution xi = xi + δi

7. EndIf

8. EndDo

Calais February 7, 2005 113

113

Page 114: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Breaking the sequential color loop

II “Color” loop is sequential. Can be broken in several different

ways.

(1) Have a few subdomains per processors

1 12 2 2

4 44

33

11

1

1 1 11

22

2

23 3 33

4 4 44

3 332

444

2

22 1

3

1

43

114

Page 115: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

(2) Separate interior nodes from interface nodes (2-level blocking)

Color # 1 Interior nodes

Color 2

Color 3

Color 3

Color 2

(3) Use a block-GMRES algorithm - with Block-size = number of

colors. SOR step targets a different color on each column of the

block II no iddle time.

Calais February 7, 2005 115

115

Page 116: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Local Solves

II Each local system Aiδi = ri can be solved in three ways:

1. By a (sparse) direct solver

2. Using a standard preconditioned Krylov solver

3. Doing a backward-forward solution associated with an accurate

ILU (e.g. ILUT) precondioner

II We only use (2) with a small number of inner steps (up to 10) or

(3).

Calais February 7, 2005 116

116

Page 117: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Performance comparison for different machines

0 5 10 15 20 25 30 350

1

2

3

4

5

6

7

8

Number of processors

Tim

e in

sec

onds

Overlapped Schwarz with ILU solver. Matrix VENKAT01

IBM−COW (FC)

SP2SGI−COW (HiPPI)

T3E

117

Page 118: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 5 10 15 20 25 30 350

1

2

3

4

5

6

7

8

9

− − − − − total

_______ precon

......... matvec

− − * − − fgmres

Number of processors

tim

e in

sec

onds

Jacobi overlap with LU solver. Matrix VENKAT01. CRAY−T3E

118

Page 119: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 2 4 6 8 10 12 14 160

5

10

15

20

25

− − x − − multicolor SOR with ILU solver

______________ multicolor SOR

.................. multicolor segregated SOR, ILU solver

− − o − − multicolor segregated SOR, ILU solver and GMRES

Number of processors

tim

e in

sec

onds

Multicolor SOR preconditioning. Matrix VENKAT01.

Calais February 7, 2005 118

119

Page 120: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

SCHUR COMPLEMENT-BASED PRECONDITIONERS

120

Page 121: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Schur complement system

Local system can be written as

Aixi +Xiyi,ext = bi. (1)

local Data

External data External data

OO AiiX Xi

xi= vector of local unknowns, yi,ext = external interface variables,

and bi = local part of RHS.

121

Page 122: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Local equations

Bi Fi

Ei Ci

ui

yi

+

0

j∈NiEijyj

=

fi

gi

. (2)

II eliminate ui from the above system:

Siyi +∑

j∈Ni

Eijyj = gi − EiB−1i fi ≡ g′i,

where Si is the “local” Schur complement

Si = Ci − EiB−1i Fi. (3)

Calais February 7, 2005 122

122

Page 123: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Structure of Schur complement system

II Schur complement system

Sy = g′

with

S =

S1 E12 . . . E1p

E21 S2 . . . E2p

... . . . ...

Ep1 Ep−1,2 . . . Sp

y1

y2

...

yp

=

g′1

g′2...

g′p

.

Calais February 7, 2005 123

123

Page 124: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Simplest idea: Schur Complement Iterations

ui

yi

Internal variables

Interface variables

II Do a global primary iteration (e.g., block-Jacobi)

II Then accelerate only the y variables (with a Krylov method)

Still need to precondition..

Calais February 7, 2005 124

124

Page 125: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Approximate Schur-LU

II Two-level method based on induced preconditioner. Global sys-

tem can also be viewed as

B F

E C

u

y

=

f

g

. with B =

B1 F1

B2 F2

. . . ...

Bp Fp

E1 E2 · · · Ep C

Block LU factorization of A:

B F

E C

=

B 0

E S

I B−1F

0 I

,

125

Page 126: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Preconditioning:

L =

B 0

E MS

and U =

I B−1F

0 I

with MS = some approximation to S.

II Preconditioning to global system can be induced from any pre-

conditioning on Schur complement.

Rewrite local Schur system as

yi + S−1i

j∈Ni

Eijyj = S−1i

[

gi − EiB−1i fi

]

.

II equivalent to Block-Jacobi preconditioner for Schur complement.

II Solve with, e.g., a few steps (e.g., 5) of GMRES

II Question: How to solve with Si?

126

Page 127: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Two approaches:

(1) can compute approximation Si ≈ Si using approximate inverse

techniques (M. Sosonkina)

(2) we can simply use LU factorization of Ai. Exploit the property:

If Ai =

LBi0

EiU−1Bi

LSi

UBiL−1BiFi

0 USi

Then LSiUSi

= Si

127

Page 128: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Name Precon lfil 4 8 16 24 36 40

raefsky1 SAPINV 10 14 13 10 11 8 8

20 12 11 9 9 8 8

SAPINVS 10 16 13 10 11 8 8

20 13 11 9 9 8 8

SLU 10 215 197 198 194 166 171

20 48 50 40 42 41 41

BJ 10 85 171 173 273 252 263

20 82 170 173 271 259 259Number of FGMRES(20) iterations for the RAEFSKY1 problem.

128

Page 129: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Name Precon lfil 16 24 32 40 56 64 80 96

af23560 SAPINV 20 32 36 27 29 73 35 71 61

30 32 35 23 29 46 60 33 52

SAPINVS 20 32 35 24 29 55 35 37 59

30 32 34 23 28 43 45 23 35

SLU 20 81 105 94 88 90 76 85 71

30 38 34 37 39 38 39 38 35

BJ 20 37 153 53 60 77 80 95 *

30 36 41 53 57 81 87 97 115Number of FGMRES(20) iterations for the AF23560 problem.

Calais February 7, 2005 129

129

Page 130: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Processors

T3

E s

eco

nd

s

360 x 360 Mesh − CPU Time

Solid line: BJ

Dash−dot line: SAPINV

Dash−star line: SLU

0 10 20 30 40 50 60 70 80 90 10050

100

150

200

250

300

350

400

450

Processors

Ite

ratio

ns

360 x 360 Mesh − Iterations

Solid line: BJDash−dot line: SAPINVDash−star line: SLU

Times and iteration counts for solving a 360×360 discretized Laplacean

problem with 3 different preconditioners using ¤exible GMRES(10).

130

Page 131: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

II Solution times for a Laplacean problem with various local sub-

problem sizes using FGMRES(10) with 3 different preconditioners

(BJ, SAPINV, SLU) and the Schur complement iteration (SI).

0 10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

Processors

T3

E s

eco

nd

s

50 x 50 Mesh in each PE

Solid line: BJDash−dot line: SAPINVDash−star line: SLUDash−circle line: SI

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

Processors

T3

E s

eco

nd

s

70 x 70 Mesh

ASSClu

Calais February 7, 2005 131

131

Page 132: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

PARALLEL ARMS

132

Page 133: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Parallel implementation of ARMS

Three types of points: interior

(independent sets), local inter-

faces, and global interfaces

Interdomain

Interior points

Local Interfaces Interfaces

Main ideas: (1) exploit recursivity (2) distinguish two phases: elim-

ination of interior points and then interface points.

Calais February 7, 2005 133

133

Page 134: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Result: 2-part Schur complement: one corresponding to local in-

terfaces and the other to inter-domain interfaces.IS I1

I2

Bext

Calais February 7, 2005 134

134

Page 135: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Three approaches

Method 1: Simple additive Schwarz using ILUT or ARMS locally

Method 2: Schur complement approach. Solve Schur complement

system (both I1 and I2) with either a block Jacobi (M. Sosonkina and

YS, ’99) or multicolor ILU(0).

Method 3: Do independent set reduction across subdomains. Re-

quires construction of global group independent sets.

II Current status Methods 1 and 2.

Calais February 7, 2005 135

135

Page 136: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Construction of global group independent sets A two level strategy

1. Color subdomains

2. Find group independent sets

locally

3. Color groups consistently

Proc 4

Proc 1

Proc 3

Proc 2

Calais February 7, 2005 136

136

Page 137: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

color 2

color 3

color 4

color 1

color 1

color 3

Internal interface points

External interface points

Algorithm: Multicolor Distributed ILU(0)1. Eliminate local rows,

2. Receive external interf. rows from PEs s.t. color(PE) < MyColor

3. Process local interface rows

4. Send local interface rows to PEs s.t. color(PE) > MyColor

Calais February 7, 2005 137

137

Page 138: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Methods implemented in pARMS:

add x Additive Schwarz procedure, with method x for subdo-

mains. With/out overlap. x is one of ILUT, ILUK, ARMS.

sch x Schur complement technique, with method x = factoriza-

tion used for local submatrix = ILUT, ILUK,ARMS. Equiv.

to Additive Schwarz preconditioner on Schur complement.

sch sgs Multicolor Multiplicative Schwarz (block Gauss-Seidel)

preconditioning is used instead of additive Schwarz for Schur

complement.

sch gilu0 ILU(0) preconditioning is used for solving the global

Schur complement system obtained from the ARMS reduction.

Calais February 7, 2005 138

138

Page 139: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

Test problem

1. Scalability experiment: sample £nite difference problem.

−∆u+ γ

exy

∂u

∂x+ e−xy

∂u

∂y

+ αu = f ,

Dirichlet Boundary Conditions ; γ = 100, α = −10; centered differ-

ences discretization.

II Keep size constant on each processor [100×100] II Global linear

system with 10, 000 ∗ nproc unknowns.

2. Comparison with a parallel direct solver – symmetric problems

3. Large irregular matrix example arising from magneto hydrody-

namics.

Calais February 7, 2005 139

139

Page 140: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 10 20 30 40 50 60 70 80 900

2

4

6

8

10

12

14

16

18

20

Processors

Orig

in 3

800

seco

nds

100 x 100 mesh per processor − Wall−Clock Time

add_arms add_arms no its add_arms ovp add_arms ovp no itsadd_ilut add_ilut no its add_ilut ovp add_ilut ovp no its

Solution times for 2D PDE problem with £xed subproblem size

140

Page 141: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 10 20 30 40 50 60 70 80 9010

20

30

40

50

60

70

80

90

100

Processors

Itera

tions

100 x 100 mesh per processor − Iterationsadd_arms add_arms no its add_arms ovp add_arms ovp no itsadd_ilut add_ilut no its add_ilut ovp add_ilut ovp no its

Iterations for 2D PDE problem with £xed subproblem size

141

Page 142: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 10 20 30 40 50 60 70 80 900.5

1

1.5

2

2.5

3

3.5

4

Processors

Orig

in 3

800

seco

nds

100 x 100 mesh per processor − Wall−Clock Time

add_arms no its add_arms ovp no itssch_arms sch_gilu0 sch_gilu0 no its sch_sgs sch_sgs no its

Solution times for a 2D PDE problem with the £xed subproblem

size using different preconditioners.

142

Page 143: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer

0 10 20 30 40 50 60 70 80 900

10

20

30

40

50

60

70

80

90

100

Processors

Itera

tions

100 x 100 mesh per processor − Iterations

add_arms no its add_arms ovp no itssch_arms sch_gilu0 sch_gilu0 no its sch_sgs sch_sgs no its

Iterations for a 2D PDE problem with the £xed subproblem size

using different preconditioners.

143


Recommended