A new algorithm for computing quadrature-based bounds in ... · for computing quadrature-based...

Post on 15-Jul-2020

1 views 0 download

transcript

A new algorithm

for computing quadrature-based boundsin conjugate gradients

Petr Tichý

Czech Academy of SciencesCharles University in Prague

joint work with

Gérard Meurant and Zdeněk Strakoš

June 08–13, 2014Householder Symposium XIX,

Spa, Belgium

1

Problem formulation

Consider a system

Ax = b

where A ∈ Rn×n is symmetric, positive definite.

Without loss of generality, ‖b‖ = 1 , x0 = 0 .

2

The conjugate gradient method

input A, br0 = b, p0 = r0

for k = 1, 2, . . . do

γk−1 =rT

k−1rk−1

pTk−1A pk−1

xk = xk−1 + γk−1pk−1

rk = rk−1 − γk−1A pk−1

δk =rT

k rk

rTk−1rk−1

pk = rk + δkpk−1

test quality of xk

end for

Dk

γ−10

. . .. . .

γ−1k−1

Lk

1√

δ1. . .. . .

. . .√δk−1 1

3

How to measure quality of an approximation in CG?A practically relevant question

using residual information,– normwise backward error,– relative residual norm.“Using of the residual vector rk as a measure of the “goodness” of

the estimate xk is not reliable” [Hestenes & Stiefel 1952]

using error estimates,– the A-norm of the error,– the Euclidean norm of the error.“The function (x− xk, A(x− xk)) can be used as a measure of the

“goodness” of xk as an estimate of x.” [Hestenes & Stiefel 1952]

The A-norm of the error plays an important role in stoppingcriteria [Deuflhard 1994], [Arioli 2004], [Jiránek, Strakoš, Vohralík 2006].

4

The Lanczos algorithmLet A be symmetric, compute orthonormal basis of Kk(A, b)

input A, bv1 = b/‖b||, δ1 = 0β0 = 0, v0 = 0for k = 1, 2, . . . do

αk = vTk Avk

w = Avk − αkvk − βk−1vk−1

βk = ‖w‖vk+1 = w/βk

end for

Tk

α1 β1

β1. . .

. . . βk−1

βk−1 αk

5

CG versus LanczosLet A be symmetric, positive definite

Lanczos

Tk

←→CG

Dk, Lk

Both algorithms generate an orthogonal basis of Kk(A, b).

Lanczos using a three-term recurrence → Tk.

CG using a coupled two-term recurrence → Dk, Lk .

Tk = Lk Dk LTk .

6

CG, Lanczos and Gauss quadrature

CG, Lanczos

Gauss quadratureOrthogonal pol.Jacobi matrices

At any iteration step k, CG (implicitly) determinesweights and nodes of the k-point Gauss quadrature

∫ ξ

ζf(λ) dω(λ) =

k∑

i=1

ωi f( θi) + Rk[f ] .

7

Gauss quadrature for f(λ) ≡ λ−1

Gauss quadrature∫ ξ

ζλ−1 dω(λ) =

k∑

i=1

ωi

θi+ Rk[λ−1] .

Lanczos (T

−1n

)1,1

=(T

−1k

)1,1

+ Rk[λ−1].

CG

‖x‖2A =k−1∑

j=0

γj‖rj‖2

︸ ︷︷ ︸τk

+ ‖x− xk‖2A .

Important : Rk[λ−1] > 0 .

8

Gauss-Radau quadrature for f(λ) = λ−1

µ is prescribed∫ ξ

ζf(λ) dω(λ) =

k∑

i=1

ω̃if(θ̃i

)+ ω̃k+1f(µ)

︸ ︷︷ ︸(T̃

−1k+1

)1,1≡ τ̃k+1

+ Rk[f ] ,

where

T̃k+1 =

α1 β1

β1. . .

. . .. . .

. . . βk−1

βk−1 αk βk

βk α̃k+1

and µ is an eigenvalue of T̃k+1.

Important: if 0 < µ ≤ λmin, then Rk[λ−1] < 0 .

9

Idea of estimating the A-norm of the error[Golub & Strakoš 1994], [Golub & Meurant 1994, 1997]

Consider two quadrature rules at steps k and k + d, d > 0,

‖x‖2A = τk + ‖x− xk‖2A ,

‖x‖2A = τ̂k+d + R̂k+d .

Then‖x− xk‖2A = τ̂k+d − τk + R̂k+d .

Gauss quadrature: R̂k+d > 0 → lower bound,Radau quadrature: R̂k+d < 0 → upper bound.

How to compute efficiently

τ̂k+d − τk ?

10

How to compute efficiently τ̂k+d − τk?

‖x− xk‖2A = τ̂k+d − τk + R̂k+d .

For numerical reasons, it is not convenient to compute τk andτ̂k+d explicitly. Instead,

τ̂k+d − τk =k+d−2∑

j=k

(τj+1 − τj) + (τ̂j+d − τj+d−1)

≡k+d−2∑

j=k

∆j + ∆̂k+d−1 ,

and update the ∆j’s without subtractions. Recall τj =(T

−1j

)1,1

.

11

Golub and Meurant approach

[Golub & Meurant 1994, 1997]: Use tridiagonal matrices

CG → Tk → Tk−µI → T̃k

and compute ∆’s using updating strategies,no need to store tridiagonal matrices.

Use the formulas

‖x− xk‖2A =k+d−1∑

j=k

∆j + ‖x− xk+d‖2A ,

‖x− xk‖2A =k+d−2∑

j=k

∆j + ∆(µ)k+d−1 + R(R)

k+d .

12

CGQL (Conjugate Gradients and Quadrature via Lanczos)

input A, b, x0, µr0 = b−Ax0, p0 = r0

δ0 = 0, γ−1 = 1, c1 = 1, β0 = 0, d0 = 1, α̃(µ)1 = µ,

for k = 1, . . . , until convergence doCG-iteration (k)

αk =1

γk−1+

δk−1

γk−2, β2

k =δk

γ2k−1

dk = αk −β2

k−1

dk−1, ∆k−1 = ‖r0‖2

c2k

dk,

α̃(µ)k+1 = µ +

β2k

αk − α̃(µ)k

,

∆(µ)k = ‖r0‖2

β2kc2

k

dk

(α̃

(µ)k+1dk − β2

k

) , c2k+1 =

β2kc2

k

d2k

Estimates(k,d)end for

13

Our approach

[Meurant & T. 2013]: Update LDLT decompositions of Tk and T̃k

CG → LkDkLTk → L̃kD̃kL̃

Tk

We use tridiagonal matrices only implicitly.

We get very simple formulas for updating ∆k−1 and ∆(µ)k .

In [Meurant & T. 2013], this idea is used also for other types ofquadratures (Gauss-Lobatto, Anti-Gauss).

14

CGQ (Conjugate Gradients and Quadrature)

[Meurant & T. 2013]

input A, b, x0, µ,r0 = b−Ax0, p0 = r0

∆(µ)0 = ‖r0‖2

µ,

for k = 1, . . . , until convergence doCG-iteration(k)

∆k−1 = γk−1‖rk−1‖2,

∆(µ)k =

‖rk‖2(∆

(µ)k−1 −∆k−1

)

µ(∆

(µ)k−1 −∆k−1

)+ ‖rk‖2

Estimates(k,d)end for

15

Conclusions (theoretical part)

Simple formulas for computing bounds on ‖x− xk‖A.

Almost for free.

Work well also with preconditioning.

Behaviour in finite precision arithmetic?

16

CG in finite precision arithmetic

Orthogonality is lost, convergence is delayed!

0 20 40 60 80 100 12010

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

(E) || x−xj ||

A

(FP) || x−xj ||

A

(FP) || I−VTjV

j ||

F

Identities need not hold in finite precision arithmetic!

17

Bounds in finite precision arithmetic

Observation: CGQL and CGQ give the same results(up to a small inaccuracy).

Do the bounds correspond to ‖x− xk‖A?

Gauss quadrature lower bound → yes [Strakoš & T. 2002].

What about the Gauss-Radau upper bound?

‖x− xk‖2A = ∆(µ)k + R(R)

k+1 ,

‖x− xk‖A ≤√

∆(µ)k .

18

Gauss-Radau upper bound, exact arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

0 5 10 15 20 25 30 35 40 45 5010

−3

10−2

10−1

100

101

CGQ − strakos48

|| x − xk ||

A

Upper bound, µ = 0.5 λ1

Upper bound, µ = λ1

19

Gauss-Radau upper bound, finite precision arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

0 20 40 60 80 100 12010

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

100

102

CGQ − strakos48

|| x − xk ||

A

Upper bound, µ = 0.5 λ1

Upper bound, µ = λ1

20

Gauss-Radau upper bound, finite precision arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

µ = α λ1 , α ∈ (0, 1], α ≈ 1 (red), α ≈ 0 (magenta)

80 85 90 95 100 105 110 115 12010

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

CGQ − strakos48

√∆

(µ)k

21

Gauss-Radau upper bound, finite precision arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1

µ = α λ1 , α ∈ (0, 1], α ≈ 1 (red), α ≈ 0 (magenta)

80 85 90 95 100 105 110 115 12010

−16

10−14

10−12

10−10

10−8

10−6

10−4

10−2

CGQ − strakos48

√α ∆

(µ)k

22

Conclusions (numerical observation)Gauss-Radau upper bound

It seems that√

ε is a limiting level for the accuracy of theGauss-Radau upper bound.

We cannot avoid subtractions in computing this bound.If µ ≈ λ1, then Tk − µI may be ill conditioned.

Simple formulas → investigation of numerical behaviour.

Understanding can help

in suggesting another approach,

in improving Gauss quadrature lower bound(adaptive choice of d).

23

Related papers

G. Meurant and P. Tichý, [On computing quadrature-based bounds for the

A-norm of the error in CG, Numer. Algorithms, 62 (2013), pp. 163–191.]

G. H. Golub and G. Meurant, [ Matrices, moments and quadrature with

applications, Princeton University Press, USA, 2010.]

Z. Strakoš and P. Tichý, [On error estimation in CG and why it works in

finite precision computations, ETNA, 13 (2002), pp. 56–80.]

G. H. Golub and G. Meurant, [Matrices, moments and quadrature. II. BIT,

37 (1997), pp. 687–705.]

G. H. Golub and Z. Strakoš, [Estimates in quadratic formulas, Numer.

Algorithms, 8 (1994), pp. 241–268.]

Thank you for your attention!

24