Iterative Methods and QR Factorization
Lecture 5
Alessandra Nardi
Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen Veroy
Last lecture review
• Solution of system of linear equations Mx=b• Gaussian Elimination basics
– LU factorization (M=LU)– Pivoting for accuracy enhancement– Error Mechanisms (Round-off)
• Ill-conditioning • Numerical Stability
– Complexity: O(N3)
• Gaussian Elimination for Sparse Matrices – Improved computational cost: factor in O(N1.5)– Data structure– Pivoting for sparsity (Markowitz Reordering)– Graph Based Approach
Solving Linear Systems
• Direct methods: find the exact solution in a finite number of steps– Gaussian Elimination
• Iterative methods: produce a sequence of approximate solutions hopefully converging to the exact solution– Stationary
• Jacobi• Gauss-Seidel• SOR (Successive Overrelaxation Method)
– Non Stationary• GCR, CG, GMRES…..
Iterative Methods
Iterative methods can be expressed in the general form: x(k) =F(x(k-1))
where s s.t. F(s)=s is called a Fixed Point
Hopefully: x(k) s (solution of my problem)
• Will it converge? How rapidly?
Iterative Methods
Stationary:
x(k+1) =Gx(k)+c
where G and c do not depend on iteration count (k)
Non Stationary:
x(k+1) =x(k)+akp(k)
where computation involves information that change at each iteration
Iterative – StationaryJacobi
In the i-th equation solve for the value of xi while assuming the other entries of x remain fixed:
In matrix terms the method becomes:
where D, -L and -U represent the diagonal, the strictly lower-trg and strictly upper-trg parts of M
ii
ijjiji
i
N
jijij m
xmb
xbxm
1 ii
ij
kjiji
ki m
xmb
x
)1(
)(
bDxULDx kk 111)(
Iterative – StationaryGauss-Seidel
Like Jacobi, but now assume that previously computed results are used as soon as they are available:
In matrix terms the method becomes:
where D, -L and -U represent the diagonal, the strictly lower-trg and strictly upper-trg parts of M
ii
ijjiji
i
N
jijij m
xmb
xbxm
1 ii
ij
kjij
ij
kjiji
ki m
xmxmb
x
)1()(
)(
)( 11)( bUxLDx kk
Iterative – Stationary Successive Overrelaxation (SOR)
Devised by extrapolation applied to Gauss-Seidel in the form of weighted average:
In matrix terms the method becomes:
where D, -L and -U represent the diagonal, the strictly lower-trg and strictly upper-trg parts of M
w is chosen to increase convergence
ii
ij
kjij
ij
kjiji
ki m
xmxmb
x
)1()(
)(
bwLDwxDwwUwLDx kk 111)( )())1((
)1()()( )1( ki
ki
ki xwxwx
Iterative – Non Stationary
The iterates x(k) are updated in each iteration by a multiple ak of the search direction vector p(k)
x(k+1) =x(k)+akp(k)
Convergence depends on matrix M spectral properties
• Where does all this come from? What are the search directions? How do I choose ak ?
Will explore in detail in the next lectures
• QR Factorization– Direct Method to solve linear systems
• Problems that generate Singular matrices
– Modified Gram-Schmidt Algorithm– QR Pivoting
• Matrix must be singular, move zero column to end.
– Minimization view point Link to Iterative Non stationary Methods (Krylov Subspace)
Outline
1
1
v1 v2 v3 v4
The resulting nodal matrix is SINGULAR, but a solution exists!
LU Factorization fails – Singular Example
0
1
1
1
2100
1100
0011
0011
4
3
2
1
v
v
v
v
The resulting nodal matrix is SINGULAR, but a solution exists!
Solution (from picture):v4 = -1v3 = -2v2 = anything you want solutionsv1 = v2 - 1
LU Factorization fails – Singular Example
2100
1100
0011
0011
2100
1100
0000
0011One step GE
1 1
2 21 2 N
N N
x b
x bM M M
x b
1 1 2 2 N Nx M x M x M b
Recall weighted sum of columns view of systems of equations
M is singular but b is in the span of the columns of M
QR Factorization – Singular Example
1 1 2 2i N N iM x M x M x M M b
0i jM M i j
Orthogonal columns implies:
Multiplying the weighted columns equation by i-th column:
Simplifying using orthogonality:
i
i i i i i
i i
M bx M M M b x
M M
QR Factorization – Key ideaIf M has orthogonal columns
Picture for the two-dimensional case
1M
2Mb
Non-orthogonal Case
1M
2M
b
Orthogonal Case
2x1x
QR Factorization - M orthonormal
0 and 1i j i iM M i j M M
M is orthonormal if:
1 1
2 21 2 N
N N
x b
x bM M M
x bOriginal Matrix
1 1
2 21 2 N
N N
y b
y bQ Q Q
y bMatrix withOrthonormal
Columns
TQy b y Q b
How to perform the conversion?
QR Factorization – Key idea
1 2 2 2 12 1Given , find so that, =M M Q M r M
1 2 1 2 12 1 0 M Q M M r M
1 212
1 1
M Mr
M M
2M
2Q
1M
12r
QR Factorization – Projection formula
1 2122 2 1=Now find so that 0rQ M Q Q Q
1 1 1 1
1 111
1
111Q M M Q Q
M M r
12 1 2r Q M
Formulas simplify if we normalize
2 2 2
222
2
Fi1
al1
n ly r
Q Q QQ Q
QR Factorization – Normalization
1 11 2 1 1 2 2 1 2 1 1 2 2
2 2
x yM M x M x M Q Q y Q y Q
x y
1 1 2 2 111 22 12M Q M Q Qr r r
Mx=b Qy=b Mx=Qy
11 12 1 1
2 2220
x y
x
r r
r y
QR Factorization – 2x2 case
1 1 11 2 1 2
2 2 2
11 12
220Upper
Triangular
x x bM M Q Q
x x b
Orthonormal
r r
r
Step 1) TQRx b Rx Q b b Two Step Solve Given QR
Step 2) Backsolve Rx b
QR Factorization – 2x2 case
12 11 2 3 1 2 1 3 3 231 2rM M M M M M M rMr M
13 21 3 1 23 0M M M Mr r
13 22 3 1 23 0M M M Mr r
To Insure the third column is orthogonal
QR Factorization – General case
13 21 3 1 23 0M M M Mr r
13 22 3 1 23 0M M M Mr r
1 31 1 1 2
2 32
1
1 2
3
232
M MM M M M
M MM M M
r
M r
QR Factorization – General case
In general, must solve NxN dense linear system for coefficients
1,1 1 1 1 1
1 1 ,1 1 11
N N
N N N N N
N
N N
M M M M M M
M M M M M M
r
r
To Orthogonalize the Nth Vector
2 3 inner products or workN N
QR Factorization – General case
11 2 3 2 13 21 3 1 232 1M M M M M Q M Q Qr r r
1 3 1 213 23 13 1 30Q M Q Q Mr Qr r
To Insure the third column is orthogonal
2 3 1 213 23 23 2 30Q M Q Q Mr Qr r
QR Factorization – General caseModified Gram-Schmidt Algorithm
For i = 1 to N “For each Source Column”
For j = i+1 to N { “For each target Column right of source”
endend
1
ii i
irQ M
jij iMr Q
iii iMr M
Normalize
j jj i iM M Qr
2
1
2 2 operationsN
i
N N
3
1
( )2 operationsN
i
N i N N
QR FactorizationModified Gram-Schmidt Algorithm
(Source-column oriented approach)
4M
3M
2M
1M
12 13 14r r r
1Q
2Q
3Q
4Q
11r
22r 23 24r r
33r
44r
34r
QR Factorization – By picture
1 2 1 1 2 2
1 0 0
0 1
0 0
0 1
N NNx x xx e x e x ex
1 11 1 2 2 22 N N NNx Me x Me x x M x Me xM MMx
Suppose only matrix-vector products were available?
More convenient to use another approach
QR Factorization – Matrix-Vector Product View
For i = 1 to N “For each Target Column”
For j = 1 to i-1 “For each Source Column left of target”
end
end
1
ii i
irQ M
jji iQr M
iii iMr M
Normalize
i ii j jM M Qr
2
1
2 2 operationsN
i
N N
3
1
( )2 operationsN
i
N i N N
"matrix-vector product"i iM Me
QR FactorizationModified Gram-Schmidt Algorithm
(Target-column oriented approach)
4M
3M
2M
1M
1Q
2Q
3Q
4Q
QR Factorization
4M
3M
2M
1M
1Q
2Q
3Q
4Q
r11 r12
r22
r13
r23
r33
r14
r24
r34
r44
r11
r22
r12 r14r13
r23 r24
r33 r34
r44
1 3
0
0
0NMQ M
What if a Column becomes Zero?
Matrix MUST BE Singular!1) Do not try to normalize the column.2) Do not use the column as a source for orthogonalization.3) Perform backward substitution as well as possible
QR Factorization – Zero Column
1 3
0
0
0NQ Q Q
Resulting QR Factorization
11 12 13 1
33 3
0 0 0 0
0 0
0 0 0
0 0 0
N
N
NN
r r r r
r r
r
QR Factorization – Zero Column
1 1
2 21 2 N
N N
x b
x bM M M
x b
1 1 2 2 N Nx M x M x M b
Recall weighted sum of columns view of systems of equations
M is singular but b is in the span of the columns of M
QR Factorization – Zero Column
Reasons for QR Factorization
• QR factorization to solve Mx=b– Mx=b QRx=b Rx=QTb
where Q is orthogonal, R is upper trg
• O(N3) as GE
• Nice for singular matrices– Least-Squares problem
Mx=b where M: mxn and m>n
• Pointer to Krylov-Subspace Methods– through minimization point of view
21
Minimize over all xN
T
ii
R x R x R x
Definition of the Residual R: R x b Mx
Find x which satisfies
Mx b
Equivalent if b span cols M
and min 0T
Mx b R x R xx
Minimization More General!
QR Factorization – Minimization View
1 1 1 1 1 1Suppose and therefo rex x e Mx x Me x M
1 1 1 1
T TR x R x b x Me b x Me
One dimensional Minimization
21 1 1 1 12
TT Tb b x b Me x Me Me
1 1 1 12 2 0T TTd
b Me x MR x R Mxx
e ed
11
1 1
T
T T
b Mex
e M Me
Normalization
QR Factorization – Minimization ViewOne-Dimensional Minimization
1 1Me M
b
1e
1x
One dimensional minimization yields same result as projection on the column!
11
1 1
T
T T
b Mex
e M Me
QR Factorization – Minimization ViewOne-Dimensional Minimization: Picture
1 1 2 2 1 1 2 2Now and x x e x e Mx x Me x Me
1 1 2 2 1 1 2 2
T TR x R x b x Me x Me b x Me x Me
Residual Minimization
21 1 1 1 12
TT Tx b Me xb eb Me M
22 2 2 2 22
TTx b Me x Me Me
1 2 1 22T
x x Me MeCoupling
Term
QR Factorization – Minimization ViewTwo-Dimensional Minimization
QR Factorization – Minimization ViewTwo-Dimensional Minimization: Residual Minimization
1 1 2 2 1 1 2 2
T TR x R x b x Me x Me b x Me x Me
22 2 2 2 22
TTx b Me x Me Me
1 2 1 22T
x x Me MeCoupling
Term
21 1 1 1 12
TT Tx b Me xb eb Me M
termcouplingeMeMxeMbdx
xRxdR TTT
)()(220)()(
1111
1
termcouplingeMeMxeMbdx
xRxdR TTT
)()(220)()(
2222
2
To eliminate coupling term: we change search directions !!!
1 1 2 2 1 1 2 2and x v p v p Mx v Mp v Mp
21 1 1 1 12
TTT TR x R x b v b Mp v Mp Mb p
More General Search Directions
22 2 2 2 22
TTv b Mp v Mp Mp
1 2 1 22T
v v Mp MpCoupling
Term
1 2 1 2 span , = span ,p p e e
1 2If Minimization 0 s D ecouple!!T Tp M Mp
QR Factorization – Minimization ViewTwo-Dimensional Minimization
2211 exexx
More General Search Directions
QR Factorization – Minimization ViewTwo-Dimensional Minimization
Goal: find a set of search directions such that
In this case minimization decouples !!!
pi and pj are called MTM orthogonal
ijpMMp jTT
i when 0
1
1
0i
T Ti i ji j i j
j
p e r p p M Mp
i-th search direction equals orthogonalized unit vector
T
j i
ji T
j j
Mp Mer
Mp Mp
Use previous orthogonalized Search directions
QR Factorization – Minimization ViewForming MTM orthogonal Minimization Directions
2Minimize: 2T T
i i i i iv Mp Mp v b Mp
Ti
i T
i i
b Mpv
Mp Mp
Differentiating 2 0 2:T T
i i i iv Mp Mp b Mp
QR Factorization – Minimization ViewMinimizing in the Search Direction
When search directions pj are MTM orthogonal, residual minimization becomes:
For i = 1 to N “For each Target Column”
For j = 1 to i-1 “For each Source Column left of target”
end
end
1
ii i
irp p
iii iMpr Mp
Normalize
i ix x v p
i ip e
Tjj
Tiir p M Mp
i ji i jp p pr Orthogonalize Search Direction
QR Factorization – Minimization ViewMinimization Algorithm
Intuitive summary
• QR factorization Minimization view
(Direct) (Iterative)• Compose vector x along search directions:
– Direct: composition along Qi (orthonormalized columns of M) need to factorize M
– Iterative: composition along certain search directions you can stop half way
• About the search directions:– Chosen so that it is easy to do the minimization
(decoupling) pj are MTM orthogonal
– Each step: try to minimize the residual
1
1
11
1e
p
r
2 112
22
2
1e e
r
p
r
2
1i
N
N
iNN
rr
e e
p
1Q
M M
2Q
M
NQ
MTMOrthonormal
Orthonormal
Compare Minimization and QR
Summary
• Iterative Methods Overview– Stationary– Non Stationary
• QR factorization to solve Mx=b– Modified Gram-Schmidt Algorithm– QR Pivoting– Minimization View of QR
• Basic Minimization approach• Orthogonalized Search Directions• Pointer to Krylov Subspace Methods