Solving Linear Equationsin Interior-Point Methods
Tongryeol SeolComputing and Software Department, McMaster University
March 1, 2004
Optimization Seminar
1 / 21
Outline
• Linear Equations in IPMs
• How to Solve Sparse Linear Equations
• Introduction to McSML
• Conclusion
• Reference
2 / 21
Linear Equations in IPMs for LO and QO
• The key implementation issue of IPMs is the solution of the linear system of equations arising in the Newton system:
• At every iteration, we solve the above system with different H.
• Solving the equations takes in the average 60~90% of the time ofsolving problems by IPMs.
hLΔy0A
fL=
ΔxAT–H–1
AHAT Δy = AHfL + hL
augmented system in LO
normal equation in LO
hQΔy0A
fQ=
ΔxAT– Q – H–1
A(Q+ H–1)–1AT Δy = A(Q+H–1)–1 fQ + hQ
augmented system in QO
normal equation in QO
3 / 21
How to Solve Linear Equations in IPMs
Direct Methods• Cholesky factorization (LLT) is most popular for normal equation
approach.– Cholesky factorization is a symmetric variant of LU factorization. – Data structures of L can be determined previously and fixed.
• LDLT factorization is used for augmented system approach.– D is a block diagonal matrix if 2x2 pivot is applied, otherwise just a
diagonal matrix.
Iterative Methods• Conjugate gradient method is considered as an alternative in IPMs
for network flow optimization.
4 / 21
Normal Equation Approach
• In IPMs for LO, normal equation is popular because AHAT is positive defnite and H is a diagonal matrix.
• If there is one or more dense columns in A, AHAT loses sparsity.
• normal equation of QO?
AHAT Δy = AH fL + hL
normal equation in LO
A(Q+H–1)–1AT Δy = A(Q+H–1)–1 fQ + hQ
normal equation in QO
nonzero pattern of A of fit1p(nz=9,868)
nonzero pattern of AAT of fit1p(nz=393,129)
5 / 21
Augmented System Approach
• Augmented system is free from dense columns or Q matrix, but loses positive definiteness of normal equation.
• There are two approaches to solve augmented system:
Symmetric Block Factorization with 2x2 pivoting– Theory: Bunch-Parlett (or Bunch-Kaufman) Pivoting Strategy– Implementation: LINPACK, Harwell Library (MA27, MA47),
Fourer and Mehrotra (fo1aug)
LDLT Factorization with Regularization– Theory: Quasidefinite, Proximal Point Algorithm– Implementation: Mészáros, Gondzio
6 / 21
Block Factorization with 2× 2 Pivoting
• We cannot apply the Cholesky factorization to indefinite matrices.– e.g
• Bunch-Kaufman pivoting strategy uses 2×2 pivot when 1×1 pivot is not acceptable.– At first, it tries a partial pivoting search (case 1~3).– If no 1×1 pivot is acceptable, it performs 2×2 pivot.
0110
λ
d
σc
σ
λ
λ has the largest absolute value in this columnσ has the largest absolute value in this column
case 1. |d| ≥ α |λ|,
case 2. |dσ| ≥ α |λ|2 ,
case 3. |c| ≥ α |σ | ,
case 4. Otherwise,
use d as a 1×1 pivot.
use d as a 1×1 pivot.
use c as a 1×1 pivot.
use as a 2×2 pivot.d λλ c
α ≒ 0.6404
7 / 21
Solving Linear Equations by Regularization
• Quasidefinite matrices are strongly factorizable.– A symmetric matrix K is called quasidefinite if it has the form
where H and F are positive definite and A has full row rank.– Factorizable: There exists D and L such that K = LDLT.– Strongly Factorizable: For any permutation matrix P, PKPT = LDLT exists.
• Regularization can be applied to achieve stability in the linearalgebra kernel.– Primal-dual regularization:
– However, we keep the regularizations small since we don’t want to change greatly the behaviour of IPMs.
– H AT
A FK =
– Q – H–1 AT
A 0M = + – Rp 00 Rd makes the part more positive definite
makes the part more negative definite
8 / 21
Sparse Gaussian Elimination
• First step of Gaussian elimination:
– Repeat Gaussian elimination on C … Result in a Factorization M = LU.– In symmetric cases, M = L (U = DLT ) = LDLT.
• For sparse M:
Cv=
wTα
Iv/α
01
C –0
wTα
αvwTM =
X
X
X
X
X
X
X
X
X
X
X
X
XXXX
0
0
0
0
0
X
XXXXX
XXXXX
X
X
X
X
XXXX
XXXX
XXXX
XXXX
X
X
XX
XX
X XXXX
XX
XX
X
0
0
0
0
0
X
XX
XX
X XXXX
XX
XX
X
If v and w are not zero,
is not zero when C is zero.C – αvwT
Different order of pivotingcan reduce the number of fill-ins.
unit lower triangular matrix
upper triangular matrix
diagonal matrix
9 / 21
Control of Nonzeros (Fill-ins)
• When numerical stability is not an issue,ordering, symbolic factorization, can be performed separately.numerical factorization
• Ordering permutes equations and variables to reduce fill in L.– Finding Optimal Permutation is NP-complete problem.– Local heuristics (e.g minimum degree ordering)– Global heuristics (e.g nested dissection)
• Symbolic factorization determines locations of nonzeros in L, and we can set up data structures for storing nonzeros of L previously.
10 / 21
Local Ordering Heuristics
• The Markowitz criterion in Gaussian elimination
• Minimum degree ordering is a symmetric variant of Markowitz ordering.
• Graph representation of the sparse pattern of a matrix- Nodes: on-diagonal elements- Edges: off-diagonal elements
X
X
X
Ⓧ
■■■■
■■■■
■
X
■■■
XXX
X
X
X
Ⓧ
■■■
■■■
■
X
■■
XX
Markowitz count: (ri-1)(cj-1)ri
cj
estimated # of elements to be nonzeros
Markowitz count: (ri-1)2
ri Degree: r
② ① ③
④ ⑤X
X
X
①
X③
X④
X
X
⑤X
②
XX
11 / 21
M12
M11
M22
M21
Global Ordering Heuristics
• Nested dissection ordering– Its roots are in finite-element substructuring.– It makes a matrix doubly-bordered block diagonal form.– It finds a small separator to disconnect a given graph into components of
approximately equal size.
M2
M1M1 M2
M11 M21
M12 M22
9×9 grid 81×81 matrix
12 / 21
Notion of Supernode
• Consecutive columns with identical sparsity structure can often be found in L.
• Supernode is a group of consecutive columns { j, j+1, …, j+t } such that– columns j to j+t have a dense diagonal block,– columns j to j+t have identical sparsity pattern below row j+t
• Benefits of Supernode– Reduce inefficient indirect addressing,– Take advantage of cache memory,– Allow a compact representation of the
sparsity structure of L– We can expect about 30% speed-up by
using supernode structures.
13 / 21
Introduction to McSML
• McSML consists of 11 C–MEX files for solution of linear equations in IPMs.
• ANALYSIS – Minimum (External) Degree Ordering with Multiple Elimination
• FACTORIZE and SOLVE– Left-Looking Cholesky/LDLT Factorization– Supernodal Techniques: Compact Row Indices, Loop-unrolling, etc– Iterative Refinement: Conjugate Gradient Method
• MISC.– We maintain A and AT together.– Normal Equation Builder and Augmented System Builder.
14 / 21
Normal Equation Solver of McSML
AT=sm_trans(A);
P=sf_mmd_ne(A);
[L,SINFO]=sf_scc_ne(A,AT,P);
D=nf_aat(A,AT,H,P,L,SINFO);
nf_scc_ne(L,SINFO,D);
x=nf_sub_ne(A,H,L,SINFO,D,P,rhs);
It performs minimum degree ordering. P has ordering info.
It builds data structures for L and create supernode structures(SINFO).
It computes AHAT.
It performs numerical factorization.
It solves the equation (AHATx=rhs) and refines the solution.
We assume that every matrix is sparse and every vector is full.
15 / 21
Augmented System Solver of McSML
AT=sm_trans(A);
P=sf_mmd_as(A,AT,H,F);
[L,SINFO]=sf_scc_as(A,AT,H,F,P);
D=nf_aug(A,AT,H,F,P,L,SINFO,Rp,Rd);
nf_scc_as(L,SINFO,D);
x=nf_sub_as(A,AT,H,F,L,SINFO,D,P,rhs);
It performs numerical factorization.
It solves the equation and refines the solution.
It builds + .H AT
A F-Rp 00 Rd
16 / 21
Performance of McSML – Accuracy Benchmarking
McSMLne
7.620719E-092.026405E-121.120950E-044.664558E-082.108489E-091.824042E-13qap15
3.458121E-072.185784E-154.624334E-073.715834E-153.656559E-065.607903E-15cre_d
6.074663E-072.963212E-132.726310E-071.235207E-135.722383E-072.846440E-13ken_13
3.723433E-081.302376E-141.846481E-085.002513E-153.371902E-089.263912E-15pds_20
4.836873E-092.762969E-125.972575E-063.819078E-092.697597E-084.186702E-12qap12
1.229784E-084.556325E-151.540756E-082.779174E-152.471212E-084.614586E-15pds_10
3.667600E-041.608902E-132.755767E-021.062017E-112.436846E-029.793590E-12osa_60
5.262877E-071.994308E-155.997905E-071.994308E-152.453067E-042.446411E-13cre_b
8.061074E-092.118858E-136.462990E-091.850241E-137.965795E-092.457402E-13stocfor3
4.792537E-102.895742E-154.535023E-103.474891E-154.963837E-102.779912E-15maros_r7
7.146214E-066.693688E-147.685944E-064.457410E-141.264779E-053.434812E-14dfl001
Sol. Error 2Sol. Error 1Sol. Error 2Sol. Error 1Sol. Error 2Sol. Error 1
5.873346E-14
1.323670E-14
3.993961E-16
2.874662E-12
1.613404E-14
1.774429E-15
2.700201E-12
3.186449E-14
1.927925E-15
MATLAB(LAPACK-DPOTRF)
5.723799E-053.308398E-033.061775E-03osa_30
2.168695E-102.526719E-103.405521E-10pilot87
6.414922E-121.293002E-111.305830E-1180bau3b
LIPSOL(ORNL Cholesky Solver)
Selected Netlib Problems, IBM RS/6000 W/S||b-AATx||∞
||b||∞||b-AATx||2Solution Error 1: , Solution Error 2:
17 / 21
Performance of McSML – Speed Benchmarking
2,816.78 140.692.981,414.06 70.239.466,299.51 314.911.31qap15
30.21 1.470.8135.37 1.662.172,934.20 146.631.60cre_d
7.42 0.291.6212.01 0.560.8116.21 0.770.81ken_13
1,326.40 65.928.00708.08 24.90210.0816,362.97 818.061.77pds_20
332.36 16.580.76176.12 8.731.521,043.80 52.170.40qap12
161.49 7.991.6988.78 3.1625.58193.81 9.660.61pds_10
32.93 1.512.73125.86 3.6552.86100.66 2.4651.46osa_60
41.06 2.010.8639.141.871.744,367.50 218.291.70cre_b
2.81 0.110.619.15 0.450.1511.25 0.540.45stocfor3
49.52 2.450.5250.40 2.500.40117.15 5.791.35maros_r7
172.01 8.560.8169.153.401.151,086.69 54.320.29dfl001
Total*NumericSymbolicTotal*NumericSymbolicTotal*NumericSymbolic
12.32
18.67
1.50
0.92
0.27
0.10
McSMLne
1.50
1.02
0.11
28.82
41.73
1.93
9.82
0.73
0.13
MATLAB(LAPACK-DPOTRF)
0.5739.95 9.950.95osa_30
0.9221.22 0.822.05pilot87
0.072.29 0.090.0980bau3b
LIPSOL(ORNL Cholesky Solver)
*Total = Sym. Comp. Time + 20 * Num. Comp. Time (in second) Selected Netlib Problems, IBM RS/6000 W/S
18 / 21
Performance of McSML – N.E. vs. A.S.
Selected Netlib Problems, IBM RS/6000 W/S
42.57
8.25
424.81
82.78
4898.74
762.97
65.91
83.20
84.11
13.72
2.14
3.52
15.52
2.26
Ord.*
McSMLas
0.85
0.24
0.92
0.26
1.15
0.34
0.09
0.21
0.24
0.04
0.09
0.20
0.18
0.02
S. F.
0.28
0.07
0.33
0.09
0.23
0.09
0.04
0.08
0.09
0.02
0.03
0.05
0.07
0.01
AS
117.76
14.49
69.44
7.88
1.83
0.64
0.28
1.64
2.10
0.08
0.94
2.01
7.71
0.03
N. F.
1.79 (05)
2.62 (37)
3.59 (10)
1.10 (10)
1.03 (03)
0.41 (03)
0.64 (11)
1.30 (14)
1.39 (14)
0.20 (07)
0.12 (04)
0.22 (03)
0.97 (15)
0.03 (03)
S. R.
4.787706E+00
4.003021E-10
1.481893E-09
5.471392E-10
4.062912E-08
1.095287E-08
6.276160E-10
6.666891E-10
5.002931E-10
3.719834E-09
3.719834E-09
4.098304E-10
2.476533E-07
8.280160E-11
Accuracy
142.14
16.67
65.80
7.84
0.51
0.16
0.19
1.26
1.71
0.06
0.77
2.17
8.54
0.02
N. F.
0.27
0.07
0.50
0.15
0.21 (2)
0.09 (2)
0.05
0.08
0.09
0.02
0.01
0.04
0.12
0.01
S. R.
0.99
0.23
0.70
0.19
1.01
0.30
0.06
0.16
0.17
0.04
0.06
0.19
0.19
0.01
S. F.
7.620719E-090.391.83qap15
3.458121E-070.090.47cre_d
6.074663E-070.050.37ken_13
3.723433E-080.454.58pds_20
4.836873E-090.100.42qap12
1.229784E-080.111.01pds_10
3.667600E-040.561.35osa_60
5.262877E-070.100.50cre_b
8.061074E-090.030.15stocfor3
4.792537E-100.220.28maros_r7
7.146214E-060.090.51dfl001
AccuracyNEOrd.
5.723799E-05
2.168695E-10
6.414922E-12
0.52
0.15
0.03
McSMLne
0.22osa_30
0.10pilot87
0.0080bau3b
* Tentative version of minimum degree ordering for augmented system
19 / 21
Performance of McSML – McIPM with McSML
5.89444E-08
7.39163E-09
9.22093E-08
2.85713E-08
1.21127E-09
7.17196E-10
6.31605E-08
1.34671E+00
6.13231E-08
1.81016E-08
1.88798E-11
3.89770E-09
2.17643E-09
1.43719E-10
Duality Gap
9.61793E-08
2.00629E-11
2.59837E-07
1.59447E-11
6.11396E-12
3.60120E-09
1.30478E-05
3.17591E+02
4.65285E-05
1.14794E-10
3.97911E-09
3.08955E-09
3.44373E-09
3.33908E-09
Dual Feas.
2.65988E-09
7.46069E-09
7.69573E-09
2.85714E-08
1.21127E-09
7.28516E-10
9.67817E-12
2.33736E-11
2.96888E-11
1.80975E-08
1.88593E-11
3.89769E-09
2.17643E-09
1.43661E-10
Duality Gap
1.99680E-10
2.02139E-11
1.03668E-08
1.59437E-11
8.48931E-13
4.24861E-09
1.75965E-09
1.15036E-09
6.43470E-10
1.14724E-10
3.97786E-09
2.93585E-09
3.31878E-09
7.34164E-10
Dual Feas.
1.51093E-060.9823*2.07036E-121.0723lotfi
3.27941E+030.743*8.44625E-090.4715scagr7
1.69120E-060.4418*2.28081E-100.4715stocfor1
5.82778E-050.3613*8.67057E-090.4012share2b
7.08901E-090.77177.15516E-091.0217vtp_base
1.44868E-080.45121.44868E-080.5312recipe
1.08636E-100.48137.68271E-120.5113sc205
6.31986E-050.5521*8.43804E-100.4614adlittle
1.32147E-090.31121.31704E-090.3812sc105
1.19769E-090.23111.06181E-090.3011sc50a
2.11146E-090.24101.95050E-090.2710sc50b
Primal Feas.TimeIt.Primal Feas.TimeIt.
0.81
0.41
0.20
1.04876E-09
2.94229E-11
8.93729E-10
11
17
10
McIPM with LIPSOL’s Equation Solver
1.65276E-09110.35blend
2.96139E-11170.48kb2
3.75415E-09100.67afiro
McIPM with McSMLne
Selected Netlib Problems, IBM RS/6000 W/S* Numerical difficulty happens.
20 / 21
Conclusion and Future Work
• We implemented sparse linear equation solvers for IPMs.
• The McIPMne with most of netlib problems is competitive to LIPSOL’s linear equation solver but slower with some bigproblems. We need to improve ordering quality.
• The McIPMas is not numerically stable yet. Refinement by PCGdoesn’t work well either. We need to 1check operations in factorization and substitution and 2implement Bunch-Parlettmethod.
• McIPM with McSML fails due to numerical instability of the last iterations (H=X-1Z has big range in value).
21 / 21
Selected References
Books• Duff, I. S., Erisman, A. M. and Reid, J. K. (1989) Direct Methods for Sparse Matrices, Oxford University Press,
New York• George, A. and Liu, J. W. H. (1981) Computer Solution of Large Sparse Positive Definite Systems, Prentice Hall,
Inc., Englewood Cliffs
Papers• Gondzio, J. (1993) Implementing Cholesky Factorization for IPMs of LP, Optimization• Altman, A. and Gondzio, J. (1998) Regularized Symmetric Indefinite Systems in IPMs for Linear and Quadratic
Optimization, Opt. Methods & Soft.
• Mészáros, C. (1996) Fast Cholesky Factorization for IPMs of LP, Comp. & Math. Appl.• Mészáros, C. (1997) The augmented system variant of IPMs in two-stage stochastic LP computation, EJOR• Maros, I., and Mészáros, C. (1998) The Role of the Augmented System in IPMs, EJOR
• Vanderbei, R. J. (1995) Symmetric Quasidefinite Matrices, SIAM J. Opt.
• Liu, J. W. H. (1990) The Multifrontal Method for Sparse Matrix Solution: Theory and Practice, SIAM Review• Fourer, R. and Mehrotra, S. (1993) Solving Symmetric Indefinite Systems in an IPM for LP, Math. Prog.• Duff, I. S. and Reid, J. K. (1995) Exploiting Zeros on the Diagonal in the Direct Solution of Indefinite Sparse
Symmetric Linear Systems, ACM Tran. Math. Soft.