Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | zakary-button |
View: | 215 times |
Download: | 0 times |
Fill Reduction Algorithm Using Diagonal Fill Reduction Algorithm Using Diagonal Markowitz Scheme with Local Markowitz Scheme with Local
Symmetrization Symmetrization
Patrick Amestoy
ENSEEIHT-IRIT, France
Xiaoye S. Li
Esmond Ng
Lawrence Berkeley National Laboratory
SIAM CSE03, Feb 10-13, 2003 2
ContentsContents
Motivation
Graph models for Gaussian elimination
Minimum priority metrics
Experimental results
Summary
Add: Runtime and space complexity
SIAM CSE03, Feb 10-13, 2003 3
Motivation -- New Sparse LU Factorization AlgorithmsMotivation -- New Sparse LU Factorization Algorithms
Inexpensive pre/post-processing Equilibration (or scaling) Pre-permute rows or columns of A to maximize its diagonal
Find a matching with maximum weight for bipartite graph of A Example: MC64 [Duff/Koster ‘99]
Iterative refinement
GESP (static pivoting) [Li/Demmel ‘98, SuperLU_DIST] Pivots are chosen from the diagonal Allow half-precision perturbation of small diagonals
Unsymmetrized multifrontal [Amestoy/Puglisi ‘00, MA41_NEW] Prefer diagonal pivoting, but threshold pivoting is possible Allow unsymmetric fronts, but dependency graph is still a tree
Diagonal is (almost) goodStruct(L’) Struct(U)
SIAM CSE03, Feb 10-13, 2003 4
Existing Ordering Strategies to Preserve SparsityExisting Ordering Strategies to Preserve Sparsity
Symmetric ordering algorithms on A’+A Greedy algorithms
e.g., minimum degree, minimum deficiency, etc.
Graph partitioning Hybrid
Problem: unsymmetric structure is not respected!
SIAM CSE03, Feb 10-13, 2003 5
Structural Gaussian Elimination -- Symmetric CaseStructural Gaussian Elimination -- Symmetric Case
i j k i j k
Eliminate 1
1
i
j
k
1
i
j
k
1
i
j
k
Eliminate 1i
k
j
•Undirected graph•After a vertex is eliminated, all its neighbors become a clique•The edges of the clique are the potential fills (upper bound !)
SIAM CSE03, Feb 10-13, 2003 6
Structural Gaussian Elimination -- Unsymmetric CaseStructural Gaussian Elimination -- Unsymmetric Case
Eliminate 1
1
r1
r2
c1 c2 c3
1
r1
r2
c1 c2 c3
c1r1
r2c2
c3
Eliminate 1r1
r2
c1
c2
c3
1 1
•Bipartite graph•After a vertex is eliminated, all the row & column vertices adjacent to it become fully connected – “bi-clique” (assuming diagonal pivot)•The edges of the bi-clique are the potential fills (upper bound !)
SIAM CSE03, Feb 10-13, 2003 7
Ordering Algorithms RevisitOrdering Algorithms Revisit
Markowitz [1957] for unsymmetric matrices At step k, pick pivot in the trailing submatrix so that:
It has minimum , and It is bounded by a numerical threshold
Bound the size of the rank-1 update matrix Expensive to implement because it is mixed with numerical consideration Examples: MA48 (HSL), etc.
“Restricted” Markowitz -- only look ahead a few candidate columns (rows) with the lowest degrees [Zlatev ‘80]
Minimum degree [Tinney/Walker ‘67] Special case of Markowitz for SPD systems Efficient implementation, because:
Diagonal is stable as numerical pivot Use quotient graph as a compact representation without regard of numerical values
ija)1()1( ji cr
SIAM CSE03, Feb 10-13, 2003 8
Simulation ResultSimulation Result
Order(A) vs. Order(A’+A) (Markowitz vs. min degree) Diagonal pivoting
88 unsymmetric matrices Mean fill ratio 0.90 Mean flops ratio 0.79
54 very unsymmetric (symmetry <= 0.5) Mean fill ratio 0.85 Mean flops ratio 0.56
SIAM CSE03, Feb 10-13, 2003 9
Quotient Graph – Symmetric CaseQuotient Graph – Symmetric Case
Elements -- representative nodes of the connected components in the
eliminated subgraphVariables -- uneliminated nodes
Current pivot p:
If variable v adjacent to e1, it will be adjacent to p e1 can be absorbed by p p is representative of conn. comp. {e1, e2, p}
e1
e2
px x
x
x
. element list = {e1, e2}
. variable list
v
p p
21 eepp LLAL pA
pA
SIAM CSE03, Feb 10-13, 2003 10
Quotient Graph -- Unsymmetric CaseQuotient Graph -- Unsymmetric Case
Current pivot p:
p
UUUU
UUUU
LLL
peev
pepe
pee
e2e1path search must
e2or e1 absorbcannot p
,
But
,21
21
,21
Difficulty:Path length may be greater than 2 !
e1
e2
p
x
x
x
v
SIAM CSE03, Feb 10-13, 2003 11
Quotient Graph -- “Local Symmetrization”Quotient Graph -- “Local Symmetrization”
e1
e2
p
x
x
x
v
Current pivot p:
p} e2, {e1, comp. conn. of tiverepresenta is p
e2 and e1 absorbcan p
21
21
pee
pee
UUU
LLL
Advantage: - Path length bounded by 2 !
Disadvantage: - Lose some asymmetry - More fill
s s
s
SIAM CSE03, Feb 10-13, 2003 12
Cost of ImplementationCost of Implementation
G(A) viaReachable Set
Quotient Graph Elim. GraphG(L+U)
Symmetric Long search path In-place
Path length 2 In-place [George/Liu ‘81]
Not in-place
Unsym. Long search path In-place [Pagallo/Maulino ‘83]
Local Sym. Path length 2 In-place
Elimination models can be implemented using standard graphs or quotient graphs, with different cost in time & space.
SIAM CSE03, Feb 10-13, 2003 13
Minimum Priority MetricsMinimum Priority Metrics
Metrics are based on “approximate degree” in the sense of AMD, can be implemented efficiently
Almost the same cost using various metrics: Based on row & column counts:
PRODUCT (a.k.a. Markowitz), SUM, MIN, MAX, etc.
Minimum fill : areas associated with the existing cliques are deducted …...
SIAM CSE03, Feb 10-13, 2003 14
Preliminary Results with Local SymmetrizationPreliminary Results with Local Symmetrization
Matrices: 98 unsymmetric in structure
Metrics : based on row/column counts or fill
Solvers: MA41_NEW : unsymmetrized multifrontal
Local symmetrization ordering is ideal for this solver SuperLU_DIST : GESP
SIAM CSE03, Feb 10-13, 2003 15
Compare Different MetricsCompare Different Metrics
Solver: MA41_NEWAverage fill ratio using various metrics with respect to Markowitz
(product of row & col counts)
Metrics Mean fill ratio
SUM row & col counts 0.999
MAX row & col counts 6.079
MIN row & col counts 15.94
Approx. min fill (AMF1) 0.965
Approx. min fill (AMF4) 0.959
SIAM CSE03, Feb 10-13, 2003 16
Compare with AMD(A’+A) using Min Fill -- All Compare with AMD(A’+A) using Min Fill -- All UnsymmetricUnsymmetric
MA41_NEW
SuperLU_DIST
Fill ratio Flops ratio
Mean 0.96 0.92
Best / worst 0.41 / 1.27 0.13 / 2.38
Fill ratio Flops ratio
Mean 0.96 0.96
Best / worst 0.38 / 2.36 0.009 / 6.00
SIAM CSE03, Feb 10-13, 2003 17
Compare with AMD(A’+A) using Min Fill -- Very Compare with AMD(A’+A) using Min Fill -- Very UnsymmetricUnsymmetric
MA41_NEW
SuperLU_DIST
Fill Flops
Mean 0.88 0.77
Best / worst 0.38 / 1.18 0.009 / 1.69
Fill Flops
Mean 0.95 0.89
Best / worst 0.41 / 1.27 0.13 / 2.38
SIAM CSE03, Feb 10-13, 2003 18
SummarySummary
First implementation based on BQG model Features: supervariable, element absorption, mass elimination
Using approximate degree (degree upper bound)Tried various metrics on large collection of matrices
PRODUCT, SUM, MIN-FILL, etc. Not a single one is universally best, MIN-FILL is often better
Local symmetrization Cheaper to implement, harder to understand behavior Especially suitable for unsymmetrized multifrontal, also benefit GESP Respectable gain for very unsymmetric matrices
SIAM CSE03, Feb 10-13, 2003 19
Summary (con’d)Summary (con’d)
Results for very unsymmetric matrices
Future work Work underway for a fully unsymmetric version Extend to graph partitioning strategy
Local Sym. Unsym. (simulation)
Fill reduction 0.88 0.85
Flops reduction 0.77 0.56
SIAM CSE03, Feb 10-13, 2003 20
The EndThe End
SIAM CSE03, Feb 10-13, 2003 21
1 x 2 xx x 3 x 4 x 5 x x x 6 x x 7
ExampleExample
2
3
4
5
7
6
1
2
3
4
5
7
6
1A
G(A)
row column