Date post: | 28-Mar-2015 |
Category: |
Documents |
Upload: | carlos-craig |
View: | 215 times |
Download: | 2 times |
On the role of Linear Algebra in the development of Interior Point algorithms
and software
On the role of Linear Algebra in the development of Interior Point algorithms
and software
Due Giorni di Algebra Lineare NumericaBologna, 6-7 marzo, 2007
D. di Serafino, Second University of Naples, [email protected]
joint work with
S. Cafieri, École Polytechnique, ParisM. D’Apuzzo, V. De Simone, Second University of Naples
G. Toraldo, University of Naples “Federico II”
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
2
Outline
Motivations
KKT system and IP methods
Iterative solution of KKT system
Constraint Preconditioner
Adaptive termination control
Numerical experiments
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
3
Motivations
ubiquitous in optimization algorithms
critical issue in an effective implementation of optimization methods
H symmetric, D diagonal semidefinite positive, J full rank
d
c
y
x
DJ
JH T
n m
n
m
symmetric indefinte
Focus on IP methodsFocus on IP methods
KKT system
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
4
Development of an IP-based software package for solving
large-scale (convex) Quadratic Programming problems
PRQP - Potential Reduction software for Quadratic Programming problems PRQP - Potential Reduction software for Quadratic Programming problems
x, y, , t primal and dual variables, s,, z slack variables
, , , ,spsd 21EI21 nmmmJJQ nmnmnn
Motivations
0),,,(
, s.t.2
1),,,(max
tzyx
ctzJyJQx
tudybQxxtyxp
T
e
T
i
TTTT
0),,,(
, s.t.2
1),,,(max
tzyx
ctzJyJQx
tudybQxxtyxp
T
e
T
i
TTTT
0),,(
, s.t.2
1)(min
vsxux
d,xJ
bsxJ
xcQxxxq
e
i
TT
0),,(
, s.t.2
1)(min
vsxux
d,xJ
bsxJ
xcQxxxq
e
i
TT
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
5
KKT system and IP methods
type of problem different blocks formulation of the method in the KKT system
large-scale problems sparse direct or iterative solvers local convexity of the problem KKT matrix inertia control
accuracy requirements inexact solution of the KKT system adaptive stopping criteriaincreasing ill conditioning as the iterates approach the solution preconditioning techniques
OPTIMIZATION LINEAR ALGEBRA
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
6
Infeasible primal-dual PR framework
reduce the infeasibility at least at the same rate
as the duality gap
0, 0 corresponding to the initial point w0
n
iii
n
i
m
jjjii
TTT tsyzxtsyzxtzsyx11 1
)ln()ln()ln()ln(),,,,,( min
n
iii
n
i
m
jjjii
TTT tsyzxtsyzxtzsyx11 1
)ln()ln()ln()ln(),,,,,( min
barrier functions
potential function (Tanabe,1988; Todd & Ye, 1990)
duality gap
00
0),,,,,(~
σ
σ
tszyxw
00
0),,,,,(~
σ
σ
tszyxw
ctzJyJHxr
dxJruxrbsxJr
rrrrσ
T
e
T
id
Eppip
dppp
, ,
),,,(
321
321 2
(Kojima, Mizuno & Todd, 1995)
primal infeasibility
dual infeasibility
= 0 feasible version
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
7
PR basic steps
1. Given the current interior iterate w=(x,y,s,z,,v,t), apply a Newton step to the perturbed KKT conditions
2. Update w
with suitably chosen
)/()( gwwG )/()( gwwG
www www
parameteron perturbati /
conditions KKT theofJacobian )(
wG
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
8
Newton system reduction: KKT system
2
1
2
1
b
b
x
x
DJ
JH T
2
1
2
1
b
b
x
x
DJ
JH T
equality + bound constraints:
inequality + bound constraints:
ineq. + eq. + bound constraints:
TVZXEEQH 11 , E accounts for bound constraints
0
e
Te
J
JH
DJ
JH
i
Ti
Di pd
00
0
e
ii
Te
Ti
J
DJ
JJH
D pd
bound constraints only (J = I): )(
21
1
2
2
1
11
bxTVx
TbVbxH
diagonal
condensed system
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
9
Inherent ill conditioning
approaching the solution, some entries of D and E may become very
large, producing an increasing ill conditioning in the
matrix
DJ
JEQ T
DJ
JEQ T
it is crucial to use preconditioning strategiesit is crucial to use preconditioning strategies
00
01SYD
TVZXE 11
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
10
KKT system solvers: direct vs iterative
Direct solvers Widely used in well-established IP software
(e.g. Mosek, PCx, Loqo, OOQP, KNITRO-Interior/Direct, IPOPT)
Ill conditioning not a severe problem (S. Wright, 1997; M. Wright, 1998)
Computational cost may become prohibitive for large-scale problems
Iterative solvers Increasing attention by IP community in the last 15
years (implemented in KNITRO-Interior/CG, HOPDM, PRQP)
Require suitable preconditioning techniques Adaptive termination criteria
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
11
Indefinite preconditioner with the same structure as the KKT matrix
M “simple” approximation to H such that is spd on ker(Je). Common choice: M diagonal.
CP variants investigated by many researchers(Axelsson, 1979; Golub & Wathen, 1998; Luksan & Vlcek, 1998; Keller et al., 2000; Rozloznik & Simoncini, 2002; Durazzi & Ruggiero, 2003; Gondzio et al., 2004; Dollar et al., 2006-2007; di Serafino et al., 2007; Forsgren et al., 2007; …)
DJ
JMP
T
DJ
JMP
T
Constraint Preconditioner (CP)
DJ
JHK
T
ii
T
iJDJM 1
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
12
P-1K has at least m unit eigenvalues
If rank(D)=p, then P-1K has additional m-p unit eigenvalues
The remaining eigenvalues are real positive (bounds are available)
(Keller at al., 2000; Durazzi & Ruggiero, 2003; Bergamaschi et al., 2004; Dollar, 2007)
When the iterate approaches the solution, if q entries of D get close to zero, then additional q eigenvalues tend to be clustered around 1
We expect that the preconditioner increases its effectiveness as the IP method progresses
CP: spectral properties
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
13
CP: use of Conjugate Gradient (CG) method
(Gould, Hribar, Nocedal, 2001; Rozloznik, Simoncini, 2002; Cafieri, D’Apuzzo, De Simone, di Serafino, 2007; Dollar, 2007)
No breakdown Convergence in at most n-m+p iterations,
p=rank(D)
Starting guess such that
0
22
210
21
0
1
b
b
x
x
J
DJ
e
ii
0
22
210
21
0
1
b
b
x
x
J
DJ
e
ii
CP+CG applied to KKT system behaves as CG applied
to a reduced system with matrix
using as preconditioner
(Z basis of ker(Je))
CP+CG applied to KKT system behaves as CG applied
to a reduced system with matrix
using as preconditioner
(Z basis of ker(Je))
),,( 0
22
0
21
0
1xxx
Preconditioned Projected CG
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
14
Building CP
Explicit LBLT factorization of P (B block diagonal with 1x1 or 2x2 blocks)
LBLT factorization of the Schur complement of -M
Implicit Schilders factorization of P (Dollar et al., 2006)
T
TT
L
JMI
D
M
LJM
I
DJ
JMP
0
1
00
1 00
00
TT
MLBLJJMDS
000
1
May still account for a large part of the computational cost!
May still account for a large part of the computational cost!“Requiring a factorization of P may still be considered a disadvantage,
and methods which avoid this are urgently needed ” (Gould, Orban, Toint, Acta Numerica, 2005)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
15
Reducing the cost of building CP
Always set the (2,2)-block to 0(Dollar, Gould, Schilders, Wathen, 2007)
Approximate J by dropping away entries below a prescribed tolerance and outside a fixed band (case D=0 )(Bergamaschi, Gondzio, Venturin, Zilli, 2006)
Approximate the Schur complement via incomplete Cholesky factorization (case D=0)(Benzi, Simoncini, 2006)
Reuse CP for some IP iterations (Cafieri, D’Apuzzo, De Simone, di Serafino, 2007b)
DJ
JMP
T
DJ
JMP
T
TJJM 1
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
16
Reusing CP
The idea is not new (condensed system: Carpenter & Shanno, 1993; Karmakar & Ramakrishnan, 1991)
Reusing CP use an approximate CP
CG cannot be used, apply SQMR
DJ
JMP
T
~
~~
DJ
JMP
T
~
~~
00
0~~
~ 1SYD
computed at a previous outer step
)~~~~
(~ 11 TVZXQdiagM
~
,~
,~
,~
,~
,~
ZYTVSX
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
17
Reusing CP: spectral properties (work in progress)
pjDDrankmpDrank )~
( ,)( pjDDrankmpDrank )~
( ,)(
has
an eigenvalue at 1 with multiplicity at least 2m-p-j at most 2j eigenvalues with nonzero imaginary part
The eigenvalues λ satify
KP 1~
pm
pDD i
}
}
00
0
HIJR T )(
12/12/12/1 2 ,~
,~~ RDJHIJRCMJJMHMH TT
,~~~ 1 RRJMJDS TT
CHCHmaxmaxminmin
,min,min
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
18
Reusing CP
CP is reused until its effectiveness deteriorates the number of inner iterations must not increase too much the number of outer steps for which CP is reused must be bounded
by taking into account the increasing ill conditioning and the accuracy requirements
k = k+1 factorize Pk
apply to the KKT system SQMR with Pk
j = k; l = 0while (iit(k) ≤ α ∙ iit(j) and l ≤ β ∙ lmax) do
apply to the KKT system SQMR with P j
k = k+1; l = l+1endwhilelmax = l
k = k+1 factorize Pk
apply to the KKT system SQMR with Pk
j = k; l = 0while (iit(k) ≤ α ∙ iit(j) and l ≤ β ∙ lmax) do
apply to the KKT system SQMR with P j
k = k+1; l = l+1endwhilelmax = l
while (PR stopping criterion not satisfied) do…
…endwhile
iit(k) = # inner iter. at outer step k
= 2, = 1 (by numerical experiments)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
19
Adaptive stopping criteria, according to the quality of the IP iterate (to reduce the number of inner iterations)
Basic idea: at early outer iterations low accuracy in solving the KKT system as the iterates approach the optimal solution the accuracy requirement grows up
Regard the IP method as an inexact Newton method:
,)( )()()()( kkkk whrr 0 , 10step IP
)(
kkk
residual of the KKT system
Termination control
perturbation of the KKT cond. of the original
pb.
KKT cond. of the original pb.
chosen suitably 1
/)(
)()()(
k
kkkr
1 ,)( )()()()()( kkkkk whrr
condition for convergence
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
20
Termination control
polynomial convergence
solution KKT approx. theof residual ,4
3
rr
00 // Cafieri, D’Apuzzo, De Simone, di Serafino, Toraldo, 2007c
such that exists )(length step a
|)ln|)2(( , : )1,0( mnKKkK k
0 , ,)()( wwwww
mnmn 22solution opt. ,|||| 1,0 0, ),,,0,,,,( **0 wweeeeeew
)1(
Inexact PR convergence
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
21
Termination control: theory + comput. study
CG/SQMR stopping criterion:
Theory
Computational study Further reduce the number
of inner iterations
Avoid slowdown in decreasing the infeasibility
TOLr TOLr
4
3
TOL 4
3
TOL
strategies safeguardwith
otherwise,
if, ,min
ρτ
TOLεδTOLρ
τTOL
PRPR
strategies safeguardwith
otherwise,
if, ,min
ρτ
TOLεδTOLρ
τTOL
PRPR
|);)(|1/( xqδ 10 ;10 ,1
(Cafieri, D’Apuzzo, De Simone, di Serafino, 2007d)
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
22
Software architecture of PRQP
linear algebra kernels
PRQP corePRQP core
CG SQMR MA27 (LBLT)
BLAS
SIF, AMPL interfacesSIF, AMPL interfaces
Pre/post-processing utilitiesPre/post-processing utilities
CP ICFS
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
23
Testing details
PROBLEM n m1 m2
AUG3DQP* 27543 0 8000
CVXQP1 10000 0 5000
CVXQP2 10000 0 2500
CVXQP3 10000 0 7500
GOULDQP3 19999 0 9999
LISWET5* 10002 1000 0
QPBAND 50000 25000 0
CVXQP1-M 10000 5000 0
CVXQP2-M 10000 2500 0
CVXQP3-M 10000 7500 0
Selected CUTEr problems
sparsity of Hessian and constr. matrices ≥ 99% * = diagonal Hessian
PR stopping criteria:
δ = relative duality gap, σ = relative infeasibility
# PR iterations ≤ 80
Computational environment 2.53 GHz Pentium IV, 1.256 GB RAM, 128 KB L1 cache Linux Ubuntu 4.1.2, g77 3.4.6
and gcc 4.1.3 compilers
10 ,10 87 σδ
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
24
PRQP: selected results
Direct CG + CP
PROBLEM PR ITFACT TIME
SOLVE TIME
TOTALTIME
PR/GC IT
CP FACT.TIME
SOLVETIME
TOTALTIME
AUG3DQP* 19
3.04e+1
2.72e-1 3.11e+1 19/19 3.20e+1 7.28e-1 3.35e+1
CVXQP1 201.16e+
45.27e+0 1.16e+4 20/351 1.21e+1 3.84e+0 1.63e+1
CVXQP2 193.03e+
32.15e+0 3.03e+3 20/393 1.48e-1 1.10e+0 1.45e+0
CVXQP3 162.63e+
48.01e+0 2.63e+4 16/228 6.47e+1 5.76e+0 7.10e+1
GOULDQP3
211.32e+
08.35e-2 1.82e+0 21/90 5.47e-1 6.70e-1 1.79e+0
LISWET5* 25 6.16e-1 3.82e-2 8.77e-1 25/25 6.27e-1 2.49e-1 1.13e+0
QPBAND 141.37e+
01.16e-1 2.32e+0 14/901 7.56e-1 1.74e+1 1.91e+1
CVXQP1-M
262.79e+
41.29e+1 2.79e+4 25/423 1.23e+1 4.30e+0 1.69e+1
CVXQP2-M
244.32e+
33.19e+0 4.32e+3 24/415 1.82e-1 1.19e+0 1.62e+0
CVXQP3-M
304.73e+
41.40e+1 4.72e+4 30/606 1.06e+2 1.36e+1 1.20e+2
* = diagonal Hessian
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
25
CG + CP SQMR + reused CP
PROBLEMPR/CG
IT
CP FACT.TIME
SOLVETIME
TOTALTIME
PR/QMRIT
CP FACT.TIME
SOLVETIME
TOTALTIME
AUG3DQP* 19/193.20e+
17.28e-1
3.35e+1
19/3001.57e+
17.02e+
02.31e+
1
CVXQP1 20/3511.21e+
13.84e+
01.63e+
120/738
3.64e+0
8.72e+0
1.26e+1
CVXQP2 20/393 1.48e-11.10e+
01.45e+
020/614 5.96e-2
2.15e+0
2.40e+0
CVXQP3 16/2286.47e+
15.76e+
07.10e+
116/607
1.88e+1
1.53e+1
3.42e+1
LISWET5* 25/25 6.27e-1 2.49e-11.13e+
025/200 2.99e-1
1.28e+0
1.81e+0
QPBAND 14/901 7.56e-11.74e+
11.91e+
114/1531 3.73e-1
3.81e+1
3.92e+1
CVXQP1-M 25/4231.23e+
1 4.30e+
01.69e+
125/650
5.92e+0
6.81e+0
1.30e+1
CVXQP2-M 24/415 1.82e-11.19e+
01.62e+
024/624 9.44e-2
2.20e+0
2.53e+0
CVXQP3-M 30/6061.06e+
2 1.36e+
11.20e+
2 30/858
4.50e+1
1.96e+1
6.50e+1
PRQP: selected results
* = diagonal Hessian
D. di Serafino On the role of Linear Algebra in the development of IP algorithms and software
26
Future work
Devising and experimenting new strategies for CP approximation
Applying and analysing CPs in the context of IP methods for nonconvex (quadratic) programming
Developing and experimenting inertia-controlling iterative solvers
Thanks for your attention!