Post on 26-Dec-2015
transcript
Multilevel Optimization Methods for Engineering Design and PDE-Constrained Optimization
Copyright, 1996 © Dale Carnegie & Associates, Inc.
Stephen G. NashGeorge Mason University
Joint with R. Michael LewisCollege of William & Mary
snash@gmu.eduStephen G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Setting
• Optimize a high-fidelity model:minimize fh(a)subject to <constraints>
• Also available: an easier-to-solve low-fidelity model:
minimize fH(a)subject to <constraints>
• How can you exploit the low-fidelity model?
snash@gmu.eduStephen G. Nash, George Mason University
Some Applications
• PDE-constrained optimization• Aeronautical design• Nano-porous materials• Image processing• VLSI design
In many cases, there may be a hierarchy of lower-fidelity models
snash@gmu.eduStephen G. Nash, George Mason University
snash@gmu.edu Stephen G. Nash, George Mason University
Example: Minimal SurfaceN=92, flops=2x106
N=182, flops=2x107
N=272, flops=1x108
N=32, flops=8x104
snash@gmu.edu Stephen G. Nash, George Mason University
An ExampleModel Framework
• An optimization model governed by a system of differential equations
• S(a,u) = 0: system of PDEs Design variables: a State variables: u Vary the discretization
0))(,(subject to
))(,()(minimize
auaS
auaFaf
snash@gmu.edu Stephen G. Nash, George Mason University
User-supplied Information
• Procedure to solve S (a,u ) = 0 for u given a
• Procedure to evaluate Fh (a,u ) and a Fh (a,u) for any level h
• Procedures to implement downdate IhH and
update IHh operators
IHh = <constant> (Ih
H)T
snash@gmu.eduStephen G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Some Simplifications (for this talk)
• Either: No constraints in optimization models, or Constraint equations solved exactly
• But computational approaches are designed to extend to the constrained case: avoid explicit use of (reduced) Hessian only need Hessian-vector products do not assume sparsity or known sparsity
pattern
snash@gmu.eduStephen G. Nash, George Mason University
snash@gmu.eduStephen G. Nash, George Mason University
Model Management: Algorithmic Template
• Given some initial guess ak of the solution: set a(1) ← ak
(pre-smoothing) partially minimize fh to get a(1)
(recursion) Compute Obtain a(2) by solving
subject to bounds on a. Define search direction line search: a(3) a(1) + e
(post-smoothing) partially minimize fh to get a(4)
• Set ak+1 ← a(3)
)()( )1()1( afafv hH
avafaf THs )()(min
)1()2( aae
snash@gmu.eduStephen G. Nash, George Mason University
Multilevel (no coarsening): Algorithmic Template
• Given some initial guess ak of the solution: set a(0) ← ak • (pre-smoothing) partially minimize fh to get a(1)
• (recursion) Compute Obtain a(2) by solving
subject to bounds on a. Define search direction line search: a(3) a(1) + e
• (post-smoothing) partially minimize fh to get a(4)
• Set ak+1 ← a(4)
)()( )1()1( afafv hH
avafaf THs )()(min
)1()2( aae
snash@gmu.eduStephen G. Nash, George Mason University
Multilevel: MG/OptAlgorithmic Template
• Given some initial guess ak of the solution: set a(0) ← ak • (pre-smoothing) partially minimize fh to get a(1)
• (recursion) Compute Obtain a(2) by solving
subject to bounds on a. Define search direction line search: a(3) a(1) + e
• (post-smoothing) partially minimize fh to get a(4)
• Set ak+1 ← a(4)
)()( )1()1( afIaIfv hHh
HhH
avafaf THs )()(min
)1()2( aaIe hH
snash@gmu.edu Stephen G. Nash, George Mason University
The Reduced Hessian
• Properties of the reduced Hessian govern the behavior of MG/Opt
• Not the same as the PDE S(a,u): E.g., hyperbolic PDE, elliptic reduced Hessian
• If L = Lagrangian, Sa and Su = Jacobians:
• We don’t know its properties or sparsity pattern
auuuua
uauaauauaa
SSLSS
LSSSSLLf1**
**12
snash@gmu.eduStephen G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
snash@gmu.edu Stephen G. Nash, George Mason University
Some of the Justifications
• Richer class of models• Guarantees of convergence• Better operator properties than for
PDEs alone• Good performance (even far from
solution)• Connection to other optimization
methods
snash@gmu.edu Stephen G. Nash, George Mason University
Optimization Models are More Flexible
• Applies to a large variety of optimization models and constraints Not just for solving PDEs
• Can add additional constraints: Bounds Inequalities
• True generalization of multigrid
snash@gmu.edu Stephen G. Nash, George Mason University
Analogy: Nonlinear equations vs. Optimization
• If we solve optimality conditions
• If we minimize
( ) 0f x
lim ( ) 0f x
( )f x
lim ( ) 0f x
snash@gmu.edu Stephen G. Nash, George Mason University
Convergence
• If underlying optimization algorithm is
guaranteed to converge (to a stationary point) without multilevel strategy
• Then MG/Opt is guaranteed to converge (to a
stationary point)
snash@gmu.edu Stephen G. Nash, George Mason University
When will MG/Opt work well?
• convex ≈ elliptic ≈ positive definite ≈ “nice”
• The reduced Hessian will be positive (semi) definite at the solution
• Multigrid works well for elliptic PDEs• Optimization methods work well on
convex problems
snash@gmu.edu Stephen G. Nash, George Mason University
A Sample Model Problem
• Match a target function u*:
• Where u (a) solves the 1-way wave eqn.:
• With• Computations use c = constant = 1
0
( ,0) ( )t xu cu
u x a x
0 1, 0 1x t
2 2
* *( , ( )) ( )x xf a u a u u u u
snash@gmu.edu Stephen G. Nash, George Mason University
Model Problem: Wave Eqn.
• Hyperbolic equation• Initial value moves
without dissipation or dispersion
• Multigrid methods (applied to constraint alone) are not ideal: usual approach is to march forward in time
-5 -4 -3 -2 -1 0 1 2 3 4 5-2
-1
0
1
2Initial Solution
-5 -4 -3 -2 -1 0 1 2 3 4 5-2
-1
0
1
2Solution at t=1
snash@gmu.edu Stephen G. Nash, George Mason University
Model Problem: Analysis of Continuous Problem
• The reduced Hessian is
• This is like the 1-dimensional Laplacian• Ideal for multigrid• Likely to cause difficulties for general-
purpose large-scale optimization methods • Analogous results for discretized model
problem
2
2
dI
dx
snash@gmu.edu Stephen G. Nash, George Mason University
Model Problem: Computations (cont.)• # of design variables: n=1025 [1,051,650 total variables: n(n+1)]
n=1025
n=512 n=257 n=129 n=65 n=33
Optimization
it 99
ls 100
cg 967
Successive refinement
it 23 25 25 25 25 19
ls 100 26 26 26 26 20
cg 956 216 225 214 220 145
MG/Opt
it 10 12 14 16 18 232
ls 20 24 28 32 36 242
cg 57 65 79 89 112 1974
snash@gmu.edu Stephen G. Nash, George Mason University
Choice of Comparative Algorithms
• Why only compare MG/Opt with traditional optimization algorithms (and not MG for systems of equations)? Inequality constraints may be present Optimality conditions not elliptic in
constrained case Hard to derive reduced Hessian/system
(thus hard to identify a good preconditioner) No obvious relationship between original
optimization model and reduced system
snash@gmu.edu Stephen G. Nash, George Mason University
MG/Opt & Steepest Descent
• Coarse-level problem is a first-order approximation to the fine level problem
Gradient of coarse-level problem at aH =
IhH [gradient of fine-level problem at ah]
• Analogous to the first-order approximation used to derive the steepest-descent method
snash@gmu.edu Stephen G. Nash, George Mason University
MG/Opt & Newton’s Method
• Multilevel line search: let Well-scaled search direction: Search direction of the form
• If subproblems solved accurately, then:
• Search direction is “Newton-like”
'(1) 0s h
h H He I e
)()( heafs
)(])([)1('322
HHHhHh
Hh
TH eOefIfIes
snash@gmu.eduStephen G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Scenario
• Apply algorithm (e.g., model management) Suppose that it does not work well
• Why not? Examine results of diagnostic tests Performed as part of optimization algorithm
– Diagnostic tests have low overhead– Analogous to condition-number estimators
• Now what options do you have? Manual versus automatic
snash@gmu.eduStephen G. Nash, George Mason University
Critical Condition
• Multilevel:
• Can be automatically guaranteed through additive (as here) or multiplicative corrections
• Convergence is guaranteed regardless of the quality of the approximate models
snash@gmu.eduStephen G. Nash, George Mason University
)()( )1()1( afIaIf hHh
Hhs
snash@gmu.eduStephen G. Nash, George Mason University
Sufficient to ConsiderFour Properties
• Nonlinearity• Model Consistency• Level Complementarity• Separability across Levels
Some assessment tests assume use of truncated-Newton method (TN) based on conjugate-gradient method (CG)
Some tests assume coarsening: ah → aH
snash@gmu.edu Stephen G. Nash, George Mason University
TN Search Directions
• Let • Line search: approximates • For the search directions from TN:
• Test: Is
)()( pafv
3)1(' pOv
)(min v
?0)1(' v
Diagnostic Test #2: Model Consistency
Compare predicted and actual reductions in the multilevel line search
snash@gmu.edu Stephen G. Nash, George Mason University
Predicted & Actual Reduction
• Predicted reduction: reduction in coarse-level objective (via standard
optimization)
• Actual reduction: reduction in fine-level objective (via multilevel
line search)
• Difference between (scaled) actual & predicted:
3
02
02 ))(()()()(
2
1 HHhH
hh
ThH
HH
TH eOeIafIafe
consistency of problems nonlinearity
Diagnostic Test #3: Level Complementarity
Does the coarse level correspond to the near null space of the fine-level Hessian?
snash@gmu.edu Stephen G. Nash, George Mason University
Algebraic Smoothness
• Optimizer: TN based on conjugate-gradient CG reduces error corresponding to large
eigenvalues on the fine level Complementary components correspond to
small eigenvalues (“near null space”)• Does the coarse level correspond to the
near null space of the reduced Hessian? Extend ideas from adaptive algebraic
multigrid for linear problems …
snash@gmu.edu Stephen G. Nash, George Mason University
Near Null-Space
• The error in the design variables should lie in the near null-space of the reduced Hessian
• Generalized Rayleigh quotient should be small:
hT
hhh
hhT
hh
hhGG
GGRQ
)(
)()()(
2
Reduced Hessian(not known)
Error in design variables (not known)
snash@gmu.edu Stephen G. Nash, George Mason University
Practical Test
• We must estimate: Norm of reduced Hessian (estimate via CG
method) Error in the design variables (use the multilevel
search direction)
• Test: Is small?
hT
hhh
hhT
hh
hh eeGT
eGeGeR
))((
)()()(
Multilevel search direction
Norm estimate from CG method
Matrix-vector product (as in TN)
)( hh eR
Diagnostic Test #4: Separability across Levels
Compare corresponding fine-level and coarse-level Hessian-vector products
snash@gmu.edu Stephen G. Nash, George Mason University
Separability?
• Can the fine-level and coarse-level components of the solution be computed separately?
• How much do they interact?• Is the reduced Hessian (nearly) block
diagonal in terms of fine-level and coarse-level components?
snash@gmu.edu Stephen G. Nash, George Mason University
Is it Possible to Test for Separability?
• How do you test for separability of the reduced Hessian when: You don’t compute the Hessian You can’t construct/analyze the Hessian You only have function & gradient values
and update & downdate operators
• Our test is based on Hessian-vector products: already estimated by TN
snash@gmu.edu Stephen G. Nash, George Mason University
Rough Idea
• Write reduced Hessian in block form, based on high/low frequencies:
• Use “perfect” update/downdate operators:
• Compare coarse/fine Hessian-vector products:
HHHh
THhhhh
GG
GGG
HH
hHhh
Hh
HH
hH p
p
pIpI
ppI
)(
0
hHhhHh
Hh
hHh pGpIGpGI )()( )()( 0 if separable
snash@gmu.edu Stephen G. Nash, George Mason University
Perturbation Analysis
• Apply MG/Opt to
• Assume user-supplied procedures are correct
• Assume nonlinearity test satisfied• Then MG/Opt solves a perturbed problem
• How large are the perturbations?
sorder term-higher)(min 21 h
Thh
hThh baaGaaF
)()(min 21 bbaaGGa h
Thh
hTh
snash@gmu.edu Stephen G. Nash, George Mason University
Perturbation Analysis (cont.)
HHH
hhhH GGGGG 12
rmssimilar te)(1 hhhhhH bGGb
small: separability
small: model consistency
small: level complementarity
snash@gmu.edu Stephen G. Nash, George Mason University
What if the diagnostic tests are not satisfied?
• Further analysis based on problem-specific techniques
• Nonlinearity Is it worthwhile to use a sophisticated optimization
method far from the solution?• Model Consistency
Over-coarsening? Programming errors?
• Level Complementarity Add or improve preconditioner
• Separability Use a different optimization method? Delay using multilevel until closer to solution of
optimization problem?
snash@gmu.edu Stephen G. Nash, George Mason University
Computational Tests
• Tests based on specified choices for the reduced Hessian
• Test problems chosen to isolate a particular property and measure sensitivity of diagnostic tests Multilevel already known to work well
• Ideal case: reduced Hessian is a discretized Laplacian
• Assume nonlinearity test satisfied: use quadratic optimization problems Nonlinearity test has been studied in other
contexts
snash@gmu.edu Stephen G. Nash, George Mason University
Level Complementarity
• Laplacian versus Laplacian with permuted eigenvalues
• Satisfies separability and problem consistencynH nh Laplacian
Permuted Ratio
7 15 0.03 0.34 11.7
15 31 0.04 0.21 4.8
31 63 0.04 0.57 13.6
63 127 0.07 0.30 4.5
127 255 0.13 0.25 1.9
255 511 0.20 0.34 1.7
511 1023 0.26 0.77 3.0
snash@gmu.edu Stephen G. Nash, George Mason University
Separability
• Diagonalize Laplacian:
• Test problems:
• R is random, with norm 1• Satisfies problem consistency and, for
small values of , level complementarity
T
H
hh VD
DVG
0
0
T
HT
hh VDR
RDVG
)(
snash@gmu.edu Stephen G. Nash, George Mason University
Model Consistency
• Test problems derived from discretized Laplacian
• Q is orthogonal
• R is random, with norm 1• Satisfies level complementarity and
separability
1))((, RIRIQQGQG HTH
snash@gmu.eduStephen G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Related Research
• vast literature on multigrid methods for PDEs
• optimization-based multigrid methods Based on full approximation scheme
(Brandt, 1977) applied to optimality conditions for optimization model
Lewis & Nash (2005)– SIAM J. Sci. Comp., v. 26, pp. 1811-1837
• model management Alexandrov & Lewis (2001)
– Optimization and Engineering, v. 2, pp. 413-430
snash@gmu.eduStephen G. Nash, George Mason University
snash@gmu.edu Stephen G. Nash, George Mason University
Related Research (cont.)
• diagnostic tests and related ideas for optimization-based multilevel methods
– Nash & Lewis (2008)- www.math.wm.edu/~buckaroo/pubs/LeNa08a.pdf
adaptive algebraic multigrid:– Brandt (1977)
- Math. Comp., v. 31, pp. 333-390– Brannick & Zikatanov (2006)
- Tech. Report, Penn. State University– Brezina, et al. (2006)
- SIAM J. Sci. Comp., v. 27, pp. 1261-1286.
stopping rules for inexact Newton methods: – Eisenstat & Walker (1996)
- SIAM J. Sci. Comp., v. 4, pp. 16-32