Optimization Methods

Optimization Methods

•Unconstrained optimization of an objective function F•Deterministic, gradient-based methods

•Running a PDE: will cover later in course•Gradient-based (ascent/descent) methods

•Stochastic methods•Simulated annealing

•Theoretically but not practically interesting•Evolutionary (genetic) algorithms

•Multiscale methods•Mean field annealing, graduated nonconvexity, etc.

•Constrained optimization•Lagrange multipliers

Our Assumptions for Optimization Methods

•With objective function F(p)•Dimension(p) >> 1and frequently quite large•Evaluating F at any p is very expensive•Evaluating D1F at any p is very, very expensive•Evaluating D2F at any p is extremely expensive

•True in most image analysis and graphics applications

Order of Convergencefor Iterative Methods

•|i+1| = k| i | in limit• is order of convergence

•The major factor in speed of convergence• N steps of method has order of convergence N

•Thus issue is linear convergence (=1) vs. superlinear convergence (>1)

Ascent/Descent Methods

• At maximum, D1F (i.e., F) =0.

• Pick direction of ascent/descent

• Find approximate maximum in that direction: two possibilities– Calculate stepsize that will approximately reach

maximum– In search direction, find actual max within some

range

Gradient Ascent/Descent Methods

• Direction of ascent/descent is D1F.

• If you move to optimum in that direction, next direction will be orthogonal to this one– Guarantees zigzag– Bad behavior for narrow ridges (valleys) of F– Linear convergence

Newton and Secant Ascent/Descent Methods for F(p)

• We are solving D1F=0– Use Newton or secant equation solution method to solve

• Newton to solve f(p)=0 is pi+1 = pi – D1f (pi)-1 pi

• Newton– Move from p to p-(D2F)-1D1F

• Is direction of ascent/descent is gradient direction D1F?– Methods that ascend/descend in D1f (gradient) directionare inferior

• Really direction of ascent/descent is direction of (D2F)-1D1F• Also gives you step size in that direction

• Secant– Same as Newton except replace D2F and D1F by discrete

approximations to them from this and last n iterates

Conjugate gradient method• Preferable to gradient descent/ascent methods• Two major aspects

– Successive directions for descent/ascent are conjugate: <hi+1,D2Fhi> = 0 in limit for convex F

• If true at all steps (quadratic F), convergence in n-1 steps, with n=dim(p) Improvements available using more previous directions

– In search direction, find actual max/min within some range• Quadratic convergence depends on <D1F(xi), hi> =0, i.e., F a local minimum

in the hi direction

• References– Shewchuk, An Intro. to the CGM w/o the Agonizing Pain

(http://www-2.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf)

– Numerical Recipes– Polak, Computational Methods in Optimization, Ac. Press

Conjugate gradient method issues• Preferable to gradient descent/ascent methods

• Must find a local minimum in the search direction

• Will have trouble with– Bumpy objective functions– Extremely elongated minimum/maximum regions

• Smooth objective function to put initial estimate on hillside of its global optimum– E.g., by using larger scale measurements

• Find its optimum

• Iterate– Decrease scale of objective function

– Use prev. optimum as starting point for new optimization

Multiscale Gradient-Based Optimization

To avoid local optima

• General methods– Graduated non-convexity

• [Blake & Zisserman, 1987]

– Mean field annealing• [Bilbro, Snyder, et al, 1992]

• In image analysis– Vary degree of globality of geometric

representation

Multiscale Gradient-Based Optimization

Example Methods

• To optimize F(p) over p subject to gi(p)=0, i=1, 2, …, N, with p having n parameters

– Create function F(p)+i i gi(p)

– Find critical point for it over p and • Solve D1

p,F(p)+i i gi(p)]=0– n+N equations in n+N unknowns

– N of the equations are just gi(p)=0, i=1, 2, …, N

• The critical point will need to be an optimum w.r.t. p

Optimization under Constraints by Lagrange Multiplier(s)

Stochastic Methods

• Needed when objective function is bumpy or many variables or hard to compute gradient of objective function

Date post:	07-Jan-2016
Category:	Documents
Upload:	miriam
View:	75 times
Download:	1 times

Optimization Methods

Documents