Constrained optimization - e-learning...Existence of optimaOptimality conditionsDualityEquality...

Existence of optima Optimality conditions Duality Equality constraints Frank-Wolfe method Penalty method Barrier methods

Constrained optimization

Mauro Passacantando

Department of Computer Science, University of [email protected]

Numerical Methods and OptimizationMaster in Computer Science – University of Pisa

M. Passacantando Constrained optimization 1 / 58 –


Existence of global minima

A constrained optimization problem is defined as min f (x)g(x) ≤ 0h(x) = 0

where Ω = x ∈ D : g(x) ≤ 0, h(x) = 0 is the feasible region.

TheoremIf all the functions f , gi , hj are continuous, the domain D is closed and the feasibleregion Ω is bounded, then there exists a global minimum.

Example. min x1 + x2

x21 + x2

2 − 4 ≤ 0

admits a global minimum. Where?



Existence of global optima

TheoremIf f is continuous, Ω is closed and there exists α ∈ R such that the α-sublevel set

x ∈ Ω : f (x) ≤ α

is nonempty and bounded, then there exists a global minimum.

Example. min ex1+x2

x1 − x2 ≤ 0−2x1 + x2 ≤ 0

Ω is closed and unbounded. The sublevel set x ∈ Ω : f (x) ≤ 2 is nonemptyand bounded, thus there exists a global minimum.



Existence of global optima

TheoremIf f is continuous and coercive, i.e.,

lim‖x‖→∞

f (x) = +∞,

and Ω is closed, then there exists a global minimum.

Example. min x4 + 3x3 − 5x2 + x − 2x ≥ 0

Since f is coercive and Ω = R+, there exists a global minimum.



Existence and uniqueness of global optima

Corollary

I If f is strongly convex and Ω is closed, then there exists a global minimum.

I If f is strongly convex and Ω is closed and convex, then there exists a uniqueglobal minimum.

Example. Any quadratic programming problemmin 1

2xTQx + cTx

Ax ≤ b

where Q is a positive definite matrix has a unique global minimum.

What if Q is positive semidefinite or indefinite?



Existence of global optima for quadratic programming problems

Consider min 1

2xTQx + cTx

Ax ≤ b(P)

The recession cone of Ω is rec(Ω) = d : Ad ≤ 0.

Theorem (Eaves)

(P) has a global minimum if and only if the following conditions hold:

(a) dTQ d ≥ 0 for any d ∈ rec(Ω);

(b) dT(Qx + c) ≥ 0 for any x ∈ Ω and any d ∈ rec(Ω) s.t. dTQ d = 0.



Existence of global optima for quadratic programming problems

Special cases:

I If Q = 0 (i.e., linear programming) then(P) has an optimal solution if and only if dTc ≥ 0 ∀ d ∈ rec(Ω)

I If Q is positive definite then (a) and (b) are satisfied.

I If Ω is bounded then (a) and (b) are satisfied.

Exercise 1. Prove that the quadratic programming problemmin 1

2x21 − 1

2x22 + x1 − 2 x2

−x1 + x2 ≤ −1−x2 ≤ 0

has a global minimum.



Constrained problems


x21 + x2

2 − 4 ≤ 0

Ω = B(0, 2), global minimum is x∗ = (−√

2,−√

2), ∇f (x∗) = (1, 1).

Definition (Tangent cone)

TΩ(x) =

d ∈ Rn : ∃ zk ⊂ Ω, ∃ tk > 0, zk → x , tk → 0, lim

k→∞

zk − x

tk= d

Example (continued). What is TΩ(x∗)?



First order necessary optimality condition

TheoremIf x∗ is a local minimum, then there is no descent direction in TΩ(x∗), i.e.,

dT∇f (x∗) ≥ 0, ∀ d ∈ TΩ(x∗).

Proof. By contradiction, assume that there exists d ∈ TΩ(x∗) s.t. dT∇f (x∗) < 0. Takethe sequences zk and tk s.t. lim

k→∞(zk − x∗)/tk = d . Then zk = x∗ + tk d + o(tk),

where o(tk)/tk → 0. The first order approximation of f gives

f (zk) = f (x∗) + tk dT∇f (x∗) + o(tk),

thus there is k ∈ N s.t.

f (zk)− f (x∗)

tk= dT∇f (x∗) +

o(tk)

tk< 0 ∀ k > k,

i.e. f (zk) < f (x∗) for all k > k, which is impossible because x∗ is a local minimum.



First order optimality condition for convex problems

TheoremIf Ω is convex, then Ω ⊆ TΩ(x) + x for any x ∈ Ω.

Optimality condition for constrained convex problems

If the optimization problem is convex, then x∗ is a global minimum if and only if

(y − x∗)T∇f (x∗) ≥ 0, ∀ y ∈ Ω.

Exercise 2. Prove the latter result.



Properties of the tangent cone

TΩ(x) is related to geometric properties of Ω.

Which is the relation between TΩ(x) and constraints g , h defining Ω?

Example (continued). g(x) = x21 + x2

2 − 4, ∇g(x∗) = (−2√

2,−2√

2),

TΩ(x∗) = d ∈ R2 : dT∇g(x∗) ≤ 0

Definition (First-order feasible direction cone)

Given x ∈ Ω, A(x) = i : gi (x) = 0 denotes the set of inequality constraintswhich are active at x . The set

D(x) =

d ∈ Rn :

dT∇gi (x) ≤ 0 ∀ i ∈ A(x),dT∇hj(x) = 0 ∀ j = 1, . . . , p

is called the first-order feasible direction cone at point x .




TheoremTΩ(x) ⊆ D(x) for all x ∈ Ω.

Is TΩ(x) = D(x) true for all x ∈ Ω? NO.


(x1 − 1)2 + (x2 − 1)2 − 1 ≤ 0x2 ≤ 0

Ω = (1, 0), TΩ(1, 0) = (0, 0).

∇g1(1, 0) = (0,−2), ∇g2(1, 0) = (0, 1), D(1, 0) = d ∈ R2 : d2 = 0.




Theorem - Constraint qualifications

a) (Affine constraints)If gi and hj are affine for all i = 1, . . . ,m and j = 1, . . . , p, thenTΩ(x) = D(x) for all x ∈ Ω.

b) (Slater condition)If gi are convex for all i = 1, . . . ,m, hj are affine for all j = 1, . . . , p and thereexists x ∈ int(D) s.t. g(x) < 0 e h(x) = 0, then TΩ(x) = D(x) for all x ∈ Ω.

c) (Linear independence of the gradients of active constraints)If x ∈ Ω and the vectors

∇gi (x) for i ∈ A(x),∇hj(x) for j = 1, . . . , p

are linear independent, then TΩ(x) = D(x).



Karush-Kuhn-Tucker Theorem

Why condition TΩ(x) = D(x) is important?

Theorem (Karush-Kuhn-Tucker)

If x∗ is a local minimum and TΩ(x∗) = D(x∗), then there exist λ∗ ∈ Rm andµ∗ ∈ Rp s.t. (x∗, λ∗, µ∗) satisfies the KKT system:

∇f (x∗) +m∑i=1

λ∗i ∇gi (x∗) +

p∑j=1

µ∗j ∇hj(x∗) = 0

λ∗i gi (x∗) = 0 ∀ i = 1, . . . ,m

λ∗ ≥ 0g(x∗) ≤ 0h(x∗) = 0

Exercise 3. Use KKT system to solvemin x1 − x2

x21 + x2

2 − 2 ≤ 0




Assumption TΩ(x∗) = D(x∗) is crucial.


(x1 − 1)2 + (x2 − 1)2 − 1 ≤ 0x2 ≤ 0

x∗ = (1, 0) is the global minimum. TΩ(x∗) 6= D(x∗).

∇f (x∗) = (1, 1), ∇g1(x∗) = (0,−2), ∇g2(x∗) = (0, 1).

There is no λ∗ s.t. (x∗, λ∗) solves KKT system.




KKT Theorem gives necessary optimality conditions, but not sufficient ones.


−x21 − x2

2 + 2 ≤ 0

x∗ = (1, 1), λ∗ =1

2solves KKT system, but x∗ is not a local minimum.

KKT Theorem for convex problems

If the optimization problem is convex and (x∗, λ∗, µ∗) solves KKT system, then x∗

is a global minimum.

Exercise 4. Prove the latter result.




Exercise 5. Compute the distance between a point z ∈ Rn and the hyperplaneH = x ∈ Rn : aTx = b

Exercise 6. Compute the distance between two parallel hyperplanes

H1 = x ∈ Rn : aTx = b1, H2 = x ∈ Rn : aTx = b2, with b1 6= b2.

Exercise 7. Compute the projection of a point z ∈ Rn on the ball with center x0

and radius r .

Exercise 8. Compute the projection of a point z ∈ R2 on the box

x ∈ R2 : a1 ≤ x1 ≤ b1, a2 ≤ x2 ≤ b2.



Critical cone

Consider now a nonconvex optimization problem.

(x∗, λ∗, µ∗) solves KKT system. Is x∗ a local minimum?

Definition (Critical cone)

(x∗, λ∗, µ∗) solves KKT system. The critical cone is

C (x∗, λ∗, µ∗) =

d ∈ Rn :dT∇gi (x∗) = 0 ∀ i ∈ A(x∗) con λ∗i > 0dT∇gi (x∗) ≤ 0 ∀ i ∈ A(x∗) con λ∗i = 0dT∇hj(x∗) = 0 ∀ j = 1, . . . , p

Theorem

C (x∗, λ∗, µ∗) = d ∈ D(x∗) : dT∇f (x∗) = 0



Second order necessary optimality condition

Lagrangian function is defined as

L(x , λ, µ) := f (x) +m∑i=1

λi gi (x) +

p∑j=1

µj hj(x)

Necessary condition

Assume that (x∗, λ∗, µ∗) solves KKT system and the gradients of activeconstraints at x∗ are linear independent.If x∗ is a local minimum, then

dT∇2xxL(x∗, λ∗, µ∗) d ≥ 0 ∀ d ∈ C (x∗, λ∗, µ∗).

Special case: unconstrained problems

If x∗ is a local minimum, then ∇2f (x∗) is positive semidefinite.



Second order necessary optimality condition

The previous theorem does not give a sufficient optimality condition.

Example. min x3

1 + x2

−x2 ≤ 0

x∗ = (0, 0), λ∗ = 1 is the unique solution of KKT system.The linear constraint is active at x∗ and ∇g(x∗) = (0,−1) 6= 0.Matrix ∇2

xxL(x∗, λ∗) = 0, but x∗ is not a local minimum because f (t, 0) < f (0, 0)for all t < 0.



Second order sufficient optimality condition

Sufficient conditionAssume that (x∗, λ∗, µ∗) solves KKT system and

dT∇2xxL(x∗, λ∗, µ∗) d > 0 ∀ d ∈ C (x∗, λ∗, µ∗) s.t. d 6= 0,

then x∗ is a local minimum.

Special case: unconstrained problems.

If ∇f (x∗) = 0 and ∇2f (x∗) is positive definite, then x∗ is a local minimum.



Second order optimality conditions

Exercise 9. Find local and global minima of the following problems:

a)

min −x2

1 − 2 x22

−x1 + 1 ≤ 0−x2 + 1 ≤ 0x1 + x2 − 6 ≤ 0

b)

min −x1 + x2

2

−x21 − x2

2 + 4 ≤ 0

c)

min x31 + x3

2

−x1 − 1 ≤ 0−x2 − 1 ≤ 0



Lagrangian relaxation

Consider the general optimization problem min f (x)g(x) ≤ 0h(x) = 0

(P)

where x ∈ D and the optimal value is v(P).

The Lagrangian function L : D × Rm × Rp → R is defined as

L(x , λ, µ) := f (x) +m∑i=1

λi gi (x) +

p∑j=1

µj hj(x)



Lagrangian relaxation and dual function

DefinitionGiven λ ≥ 0 and µ ∈ Rp, the problem

min L(x , λ, µ)x ∈ D

is called Lagrangian relaxation of (P) and ψ(λ, µ) = infx∈D

L(x , λ, µ) is the

Lagrangian dual function.

Dual function ψ

I is concave because inf of linear functions w.r.t (λ, µ)

I can be equal to −∞I can be not differentiable



Lagrangian relaxation and dual function

TheoremGiven λ ≥ 0 and µ ∈ Rp, we have

ψ(λ, µ) ≤ v(P).

Proof. If x ∈ Ω, i.e. g(x) ≤ 0, h(x) = 0, then

L(x , λ, µ) = f (x) +m∑i=1

λi gi (x) ≤ f (x),

hence

ψ(λ, µ) = minx∈D

L(x , λ, µ) ≤ minx∈Ω

L(x , λ, µ) ≤ minx∈Ω

f (x) = v(P)



Lagrangian dual problem

The problem max ψ(λ, µ)λ ≥ 0

(D)

is called Lagrangian dual problem of (P).

I Dual problem consists in finding the best lower bound of v(P).

I Dual problem is a convex problem, even if (P) is not convex.




Example - Linear Programming.

Primal problem: min cTxAx ≥ b

(P)

Lagrangian function: L(x , λ) = cTx + λT(b − Ax) = λTb + (cT − λTA)xDaul function:

ψ(λ) = minx∈Rn

L(x , λ) =

−∞ if cT − λTA 6= 0

λTb if cT − λTA = 0

Dual problem: max ψ(λ)λ ≥ 0

−→

max λTbλTA = cT

λ ≥ 0(D)

is a linear programming problem.

Exercise 10. What is the dual of (D)?




Example - Least norm solution of linear equations.

Primal problem: min xTxAx = b

(P)

Lagrangian function: L(x , µ) = xTx + µT(Ax − b).Dual function: ψ(µ) = minx∈Rn L(x , µ).L(x , µ) is quadratic and strongly convex w.r.t x , thus

∇xL = 2x + ATµ = 0⇐⇒ x = −1

2ATµ,

hence ψ(µ) = − 14µ

TAATµ− bTµ.Dual problem:

max − 14µ

TAATµ− bTµµ ∈ Rp (D)

is an unconstrained convex quadratic programming problem.




Exercise 11. Find the dual problem of a generic quadratic programming problemmin 1

2xTQx + cTx

Ax ≤ b(P)

where Q is a symmetric positive definite matrix.



Weak duality

Theorem (weak duality)

For any optimization problem we have v(D) ≤ v(P).

Strong duality, i.e., v(D) = v(P), does not hold in general.Example. min −x2

x − 1 ≤ 0−x ≤ 0

v(P) = −1

L(x , λ) = −x2 + λ1(x − 1)− λ2x ,

ψ(λ) = minx∈R

L(x , λ) = −∞ ∀ λ ∈ R2,

hence v(D) = −∞.



Weak duality

Example. min 2x4 + x3 − 20x2 + xx2 − 2x − 3 ≤ 0

−1 −0.5 0 0.5 1 1.5 2 2.5 3

−40

−30

−20

−10

0

10

Primal problem: min 2x4+x

3−20x

2+x s.t. x

2−2x−3 <= 0

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−70

−65

−60

−55

−50

−45

−40

−35Dual problem

Primal optimal solution x∗ ' 2.0427, v(P) ' −38.0648.Dual optimal solution λ∗ ' 2.68, v(D) ' −46.0838.



Strong duality

Theorem (strong duality)

If the problem min f (x)g(x) ≤ 0h(x) = 0

is convex, there exists an optimal solution x∗, and TΩ(x∗) = D(x∗), then KKTmultipliers (λ∗, µ∗) associated to x∗ are an optimal solution of the dual problemand v(D) = v(P).

Proof. L(x , λ, µ) is convex with respect to x , thus

v(D) ≥ ψ(λ∗, µ∗) = minx

L(x , λ∗, µ∗) = L(x∗, λ∗, µ∗) = f (x∗) = v(P) ≥ v(D)



Strong duality

Strong duality can hold also for some nonconvex problems.

Example. min −x2

1 − x22

x21 + x2

2 − 1 ≤ 0v(P) = −1

L(x , λ) = −x21 − x2

2 + λ(x21 + x2

2 − 1) = (λ− 1)x21 + (λ− 1)x2

2 − λ.

ψ(λ) =

−∞ if λ < 1

−λ if λ ≥ 1

hence λ∗ = 1 is the dual optimum and v(D) = −1.



ExercisesExercise 12. Consider the problem

minn∑

i=1

x2i

n∑i=1

xi ≥ 1

I Discuss existence and uniqueness of optimal solutions

I Find the optimal solution and the optimal value

I Write the dual problem

I Solve the dual problem and check whether strong duality holds

Exercise 13. Given a, b ∈ R with a < b, consider the problemmin x2

a ≤ x ≤ b

I Find the optimal solution and the optimal value for any a, b

I Solve the dual problem and check whether strong duality holds



Equality constrained problems

Consider min f (x)Ax = b

with

I f strongly convex and twice continuously differentiable

I A matrix p × n with rank(A) = p

It is equivalent to an unconstrained problem:

write A = (AB ,AN) with det(AB) 6= 0, then Ax = b is equivalent to

ABxB + ANxN = b =⇒ xB = A−1B (b − ANxN),

thus min f (x)Ax = b

is equivalent to

min f (A−1

B (b − ANxN), xN)xN ∈ Rn−p



Equality constrained problems

Example. Consider min x21 + x2

2 + x23

x1 + x3 = 1x1 + x2 − x3 = 2

Since x1 = 1− x3 and x2 = 2− x1 + x3 = 1 + 2x3, the original constrainedproblem is equivalent to the following unconstrained problem:

min (1− x3)2 + (1 + 2x3)2 + x23 = 6x2

3 + 2x3 + 2x3 ∈ R

Therefore, the optimal solution is x3 = −1/6, x1 = 7/6, x2 = 2/3.



Frank-Wolfe method

Consider the problem min f (x)Ax ≤ b

(P)

where f is convex and continuously differentiable and the polyhedronΩ = x : Ax ≤ b is bounded.

Linearize the objective function at xk , i.e., f (x) ' f (xk) +∇f (xk)T(x − xk) andsolve the linear programming problem:

min f (xk) +∇f (xk)T(x − xk)Ax ≤ b

The optimal solution yk of the linearized problem gives a lower bound for (P):

f (xk) ≥ v(P) = minx∈Ω

f (x) ≥ minx∈Ω

[f (xk) +∇f (xk)T(x − xk)

]= f (xk)+∇f (xk)T(y k−xk)

If ∇f (xk)T(yk − xk) = 0 then xk solves (P),otherwise ∇f (xk)T(yk − xk) < 0, i.e., yk − xk is a descent direction for f at xk .



Frank-Wolfe method

Frank-Wolfe method

0. Choose a feasible point x0 and set k = 0

1. Compute an optimal solution yk of the LP problemmin ∇f (xk)TxA x ≤ b

2. if ∇f (xk)T(yk − xk) = 0 then STOPelse compute step size tk

3. Set xk+1 = xk + tk(yk − xk), k = k + 1 and go to step 1.

The step size tk can be predetermined, e.g., tk =1

k + 1or computed by an (exact/inexact) line search.

TheoremFor any starting point x0 the generated sequence xk is bounded and any of itscluster points is a global minimum.



Frank-Wolfe method

Example. Solve the problem min (x1 − 3)2 + (x2 − 1)2

0 ≤ x1 ≤ 20 ≤ x2 ≤ 2

by means of the Frank-Wolfe method starting from the point x0 = (0, 0) andusing an exact line search to compute the step size.

∇f (x) = (2x1 − 6, 2x2 − 2) and ∇f (x0) = (−6, −2), the optimal solution of thelinearized problem

min −6x1 − 2x2

x ∈ Ω

is y0 = (2, 2). Since ∇f (x0)T(y0 − x0) = −56, a line search is needed: the stepsize t0 is the optimal solution of

min (x1 − 3)2 + (x2 − 1)2

x1 = 2tx2 = 2tt ∈ [0, 1]



Frank-Wolfe method

which is equivalent to min 8t2 − 16tt ∈ [0, 1]

thus t0 = 1 and x1 = (2, 2). Since ∇f (x1) = (−2, 2), the optimal solution of thelinearized problem

min −2x1 + 2x2

x ∈ Ω

is y1 = (2, 0). Now, ∇f (x1)T(y1 − x1) = −4, the step size is the optimal solutionof

min (x1 − 3)2 + (x2 − 1)2

x1 = 2x2 = 2− 2tt ∈ [0, 1]

hence t1 = 1/2, x2 = (2, 1) and ∇f (x2) = (−2, 0), thus y2 = (2, 0) (analternative optimal solution is y2 = (2, 2)). Since ∇f (x2)T(y2 − x2) = 0, thepoint x2 is the optimal solution of the original problem.



Frank-Wolfe method

Exercise 14. Implement in MATLAB the Frank-Wolfe method with exact linesearch for solving the problem

min 12x

TQx + cTxAx ≤ b

where Q is a positive definite matrix.

Exercise 15. Run the Frank-Wolfe method with exact line search for solving theproblem

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

starting from the point (0, 0).[Use ∇f (xk)T(xk − yk) < 10−3 as stopping criterion.]

Exercise 16. Solve the previous problem by means of the Frank-Wolfe methodwith predetermined step size tk = 1/(k + 1) starting from (0, 0).



Penalty method

Consider a constrained optimization problemmin f (x)gi (x) ≤ 0 ∀ i = 1, . . . ,m

(P)

Define the penalty function

p(x) =m∑i=1

(max0, gi (x))2

and consider the (unconstrained) penalized problemmin f (x) +

1

εp(x) := pε(x)

x ∈ Rn(Pε)

Note that

pε(x)

= f (x) if x ∈ Ω> f (x) if x /∈ Ω



Penalty method

Proposition.

I If f , gi are continuously differentiable, then pε is continuously differentiable

I If f , gi are convex, then pε is convex

I (Pε) is a relaxation of (P), i.e., v(Pε) ≤ v(P) for any ε > 0

I Let x∗ε be an optimal solution of (Pε):I if x∗ε ∈ Ω, then x∗ε is optimal also for (P)I if x∗ε /∈ Ω, then v(Pε) > v(Pε′) for any ε′ > ε

Penalty method

0. Set ε0 > 0, τ ∈ (0, 1), k = 0

1. Find an optimal solution xk of (Pεk )

2. If xk ∈ Ω then STOPelse εk+1 = τεk , k = k + 1 and go to step 1.

TheoremIf f is coercive, then the sequence xk is bounded and any of its cluster points isan optimal solution of (P).



Penalty method

Example. Solve the problemmin 1

2 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

by means of the penalty method starting from ε0 = 5 with τ = 0.5.

0 0.5 1 1.5 2 2.5 3 3.5 4

0

0.5

1

1.5

2

2.5

3



Penalty method

Exercise 17. Implement in MATLAB the penalty method for solving the problemmin 1

2xTQx + cTx

Ax ≤ b


Exercise 18. Run the penalty method with τ = 0.5 and ε0 = 5 for solving theproblem

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

[Use min(b − Ax) > −10−3 as stopping criterion.]



Barrier methods

Consider min f (x)g(x) ≤ 0

where

I f , gi convex and twice continuously differentiable

I there is no isolated point in Ω

I there exists an optimal solution (e.g. f coercive or g(x) ≤ 0 bounded)

I Slater constraint qualification holds: there exists x such that

x ∈ dom(f), gi (x) < 0, i = 1, . . . ,m

Hence strong duality holds.

Special cases: linear programming, convex quadratic programming



Unconstrained reformulation

The constrained problem min f (x)g(x) ≤ 0

is equivalent to the unconstrained problem min f (x) +m∑i=1

I−(gi (x))

x ∈ Rn

where

I−(u) =

0 if u ≤ 0

+∞ if u > 0

is the indicator function of R−.



Barrier function

I− is neither finite nor differentiable. It can be approximated by the smoothconvex function

u 7→ − εu, where u < 0,

and parameter ε > 0. Approximation improved as ε→ 0.We approximate the problem min f (x) +

m∑i=1

I−(gi (x))

x ∈ Rn

with min f (x) + ε

m∑i=1

1

−gi (x)

x ∈ int(Ω)

B(x) =m∑i=1

1

−gi (x)is called barrier function.

dom(B) = int(Ω), B is convex and smooth.



Barrier method

If x∗(ε) is the optimal solution of min f (x) + ε

m∑i=1

1

−gi (x)

x ∈ int(Ω)

then

∇f (x∗(ε)) + ε

m∑i=1

1

[gi (x∗(ε))]2∇gi (x∗(ε)) = 0.

Define λ∗i (ε) =ε

[gi (x∗(ε))]2> 0, for any i = 1, . . . ,m. Then the Lagrangian

function

L(x , λ∗(ε)) = f (x) +m∑i=1

λ∗i (ε)gi (x)

is convex and ∇xL(x∗(ε), λ∗(ε)) = 0, hence

f (x∗(ε)) ≥ v(P) ≥ ψ(λ∗(ε)) = minx L(x , λ∗(ε))= L(x∗(ε), λ∗(ε))= f (x∗(ε))− εB(x∗(ε))



Barrier method

Barrier method

0. Set τ < 1 and ε1 > 0. Choose x0 ∈ int(Ω), set k = 1

1. Find the optimal solution xk of min f (x) + εk

m∑i=1

1

−gi (x)

x ∈ int(Ω)

using xk−1 as starting point

2. If εkB(xk) = 0 then STOPelse εk+1 = τεk , k = k + 1 and go to step 1

TheoremIf Ω is bounded, then the sequence xk is bounded and any of its cluster pointsis an optimal solution of (P).



Logarithmic barrier

The indicator function I− can be also approximated by the smooth convexfunction:

u 7→ −ε log(−u), with ε > 0.

We approximate the problem min f (x) +m∑i=1

I−(gi (x))

x ∈ Rn

with min f (x)− εm∑i=1

log(−gi (x))

x ∈ int(Ω)



Logarithmic barrier

Logarithmic barrier function

B(x) = −m∑i=1

log(−gi (x))

I dom(B) = int(Ω).

I B is convex

I B is smooth with

∇B(x) = −m∑i=1

1

gi (x)∇gi (x)

∇2B(x) =m∑i=1

1

gi (x)2∇gi (x)∇gi (x)T +

m∑i=1

1

−gi (x)∇2gi (x)



Logarithmic barrier

If x∗(ε) is the optimal solution of min f (x)− εm∑i=1

log(−gi (x))

x ∈ Rn

then

∇f (x∗(ε)) +m∑i=1

ε

−gi (x∗(ε))∇gi (x∗(ε)) = 0.

Define λ∗i (ε) =ε

−gi (x∗(ε))> 0, for any i = 1, . . . ,m. Then the Lagrangian

function

L(x , λ∗(ε)) = f (x) +m∑i=1

λ∗i (ε)gi (x)

is convex and ∇xL(x∗(ε), λ∗(ε)) = 0, hence

f (x∗(ε)) ≥ v(P) ≥ ψ(λ∗(ε)) = minx L(x , λ∗(ε))= L(x∗(ε), λ∗(ε))= f (x∗(ε))−mε



Interpretation via KKT conditions

KKT system of the original problem is∇f (x) +

m∑i=1

λi∇gi (x) = 0

−λigi (x) = 0λ ≥ 0g(x) ≤ 0

(x∗(ε), λ∗(ε)) solves the system∇f (x) +

m∑i=1

λi∇gi (x) = 0

−λigi (x) = ελ ≥ 0g(x) ≤ 0

which is an approximation of the KKT system



Logarithmic barrier method


0. Set tolerance δ > 0, τ < 1 and ε1 > 0. Choose x0 ∈ int(Ω), set k = 1

1. Find the optimal solution xk of min f (x)− εkm∑i=1

log(−gi (x))

x ∈ int(Ω)

using xk−1 as starting point

2. If m εk < δ then STOPelse εk+1 = τεk , k = k + 1 and go to step 1

Choice of τ involves a trade-off: small τ means fewer outer iterations, more inneriterations



Choice of starting point

How to find x0 ∈ int(Ω)?

Consider the auxiliary problem min sgi (x) ≤ s

I Take any x ∈ Rn and s > maxi gi (x) so that (x , s) is in the interior of thefeasible region of the auxiliary problem

I Find an optimal solution (x∗, s∗) of the auxiliary problem with the barriermethod starting from (x , s)

I If s∗ < 0, then x∗ ∈ int(Ω)otherwise int(Ω) = ∅




Example. Solve the problemmin 1

2 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0

by means of the logarithmic barrier method with δ = 10−3, τ = 0.5, ε1 = 1 andx0 = (1, 1).

0 0.5 1 1.5 2 2.5 3 3.5 4

0

0.5

1

1.5

2

2.5

3




Exercise 19. Implement in MATLAB the logarithmic barrier method for solvingthe problem

min 12x

TQx + cTxAx ≤ b


Exercise 20. Run the logarithmic barrier method with δ = 10−3, τ = 0.5, ε1 = 1and x0 = (1, 1) for solving the problem

min 12 (x1 − 3)2 + (x2 − 2)2

−2x1 + x2 ≤ 0x1 + x2 ≤ 4−x2 ≤ 0


Date post:	30-Dec-2019
Category:	Documents
Upload:	others
View:	6 times
Download:	0 times

Constrained optimization - e-learning...Existence of optimaOptimality conditionsDualityEquality...

Documents