Introduction to convex optimization I...Outline Introduction to convex problems Special classes of...

Introduction to convex optimization I

Sergio García1

The University of Edinburgh, UK

June 2018

1Based on lecture notes by Dr Paresh Date, Brunel UniversitySergio García Introduction to convex optimization I June 2018 1 / 21

Outline

• Introduction to convex problems

• Special classes of convex problems

1 Linear programming

2 Convex quadratic programming

Sergio García Introduction to convex optimization I June 2018 2 / 21

Outline

• Introduction to convex problems

• Special classes of convex problems

1 Linear programming

2 Convex quadratic programming


The convex optimization problem I

• The problems of interest are of the form

minimize f0(x),

subject to fi (x) ≤ 0, i = 1, 2, . . . ,m,

and hi (x) = 0, i = 1, 2, . . . , p, (1)

where the functions fi : dom (fi ) ⊇ Rn 7→ R, i = 0, 1, 2, . . . ,m are convex;

hi (x) = a>i x− bi , i = 1, 2, . . . , p are a�ne.

• Maximization of a concave function subject to convex constraints is also a

convex optimization problem.


The convex optimization problem II

• The set

D =⋂

dom(fi )⋂

dom(hi )

is the domain of the optimization problem (1). D is obviously convex.

• A point x ∈ D is said to be a feasible point for (1) if it satis�es

fi (x) ≤ 0, i = 1, 2, . . . ,m, hi (x) = 0, i = 1, 2, . . . , p. The set of all feasible

points F is called the feasible set or the constraint set.

• The optimal value p? is de�ned as

p? = inf {f0(x), x ∈ F} ,

where p? is −∞ if the problem is unbounded from below.

• A point x? is said to be an optimal point if it is feasible and f0(x?) = p?.


The convex optimization problem II

• The set

D =⋂

dom(fi )⋂

dom(hi )

is the domain of the optimization problem (1). D is obviously convex.

• A point x ∈ D is said to be a feasible point for (1) if it satis�es

fi (x) ≤ 0, i = 1, 2, . . . ,m, hi (x) = 0, i = 1, 2, . . . , p. The set of all feasible

points F is called the feasible set or the constraint set.

• The optimal value p? is de�ned as

p? = inf {f0(x), x ∈ F} ,

where p? is −∞ if the problem is unbounded from below.

• A point x? is said to be an optimal point if it is feasible and f0(x?) = p?.


The convex optimization problem III

• We say that (1) is solvable and the optimum is attained if x? exists; the problem

is unsolvable if F is empty or if p? = −∞.

• A point x is ε−suboptimal if it is feasible and f0(x) ≤ p? + ε, where ε > 0.

• A feasible point x?l is said to be locally optimal if there exists r > 0 such that

f0(x?l ) = inf {f0(x), x ∈ F , ‖x− x?l ‖2 ≤ r} ,

and is said to be globally optimal if it is optimal over all x ∈ F .

• For convex optimization problems, any local optimum is also a global

optimum, and the set of points which achieves this optimum is convex.

• This means: if we are searching for an optimum, we can stop once we �nd a

local one. There is no better optimum out there in the domain.







f0(x?l ) = inf {f0(x), x ∈ F , ‖x− x?l ‖2 ≤ r} ,












f0(x?l ) = inf {f0(x), x ∈ F , ‖x− x?l ‖2 ≤ r} ,







A simple equivalent formulation

• Note that problem (1) is also equivalent to

minimize t,

subject to f0(x) ≤ t, fi (x) ≤ 0, i = 1, 2, . . . ,m,

and hi (x) = 0, i = 1, 2, . . . , p, (2)

• No further constraint on new decision variable t means that we can simply set

t? = f0(x?). This is also called epigraph formulation.

• This added variable t comes handy in many cases when f0(x) itself is less

convenient to deal with, as we shall see.


A simple equivalent formulation

• Note that problem (1) is also equivalent to

minimize t,

subject to f0(x) ≤ t, fi (x) ≤ 0, i = 1, 2, . . . ,m,

and hi (x) = 0, i = 1, 2, . . . , p, (2)

• No further constraint on new decision variable t means that we can simply set

t? = f0(x?). This is also called epigraph formulation.

• This added variable t comes handy in many cases when f0(x) itself is less

convenient to deal with, as we shall see.


Applications of convex optimization

• Within OR, convex optimization problems occur in supply chain planning,

capacity location, �nancial portfolio optimization, asset and liability

management, · · ·

• Elsewhere, they also occur in data analysis (curve �tting), signal processing,

control system design, structural optimization, antenna array design, · · ·

• Special types of (extremely useful) convex optimization problems: linear

programming (LP), quadratic programming (QP) and semi-de�nite

programming (SDP).

• Very signi�cant body of theoretical research as well as software implementation

exists for each of these.


Applications of convex optimization

• Within OR, convex optimization problems occur in supply chain planning,

capacity location, �nancial portfolio optimization, asset and liability

management, · · ·

• Elsewhere, they also occur in data analysis (curve �tting), signal processing,

control system design, structural optimization, antenna array design, · · ·

• Special types of (extremely useful) convex optimization problems: linear

programming (LP), quadratic programming (QP) and semi-de�nite

programming (SDP).

• Very signi�cant body of theoretical research as well as software implementation

exists for each of these.


The linear programming problem

• In LP, both the objective function and the constraint functions are linear:

minimize c>x

subject to Ax = b,

x ≥ 0, (3)

• The vectors c,b and the matrix A are the problem parameters specifying the

objective function and the constraint functions.

• Applied convex programming starts with LP; simplex method of Dantzig ∼1947-48 made mathematical optimization tractable.

• Still a work-horse within �nancial optimization. You will learn about solving

large scale LPs in this course.


The linear programming problem

• In LP, both the objective function and the constraint functions are linear:

minimize c>x

subject to Ax = b,

x ≥ 0, (3)

• The vectors c,b and the matrix A are the problem parameters specifying the

objective function and the constraint functions.

• Applied convex programming starts with LP; simplex method of Dantzig ∼1947-48 made mathematical optimization tractable.

• Still a work-horse within �nancial optimization. You will learn about solving

large scale LPs in this course.


Example of LP: the diet problem

• Suppose that there are m basic nutrients;

• A healthy diet needs bj units of jth nutrient per day.

• There are n di�erent food items available, with one unit of item i containing ajiunits of nutrient j .

• Price of food item i is ci per unit.

• How do we minimize the cost of food per day, while keeping the diet healthy?


Example of LP: the diet problem

• Suppose that there are m basic nutrients;

• A healthy diet needs bj units of jth nutrient per day.

• There are n di�erent food items available, with one unit of item i containing ajiunits of nutrient j .

• Price of food item i is ci per unit.

• How do we minimize the cost of food per day, while keeping the diet healthy?


The diet problem (continued)

• This leads, precisely, to

minimize c>x

subject to Ax = b,

x ≥ 0, (4)

where xi is the number of units of food item i to be purchased.

• There might be other linear constraints on x, e.g. on the number of units of any

one food item purchased.

• Note: Increasing the number of food items from, say, 20 to 200 makes very little

di�erence in computational complexity, but · · ·

• Saying `use any 10 out of 20 food items' makes obtaining an exact solution `far

more di�cult'/ practically impossible.


The diet problem (continued)

• This leads, precisely, to

minimize c>x

subject to Ax = b,

x ≥ 0, (4)

where xi is the number of units of food item i to be purchased.

• There might be other linear constraints on x, e.g. on the number of units of any

one food item purchased.

• Note: Increasing the number of food items from, say, 20 to 200 makes very little

di�erence in computational complexity, but · · ·

• Saying `use any 10 out of 20 food items' makes obtaining an exact solution `far

more di�cult'/ practically impossible.


The Quadratic programming problem

• In QP, the objective function is convex quadratic and the constraint functions

are linear, i.e. the problem is of the form

minimize1

2x>Px+ q>x+ r

subject to Gx ≤ h, (5)

Ax = b.

• The matrices P, G , A, vectors q, h and the scalar r are the problem parameters.

• The vector inequality (5) indicates that Gx− h has all non-negative elements.

• The matrix P is required to be positive semi-de�nite for this problem to be

convex (x>Px ≥ 0 ∀ x).





minimize1

2x>Px+ q>x+ r


Ax = b.









minimize1

2x>Px+ q>x+ r


Ax = b.






Examples of QP: least squares data-�tting

• In data �tting problems,

b = Ax+ v

where b is a vector of measurements, the perturbation v is assumed to be small

and we are trying to �nd a vector x which minimizes the Euclidian norm of this

perturbation. This leads to QP

minimize ‖Ax− b‖22

(6)

• This has a closed-form solution if there are no constraints on x; needs to be

solved numerically if there are constraints, e.g. x ≥ 0.

• In interpolation problems, the matrix A has entries of the form (A)ij = θj−1i for

given θi , i = 1, 2, . . .m and the problem is to �nd the coe�cient vector x of a

polynomial p(θ) of a prescribed degree n, which best matches the set of points

(θi , bi ).


Examples of QP: least squares data-�tting

• In data �tting problems,

b = Ax+ v

where b is a vector of measurements, the perturbation v is assumed to be small

and we are trying to �nd a vector x which minimizes the Euclidian norm of this

perturbation. This leads to QP

minimize ‖Ax− b‖22

(6)

• This has a closed-form solution if there are no constraints on x; needs to be

solved numerically if there are constraints, e.g. x ≥ 0.

• In interpolation problems, the matrix A has entries of the form (A)ij = θj−1i for

given θi , i = 1, 2, . . .m and the problem is to �nd the coe�cient vector x of a

polynomial p(θ) of a prescribed degree n, which best matches the set of points

(θi , bi ).


Data �tting in l1 norm as LP

• Recall least squares data �tting; in general, minimizing any vector norm of

Ax− b is a convex problem.

• In particular, since ‖z‖1 =∑

i |zi |, we can re-formulate minimizing ‖Ax− b‖1over x as a linear program:

•

minimize∑i

ti subject to

(Ax− b)i ≤ ti

(Ax− b)i ≥ −ti ,

with t1, · · · tn as auxiliary decision variables.







•

minimize∑i

ti subject to

(Ax− b)i ≤ ti

(Ax− b)i ≥ −ti ,








•

minimize∑i

ti subject to

(Ax− b)i ≤ ti

(Ax− b)i ≥ −ti ,



Data �tting in l∞ norm as LP

• In�nity norm for a vector is de�ned by ‖z‖∞ = maxi |zi |.

• We can re-formulate minimizing ‖Ax− b‖∞ as a linear program:

minimize t subject to

(Ax− b)i ≤ t

(Ax− b)i ≥ −t,

with t as a single auxiliary decision variable.


Data �tting: what should you use?

• Least squares is usually the quickest.

• If you want a solution robust to outliers: use l1−norm.

• If you want to get the `best worst case' �t: use l∞−norm.

• For the same set of points (y = 2x + 1+ random noise) with two outliers, we

can compare the �ts obtained by minimising di�erent norms.


Data �tting: what should you use?

• Least squares is usually the quickest.

• If you want a solution robust to outliers: use l1−norm.

• If you want to get the `best worst case' �t: use l∞−norm.

• For the same set of points (y = 2x + 1+ random noise) with two outliers, we

can compare the �ts obtained by minimising di�erent norms.


Data �tting with outliers: least squares �t

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

1

1.5

2

2.5

3

3.5

x

y

Data fitting in the presence of outliers

Least squares fittrue points


Least squares vs 1−norm �t

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

1

1.5

2

2.5

3

3.5

x

y


Least squares fittrue pointsone norm fit


Least squares vs ∞-norm �t

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

1

1.5

2

2.5

3

3.5

x

y


Least squares fittrue points∞ norm fit


Recognizing convex problems

• See if you can re-formulate the problem as LP/QP or SDP (next lecture);

• See if you can re-formulate it as a quasiconvex problem (next lecture);

• Can you arrive at your objective function and constraints via composition of

simpler convex functions?

• Check convexity of functions via gradient/Hessian/ testing it on a line.


Recognizing convex problems - example

• Given a decision vector x specifying variables such as retail price and advertising

spend, let the probability of consumer buying your product be de�ned by

f (x) =exp(a>x+ b)

1+ exp(a>x+ b).

How would you maximize f (x) over x? Assume that there are suitable

constraints over x, and a>x+ b ≥ 0.

• h(x) = ex/1+ex is concave and non-decreasing and g(x) = a>x+ b. Hence

f (x) = h(g(x)) is concave. Further, ∇(f ) = 0⇔ ∇(g) = 0.

• This is a simple linear programming problem if the constraints on x are a�ne.





f (x) =exp(a>x+ b)

1+ exp(a>x+ b).










f (x) =exp(a>x+ b)

1+ exp(a>x+ b).







Next steps

• Having looked at a few di�erent types of convex optimization problems,

• we will next look at one more special- and important- class of problems

(semide�nite programs).

• Then we will look at some theoretical analysis of optimization and (�nally!) how

to actually solve these problems.

• This will also include a de-tour on modelling and solving quasiconvex

optimization problems.

◦ ◦ ◦


Next steps

• Having looked at a few di�erent types of convex optimization problems,

• we will next look at one more special- and important- class of problems

(semide�nite programs).

• Then we will look at some theoretical analysis of optimization and (�nally!) how

to actually solve these problems.

• This will also include a de-tour on modelling and solving quasiconvex

optimization problems.

◦ ◦ ◦


Date post:	19-Jun-2020
Category:	Documents
Upload:	others
View:	9 times
Download:	0 times

Introduction to convex optimization I...Outline Introduction to convex problems Special classes of...

Documents