In LP ... the objective function & constraints are linear and the problems are “easy” to solve.
Most real-world problems have nonlinear elements and are hard to solve.
Nonlinear Programming Models
Minimize f(x)
s.t. gi(x) (, , =) bi, i = 1,…,m
x is the n-dimensional vector of decision variables
f(x) is the objective function
gi(x) are the constraint functions
bi are fixed known constants
General NLP
Example 1 Max 3x1 + 2x2
4
s.t. x1 + x2 1, x1 0, x2 unrestricted
2
Example 2 Max ec1x1 ec2x2 … ecnxn
Example 3 Minnj=1
fj(xj)
s.t. Ax = b, x 0
where each fj(xj) is of the form
Problems with“decreasing efficiencies”
Examples 2 and 3 can be reformulated as LPs
s.t. Ax = b, x 0
fj(xj)
xj
Max f(x1, x2) = x1x2
s.t. 4x1 + x2 8
x1, x20
2
8
f(x) = 2
f(x) = 1
x2
Optimal solution will lie on the line g(x) = 4x1 + x2 – 8 = 0.
x1
NLP Graphical Solution Method
• Solution is not a vertex of feasible region.
• For this particular problem the solution is on the boundary of the feasible region.
• This is not always the case.
Solution Characteristics
Gradient of f(x) = f(x1, x2) (f/x1, f/x2)T
This gives f/x1 = x2, f/x2 = x1
and g/x1 = 4, g/x2 = 1
At optimality we have f(x1, x2) = g(x1, x2)
or x2* = 4 and x1
* = 1
f(x)
x
localmin
globalmax stationary
point
localmin
localmax
Nonconvex Function
Let S Rn be the set of feasible solutions to an NLP.
Definition: A global minimum is any x0 S such that
f(x0) f(x)
for all feasible x not equal to x0.
Function with Unique Global Minimum at x = (1, –3)
What is the optimal solution if x1 0 and x2 0 ?
Min {f(x)= sin(x) : 0 x 5}
Function with Multiple Maxima and Minima
Constrained Function with Unique Global Maximum and Unique Global Minimum
d2f (x)dx2 ≥ 0 for all x.
Convex for Univariate f :
Convex function: If you draw a straight line between any two points on f(x) the line will be above or on the line of f(x).
Concave function: If f(x) is convex than - f(x) is concave.
Linear functions are both convex and concave.
Convexity
Definition of ConvexityLet x1 and x2 be two points in S Rn. A function f(x) is convex if and only if
f(x1 + (1–)x2) ≤ f(x1) + (1–)f(x2)
for all 0 < < 1. It is strictly convex if the inequality sign ≤ is replaced with the sign <.
x1 x2 x
f(x)
x + (1–)x1 2
f(x )+(1–)f(x )1 2
f(x +(–l)x )1 2
1-dimensional example
))1(( 21 xxf
)()1()( 21 xfxf
21 )1( xx
f(x)
x
Nonconvex -- Nonconcave Function
A positively weighted sum of convex functions is convex:
if fk(x) k =1,…,m are convex and 1,…,m 0
then f(x) = kfk(x) is convex.
m
k=1
Theoretical Result for Convex Functions
Hessian of f at x:
Example: f(x) = 2x1
3 + 3x22 – 4x1
2x2 +
5x1-8
2
2
2
2
1
2
2
2
22
2
12
21
2
21
2
21
2
2 )x(
nnn
n
n
x
f
xx
f
xx
f
xx
f
x
f
xx
f
xx
f
xx
f
x
f
f
…
…
…
Determining Convexity
Single Dimensional Functions:
A function f(x) C1 is convex if and only if it is underestimated by linear extrapolation; i.e.,
f(x2) ≥ f(x1) + (df(x1)/dx)(x2 – x1) for all x1 and x2.
A function f(x) C2 is convex if and only if its second derivative is nonnegative.
d2f(x)/dx2 ≥ 0 for all x
If the inequality is strict (>), the function is strictly convex.
x1 x2
f(x)
Multiple Dimensional Functions
Definition: The Hessian matrix H(x) associated with
f(x) is the n n symmetric matrix of second partial
derivatives of f(x) with respect to the components of x.
Example: f(x) = 3(x1)2 + 4(x2)3 – 5x1x2 + 4x1
2122
21
245
56)( and
512
456)(
xxx
xxf xHx
When f(x) is quadratic, H(x) has only constant terms;
when f(x) is linear, H(x) does not exist.
Properties of the Hessian
• H(x) is positive semi-definite (PSD) if and only if xTHx ≥ 0 for all
x and there exists an x 0 such that xTHx ≥ 0.
• H(x) is positive definite (PD) if and only if xTHx > 0 for all x 0.
• H(x) is indefinite if and only if xTHx > 0 for some x, and xTHx < 0 for some other x.
How can we use Hessian to determine whether or not f(x) is convex?
Multiple Dimensional Functions and Convexity
• f(x) is convex if only if f(x2) ≥ f(x1) + Tf(x1)(x2 – x1) for all x1 and x2.
• f(x) is convex (strictly convex) if its associated Hessian matrix H(x) is positive semi-definite (definite) for all x.
• f(x) is concave if only if f(x2) ≤ f(x1) + ▽Tf(x1)(x2 – x1) for all x1 and x2.
• f(x) is concave (strictly concave) if its associated Hessian matrix H(x) is negative semi-definite (definite) for all x.
• f(x) is neither convex nor concave if its associated Hessian matrix H(x) is indefinite
Testing for Definiteness
Definition: The ith leading principal submatrix of H is the matrix
formed taking the intersection of its first i rows and i columns.
Let Hi be the value of the corresponding determinant:
obtained. is untilon so and , ,2221
12112111 nH
hh
hhHhH
Let Hessian, H =
nnnn
n
n
hhh
hhh
hhh
...
..
..
..
...
...
21
22221
11211
• Definition – The kth order principal submatrices of an n
n symmetric matrix A are the k k matrices obtained by deleting n - k rows and the corresponding n - k columns of A (where k = 1, ... , n).
• Example
346
570
812
A
346
570
812
70
12,
36
82,
34
57
3 ,7 ,2
,3
,2,2,2
,1,1,1
AH
HHH
HHH
a
cba
cba
11 12 13
21 22 23
31 32 33
r
11 22 33
r
11 13 22 2311 12
31 33 32 3321 22
Example:
Principal submatrices of order 1: (PS (A))
[ ] [ ] [ ]
Principal submatrices of order 2: (PS (A))
a a a
A a a a
a a a
a a a
a a a aa a
a a a aa a
11 12 13
21 22 23
31 32 33
r
11 12 1311 12
11 21 22 2321 22
31 32 33
Principal submatrix of order 3
Leading principal submatrices (LPS (A))
[ ], ,
a a a
a a a
a a a
a a aa a
a a a aa a
a a a
Rules for Definiteness• H is positive definite if and only if the determinants of all the
leading principal submatrices are positive; i.e., Hi > 0 for i = 1,…,n.
• H is negative definite if and only if H1 < 0 and the remaining leading principal determinants alternate in sign:
H2 > 0, H3 < 0, H4 > 0, . . .• H is positive-semidefinite if and only if all principal
submatrices ( Hi ) have nonnegative determinants.
• H is negative semi-definiteness if and only if
Hi 0 for i odd and Hi 0 for i even .
Quadratic Functions
Example 1: f(x) = 3x1x2 + x12 + 3x2
2
63
32)( and
63
23)(
21
12 xHxxx
xxf
so H1 = 2 and H2 = 12 – 9 = 3
Conclusion f(x) is strictly convex because
H(x) is positive definite.
Quadratic Functions
Example 2: f(x) = 24x1x2 + 9x12 + 6x2
2
3224
2418)( and
3224
1824)(
21
12 xHxxx
xxf
H1 = 18 and H2 = 576 – 576 = 0 → f is not PD
• H is positive semi-definite (determinants of all
principal submatrices are nonnegative) → f(x) is
convex .
• Note, xTHx = 18(x1 + (4/3)x2)2 ≥ 0.
Nonquadratic Functions
Example 3: f(x) = (x2 – x12)2 + (1 – x1)2
24
42124)(
1
1212
x
xxxxH
Thus the Hessian depends on the point under consideration:
At x = (1, 1), which is positive definite.
At x = (0, 1), which is indefinite.
Thus f(x) is not convex although it is strictly convex near (1, 1).
24
410)1,1(H
20
02)00( ,H
Example
18060
600A
Is matrix A PD or PSD or ND or NSD or Indefinite ?
Convex Sets
Definition: A set S n is convex if any point on the line
segment connecting any two points x1, x2 S is also in S.
Mathematically, this is equivalent to
x0 = x1 + (1–)x2 S for all such that 0 ≤ ≤ 1.
x1
x2
x1x1x2
x2
x1
x2
S = {(x1, x2) : (0.5x1 – 0.6)x2 ≤ 1
2(x1)2 + 3(x2)2 ≥ 27; x1, x2 ≥ 0}
(Nonconvex) Feasible Region
Convex Sets and Optimization
Let S = { x n : gi(x) bi, i = 1,…,m }
Fact: If gi(x) is a convex function for each i = 1,…,m then S is a convex set.
Convex Programming Theorem: Let x n and let f(x) be a
convex function defined over a convex constraint set S. If a
finite solution exists to the problem
Minimize{f(x) : x S}
then all local optima are global optima. If f(x) is strictly
convex, the optimum is unique.
Note• Let s = { x n : g(x) b}. Fact: If g (x) is a convex function, then s is a convex set.
• Let S = { x n : gi(x) bi, i = 1,…,m }
Fact: If gi(x) is a convex function for each i = 1,…,m then S is a convex set.
• Let t = { x n : g(x) b}. Fact: If g (x) is a concave function, then t is a convex set.
• Let T = { x n : gi(x) bi, i = 1,…,m }
Fact: If gi(x) is a concave function for each i = 1,…,m then T is a convex set.
Max f(x1,…,xn)
s.t. gi(x1,…,xn) bi
i = 1,…,mx1 0,…,xn 0
is a convex program if f is concave and each gi is convex.
Convex Programming
Min f(x1,…,xn)
s.t. gi(x1,…,xn) bi
i = 1,…,mx1 0,…,xn 0
is a convex program if f is convex and each gi is convex.
x11 2 3 4 5
1
2
3
4
5
x2
Maximize f(x) = (x1 – 2)2 + (x2 – 2)2
subject to –3x1 – 2x2 ≤ –6
–x1 + x2 ≤ 3
x1 + x2 ≤ 7
2x1 – 3x2 ≤ 4
Linearly Constrained Convex Function with Unique Global Maximum
(Nonconvex) Optimization Problem
Commercial optimization software cannot guarantee that a solution is globally optimal to a nonconvex program.
Importance of Convex Programs
NLP algorithms try to find a point where the gradient of the Lagrangian function is zero – a stationary point – and complementary slackness holds.
Given L(x,) = f(x) + (g(x) – b)
we want
L(x,) = 0, g(x) – b ≤ 0, g(x)-b] = 0, x 0, 0
However, for a convex program, all local solutions are globally optima.
We want to build a cylinder (with a top and a bottom) of maximum volume such that its surface area is no more than S units.
Max V(r,h) = r2h
s.t. 2r2 + 2rh = S
r 0, h 0
r
h
There are a number of ways to approach this problem. One way is to solve the surface area constraint for h and substitute the result into the objective function.
Example: Cylinder Design
h =S 2r2
2rVolume = V = r2 [
S 2r2
r] =
rS 2
r3
dVdr
= 0 r = (S
6)1/2
, h = S
2r r =2(S
6)1/2
V = r2h = 2(S
6)3/2
r = (S
6)1/2
h = 2(S
6)1/2
Is this a global optimal solution?
Solution by Substitution
V(r) = rS 2
r3 dV(r) dr =
S2 3r2
d2V(r)
dr2r
d2V dr2 0 for all r0
Thus V(r) is concave on r0 so the solution is a global maximum.
Test for Convexity
• A company wants to advertise in two regions.
• The marketing department says that if $x1 is spent in region 1, sales volume will be 6(x1)1/2.
• If $x2 is spent in region 2 the sales volume will be 4(x2)1/2.
• The advertising budget is $100.
Model: Max f(x) = 6(x1)1/2 + 4(x2)1/2
s.t. x1 + x2 100, x1 0, x2 0
Advertising (with Diminishing Returns)
Solution: x1* = 69.2, x2
* = 30.8, f(x*) = 72.1
Is this a global optimum?
Excel Add-in Solution
1
2345678910111213141516171819
A B C D E F G H I J K L M N O
Nonlinear Model Name: Adv100 Objective Terms Solver: Excel Solver
72.111 Type: NLP1 Linear: 0 Type: Nonlinear2 Change Goal: Max NonLinear 1: 72.111 Sens.: Yes Comp. Time 00:00
TRUE Objective: 72.111NonLinear 2: 0 Status OptimalTRUE Solve100 Variables 1 2
Change Relation Name: X1 X2Values: 69.231 30.769
Lower Bounds: 0 0
Linear Obj. Coef.: 0 0Nonlinear Obj. Terms: 8.3205 5.547Nonlinear Obj. Coef.: 6 4
ConstraintsNum. Name Value Rel. RHS Linear Constraint Coefficients
1 Con1 100 <= 100 1 12 Con2 0 <= 10000 0 0
Let j = expected return
jjvariance of return
We are also concerned with the covariance terms:
ij= cov (ri, rj)
If ij > 0 then returns on i and j are positively correlated.
If ij < 0 returns are negatively correlated.
Portfolio Selection with Risky Assets (Markowitz)
• Suppose that we may invest in (up to) n stocks.
• Investors worry about (1) expected gain (2) risk.
Decision Variables: xj = # of shares of stock j purchased
R(x) = jxj
n
j=1Expected return of the portfolio:
V(x) = ijxixj Variance (measure of risk):
V(x) = 11x1x1 + 12x1x2 + 21x2x1 + 22x2x1
= 2 + (2) + (2) + 2 = 0
Thus we can construct a “risk-free” portfolio (from variance point of view) if we can find stocks “fully” negatively correlated.
n
i=1
n
j=1
Example If x1 = x2 = 1, we get
22
22
2221
1211
If , then purchasing stock 2 is just like
purchasing additional shares of stock 1.
22
22
2221
1211
Nonlinear optimization models …
Let pj = price of stock j,
b = our total budget
risk-aversion factor ( 0 risk is not a factor)
Consider 3 different models:
1) Max f(x) = R(x) – V(x)
s.t. pjxj b, xj 0, j = 1,…,n
where 0 determined by the decision maker
n
j=1
2) Max f(x) = R(x)
s.t. V(x) , pjxj b, xj 0, j = 1,…,n
where 0 is determined by the investor. Smaller
values of represent greater risk aversion.
n
j=1
3) Min f(x) = V(x)
s.t. R(x) , pjxj b, xj 0, j = 1,…,n
where 0 is the desired rate of return
(minimum expectation) is selected by the investor.
n
j=1