Lecture 1: Introduction, Necessary and Sufficient ...

Copyright ©1991-2009 by K. Pattipati1

Lecture 1: Introduction, Necessary and Sufficient

Conditions for Minima & Convex Analysis

Prof. Krishna R. Pattipati

Dept. of Electrical and Computer Engineering

University of Connecticut Contact: [email protected] (860) 486-2890

Fall 2009

September 1, 2009

ECE 6437Computational Methods for Optimization

mailto:[email protected]


Introduction

Contact Information

• Room number: ITE 350

• Tel/Fax: (860) 486-2890/5585

• E-mail: [email protected]

Office Hours: Tuesday – Thursday: 11:00-12:00 Noon

Mission or goal

• Provide systems analysis with central concepts of widely

used optimization techniques

• Requires skills from both Mathematics and CS

• Need a strong background in multivariable calculus and

linear algebra

mailto:[email protected]


Outline of Lecture 1

Three Recurrent Themes

• Problem, Algorithms, Convergence Analysis

Optimization Applications

What is an Optimization Problem?

Classification of Optimization Problems

Three Basic Questions of Optimization

• Optimality conditions, algorithm, convergence

Optimality Conditions for single variable and Multi-

variable Functions

Elementary Convexity Theory


Three Recurrent Themes

Need to mathematically understand the

optimization problem to be solved

Design an algorithm to solve the problem,

that is, a step-by-step procedure for

solving the problem

Convergence Analysis

• How fast does the algorithm converge?

• What is the relationship between rate of

convergence and the size of the

problem?

ECE 6437

Computational

Methods in

Optimization

Convergence

Analysis

(Complexity)

Algorithmic

Techniques

Optimization

Problem

(Application-specific)


Applications of Optimization

Sample Applications

• Scheduling in Manufacturing systems

• Scheduling of police patrol officers in a city

• Reducing fuel costs in Electric power industry (unit commitment)

• Gasoline blending in TEXACO

• Scheduling trucks at North American Van Lines

• Advertisement to meet certain % of each income group

• Investment portfolio to maximize expected returns, subject to

constrains on risk

Technical Areas

• Operations Research, Systems theory (Optimal Control),

Statistics (Design of Experiments), Computer Science, Chemical

and Civil Engineering, Economics, Medicine, Physics, Math,….


Three Attributes:

1. A set of independent variables or parameters

2. Conditions or restrictions on the acceptable values of the variables

constraints of the optimization problem, (e.g., )

3. A single measure of goodness, termed the objective (utility) function

or cost function or goal, which depends on

or

What is an Optimization Problem?

1

2

continuous

vector ..., 2, 1,0,1,2,... integers

| 0,1 binary

n

n

in

xx R

xx n x Z

x xx

1 2, ,..., nx x x

0x

1 2 1 2, ,..., , , ,...,n nx x x f x x x

f x

Z

: ; if Z

0,1

n

n n

n

R

f R R f Z Z

Z


Abstract Formulation: “Minimize subject to ”

• Such problems have been investigated at least since 825 A.D. Persian

author Abu Ja'far Muhammad ibn Musa Al-Khwarizmi who wrote the

first book on Mathematics

• Since 1950’s, a hierarchy of optimization problems have emerged under

the general heading “Mathematical Programming”. The solution

approach is algorithmic in nature, i.e., construct a sequence

Abstract Formulation

* *

0 1 ... , where minimizes x x x x f x

f x x

Feasible set, closed and bounded


A classification of Mathematical

programming problems

n

n

x R

x Z

Nonlinear programming problems

Discrete nx Z

hard

problems

NP

Network

Programming

Separable

Resource allocation

problems

Assignment

problemsLP

Cont. nx R

Convex

Programs

ECE 6437

Research

No course yet

NLP

ECE 6108


Unconstrained NLP: no constraints on

• Steepest descent (gradient) method

• Conjugate gradient method

• Newton, Gauss-Newton methods & variations

• Quasi-Newton (or) variable metric methods

Constrained NLP: defined by

• Penalty methods

• Multiplier or Augmented Lagrangian methods

• Reduced gradient method

• Recursive quadratic programming

nR x

0, 1,2, , Equality constraints

0, 1,2, , Inequality constraints

, 1,2, , n Simple bound constraints

i

i

LB UB

i i i

h x i m n

g x i p

x x x i

Computational Methods in Optimization

ECE 6437


Special Case 1: Convex programming problem (CPP)

• Convex cost function with convex constraints

• is convex (defined later).

• is concave (or) is convex.

• linear

Computational Methods in Optimization:

ECE 6437 (Cont’d)

( )ig x

( )ih x1

n

m

i i

i

A x b a x b R

Local minimum Global minimum

( )f x

( )ig x


Special Case1.1: Linear Programming (LP) Problem

• is linear

• linear

•

• A striking feature of this problem is that the number of feasible solutions is

finite:

• Efficient algorithms exist for this problem

− Revised simplex

− Interior Point algorithms (application of specialized NLP to LP)

• One of the most widely used models in production planning.

Special cases 1.1.x :• Network Flows (LP on networks, i.e., graphs with weights)

• Shortest paths

• Maximum flow problem

• Transportation problem

• Assignment problem

Linear Programming and Network Flows -

ECE 6108

f x1 1 2 2( )

T

n nf x c x c x c x c x

( )ig x ; 1,2, ,T

i ia x b i p

0; 1,2,.., ; ; is by matrixix i n Ax b A m n

n pN

m p


Integer Programming (combinatorial optimization) has hard intractable

problems with exponential computational complexity

• Traveling salesperson problem

• VLSI routing

• Testing

• Multi-processor scheduling to minimize make span

• bin-packing

• Knapsack problem

• …..

In ECE 6437, our focus will be on the following problems:

• Unconstrained NLP

• Constrained NLP

• Convex Programming

Integer Programming


1. Static Question: How can one determine whether a given point

is a minimum → Provides theory, stopping criteria, etc.

2. Dynamic Question: If a given point is not a minimum, then how

does one go about finding a solution that is a minimum? → Algorithm

3. Convergence Analysis:• Does the algorithm in 2 converge?

• If so, how fast?

How does or behave?

Let us consider the third question first.

Rate of Convergence Concepts:Suppose have an algorithm that generates a sequence with a stationary limit

point . Define a scalar error function:


*x

x

*

0 1 2x x x x

*

kx x *( ) ( )kf x f x

kx*x : Rne R

*

kke x x


Rate of Convergence:

Rate of Convergence - 1

1

1

1/22 2 2

1 2211

*

Here is defined as any Holder -norm defined by:

Typically, ; ; max

You may also define | |

The behav

n pp

ipi

n

i n i i

i

kk

x p

x x

x x x x x x x x

e f x f x

ior of as a function of is directly related to computational efficiency

Time complexity: cost per step * number of iterations

In order to investigate the behavior of , we compare it to

“st

k

k

e k

e

1

andard” sequences. One standard form is to look for

as

order of convergence (or) asymptotic rate of convergence

convergence ratio or asymptotic error constant

r

k ke e k

r

AB

CD

ke

k

A B C D


1 linear convergence (Geometric). Converges if

2 quadratic (fast) convergence

3 cubic (superfast) convergencer

1

1

1

1

1

If 1, 1 linear lim 1

1, 1 sublinear lim 1

0, 1 superlinear lim 0

1 superlinear 0 lim

k

kk

k

kk

k

kk

k

kk

er

e

er

e

er

e

er

e

Examples:

1

1

1 1

1

1 , 1 binary search, golden section search, gradient method, regula falsi

12 1 sublinear

1

1 1 13 1; 0 as superlinear

1 1

k k

kk

k

k k k

k k k

e e

e ke

k e k

ke e e r k

k k k k



1

1 1 1

2 1 2 2 1 1 2 2 2 1 2

2 21 2 1 2

2

2 2

1

1 14 , q , q , linear

5 ;

lim 1 and

6 2 quadratic Newton's Methk

k kk

k kk

k k k k k k

k k k

k

kk

k k k

e q qk k

e e e

er

e

e a e e r

1 1

1

2

2 2 2

od

7 ; 1.618 Golden section number

1 superlinear convergence rate

: secant method, quadratic fit 1.3

18 1; 0 linear and

2

since 1

Examp

1

les

k

k k k

k k

k

e Me

r

e a a

a a a

1

1

2

1

1 1 lim lim

21

Most of the methods that we discuss will have 1 2

k

k

k kk

e

e a

r

Examples:



Static Question: Necessary and Sufficient

Conditions for Minimum-1

4 3 212 47 60 3 4 5f x x x x x x x x x

Example:

weak several equivalent minima Local or relative

strong strict strict local minimum

Minima

weak Global

strong strict

global

minimum

local

minimum

Strict

global minimum

Weak local

minima

Strict

local minimum

f x

x

Extends to multivariable

functions readily


*

* *

* *

* *

: is a local minimum of over if for some >0, we have

or ,

, neighbourhood

Definitio

of

n x f x

f x f x x and x x

f x f x x N x

N x x

* *

* * *

*

* *

, :

: strict local minimum if , \

: is weak strict global minimum of over

Remark

D

if

:

efinitio

strict global

n

Note

N x x x x

f x f x x N x x

x f x x

f x f x f x f x x

minimum strict local minimum

strict local minimum strict global minimum except for convex functions

2x

1x

*,N x

*x

Static Question: Necessary and Sufficient

Conditions for Minimum-2


Optimality Conditions of Univariate Functions:

Necessary Conditions

*

*

* st

2nd

2

* *

Tangent is horizontal slope ( ) 0 1 order condition

Curvature up second de

Proof

rivative 0 2 order condition

: Suppose is a local minimum. Let . Then,

x x

x x

dff x

dx

d f

dx

x y x x

* * * * 2

* *

2 2* * 2 * *

*

by the mean value theorem

1

2

Suppose 0. Then pick ( ); sufficiently small

10

2

a contradiction need 0

From the

f y f x x f x f x x f x x x

f x x f x

f y f x f x f x f x

f x

* * * 2

* *

* * *

1 first order condition, we have ;0 1

2

if 0 0 for some small by continuity

a contradiction 0

f x x f x f x x x

f x f x x

f x x f x f x

For univariate functions:


Optimality Conditions of Univariate Functions:

Remarks

1. The proof provides a method of advancing from one to the next.

Take a step of ( ) s.t. ( ) ( )

Steepest descent or Gradient or Cauchy Method.

2. These are only necessary c

x

f x f x f x f x

3

onditions. They are not sufficient.

: = ; 0 = 0 =0

Not a local minimum, such point is called a saddle point or

point of inflection.

3. Note that first order con

Example

dition is sati

f x x f f

sfied by minima, maxima and

saddle point. Such points are refered to as stationary points.


f x

x

3f x x


Sufficient Conditions of Optimality for a

Univariate Function

*

*

* *

* *

( ) 0

0

was proven earlier. To show , note that >

only if 0 wich by continuity implies that >0.

The above results extend directly to multivaria

i f x

ii f x

i ii f x x f x

f x x f x

2

2

ble functions, i.e.,

functions of several variables.

Assume and exist and are continuous

Univariate Multivariate

derivative gradi

i i j

f ff x C

x x x

ent vector of first order partial derivatives

second derivative Hessian Matrix of second order partial derivatives



Conditions of Optimality for a

Multivariate Function-1

1

2 ;

n

f

x

f

xf x g x

f

x

Gradient:

1 2 1 2

0

0

, ,..., ,..., , ,...,lim

lim

Rate of change of along the direction

or slope of the tangent line along

or direction of increase in

i n n

i

i

i

i

f x x x x f x x xf

x

f x e f x

f x

x

f at x

1

2

2

1 2 2 1

2

2 2 1

21 2 1 1

cos

sin ; ;

22

Exam

o

l :

s

p

c

e

x

x

f x x x x x

ox x xf x f x f x

x x x


Hessian:

22

2 2

2 2

2

1 1 2 22

2 2

2

1 2 2

Hessian x matrix

since is symmetric

1 Need only

Example:

elements2

cos

i j

T

i j j i

ij ji

ff x F x n n

x x

f fF x F x F x

x x x x

n nf f

f f

x x x xf x

f f

x x x

1

2

1 2 1

2 1 1 2

1

1 1 1

2

2 sin 0 10

2 sin 2 1

A quadratic function

1 1

2

Exam

2

;

p :

le

x

x

n n nT T

ij i j i i

i j i

x x x

x x x

f x x Qx b x c q x x b x c

f x Qx b f x Q

Conditions of Optimality for a



Necessary conditions

*

2 *

1. 0

2. 0

f x

f x PSD

*

2 *

1. 0

2. 0 matrix

f x

f x PD

Sufficient conditions

1. A symetric matrix is iff

0 0

All principal minors have non-negative

determinants

Matrix

T n

i

A PSD

x Ax x R A

can be factored as

0; unit lower

T

i

A A LDL

d L

3

A symetric matrix is iff

0 0 0

All principal minors have positive

determinants

; 0; Computation6

T n

i

T

i

A PD

x Ax x R x A

nA LDL d O

1 2 n

1

2. For any symmetric matrix with ... , we have

Rayleigh inequalityT T T

n

A

x x x Ax x x

Summary of Conditions of Optimality for a


• PD: Positive Definite

• PSD: Positive Semi-definite


2

*

* * * 2 2 *

*

From the mean value theorem, we have for any and

1; 0,1

2

Take , where 1 for any norm (usually 1,2, )

1

2

If is a mini

TT

T T

x y

f y f x f x y x y x f x y x y x

x x y x d d

f x d f x f x d d f x d d g

x

* *

*

1

1

* *

mum, the scalar function has minimum at 0 0 0

0 ,

Taking 0 similarly 0 since is arbitrary

0 0 1 order condition. so, norm will be small

T n

ix x

st

g g

g f x d f x d d R

f fd e d

x x

f x f x

2

2 * 2 *

2

near minimum.

For a local minimum, we also need

00 0 is T T n

d gd f x d d R f x PSD

d

Proof necessity:

Proof of Optimality Conditions - 1


2 * 2 *

n

* * 2 2 *

2 * 2 *

2 *

n

* *

Suppose >0 smallest eigenvalue of >0

1; 0,1

2

For sufficiently small , 0 if >0

Let be the smallest eigenvalue of . Then

1

2

T

f x f x

f x d f x d f x d d

f x d f x

f x d

f x d f x

2

22 2 *

n

*

....Recall Rayleigh inequality2

is a strict local minimum.

Td f x d d d

x

Sufficiency:

2 * 2 *: Strict local maximum if 0 and saddle point if is iN ndefine ito et .f x f x

2 2

1 2 1 1 2 2

*1 2 *1

1 2 *2 2

: , 6 4 5E

2 6 2 03, ; 0

2 4 0

xam

22

ple f x x x x x x

x xf x x f x

x x

Strict local minimum

(It is also global minimum. Why?)

2 2

1 2 1 1 2 1 2 2

1 2 * 2 *

1 2

1 2

Ex : , 2 2 14 2 22 8

4 2 14 5 4 2, ;

2 4 2

ampl

3

e

2 2 4

f x x x x x x x x

x xf x x x f x

x x

1 220, 20

Indefinite

Saddle point

Proof of Optimality Conditions - 2


Convex Sets -1

Important because local optimum global optimum

1 2

1 2

1 2

: A set convex if for two points , and 0,1

we have 1 . In words, is convex if for every two

points and , the line segm

Definitio

n

n

e t

nR any x x

x x

x x

1 2 joining and is also in .x x

• Convex • Nonconvex

A convex set is one

whose boundaries do not

bulge inward (or) do not

have indentations.


Examples:

1. A hyperplane is convex.

2. Half spaces :

or : are convex.

3. convex. Need not be

4. Sum or difference of convex sets is convex.

5. Expansions

T

T

T

i i

a x b

H x a x b

H x a x b

c c

or contraction of convex sets is convex.

6. Empty set is convex (by definition).

2x

1x

1 2 1 0x x

1

1

C+D

C D

2C

C/2

C

Convex Sets -2


Convex Functions

1 2

1 2 1 2

Consider : ; is a sacalar multivariable function.

is a convex function on a convex set if for any two points

and

1 1 0 1

f x R f x

f x

x x

f x x f x f x

f x

x1x

0

1 21f x f x

2xx

1 21f x x x

1 2

A convex function bends up

A line segment chord, secant between

any two points never lies below the graph

Linear interpolation between any two

points and overestimates the funcx x

tion

Concave if is convex.f x

Concave

Convex

Not convex

Not convex

Convex Functions - 1


2 1 2 1 2

22

1 2 1 1 1 2 2 1 2 2

1. A linear function is convex

1 1 1

2. A quadratic function is convex if is .

1 1 1 1

T

T T

T

T T T T

f x c x

f x x c x c x f x f x

x Qx Q PSD

f x x x Qx x Qx x Qx x Qx

1 2 1 1 2 2

1 1 2 2 1 2 2 1

1 2 1 2

2 1 1

1 2 1

1 0 iff is

3. In general ; 1; 0

T T

T T T T

T

i ii i i i

i i i

f x f x x Qx x Qx

x Qx x Qx x Qx x Qx

x x Q x x Q PSD

f x f x

JENSEN'S INEQU ALITY

Examples:


[ ( )] { ( )}f E x E f x


1

2

2 1 1 2 1

Defines the tangent plane at

4. The linear extrapolation at a point underestimates a convex function

assume ; + T

x

f x

f x C f x f x f x x x

Examples (cont’d):

• Proof:x

1x 2x

2 1 2 1

1 2 1 1

2 10

1 2 1 2 1

1 2 0 2 1

1 0 0 1

only if is convex 1 1

lim

If Assume result is true at and 1

1 1

T

T

f x f x x f x f x

f x x x f xf x f x

f x x x f x f x

x x x x x

f x f x f x x x

0

2 0 0 2 0

2 1 0 0 1 2 0

1 1

T

T

f x f x f x x x

f x f x f x f x x x x



2 2

2

2 1 1 2 1 2 1 1 2 1 2 1

2 2

1 1 2 1

2 1 1 2 1

5. convex is PSD over convex

1 (only if)

2

0 0 for sufficiently small

is conv

TT

T

f x C f x x

f x f x f x x x x x f x x x x x

f x f x x x

f x f x f x x x f x

*2

1

2

2 1 1 2 1 2 1

2 1 1 2 1

ex

If : Suppose 0 can find N ,

0

a contradition

6. Sum of convex functions is convex

7. The epigraph or the level set

=

T

T

f x x

x x f x x x x x

f x f x f x x x

1 2 1

2

1 2 1 2

1 2

is convex for all is convex

: Let and

and ;

1 1

Proo

f

1

:

if .

f x

f x

x x f x

f x

f x x f x f x

x x

x

Examples:

= : f xx

f x

1x

2x

tanf x cons t



8. Convex programming problem

min convex

s.t. concave convex

0

= : 0 : 0

i i

i

i i i

f x f x

Ax b g x g x

g x

x g x x g x

convex; = : convex

intersection of hyperplanes convex set

convex

9. Local optimum global optimum

global local is always true!!!

To prove local globa

ii

x f x

Ax b

Α

Α

*

*

* * *

*

l, let be a local minimum,

but is a global minimum.

Consider 1

Convexity 1 1

can not be a local minimum, a contradictio

x

y

x x y

f x y f x f y f x

x

n.

As a worst case, local minima must be bunched together as shown.

Examples:

x

f x



* * * *

* *2

10. First order necessary condition is also sufficient

11. is convex iff the scalar function = is convex and .

12. Since near , 0, we can appl

T nf x f x f x x x f x x R

f x g f x d x d

x f x

*

* * * *2

2 2

y convex analysis locally.

In addition, from Taylor series, for near

1 +

2

1 +

2T

TT

T TT T

bc

x x

f x f x f x x x x x f x x x

f x f x x x f x x f x x f x x

* * *2

* * * *

1+

2

1 A quadratic approximation near

2

T

Q

TT

x f x x

c b x x Qx x

Examples:



1 2 1 2

1 2 1 1 2

1 2

1 2

1 2 2

2 2 2

1 2 1 1 22

1 2 1 2

2 2 2

1 2 1 2 2

( ) ln(1 ) ln ln

1 1

1 2 1 1( ) 0

2 11 1 3

1

1 1 1

(1 ) (1 )( ) 0 { : 0, 0, 1}

1 1 1

(1 ) (1 )

f x x x x x

x x x x xf x x x

x x

x x x

x x x x xf x x x x x x

x x x x x

Stri

ctly Convex

Example:



Summary

Abstract Definition of an Optimization Problem

Classification of Optimization Problems


• Optimality conditions, algorithm, convergence

Optimality Conditions for single variable and Multi-

variable Functions

Elementary Convexity Theory

Date post:	18-Dec-2021
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Lecture 1: Introduction, Necessary and Sufficient ...

Documents