A gentle introduction to LP and dualityguyk/cs.pdfCombinatorial Optimization 2 What is an LP LP...

Combinatorial Optimization 1

✬

✫

✩

✪

A gentle introduction to LPand duality

Guy Kortsarz


✬

✫

✩

✪

What is an LP

LP stands for linear programming. Minimizing

a linear objective function subject to linear

constrains.

Example:

Minimize −x− 2y

Subject to to

x+ y ≤ 4

2x+ y ≤ 6

−x− y ≤ −1.

x ≥ 0, y ≥ 0.


✬

✫

✩

✪

Intersection of half plains

See the board.

The pointy places are called corners and also

Basic Feasible Solution.

Next an example.


✬

✫

✩

✪

Geometrical interpretation in the

plane

We will see on the board the feasible area of

the solutions. I will call the points in which

edges meet corners.

claim: there is always a corner who is optimum.

Proof: on the board.


✬

✫

✩

✪

The Corners

1. (0, 1)

2. (1, 0)

3. (2, 2)

4. (0, 4)

5. (3, 0)

The optimal point is (0, 4). The minimum is

−8.


✬

✫

✩

✪

How to get a corner?

Every two constrains who intersect give a

corner. See on the board.

Check that a BFS is indeed a corner


✬

✫

✩

✪

Linear Algebra

Consider a matrix A which is a n× n matrix.

We say that the columns v1, v2, . . . , vn are

independent columns if there does not exists n

numbers some of which are not 0 so that

c1 · v1 + c2 · v2 + . . .+ cn · vn = 0

Another way to write it A · x = 0 has no

solution.

If the rows are independent we say that the

rank of the matrix is n. Else the rank of the

matrix is the maximum number of independent

rows.

Theorem: An An×n has an inverse matrix and

determinant non zero iff all the rows are

independent.


✬

✫

✩

✪

The rank of the rows and columns

Given a collection of m rows its rank is the

maximum number of rows that we can take

that are independent. The same holds for the

rows.

Theorem (no proof): the rank of the rows

equals the rank of the columns

If the matrix is n× n and we want a unique

solution for Ax = b all the rows (hence

columns) have to be independent


✬

✫

✩

✪

The rank of our LPs

Note that we have for every variable x the

constraint x ≥ 0. If you take these rows only

you get the n · n identity Matrix whose rank is

n. This is the rank since it can not be more

than n.

Since the rank of the columns equals the rank

of the rows, the rank of the rows is always n

Namely always is the number of variable (note:

this is a special case).


✬

✫

✩

✪

How to find one BFS

1. Choose n independent rows (see the case of

2× 2).

2. Choose the n× n matrix associated with

this inequalities, the respective parts of b

and change the inequality to equality.

3. Note that the n× n matrix is invertible

since is has rank n by choice.

4. Solve via Gaus elimination.

5. Return this solution (each choice gives a

corners).


✬

✫

✩

✪

A very slow algorithm: going on

all corners

1. opt←∞

2. While there is a subset of n rows not

chosen do:

(a) Choose a collection of n rows not tested

yet.

(b) Define the n · n resulting Matrix.

(c) Check if the determinant is not 0

(d) If not, find the inverse matrix and the

unique solution and plug it in the

objective function.

(e) If plugging these values gives a smaller

solution opti update opt← opti.

3. Return opt (and the respective corner)


✬

✫

✩

✪

The vertex cover problem

Input: A graph G and a cost cv for any v

Output: A minimum cost U so that for every

e = uv, either u ∈ U or v ∈ U or both u, v ∈ U .

Integer programming is the (very hard)

problem of obtaining solutions with integral

values.


✬

✫

✩

✪

Integer programming for VC

Minimize∑

v c · xv

subject to

xv + xu ≥ 1 for every e = (u, v)

xv ∈ {0, 1}


✬

✫

✩

✪

What does it mean?

If xv and xu can only be 0 and 1, the vertices

whose value is 1 give a minimum vertex cover.

Check that an VC corresponds to an IP

solution and vice verse.

Thus IP is NPC

However, we later sketch the proof for:

Theorem: The Linear programming problem

(in which fractional values are allowed) can be

solved in time polynomial in the input


✬

✫

✩

✪

It very easy to pose almost every

NPC problem as an IP

A relaxation is replacing a constraint

x ∈ {0, 1} to the constraint x ≥ 0 (this works

for minimization problems)

The following LP is called the fractional LP for

Vertex Cover

Minimize∑

v cv · xv

subject to


xv, xu ≥ 0


✬

✫

✩

✪

Why is this relaxing

The strict xu ∈ {0, 1} is made much less strict

by allowing any fractional solution.

Why do we do that? Because LP can be solved

in Polynomial time.

What is the ”damage”?

What does it mean that xv = 2/3. A vertex v

can not be 2/3 inside the solution.

We lost the meaning the IP had


✬

✫

✩

✪

Why would a fractional solution be

useful for us?

An LP solution was used countless times to

derive approximation algorithms.

The main reason is the following observation:

Let opt be the value of the best IP solution for

our problem (which is also gives the best

solution itself). Let optf be the (optimum) LP

value.

Fact: optf ≤ opt.

Because LP minimizes over a much larger

universe that {0, 1}n.

So we can develop a very simple plan to get an

approximation.


✬

✫

✩

✪

Approximating via LP

1) Write an IP equivalent to the problem (this

is usually very easy).

2) Relax the constrains. Allow fractional

solutions. Find the best fractional solution in

polynomial time.

3) Try to set each fraction xi in x to 0 or 1 (for

example if xi = 0.9 its very likely will be set to

1).

4) Show that the costs for the new 0, 1 solution

is at most ρ · optf for some ρ > 1.

Remark: This last may be highly difficult.

May not be possible without loosing a lot.

Clearly, this implies a ρ approximation.

The above process is called ”rounding”. There

seems to be almost infinite theory about

rounding (easily 10 courses on that from what I

remember and do not need to further read).


✬

✫

✩

✪

For VC with vertex costs an

approximation of 2 is trivial.

Set S to be {v | xv ≥ 1/2}.

By the constrain of an edge (u, v) either xv or

xu will be at least 1/2 so the solution is

feasible.

Clearly the LP has a not larger optimum than

the IP (because it minimized over a larger

domain).

Thus while xu may contribute 1/2 to the

objective function, we pay 1. This is a ratio 2

algorithm.


✬

✫

✩

✪

Integrality gap

We denote throughout the optimum of our

combinatorial problem by opt and the value for

the LP as optf .

Usually opt and optf are not the same (albeit

there are many special case in which they are.

See the theory of totally unimodular matrices,

and even beyond that).

For minimization problems the maximum over

opt/optf (for minimization problems) is called

the integrality gap.

If we base an approximation on LP alone, the

ratio can not be better than the integrality gap.

Because approximation is defined with respect

to worse case.


✬

✫

✩

✪

The integrality gap for VC is tight

Consider a clique of size n. In a clique every

two vertices are connected by an edge (namely(

n2

)

edges in total).

Note that we must take set of (any) n− 1

vertices for the solution. If two vertices u, v are

not taken and since every two vertices are

neighbors, the edge uv will not be covered.

However, in a fractional solution we can give

1/2 to every vertex.

Thus n− 1 versus n/2. A n integrality gap of 2

(up to low order terms.


✬

✫

✩

✪

Convexity

An area is convex is for any two feasible points

the line between them has all feasible points.

In LP quite trivially we get a polytope (also

some times called simplex) which is a convex

body.

We can only solve programs if the feasible

solution is convex. And even that under some

constrains. See later.

The line between two vectors y and z is

α · y + (1− α)z.


✬

✫

✩

✪

Proving that any BFS in the case

of VC is half integral

.

This means that the entries are only {1, 0, 1/2}.

We know in fact more (no proof). There is an

optimum so that all the ones with 1 are in the

solution, all those with 0 are not in the

solution. About the 1/2? Unclear.

An easy theorem about BFS. A BFS can not

be posed as a convex combination

x = (1− α)y + αz

Let x be a BFS and say it is not half integral

This means that the entries of x either belong

to has some entries in (1/2, 1] or in [0, 1/2).


✬

✫

✩

✪

Half integrality proof

Define a new solution y that depends on x.

xi ∈ (1/2, 1] we set the value to xi − ǫ. and for

every xi ∈ [0, 1/2) we make xi ← xi + ǫ.

Define z as a function of x so that xi ∈ (1/2, 1]

we set the value to xi − ǫ. and for every

xi ∈ [0, 1/2) we make xi ← xi + ǫ.

You do not get to 1/2 if ǫ is small enough.

It is clear that x = (y + z)/2.

We now prove that both y and z are feasible. A

contradiction.


✬

✫

✩

✪

Feasibility of y, z

Consider xi + xj ≥ 1 for some e = ij. Now

consider y. If xj ∈ [0, 1/2) yj = xj − ǫ.

However since xi < 1/2, xj > 1/2 otherwise

they do not sum to at least 1. By definition in

y xj ∈ (1/2, 1] get xj + ǫ. Thus

yi + yj = x1 + xj ≥ 1.

Note that if both xi, xj > 1/2 we have no

problem.

The same proof holds for z.

Can we break the ratio of 2? Everybody thinks

that not.


✬

✫

✩

✪

LP can not ALWAYS help in

approximation

Consider the maximum independent set

problem.

Given an undirected graph G(V,E) find the

largest possible set U ⊆ V so that no two

vertices in U share an edge.

The algebraic constraint for two neighbors is

that at most one of them can belong to a

solution. Let xv be a variable corresponding to

a vertex v.

Clearly For every e = (u, v), xu + xv ≤ 1.

Since we think as xv = 1 as xv is in the

solution.


✬

✫

✩

✪

The IP for the Independent Set

problem

Note that we can maximize (maximize x is

minimizing −x which is still linear).

Maximize∑

v∈V xv

Subject to

e = (u, v) xu + xv ≤ 1, for every e ∈ E.

xv ∈ {0, 1} for every v.

Convince yourself that this IP is equivalent to

maximum independent set


✬

✫

✩

✪

The LP

Maximize∑

v∈V xv

Subject to

xu + xv ≤ 1, for every e = (u, v) ∈ E.

xv ≥ 0 for every v.

And its polynomially solvable.

But does the LP solution give any information

on the maximum size independent set?


✬

✫

✩

✪

Say that the input is a Clique

Namely a graph in which for every u, v, u 6= v

(u, v) ∈ E,

Since every two vertices are connected by an

edge, we can not take even two vertices into

any independent set.

In other words: the integral optimum is 1.

The fractional solution can ”cheat”.

Give 1/2 value to all xv.

The inequalities are met.

The value of the LP is n/2.


✬

✫

✩

✪

Duality in LP

Example:

Minimize x+ 2y

Subject to to

x+ y ≤ 4

2x+ 5y ≤ 12 x ≥ 0, y ≥ 0.


✬

✫

✩

✪

Valid inequalities

Say that x, y is valid.

Consider 1/2 · (x+ y) + 1/4 · (2x+ 5y) Note

that at x we get coefficient 1. And at y we get

y/2 + 5y/4 < 2y

Thus since the coefficients are at most those of

x and y, the above combination will give

something smaller than the objective function

at x, y for any x, y.


✬

✫

✩

✪

A brilliant idea

Let us do this with parameters. Consider

w1 · (x+ y) + w2 · (2x+ 5y)

And we wants that the coefficient of x and y

would be at most of the objective function.

This gives w1x+ 2w2x ≤ 1 · x

and

w1 · y + 5w2 · y ≤ 2y


✬

✫

✩

✪

Cancel x, y

w1 + 2w2 ≤ 1

and w1 + 5w2 ≤ 2


✬

✫

✩

✪

But do we have an objective

function?

We started with w1 · (x+ y) + w2 · (2x+ 5y)

And the inequality gives x+ y ≤ 4

2x+ 5y ≤ 12

Therefore we can replace x+ y by 4 and

2x+ 5y by 12 and the value can never be larger

than the primal.


✬

✫

✩

✪

The dual program

Maximize 4w1 + 12w2

Subject to

w1 + 2w2 ≤ 1

and w1 + 5w2 ≤ 2

Every feasible solution for the dual is a lower

bound on the primal.


✬

✫

✩

✪

General Duality

Primal:

Minimize cT · x

subject to

A · x ≥ b

x ≥ 0

Dual:

Maximize bT · y

subject to

yT ·A ≤ c

y ≥ 0


✬

✫

✩

✪

Duality theory

The value of any feasible solution for the dual

is at most the value of any feasible solution for

the primal.

cT · x = xT · c ≥ xT ·AT · y ≥ bT · x

We used here xT ·AT = A · x ≥ b.

• For every dual-feasible y, and primal

feasible x bt · y ≤ ct · x.

This is called, the weak duality theorem

But the following is also true:

•

maxy, AT y≤c

b · y = minx, Ax≥b

c · x.

This is called the strong duality theorem

(non trivial proof).


✬

✫

✩

✪

Complementary slackness

Consider again:

cT · x = xT · c ≥ x ·AT · y = A · xT · y ≥ bT · y

And say that x and y are optimum. Thus, all

the inequalities above must be equalities by

strong duality.

Consider what happens in coordinate i. If

(A · xT )i > bi. Thus the last inequality is

equality only if yi = 0. The same implies that

if (yT ·A)i > ci then xi = 0 must hold. Note

that if the above holds in any case its all

equalities and so both x is optimal and so is y.

Theorem 1. The following conditions are

sufficient and required for optimality

1. For any i, if (Ax)i < bi then yi = 0 and

2. (yT ·A)i < ci then xi = 0.

This is called complementary slackness


✬

✫

✩

✪

Example: Vertex cover and

matching

Fractional VC: Primal

Minimize c · xv

subject to


xv ≥ 0

Fractional Matching: Dual

Maximize∑

xe

subject to∑

e∈E(v) ye ≤ 1 for every v ∈ V

ye ≥ 0


✬

✫

✩

✪

The Traveling Sales Person

Given a graph and a metric on a graph and a

start vertex return a path that goes via every

vertex v exactly once, and among those

solutions (also called Hamilton paths) select

the one with cost

This is the best known problem in computer

science and if they talk in the paper on a

computer science problem, this problem is

usually chosen.


✬

✫

✩

✪

An LP for TSP

Let δ(S), for S ⊆ V be the edges with one

endpoint in S.

For {v} we simply write δ(v).

Fractional TSP:

Minimize ce · xe

subject to∑

e∈δ(S) xe ≥ 2 for every S ⊆ V∑

e∈δ(v) xe = 2 for every v ∈ V

xe ≥ 0

The number of constrains is exponential in |V |

however we later see that it can be solved in

polynomial time


✬

✫

✩

✪

Integrality gap

The above program is known to have 4/3 or

worse integrality gap. Namely, the ratio of

integral optimum over fractional optimum is at

least 4/3 in some cases.

On the other hand, a 4/3 upper bound on the

integrality gap is not known.

It is known that the Christofides Heuristic

delivers a solution that is within 3/2 of the LP,

establishing an upper bound of 3/2 on the

integrality gap (Wolsey).

This was improved to some specific metrics.

Very recently the directed version of this

problem was gives a constant ratio. Its called

asymmetric TSP.


✬

✫

✩

✪

Sketch of the ideas in the Ellipsoid

algorithm

If we use LP we need to show how to solve it in

polynomial time.

The linear programming problem:

Minimize cx

subject to

Ax ≥ b xi ≥ 0

Is polynomially equivalent to solving the

question: does A′x ≥ b′ has at least one

solution?

We can use it to look for the smallest z so that

Ax ≥ b and −cx ≥ −z.

And use binary search to look for th best z


✬

✫

✩

✪

Running time independent of m

Let L = n+ logP . Note that m is not taken.

Because the running time can be seen to

depend only on the number of variables and

the largest entry number. P is the largest

number in the input.

Then there is a basic feasible solution (corner)

of the polytope that contains rational numbers

yi = pi/qi so that yi, qi ≤ 2L


✬

✫

✩

✪

Ellipsoids continued

Taking a sphere with radios ub = n · 2L will

contain our solution and the minimum possible

volume: is 2−(n+2)L. If less than that, we can

show its empty.

In iteration we change the body to a smaller

volume body that still contains our solution.

An Ellipsoid is a generalization of the ellipse to

n dimension. If you multiply a sphere by a

positive semi definite metric Q you get an

Ellipsoid.

The volume of the ellipsoid is det(Q)

We start by some Ellipsoid E1 that contains

the polytope P = {x | A′x ≥ b′}.

The reduction in the volume is by:

vol(Ei+1) ≤vol(Ei)

21/(2(n+1)).


✬

✫

✩

✪

Ellipsoids continued

The number of iterations:

After 2(n+ 1) iterations the volume is cut in

half.

Recall that the volume starts at 22·n2·L. and is

at least is 2−(n+2)L.

Thus just the number of iterations is

Θ(n4 + n3 · logP ).


✬

✫

✩

✪

What do we do at every iteration?

We multiply n× n positive semi definite

matrices. This takes polynomial time. We do

not care much what polynomial since its

already a high time.

Therefore we got a total time that is

polynomial in the input.

Important for theory. We cant use LP if we

cant solve it in poly time.

The time is not strongly polynomial

In addition there are issues of how many digits

after the period you should take so that the

error will not cause problems.

Its not a good algorithm. Still, its polynomial

time. Later better polynomial time (interior

method type) were invented.


✬

✫

✩

✪

How do we reduce the volume?

The idea: use the “good center property” of

the ellipsoid.


✬

✫

✩

✪

The good center property

There is a vertex in the Ellipsoid that is called

the center.

If we pass a line through the center of a circle,

it will divide the circle to two equal in area

parts.

In the ellipsoid there is a center so that no

matter what line you pass through the center

the two parts are both not too large.

However, the part containing each part looks

now like a “broken egg”. Not an ellipsoid.

So we replace it by the smallest possible

ellipsoid that contains the ”correct” half egg.


✬

✫

✩

✪

Violated constraint

We need to know which part of the Ellipsoid to

choose. For this we need to find a violated

constraint.

Let x be the center of the ellipsoid

If x is not a feasible solution then there is a

violated constraint at one of the two broken

Ellipsoids.

Take a parallel hyperplane to the violated

constraint that goes via the center.

It cuts the ellipsoid into two “half ellipsoids”

Encompass the “appropriate half” by Ei+1:

The minimum possible new ellipsoid that

contains this half. Use the violated constraint

to know which half is right.

Its possible to show that the volume decreases

as stated above


✬

✫

✩

✪

Geometric illustration

E 1

E 2

Figure 1: The hyperplane through the center is

parallel to the violated constraint

Thus at the end the time is a (big) polynomial.


✬

✫

✩

✪

Programs with exponential size

Note: m, the number of constrains does not

appear in the running time.

Polynomial time is implied if we can find a

violated constraint.

Example: Fractional TSP:

Minimize ce · xe

subject to∑

e∈δ(S) xe ≥ 2 for every S ⊆ V∑

e∈δ(v) xe = 2 for every v ∈ V

xe ≥ 0

Finding a minimum cut computation because

needs to see if there is a cut that has capacity

less than 2 Then the minimum cut

computation, gives a violated constraint

Date post:	27-Jan-2021
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

A gentle introduction to LP and dualityguyk/cs.pdfCombinatorial Optimization 2 What is an LP LP...

Documents