(c) A. Mason www.esc.auckland.ac.nz/Mason/ 1+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
PRACTICAL SET PARTITIONING AND COLUMN GENERATION
Andrew Mason
[email protected], www.esc.auckland.ac.nz/Mason
Linkoping 1999, Auckland 2000, Auckland 2001
Set Partitioning ProblemsGiven a set of objects (with index set I), find a minimal cost partition of I into mutually disjointsubsets.
Example: Copying 2 CD’s onto C60 tapes.
Set of objects:
Possible Subsets:
Cost of Subsets (assuming minimisation objective):
This particular problem is known as:
ENIGMA MCMXC Mins1 The Voice of Enigma 2.132 Sadeness 4.253 Find Love 4.824 Sadeness (Reprise) 2.805 Callas Went Away 4.486 Mea Culpa 4.877 The Voice & The Snake 1.758 Knocking on Forbidden
Doors4.45
9 Way to Eternity 2.3010 Hallelujah 4.2511 The Rivers of Belief 3.5212 Sadeness II 2.7213 Mea Culpa II 6.0714 Principles of Lust 4.8315 The Rivers of Belief II 7.07
Total: 60.30
ENIGMA the CROSS ofchanges
Mins
1 Second Chapter 2.272 The Eyes of Truth 7.223 Return to Innocence 4.284 I Love You… I'll Kill You 8.855 Silent Warrior 6.176 The Dream of the Dolphin 2.787 Age of Loneliness 5.378 Out from the Deep 4.889 The CROSS of changes 2.38
Total: 44.20
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 2+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Formal Set Partitioning Definition:Given:
1/ I={1… m}
2/ a collection of subsets P={P1, P2, … , Pn}, where each Pj⊆P
3/ a cost function c(Pj)
then J⊂ {1, … , n} defines a partition of I if and only if:
1/ (all elements in a subset)
2/
We seek a minimum cost partition: min sum_j in J c(Pj) st J partitioning I
Integer Programming (IP) Formulation of Set Partitioning:
• Rows correspond to elements of I
• Columns are elements of P
• aij=1 if element I is in Pj, aij=0 otherwise
• cj=cost of Pj
• xj=1 if Pj is in the partition
• all items must be included in soln
Variables: Matrix Coefficients:Right hand side: Constraints:LP Dual Variables:Note: X integer => will be Binary to satisfy constraints
Set Partitioning Example 1: Airline Planning (Pairings, Tours of Duty) Problem
Partition … … … … … ..… .… … … … … … ..… into … … ..(These tours will later be allocated to people.)
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 3+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Mon Tue Wed
0000
0600
1200
1800
0000
0600
1200
1800
0000
0600
1200
1800
AKLCHCWLGHNLLAXMELSYDSINLHRBNECNSDPSFKF
Mon0000
0600
AKLCHCWLGHNLLAXMELSYDSINLHRBNECNSDPSFKF
Example and Picture from Air New Zealand
Costs include:
Rules for building columns include:
Integer Programming Formulation:min x
st
x
Set Partitioning Example 2: Political Districting
(Images stolen from http://www.elections.org.nz/elections/general/electorates/index.html)
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 4+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Partition … … … .… census area units… … … … … … … … .into … … … … … electorates… …
Costs: Shape of electorate, natural unit (eg not split by rivers),deviation from desired populationsize!
Possible Additional Constraint: Must have 61 electorates
Other Set Partitioning Examples:Vehicle Routing- columns are routes, rows are deliveries to makeBin Packing
Set Packing Problems
Variables: Integer (0/1) Matrix Coefficients: BinaryRight hand side: Binary Constraints:LP Dual Variables:
Set Packing Example: Cutting of Boards
1 2 3 4 56 7 8 9 1011 12 13 14 15
1 2 3 4 56 7 8 9 1011 12 13 14 15
1 2 3 4 56 7 8 9 1011 12 13 14 15
1 2 3 4 56 7 8 9 1011 12 13 14 15
max x
st 12345 x6789
101112131415
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 5+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Set Covering Problems
Variables: Integer (0/1) Matrix Coefficients: BinaryRight hand side: Binary Constraints:LP Dual Variables:
Set Covering Example: Mail Deliveries
• Must walk along each street in town to deliver mail for that street.• Each person starts at 5am, must be finished by 7am.• If two people walk a street, only 1 does the deliveries (hence ‘covering’)
Columns:Cost:
Set Covering vs Set PackingIf changing any 1 in a column into a 0 gives a valid no more expensive column, then the setcovering and set packing solutions are the same.
Set Partitioning/Covering/Packing Generalisations:Different Possibilities:
Right Hand Side: Binary or Integer (or Real)Variables: Binary or IntegerA-Matrix coefficients: Binary or Integer (or Real)Constraints: Mix of <, =, >
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 6+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Generalised Set Covering Example: Single-Day Shift Generation
Columns are:
Costs:
b=
b1b2b3b4b5b6b7b8b9b10b11b12b13b14b15
:
bm-6bm-5bm-4bm-3bm-2bm-1bm
A=
1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 11 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 … 1 1 1 …
1 1 … 1 1 1 … 1 1 1 … 1 1 1 …1 … 1 1 1 … 1 1 1 … 1 1 1 …
1 1 1 … 1 1 1 … 1 1 1 …1 1 1 … 1 1 1 … 1 1 1 …
1 1 … 1 1 1 … 1 1 1 …1 … 1 1 1 … 1 1 1 …
1 1 1 … 1 1 1 …1 1 1 … 1 1 1 …
1 1 … 1 1 1 …1 … 1 1 1 …
1 1 1 … : : :
1 1 1 …1 1 1 …1 1 1 …1 1 1 …1 1 1 …
1 1 …1 …
...
...
...
...
1 : : 1 : : 1 : : 1 : :1 1 1 1 1 1 1 1
1 1 1 13 hour shifts 4 hour shifts 5 hour shifts 8 hour shifts
Variables: Matrix Coefficients:Right hand side: Constraints:
Offi
cers
on
Dut
y
0
5
10
15
20
25
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
2200
2300
Part time
Full time
Smoothed Workload
Arrivals
Departures
NS+Mon-Min
Generalised Set Covering: Example from NZ Customs
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 7+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Generalised Set Covering Example: Personalised Single-Day Shift Generation
Note: The Gx=e constraints are known as… GUB (generalised upper bnd) OR convexity
min x
stx
x
Variables: Matrix Coefficients:Right hand side: Constraints:
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 8+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Set Partitioning Example: ToD Allocation to Crew (“Rostering”)The problem here is take the optimal Tours of Duty (ToDs) produced earlier, and allocatethem to staff, eg cabin crew. We assume that each ToD requires 4 cabin crew.
ToDs123456789
101112
Week 1 Week 2 Week 3 Week 4
Wk
Wk
Wk
Wk
W T F S S M W T F S S MT
W T F S S MT
W T F S S MT
ToDs123456789
101112
Week 1 Week 2
W T F S S M W T F S S MT
min x
stx
12345 x6789
101112
Variables: Matrix Coefficients:Right hand side: Constraints:
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 9+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Generalised Set Covering Example: Group Single-Day Shift Generation
Building shifts with couples who prefer to work together.
min x
stx
x
Variables: Matrix Coefficients:Right hand side: Constraints:
Fractional matrix coefficients can arise, eg…
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 10+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Elastic ConstraintsEg for generalised set partitioning:
Introduce costed slack and surplus variables
Min (c | cslack | csurplus) (x xslack xsurplus )St. (A | I | -I) (x xslack xsurplus ) = b
min c_slack c_surplus
st 1 -11 -1
1 -11 -1
1 -1
min
st
Notes: don’t normally enforce integrality of slack/surplus variables (happens naturally)
can put bounds on u and s, and/or piecewise linear costs
This problem is more stable in the sense that the LP Dual Variables are now… .. bounded cslack <= Pi <= csurplus
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 11+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Solution Strategies
“Enumeration with Implication” (Constraint Logic Programming) for SetPartitioning
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
min 5 4 3 5 2 2 2 1 9 6 x
st 1 1 1 1 11 1 1 1 1 1 2
1 1 1 1 x = 1 31 1 1 1 1 1 4
1 1 1 1 1 1 51 1 1 6
x2 c=4 Possible Successors: 6, 10x6 x10
min 2 6 xst 1 1 1 4
1 x = 1 51 1 1 6
Constraint 5 -> x10 = 1
x2 x10 Feasible, cost = 7
x4 c=5 Possible Successors: 7, 8x7 x8
min 2 1 xst 1 1 2
1 x = 1 31 6
Constraint 6 => infeasible
x9 c=9 Possible Successors: 1,6,7x1 x6 x7
min 5 2 2 xst 1 1 1 2
1 1 x = 1 41 1 6
Constraint 6 -> x6x9 x6 c=11
x7
min 2 xst 1 x = 1 2Require x7x9 x6 x7 Cost = 13
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
min 5 4 3 5 2 2 2 1 9 6 x
st 1 1 11 1 1 1 1
1 1 1 1 x =1 1 1 1 1
1 1 1 1 11 1
Cost= Possible Successors:
minst
Cost= Possible Successors:
min xst 1
x = 11
Cost= Possible Successors:
minst
x =
Cost=
minst x =
Cost=Comments:Could we bound? Yes, if all cI>0, eg bound sln x9, x6 by first soln foundBut, bounds are weak as for only partial solns, no LP to give betterCan incorporate implication into IP B&B… see later
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 12+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Can often preprocess A matrix to identify cost/constraint implications
Can use for Covering/Packing, but implications not so strong
CLP can be much better than IP
NB: Cplex preprocessing “probing” is close to CLP
References:eg INFORMS Journal on Computing Volume 10, Number 3, 1998 (pubsonline.informs.org)
HeuristicsMany based on Lagrangian relaxation, genetic algorithms, simulated annealing etc. Some arevery good.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 1+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
COLUMN GENERATION AND DECOMPOSITION
Andrew Mason
[email protected], www.esc.auckland.ac.nz/Mason
Linkoping 1999, Auckland 2000, Auckland 2001
Integer Programming for Set PartitioningSolve the linear programming relaxation, then use “branch and bound” or “branch and cut” tointegerise.
When there is a choice of set partitioning or set covering as a formulation, set covering ispreferred: (Barnhart et al 1998)• Its linear programming relaxation is numerically far more stable and thus easier to solve;• It is trivial to construct a feasible integer solution from a solution to the linear programming
relaxation
Solving the Set Partitioning Linear Programming RelaxationWe assume A is an mxn matrix, with n>>m. We solve:
Min z=cTxst Ax = b
x ≥ 0, integerNB: Generally hard to solve as these LP’s are very … … … … degenerate
Standard LP procedure:Repeat
Price all (non-basic) columns to find an entering variableIf an entering variable is found
Enter variable into basis, remove leaving column, update x, and Pi’sUntil no entering column is found
Note:Let aj denote the j’th column of A, so A≡(a1| a2| a3| ... | an).
For binary A matrices, let I(aj)={i:aij = 1} be the indices of the rows column j contributes to.
Now, the reduced cost for xj is given byrc(xj) = cj –πTaj = ci – sum_ i in I(aj) π_i
Does it matter if we price basic columns? No
Why? They will have 0 reduced cost
But... beware numerical error.. don’t want a basic column to enter!
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 2+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Standard LP with Partial PricingAssume the variables are divided (perhaps naturally) into p subsets X1, X2, … , Xp
Eg, For Personalised Shifts Generation problem, Xs=columns for staff member s
min x
st 1x 1
1
xb
s = 1
Repeat
Repeat
Price all (non-basic) columns in subset Xs
s = s+1
if (s>p) s=1
Until a ‘sufficiently good’ entering variable is found or all columns priced
If an entering variable is found
Enter variable into basis, remove leaving column, update x, and Pi’s
Until no entering column is found
Absolutely vital for fastsolution of large problems.Much more efficientmemory access
What is sufficiently good?
LP C
ost
Time
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 3+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
LP with Sprint Pricing (Multiple Pricing)Maintain an active set.Use active set for fast iterations.Do big pricing occasionally.
x
x
st
x
x
Let Aactive denote the active subset of columns from A.
s = 1
Repeat
Repeat
Price all (non-basic) columns in active subset Aactive
If a ‘sufficiently good’ entering variable is found
Enter variable into basis, remove leaving column, update x, Pi’s
until no ‘sufficiently good’ entering column is found
Price A to find a set of good entering columns (-ve r.c.)
if good (or any) entering columns are found
Add entering columns to Aactive
Remove non-basic (high reduced cost?) columns from Aactive
Until no entering column is found
Note: The “Price A” step does not need to price all columns except in the final iterations.
Advantages:Small active set in memory Can bring in columns off disk if required
We stop our minor iterations and price A when the most negative reduced cost in Aactive is not“sufficiently good”. What’s good enough?
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 4+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
LP C
ost
Time
SPRINT successfully used within IBM for a number of big problems. Ideas also appear inLagrangian-based heuristics.
Efficient Pricing in the LP – Simple (Trivial?) Column GenerationPricing is all about giving the basis a new entering column if one exists.
Example 1: The A-matrix consists of 2n columns being all possibilities or a 1 or a 0 in eachposition. Cost of each column is 1.
min 1 1 1 1 1 1 1 … x
st 0 0 0 0 0 0 0 1 1 π1
0 0 0 0 0 0 0 1 1 π2
0 0 0 0 0 0 0 … 1 1 x = b π3
0 0 0 0 1 1 1 1 1 π4
0 0 1 1 0 1 1 1 1 π5
0 1 0 1 1 0 1 0 1 π6
We could store all our columns in an A-matrix. Then our pricing algorithm could price all thecolumns in sequence using rc(xj)= cj –πTaj, and then return that column (if any) with themostnegative reduced cost.
Or we could be smart:
We do not store any columns (except those in the basis)Whenever we need to price columns, use the following algorithm:Place 1’s in each row with a –ve Pi.Calculate reduced costReturn the constructed column if r.c. < 0
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 5+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Example 2: The A-matrix consists of all columns with 1, 2, 3, 4, or 5 1’s in any rows. Cost ofa column is the number of 1’s it contains.
min 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 … x
st 1 1 1 1 1 1 1 1 1 1 1 π1
1 1 1 1 1 1 1 1 π2
1 1 1 … 1 x = b π3
1 1 1 1 π4
1 1 1 1 π5
1 1 1 1 π6
The Smart approach:
Do not store an A-matrix.To find an entering column:For each row, calculate hi=1-PiFind the rows (up to 5) with most negative hi’sPlace 1’s in these rowsReturn the column constructed if it has negative reduced cost
Gilmore Gomory (Delayed) Column Generation for StockCutting
The Better Food Company produces cream-filled sponge rolls with a standard width of 20 cmeach. Each 20cm roll costs the company $2.00 to produce. Special customer orders withdifferent widths are produced by cutting (slitting) the standard rolls of sponge into shorterlengths. Typical orders (which may vary from day to day) are summarized in the followingtable. These orders need to be met at least cost.
Desired Desired NumberOrder Width (cm) of Rolls
A 5 150B 7 200C 9 300
An order is filled by setting the cutting knives to the desired widths. Usually, there are anumber of ways in which a standard roll can be slit to fill a given order. The figure belowshows three possible knife settings for the 20-cm roll. Although there are other feasiblesettings, we limit the discussion for the moment to considering settings 1, 2. and 3 in thefigure. Note that the shaded area in each diagram represents lengths of sponge that are tooshort to be used in meeting orders, and so these pieces must be thrown away. Such wastage iscalled trim loss.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 6+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Setting 1
Setting 3
Setting 2
20 20
20
7 9 4 5 5 7 3
5 5 9 1
The effect of all the different ‘sensible’ cutting patterns is summarised in the following table.
Pattern 1 Pattern 2 Pattern 3 Pattern 4 Pattern 5 Pattern 65 cm rolls produced 0 2 2 4 1 07 cm rolls produced 1 1 0 0 2 09 cm rolls produced 1 0 1 0 0 2
We note that each pattern uses no more than 20cm
Mathematical RepresentationWe seek to determine the knife setting combinations (variables) that will fill the requiredorders (constraints) while using the least number of rolls (objective).
To express the model mathematically, we define the variables as
xj = number of standard rolls to be slit according to pattern j, j = 1, 2, … , 6
Objective:We wish to minimise the number of the rolls we cut:
min x1 + x2 + x3 + x4 + x5 + x6
ConstraintsWe must ensure we cut at least the number of 5, 7 and 9 cm rolls ordered.
5-cm rolls: 2x2 + 2x3 + 4x4 + x5 ≥ 150
7-cm rolls: x1 + x2 + 2x5 ≥ 200
9-cm rolls: x1 + x3 + 2x6 ≥ 300Logical constraints:
x1 , x2 , x3 , x4 , x5 , x6 ≥ 0, integer
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 7+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Finding the Entering Column
The LP relaxation to the above IP can be written
x1 x2 x3 x4 x5 x6
min 1 1 1 1 1 1 x orderss.t. 5cm 2 2 4 1 x ≥ 150
7cm 1 1 2 ≥ 2009cm 1 1 2 ≥ 300
Assume we are solving the LP relaxation, and we want to find the best entering variable (mostnegative reduced cost). Now, the reduced costs for all columns (including the basic ones) are:
r.c.(xj) = cj - πTaj
Now, any column of the A-matrix can be represented as a vector:
xj
cj 1a1j y1 5cm
a2j y2 7cm
a3j y3 9cm
where y1, y2, & y3 are the integer number of 5cm, 7cm, and 9cm lengths cut from the 20cm.
When we generated the A-matrix, we considered all (sensible) combinations for which
5y1 + 7y2 + 9y3 ≤ 20
All combinations of y1, y2, & y3 that satisfy this constraint (and are ‘sensible’ in that they couldnot fit another roll) appear in the A matrix, and so represent possible entering columns.
The reduced cost of this general ‘y’ column is: 1 - y1p1 - y2p2 - y3p3
Therefore, the problem of generating our most negative reduced cost column can beformulated:
min 1 - y1π1 - y2π2 - y3π3 or max y1p1 + y2p2 + y3p3
s.t. 5y1 + 7y2 + 9y3.<= 20 Knapsack Problem
y1, y2, y3=0, integer
Notes:This idea was first used by Gilmore and Gomory in the 1950’s.
How do we implement the column generator?
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 8+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
shortest path, 21 nodes
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 9+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Extreme Columns…Will all the columns above actually be generated by the column generator? Consider asimplified example:
Columns are a mix of 5cm and 7cm pieces cut from 30cm
x1 x2 x3 x4 x5
min 1 1 1 1 1 x orderss.t. 5cm 6 4 3 1 0 x ≥ 9 π1
7cm 0 1 2 3 4 ≥ 6 π2
Consider column generator problem of finding most negative reduced cost given π1, π2:
min 1 - y1π1 - y2π2 or max y1p1 + y2p2
s.t. 5y1 + 7y2 .<= 30 Knapsack Problem
y1, y2=0, integer
What do the solutions to this problem look like? Note all duals are non-negative.
Case 1: π1> 2/3 π2Most negative reduced cost column is: (6,0)Τ
Case 2: π1< 2/3 π2Most negative reduced cost column is: (0,4)Τ
What is best optimal LP solution using generatedcolumns?:
x1 x2 x3 x4 x5
1.5 1.5
Cost =3
What is the best integer solution?x1 x2 x3 x4 x5
1 1 or 3 1
Cost =3
Note: Column x3 exists in LP solution as a linear combination of other cols
Moral of the story:Column generator gives extreme columnsMay need to generate new columns during integerisation
Don’t believe AMPL’s/CPlex stock cutting example! (CPlex does not generate in B&B, butonly uses columns generated during the LP solve.)
4
3
2
1
0 1 2 3 4 5 6 7 y1
y2
4
3
2
1
0 1 2 3 4 5 6 7
y2
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 10+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Dantzig Wolfe Decomposition and Column GenerationDantzig & Wolfe developed a technique that takes some LP or IP problem, and forms from ita new problem. This new problem can, if we wish, be solved using column generation.
Example: We have 2 staff available to cover today’s 3 shifts, each 4 hours long. We require 1or more staff on each shift. Each shift costs 1 unit to staff. Person A can work between 4 and8 hours, person B between 8 and 12 hours.
Let yij = 1 if person i does shift j, and 0 otherwise, yij∈{0,1}
We can write this problem out as follows to emphasise the independence of the yA’s and yB’s:
min yA1 + yA2 + yA3 + yB1 + yB2 + yB3
st yA1 + yB1 ≥ 1yA2 + yB2 ≥ 1 <-Complcated constraints (involve yA’s and yB’s)yA3 + yB3 ≥ 1
yA1+ yA2+ yA3 ≥ 1 yB1+ yB2+ yB3 ≥ 2yA1+ yA2+ yA3 ≤ 2 yB1+ yB2+ yB3 ≤ 3yA1 yA2 yA3 y∈ {0,1) yB1 yB2 yB3 ∈ {0,1)Easy constraints
If we replace the ‘easy’ constraints 4 and 5, and 6 and 7, by the set of solutions that they (andthe binary restrictions on the y’s) allow, we can write this as:
min yA1 + yA2 + yA3 + yB1 + yB2 + yB3
st yA1 + yB1 ≥ 1yA2 + yB2 ≥ 1yA3 + yB3 ≥ 1
(yA1, yA2, yA3) /in SA (yB1, yB2, yB3) /in SA
where SA is the set of all possible legal values for (yA1, yA2, yA3):
SA=
, , , , ,
1 1 11 1 1
1 1 1
and SB is the set of all possible legal values for (yB1, yB2, yB3)
SB=
, , ,
1 1 11 1 1
1 1 1
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 11+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
We can now represent each set as an integer convex combination of its members:
SA=
y
y
y
y
y
y
A
A
A
A
A
A
1
2
3
1
2
3
=
: xa1
+
+
+
+
+
,
1 1 11 1 1
1 1 1 ∑=
=6
1j
binary ,1 ajaj xx
SB=
y
y
y
y
y
y
B
B
B
B
B
B
1
2
3
1
2
3
=
:
+
+
+
,
1 1 11 1 1
1 1 1 ∑=
=6
1j
binary ,1 bjbj xx
Notice the use of convexity constraints, also termed GUB (generalised upper bound)constraints.
The above give us expressions for yA1 etc in terms of the x’s, so we can now substitute backinto the original formulation,
min yA1 + yA2 + yA3+ yB1 + yB2 + yB3
st yA1 + yB1 ≥ 1yA2 + yB2 ≥ 1yA3 + yB3 ≥ 1,
to form a new problem in which the decision variables are the x’s. This new problem has toinclude the extra convexity constraints and the binary restrictions on the x’s that are used todefine SA and SB. This gives us our new formulation:
xA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
min 1 1 1 2 2 2 2 2 2 3 x
st 1 1 1 1 1 1 ≥ 11 1 1 1 1 1 ≥ 1
1 1 1 1 1 1 x ≥ 11 1 1 1 1 1 = 1
1 1 1 1 = 1
xA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
min x
st
x
x in 0,1
y
y
y
A
A
A
1
2
3
∈ SA,
y
y
y
B
B
B
1
2
3
∈ SB
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 12+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
This new problem is called the IP Master Problem.
It’s LP relaxation is called the LP Master Problem.
Note that in the relaxed (LP) master problem, the x’s can be fractional, and so therequirements of yA and yB belonging to SA and SB respectively are relaxed instead to yA and yB
being in their convex hulls (polyhedrons), conv(SA) and conv(SB), respectively. The key ideasin Dantzig-Wolfe are(1) the convex hulls of SA and SB can be defined by their extreme points.(2) In general, SA and SB could have millions of members, and indeed millions of extreme
points, so we can’t add all of these to the master.(3) Instead, we generate new extreme points (columns) and add these columns to the master
whenever these new columns will improve the objective (i.e. have negative reduced cost).
Note: When the master has only a subset of the columns, we say it is restricted.
We have decomposed the original problem into a master and 2 column-generationsubproblems, 1 for each person.
The Column Generation SubProblems:In this example, we know that Person A’s columns are defined by
01
3
2
1
A
A
A
yyy
where the points (yA1, yA2, yA3) ∈ SA, i.e. are the solutions to
1≤Ya1+Ya2+Ya3 ≤ 2, YA1, Ya2, Ya3 binary
Each column defined by (yA1, yA2, yA) has a cost given by
Ya1+Ya2+Ya3
As part of our pricing, we want to find the column (yA1, yA2, yA) in SA that has the mostnegative reduced cost (i.e. is the best possible ‘Person A’ entering column). Now, given avector of duals (π1, π2, π3, πA, πB) any column defined by (yA1, yA2, yA) has reduced cost:
rc = yA1+yA2+yA3-yA1 Pi1 - yA2 Pi2 - YA3 Pi3 - PiA
Thus the problem of finding the most negative cost column for person, i.e. the ‘Person A’column generation problem for the LP Master is
min rc = yA1+yA2+yA3-yA1 Pi1 - yA2 Pi2 - YA3 Pi3 - PiAs.t. 1≤Ya1+Ya2+Ya3 ≤ 2, YA1, Ya2, Ya3 binary
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 13+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Note 1:If the members of SA (or SB) are all (0,1) vectors (i.e. the y’s are binary), then SA is exactly theset of extreme points of the convex hull conv(SA) of SA. However, this is not true if the y’s aregeneral integer. Column generators tend to produce extreme points, and so, in the latter case,some feasible integer columns may never be generated for the LP Restricted Master.Eg… stock cutting problem seen before
Note 2:In this case, SA and SB could be described by linear constraints; the column generators wereeasy problems. However, this need not be the case. Indeed, the beauty of column generation isthat we can embed very complicated rules in the column generators. The column generatorhandles the complexity, not the IP.
Note 3:Given any (possibly fractional) x’s for the LP, we can calculate the original variables, i.e. they’s. Eg, for our example:
yA1 styA2
yA3
Note 4:The Relaxed LP Master is often stronger than the original formulation as some fractionalsolutions in the original may not be solutions to the Dantzig-Wolfe reformulation; i.e. theDantzig-Wolfe reformulation has a worse LP objective than the original formulation, and thusis easier to integerise. However, if the column generation problems are not NP-hard (eg theirLP forms have naturally integer solutions), then the Dantzig-Wolfe reformulation and theoriginal problem have… the same objective.
So, for our example above, the sub problem is/is not is naturally integer, and so, the Dantzig-Wolfe reformulation is/is not is not stronger.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 14+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Branch and Bound with Column GenerationIf our original problem was an IP, then so will be our new Dantzig-Wolfe master problem.Solving integer programs using column generator almost always requires that we generateduring branch and bound. If we generate columns during the branch and bound process, wecall it “Branch and Price” (Barnhart et al, 1998), or “IP Column Generation” (Wolsey 1998)
Branching PossibilitiesIf column generating, branches must be respected by the column generator. That is, columnsmust satisfy the branches imposed. We don’t want this to complicate the generator too much.
Variable Branching:Force a variable xj up or down to xj
i or xji respectively (xj
i is value of xj at node i), ieadding xj ≤ xj
i or xj ≥ xji.
Why not use variable branching? Problems occur if variables are forced down:• Applying an upper bound on a variable means it cannot be generated again in the column
generator. How do we stop it reappearing? Need k’th shortest path- hard!• With binary variables, a zero branch (xj=0) says little about the solution; many feasible
solutions remain. (The 1-branch xj=1 is much more powerful.)
Constraint Branching: Binary Variables, Binary A-matrix, GUB Constraints• Developed by David Ryan and Brian Foster in 1981• Column Generation Friendly
xA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
1/2 1/4 1/4 1/3 2/3
yA1 3/4 st 1 1 1 1 1 1 ≥ 1 π1 yB1 1yA2 1/4 1 1 1 1 1 1 ≥ 1 π2 yB2 1yA3 1/2 1 1 1 1 1 1 ≥ 1 π3 yB3 2/3
1 1 1 1 1 1 = 1 πΑ1 1 1 1 = 1 πΒ
Constraint branch on constraint pair 3,4 (ie yA3)1-Branch X X X0-Branch X X X
x1/2
yA1 3/4 styA2 1/4yA3 1/2
Constraint branch on constraint pair1-Branch0-Branch
We branch to force (1-branch) or ban (0-branch) tasks for a specific person. Branch choice isbased on ‘original y variables’ in Dantzig-Wolfe view.
Two branches possible above. Pick one of these. Each side of the branch is enforced bybanning columns.
These branches are easy to enforce in a column generator. Eg, to force person A to undertaketask 3, we tell the Person A column generator that Pi3=inf
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 15+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Column Generator StructuresColumn generators are typically:• Shortest Path (Dynamic Programming)
• Nested shortest path• Shortest Path with Resource Constraints• TSP solutions (vehicle routing)• TSP solutions with resource constraints (eg vehicle routing)• General IP’s• Enumerators
• Randomised enumerators
Note: Shortest Path/Dynamic Programming is only useful if there is significant merging ofstates. Otherwise, it is just inefficient enumeration.
Example: Simplified rostering. Must determine which days are worked, and which days are offfor each staff member. Staff like to work 5 days on, then 2 days off (a ‘5/2’), but can alsowork 3/1, 4/2, 6/2., or 6/3. Each day worked is paid 8 hours. Staff need to work 80 hoursover the fortnight roster period.
m in x
stx
x
16h
Saturday
Hours Worked
Days inroster
0h 8h 24h 32h
MondayTuesday
WednesdayThursday
FridaySaturday
SundayMondayTuesday
WednesdayThursday
Friday
Sunday
40h 48h 56h 64h 72h 80h
end
start
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 16+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
L.P. Dantzig-Wolfe Decomposition (Formally)
Consider the L.P.
0
min
≥
=
=
x
bb
xTS
xcz
T
s
T
We assume sbSx = involves sm constraints, TbTx = involves Tm constraints i.e., S isnms × , T is nmT ×
Let { }.0,: ≥== xbTxxR TT Then we can change our LP into:
0
min
≥
=
=
x
bb
xTS
xcz
T
s
T
⇒ T
s
T
RxbSx
xcz
∈=
=min
TR is the set of feasible solutions to a subset of the constraints 0, ≥= xbTx T . Now, TR ispolyhedral (defined by hyperplanes), and so any point in a bounded polyhedral set can bewritten as a convex combination of its extreme points, i.e., if TRx ∈ then (assuming TRbounded),
0,1, ≥== ∑ λλλ Tjj
j
exx
ie, 0,1, ≥== λλλ TeXx ,
where X=(x1 x2 … xt) is an n x t matrix (t is the number of extreme points), with jx being thethj extreme point of TR . (Don’t worry yet about how to find jx or how many there are.)
We can substitute 0,1, ≥== λλλ TeXx into the original LP i.e. our LP becomes:
0
min
≥
=
=
x
bb
xTS
xcz
T
s
T
⇒ T
s
T
RxbSx
xcz
∈=
=min
⇒
.01
min
≥=
====
λλ
λλ
T
s
TT
e
bSXSxXcxcz =zmin
⇒
.01
min
≥=
======
λλ
λλλλ
T
s
TTT
e
bPSXSxfXcxcz =zmin
where SXP = , i.e. jj Sxp = , and Xcf TT = , i.e. jT
j xcf = . That is, the new LP has
columns SXP = , and costs Xcf TT =
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 17+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
NB: 1. This LP is called the Master LP (MLP). It has ms+1 explicit constraints and asmany variables as extreme points of −TR often large.
2 The new columns are [ ]tt SxpSxpP === ...1
1
3. The explicit constraints in the new problem are
=
1s
T
beP
λ
4. If tm is large ⇒a. Ts mm + constraints in original LP – large. But just ms+1 in Master LP.b. n vars in original L.P. but t vars in the new Master L.P (t often very
large).
Pricing using a Column Generator
Each extreme point Tj Rx ∈ leads to a column in the master of the form
1
jSx with cost
jT xc . We want to find an entering variable for the master given some duals Tπ , ie we wantsome T
j Rx ∈ with reduced cost rc( jx )< 0.
If we write ( )01 ,πππ TT = , we see that the reduced cost rc( jx ) is given by
rc( jx ) = ( ) 01 ππ −− jTT xSc
The usual Simplex criterion for entering variable is to minimize reduced cost and we thereforedefine a linear column generation sub-problem :
Given some master duals ( )01 ,πππ TT = find the extreme point TRx ∈ ,
{ }.0,: ≥== xbTxxR TT that has the most negative reduced cost ( ) 01 ππ −− jTT xSc , i.e.
( )
0 st
min 1
≥=
−
xbTxxSc
T
TT π
Q: Do we know we will get an extreme point of TR ? Why?Yes; LP only finds extreme points
Q: Do we need to solve this problem to optimality? Why?Often no; can stop if we get a -ve reduced cost col
scalar
vector
constant
variable
constant; can ignore forchoosing xj
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 18+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
The solution sx of this LP subproblem defines the new variable sλ which enters the master
program if the reduced cost is negative, ie. if ( ) .001 <−− ππ sTT xSc The column that enters
the Master LP basis is then
=
11
ss Sxp
with objective coefficient sTs xcf = .
Having found a new variable sλ to enter the Master LP basis we can determine a leavingvariable using the usual LP criterion (applied to the MLP). The Master LP basis is thenupdated, thus leading to a new π-vector etc. (Each π vector defines a new subproblem, andeach subproblem generates a column for the master (i.e. an extreme pt of TR ).
Special Case - Multiple Sub-Problems
Consider some L.P
0
min
≥
=
=
x
bb
xTS
xcz
T
s
T
Consider the case where
=
pTT
TT
00
00
2
1
Write [ ]Tp
TTT cccc ...,,, 21= and [ ]pSSSS ...21= to correspond with the structure of T.
(Same with x and Tb ).
The LP is then
111
11
11
..../
...min
bxT
bxSxSts
bxcxcz
spp
spTp
T
=
=++=++=
ppp bxT =0...,,, 21 ≥pxxx
Master L.P.
The master looks like it did before:
Vectors
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 19+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
01
min
≥=
====
λλλλ
T
jjs
jTj
T
e
SxPbP
xcffz
Subproblem
( )
0
min 01
≥=
−−
xbTx
xSc
T
TT ππ
Using the above matrix partitions, we find the subproblem becomes.
( )
.0
min 1
ixibxT
xSc
i
iii
iiTT
i
∀≥∀=
−∑ π
This sub-problem can be treated as p separate problems since the sxi ' don’t affect each otherin the constraints. We therefore solve p subproblems
( )
.0
min 1
≥=
−
i
iii
iiTT
i
xbxT
xSc π
Then
=
sp
s
s
s
x
xx
x
.
.
.
.2
1
, so the new column
=
1sp
.with cost sf , where sTs
ss xcfSxp ==
NB: Each subproblem is usually relatively small.
Alternative Approach for above problemIn the above example, the column generator returned one large column that containedsolutions for each sub-problem given by iii bxT = , 0≥ix . An alternative way of solving thisproblem is to add a convexity constraint to the master for each constraint sub-problem
iii bxT = , 0≥ix , i=1,2,… ,p. This gives the rostering-type formulations we saw before with pcolumn generators, one solving each sub-problem defined over iii bxT = , 0≥ix .
Optimal solution= s
ix
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 20+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Interpretation of Decomposition:
The objective function of each subproblem has the form ( ) iiTT
i xSc 1π− . The thisubsystem proposes asolution say s
ix to the master program. (The master finds that this involves siiS x units of the
shared resources which are therefore not available to other subsystems. The direct cost of Six
is given by iTi xc but the thi subsystem, in proposing S
ix , must pay for its use of sharedresource. (The MLP (with knowledge of the other subsystem’s demands) charges the price
1p on the shared resources. Note that there is one element of 1p for each shared resource ofthe MLP. (i.e. constraint of MLP).
Then MLPT
MLP ddz bp1= and if the thi subproblem uses the thk resource then 0<kdb (i.e less
available for other subproblems. I.e. in demand) and if the thk resource is valuable then 0>dz(if we seek to minimise) therefore 0<kπ .
The indirect cost incurred by the thi subsystem in choosing s1p is represented by s
iiT S xp1−
where siS x is the amount ( )0≥ of each resource used. Therefore the indirect cost on the thi
subproblem is >0. (i.e. a charge against the thi subproblem).
The MLP then repeatedly asks the subproblems to propose solutions and each time adjust theprices of shared resources to:
a. discourage the use of shared resources in great demand i.e. sets 01 <<kπ .
b. encourage the use of resources underutilyed. I.e. sets 01 =kπ and charges eachsubproblem
nothing to use them.
The solution of the LP is provided by the MLP which computes weights (i.e λ) for each ofthe proposals provided by the subproblems. As proposals are considered by the MLP and 1pis adjusted, early proposals will no longer be attractive and will be removed fromconsideration by setting 0=λ (i.e nonbasic). When no profitable new proposal is made thesolution is given as ∑ ==
j
jj X?xx λ .
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 21+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Solving the LP: Different Pricing Calculations for the EnteringVariable
Steepest Edge Pricingeg: John J. Forrest and Donald Goldfarb, Steepest-edge simplex algorithms for linearprogramming, Mathematical Programming 57 (1992), pp. 341-374
Minimisation Example:
Total Cost 8.60C 1 1 0 0 0 0
A Matrix b Pi12 -8 1 31 011 -12 -1 -48 02 10 1 54 0.13 1 12 0.27
BasXn 0 0
X 4 4.6 19.8 36.8 0 0rc 0 0 0 0 -0.1 -0.3
1 2 3 4 5 6
Basis Inverse Basis Xb12 -8 1 0 0 0 -0 0.33 4.0 1
11 -12 0 -1 0 0 0.1 -0.07 4.6 2
2 10 0 0 1 0 0.8 -4.53 19.8 3
3 0 0 0 0 -1 -1.2 4.47 36.8 41 2 3 4
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7X1
X2
Which is the most negative reduced cost (termed Dantzig reduced cost) entering variable? x6
If we increase this variable by 1, we move to the following (non-basic) solution.
Total Cost 8.33C 1 1 0 0 0 0
A Matrix b Pi12 -8 1 31 011 -12 -1 -48 02 10 1 54 0.13 1 12 0.27
BasXn 0 1
X 3.67 4.67 24.3 32.3 0 1rc 0 0 0 0 -0.1 -0.3
1 2 3 4 5 6
Basis Inverse Basis Xb12 -8 1 0 0 0 -0 0.33 3.7 1
11 -12 0 -1 0 0 0.1 -0.07 4.7 2
2 10 0 0 1 0 0.8 -4.53 24.3 3
3 0 0 0 0 -1 -1.2 4.47 32.3 41 2 3 4
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7X1
X2
Why is this a bad choice of edge to be travelling along?
What happens if we try the other direction?
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 22+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Total Cost 8.50C 1 1 0 0 0 0
A Matrix b Pi12 -8 1 31 011 -12 -1 -48 02 10 1 54 0.13 1 12 0.27
BasXn 1 0
X 4 4.5 19 38 1 0rc 0 0 0 0 -0.1 -0.3
1 2 3 4 5 6
Basis Inverse Basis Xb12 -8 1 0 0 0 -0 0.33 4.0 1
11 -12 0 -1 0 0 0.1 -0.07 4.5 2
2 10 0 0 1 0 0.8 -4.53 19.0 3
3 0 0 0 0 -1 -1.2 4.47 38.0 41 2 3 4
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7X1
X2
Comment: The step taken when increasing x5 is smaller then x6
This makes the reduced cost smaller
We can scale the problem to make the step sizes similar.
Total Cost 8.20C 1 1 0 0 0 0
A Matrix b Pi12 -8 1 31 011 -12 -1 -48 02 10 4 54 0.13 1 12 0.27
BasXn 1 0
X 4 4.2 16.6 41.6 1 0rc 0 0 0 0 -0.4 -0.3
1 2 3 4 5 6
Basis Inverse Basis Xb12 -8 1 0 0 0 -0 0.33 4.0 111 -12 0 -1 0 0 0.1 -0.07 4.2 22 10 0 0 1 0 0.8 -4.53 16.6 33 0 0 0 0 -1 -1.2 4.47 41.6 41 2 3 4
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7X1
X2
Comment: x5 now has the better reduced cost
Steepest Edge Pricing:Calculate rc(xj) / normj
Scale factor ‘normj’ normalises to avoid above effect.
Must calculate initial scale factors... eg CPlex’s “Steepest Edge with Slack Initial Norms”
Can make a huge improvement (particularly for .....degenerate problems).
Scale factors have to be updated at each pivot for all non-basic variables... possibly slow (egincrease time per iteration by 8%).
But what about Sprint approach...? Have to get norms for variables added to active set
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 23+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Steepest Edge Pricing - formally
In the Dantzig rule we compute the variable with the smallest reduced cost jjT
j xIfac .π− is
measured in different units, this has the effect of scaling jc and ja by some value γ say.
Then ( )jT
jj accr πγ −=. . We would prefer that the choice of entering variable beindependent of the scaling of the variables. Steepest edge pricing is one way of addressing thisissue.
In a single RSM iteration we have
syxx α+=ˆ ,
where
=
0Bx
x is a basic feasible solution, and
−
=
−
010
0
1
M
s
s
aB
y gives the change in all
variables (basic and non-basic) when non-basic variable sx , s>m is increasing in value. (Note
that sa is the column of the A matrix corresponding to sx , and so saB 1−− is the change in thebasic variables Bx as sx increases.).
For sx to be the entering variable, the direction
−
=
−
010
0
1
M
s
s
aB
y must be downhill with respect
to c, i.e., we have the common entering variable condition for a negative reduced cost rc(xs):
.0
)( 1
<−=−== −
sT
s
sTBss
Ts
ac
aBccycxrc
π
Steepest edge pricing involves choosing the directionthat is most downhill with respect to c, i.e., at thegreatest angle to c.
ideal direction
cθsy
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7X1
X2
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 24+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
The angle θ can be found from θcosssT ycyc = , =>
s
sT
ycyc=θcos .
For θ close to 180° , we must want cosθ as small as possible, i.e., choose the entering (hencenon-basic) variable index s so that
j
j
j
jT
j
jT
s
sT
y
xrcmjy
ycmjyc
ycmjys
yc )(minminmin>
=>
=>
=
since c is a constant. (We see that we are simply scaling each reduced cost rc(xj) by its
associated step size jy , and then choosing the best of these scaled values.) However, we
need to compute
jy =
− −
010
0
1
M
jaB
for each candidate entering variable, and this requires calculating jaB 1− for each possible
entering variable jx ; this will be slow. However, recurrences have been developed to keeptrack of these vectors from iteration to iteration so they do not have to be calculated fromscratch.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 25+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
PRICING STRATEGIES:
Up to now, we have discussed a single method for determining the entering variable, i.e.entering var = one with minimum reduced cost, i.e. { } Njacs j
Tj ∈−= ,minarg π and sx is
the entering variable. This is known as the Dantzig rule. But there are many pricingstrategies:
(a) Full pricing (see above)
(b) Multiple pricing: find best p r.c.’s. Then price only on subset (p=5 or 10) until r.c.’snot sufficiently negative.
(c) Partial pricing: Like multiple pricing, but one just chooses any p columns (notnecessarily ones that have negative r.c.).
(d) Steepest edge (see later)
(e) Lambda pricing: Similar to steepest edge pricing in that it attempts to avoidproblems with scaling of variables.
(f) Column generation: (see later).
Lambda PricingTry to take into account the objective function cj.
lambdaj = cj / ( cj- rc(xj) ) = cj / ( πT aj)
Examplecj πT aj rc(xj) = cj - πT aj λj
1000 1010 -10 0.99010 12 -2 0.8331 2 -1 0.5
Of those columns with negative reduced cost, choose that with ...smallest λj
Not used much in practice (eg not in CPlex)
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 26+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
IntegerisationIn general our LP solutions will be fractional, and so we have to integerise. But, for somechoices of A, we get naturally integer solutions.
Totally Unimodular Matricessee Hoffman + Kruskal, 1956, and attachment.Naurally give integer x for any rhs b.
Totally unimodular (0,1) matrices arise if there is unique subsequence.... there is an orderingof the rows in which all columns with a one in row i have there next one (if any) in row j.
If the one’s are ordered activities, this means that all columns doing activity i do the same nextactivity, activity j. Limited sub-sequence can lead to solutions that are close to integer.
Balanced 0/1 Matrices with Unit Right Hand SideBerge (1972), Fulkerson, Hoffman & Opperheim (1974)
For 0/1 right hand side and ≥, ≤ or = constraints (i.e set covering, partitoning, and packing),fractions can only occur if there exist odd-order 2-cycles, i.e. pxp sub-matrices with p oddhaving row and column sums of 2 (Berge (1972)).
Some sample fractional structures demonstrating the odd-order 2-cycles
A: 1* 0 1* = 11* 1* 0 = 1 “2 from 3” (each of the rows (& columns) has 2 1’s0 1* 1* = 1
x: ½ ½ ½
A: 1* 0 1* 1 = 11* 1* 0 1 = 1 “3 from 4”1 1 1 0 = 10 1* 1* 1 = 1
x: 1/3 1/3 1/3 1/3
1* 0 1*1* 1* 00 1* 1*0 1 0 1 0 0 0 0
1# 0 0 0 1# “2 from 3”A: 1# 1# 0 0 0
0 1# 1# 0 00 0 1# 1# 00 0 0 1# 1#
x: ½ ½ ½ ½ ½ ½ ½ ½
Odd-order 2-cycles are necessary, but not sufficient, for there to be fractions. (See next.)Constraint branching (see next) removes columns, breaking cycles.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 27+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Perfect Matrices with Unit Right-hand Side(Padberg 1974)
A (0,1) matrix is perfect if all extreme points of {x: Ax ≤ e, x≥0} are integer. Note: Addingslacks turns this set packing problem into set partitioning (but not set covering).
In perfect matrices, odd-order 2-cycles can occur, but are neutralised (stopped from producingfractions) by other contraints.
EgA: 1 1 1 ≤1 <- eg ‘≤’ GUB constraint
1* 0 1* ≤11* 1* 0 ≤1 “2 from 3” (each of the rows (& columns) has 2 1’s0 1* 1* ≤1
Soln ½ ½ ½ (which we had before) is now not feasible because of GUB constraint.
A non-singular pxp sub-matrix with row and column sums equal to β may have a fractionalsolution with each variable being 1/β. If β≥2, these variables are fractional. However, if someother ‘integerising’ ≤1 constraint on these variables includes more than β of these variables(has more than β 1’s), this 1/β solution becomes infeasible because it violates this constraint.
Perfect Matrices guarantee integer solutions because they do not contain any sub-matriceswith row and column sums of β unless these sub-matrices have associated integerisingconstraints.
Note: GUB constraints are “integerising” for all the columns they contain. So, if a solution isfractional, it must fractionate across the GUB constraints.
A: 1 1 1 = 1 GUB 11 1 1 = 1 GUB 2
0 1* 0 0 1* 1 = 10 1* 1* 0 0 1 = 10 1 1 0 1 0 = 10 0 1* 0 1* 1 = 1
x: 1/3 1/3 1/3 1/3 1/3 1/3
NB: Adding cuts adds ‘perfect matrix’ structure to matrices.
Ideal MatricesP. Nobili, A. Sassano, (0, ±1) Ideal Matrices, Mathematical Programming 80 (1998) 265-281• Give integer solutions to set covering (Ax≥e), but not well characterised (yet!)
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 28+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Branch and BoundBasically “Enumeration with Upper and Lower Bounds”. Also called “divide and conquer”.• Upper bounds from heuristics and naturally integer solutions to LP relaxations• Lower bounds from LP relaxation
If we generate columns during the branch and bound process, we call it “Branch and Price”(Barnhart et al, 1998), or “IP Column Generation” (Wolsey 1998)
See [LS97] Linderoth, J.T., Savelsburgh, M.W.P., A computational study of search strategiesfor mixed integer programming, Georgia Inst. of Technology, 1997, for a good analysis ofbranch and bound strategies; much of this summary comes from here.
Branch and Bound Issues• Which node to explore next?• What branch to make next?• How much processing to do at each step?• Do we want to prove optimality?• Big or small LP – IP duality gap?• Will we be back-tracking?• Is feasiblity hard? is ‘good quality’ hard?
Branching Decisions• “What about our solution are we going to decide next?”
• How will our children nodes differ?• Which variable, constraint, SOS, or other branch do we choose?
• How balanced is the branch (number of solutions on each side of the branch)?• Unbalanced ok if we don’t intend to explore the other side• Variable branching on binary variables is very unbalanced
• Make the important decisions first• If a costly decision has to be made, make that branch earlier, not later
• How much do we change the solution?• Branching 0.9 to 1 (‘gentle branch’); small objective increase• Branching 0.5 to 0 or 1; both sides may increase objective
• Are we trying to find a good solution, or prove that some solution is optimal?• Finding a good solution - choose the gentle branches• Proving optimality – choose the “0.5” branches
• How much do we believe the LP vs our external knowledge?
Predicting LP-Objective after BranchingWe can often estimate the impact of a branch on the LP objective function.
Assume zi is the objective at the current node, node i. Assume some xji is fractional at node i’s
optimal solution. Let zi(xji↑) and zi(xj
i↓) be the new objective when xji is increased to (at least)
xji or decreased to (no more than) xj
i respectively and the problem is resolved.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 29+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Three methods to estimate zi(xji↑). (Estimating zi(xj
i↓) is analagous.)(1) Find some variable xh
i that when it enters and increases to value xhinew, increases xj
i up toxj
i. Then, zi(xji↑) ≈ rc(xh
i)(xhinew – xh
i). (see Nemhauser and Wolsey 1998, p364)(2) Strong Branching (see below): test the branch by branching the variable, and performing
a limited number of iterations on the new problem. (One dual pivot works well.) Providesupper bound on zi(xj
i↑)(3) PseudoCosts: [LS97] Let pk
j↑ = [zk(xj
k↑) - zk] / [xjk-xj
k] be the actual rate of change inthe objective function that occurred when xj
k was increased from xjk to xj
k at some nodek. Let pj
↑ be the average over all nodes where xj was branched up; pj↑ is called xj’s ‘(up-
)pseudo cost’. We assume the objective will change at the same rate this time:zi(xj
i↑) ≈ zi + pj↑ ( xj
i- xji↑)
If xj has never been branched before, can put pj↑=cj (not very good), or test the branch
using partial solving (above) (recommended).Not so useful with many 0/1 variables as the same variable won’t be branched often in thetree. What about constraint branching?
These methods can be combined; eg weighted combination of pseudo-cost (global)information and (local) strong branching results.
Predicting Integer Solution ObjectivesThese are mainly used in node selection rules (see later).
Techniques exist for obtaining bounds on the best integer solution that can be obtained from afractional LP solution. These use reduced costs and the requirement that variables be integer;see [LS97]. Other techniques include:
• Best (Integer) Projection• Integer infeasibility at node i is si=Σj ( min(xj
i - xji, xj
i- xji )
• Best integer solution that can be found from fractional node i, zi[int] is given by zi
[int]≈zi+si (zU-z0)/s0, where zU is current upper bound, and node 0 is the root node.
• Best (Pseudo-cost+Rounding) Estimate• Modify above approach to use pseudo costs assuming each fractional variable can
round in its cheapest direction• Probabilistic Pseudo-cost Estimate
• Modify best estimate to assume a variable rounds down or up with variousprobabilities
• Probabilites are based on number of non-zeros in heuristically obtained solutions.
Choice of Branching VariableRange of choices to pick which variable to branch on:• Pick most fractional (good for proving optimality)• Pick least fractional that is not integer (good for finding integer solutions)• Use user-provided priority order on variables• Enhanced Branching: Of a set of most (or sufficiently) fractional variables, pick that
variable with greatest cost [Savelburgh’s MINTO default]
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 30+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
• Use one of the above LP objective estimates, eg• Strong Branching (developed by CPlex). Given a set of fractional variables, try each
side of each branch by performing a fixed number of simplex iterations.• Must use zi(xj
i↑) and zi(xji↓) estimates in some sensible way; examples in the literature
include:• eg, choose that branch that maximises min[ zi(xj
i↓) , zi(xji↑) ]
• eg, choose that branch that maximises zi(xji↓) + zi(xj
i↑)• eg, choose that branch that maximises zi(xj
i↓) - zi(xji↑)
For an example of strong branching (using the last rule above) with constraint branching andcolumn generation, see• Klabjan, D., Johnson, E.L., Nemhauser, G.L., Solving Large Airline Crew Scheduling
Problems: Random Pairing Generation and Strong Branching, Georgia Inst. ofTechnology, 1999 [KJN99]
Node Choice• “Which partial solution are we going to explore next?”• Depth first search = “LP Dive”
• Choose one of the just created children• If both children are bounded or infeasible, step back up tree to first unexplored node• focuses on finding a (hopefully good) solution quickly• number of active (unexplored, unbounded and feasible) nodes stays small• finding a solution early allows bounding• may be hard to prove optimality
• Best first search• Always explore best active node in tree
• choice of ‘best’ can use LP and IP estimates discussed above• good for (eventually) finding a very good solution• large number of active nodes at any time
• To choose between children• choose least fractional (good for quick solutions) or use estimates discussed above• can backtrack if first child gives big objective increase• if branch is unclear, eg on 0.5’s,
• use estimates discussed above• fully solve both children, and continue from the better
• Blended strategies often used in practice• ‘Depth first’ to find a solution• Switch to ‘best first’ to prove optimality• ‘Multi-start’ depth first
• if depth first starts back-tracking, switch to a new better (higher) node• eg use depth first to explore both sides of
• ‘uncertain branches’, eg 0.5’s• critical branches, eg at top of tree
• Avoid ‘playing in the muck’; don’t generate sequences of similar solutions at thebottom of the tree
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 31+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Node Evaluation• What solution do we start from?
• Normally use parent LP solution• What algorithm do we use to re-optimise with the new branch?
• Dual simplex can’t use column generator• Resolve without column generation?
• Use primal simplex• Phase 1 or Big M to drive out banned columns, or push variables to new bounds
• How much time do we spend now vs later evaluating a node?• Can partially solve a node (eg using a limited number of iterations) to get bounds on
objective function; see estimate methods above• Use heuristics to get node upper bounds• Can solve to ‘objective function optimality’ only
• If objective equals parent’s, must be optimal, even if not dual feasible• Can leave node totally un-evaluated
• Allows multiple branches to be applied in succession
Partial Node SolutionWe can partially solve nodes• Stop LP before optimality obtained
• Eg limited number of iterations, or stop when reduced costs become near zero• Obtaining optimal solution is not important
• LP objective only used for bounding• Can generate lower bounds on optimal LP value from partial solution (see below),
allowing node to be bounded in normal way• LP optimal variable values only used for branching decisions
• LP decision variables (probably) change only slightly in final iterations• Saves time spent in ‘tail-off’
• Particularly important for column generators, as finding an entering column often getsharder as the master gets closer to optimality
• Objective function bounds can help node selection• Save basis of partially solved nodes for re-use if the node is explored later.
Bounds on LP SolutionsConsider some minimisation LP solution that is sub-optimal, i.e. there are columns with –vereduced costs. What can we say about the optimal solution?
Clearly, the current solution is an upper bound. We seek good lower bounds.
Two types of lower bounds: Dual variable based and Lagrangian. Both require a full pricing ofthe variables, or perhaps an iterative application (modify duals, generate, repeat if not dualfeasible) if column generation is used.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 32+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Farley’s Dual-Based Bounds – Dual Variable ScalingSee A.A. Farley, “A note on bounding a class of linear programming problems, includingcutting stock problems.” Operations Research 38, 1990, p922
Basic Idea: Modify the duals to become dual feasible; by weak duality, this dual solution givesa lower bound on the primal problem.
Example: Reduce all dual variables to 0. For positive column costs, this solution is dualfeasible. The dual objective πTb is 0. This is a (useless but valid) lower bound on cTx.
Consider the primal/dual pair:P min cTx : Ax ≥ b, x≥0D max bTπ : ATπ ≤ c, π≥0
As usual, we will assume c≥0.
Assume we have some basic sub-optimal solution to P and associated duals π. We are sub-optimal (i.e. not dual feasible) so for some j,
rc(xj) = cj – (Aj) Tπ < 0
Consider some new set of duals π' = α π, 0≤α<1. We know α=0 gives dual feasible (exampleabove), and that α=1 is not dual feasible. What is the largest value of α that is dual feasible?
Solve: max α : rc'(xj) = cj – (Aj)T(απ) ≥0 for all columns j in A
Gives α = min j {cj / (Aj) Tπ : (Aj) Tπ > 0}
Noting that π ≥0 and hence π' = α π ≥0, the set of new duals π' = α π satisfy both feasibilityrequirements for (D) above. From weak duality, any feasible solution to D is a lower bound ona feasible solution to P. Therefore, a lower bound on the optimal objective value cTx* to (P) isgiven by
cTx* ≥ bTπ' = bT(απ) = α bTπ = α πTb = α cBTB-1 Tb = α cTx q
Notes:• For column generation, must be applied iteratively as finding α is hard
Lower Bounds from Dual Feasibility by Additive π ChangesA dual feasible solution can often be formed by decreasing one by one dual variables forsuccessive constraints until all reduced costs are non-negative. If none of the dual variablesbecome negative for any ≥ constraints, this will allow a lower bound to be calculated. (Thistechnique is used in some Lagrangian heuristics).
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 33+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Lower Bounds from Dual Feasibility with GUB ConstraintsConsider a problem with equality GUB constraints. We will assume that any columns that arenot part of a GUB constraint (eg slacks) have non-negative reduced costs. Let Xi be the set ofcolumns aj associated with GUB constraint i, and let rcmin(Xi)=min( rc(xj) : aj ∈ Xi ) be theminimum reduced cost in Xi for some current solution with objective value z. A lower boundon the optimal solution, z*=cTx*, is given by
z* ≥ z + Σi rcmin(Xi)To see this, we note that at least one column in each Xi is basic and so rcmin(Xi)≤0 for all Xi.Assume rcmin(Xi) <0 for some i, and hence the solution is not dual feasible. If the dual variableπi associated with GUB constraint i, is replaced by πi'=πi+rcmin(Xi) then all rc(xj) : aj ∈ Xi
become non-negative. Applying this process to all required Xi gives a dual feasible solution,with a bound (π')Tb. The result follows from unit right hand sides on the GUB constraints, andπTb = cTx at the current solution.Implementation: Partial pricing means we don’t normally price all individuals, and so cannotcalculate the bound. However, we can bound the bound as follows! Assume we stop wheneveran individual i has a best reduced cost rcmin(Xi) > -εi, and when the objective z < z* + ε, wherez* is the unknown optimal solution. We assume ε<Σiεi, i.e. we assume the bound on z istighter than the bound that follows from the individual εi bounds. Assume we last priced up toindividual p in our partial pricing. Then, in the next pricing step, we price the next individualsp+1, then p+2, p+3, ... p+k (assuming p+k wraps around back to 1) until rcmin(Xp+k) > -εp+k, inwhich case individual p+k’s column(s) enters, or until Σp<i≤p+k rcmin(Xp+k) < -ε, at which pointthe best reduced cost column(s) found so far enter the basis.
Lagrangian Lower Bound with GUB constraintsStandard Lagrangian techniques can be used to get lower bounds using the current set of LPduals. Assuming the GUB constraints are not relaxed, the Lagrangian sub-problem solutionfor a given set of duals is simply the set of most-negative column from each Xi (ignoring anyslacks/surpluses).
Branching PossibilitiesIf column generating, branches must be respected by the column generator. That is, columnsmust satisfy the branches imposed. We don’t want this to complicate the generator too much.
Variable Branching:Force a variable xj up or down to xj
i or xji respectively (xj
i is value of xj at node i), ieadding xj ≤ xj
i or xj ≥ xji.
Why not use variable branching? Problems occur if variables are forced down:• Applying an upper bound on a variable means it cannot be generated again in the column
generator. How do we stop it reappearing? Need k’th shortest path.• With binary variables, a zero branch (xj=0) says little about the solution; many feasible
solutions remain. (The 1-branch xj=1 is much more powerful.)
Fixing variables at 1 that are 1 in the LP can be a useful heuristic for big IP’s.
General form: “The number of times the column ai appears in the solution must be integer.”
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 34+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Constraint Branching: Binary Variables, Binary A-matrix, GUB ConstraintsxA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
1/2 1/4 1/4 1/3 2/3
1 1 1 1 1 1 = 11 1 1 1 = 1
yA1 1 st 1 1 1 1 1 1 ≥ 1 yB1 1yA2 0 1 1 1 1 1 1 ≥ 1 yB2 1yA3 3/4 1 1 1 1 1 1 ≥ 1 yB3 2/3
Constraint branch on constraints 1-Branch0-Branch
We branch to force (1-branch) or ban (0-branch) tasks for a specific person. Branch choice isbased on ‘original y variables’ in Dantzig-Wolfe view.
Two branches possible above. Pick one of these. Each side of the branch is enforced bybanning columns.
These branches are easy to enforce in a column generator. Eg, for force person A to undertaketask 1, we put PiA1=inf
General form: “The number of times we have person p working shift q in the solution must beinteger (0 or 1 in fact).”
Constraint Branching: Binary Variables, Binary A-matrix, no GUB ConstraintsGeneral case of above.
xA1 xA2 xA3 xA4 xA5 xA6 xA7 xA8 xA9 xA10
3/8 1/8 1/8 1/2 1/2
1 1 1 1 1 1 1 1 = 12 1 1 1 1 1 1 = 13 1 1 1 1 1 1 = 14 1 1 1 1 = 15 1 1 1 1 1 1 = 1
Coverage1 2 -1 -1 0 -1 -1 XXXX 0.5 0 0 -1 0.51 3 0 -1 -1 0 XXXX -1 -1 0 -1 0.5 0.51 4 -1 XXXX -1 0 XXXX XXXX 0.5 -1 0 0.5 11 5 -1 -1 0 -1 -1 XXXX -1 0 0 0.5 0.52 3 -1 0.375 -1 -1 -1 -1 -1 0 -1 -1 0.3752 4 XXXX -1 -1 -1 -1 XXXX 0.5 -1 0 -1 0.52 5 XXXX 0.375 0 XXXX 0.125 XXXX -1 0 0 -1 0.53 4 -1 -1 XXXX 0 XXXX -1 -1 -1 -1 0.5 0.53 5 -1 0.375 -1 -1 -1 -1 XXXX 0 -1 0.5 0.8754 5 XXXX -1 -1 -1 -1 XXXX -1 -1 0 0.5 0.5
Constraint branch on constraint pair 1-Branch0-Branch
Pair
General form: “The number of times shifts p and q occur together in a solution must be integer(0 or 1 in fact).”
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 35+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Follow-on BranchingSpecial case of above where the ‘constraint pair’ is restricted to tasks that occur in immediatesuccession. Eg if after flight sector 5 you can do one of sectors 6, 7, or 8, then a constraintbranch can force (or ban) each of these options. Good for column generation as the sectorsbecome one ‘multi-sector’... you either choose all or none of it. Eg, see [KJN99]
General Constraint Branching: Binary Variables, Binary A-Matrix (both above cases)
Developed by Ryan and Foster.Let J(s,t) = { j | asj = 1 and atj = 1}. J(s,t) is the set of columns covering both constraints sand t. Suppose activities (constraints) s and t appear together (i.e. occur in the same column)at a fractional value in the optimal LP solution (i.e. 1x0
t)J(s,jj << ∑
∈).
Then in an integer solution:either activities s and t must occur together (i.e. 1x
t)J(s,jj =∑
∈)
or activities s and t must not occur together (i.e. 0xt)J(s,j
j =∑∈
)
So we find constraints s and t with
1xts,
maxt)J(s,j
j <
∑
∈
and then force s and t to occur together by setting xj = 0 for all j∈ Jban1(s,t) where
Jban1(s,t) = { j | (asj = 1 and atj = 0) or (asj = 0 and atj = 1)}.
This is called the 1-branch.
In the 0-branch, we force s and t not to occur together by banning variables where theyhappen together, i.e. we ban variables in Jban
0(s,t)Jban
0(s,t) = { j | (asj = 1 and atj = 1) }
Note: A sequence of constraint branches leads (eventually) to a balanced matrix, and hence aninteger solution for all variables in at least 1 constraint with unit right hand side.
Constraint Branching: Binary Variables, Integer A, GUB ConstraintsxA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
1/2 1/4 1/4 1/3 2/3
1 1 1 1 1 1 = 11 1 1 1 = 1
yA1 3 1/2 st 3 6 6 1 1 7 1 3 6 3 ≥ 5 yB1 2 1/3yA2 2 3/4 4 1 3 7 2 1 4 3 6 6 ≥ 8 yB2 5 1/3yA3 1 1/2 2 2 1 2 1 1 3 2 0 1 ≥ 3 yB3 1 2/3
Constraint branch on constraint pair1-Branch0-Branch
We impose branches on the y variables of the form ypq≤ ypqi or ypq ≥ ypq
i , where ypqi is the
value of ypq at node i. These are enforced by banning all columns for person p which have apq> ypq
i or apq < ypqi respectively for some task (non-GUB constraint) q, where A=(apq) is the
non-GUB matrix in the problem.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 36+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Note that branching may be required even if ypqi is integer. This can occur when fractional x
sum to give an integer ypqi.
Branches are generally easy to enforce in column generators.
General form: “The number of times person p works shift q must be integer.”
SOS Branching: Binary Variables, Arbitrary A, GUB ConstraintsxA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
soln: 1/2 1/4 1/4 1/3 2/3
st 1 1 1 1 1 1 = 11 1 1 1 = 1
3 6 6 1 1 7 1 2.5 6 3 ≥ 54 1 3 7 2 1 4 3 6 6 ≥ 82 2 1 2 1 1 3 2 0 1 ≥ 3
Weight 2 3 4 6 7 9 2 3 5 7Soln Weight
SOS branch on GUB constraint 1-Branch0-Branch
5 5.333333333
We forms ‘specially order sets’ (SOS’s) of variables with the property that, at most, only 1variable from the set can appear in the solution . Specially ordered sets arise from =1 (or ≤1)GUB constraints. Consider some specially ordered set Xp. Each variable xj in Xp has a(unique) weight wj. At some node i in the branch and bound tree, the average weight w i(Xp)of variables in Xp is given by
w i(Xp)=Σj∈Xp wjxji
We impose branches on w (Xp) of the form w (Xi) ≤ w i(Xi) or w (Xi) ≥w i(Xi). Theseare enforced by banning all columns in Xp which have wj> w i(Xp) or wj < w i(Xp)respectively.
Note: If the weights are not unique, SOS branching may be insufficient to form an integersolution for xj in Xp. Can randomly perturb wj before starting if required.
Above description is for ‘Type 1’ SOS branching; it can be used for any type of variables(integer, binary, real), but without a GUB constraint, this information must be given externally(it is not in the model). ‘Type 2’ allows for up to 2 variables from Xp to be in the solution aslong as they are adjacent in Xp, where Xp is now ordered. Type 3 allows only +1’s and –1’s inthe coefficients, but decreases the right hand side for each –1.
SOS branching works with overlapping Xp, i.e. a column can belong to any number of sets.
Typically easy to handle in column generators if weight is ‘column generation friendly’, i.e.derived from a column property associated with an arc or a node in shortest path.
Example: wj=‘time spent once sector 5 is completed before starting the next sector.’ (KJN99)
General Form: “The number of times that a column from Xp with a weight above (less than)some specified value must be integer (0 or 1 in fact).”
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 37+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Generalised SOS Branching (‘Attribute Branching’): Integer Variables, Arbitrary A, noGUB Constraints
xA1 xA2 xA3 xA4 xA5 xA6 xA7 xA8 xA9 xA10
soln: 1 3/5 3 1/3 2 1/5 1/4 1 3/4 2/3
st 1 1 1 1 5 1 0 2 2 1 ≥ 20 2 2 1 0 0 1 0 2 4 ≥ 33 6 6 1 1 7 1 2.5 6 3 ≥ 54 5 5 7 2 1 4 3 5 6 ≥ 82 2 3 2 1 1 3 2 2 1 ≥ 3
X=0-Branch
1-Branch
Consider any set X of integer variables. At some node i in the branch and bound tree, thenumber of times columns from X are used is given by
ni(X) =Σj∈X xji
We impose branches on n(X) of the form n(X) ≤ ni(X) or n(X) ≥ ni(X) .
Each branch is enforced by adding a constraint (cut) to the problem. These cuts are local; theyare not valid inequalities, and have to be removed when stepping up the tree.
Where possible, choose X to be some ‘attribute’ that is important to cost, and is ‘columngenerator friendly.’ Eg, branch on “number of full weekends off”. For each full weekend offthat the column generator includes, the reduced cost changes by the Pi for the added cut.
General Form: “The total number of times that we use columns from X in the solution must beinteger.”
Generalised Constraint Branching: Integer Variables, Arbitrary A-matrix, no GUBConstraints
xA1 xA2 xA3 xA4 xA5 xA6 xA7 xA8 xA9 xA10
soln: 1 3/5 3 1/3 2 1/5 1/4 1 3/4 2/3
st 1 1 1 1 5 1 0 2 2 1 ≥ 20 2 2 1 0 0 1 0 2 4 ≥ 33 6 6 1 1 7 1 2.5 6 3 ≥ 54 5 5 7 2 1 4 3 5 6 ≥ 82 2 3 2 1 1 3 2 2 1 ≥ 3
Branch on a=0-Branch
1-Branch
We impose branches of the generalised SOS form, where
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 38+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
X = X≥a = { j : aj ≥ a }where ‘a’ is some ‘reference column’ (typically chosen from the A-matrix).
To choose our reference column at non-integer node i (Vanderbeck and Wolsey, 1996):Find some column aj for which xj
i is the only fractional value in X≥a.
Note that aj is any maximal (undominated) column from Fi={ j : xji is fractional }.
Must be enforced by adding a constraint.
Vanderbeck and Wolsey show how these constraint branches can be enforced in general IP-based column generators by using 0/1 variables that determine when a column belongs to X≥a,and hence when its reduced cost include the dual variable for the cut associated with Xp.
General Form: “The number of times we have columns aj ≥ a appearing in the solution must beinteger.”
See: F. Vanderbeck, L.A. Wolsey, An exact algorithm for IP column generation, OperationsResearch Letters 19 (4) (1996) pp. 151-159.
Column Generator StructuresColumn generators are typically:• Shortest Path (Dynamic Programming)
• Nested shortest path• Shortest Path with Resource Constraints• TSP solutions (vehicle routing)• TSP solutions with resource constraints (eg vehicle routing)• General IP’s• Enumerators
• Randomised enumerators
Note: Shortest Path/Dynamic Programming is only useful if there is significant merging ofstates. Otherwise, it is just inefficient enumeration.
Can blend enumeration and shortest path• Enumerate high level structure of column, eg ‘all days on/off that give a 40 hour week’• Fill in column detail using dual information, eg via shortest path, eg ‘what is done during
days on’
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 39+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Column Generation Issues:• Many issues similar to SPRINT pricing.• How many columns do we return in each generate?
• Dynamic Programs often end in multiple states, offering many ‘good’ columns• Number to return depends on speed of generator
• When many columns are available, select a range of different columns
• When do we call the generator?
1000
1050
1100
1150
1200
1250
4000 4400 4800 5200 5600 6000 6400 6800 7200 7600Iteration
Obj
ectiv
e V
alue
-1000
-800
-600
-400
-200
0
Red
uced
Cos
t
Reduced Cost of the Entering Variable
Objective Value
Generation At Dark Bars
Impact of Calling the Column Generator• Crashing the basis
• Best to start from a good basis.• One strategy is to use the ‘remaining b’ (remaining workload to cover) (possibly
scaled) as a substitute for π, updating remaining workload as columns are generatedand contribute to covering the work.
Plots taken from: Mark Smith, Optimal Nurse Scheduling using Column Generation, Masters thesis,Department of Engineering Science, School of Engineering,University of Auckland, 1995
0200
400600
800
10001200
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Minimum Quality of Returned Columns
Tim
e (s
)
LP Relax
B&B
The effect of multiple column generation (fewer columns per generate on the right)
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 40+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
Integerisation StrategiesBranch and Bound
The course will introduce set partitioning, set covering and set packing models and illustratetheir use with several case studies.
Column generation will be discussed from both a ‘natural’ and a Dantzig-Wolfe viewpoint.
We will then consider branch and bound and use this to motivate constraint branching and itslinks to perfect and balanced matrices, and also limited subsequence.
Constraint branching will be discussed, and their use contrasted.This will include recent work on constraint branching via cuts for non-binary problems.General problem-motivated branching will be mentioned.
The choice of alternative column generators (enumerative, (k’th-)shortest path, and ‘delicate’blends of the two) will be covered.
Other topics covered will include a selection from branch and cut, practical IPimplementations, heuristic solution processes, ‘lift and project’ cuts, new work on stabilizedcolumn generation, heuristics for set partitioning, and bounded early LP termination. Some ofthis material will be covered in seminars researched and presented by students enrolled in thecourse.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 41+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
yA1 yA2 yA3 yB1 yB2 yB3
min 1 1 1 1 1 1 y
st 1 1 ≥ 1 π1
1 1 ≥ 1 π21 1 y ≥ 1 π3
1 1 1 ≥ 1 π4
1 1 1 ≤ 2 π5
1 1 1 ≥ 2 π6
1 1 1 ≤ 3 π7
yA1 yA2 yA3 yB1 yB2 yB3
min y
st
y
yA1 yA2 yA3 yB1 yB2 yB3 ∈ {0,1)
Note 3:The original objective function can be arbitrarily complex in each of the subproblem variables,but for linear programming, must be additive across sub-problems, eg
min z = fA(yA1, yA2, yA3) + fB(yB1, yB2, yB3).
This is possible because ....we evaluate fA(yA1, yA2, yA3) for each column generated.
However, a complex fA(yA1, yA2, yA3) can make for a complex generator.
Note 4:If some column, eg aAp is a convex combination of other columns,
aAp = Σλ(k) λ(k) aAk : Σλ(k) λ(k)=1but
fA(aAp) > Σλ(k) λ(k) fA(aAp)then
column aAp will never appear in the LP solution
Our original formulation:yA1 yA2 yA3 yB1 yB2 yB3
min 1 1 1 1 1 1 y
st 1 1 ≥ 1 π1
1 1 ≥ 1 π21 1 y ≥ 1 π3
1 1 1 ≥ 1 π4
1 1 1 ≤ 2 π5
1 1 1 ≥ 2 π6
1 1 1 ≤ 3 π7
yA1 yA2 yA3 yB1 yB2 yB3 ∈ {0,1}
Our Column Generation Reformulation:
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 42+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
xA1 xA2 xA3 xA4 xA5 xA6 xB1 xB2 xB3 xB4
min 1 1 1 1 1 1 1 1 1 1 x
st 1 1 1 1 1 1 ≥ 11 1 1 1 1 1 ≥ 1
1 1 1 1 1 1 x ≥ 11 1 1 1 1 1 = 1
1 1 1 1 = 1xij ∈ {0,1}
Example solution feasible for original, but not new formulation:yA1 yA2 yA3 yB1 yB2 yB3
Note 7:Where is the ‘convexity constraint’ in the stock cutting problem?
Hint: The stock cutting problem does not have natural sub-problems, so ...create them!All sub-problems are the same
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 43+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.
FIXES TO MAKE
2001:• We started with a simple rostering example., then into col gen forms.• Add something about NP-hard in sub-problem for LP complexity in Dantzig Wolfe.• Draw a price/pivot diagram.• Do Benders by introducing form of master first, then talking about how we find the cuts;
merge this into main notes.
2000: DONE!In generating extreme columns, the 2 cases are not Pi(1) < Pi(2) and P1(1)>Pi(2), but insteadmore complicated. (Depend on slope of frontier; in this case, frontier is a line from (6,0) to(0,4). Ratio is 2/3?
In Dantzig-Wolfe, costs are wrong... we have unit costs for master and original. Also, sub-problem has to have the dual for Pi_A.
Example of solution in original not master is 0.5 for all x’s.
Master problem has a generator sub-problem that is naturally integer, hence the LP is NOTstrengthened, contrary to example in notes.
See 2001 Second handout for more material including a rostering col generator.
(c) A. Mason www.esc.auckland.ac.nz/Mason/ 44+
This material is not to be distrubted; contact the author for the latest version and permission to distribute.