Mixed Integer Programming: Analyzing 12 Years of Progress

Post on 01-Jul-2015

319 views 1 download

description

This presentation was first given at INFORMS in November 2013. It presents an analysis of the features that had the most impact on MIP solver performance during the last 12 years. More presentations are available at https://www.ibm.com/developerworks/community/groups/community/DecisionOptimization

transcript

Mixed Integer Programming:

Analyzing 12 Years of Progress

© 2013 IBM Corporation

Background

2001: Manfred Padberg’s 60th birthday– Bixby et al., “Mixed-Integer Programming: A Progress Report”, in: Grötschel (ed.) The

Sharpest Cut: The Impact of Manfred Padberg and His Work, MPS-SIAM Series on Optimization, pp.309-325, SIAM, Philadelphia (2004)

• Analysis of the relative contributions of the key ingredients of Branch-and-Cut Algorithms for solving MIPs

2013: Martin Grötschel’s 65th birthday– T. Achterberg and R.W., “Mixed Integer Programming: Analyzing 12 Years of

Progress”, in: Jünger and Reinelt (eds.) Facets of Combinatorial Optimization, Festschrift for Martin Grötschel, pp.449-481, Springer, Berlin-Heidelberg (2013)

• What stayed?• What changed?• Why?

2

© 2013 IBM Corporation

Agenda

Methodology– How to Benchmark– How to Measure Importance of Features

Analysis– Presolving– Cuts– Heuristics– Parallelism

Summary

3

© 2013 IBM Corporation

Benchmarking

Run competing algorithms on set of problem instances– Measure and compare runtime, timeouts– Use geometric mean for aggregation

Performance Variability– Seemingly performance neutral changes (random seed, platform, permutation of

variables, …) have drastic impact on solution time– Has been observed for a long time, e.g.

• Emilie Danna: Performance variability in mixed integer programming,Presentation at Workshop on Mixed Integer Programming 2008

– Friend or foe for Solvers?• Tuesday Oct 08, 08:00 - 09:30, Andrea Lodi:

Performance Variability in Mixed-integer Programming• Wednesday Oct 09, 15:30 - 17:00, Andrea Tramontani:

Concurrent root cut loops to exploit random performance variability– Definitely foe for Benchmarking

4

© 2013 IBM Corporation

A benchmarking myth

Compare Solver S against reference Solver R– Solver S claimed to be faster than R on “hard problems”

Model Set M– Solution times tR(m), m M, for S and R are in (0sec, 100sec]

Classify Models in to hard models H and easy models E– m H, if tR(m) > 80sec– m E, otherwise

Computational confirmation of speedup:– 1.8x faster on hard models– 0.8x slower on easy models

5

1.80

0.80

0.00

1.00S

peedup

© 2013 IBM Corporation

A benchmarking myth

Compare Solver S against reference Solver R– Solver S claimed to be faster than R on “hard problems”

Model Set M– Solution times tR(m), m M, for S and R are in (0sec, 100sec]

Classify Models in to hard models H and easy models E– m H, if tR(m) > 80sec– m E, otherwise

Computational confirmation of speedup:– 1.8x faster on hard models– 0.8x slower on easy models

Times for S and R uniform random numbers:– <tR(E)> = 40 <tR(H)> = 90– <tS(E)> = 50 <tS(H)> = 50– Speedup: 4/5 9/5

6

1.80

0.80

0.00

1.00S

peedup

© 2013 IBM Corporation

A benchmarking myth

Compare Solver S against reference Solver R– Solver S claimed to be faster than R on “hard problems”

Model Set M– Solution times tR(m), m M, for S and R are in (0sec, 100sec]

Classify Models in to hard models H and easy models E– m H, if tR(m) > 80sec– m E, otherwise

Computational confirmation of speedup:– 1.8x faster on hard models– 0.8x slower on easy models

Times for S and R uniform random numbers:– <tR(E)> = 40 <tR(H)> = 90– <tS(E)> = 50 <tS(H)> = 50– Speedup: 4/5 9/5

7

1.80

0.80

0.00

1.00S

peedup

© 2013 IBM Corporation

Avoiding the bias

The problem is real:2 different random seedsfor CPLEX 12.5

Problem comes from using timesfrom one solver to define subsetsof problems

8

biased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1848 0.99

[10,10k] 1074 0.95

[100,10k] 552 0.87

[1k,10k] 207 0.76

© 2013 IBM Corporation

Avoiding the bias

The problem is real:2 different random seedsfor CPLEX 12.5

Problem comes from using timesfrom one solver to define subsetsof problems

Solution–Use (max) times from all solvers

to define subsets of problems

9

biased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1848 0.99

[10,10k] 1074 0.95

[100,10k] 552 0.87

[1k,10k] 207 0.76

unbiased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1879 1.00

[10,10k] 1121 1.01

[100,10k] 604 1.01

[1k,10k] 238 1.08

© 2013 IBM Corporation

Avoiding the bias

The problem is real:2 different random seedsfor CPLEX 12.5

Problem comes from using timesfrom one solver to define subsetsof problems

Solution–Use (max) times from all solvers

to define subsets of problems

Note–250 models can not measure

performance difference of lessthan 10%

–Will use [10,10k] bracket

10

biased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1848 0.99

[10,10k] 1074 0.95

[100,10k] 552 0.87

[1k,10k] 207 0.76

unbiased subsets # of problems geomean

All 3159 1.00

[0,10k] 3082 1.00

[1,10k] 1879 1.00

[10,10k] 1121 1.01

[100,10k] 604 1.01

[1k,10k] 238 1.08

© 2013 IBM Corporation

Measuring Impact

MIP is a bag of tricks–Presolving–Cutting planes–Branching–Heuristics– ...

How important is each trick?Compare runs with feature turned on and off

–Solution time degradation(geometric mean)

–# of solved models• Essential or just speedup?

–Number of affected models• General or problem specific?

11

Bixby et al. 2001

Feature Degradation

No cuts 53.7x

No presolve 10.8x

Trivial branching 2.9x

No heuristics 1.4x

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

Benchmarking setup

• 1769 models• 12 core Intel Xenon 2.66 GHz• Unbiased: At least one of all thetest runs took at least 10sec

99% 82% 91% 26%93%91% 46%83% 65%% affected

12

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

Fundamental Features

• Lots of models unsolvablewithout

• Apply to most models

99% 82% 91% 26%93%91% 46%83% 65%% affected

13

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

Important Features

• Many models unsolvablewithout

• Apply to most models

99% 82% 91% 26%93%91% 46%83% 65%% affected

14

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

Parallelism is not important

• turning off == 12x fewer cyclesi.e. just a tighter time limit

• Hardware cannot defeatcombinatorial explosion

99% 82% 91% 26%93%91% 46%83% 65%% affected

15

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

Special Features

• Few models unsolvablewithout

• Apply to few models

99% 82% 91% 26%93%91% 46%83% 65%% affected

16

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

• Only 106 models• Biased towards Cuts due toselection of “hard” modelsfor CPLEX 5.0

• Impact of other componentsmatches

99% 82% 91% 26%93%91% 46%83% 65%% affected

17

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

99% 82% 91% 26%93%91% 46%83% 65%% affected

18

© 2013 IBM Corporation

Component Impact CPLEX 12.5 – Presolve

19

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

99% 82% 91% 26%93%91% 46%83% 65%% affected

20

© 2013 IBM Corporation

Component Impact CPLEX 12.5 – Cutting Planes

21

© 2013 IBM Corporation

Component Impact CPLEX 12.5 – Cutting Planes

2.52 1.40 1.19 1.22 1.041.021.83 1.02

Bixby et al. 2001

22

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

99% 82% 91% 26%93%91% 46%83% 65%% affected

23

© 2013 IBM Corporation

Component Impact CPLEX 12.5 – Primal Heuristics

24

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

99% 82% 91% 26%93%91% 46%83% 65%% affected

25

© 2013 IBM Corporation

Parallelism in CPLEX

Types of Parallelism–Opportunistic Parallelism–Deterministic Parallelism

• Identical runs produce same solution path and results• Use deterministic locks• Based on deterministic time• Implemented via counting memory accesses

Parallel Tasks–Root node parallelism

• Concurrent LP solve• Heuristics concurrent to cutting phase• Parallel Cut loop

–Parallel Processing of B&C Tree

26

© 2013 IBM Corporation

The Cost of Determinism

27

© 2013 IBM Corporation

0

200

400

600

800

1000

1200

1400

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

nu

mb

er o

f ti

meo

uts

0

50

100

150

200

250

tota

l sp

eed

up

10 sec

100 sec

1000 sec

Date: 28 September 2013Testset: 3147 models (1792 in 10sec, 1554 in 100sec, 1384 in 1000sec)Machine: Intel X5650 @ 2.67GHz, 24 GB RAM, 12 threads (deterministic since CPLEX 11.0)Timelimit: 10,000 sec

© 2013 IBM Corporation

Related presentations

Recent Developments in CPLEX Monday, 08:00 - 09:30 Tobias Achterberg

Non-convex Quadratic Programming in CPLEX Tuesday, 11:00 - 12:30 Christian Bliek and Pierre Bonami

Tutorial: Performance Variability in Mixed-integer Programming Tuesday, 8:00 – 9:30 Andrea Lodi and Andrea Tramontani

Software Tutorial: Expert Tips and Tricks for Using CPLEX Optimization Studio Tuesday Oct 08, 08:00 - 09:30

Lift-and-Project Cuts in CPLEX 12.5.1 Wednesday, 13:30 - 15:00 Andrea Tramontani

Concurrent root cut loops to exploit random performance variability Wednesday, 15:30 - 17:00 A. Tramontani, M. Fischetti, A. Lodi, D. Salvignin, M. Monaci

Interesting Use Cases for the CPLEX Remote Object Wednesday, 15:30 - 17:00 Lazlo Ladanyi and Daniel Junglas

29

© 2013 IBM Corporation30

© 2013 IBM Corporation

Component Impact CPLEX 12.5 Summary

31

© 2013 IBM Corporation

Component Impact CPLEX 12.5 – Branching

32