+ All Categories
Home > Data & Analytics > Very largeoptimizationparallel

Very largeoptimizationparallel

Date post: 22-Jan-2018
Category:
Upload: alkis-vazacopoulos
View: 622 times
Download: 0 times
Share this document with a friend
23
Alkis Vazacopoulos Robert Ashford Optimization Direct Inc. CPLEX Very Large Optimization Models And parallel processing
Transcript

Alkis VazacopoulosRobert Ashford

Optimization Direct Inc.

CPLEXVery Large Optimization Models

And parallel processing

Summary

• Challenges of Large Scale Optimization•Heuristic Software Approach• Scheduling, supply chain and telecommsexamples•Benefits and problems of parallelism

Large Scale Optimization• Many models now solved routinely which would have

been impossible (‘unsolvable’) a few years ago

• BUT: have super-linear growth of solving effort as model size/complexity increases

• AND: customer models keep getting larger• Globalized business has larger and more complex supply

chain• Optimization expanding into new areas, especially

scheduling• Detailed models easier to sell to management and end-

users

The Curse of Dimensionality: Size Matters

• Super-linear solve time growth often supposed

• The reality is worse

• Few data sets available to support this

• Look at randomly selected sub-models of two scheduling models• Simple basic model• More complex model with additional entity types• Two hour time limit on each solve• 8 threads on 4 core hyperthreaded Intel i7-4790K

• See how solve time varies with integers after presolve

Simple model

0

1000

2000

3000

4000

5000

6000

7000

8000

0 10000 20000 30000 40000 50000

Solution  tim

e  in  seconds

Numer  of  ineteger  entities

Complex model

0

1000

2000

3000

4000

5000

6000

7000

8000

0 5000 10000 15000 20000 25000 30000

Solution  time

Integer  entities

Why size matters• Solver has to

• (Presolve, probe and) solve LP relaxation• Find and apply cuts• Branch on remaining infeasibilities (and find and apply cuts too)• Look for feasible solutions with heuristics all the while

• Simplex relaxation solves theoretically NP, but in practice effort increases between linearly and quadratic

• Barrier solver effort grows more slowly, but:• cross-over still grows quickly• usually get more integer infeasibilities• can’t use last solution/basis to accelerate

• Cutting grows faster than quadratic: each cut requires more effort, more cuts/round, more rounds of cuts, each round harder to apply.

• Branching is exponential: 2n in number of (say) binaries n

What can be done?• Decomposition

• Solve smaller models

• Use ‘good’ formulations• As tight as possible• Avoid symmetry

• Settle for a good solution• Use heuristics

• Use more powerful hardware• Can usually make minor improvements by having a faster

processor and memory• Big potential is from parallel processing• Use 4, 8 12, 24,… or more threads on separate ‘processors’

Parallel Processing• CPU clock speed is not improving• Power consumed (and heat generated according to the

square of the speed

• Memory speed is not improving (much)

• Can fit more processors onto a single chip

• Can get wider registers

• Vectorization• do several (up to 8) fp calculations at once on 512 bit

register (SIMD)• of limited use in sparse optimization

• Cannot use full processor capability if using many cores

Heuristic Solution Software• Proof of optimality may be impractical, but want good solutions to (say)

20%

• Uses CPLEX callable library

• First-feasible-solution heuristics

• Improvement heuristic• Solves sequence of smaller sub-model(s)

• approach used e.g. by RINS and local branching• Use model structure to create sub-models and combine solutions

• Assess solution quality by a very aggressive root solve of whole model

• Multiple instances can be run concurrently with different seeds

• Can run on only one core

• Can interrupt at any point and take best solution so fartime limit / call-back /SIGINT

Parallel Heuristic Approach

• Run several heuristic threads with different seeds simultaneously

• CPLEX callable library very flexible, so• Exchange solution information between runs• Kill sub-model solves when done better elsewhere

• Improves sub-model selection

• Opportunistic or deterministic

• 5 instances run on 24 core  Intel  2  x  Xeon  E5-­‐2690v3  running  at  up  to  3.5  GHz  (DDR4-­‐2133  memory)

Example: Large Scale Scheduling, Supply Chain and Telecomms Models

† No usable (say within 30% gap) solution after 3 days run time on fastest hardware (Intel i7 4790K ‘Devil’s Canyon’)

Model entities rows cols integers

Easy 314 299288 57804 57804

Medium† 314 389560 94200 94200

Difficult† 406 371964 149132 149132

Large 302965 2836736 4892396 18271400

Huge 27000 2577916 12944400 12944400

Heuristic Results on Scheduling Models

8  Cores  on  Intel  2  x  Xeon  E5-­‐2690v3  3GHzSolution Time Gap

Easy 96 231.06 0%

Medium 113 28800 ≤  13%

Difficult 773.8 28800 ≤  56%

Large 1.1493E+7 28800 ≤  1%

Huge 370 28800 ≤  61%

Lower  bounds    established  by  separate  CPLEX  runsTimes  are  in  seconds

Huge Model Heuristic Behavior

300

500

700

900

1100

1300

1500

0 5000 10000 15000 20000 25000 30000

Solution  Value

Seconds

1  Threads4  Threads8  Threads12  Threads16  Threads24  Threads

Large Model Heuristic Behavior

0.00E+00

5.00E+08

1.00E+09

1.50E+09

2.00E+09

2.50E+09

3.00E+09

3.50E+09

4.00E+09

4.50E+09

5.00E+09

0 5000 10000 15000 20000 25000 30000

Solution  Value

Seconds

1  Thread

4  Threads

8  Threads

12  Threads

16  Threads

24  Threads

Large Model Heuristic Behavior

11000000

11500000

12000000

12500000

13000000

13500000

14000000

14500000

15000000

0 5000 10000 15000 20000 25000 30000

Solution  Value

Seconds

1  Thread

4  Threads

8  Threads

12  Threads

16  Threads

24  Threads

Difficult Model Heuristic Behavior

300

500

700

900

1100

1300

1500

0 5000 10000 15000 20000 25000 30000

Solution  Value

Seconds

1  Thread

4  Threads

8  Threads

12  Threads

16  Threads

24  Threads

Medium Model Heuristic Behavior

100

110

120

130

140

150

160

170

180

190

200

0 5000 10000 15000 20000 25000 30000

Solution  Value

Seconds

1  Threads4  Threads8  Threads12  Threads16  Threads24  Threads

Easy Model Heuristic Behavior

0

500

1000

1500

2000

2500

0 100 200 300 400 500 600

Solution  Value

Seconds

1  Thread4  Threads8  Threads12  Threads16  Threads24  Threads

What’s Gone Wrong?

• Size (or rather density) matters again

• More threads can give worse results

• Used 24 core Intel Xeon E5-2690v3 server

• Not used hyperthreading

• Memory transfer speed is ~20GB/sec

• MIP solves become memory bus bound

Work Rate

0

200

400

600

800

1000

1200

1400

1 6 11 16 21

Work  Rate  (ticks/thread/sec)

Threads

Easy  (13.0)Medium  (10.5)Difficult  (8.5)Large  (2.7)Huge  (4.2)

Conclusions

• Hard size barriers to solve (to optimality) times

• May have to be satisfied with solutions of unproven quality/optimality

• Heuristic methods demand serious consideration

• Parallel solution methods best way of exploiting modern hardware

• Even these are limited by memory bus speeds

Thanks for listening

Robert [email protected]


Recommended