Date post: | 22-Jan-2018 |
Category: |
Data & Analytics |
Upload: | alkis-vazacopoulos |
View: | 622 times |
Download: | 0 times |
Alkis VazacopoulosRobert Ashford
Optimization Direct Inc.
CPLEXVery Large Optimization Models
And parallel processing
Summary
• Challenges of Large Scale Optimization•Heuristic Software Approach• Scheduling, supply chain and telecommsexamples•Benefits and problems of parallelism
Large Scale Optimization• Many models now solved routinely which would have
been impossible (‘unsolvable’) a few years ago
• BUT: have super-linear growth of solving effort as model size/complexity increases
• AND: customer models keep getting larger• Globalized business has larger and more complex supply
chain• Optimization expanding into new areas, especially
scheduling• Detailed models easier to sell to management and end-
users
The Curse of Dimensionality: Size Matters
• Super-linear solve time growth often supposed
• The reality is worse
• Few data sets available to support this
• Look at randomly selected sub-models of two scheduling models• Simple basic model• More complex model with additional entity types• Two hour time limit on each solve• 8 threads on 4 core hyperthreaded Intel i7-4790K
• See how solve time varies with integers after presolve
Simple model
0
1000
2000
3000
4000
5000
6000
7000
8000
0 10000 20000 30000 40000 50000
Solution tim
e in seconds
Numer of ineteger entities
Complex model
0
1000
2000
3000
4000
5000
6000
7000
8000
0 5000 10000 15000 20000 25000 30000
Solution time
Integer entities
Why size matters• Solver has to
• (Presolve, probe and) solve LP relaxation• Find and apply cuts• Branch on remaining infeasibilities (and find and apply cuts too)• Look for feasible solutions with heuristics all the while
• Simplex relaxation solves theoretically NP, but in practice effort increases between linearly and quadratic
• Barrier solver effort grows more slowly, but:• cross-over still grows quickly• usually get more integer infeasibilities• can’t use last solution/basis to accelerate
• Cutting grows faster than quadratic: each cut requires more effort, more cuts/round, more rounds of cuts, each round harder to apply.
• Branching is exponential: 2n in number of (say) binaries n
What can be done?• Decomposition
• Solve smaller models
• Use ‘good’ formulations• As tight as possible• Avoid symmetry
• Settle for a good solution• Use heuristics
• Use more powerful hardware• Can usually make minor improvements by having a faster
processor and memory• Big potential is from parallel processing• Use 4, 8 12, 24,… or more threads on separate ‘processors’
Parallel Processing• CPU clock speed is not improving• Power consumed (and heat generated according to the
square of the speed
• Memory speed is not improving (much)
• Can fit more processors onto a single chip
• Can get wider registers
• Vectorization• do several (up to 8) fp calculations at once on 512 bit
register (SIMD)• of limited use in sparse optimization
• Cannot use full processor capability if using many cores
Heuristic Solution Software• Proof of optimality may be impractical, but want good solutions to (say)
20%
• Uses CPLEX callable library
• First-feasible-solution heuristics
• Improvement heuristic• Solves sequence of smaller sub-model(s)
• approach used e.g. by RINS and local branching• Use model structure to create sub-models and combine solutions
• Assess solution quality by a very aggressive root solve of whole model
• Multiple instances can be run concurrently with different seeds
• Can run on only one core
• Can interrupt at any point and take best solution so fartime limit / call-back /SIGINT
Parallel Heuristic Approach
• Run several heuristic threads with different seeds simultaneously
• CPLEX callable library very flexible, so• Exchange solution information between runs• Kill sub-model solves when done better elsewhere
• Improves sub-model selection
• Opportunistic or deterministic
• 5 instances run on 24 core Intel 2 x Xeon E5-‐2690v3 running at up to 3.5 GHz (DDR4-‐2133 memory)
Example: Large Scale Scheduling, Supply Chain and Telecomms Models
† No usable (say within 30% gap) solution after 3 days run time on fastest hardware (Intel i7 4790K ‘Devil’s Canyon’)
Model entities rows cols integers
Easy 314 299288 57804 57804
Medium† 314 389560 94200 94200
Difficult† 406 371964 149132 149132
Large 302965 2836736 4892396 18271400
Huge 27000 2577916 12944400 12944400
Heuristic Results on Scheduling Models
8 Cores on Intel 2 x Xeon E5-‐2690v3 3GHzSolution Time Gap
Easy 96 231.06 0%
Medium 113 28800 ≤ 13%
Difficult 773.8 28800 ≤ 56%
Large 1.1493E+7 28800 ≤ 1%
Huge 370 28800 ≤ 61%
Lower bounds established by separate CPLEX runsTimes are in seconds
Huge Model Heuristic Behavior
300
500
700
900
1100
1300
1500
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Threads4 Threads8 Threads12 Threads16 Threads24 Threads
Large Model Heuristic Behavior
0.00E+00
5.00E+08
1.00E+09
1.50E+09
2.00E+09
2.50E+09
3.00E+09
3.50E+09
4.00E+09
4.50E+09
5.00E+09
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads
Large Model Heuristic Behavior
11000000
11500000
12000000
12500000
13000000
13500000
14000000
14500000
15000000
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads
Difficult Model Heuristic Behavior
300
500
700
900
1100
1300
1500
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Thread
4 Threads
8 Threads
12 Threads
16 Threads
24 Threads
Medium Model Heuristic Behavior
100
110
120
130
140
150
160
170
180
190
200
0 5000 10000 15000 20000 25000 30000
Solution Value
Seconds
1 Threads4 Threads8 Threads12 Threads16 Threads24 Threads
Easy Model Heuristic Behavior
0
500
1000
1500
2000
2500
0 100 200 300 400 500 600
Solution Value
Seconds
1 Thread4 Threads8 Threads12 Threads16 Threads24 Threads
What’s Gone Wrong?
• Size (or rather density) matters again
• More threads can give worse results
• Used 24 core Intel Xeon E5-2690v3 server
• Not used hyperthreading
• Memory transfer speed is ~20GB/sec
• MIP solves become memory bus bound
Work Rate
0
200
400
600
800
1000
1200
1400
1 6 11 16 21
Work Rate (ticks/thread/sec)
Threads
Easy (13.0)Medium (10.5)Difficult (8.5)Large (2.7)Huge (4.2)
Conclusions
• Hard size barriers to solve (to optimality) times
• May have to be satisfied with solutions of unproven quality/optimality
• Heuristic methods demand serious consideration
• Parallel solution methods best way of exploiting modern hardware
• Even these are limited by memory bus speeds