+ All Categories
Home > Documents > Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

Date post: 06-Feb-2016
Category:
Upload: ebony
View: 22 times
Download: 0 times
Share this document with a friend
Description:
Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning. Milind Chabbi John Mellor- Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE. - PowerPoint PPT Presentation
Popular Tags:
39
Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE 1 This work is funded by the Defense Advanced Research Projects Agency (DARPA) through the Air Force Research Lab (AFRL).
Transcript
Page 1: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

1

Efficiently Exploring Compiler Optimization Sequences With

Pairwise PruningMilind Chabbi

John Mellor-CrummeyKeith Cooper

RICE UNIVERSITYDEPARTMENT OF COMPUTER SCIENCE

This work is funded by the Defense Advanced Research Projects Agency (DARPA) through the Air Force Research Lab (AFRL).

Page 2: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

2

Compiler Optimization Phase-Ordering Problem

Order of application of compiler optimizations drastically changes measured performanceKulkarni et al. [CGO’ 06] show 38% average code size

reductionZhao et al. [CGO’09] show up to 32% speedupProduction compilers still use fixed order

Figure credit : Zhao et al. [CGO’09]

Exascale systems multiply the cost of poor node performance

Page 3: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

3

Phase-Order Selection Is Hard Selecting best phase order is non-trivial

Program dependent Relations between optimizations are complex

• One optimization can enable/disable another

Exhaustive empirical exploration is expensive and unrealistic 20 Optimization 2.5 * 1018 possible optimization sequences “Exhaustive optimization phase order space exploration.” [Kulkarni et al. CGO '06]

• Many optimization orders lead to structurally same function instances

Approaches Analytically modeling code and effects of optimization is non-trivial and still in

infancy• “M. L. A framework for exploring optimization properties.” [Zhao et al. CC '09]

Other techniques have been tried and proven to be effective • Genetic algorithms [Cooper et al. SIGPLAN Workshop on Languages, Compilers, and Tools

for Embedded Systems 1999]

Page 4: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

4

Roadmap

Phase order selection using pairwise constraints between optimizations

Graph model

Regression model

Conditional Sampling modelWill show effectiveness on sample numerical program FMIN throughout the discussion with dynamic instruction count (DIC) as our optimization metric

Page 5: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

5

Interaction Is Significant Between Pairs

Interaction is significant between pairsCapture the ordering of pairs without regard

to their absolute positionsa b

a b

b a

b a

Good

Good

Bad

Bad

Page 6: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

6

Pruning Using Pairwise Constraints Generate all possible optimization pairs of length 2 and record their

performance characteristics pairs to empirically evaluate

• 20 optimization 380 pairs vs. 2.5 * 1018 sequences For k-wise , it will be groups to empirically evaluate

Compare performance of each pair with its reverse to build pair-ordering constraints

If < then, final sequence will look like:

Reduces search space from O(n!) to O(n2) Not a silver bullet strategy

Can be used to augment other search space pruning techniques

a b b a

a b

Page 7: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

7

Background And Effectiveness Of Pairwise Pruning

Used by test community In software testing : multiple input variables taking multiple values

cause combinatorial explosion Pairwise (a.k.a. all-pairs) testing is based on the observation that

most faults are caused by interactions of at most two factors. Pairwise-generated test suites cover all combinations of two therefore are

much smaller than exhaustive ones yet still very effective in finding defects

K. Burr and W. Young [STAR’98]

D. R. Wallace and D. R. Kuhn[International Journal of Reliability, Quality and Safety Engineering,2001]

Page 8: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

8

Roadmap

Phase order selection using pairwise constraints between optimizations

Graph model

Regression model

Conditional Sampling model

Page 9: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

9

Graph Model

Nodes represent optimizations : E.g. { a, b, c}Directed edges represent optimization ordersGraph construction

Empirically evaluate all pairs to add edges• ab < ba edge (a,b)• ac < ca edge (a,c)• cb < bc edge (c,b)

Add weights to edges based on profitability• E.g. (ab) Vs. (ba) has profit of 20%

a

b c

20 15

30

Graph may be cyclic or acyclic

Page 10: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

10

Phase Order Selection For Acyclic Graphs

Topologically sort graph nodes to get a sequence Such sequence (if exists) maintains all pair-

ordering constrains a

b c

20 15

30

Model found best sequence

Page 11: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

11

Phase Order Selection For Graphs With Cycles

Cyclic ordering constraints:ab < ba edge (a,b)bc < cb edge (b,c)ca < ac edge (c,a)

Select an edge to break in each cycle Select edge to minimize total weight of deleted

edges (minimizes cost of pair-ordering constraint violation)E.g. break edge (c,a)

Optimal sequence is : abc

a

b c

20 15

30

Page 12: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

12

Graph Model On FMIN• 13 optimizations• ~6 billion search space• Measure benefit of =

156 pairwise orderings• Model found best had

1111 DIC• 1103 DIC was best

among 5000 random sequences

• We were within 0.73% of the best

• 3.9% of the sequences in the random sampling were better than the model found best

Page 13: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

13

Performance Estimation

Want to predict performance of any random sequence

Useful to ensure that a given sequence optimized for one objective function does not dramatically worsen another objective E.g. Speed vs. Code size

Provides an analytical model for performance prediction

Page 14: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

14

Graph Model For Performance Estimation

Graph model has built-in ability to estimate performance of a given sequence

To estimate the performance of a random sequence: Perform a walk on the graph using the given

sequence Add weights of violated ordering-preference along

the walk to the performance number of the model found best sequence (already known)

Page 15: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

15

Example Graph Model For Performance Estimation

Let observed performance of model found best sequence (abcd) be 1200 instructions

Estimated performance of sequence dacb is:1200 +

a

b

c

d

120

20

30

40

60

50

+ + + = 1340

Edges decorated with absolute difference

not relative %

d

a

c

b

30

40

50

20

Page 16: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

16

Performance Estimation With Graph Model On FMIN

6 optimizations i.e. 720 sequences

1 27 53 79 1051311571832092352612873133393653914174434694955215475735996256516777031221

1241

1261

1281

1301

1321

1341

1361

1381

1401

Graphical model-predicted DIC Observed DIC

Optimization sequence sorted by DIC

DIC

Divergence + Phase mismatch

Page 17: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

17

Issues With Graph Model

Considered just pairs of optimizations of length 2Neglected global behavior of optimizations

Assumed weights or behaviors of pairs to be context-insensitive (i.e. same even in full length sequence)

Want a model that is context-sensitive

Page 18: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

18

Roadmap

Phase order selection using pairwise constraints between optimizations

Graph model

Regression model

Conditional Sampling model

Page 19: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

19

Getting Context Sensitive WithRegression Model

Take into account context of the pairs by sampling full-length sequences

Represent sequences by regression equationsRepresent all possible pairs as a parameter vectorPresence / absence of pairs in a sequence as

input variables Observed performance of a sequence as

measured value

X =

Input variables

Parameter vector

Measured value

Page 20: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

20

Example Linear Regression Model

Optimizations : { a, b, c }Sequence :

Equation :

a b c

Xab Xba Xac Xca Xbc Xcb

1 0 1 0 1 0 1045Xabc

0 1 1 0 1 0 1050Xbac

Measured value

… …

Parameter vector

Page 21: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

21

Analytical Model For A Sequence

Sample unique sequences Solve the linear regression to obtain value of

each of Xij

Given a sequence : Analytically projected performance is :

Xcb + Xca + Xba

c b a

Page 22: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

22

Regression Model On FMIN

Sequence of length 66! = 720 total sequences

1 25 49 73 97 1211451691932172412652893133373613854094334574815055295535776016256496736971221

1231

1241

1251

1261

1271

1281

1291

Observed DIC Model-predicted DICOptimization sequences sorted by observed DIC

DIC

No phase mismatch, less divergence

Page 23: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

23

Analysis of Regression-equation: Optimization Grouping Effect

Sequence of length 66! = 720 total sequences

1 25 49 73 97 1211451691932172412652893133373613854094334574815055295535776016256496736971221

1231

1241

1251

1261

1271

1281

1291

Observed DIC Model-predicted DICOptimization sequences sorted by observed DIC

DIC

gn,ln,mn

lg, lm

lg, lm

Page 24: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

24

Refined Regression Model

100% sampling to solve regression equation

1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 501 521 541 561 581 601 621 641 661 681 7011221

1231

1241

1251

1261

1271

1281

1291

Observed DIC Model predicted DIC after augmentation

Optimization sequence sorted by DIC

DIC

Superior projections, perfect corelation

Page 25: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

25

Regression Model With Reduced Sampling Rate

1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 501 521 541 561 581 601 621 641 661 681 7011221

1231

1241

1251

1261

1271

1281

1291

Observed DIC Model predicted DIC with 12% sampling

Optimization sequence sorted by DIC

DIC

12% sampling

Page 26: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

26

Roadmap

Phase order selection using pairwise constraints between optimizations

Graph model

Regression model

Conditional Sampling model

Page 27: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

27

Properties of Pairs Across Phase Shifts

1 25 49 73 97 1211451691932172412652893133373613854094334574815055295535776016256496736971221

1231

1241

1251

1261

1271

1281

Observed DIC

Optimization sequences sorted by observed DIC

DIC

(m,n) = 0% (m,n) = 66.6%

(l,n) = 0% (l,n) = 66.6%

(g,n) = 0% (g,n) = 66.6%

Page 28: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

28

Properties of Pairs Across Phase Shifts

1 25 49 73 97 1211451691932172412652893133373613854094334574815055295535776016256496736971221

1231

1241

1251

1261

1271

1281

Observed DIC

Optimization sequences sorted by observed DIC

DIC

(l,g) = 0% (l,g) = 75%

(l,g) =

0%

(l,g) = 75%

mn,ln,gn shift

(l,m) = 0% (l,m) = 75%

(l,m) = 0%

(l,m) = 75%

Page 29: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

29

Properties of Pairs Across Phase Shifts

1 25 49 73 97 1211451691932172412652893133373613854094334574815055295535776016256496736971221

1231

1241

1251

1261

1271

1281

Observed DIC

Optimization sequences sorted by observed DIC

DIC

mn,ln,gn shift

lm, lg shift (c,d) = 0%(c,d) = 100%

0%100%

0%100%

0%100%

Page 30: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

30

Conditional Sampling Model

Sample k << n! full length sequences that satisfy a set of pairwise ordering constraints CInitially C = {}We sampled 100 sequences in our implementation

Identify largest phase shiftObtain pattern on either side of largest phase shift

e.g. pairs present with 100% or 0% on one sideAdd pairwise constrains favoring better performance to CRepeat sampling and refining C until we reach a

performance plateau

Page 31: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

31

Conditional Sampling On FMINConditions:

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 961103

1123

1143

1163

1183

1203

1223

1243

1263

1283

1303

Optimization sequence sorted by DIC

DIC (o,d) = 100% (o,d) = 17%

od

13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z}

Page 32: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

32

Conditional Sampling On FMINConditions:

od

13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z}

vd

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 961103

1123

1143

1163

1183

1203

1223

1243

1263

1283

1303

Optimization sequence sorted by DIC

DIC Shift

(v,d) = 100% (v,d) = 60%

Page 33: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

33

Conditional Sampling On FMIN

an,oa,bn,cn,dn,gn,ln,ol, mn,on, qn, tn, vn, zn, oq, ov

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 961103

1108

1113

1118

1123

1128

1133

1138

Optimization sequence sorted by DIC

DIC

Shift

Conditions:

od

vd

an , oa, bn, cn,

dn, gn, ln, ol, mn, on, qn,

tn, vn, zn, oq, ov

= 100%

an = 39%

cn = 39%

dn = 43%

gn = 37%

ln = 37%

ol = 79%

mn = 40%

on = 71%

qn = 37%

oq = 79%

ov = 100%

tn = 37%

oa = 80%

bn = 46%

vn = 13%

zn = 61%

Page 34: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

34

Conditional Sampling On FMIN

cd, cv

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 961103

1103.5

1104

1104.5

1105

1105.5

Optimization sequence sorted by DIC

DIC

Shift

(c,d) = 100% (c,d) = 0%

(c,v) = 100% (c,v) = 0%

13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z}

an,oa,bn,cn,dn,gn,ln,ol, mn,on, qn, tn, vn, zn, oq, ov

Conditions:

od

vd

Page 35: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

35

Conditional Sampling On FMIN

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 961103

1103.5

1104

1104.5

1105

1105.5

Optimization sequence sorted by DIC

DIC

Required 500 samples i.e.

8 * 10-6 % sampling

cd, cv

an,oa,bn,cn,dn,gn,ln,ol, mn,on, qn, tn, vn, zn, oq, ov

Conditions:

od

vd

Page 36: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

36

Summary

Order of application of compiler optimizations has dramatic effect on performance

“Pairwise pruning” reduces empirical search space by several orders of magnitude, yet effective

Three models of pairwise pruningContext insensitive graph modelContext sensitive regression modelContext sensitive Conditional Sampling model

Initial results are encouragingTechnique can be used to augment other search space

pruning techniques

Page 37: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

37

Backup slides

In our implementation we represent presence of pair by 1 and absence by -1 Reduces unknowns to

We add a residue term Xresidue to account for residual minimum advantage of applying each optimization i

Each Xij accounts only for the advantage/disadvantage of the ordered pair (i,j)

Standalone strength of optimizations i and j are accounted in Xresidue

Page 38: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

38

Challenges And Opportunities

Not a silver bullet strategySometimes patterns may not be as distinct as 0% or

100%, we may have to choose pattern based on higher percentage on one side • E.g. 90% on left vs. 30% on right

In our experiments we always took 100 samples, we can tune it with various techniquesVuduc et al. [International Journal of High Performance Computing

Applications - 2004] suggest a statistical early stopping criterion which suggests when sampling can be stopped

Page 39: Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning

39

Graph Model On FMIN

Six optimizations : {c,d,g,l,m,n}

Model found optimal sequence : cndgml

Model found sequence had dynamic instruction count of 1221 which was best among entire 720 possible sequences


Recommended