Post on 29-Mar-2015
transcript
Efficient Program Compilation through Machine Learning Techniques
Gennady PekhimenkoIBM Canada
Angela Demke BrownUniversity of Toronto
MotivationMy cool program
Compiler-O2
DCE
Peephole
Unroll
InlineExecutable
But what to do if executable is slow?
Replace –O2 with –O5
UnrollUnroll
UnrollUnroll
UnrollUnroll
Optimization 100New
FastExecutable
1-10 minutes
few seconds
Unroll
Inline
Peephole
DCE
Motivation (2)
Compiler-O2
Our cool OperatingSystem
1 hourExecutable
Too slow!
Compiler-O5
20 hours
NewExecutable
We do not have that much time
Why did it happen?
Basic Idea
UnrollUnroll
UnrollOptimization 100
Do we need all these optimizations for every function?
Probably not.
Compiler writers can typically solve this problem, but how ?
1. Description of every function2. Classification based on the description3. Only certain optimizations for every class
Machine Learning is good for solving this kind of problems
Overview
Motivation System Overview Experiments and Results Related Work Conclusions Future Work
Initial Experiment
3X difference on average
Initial Experiment (2)
0100200300400500
bzip
2cr
afty
eon
gap
gzip
mcf
vort
ex vpr
amm
pap
plu
art
equa
kefa
cere
cfm
a3d
galg
ellu
cas
mes
am
grid
sixtr
ack
swim
wup
wise
Time, secs
Benchmarks
SPEC2000 execution time at –O3 and –qhot –O3
"-O3""-qhot -O3"
Classification parameters
Our SystemPrepare
• extract features• modify heuristic
values• choose
transformations • find hot methods
Gather Training Data
Compile Measure run time
LearnLogistic Regression Classifier
Best feature settingsOffline
Deploy
TPO/XL Compilerset heuristic values
Online
Data Preparation
Three key elements: Feature extraction Heuristic values modification Target set of transformations
• Total # of insts• Loop nest level• # and % of Loads, Stores,
Branches• Loop characteristics• Float and Integer # and %
• Existing XL compiler is missing functionality
• Extension was made to the existing Heuristic Context approach
• Unroll • Wandwaving• If-conversion• Unswitching• CSE• Index Splitting ….
Gather Training Data Try to “cut” transformation backwards (from
last to first)
If run time not worse than before, transformation can be skipped
Otherwise we keep it We do this for every hot function of every testThe main benefit is linear complexity.
Late Inlining Unroll Wandwaving
Learn with Logistic Regression
Function Descriptions
Best Heuristic Values
Input Classifier
• Logistic Regression• Neural Networks• Genetic
Programming
Output.hpredictfiles
Compiler +Heuristic Values
Deployment
Online phase, for every function: Calculate the feature vector Compute the prediction Use this prediction as heuristic context
Overhead is negligible
Overview
Motivation System Overview Experiments and Results Related Work Conclusions Future Work
Experiments
Benchmarks: SPEC2000 Others from IBM customers
Platform: IBM server, 4 x Power51.9 GHz, 32GB RAMRunning AIX 5.3
Results: compilation time
00.10.20.30.40.50.60.70.80.9
1bz
ip2
craft
yeo
nga
pgz
ipm
cfvo
rtex vp
ram
mp
appl
uar
teq
uake
face
rec
fma3
dga
lgel
luca
sm
esa
mgr
idsi
xtra
cksw
imw
upw
ise
Geo
Mea
n
Normalized Time
Benchmarks
Oracle
Classifer
2x average speedup
Results: execution time
0
50
100
150
200
250
300
350bz
ip2
craft
yeo
nga
pgz
ipm
cfvo
rtex vp
ram
mp
appl
uar
teq
uake
face
rec
fma3
dga
lgel
luca
sm
esa
mgr
idsi
xtra
cksw
imw
upw
ise
Time, secs
Benchmarks
Baseline
Oracle
Classifer
New benchmarks: compilation time
0
0.2
0.4
0.6
0.8
1
Normalized Time
Benchmarks
Classifier
New benchmarks: execution time
0
50
100
150
200
250
300
350
apsi parser twolf dmo argonne
Time, secs
Benchmarks
Baseline
Classifer
4% speedup
Overview
Motivation System Overview Experiments and Results Related Work Conclusions Future Work
Related Work Iterative Compilation
Pan and Eigenmann Agakov, et al.
Single Heuristic Tuning Calder, et al. Stephenson, et al.
Multiple Heuristic Tuning Cavazos, et al. MILEPOST GCC
Conclusions and Future Work 2x average compile time decrease Future work
Execution time improvement -O5 level Performance Counters for better method
description Other benefits
Heuristic Context Infrastructure Bug Finding
Thank you
Raul Silvera, Arie Tal, Greg Steffan, Mathew Zaleski
Questions?