transcript
- Slide 1
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 55 Exhaustive Phase Order Search Space
Exploration and Evaluation by Prasad Kulkarni (Florida State
University)
- Slide 2
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 552 Compiler Optimizations To improve
efficiency of compiler generated code Optimization phases require
enabling conditions need specific patterns in the code many also
need available registers Phases interact with each other Applying
optimizations in different orders generates different code
- Slide 3
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 553 Phase Ordering Problem To find an ordering
of optimization phases that produces optimal code with respect to
possible phase orderings Evaluating each sequence involves
compiling, assembling, linking, execution and verifying results
Best optimization phase ordering depends on source application
target platform implementation of optimization phases Long standing
problem in compiler optimization!!
- Slide 4
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 554 Phase Ordering Space Current compilers
incorporate numerous different optimization phases 15 distinct
phases in our compiler backend 15! = 1,307,674,368,000 Phases can
enable each other any phase can be active multiple times 15 15 =
437,893,890,380,859,375 cannot restrict sequence length to 15 15 44
= 5.598 * 10 51
- Slide 5
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 555 Addressing Phase Ordering Exhaustive Search
universally considered intractable We are now able to exhaustively
evaluate the optimization phase order space.
- Slide 6
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 556 Re-stating of Phase Ordering Earlier
approach explicitly enumerate all possible optimization phase
orderings Our approach explicitly enumerate all function instances
that can be produced by any combination of phases
- Slide 7
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 557 Outline Experimental framework Exhaustive
phase order space evaluation Faster conventional compilation
Conclusions Summary of my other work Future research
directions
- Slide 8
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 558 Outline Experimental framework Exhaustive
phase order space evaluation Faster conventional compilation
Conclusions Summary of my other work Future research
directions
- Slide 9
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 559 Experimental Framework We used the VPO
compilation system established compiler framework, started
development in 1988 comparable performance to gcc O2 VPO performs
all transformations on a single representation (RTLs), so it is
possible to perform most phases in an arbitrary order Experiments
use all the 15 re-orderable optimization phases in VPO Target
architecture was the StrongARM SA-100 processor
- Slide 10
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5510 VPO Optimization Phases IDOptimization
PhaseIDOptimization Phase bbranch chaininglloop transformations
ccommon subexpr. elim.ncode abstraction dremv. unreachable
codeoeval. order determin. gloop unrollingqstrength reduction hdead
assignment elim.rreverse branches iblock reorderingsinstruction
selection jminimize loop jumpsuremv. useless jumps kregister
allocation
- Slide 11
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5511 Disclaimers Did not include optimization
phases normally associated with compiler front ends no memory
hierarchy optimizations no inlining or other interprocedural
optimizations Did not vary how phases are applied Did not include
optimizations that require profile data
- Slide 12
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5512 Benchmarks 12 MiBench benchmarks; 244
functions CategoryProgramDescription auto bitcounttest processor
bit manipulation abilities qsortsort strings using quicksort
sorting algorithm network dijkstraDijkstras shortest path algorithm
patriciaconstruct patricia trie for IP traffic telecomm fftfast
fourier transform adpcmcompress 16-bit linear PCM samples to 4-bit
consumer jpegimage compression and decompression tiff2bwconvert
color.tiff image to b&w image security shasecure hash algorithm
blowfishsymmetric block cipher with variable length key office
stringsearchsearches for given words in phrases ispellfast spelling
checker
- Slide 13
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5513 Outline Experimental framework Exhaustive
phase order space evaluation Faster conventional compilation
Conclusions Summary of my other work Future research
directions
- Slide 14
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5514 Terminology Active phase An optimization
phase that modifies the function representation Dormant phase A
phase that is unable to find any opportunity to change the function
Function instance any semantically, syntactically, and functionally
correct representation of the source function (that can be produced
by our compiler)
- Slide 15
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5515 Nave Optimization Phase Order Space All
combinations of optimization phase sequences are attempted a b c d
a bc dadadad bcbcbc L2 L1 L0
- Slide 16
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5516 Eliminating Consecutively Applied Phases A
phase just applied in our compiler cannot be immediately active
again a b c d bc dadada cbbc L2 L1 L0 a bc d
- Slide 17
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5517 Eliminating Dormant Phases Get feedback
from the compiler indicating if any transformations were
successfully applied in a phase. L2 L1 L0 a b c d bc dadad cb a
bc
- Slide 18
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5518 Identical Function Instances Some
optimization phases are independent example: branch chaining &
register allocation Different phase sequences can produce the same
code r[2] = 1; r[2] = 1; r[3] = r[4] + r[2]; r[3] = r[4] + r[2];
instruction selection r[3] = r[4] + 1; r[3] = r[4] + 1; r[2] = 1;
r[2] = 1; r[3] = r[4] + r[2]; r[3] = r[4] + r[2]; constant
propagation r[2] = 1; r[2] = 1; r[3] = r[4] + 1; r[3] = r[4] + 1;
dead assignment elimination r[3] = r[4] + 1; r[3] = r[4] + 1;
- Slide 19
- Electrical Engineering & Computer Science - University of
Kansas Colloquium / 5519 Equivalent Function Instances sum = 0; for
(i = 0; i < 1000; i++ ) sum += a [ i ]; Source Code r[10]=0;
r[12]=HI[a]; r[12]=r[12]+LO[a]; r[1]=r[12]; r[9]=4000+r[12]; L3
r[8]=M[r[1]]; r[10]=r[10]+r[8]; r[1]=r[1]+4; IC=r[1]?r[9];
PC=IC