1
CS2210 Compiler Design 2004/5
Control Flow Analysis
CS2210Lecture 11
CS2210 Compiler Design 2004/5
Reading & Topics
■ Muchnick: chapter 7■ Optimization Overview■ Control Flow Analysis■ Maybe start data flow analysis
CS2210 Compiler Design 2004/5
Optimization Overview■ Two step process
■ Analyze program to learn things about it “program analysis”■ Determine when transformations are legal & profitable
■ Transform the program based on information intosemantically equivalent but better output program
■ Optimization is a misnomer■ Almost never optimal■ Sometimes slows some programs down on some inputs (try
to speed up most programs on most inputs)
2
CS2210 Compiler Design 2004/5
Semantics■ Subtleties
■ Evaluation order■ Arithmetic properties (e.g. associativity)■ Behavior in error cases
■ Some languages very precise■ E.g., Ada
■ Some weaker■ Potentially more optimization opportunity
CS2210 Compiler Design 2004/5
Analysis Scope■ Peephole
■ Across small number ofadjacent instructions
■ Trivial■ Local
■ Within a basic block■ simple
■ Intraprocedural (aka. Global)■ Across basic blocks within a
procedure■ More complex, branches,
merges loops
■ Interprocedural■ Across procedures, within
whole program■ Even more complex, calls,
returns■ More useful for higher-level
languages■ Hard with separate
compilation■ Whole-program
■ Useful for safety properties■ Most complex
CS2210 Compiler Design 2004/5
Catalog of Optimizations■ Arithmetic simplification
■ Constant foldingx: = 3+4 => x := 7
■ Strength reductionx := y*4 => x := y<<2
■ Constant propagationx:=5 => x:=5y := x+2 y :=5+2
=> y := 7
■ Copy propagationx := y => x := yw := w+ x w := w+y
3
CS2210 Compiler Design 2004/5
Catalog (2)■ Common Subexpression
Elimination (CSE)x : = a + b => x := a+b… …y := a +b y := x
■ Can also eliminateredundant memoryreferences, branch tests
■ Partial RedundancyElimination (PRE)■ Like CSE but earlier
expression available onlyalong some path
if … then => if … thenx := a+b t :=a+b; x:=t
end else t:=a+b end… …y := a+b y := t
CS2210 Compiler Design 2004/5
Catalog (3)■ Pointer analysis
p := &x => p = &x*p := 5 *p := 5y := x+1 y := 6
■ Dead assignmenteliminationx := y * z… /* no use of x */x := 6
■ Dead codeeliminationif (false) then …
■ Integer rangeanalysisfor (i=0;i<10;i++) {
if (i>=10) goto errora[i] := 0;
}
CS2210 Compiler Design 2004/5
Loop Optimizations (1)■ Loop-invariant code
motionfor j := 1 to 10 for i := 1 to 10 a[i] := a[i] + b[j] ⇒for j := 1 to 10 t := b[j] for i := 1 to 10
a[i] := a[i] + t
■ Induction variableeliminationfor i := t to 10 a[i] := a[i] + 1⇒for p := &a[1] to &a[10] *p := *p + 1
■ a[i] is severalinstructions *p is one
4
CS2210 Compiler Design 2004/5
Loop Optimizations (2)■ Loop unrolling
for i := 1 to N a[i] := a[i] + 1
⇒for i := 1 to N by 4 a[i] := a[i] + 1 a[i+1] := a[i+1] + 1 a[i+2] := a[i+2] + 1 a[i+3] := a[i+3] + 1
• Creates more optimizationopportunities in loop body
■ Parallelization■ Interchange■ Reversal■ Fusion■ Blocking / tiling
■ Data cache localityoptimization
CS2210 Compiler Design 2004/5
Call Optimizations■ Inlining
l := …w : = 4a := area(l,w)
⇒l := …w := 4a := l*w
⇒l := …w := 4a := l <<2
■ Many simpleoptimizationsbecome importantafter inlining
■ Interproceduralconstantpropagation
CS2210 Compiler Design 2004/5
More Call Optimizations■ Static binding of dynamic calls
■ Calls through function pointers in imperative languages■ Call of computed function in functional language■ OO-dispatch in OO languages (e.g., COOL)
■ If receiver class can be deduced, can replace with direct call■ Other optimizations possible even when multiple targets
(e.g., using PICs = polymorphic inline caches)
■ Procedure specialization■ Partial evaluation
5
CS2210 Compiler Design 2004/5
Machine-dependentOptimizations
■ Register allocation■ Instruction selection
■ Important for CISCs
■ Instruction scheduling■ Particularly important with long-delay
instructions and on wide-issue machines(superscalar + VLIW)
CS2210 Compiler Design 2004/5
The Phase Ordering Problem■ In what order should optimizations be performed?
■ Some optimizations create opportunities for others (orderaccording to this dependence)
■ Can make some optimizations simple■ Later optimization will “clean up”
■ What about adverse interactions■ Common subexpression elimination ⇔ register allocation■ Register allocation ⇔ instructions scheduling
■ What about cyclic dependences?■ Constant folding ⇔ constant propagation
CS2210 Compiler Design 2004/5
Control Flow Analysis
6
CS2210 Compiler Design 2004/5
Approaches■ Dominator-based
■ Control flow graph with dominator relation to identify loops■ Most commonly used
■ Interval-based■ Nested regions (= intervals)■ Control tree■ Special case: structural analysis
■ Most sophisticated■ Classifies control structures (not just loops)
CS2210 Compiler Design 2004/5
Basic Blocks and Control FlowGraphs (CFGs)■ Basic block = maximal sequence of
instructions entered only from first and exitedfrom last
■ Entry can be■ Procedure entry point■ Branch target■ Instruction immediately following branch or returnEntry instructions are called leaders
CS2210 Compiler Design 2004/5
Examplereceive mf0 <- 0f1 <- 1if m <= 1 goto L3i <- 2
L1:if i<= m goto L2return f2
L2:f2 <- f0 + f1f0 <- f1f1 <- f2i <- i+1
L3:return m
receive m
f0 <- 0
f1 <- 1
i <- 2
m <= 1
return m
return f2
i <= m
f2 <- f0+f1
f0 <- f1
f1 <- f2
i <- i+1
YN
N
Y
7
CS2210 Compiler Design 2004/5
Example CFGentry
B4
B1
B6
B3B2
B5
exit
Y N
N
Y
CS2210 Compiler Design 2004/5
Dominators & Postdominators■ Binary relation useful to determine loops■ Node d dom i :⇔ every possible execution path
from entry to i includes d■ Dominance relatation is
■ Reflexive: d dom d■ Transitive: a dom b and b dom c then a dom c■ Anti-symmetric: if a dom b and b dom a then a = b
■ Immediate dominance■ For a ≠ b :a idom b iff a dom b and ∀c: c dom b and a dom
c ⇒ c = a
CS2210 Compiler Design 2004/5
Dominators & Postdominators■ p pdom i :⇔ every possible execution
path from node i to exit includes p■ Dual relation: i dom p in CFG with edges
reversed and entry and exit switched
■ a strictly dominates b :⇔ a dom banda ≠ b
8
CS2210 Compiler Design 2004/5
Dominator Exampleentry
B4
B1
B6
B3B2
B5
exit
Y N
N
Y
CS2210 Compiler Design 2004/5
Computing Dominators (easybut slow)■ Initialize dom(i) = set of all nodes for i ≠ entry,
dom(entry) = {entry}■ While changes occur do
■ dom(i) = {i} U (dom(i) ∩ dom(pred(i)) for all predecessorsof I)
■ Works fastest if nodes are processed in reversepostorder■ O(n2e) complexity, n number of nodes, e number of edges
CS2210 Compiler Design 2004/5
Another Example
Preorder: 0 - 1 - 2 - 3 - 4 - 5 -6
Postorder: 5 - 4 - 3 - 6 - 2 - 1 - 0
Reverse Postorder: 0 - 1 -2 - 6 - 3 - 4 - 5
9
CS2210 Compiler Design 2004/5
Computing IDOM
■ Compute dominators■ tmp(i) := dom(i) - {i}■ Remove all n ∈ tmp(i) from tmp(i) for
which ∃ x (≠ n) ∈ tmp(i) such that n ∈tmp(x)
CS2210 Compiler Design 2004/5
Computing Dominators Faster
■ Lengauer & Tarjan’s algorithm■ Described in the book■ O(e α(e,n)) running time where α is the
inverse of Ackerman’s function
10
CS2210 Compiler Design 2004/5
Loops & SCCs■ An edge e = (m,n) is called a back edge iff n dom
m (head dominates tail)■ A natural loop of back edge m->n =: subgraph
containing n and all nodes from which m can bereached w/o passing through n and the edges thatconnect those nodes■ n is called the loop header■ Preheader a node inserted immediately before the loop
header■ Useful in many loop optimization as a “landing pad” for code
from the loop body
CS2210 Compiler Design 2004/5
Headers and Preheaderspreheader
B1 B2
B3
header
B1 B2
B3
header
CS2210 Compiler Design 2004/5
Natural Loop Properties
■ Natural loops with different headers areeither■ Nested■ Disjoint
■ What about natural loops with the sameheader? B1
B3B2
11
CS2210 Compiler Design 2004/5
Strongly ConnectedComponents
■ Generalization of loops■ A SCC is a subgraph GS=<NS,ES> in
which every node in NS is reachablefrom every other node in NS by a pathincluding only edges from ES
■ An SCC S is maximal, iff every SCCcontaining it is the S itself
CS2210 Compiler Design 2004/5
Reducible Flow Graphs■ A flow graph G=(N,E) is reducible (aka.
well-structured) if E can be partitioned intoE = EB U EF, where EB is the back edge set,so that (N, EF) forms a DAG in which allnodes are reachable from the entry node
■ Patterns that make CFGs irreducible, arecalled improper regions■ Impossible in some languages (e.g., Modula-2)
CS2210 Compiler Design 2004/5
Dealing with Irreducibility
■ Cannot use structural analysis directly■ Use iterative data flow analysis instead■ Make graph well-structured using node
splitting■ Induced iteration on the lattice of
monotone functions from the lattice toitself (more on this later)
12
CS2210 Compiler Design 2004/5
Control Trees & IntervalAnalysis■ Idea:
■ divide CFG into regions of various kinds■ Combine each region into a new node (abstract node)■ Obtain an abstract flow graph■ Final result is called control tree
■ Root of control tree is abstract flow graph representing originalflowgraph
■ Leaves of control tree are basic blocks■ Nodes between root and leaves represent regions■ Edges represent relationships between abstract node and its
descendents
CS2210 Compiler Design 2004/5
Example: T1-T2 Analysis
B1 B1a
B1
B2B1a
T1:
T2:
CS2210 Compiler Design 2004/5
Interval Analysis■ A (maximal) interval IM(h) with header h is the
maximal single-entry subgraph with h as only entrynode and with all closed subpaths in the subgraphcontaining h■ Like natural loop but with “acyclic stuff dangling off loop
exits”
■ A (minimal) interval is either■ A natural loop■ A maximal acyclic subgraph■ A minimal irreducible region
13
CS2210 Compiler Design 2004/5
Interval Analysis Steps■ Iterate until done:
■ Postorder traversal of CFG looking for loop headers■ Construct natural loop for each loop header and reduce
the loop to an abstract region “natural loop”■ For each set of entries of an improper region construct
minimal SCC and reduce it to “improper region”■ For entry node and each immediate descendant of a
node in a natural loop or irreducible region constructmaximal acyclic graph with that node as root: if morethan one node results, reduce to “acyclic region”
CS2210 Compiler Design 2004/5
Structural Analysis■ A refinement of interval analysis■ Advantage compared to standard iterative
data flow analysis■ Uses specialized flow functions for recognized
structures that are much faster■ Data flow equations are determined by the syntax
and semantics of the (source) language
■ Recognizes more structures than standardinterval analysis
CS2210 Compiler Design 2004/5
Region Types■ Blocks■ If-then■ If-then-else■ Case-switch
■ Self loop■ While loop■ Natural loop■ Improper interval■ Proper interval
14
CS2210 Compiler Design 2004/5
Data Flow Analysis
CS2210 Compiler Design 2004/5
Reading
■ Muchnick Chapter 8■ Some material not in book just in
lecture notes
CS2210 Compiler Design 2004/5
Data Flow Analysis■ Compute information about program
■ At program points■ To identify optimization opportunities
■ Can model as solving a system of constraints■ Each CFG node imposes a constraint relating
predecessor and successor info■ Solution to constraints is result of analysis■ Solution must be
■ Sound (aka safe)■ Can be conservative
15
CS2210 Compiler Design 2004/5
Key Issues■ Do constraints define analysis correctly?■ How to efficiently represent information?■ How to represent and solve constraints
efficiently?■ What about multiple solutions?■ How synchronize transformation with
analysis?
CS2210 Compiler Design 2004/5
Example: Reaching Definitions■ For each program point want to know
■ What set of definitions (statements) may reach that point■ Reach = the last definition of a variable
■ Info = set of var -> LIR bindings, e.g., {x->s1, y->s5,y->s8}
■ Can use the info to■ Build def-use chains■ Do constant- and copy propagation
■ Safety■ Can have more bindings that “true” answer
CS2210 Compiler Design 2004/5
Constraints for Reaching Defs■ Strong update
s: x := …infosucc = infopred - {x->s’| ∀s’} U {x->s}
■ Weak updates: *p := …infosucc = infopred U {x->s| ∀x∈may-point-to(p)}
■ Other statements do nothing infosucc = infopred
16
CS2210 Compiler Design 2004/5
More Constraints for RDEFs■ Branches
infosucc[i] = infopred
■ Control flow merges■ We don’t know which path is taken at run time■ Be conservative:
infosucc = ∪i infopred[i]
■ Procedure entry:■ receive x
info = {x->entry}
CS2210 Compiler Design 2004/5
Solving Constraints■ For reaching
definitions cansolve traversinginstructions inforwardtopological order
(1) x :=…
(2) y :=…
(3) y :=…
(4) p :=…
…x…
(6) x :=…
(7) *p :=…
…x…
(5) x :=…
…y…
…x…
…y…
(8) y:=…
CS2210 Compiler Design 2004/5
Another Example(1) x :=…
(2) y :=…
(3) y :=…
(4) p :=…
…x…
(6) x :=…
(7) *p :=…
…x…
(5) x :=…
…y…
…x…
…y…
(8) y:=…
17
CS2210 Compiler Design 2004/5
Loop Schema:
CS2210 Compiler Design 2004/5
Constraints and Loops■ Constraints are now recursively defined
■ infoloop-head=infoloop-entryU infoback-edge
■ Infoback-edge = …infoloop-head…
■ Substituting definition of infoback-edge:infoloo-head=infoloop-entryU (…infoloop-head…)summarized rhs as F:infoloop-head=F(infoloop-head)
■ Legal solutions to constraints is a fixed-point of F
CS2210 Compiler Design 2004/5
Recursive Constraints
■ Many solutions possible■ Want least or greatest fixed-point,
whichever corresponds to most preciseanswer
■ How can fixed-point be found■ Interval & structural analysis for certain
CFGs■ Iterative approximation for arbitrary CFGs
18
CS2210 Compiler Design 2004/5
Iterative Data Flow Analysis■ Start with initial guess of info at loop head
infoloop-head=guess■ Solve equations for body
infoback-edge=Fbody(infoloop-head)infoloop-head=infoloop-entry U infoback-edge
■ Test if fixed-point foundinfoloop-head’=info,loop-head■ If same done■ Else use result as new (better) guess:
infoback-edge’=Fbody(infoloop-head)infoloop-head’’=infoloop-entryUinfoback-edge
CS2210 Compiler Design 2004/5
When does Iterating work?■ Have to be able to
make initial guess■ Infon+1 must be closer
to fixed-point than infon
■ Must eventually reachfixed-point, info mustbe drawn from a finiteheight domain
■ To reach best fixed-point,initial guess should beoptimististic■ Infoloop-head = infoloop-entry
■ Even if guess is overlyoptimistic, iteration willensure that we get a safe
■ Iteration speed■ Ideal: solve constraints
along shortest path fromloop head to tail
■ Practical: avoid solvingconstraints outside of loopuntil fixed-point reachedinside loop
CS2210 Compiler Design 2004/5
Data Flow Analysis Direction■ Constraints are declarative, so may require
mix of forward and backward■ Frequently directional propagation & iteration
■ Forward or backward■ Topological traversal of acyclic subgraph
minimizes analysis time
■ Directional constraints are called flowfunctionsRDEFs:x:= … (in) = in - {x->s’|∀x’} U {x->s}
19
CS2210 Compiler Design 2004/5
GEN & KILL sets■ Flow functions can often be described by
their GEN and KILL sets■ GEN = new information added■ KILL = old information removed■ Finstr(in) = in - KILLinstr U GENinstr
■ Example reaching definitionsGENs:x:= … = {x->s}KILLs:x:= … = {x->s’|∀x’}
CS2210 Compiler Design 2004/5
Bit Vectors■ Encode data flow information and GEN/KILL sets as
bit vectors■ Works when info can be expressed abstractly as a set of
things■ Each gets a specific bit position■ Reaching defs: info = bit vector over statements, each bit
represents a specific statement, defined variable is impliedby statement
■ Advantages■ Compact representation■ Fast union & difference operations■ Can combine GEN / KILL sets of whole basic blocok into one
GEN/KILL set for faster iteration
CS2210 Compiler Design 2004/5
Example:Constant Propagation■ What info should be represented?■ CPx:=N(in) =
CPx := y+x(in) =
■ Merge function?■ Initial info?■ Direction?■ Can bit vectors be used?
20
CS2210 Compiler Design 2004/5
Another Example: ConstantPropagation
x := 3
w := 3y := x*2z := y+5
x := x+1w := 3
w := w*2
CS2210 Compiler Design 2004/5
May vs. Must Info■ Some kind of info
implies guarantees:must info
■ Some kind of infoimplies possibilities:may info
∩∪MERGE
Removeeverythingpossiblywrong
Remove onlyif guaranteedwrong
KILL
Add only ifguaranteedtrue
Addeverythingthat might betrue
GEN
Overly smallset
Overly big setSafe
Big setSmall setDesired info
MustMay
CS2210 Compiler Design 2004/5
Live Variables Analysis■ Desired info: set of variables live at each program
point■ Live = might be used later in the program■ Supports dead assignment elimination, register allocation
■ May or must?■ Direction?■ Merge function■ Bit vectors usable?■ Initial info?
21
CS2210 Compiler Design 2004/5
Live Variables Examplex := 5y := x*2
y := x+10x := x+1
…y…
CS2210 Compiler Design 2004/5
Lattice-theoretic Data FlowAnalysis Framework■ Goals
■ Provide single, formalmodel to describe allDFAs
■ Formalize “safe”,“conservative”,“optimistic”
■ Precise bounds on timecomplexity
■ Connect analysis withunderlying semantics toenable correctness proofs
■ Plan■ Define domain of program
properties computed by DFA■ Each domain has set of
elements■ Each element represents one
possible value of the property■ Order sets to reflect relative
precision■ Domain = set of elements +
order over elements = lattice■ Define flow/merge functions
using lattice operators■ Benefit from lattice theory for
realizing goals
CS2210 Compiler Design 2004/5
Lattices■ D = (S, ≤)
■ S set of elements■ ≤ induces a partial order
■ Reflexive, transitive & anti-symmetrics■ ∀ x,y∈S: meet ^(x,y) (greatest lower bound)
join v(x,y) (least upper bound)ν = closure property
■ Unique Top (T) & Bottom ⊥ elements:■ X^ ⊥ = ⊥ and x v T = Tν Meet and join are commutative and associative
■ Height of lattice: longest path through partial orderfrom top to bottom
22
CS2210 Compiler Design 2004/5
Lattices in Data Flow Analysis■ Model information by elements of a lattice domain
■ Top = best case info■ Bottom = worst case info■ Initial info for optimistic analyses (at least back edges: top)■ If a ≤ b then a is a conservative approximation of b■ Merge function = meet (^) : the most precise element that’s
a conservative approximation of both input elements■ Initial info for optimistic analyses (at least back edges: top)
CS2210 Compiler Design 2004/5
Some Typical Lattice Domains■ Two point lattice: ⊥ T
■ Boolean property■ A tuple of two point lattices = bit vector
■ Lifted set: set of incomparable values and ⊥ and T■ Example?
■ Powerset lattice: set of all subsets of S, orderedsomehow (often by ⊆)
■ T = {} ⊥ = S or vice versa■ Collecting analysis■ Isomorphic to tuple of booleans indicating membership in
subset of elements of S
CS2210 Compiler Design 2004/5
Tuples of Lattices■ Often useful to break complex lattice into a
tuple of lattices, one per variable analyzed■ DT=<ST, ≤T> = <S, ≤>N
ST= S1XS2X…XSN
≤T pointwise orderingTT = <TD,…,TD>, bottom tuple of bottoms
ν Height(DT) = N * height(D)ν Example?