Program Analysis Mooly Sagiv Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12...

Post on 20-Jan-2018

213 views 0 download

description

Lattice Theory u The Foundation of –Denotational semantics –Program analysis u Special topology theory u Generalizes powersets and integers

transcript

Program AnalysisMooly Sagiv

http://www.math.tau.ac.il/~sagiv/courses/pa.html

Tel Aviv University640-6706

Sunday 18-21 Scrieber 8Monday 10-12 Schrieber 317

Textbook: Dataflow Analysis Chapter 2 & Appendix A

Monotone Frameworks and Precision

Outline Lattice Theory Monotone Dataflow Frameworks Precision of Data Flow Analysis

Lattice Theory The Foundation of

– Denotational semantics– Program analysis

Special topology theory Generalizes powersets and integers

Partial Orders Consider a set P A partial order is a relation

: P P{false, true} such that: is reflexive pP: pp is transitive p1, p2, p3 P, p1p2, p2 p3 p1 p3

is anti-symmetric p1, p2 P : p1p2, p2 p1p1=p2

Partially ordered sets (Posets) (P, ) Examples

– (R, )– (P(S), )– (P(S), )– (Alphanumeric-Strings, Lexicographic-order)

Upper Bounds Consider a Poset (P, ) An element uP is an upper bound of a subset

SP if sS: su An element uP is a least upper bound of a subset

SP if– u is an upper bound of S– For every upper bound u’ of S: uu’

The least upper bound of every S is unique if exists (denoted by S)

For S={p1,p2} p1p2= {p1, p2}

Lower Bounds Consider a Poset (P, ) An element lP is a lower bound of a subset

SP if s S: ls An element lP is a greatest lower bound of a subset

SP if– l is a lower bound of S– For every lower bound l’ of S: l’l

The greatest lower bound of every S is unique if exists (denoted by S)

For S={p1, p2} p1 p2= {p1, p2}

Complete Lattices

The Poset (L, ) such that every subset SL S and S are both defined is called complete lattice

Denoted by (L, ) = (L, , , , ,) is the minimum value = = L is the maximum value = L =

Lattices in Program Analysis

The Poset (L, ) describes “potential pieces of abstract information” (known when the analysis begins)

l1l2

– l1 is at least as precise as l2

– l2 describes at least the program states described by l1

describes an empty set of program states describes all the program states (trivial solution)

l1l2 is the effect of integrating l1 and l2 from different control-flow paths

l1l2 is the effect of integrating l1 and l2 from the same control-flow path

Lemma A.2 Given a Poset (P, ) the following claims are

equivalent– (i) P is a complete lattice– (ii) for every subset SP

S is defined

– (iii) for every subset SP S is defined

Chains Consider a Poset (P, ) A chain is subset SP which is totally ordered

– for every s1, s2 S: s1s2 or s1s2 P satisfies the ascending chain condition if all the

ascending chains in L is finite P has a finite height h if all chains contains at

most h+1 elements

Construction of Complete Lattices

It is possible to construct lattices from other lattices (like compound data-types)

Allows natural generalizations of static analysis algorithms

Examples:– Cartesian products– Total function space

Cartesian Products Consider lattices

– (L1, 1, 1, 1, 1,1)

– (L2, 2, 2, 2, 2,2)

Define L = (L1 L2, ) where (l1, l2) (u1, u2) if l1 1 u1 and l2 2 u2

L is a complete lattice S = (1{l1: l2 : (l1, l2) S}, 2 {l2: l1 : (l1, l2) S}) S = ( 1{l1: l2 : (l1, l2) S}, 2 {l2: l1 : (l1, l2) S}) = (1, 2)

= (1,2)– If L1 has a finite height h1and L2 has a finite height h2

then ...

Total Function Space Consider

– A lattice (L1, 1, 1, 1, 1,1)– A set S

Define L = (S L1 , ) where f1 f2 if for every s S: f1(s) 1 f2 (s)

L is a complete lattice– ( Y)(s) = 1{f(s) : f Y}

– ( Y)(s) = 1{f(s) : f Y} (s) = 1

(s)= 1

– If L1 has a finite height h1and S is finite

then ...

Properties of Functions Consider a function f: L1 L2 where

(L1, 1, 1, 1, 1,1) and (L2, 2, 2, 2, 2,2) complete lattice

f is strict if f(1)=2

f is monotone (or order-preserving) if s1, b1 L1: s1 1 b1 f(s1) 2 f(b1)

f is additive (or distributive) if s1, b1 L1: f(s11b1) = f(s1)2 f(b1)

Fixed Points Consider a function f: L L where

(L, , , , ,) is a complete lattice Let Fix(f) be the sets of fixed points of f

Fix(f) = { l | f(l) = l }– lfp(f) is the least element in Fix(f) (unique if exists)– gfp(f) is the greatest element in Fix(f) (unique if exists)

Let Pre(f) be the sets of pre fixed points of fPre(f) = { l | f(l) l } (Red(f))

Let Post(f) be the sets of post fixed points of fPost(f) = { l | l f(l) } (Ext(f))

Tarski’s Theorem: if f is monotone then:– lfp(f) = Pre(f)

– gfp(f) = Post(f)

Constructive Version of Tarski’s Theorem

Define the sequence:– l0 =

– li+1 = f(li)

li lfp(f) If L has height h

lh=lfp(f) Improvements

– stop when no more changes occur – Chaotic iterations

Monotone Frameworks Generalizes Kill/Gen Problems a complete lattice (L, , , , ,) describes the

“potential pieces of information” The initial value at entry is specified by L The effect of every basic block at l is described by

a monotone function fl :L L (transfer function) Solve the following system of equations (forward)

otherwise u )'()}(),'{()(

)(*

*

lDFSflowllSinitl

lexit

DFentry

))(()( lDFentryfl lDFexit

Instances of Monotone Frameworks Kill/Gen Problems

= or =

– fl(entry(l)) = (entry (l) - kill(l)) gen(l)

May be uninitialized (garbage) variables Constant propagation Truly-live variables Points-to analysis

May-be-garbage variables A variable may-be-garbage at a label l if there

may be a path to l in which it is either uninitialized or set using an uninitilized variable[x := 5]1 ;if [z > 2]2

then [y := 17]3 ;else [skip]4 ;

[t := y + x]5 ;

May-be-garbage variables(cont) L = (P(Var*), , , , , Var*) Initial value =Var*

Transfer functions fl(DFentry(l))

[x := a]l if FV(a) DFentry(l) then DFentry(l) {x} else DFentry(l) – {x}

[skip]l DFentry(l)

[b]l DFentry(l)

Constant Propagation Determine variables with constant values Information Lattice

– Extended integer lattice (L1, 1, 1, 1, 1, 1)» L1 = Z {1, 1}

» 1 1 z 1 1

– Define L = (S L1, ) where S=Var*

Transfer functions Acp: AExp (L L1)fl(DFentry(l))

[x := a]l DFentry(l)[x Acpa(DFentry(l))]

[skip]l DFentry(l)

[b]l DFentry(l)

Chaotic Iterations

for l Lab* doDFentry(l) := DFexit(l) :=

DFentry(init(S*)) := WL= Lab*

while WL != do Select and remove an arbitrary l WL

if (temp != DFexit(l))

DFexit(l) := temp for l' such that (l,l') flow(S*) do DFentry(l') := DFentry(l') DFexit(l) WL := WL {l’}

))(( lDFftemp entryl

Complexity of Chaotic Iterations Parameters:

– |Lab| labels– k is the maximum outdegree of flow(S*) – A lattice of height h– c is the maximum cost of

» applying fl » L comparisons

ComplexityO(|Lab| h * c * k)

Soundness of Chaotic Iterations define abstraction : Collecting-States L Show that for every l:

({[b]l(s) | s CS }) fl ((CS)) Conclude that the DF solution of Chaotic

iterations satisfies for every l: (CS entry(l)) DFentry(l) (CS exit(l)) DFexit(l)

But it may be that Chaotic iterations yield DFentry(l) = and yet (CS entry(l))=

How to measure precision?

Precision of Chaotic Iterations Optimal

(CS entry(l)) = DFentry(l) (CS exit(l)) = DFexit(l)

Join-over-all-paths - No loss of information w.r.t. straight line code

Relatively optimal (induced) w.r.t. the abstraction

Compare at run-time Good enough for the used optimization

The Join-Over-All-Paths (JOP) Let paths(init(S*), l) denote the potentially infinite

set paths from init(S*) to l (written as sequences of labels)

For a sequence of labels [l1, l2, …, ln] definef [l1, l2, …, ln]: L L by composing the effects of basic blocksf [l1, l2, …, ln](s) = fln (… (fl2 (fl1 (s)) …)

JOPl = {f[l1, l2, …, l]() [l1, l2, …, l] paths(init(S*), l)}

JOP vs. Least Solution The DF solution obtained by Chaotic iteration

satisfies for every l: – JOPl DFentry(l)

If every fl is additive (distributive) for all the labels l– JOPl = DFentry(l)

Static Analysis problems beyond Monotone Frameworks

Infinite heights– integer intervals– Linear relationships between variables

Bi-directional problems Procedures

Conclusions Many dataflow problems can be solved via the

Chaotic Iteration Algorithm Provide a tool to understand precision