Materialization in Shape Analysis with Structural Invariant Checkers Bor-Yuh Evan Chang Xavier Rival...

Post on 21-Jan-2016

228 views 0 download

transcript

Materialization in Shape Materialization in Shape Analysis with Structural Analysis with Structural

Invariant CheckersInvariant CheckersBor-Yuh Evan ChangBor-Yuh Evan Chang

Xavier RivalGeorge C. Necula

University of California, Berkeley

August 27, 2007ITU Copenhagen

2

What’s shape analysis? What’s What’s shape analysis? What’s special?special?

• Memory manipulationMemory manipulation– Particularly important in systems code (in C)

• Flow-sensitiveFlow-sensitive– Many important properties

• E.g., Is an object freed? Is a file open?

– Heap abstracted differently at different points• E.g., Not based on allocation site

Shape analysis tracks memory memory manipulationmanipulation in a flow-sensitiveflow-sensitive manner.

3

Example: Typestate with shape Example: Typestate with shape analysisanalysis

cur = l;while (cur != null) {

assert(cur is red);make_purple(cur);

cur = cur!next;}

l

cur

l

Concrete ExampleConcrete Example AbstractionAbstraction

“red list”

l

“purplelist

segment”

“red

list”

l

cur

program-specific predicate

flow-sensitive heap abstractionmake_purple(¢) could

be• lock(¢) • free(¢)• open(¢)• …

4

Shape analysis is not yet practicalShape analysis is not yet practical

UsabilityUsability: Choosing the heap abstraction difficult

TVLA[Sagiv et al.]

“red list”

red(n) Æn 2 reach(l)

“red list”

Space Invader[Distefano et al.]

“red list”

Our Proposal

Built-in high-level predicates

-- Hard to extend

++ No additional user effort

Parametric in low-level, analyzer-oriented predicates++ Very general and expressive

-- Hard for non-expert

Parametric in high-level, developer-oriented predicates++ Extensible

++ Easier for developers

5

Shape analysis is not yet practicalShape analysis is not yet practical

ScalabilityScalability: Finding right level of abstraction difficultOver-reliance on disjunction for precision

“purplelist

segment”

“red

list”

l

curdeveloper

curl curlcurl curl

l,cur l, curl lemp

Ç Ç Ç

Ç Ç Ç Ç Ç

shape analyzer

6

HypothesisHypothesis

The developerdeveloper can describe the memory in a compactcompact manner at an abstraction level sufficient for the properties of interest (at least informally).• Good abstraction is program-specific

shape analyzer

developer

“purplelist

segment”

“redlist”

l

cur

??

abstraction

7

ObservationObservation

bool redlist(List* l) {if (l == null)return true;elsereturnl!color == red&& redlist(l!next);

}

Checking codeChecking code expresses a shape invariant and an intended usage pattern.

l

l

l

l

8

ProposalProposal

• Extensible– Abstraction based on the developer-supplied checkers

• Targeted for Usability– Global data structure specification, local invariant inference

• Targeted for Scalability– Based on the hypothesis

An automated shape analysisshape analysis with a memory abstraction parameterized by invariant invariant checkerscheckers.

shape analyzer

bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next);}

checkers

9

Shape analysis is an abstract Shape analysis is an abstract interpretation on memory states with …interpretation on memory states with …

• MaterializationMaterialization (partial concretization)

• To perform strong updates

• And wideningwidening for termination

“red list”

l, cur l, cur “red list”

l “red list”cur

l, cur “red list”

l “red list”cur

“purplelist

segment”

“red

list”

l

cur

10

OutlineOutline

• Memory abstraction– Restrictions on checkers– Challenge: Intermediate invariants

• Materialization by forward unfolding– Where and how– Challenge: Unfolding segments

• Materialization by backward unfolding– Challenge: Back pointers

• Deciding where to unfold generically

11

Abstract memory using checkersAbstract memory using checkers

® values(address or null)

points-to relation ®@f ¯

® ¯f

checker run c(®)

®c

partial run ?

® ¯c

GraphsGraphs

ExampleExample“Disjointly, ®!next = ¯, °!next = ¯, and ¯ is a list.”

list¯

next®

°

“Some number of points-to edges

that satisfies checker c”

nextdisjointdisjoint memory regions

(¤¤)

12

Checkers as inductive definitionsCheckers as inductive definitions

bool list(List* l) {if (l == null)

return true;else

return list(l!next);

}

:= 9¯.®list

® null

® ¯next list

® null

Çemp

list(l)

list(…)

DisjointnessDisjointnessChecker run can dereference any object field only once

emp (® null)

next® null

next®

nextnull

13

What can a checker do?What can a checker do?

• In this talk, a checker …– is a pure, recursive function– dereferences any object field only once during a run– only one argument can be dereferenced (traversal

arg)– has only additional pointer parameters

bool dll(Dll* l, Dll* prev) {if (l == null) return true;else

return l!prev == prev

&& dll(l!next);}

Traversal argument

:= 9¯.®dll(½)

Ç® null

emp

® null

®next dll(®)

¯½prev

Only fields from traversal argument

14

next

Example checker: Two-level skip Example checker: Two-level skip listlist := 9¯,°.®

skip1

Ç® null

emp

® null

®skip1

°next skip0()

¯

skip

:= 9¯.®skip0(°)

Ç® °

emp

® °

®skip0()

¯

skip null

skip

next

skip

next

skip

next

skip

next

skip

next

skip

next

back to the abstract domain …back to the abstract domain …

shape analyzer

bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next);}

checkers

16

Challenge: Intermediate invariantsChallenge: Intermediate invariants

assert(redlist(l));

cur = l;

while (cur != null) {

make_purple(cur);

cur = cur!next;

}

assert(purplelist(l));

lredlist

curpurplelist

lredlist

lpurplelist

Prefix Prefix SegmentSegmentDescribedby ?

SuffixSuffixDescribed by checkers

17

Prefix segments as partial checker Prefix segments as partial checker runsruns

c(…)c(…)

l curpurplelist

purplelist(l)

purplelist(…)

purplelist(cur)

AbstractionAbstraction

Checker RunChecker Run

c

c()

c(…) c(…)

c(…) c()

FormulaFormula c() ¤– c() ??

Doesn’t quite work because we need materialization

18

OutlineOutline

• Memory abstraction– Restrictions on checkers– Challenge: Intermediate invariants

• Materialization by forward unfolding– Where and how– Challenge: Unfolding segments

• Materialization by backward unfolding– Challenge: Back pointers

• Deciding where to unfold generically

19

Flow function: Unfold and update Flow function: Unfold and update edgesedges

listnext nextx

materialize: x!next, x!next!next

update: x!next = x!next!next

list

next

nextx

x!next = x!next!next;

UnfoldUnfold inductive definition

Strong updates using disjointnessdisjointness of regions

listx

next

nextx

Ç

20

Unfolding: where, how, and why okUnfolding: where, how, and why ok

• Where– “Reach” a traversal argument with x!next

• How and Why Ok (concretizations same)– By definition

listnext nextx

materialize: x!next, x!next!next

x!next = x!next!next; list

xnext

nextx

Ç

21

list

list

What about unfolding segments?What about unfolding segments?

materialize: x!next

®list

®x, y

Ç

y

list¯

list® °

nextx y

® = ¯

list(®) ¤– list(¯)

emp Ç ®@f ° ¤ (list(°) ¤– list(¯))

22

Segment connector (for unfolding)Segment connector (for unfolding)

Concretestore ¾: Val ! Valvaluation º : SymVal ! Val

c(®) ¤= c0(®0)¾, º ²iff there exists an i such that c(®) ¤=i c0(®0)

[¢], º ² c(®) ¤=0 c(®0)iff º(®) = º(®0)

¾, º ² c(®) ¤=i+1 c0(®0)iff there exists a disjunct (Mu ¤ Mf ¤ c00(¯) Æ F) such that

º satisfies [actuals/formals]F and¾, º ² [actuals/formals](Mu ¤ Mf ¤ c00(¯) ¤=i c0(®0))

Inductive Definitionsc(®) := … Ç (Mu ¤ Mf Æ F) Ç …

“unfolded” points-to

“folded” recursive

calls

pure formula

23

Basic properties of segmentsBasic properties of segments

• If ¾, º ² c(®) ¤= c0(®0), then ¾, º ² c(®) ¤– c0(®0)

– If ¾, º ² (c(®) ¤= c0(®0)) ¤ c0(®0), then ¾, º ² c(®) (elimination)

• [¢], º ² c(®) ¤= c(®)(reflexivity)

• If ¾, º ² (c(®) ¤= c0(®0)) ¤ (c0(®0) ¤= c

00(®00)), then ¾, º ² c(®) ¤= c 00(®00)(transitivity)

®c

®00®0

c0 c0 c0

0

c0

0

24

OutlineOutline

• Memory abstraction– Restrictions on checkers– Challenge: Intermediate invariants

• Materialization by forward unfolding– Where and how– Challenge: Unfolding segments

• Materialization by backward unfolding– Challenge: Back pointers

• Deciding where to unfold generically

25

• Traversal on ‘next’ field to find element to remove:

• Materialize ‘cur!prev’ and remove ‘cur’:

Challenge: Back pointersChallenge: Back pointers

:= 9¯.®dll(½)

Ç® nullemp

® null

®next dll(®)

¯½prev

l curdll(°)dll(null)

:= 9¯.®dll0(½)

Ç® nullemp

® null

®prevprev dll0(®)

¯½nextnext

ExampleExample: Removal in doubly-linked lists

l curdll(°)dll(null) dll(°)

l curdll(°)dll(null)

°next

prev

dll(°)

Need to unfold “backward”

26

Backwards unfolding by forwards Backwards unfolding by forwards unfoldingunfolding

¯dll(null) dll(°)

i+1

°prev

split (lemma)

dll(null) dll()

i 1± ¯

°prev

dll(°)dll()

i

unfold forward at ±

dll(null) dll()

i 0±

°prev

dll(°)

prev

¯´next dll(±)

nextdll(null) dll()±

prevprev

¯

reduce ´ = ¯, ± = °

27

OutlineOutline

• Memory abstraction– Restrictions on checkers– Challenge: Intermediate invariants

• Materialization by forward unfolding– Where and how– Challenge: Unfolding segments

• Materialization by backward unfolding– Challenge: Back pointers

• Deciding where to unfold generically

28

Deciding where to unfoldDeciding where to unfold

• ObservationsObservations: Can indicate (with types) what fields are materialized for a checker parameter

types ¿ ::= { f1hl1i, …, fnhlni }levels l ::= n | unk

hl1i hlni

A pointer that may

materialize these fields

Where in the traversal it

may be materialized

c0 c1 cmc-1c-n ……

• Levels

Level 0:Materialized in this call.

Level -1:Materialized just before this call

29

Example: Doubly-linked listsExample: Doubly-linked lists

:=

9(¯ : {nexth1i, prevh1i}).

®dll(½)

Ç® null

emp

® null

®next dll(®)

¯½prev

® : {nexth0i, prevh0i},½ : {nexth-1i, prevh-1i}

Before:Traversal argument had level 0 fields (implicitly)

Backward unfolding parameter ½ has level -1

30

Example: Alternative doubly-linked Example: Alternative doubly-linked listlist

:=

9(¯ : {nexth2i, prevh1i}).

®npdll

Ç® null

emp

® null

®next npdll0(®)

¯

® : {nexth0i, prevh-1i}

:=

9(¯ : {nexth1i, prevh1i}).

®npdll0(½)

Ç® null

emp

® null

®npdll

½prev

® : {nexth1i, prevh0i},½ : {nexth-1i, prevh-2i}

31

Types can be inferred automaticallyTypes can be inferred automatically

Checking

®f

{ fh0i } <: typeof(®)

typeof(¯) – 1<: declared_typeof(¼) (where c(¼) := …)

{ }

{ fh0i, ghunki }

{ fh0i } { gh1i }

{ fhunki, ghunki }

Inference using a fixed-point computation with types initialized to { }

32

Summary:Summary:Enabling materialization anywhereEnabling materialization anywhere

• Defined segments as partial checker runs directly (inductively)– For forward unfolding– Backward unfolding derived from forward

unfolding

• Checker parameter types with levels– For deciding where to unfold– Inferable and does not affect soundness

33

Summary:Summary:Given checkers, everything is automaticGiven checkers, everything is automatic

bool redlist(List* l) { if (l == null) return true; else return l!color == red && redlist(l!next);}

checkers

shape analyzer

typetypepre-analysispre-analysis

abstract interpretation

unfoldingunfoldingandand

updateupdatewidening

34

ConclusionConclusion

• Invariant checkers can form the basis of a memory abstraction that– Is easily extensible on a per-program basis– Expresses developer intent

• Critical for usability• Prerequisite for scalability

• Enabling materialization anywhere– Inductive segments– Pre-analysis on checkers to decide where

to unfold robustly

What can checker-basedshape analysis do for you?

36

Challenge: Termination and Challenge: Termination and precisionprecisionlast = l;cur = l!next;while (cur != null) {

// … cur, last …if (…) last =

cur;cur = cur! next;

}

listl, last

nextcur

listl

next nextcurlast

listl

next next nextcurlast

widen (canonicalize, blur)

list list listl

nextcurlast

ObservationObservationPrevious iterates are “less unfolded”

FoldFold into checker edges

But where and how much?

37

History-guided foldingHistory-guided folding

listnext

listnext next

listnextlist

l, last

last

cur

cur

l

l

last cur

l,

list ?

v

?

list

Yes

last = l;cur = l!next;while (cur != null) {

if (…) last = cur;

cur = cur! next;}

• Match edges to identify where to fold

• Apply local folding rules

nextl last

l last

l, last

38

Summary:Summary:Enabling checker-based shape analysisEnabling checker-based shape analysis

• Built-in disjointness of memory regions– As in separation logic– Checkers read any object field only once in a run

• Generalized segment abstraction– Based on partial checker runs

• Generalized folding into inductive predicates– Based on iteration history (i.e., a widening

operator)

c

next listl cur

listl, cur

list listl cur

39

Experimental resultsExperimental results

Benchmark Lines of

Code

Analysis Time

Max. Num. Graphs at a

Program Point

Max. Num Iterations at a Program

Point

list reverse 019 0.007s 1 03

list remove element

027 0.016s 4 06

list insertion sort 056 0.021s 4 07

search tree find 023 0.010s 2 04

skip list rebalance 033 0.087s 6 07

scull driver 894 9.710s 4 16• Verified structural invariants as given by checkers are preserved across data structure manipulation

• Limitations (in scull driver)– Arrays not handled (rewrote as linked list), char arrays ignored

• Promising as far as number of disjuncts