+ All Categories
Home > Documents > End-User Program Analysis Bor-Yuh Evan Chang University of California, Berkeley Dissertation Talk...

End-User Program Analysis Bor-Yuh Evan Chang University of California, Berkeley Dissertation Talk...

Date post: 20-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
36
End-User Program Analysis Bor-Yuh Evan Chang University of California, Berkeley Dissertation Talk August 28, 2008 Advisor: George C. Necula, Collaborator: Xavier Rival (INR
Transcript

End-User Program Analysis

Bor-Yuh Evan ChangUniversity of California, Berkeley

Dissertation TalkAugust 28, 2008

Advisor: George C. Necula, Collaborator: Xavier Rival (INRIA)

2Bor-Yuh Evan Chang - End-User Program Analysis

Software errors cost a lot

~$60 billion annually (~0.5% of US GDP)– 2002 National Institute of Standards and

Technology report

total annual revenue of>10x annual budget of >

3Bor-Yuh Evan Chang - End-User Program Analysis

But there’s hope in program analysis

Microsoft uses and distributesthe Static Driver Verifier

Airbus appliesthe Astrée Static Analyzer

Companies, such as Coverity and Fortify, market static source code analysis tools

4Bor-Yuh Evan Chang - End-User Program Analysis

Because program analysis caneliminate entire classes of bugs

For example,– Reading from a closed file:

– Reacquiring a locked lock:

How?– Systematically examine the program

– Simulate running program on “all inputs”

– “Automated code review”

read( );

acquire( );

5Bor-Yuh Evan Chang - End-User Program Analysis

… code …// x now points to an unlocked lock

acquire(x);… code …

analysis state

Program analysis by example:Checking for double acquires

Simulate running program on “all inputs”

x

acquire(x);… code …

6Bor-Yuh Evan Chang - End-User Program Analysis

… code …// x now points to an unlocked lock in a linked list

acquire(x);… code …

ideal analysis state

Program analysis by example:Checking for double acquires

Simulate running program on “all inputs”

x xx

or or or …

undecidability

7Bor-Yuh Evan Chang - End-User Program Analysis

… code …// x now points to an unlocked lock in a linked list

acquire(x);… code …

ideal analysis state analysis state

Must abstract

x xx

or or or … ?

xFor decidability, must abstract—“model all inputs” (e.g., merge objects)

Abstraction too coarse or not precise enough (e.g., lost x is always unlocked)

mislabels good code as buggy

8Bor-Yuh Evan Chang - End-User Program Analysis

To address the precision challenge

Traditional program analysis mentality:

“Why can’t developers write more specifications for our analysis? Then, we could verify so much more.”

“Since developers won’t write specifications, we will use default abstractions (perhaps coarse) that work hopefully most of the time.”

End-user approach:

“Can we design program analyses around the user? Developers write testing code. Can we adapt the analysis to use those as specifications?”

9Bor-Yuh Evan Chang - End-User Program Analysis

Summary of overview

Challenge in analysis: Finding a good abstraction

precise enough but not more than necessary

Powerful, generic abstractionsexpensive, hard to use and understand

Built-in, default abstractionsoften not precise enough (e.g., data structures)

End-user approach:Must involve the user in abstraction

without expecting the user to be a program analysis expert

10Bor-Yuh Evan Chang - End-User Program Analysis

Overview of contributions

Extensible Inductive Shape Analysis [POPL’08,SAS’07]

Precise inference of data structure propertiesAble to check, for instance, the locking

example

Targeted to software developersUses data structure checking code for guidance Turns testing code into a specification for

static analysis

Efficient~10-100x speed-up over generic approaches Builds abstraction out of developer-supplied

checking code

Extensible InductiveShape Analysis

Precise inference of data structure properties

End-user approach

[POPL’08, SAS’07]

12Bor-Yuh Evan Chang - End-User Program Analysis

Shape analysis is a fundamental analysisData structures are at the core of

– Traditional languages (C, C++, Java)– Emerging web scripting languages

Improves verifiers that try to– Eliminate resource usage bugs

(locks, file handles)– Eliminate memory errors (leaks, dangling

pointers)– Eliminate concurrency errors (data races)– Validate developer assertions

Enables program transformations– Compile-time garbage collection– Data structure refactorings

13Bor-Yuh Evan Chang - End-User Program Analysis

Shape analysis by example:Removing duplicates

// l is a sorted doubly-linked list

for each node cur in list l {remove cur if duplicate;

}assert l is sorted,

doubly-linked with no duplicates;

Example/Testing Code Review/Static Analysis

“no duplicates”l

“sorted dl list”l

program-specific

l 2 2 44

l 2 44

cur

l 2 4

“sorted dl list”l“segment withno duplicates”

cur

intermediate state more

complicated

14Bor-Yuh Evan Chang - End-User Program Analysis

Shape analysis is not yet practical

Choosing the heap abstraction difficult for precision

Parametric in high-level, developer-oriented predicates+ Extensible+ Targeted to developers

Xisa

Built-in high-level predicates

- Hard to extend+ No additional user effort (if

precise enough)

Parametric in low-level, analyzer-oriented predicates+ Very general and expressive- Hard for non-expert

89

Traditional approaches:

End-user approach:

Space Invader [Distefano et

al.]

TVLA[Sagiv et al.]

15Bor-Yuh Evan Chang - End-User Program Analysis

Key insightfor being developer-friendly and efficientUtilize “run-time checking code” as specification for static analysis.

assert(sorted_dll(l,…));

for each node cur in list l {remove cur if duplicate;

}

assert(sorted_dll_nodup(l,…));

l

l

cur

l

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

checker

Contribution: Automatically generalize checkers for complicated intermediate states

Contribution: Build the abstraction for analysis out of developer-specified checking code

• p specifies where prev should point

16Bor-Yuh Evan Chang - End-User Program Analysis

Our framework is …

• Extensible and targeted for developers– Parametric in developer-supplied checkers

• Precise yet compact abstraction for efficiency– Data structure-specific based on properties of

interest to the developer

An automated shape analysis with a precise memory abstraction based around invariant checkers.

shape analyzer

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

17Bor-Yuh Evan Chang - End-User Program Analysis

Splitting of summaries

To reflect updates precisely

And summarizing for termination

Shape analysis is an abstract interpretation on abstract memory descriptions with …

cur

l

cur

l

cur

l

cur

l

cur

l

cur

l

18Bor-Yuh Evan Chang - End-User Program Analysis

Outline

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

typeinference

on checkerdefinitions

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

Learn information about the checker to use it as an abstraction 1

2

3Compare and contrast manual code review and our automated shape analysis

19Bor-Yuh Evan Chang - End-User Program Analysis

Overview: Split summariesto interpret updates precisely

l

cur

l

cur

Want abstract update to be “exact”, that is, to update one “concrete memory cell”.The example at a high-level: iterate using cur changing the doubly-linked list from purple to red.

l

cur

split at cur

update cur purple to red

l

cur

Challenge:How does the analysis “split” summaries and know where to “split”?

20Bor-Yuh Evan Chang - End-User Program Analysis

“Split forward”by unfolding inductive definition

Çdll(h, p) =

if (h = null) thentrue

elseh!prev = p and dll(h!next, h)

l

curget: cur!next

l

cur

null

p dll(cur, p)

l

cur

pdll(n, cur)

n

Analysis doesn’t forget the empty case

21Bor-Yuh Evan Chang - End-User Program Analysis

“Split backward” also possible and necessary

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

l

cur

pdll(n, cur)

n

for each node cur in list l {

remove cur if duplicate;}assert l is

sorted, doubly-linked with no duplicates;

“dll segment”

l

cur

p0dll(n, cur)

n“dll segment”

cur!prev!next= cur!next;

l

cur

dll(n, cur)nnull

get: cur!prev!next

Ç

Technical Details:How does the analysis do this unfolding?Why is this unfolding allowed?(Key: Segments are also inductively defined)

[POPL’08]

How does the analysis know to do this unfolding?

22Bor-Yuh Evan Chang - End-User Program Analysis

Outline

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

typeinference

on checkerdefinitions

Contribution: Turns testing code into specification for static analysis

12

3

How do we decide where to unfold?

Derives additional information to guide unfolding

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

23Bor-Yuh Evan Chang - End-User Program Analysis

memory cell (points-to: °!next = ±)

Abstract memory as graphs

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

l

®dll(null) dll(¯)

cur

°dll(°)

¯prev

next±

Make endpoints and segments explicit, yet high-levell dll(±, °)

±“dll segment”

cur

°

®

segment summary

checker summary (inductive pred)

memory address (value)

Contribution: Generalization of checker(Intuitively, dll(®,null) up to dll(°,¯).)

Some number of memory cells (thin edges)

Which summary (thick edge), in what direction, and how far do we unfold to get the edge ¯!next (cur!prev!next)?

¯

next

24Bor-Yuh Evan Chang - End-User Program Analysis

0

1

-1

-2

Types for deciding where to unfold

®dll(null) dll(¯) dll(¯)

°

dll(®,null)

dll(¯,®)

dll(°,¯)

dll(±,°)

dll(null,±)

Checker “Run” (call tree/derivation)

Instance

Summary

° ±® ¯ nullnull

dll(h, p) =if (h = null) then

trueelse

h!prev = p and dll(h!next, h)

h:{nexth0i,prevh0i }p:{nexth-1i,prevh-1i }

If it exists, where is:

°!next ?

¯!next ?

Checker Definition

0-1

Says:

For h!next/h!prev, unfold from h

For p!next/p!prev, unfold before h

25Bor-Yuh Evan Chang - End-User Program Analysis

Types make the analysis robust with respect to how checkers are written

¯dll(®) dll(¯) dll(¯)

°

Instance

Summarydll(h, p) =

if (h = null) thentrue

elseh!prev = p and dll(h!next, h)

h:{nexth0i,prevh0i }p:{nexth-1i,prevh-1i }

°¯ null®

¯ ° null

Instance

¯dll0 dll0 dll0

°

Summarydll0(h) =if (h!next = null)

thentrue

elseh!next!prev = h

and dll0(h!next)

Alternative doubly-linked list checker h:{nexth0i,prevh-1i }

°!prev ? -1

Doubly-linked list checker (as before)

Different types for different unfolding

26Bor-Yuh Evan Chang - End-User Program Analysis

Summary of checker parameter types

Tell where to unfold for which fields

Make analysis robust with respect to how checkers are written

Learn where in summaries unfolding won’t help

Can be inferred automatically with a fixed-point computation on the checker definitions

27Bor-Yuh Evan Chang - End-User Program Analysis

Summary of interpreting updates

Splitting of summaries needed for precision

Unfolding checkers is a natural way to do splitting

When checker traversal matches code traversal

Checker parameter typesEnable, for example, “back pointer” traversal without blindly guessing where to unfold

28Bor-Yuh Evan Chang - End-User Program Analysis

Outline

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

typeinference

on checkerdefinitions

12

3

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

29Bor-Yuh Evan Chang - End-User Program Analysis

Summarizeby folding into inductive predicates

last = l;cur = l!next;while (cur != null) {

// … cur, last …if (…) last =

cur;cur = cur! next;

}

listl, last

nextcur

listl

next nextcurlast

listl

next next nextcurlast

summarize

listlast

listnextcur

listl

Challenge: Precision (e.g., last, cur separated by at least one step)

Previous approaches guess where to fold for each graph.Contribution: Determine where by comparing graphs across history

30Bor-Yuh Evan Chang - End-User Program Analysis

Summary:Given checkers, everything is automatic

shape analyzer

abstract interpretation

splitting andinterpreting update

summarizing

typeinference

on checkerdefinitions

dll(h, p) =if (h = null) then

trueelse

h!prev = prev and dll(h!next, h)

checkers

31Bor-Yuh Evan Chang - End-User Program Analysis

Results: Performance

Benchmark

Max. Num.

Graphs at a

Program Pt

Analysis

Time (ms)

singly-linked list reverse 1 0.6

doubly-linked list reverse 1 1.4

doubly-linked list copy 2 5.3

doubly-linked list remove 5 6.5

doubly-linked list remove and back 5 6.8

search tree with parent insert 5 8.3

search tree with parent insert and back

5 47.0

two-level skip list rebalance 6 87.0

Linux scull driver (894 loc) (char arrays ignored, functions inlined)

4 9710.0

Times negligible for data structure operations (often in sec or 1/10 sec)Expressiveness:

Different data structures

Verified shape invariant as given by the checker is preserved across the operation.

TVLA: 850 ms

TVLA: 290 ms

Space Invaderonly analyzes lists (built-in)

32Bor-Yuh Evan Chang - End-User Program Analysis

Demo: Doubly-linked list reversal

http://xisa.cs.berkeley.edu

Body of loop over the elements:Swaps the next and prev fields of curr.

Already reversed segmentNode whose next and prev fields were swapped Not yet reversed list

33Bor-Yuh Evan Chang - End-User Program Analysis

Experience with the tool

Checkers are easy to write and try out– Enlightening (e.g., red-black tree checker in 6

lines)– Harder to “reverse engineer” for someone else’s

code– Default checkers based on types useful

Future expressiveness and usability improvements– Pointer arithmetic and arrays– More generic checkers:

polymorphic “element kind unspecified”

higher-orderparameterized by other predicates

Future evaluation: user study

34Bor-Yuh Evan Chang - End-User Program Analysis

Summary ofExtensible Inductive Shape Analysis

Key Insight: Checkers as specificationsDeveloper View: Global, Expressed in a familiar

styleAnalysis View: Capture developer intent,

Not arbitrary inductive definitions

Constructing the program analysisIntermediate states: Generalized segment predicates

Splitting: Checker parameter types with levels

Summarizing: History-guided approachnext listlist list listlist

® ¯c(°) c0(°0)

h : {nexth0i, prevh0i}p : {nexth-1i, prevh-1i}

35Bor-Yuh Evan Chang - End-User Program Analysis

Conclusion

Extensible Inductive Shape Analysisprecision demanding program analysis improved by novel user interaction

Developer: Gets results corresponding to intuition

Analysis: Focused on what’s important to the developer

Practical precise tools for better software with an end-user approach!

What can inductiveshape analysis do for you?

http://xisa.cs.berkeley.edu


Recommended