+ All Categories
Home > Documents > SCIENCES USC INFORMATION INSTITUTE Pedro C. Diniz University of Southern California / Information...

SCIENCES USC INFORMATION INSTITUTE Pedro C. Diniz University of Southern California / Information...

Date post: 16-Dec-2015
Category:
Upload: grant-shepherd
View: 215 times
Download: 1 times
Share this document with a friend
17
SCIENCES SCIENCES USC USC INFORMATION INFORMATION INSTITUTE INSTITUTE Pedro C. Diniz University of Southern California / Information Sciences Institute 4676 Admiralty Way, Suite 1001 Marina del Rey, California 90292 [email protected] Increasing the Accuracy of Shape and Safety Analysis for Pointer-based Codes* * This work is partly funded by the National Science Foundation (NSF) under award number CCR- 0209228.
Transcript

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Pedro C. Diniz

University of Southern California / Information Sciences Institute4676 Admiralty Way, Suite 1001Marina del Rey, California 90292

[email protected]

Increasing the Accuracy of Shape and Safety Analysis for Pointer-based Codes*

* This work is partly funded by the National Science Foundation (NSF) under award number CCR-0209228.

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Introduction and Motivation Static Shape Analysis

• Understand Topological Properties of Data Structures Tree DAG Graph Topology Induced by a Subset of the Pointer Fields

Focus:• C Codes that Allocate Memory via malloc/free Functions• Traverse and Change Data Structure through Pointers

Applications: • Redundant Load/Store Elimination• Instruction Scheduling• Parallelization• “Bug” Finding

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Basic Approach

Loss of Accuracy• Need for Summarization• Abstract After Each Statement• Ignores Control Flow Predicates• Ignores Node “Configurations”

stat

cond

stat stat

Abstract Interpretation

Execute Each StatementMaterialize & Abstract

Fix-Point

Invariants

{next.prev, prev.next}

Abstract Storage Graph (ASG)

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

ExampleObservations:1. Loop Does not Modify Data Structure2. “Scan” structure a long “next”3. If body executes (stats 5&6) then on

exit (t != NULL)4. AND (p == t->next) holds5. On exit a few “contexts” hold6. This loop is “safe”, i.e. no null pointer

is ever dereferenced.7. The loop terminates:

¨ Iff structure is acyclic along “next” if it terminates from stat 2

¨ Only sufficient condition if it terminates from stat 4.

1: t = NULL;

2: while(p != NULL){

3: if (p->data < item)

4: break;

5: t = p;

6: p = p->next;

7: }

t == NULL

p == NULL

Context #1

t == NULL

p != NULL

p->data < item

Context #2 t != NULL

p != NULL

p->data < item

p = t->next

p == p->next(k)

Context #3 t != NULL

p == NULL

p->data < item

p = t->next

p == p->next(k)

Context #4

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Example

1:if(p->next != NULL){

2: p->next->prev = temp;

3: temp->next = p->next;

4: p->next = temp;

5: temp->prev = p;

6:}

Observations:1. Modification for a node s.t. p-

>next != NULL2. Need to know relation between

temp and p

ASG

pp

temp temp

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

What’s the Point?

Programmers Fundamentally Encode ”State" via Conditionals and Loop Constructs

A Typical Programming Style is to Use• Loop constructs to scan the structures to position pointer

variables at nodes that should be modified. • Conditional statements to define which operations should be

performed.

Shape Analysis and Safety Algorithms should Exploit the Information Conveyed in these Statements.

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Basic Analysis

Structural Fields & Node Configurations

Scan Loops

Assumed/Verified Properties

Context Tracing

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Scan Loops Typical Scan loops are short!

• Read-Only Heap Pointer Values• Use Stack/Global Variables

Symbolic Pointer Analysis• Symbolically Execute Loop Statements for

For zero-trip Multi-trips Relationships between Pointers on Exits

• Symbolical Value Number (iteration-based) Across All Loop Internal Paths

• Reach Closed-form Expressions (see HN90) Convert Loop into a Multi-way Statement

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Scan Loop and Tracing Contexts

03: t = NULL;04: while(p != NULL){05: if(p->data < item)06: break;07: t = p;08: p = p->next; 09: }

C0 = {t -> t(0), p -> p(0)}

C1 = {t -> NULL, p -> p(0)}

T (i,i+1) = {

t(i+1)-> p(i),

p(i+1) -> p(i)->next

p(i+1) == t(i+1)->next

p(i) != NULL;

t(i+1) != NULL;

}

T(0,i+1) = {

t -> t(i+1);

p-> p(i+1);

t(i+1) = p(0)(->next)i

p(i+1) = p(0)(->next)i+1

p(i+1) = t(i+1)->next

p(i) != NULL;

t(i+1) != NULL;

}

t = NULL

p != NULL

p->data < item

t = p;p = p->next;

C2 , C3

C4 , C5 Symbolic Loop Transfer Function

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Contexts On Exit of Scan Loop

C2 = {

t-> NULL, p -> p(0)

t = NULL

p = NULL

}

C3 = {

t->t(i+1), p->p(i+1)

p(i+1) = NULL

t(i+1) = p(0)(->next)i

p(i+1) = p(0)(->next)i+1

p(i+1) = t(i+1)->next

t(i+1) != NULL

}

C5 = {

t->t(i+1), p->p(i+1)

p(i+1) != NULL

t(i+1) = p(0)(->next)i

p(i+1) = p(0)(->next)i

p(i+1) = t(i+1)->next

t(i+1) != NULL

}

C4 = {

t-> NULL; p ->p(0)

t = NULL

p != NULL

}

Zer

o-T

rip

Mul

ti-T

ripExit #1 Exit #2

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Why Are Contexts Important?

Establish Symbolic Pointer/Values Relationships• Allow Analyses to Discriminate Between “Nodes” of an

Abstract Shape Representation for Increased Accuracy• Identify Potential Non-Trivial “Bugs”

09: if(t != NULL){10: stat; // with p = t->next11: }

09: if(p != t->next){10: t->next = NULL;11: }

03: t = NULL;04: while(p != NULL){05: if(p->data < item)06: break;07: t = p;08: p = p->next; 09: }

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Termination

Derive Sufficient Termination Conditions• Look at Loop Transfer Function(s)• Exit Predicates

T (i,i+1) = {

t(i+1)-> p(i),

p(i+1) -> p(i)->next

}

Predicates:p != NULLp->data < item

Acyclic(next) = TRUE)

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Safety (non-Nil Dereferencing)

Examine Contexts• Check out if Predicates Ensure Dereference• If Not Can Derive (Min) Predicates that Can

t = NULL

p != NULL

p->data < item

t = p;p = p->next;

C2 , C3

C4 , C5

{p(i) != NULL }

{p(i) != NULL }

{t(i+1) != NULL;

p(i+1) ? }

p->next != NULL ?

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

How Frequent are Scan Loops?

Program Lines #Loops #PtrLoops #ScanLoops

bintree 200 5 1 1

em3d 148 11 6 5

hash 96 7 3 3

blocks2 560 45 17 6

chomp 298 24 10 4

sparse 1170 89 76 56

graphics 686 28 14 4

paraffins 166 17 8 2

nbody 808 13 11 2

pug 2958 77 39 26

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Putting the Pieces Together

Coarse-Grain

Shape Analysis

GH:POPL96

Scan Loop

Termination & Safety

Context Tracing

Fine-Grain

Shape Analysis

Use Results from Coarse-Grain Analysis

Abstract Storage Graph (ASG)

PropertiesHold

YES

NO

AssumedProperties

Use Results from

Fine-Grain Shape Analysis

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Related Work Shape Analysis

• LH88:PLDI88, CWZ90:PLDI90• PKC93:LCPC93• Deutsh94:PLDI94• SRW98:TOPLAS98,POPL99• HHN94:IPPS94,HHN94:PLDI94, GH96:POPL96• CAZ:LCPC01• KR:POPL02

(Static) Safety Analysis• Colby97:LoyolaUnivTechRep97• Evans96:PLDI96• DRS98:PASTE98

Program Checking• NL98:PLDI98• Ball:PLDI01

SCIENCESSCIENCES

USCUSCINFORMATIONINFORMATION

INSTITUTEINSTITUTE

Summary

Symbolic Analyses• Structural Fields and Node Configurations• Scan Loops• Assumed and Verified Properties for Termination• Context Tracing for Accurate Pointer Relationships

Thesis:

In order to increase the accuracy of shape and safety analysis algorithms, compilers must

uncover and exploit the knowledge encoded in conditional statements


Recommended