Post on 24-Jan-2016
description
transcript
Department of Computer Sciences
Dynamic Shape Analysisvia Degree Metrics
Maria Jump & Kathryn S. McKinleyDepartment of Computer SciencesThe University of Texas at Austin{mjump,mckinley}@cs.utexas.edu
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Software is Dynamic Software is always changing
New expectations New algorithms New applications New users New notions of what is cool
Limited only by human ambition
First Law of Software: Software is a gas!It expands to fit the container it is in!
[Nathan Myhrvold, Former CTO, Microsoft]
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Software Complexity
Developed by large teams Few, if any, understand all parts Need tools to help
Property of a system that is directly proportional to thedifficulty one has in comprehending the system
at a level and detail necessary to make changeswithout introducing instability or functional regressions
[Peter Rosser, Microsoft Software Engineer]
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Code + Data
Program =
Program Analysis
Software complexity leads to heap complexity
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Heap Complexity Exacerbated by modern languages
Objects are smaller and more numerous Most objects allocated on the heap
Objects encode program state Heap contains semantic, memory,
and concurrency bugs
Program analysis is incompletewithout heap analysis
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
An Opportunity Automatic memory management
examines live data on a regular basis Understand heap usage Detect heap-based bugs Optimize program based on heap usage
Opportunity to performheap analysis dynamically
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Shape AnalysisCharacterize the “shape” of heap-allocated
pointer-based data structures andverify shape-preserving properties
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Shape Analysis
Code + Data
Program =
Static Conservatively characterizes all “possible” shapes at every program point Only works on small programs
Dynamic Discover shape of current
data structure Generate assertions
for verification Monitor shape during
entire execution
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Shape Analysis
0
20
40
60
80
100
compress
jess
raytrace
javac
mpegaudio
jackantlrbloatcharteclipse
fop
jythonluindexlusearch
pmdxalan
average
HomeBrewed Library%
of
Hea
p
LinkedHashMap: 99.3%
OctTree: 98%
BinaryTree: 35.8%HashMap: 59.5%
Custom Library
Custom: 33%Library: 58%
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
ShapeUp Desired goals
Discover shape of current data structure Discover dynamic invariants Monitor expected shape of current data
structure during execution
Heap summarization graph [POPL ‘07]
Class field-wise graph (CFWG) Identifies recursive data structures
(RDS) Summarize dynamic degree invariants
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Class Field-Wise Graph (CFWG)
Heap
Home-brewed data structure from SPECjbb2000 (simplified)
CFWG
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Class Field-Wise Graph (CFWG)
Heap
Home-brewed data structure from SPECjbb2000 (simplified)
CFWG
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Class Field-Wise Graph (CFWG)
Heap
Home-brewed data structure from SPECjbb2000 (simplified)
CFWG
Degree Metrics
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Degree Metrics Degree metrics per instance:
Out-degree: # of outgoing references In-degree: # of incoming references Only count edges between objects
of same class Use word in header to track in-
degree of backbone objects Summarize by class per data
structure instance in a degree profile
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=1
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=1
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=1
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=2
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=2
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=2
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=3
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=3
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=3
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=4
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=5
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=6
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=6
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=6
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=7
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=8
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=9
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=9
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=9
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=10
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=10
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=10
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=11
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=12
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=13
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=13
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=13
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=14
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
n=15
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profileconstant
range
n
Complete Binary Tree
CFWG
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Calculating Degree Profile
CFWG
left right
constant
range
Complete Binary Tree
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Interpreting Degree Profile Monitoring degree profile shows
errors: Error: constant violation Warning: range violation [min-,max+]
Indicate problem with RDS during development and testing
Monitor shape after deployment for dynamically introduced errors
What kind of errors can be detected?
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Implementation and Methodology Jikes RVM SPECjvm98, DaCapo, SPECjbb2000
Heap composition Showed degree metric are stable by
class Overheads
<8% total time <1% space overhead [used bits in header]
Microbenchmarks Single RDS in isolation Random error injection
Singly-linked listDoubly-linked list
Binary TreeBinary Tree w/ PP
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Looking for errors
TRAINING Ran microbenchmark 100 times Merged together dynamic invariants
ERROR DETECTION Created RDS with 100,000 nodes Errors: 1, 2, 3, 4, 5, 10, 50, and
100 Detected dynamic invariant
violations
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Doubly-linked List
=0 =1 =2
in 0 2 n-2
out
0 2 n-2
circle cyclic
disconnect skip
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
0
10
20
30
40
50
60
70
80
90
100
Circle
Disconnect
CyclicSkip
Random
% Detection
1 2 3 4 5 10 50 100
Doubly-linked List
=0 =1 =2
in 0 2 n-2
out
0 2 n-2
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Binary Tree =0 =1 =2
Complete Binary Tree
in 1 n-1 0
out
[50.0,50.2]
0 [49.8,50.0]
Full Binary Tree
in 1 n-1 0
out
[50.0,50.1]
0 [49.9,50.0]
Random Binary Tree
in 1 n-1 0
out
[33.6,35.7]
[28.7,35.1]
[32.3,35.6]link
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
0
10
20
30
40
50
60
70
80
90
100
BTComplete
BTFull
BTRandom
% Detection
1 2 3 4 5 10 50 100
Binary Tree =0 =1 =2
Complete Binary Tree
in 1 n-1 0
out
[50.0,50.2]
0 [49.8,50.0]
Full Binary Tree
in 1 n-1 0
out
[50.0,50.1]
0 [49.9,50.0]
Random Binary Tree
in 1 n-1 0
out
[33.6,35.7]
[28.7,35.1]
[32.3,35.6]
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Binary Tree w/ PP =1 =2 =3
Complete Binary Tree w/ PP
in [50.0,50.0]
2 [49.9,50.0]
out
[50.0,50.0]
2 [49.9,50.0]
Full Binary Tree w/ PP
in [50.0,50.1]
1 [49.9,50.0]
out
[50.0,50.1]
1 [49.9,50.0]
Random Binary Tree w/ PP
in [33.8,35.2]
[30.0,32.6]
[33.6,34.8]
out
[33.8,35.2]
[30.0,32.6]
[33.6,34.8]
link
disconnect
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Microbenchmark: Binary Tree w/ PP
0102030405060708090
100
BTPComplete
Link
BTPFull LinkBTPRandom
Link
BTPCompleteDisconnect
BTPFull
DisconnectBTPRandom
DisconnectBTPComplete
RandomBTPFullRandom
BTPRandom
Random
% Detection
1 2 3 4 5 10 50 100
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
ShapeUp’s Contributions Performs dynamic heap analysis to determine
shapeof current data structure
Summarizes degree metrics using a class field-wise graph with low overheads
Shows whole-heap degree metrics are not stable, but class degree metrics are
Introduces degree profiles for RDS Regular RDS have more invariants than random RDS Degree profiles can detect some shape errors
Thank You!
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Thank You!
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Space Overhead
jess Eclipse Geomean
# of types
bm+VM 1744 3365 1747
avg 318 667 334
max 319 775 346
# of edges
avg 844 4090 904
max 861 7585 1142
Increased Alloc %
0.094% 0.167% 0.233%
19%
2.7X
0.233%
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Time Overhead
No
rmal
ized
To
tal
Tim
e
Heap Size Relative to Minimum
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Singly-linked List
circle := attaches head to tail
cyclic := creates a cycle from tail to random node
0
10
20
30
40
50
60
70
80
90
100
SLL Circle SLL Cyclic
1 Error
=0 =1
in 1 n-1
out
1 n-1
De
tec
tio
n %
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Microbenchmark: HashMap=0 =1 =2
in [31.9,51.6]
[48.4,68.1]
{0}
out
[31.9,51.6]
[48.4,68.1]
{0}
link := creates a connection between buckets from a null ptr to a random entry
0
10
20
30
40
50
60
70
80
90
100
HM Link
1 2 3 4 5 10 50 100
Department of Computer Sciences
20-Jun-2009 Jump & McKinley
Data Structures in Benchmarks
Benchmark RDS Node L or H? %
Raytrace OctNode H 78.0
Jack LinkedHashMap$LinkedHashEntry L 44.2
RuntimeNfaState H 9.4
Antlr Object[] H 76.2
Bloat HashMap$HashEntry L 56.6
CallMethodExpr H 3.7
Eclipse LinkedHashmap$LinkedHashEntry L 59.0
AND_AND_Expression H 0.4
Fop HashMap$HashEntry L 51.4
PropertyList H 4.4
Jython Pyframe H 94.6
Luindex LinkedHashMap$LinkedHashEntry L 99.3
Lusearch WeakHashMap$WeakBucket L 47.5
HitDoc H 2.0
Xalan ChildIterator H 34.6