Post on 04-Jan-2016
description
transcript
Free-Me: A Static Analysis for Individual
Object Reclamation
Samuel Z. GuyerTufts University
Kathryn S. McKinleyUniversity of Texas at Austin
Daniel FramptonAustralian National University
T H E U N I V E R S I T Y O F
T E X A SA T A U S T I N
2
Motivation
Automatic memory reclamation (GC) No need for explicit “free” Garbage collector reclaims memory Eliminates many programming errors
Problem: when do we get memory back? Frequent GCs:
Reclaim memory quickly, with high overhead Infrequent GCs:
Lower overhead, but lots of garbage in memory
3
Example
Notice: String idName is often garbageMemory:
void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } computeOn(id);}}
Read a token (new String)
Look up insymbol table
If not there, create new identifier, add
to symbol tableCompute on
identifier
4
Solution
Garbage does not accumulateMemory:
void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName) if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } else free(idName); computeOn(id);}} String idName is garbage,
free immediately
5
Our approach
Adds free() automatically FreeMe compiler pass inserts calls to free() Preserve software engineering benefits
Can’t determine lifetimes for all objects Works with the garbage collector Implementation of free() depends on collector
Goal: Incremental, “eager” memory reclamation
Results: reduce GC load, improve performance
Potential: 1.7X performancemalloc/free vs GC
in tight heaps(Hertz & Berger, OOPSLA 2005)
6
Outline
Motivation Analysis Results Related work Conclusions
7
FreeMe Analysis Goal:
Determine when an object becomes unreachable
Not a whole-program analysis*
Idea: pointer analysis + liveness Pointer analysis for reachability Liveness analysis for when
Within a method, for allocation site “p = new A” where can we place a call to “free(p)”?
I’ll describe the interprocedural
parts later
8
Pointer Analysis
Connectivity graph Variables Allocation sites Globals (statics)
Analysis algorithm Flow-insensitive, field-insensitive
String idName = stream.readToken();Identifier id = symbolTable.lookup(idName);if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id);}computeOn(id);
idName
symbolTable
readTokenString
Identifier
(global)
id
9
Adding liveness Key:
idName
readTokenString
Identifier(global)
An object is reachable only when all incoming pointers are live
From a variable: Live range of the variable
From a global: Live from the pointer store onward
Live from the pointer store until source object becomes unreachable
From other object:
Reachability is union of all these
live ranges
10
Computed as sets of edges Variables
Heappointers
Liveness Analysis
String idName = stream.readToken();
id = new Identifier(idName);
computeOn(id);
if (id == null)
Identifier id = symbolTable.lookup(idName);
symbolTable.add(idName, id);
idName
(global)
readTokenString
Identifier
11
Where can we free it?
Where object exists
-minus-
Where reachable
String idName = stream.readToken();
id = new Identifier(idName);
computeOn(id);
if (id == null)
Identifier id = symbolTable.lookup(idName);
symbolTable.add(idName, id);
readTokenString
Compiler inserts call to
free(idName)
12
Interprocedural component Detection of factory methods
Return value is a new object Can be freed by the caller
Effects of methods called
Describes how parameters are connected
Compilation strategy: Summaries pre-computed for all methods Free-me only applied to hot methods
String idName = stream.readToken();
symbolTable.add(idName, id);
Hashtable.add: (0 → 1) (0 → 2)
13
Implementation in JikesRVM FreeMe added to OPT compiler
Run-time: depends on collector Mark/sweep
Free-list: free() operation Generational mark/sweep
Unbump: move nursery “bump pointer” backward
Unreserve: reduce copy reserve Very low overhead Run longer without collecting
14
Volume freed – in MB
100%
50%
0% compress
105
jess
263
raytrace
91
mtrt
98
javac
183
jack
271
pseudojbb
180
xalan
8195antlr
1544716
bloat
fop
103
hsqldb
515
jython
348
pmd
822
ps
523
db
74
SPEC benchmarks DaCapo benchmarks
Increasing alloc size Increasing alloc size
15
Volume freed – in MB
100%
50%
0% compress
105
jess
263
raytrace
91
mtrt
98
javac
183
jack
271
pseudojbb
180
xalan
8195antlr
1544716
bloat
fop
103
hsqldb
515
jython
348
pmd
822
ps
523
db
74
016
7373
24
163
34 1607
673
22230
57
75
278
22
45
FreeMeMean: 32%
16
Compare to stack-like behavior
Notice: Stacks and regions won’t work for example idName escapes some of the time Not biased: 35% vs 65%
Comparison: restrict placement of free()Object must not escape No conditional free No factory methods
Other approaches: Optimistic, dynamic stack allocation [Azul, Corry 06] Scalar replacement
17
Volume freed – in MB
0% compress
105
jess
263
raytrace
91
mtrt
98
javac
183
jack
271
pseudojbb
180
xalan
8195antlr
1544716
bloat
fop
103
hsqldb
515
jython
348
pmd
822
ps
523
db
74
Stack-likeMean: 20%
100%
50%
016
73
73
15
103
1566
14635
21
283
5614
45
6
18
Mark/sweep – GC time
All benchmarks30%
9%
19
Mark/sweep – time
20%
15%
6%
All benchmarks
20
GenMS – time
All benchmarks
21
GenMS – GC time
Why doesn’t this help?
Note: the number of GCs is greatly reduced
FreeMe mostly finding short-lived objects
All benchmarks
Nursery reclaims dead objects for free
(cost ~ survivors)
22
Bloat – GC time
12%
23
Related work Compile-time memory management
Functional languages [Barth 77, Hughs 92, Mazur 01] Shape analysis [Shaham 03, Cherem 06]
Stack allocation [Gay 00, Blanchet 03, Choi 03] Tied to stack frames or other scopes Objects from a site must not escape
Region inference [Tofte 96, Hallenberg 02] Cheap reclamation – no scoping constraints Similar all-or-nothing limitation
24
Conclusions FreeMe analysis
Finds many objects to free: often 30% - 60% Most are short-lived objects
GC + explicit free() Advantage over stack/region allocation: no need to make
decision at allocation time
Generational collectors Nursery works very well Abandon techniques that replace nursery?
Mark-sweep collectors 50% to 200% speedup Works better as memory gets tighter
Embedded applications:Compile-ahead
Memory constrainedNon-moving collectors
25
Thank You