Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | vernon-pope |
View: | 219 times |
Download: | 0 times |
1
Improving Productivity With Fine-grain Compiler-based
Checkpointing
Chuck (Chengyan) ZhaoProf. Greg Steffan
Prof. Cristiana AmzaAllan Kielstra*
Dept. of Electrical and Computer EngineeringUniversity of Toronto
IBM Toronto Lab*
Nov. 10, 2011
2
Productivity and CompilersProgrammer’s Productivity: important
computers: fast, cheapprogrammers: slow (relatively), expensive
new way for compiler to help?automatic fine-grain checkpointing (CKPT)optimizations to reduce checkpoint overhead
applications of checkpointingaccelerate bug-finding processautomated support for backtracking algorithms
a compiler can improve programmer’s productivity via automatic CKPT
Annotatedsource
Enable Checkpointing
Optimize Checkpointing
LLVM frontend
Callsite Analysis
Inter-procedural Transformations
Intra-procedural Transformations
Special Cases Handling
Source code
C/C++
LLVM IR
BackendProcess
Compiler Checkpointing (CKPT) Framework
x86
x64
…POWER
C/C++
2. Pre Optimize
3. Redundancy Eliminations
4. Hoisting
6. Non Rollback Exposed Store Elimination
1. CKPT Inlining
7. Heap Optimize
8. Array Optimize
9. Post Optimize
5. Aggregation
3
4
compiler-based checkpointing basics
…a = 5;b = 7;
…
main program
a:
b:
checkpoint buffer
failurerecovery
(&a, 0)(&b, 0)
main
mem
ory
0
05
7
5
start_ckpt(); …
backup(&a, sizeof(a));
a = …;
handleMemcpy(…);
memcpy(d, s, len);
foo_ckpt();
foo();
…
stop_ckpt(cond);
foo(…){ /* body of foo() */}
foo_ckpt(…){
/* body of foo_ckpt() */ }…
Transformations to Enable Checkpointing
3 Steps:
1.Callsite analysis
2.Intra-procedural transformation
3.Inter-procedural transformation
Optimize Checkpointing
Checkpointing Optimization Framework
2. Pre Optimization
3. Redundancy Eliminations (3 REs)
4. Hoisting
6. Non Rollback Exposed Store Elimination
1. CKPT Inlining
7. DynMem (Heap) Optimization
8. Array Optimization
9. Post Optimization
5. Aggregation
6
start_ckpt();…
if (C){ backup(&a, sizeof(a)); a = …; } … backup(&a, sizeof(a)); a = …; … backup(&a, sizeof(a)); a = …; … …stop_ckpt(cond);
Redundancy Elimination OptimizationAlgorithm
establish dominating relationship
stop_ckpt() marker
promote leading backup call
re-establish dominating relationship
among backup calls
eliminate all non-leading backup call(s)
7RE1: remove all non-leading backup call(s)
dom
dom
int a, b;…start_ckpt();
… b = … a op …; … backup(&a, sizeof(a)); a = …;…
…stop_ckpt(cond);
8
Definition: Rollback Exposed Store
must backup 'a' because the prior load of 'a' must access the"old" value on rollback---i.e., 'a' is "rollback exposed"
Rollback Exposed Store:a store to a location with a possible previous load of that location
Rollback Exposed Store needs backup
int a, b;…start_ckpt();
…
… backup(&a, sizeof(a)); a = …;…
…stop_ckpt(cond);
Algorithm Description
no use of the address (&a) on any path
the backup address (&a) isn’t aliased to anything
empty points-to set
9NRESE is a new, checkpoint-specific optimization
Non-Rollback Exposed Store Elimination (NRESE)
no prior use of 'a', hence it is non-rollback-exposed
we can eliminate the backup of 'a'
Applications
10
11
Q: place where the bug manifests
(a user or programmer notices the bug at this point)
T: safe point, literally earlier than P, the program can reach through checkpoint recovery
CKPT Region
P: root cause of a bug
App1: CKPT enabled debugging
11
Key benefitsexecution rewindingarbitrarily large regionunlimited # of retriesno restart from beginning
12
Q: keep swap if improvement, discard otherwise
T: pick a pair of blocks to swap
CKPT Region
App2: CKPT enabled backtracking
12
Proceed with VPR’s random/simulated-annealing based algorithm
Key benefitsautomate support for backtracking
backup actionsabortcommit
cover arbitrarily complex algorithmcleaner code, simplify programming
programmer focus on algorithm
Evaluation
13
Platform and BenchmarksEvaluation Platform
Core i7 920, 12GB DDR3, 200GB SATADebian6-i386, gcc/g+-4.4.5LLVM-2.9
BenchmarksBugBench: 1.2.0
5 programs with buffer-overflow bugs3 CKPT regions per program: Small . Medium . Large
VPR: 5.0.2FPGA CAD tool, 1 CKPT region
CKPT ComparisonlibCKPT: U. TennesseeICCSTM: Intel ICC based STM
14
15
Compare with Coarse-gain Scheme: libCKPT
HUGE gain over coarse-grain libCKPT
16
Compare with Fine-gain Scheme: ICCSTM
better than best-known fine-grain ICCSTM
17
0
20
40
60
80
100
120% of Buffer Size Reduction
INLINE
+RE1
%
%
%
%
%
RE1 Optimization: buffer size reduction
RE1 is the single most-effective optimization
18
0
1
2
3
4
5
6
7
8
9% of Buffer Size Reduction +RE2
+RE3
+Hoist
+Aggr
+NRESE
+HeapOpti
+ArrayOpti
%
%
%
%
%
%
%
%%
Post RE1 Optimization: buffer size reduction
Other optimizations also contribute
ConclusionCKPT Optimization Framework
compiler-drivenautomaticsoftware-onlycompiler analysis and optimizations100-1000X less overhead: over coarse-grain scheme4-50X improvement: over fine-grain scheme
CKPT-supported Appsdebugger: execution rewind in time
up to: 98% of CKPT buffer size reductionup to: 95% of backup call reduction
VPR: automatic software backtrackingonly 15% CKPT overhead
19
20
Questions and Answers
?
Algorithm: Redundancy Elimination 1
1. Build dominating relationship (DOM) among backup calls
2. Identify leading backup call
3. Promote suitable leading backup call
4. Remove non-leading backup call(s)
21
Algorithm: NRESE
Backup address is NOT aliased to anything
points-to set is empty
AND
On any path from begin of CKPT to the respective write, there is no use of the backup address
the value can be independently re-generated without the need of it self
22
1D array vs. Hash Tables Buffer Schemes
23
24
10X
100X
1KX
10KX
100KX
Compare with Coarse-gain Scheme: libCKPT
HUGE gain over coarse-grain libCKPT
Annotatedsource
Enable Checkpointing
Optimize Checkpointing
Source code
C/C++ LLVM IR
BackendProcess
Compiler Checkpointing (CKPT) Framework
x86
x64
…Power
C/C++
2. Pre Optimize
3. Redundancy Eliminations
4. Hoisting
6. Non Rollback Exposed Store Elimination
1. CKPT Inlining
7. Heap Optimize
8. Array Optimize
9. Post Optimize
5. Aggregation
25
CKPT Enabled Debugging
Key benefitsexecution rewindingarbitrarily large regionunlimited # of retriesno restart
26
27
Compare with Fine-gain Scheme: ICCSTM
better than best-known fine-grain solution
start_ckpt();… backup(&a, sizeof(a)); a = …; … backup(&a, sizeof(a)); a = …; … if (C){ backup(&a, sizeof(a)); a = …; … } …
…stop_ckpt(c);
Redundancy Elimination Optimization 1Algorithm
establish dominating relationship
among backup calls
promote leading backup call
eliminate all non-leading backup call(s)
28
D
RE1: keep only dominating backup call
29
initial guess
obtain a new result (manual CKPT)
check result
…
commit and continue
good
abort and try next
bad
CKPT Support for Automatic Backtracking (VPR)
CKPT automates the process, regardless of backtracking complexity
30
31
Key benefitsautomate support for backtracking
backup actionsabortcommit
cover arbitrarily complex algorithmcleaner code, simplify programming
programmer focus on algorithm
3232
App2: CKPT enabled backtracking
Evaluate (manual CKPT)
Initial Guess
badReset Data
goodCommit Data
Finish
stop condition reached
Key benefitsautomate support for backtracking
backup actionsabortcommit
cover arbitrarily complex algorithmcleaner code, simplify programming
programmer focus on algorithm
33
Key benefitsautomate CKPT process
backup actionsabortcommit
cover arbitrarily complex algorithmsimplify programming
programmer focus on algorithm
2. Pre Optimize
3. Redundancy Eliminations
4. Hoisting
6. Non Rollback Exposed Store Elimination
1. CKPT Inlining
7. Heap Optimize
8. Array Optimize
9. Post Optimize
5. Aggregation
34
How Can A Compiler Help Checkpointing?
Enable CKPTcompiler transformations
Optimize CKPTdo standard optimizations apply?support CKPT-specific optimizations?
CKPT Usesdebuggingbacktracking
35
36
0
20
40
60
80
100
120% of Buffer Size Reduction INLINE
+RE1
+RE2
+RE3
+Hoist
+Aggr
+NRESE
+HeapOpti
+ArrayOpti
Optimization: buffer size reduction
up to 98% of CKPT buffer size reduction
%
%
%
%
%