+ All Categories
Home > Documents > 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof....

1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof....

Date post: 29-Dec-2015
Category:
Upload: vernon-pope
View: 219 times
Download: 0 times
Share this document with a friend
Popular Tags:
36
1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept. of Electrical and Computer Engineering University of Toronto IBM Toronto Lab* Nov. 10, 2011
Transcript
Page 1: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

1

Improving Productivity With Fine-grain Compiler-based

Checkpointing

Chuck (Chengyan) ZhaoProf. Greg Steffan

Prof. Cristiana AmzaAllan Kielstra*

Dept. of Electrical and Computer EngineeringUniversity of Toronto

IBM Toronto Lab*

Nov. 10, 2011

Page 2: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

2

Productivity and CompilersProgrammer’s Productivity: important

computers: fast, cheapprogrammers: slow (relatively), expensive

new way for compiler to help?automatic fine-grain checkpointing (CKPT)optimizations to reduce checkpoint overhead

applications of checkpointingaccelerate bug-finding processautomated support for backtracking algorithms

a compiler can improve programmer’s productivity via automatic CKPT

Page 3: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Annotatedsource

Enable Checkpointing

Optimize Checkpointing

LLVM frontend

Callsite Analysis

Inter-procedural Transformations

Intra-procedural Transformations

Special Cases Handling

Source code

C/C++

LLVM IR

BackendProcess

Compiler Checkpointing (CKPT) Framework

x86

x64

…POWER

C/C++

2. Pre Optimize

3. Redundancy Eliminations

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. Heap Optimize

8. Array Optimize

9. Post Optimize

5. Aggregation

3

Page 4: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

4

compiler-based checkpointing basics

…a = 5;b = 7;

main program

a:

b:

checkpoint buffer

failurerecovery

(&a, 0)(&b, 0)

main

mem

ory

0

05

7

Page 5: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

5

start_ckpt(); …

backup(&a, sizeof(a));

a = …;

handleMemcpy(…);

memcpy(d, s, len);

foo_ckpt();

foo();

stop_ckpt(cond);

foo(…){ /* body of foo() */}

foo_ckpt(…){

/* body of foo_ckpt() */ }…

Transformations to Enable Checkpointing

3 Steps:

1.Callsite analysis

2.Intra-procedural transformation

3.Inter-procedural transformation

Page 6: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Optimize Checkpointing

Checkpointing Optimization Framework

2. Pre Optimization

3. Redundancy Eliminations (3 REs)

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. DynMem (Heap) Optimization

8. Array Optimization

9. Post Optimization

5. Aggregation

6

Page 7: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

start_ckpt();…

if (C){ backup(&a, sizeof(a)); a = …; } … backup(&a, sizeof(a)); a = …; … backup(&a, sizeof(a)); a = …; … …stop_ckpt(cond);

Redundancy Elimination OptimizationAlgorithm

establish dominating relationship

stop_ckpt() marker

promote leading backup call

re-establish dominating relationship

among backup calls

eliminate all non-leading backup call(s)

7RE1: remove all non-leading backup call(s)

dom

dom

Page 8: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

int a, b;…start_ckpt();

… b = … a op …; … backup(&a, sizeof(a)); a = …;…

…stop_ckpt(cond);

8

Definition: Rollback Exposed Store

must backup 'a' because the prior load of 'a' must access the"old" value on rollback---i.e., 'a' is "rollback exposed"

Rollback Exposed Store:a store to a location with a possible previous load of that location

Rollback Exposed Store needs backup

Page 9: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

int a, b;…start_ckpt();

… backup(&a, sizeof(a)); a = …;…

…stop_ckpt(cond);

Algorithm Description

no use of the address (&a) on any path

the backup address (&a) isn’t aliased to anything

empty points-to set

9NRESE is a new, checkpoint-specific optimization

Non-Rollback Exposed Store Elimination (NRESE)

no prior use of 'a', hence it is non-rollback-exposed

we can eliminate the backup of 'a'

Page 10: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Applications

10

Page 11: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

11

Q: place where the bug manifests

(a user or programmer notices the bug at this point)

T: safe point, literally earlier than P, the program can reach through checkpoint recovery

CKPT Region

P: root cause of a bug

App1: CKPT enabled debugging

11

Key benefitsexecution rewindingarbitrarily large regionunlimited # of retriesno restart from beginning

Page 12: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

12

Q: keep swap if improvement, discard otherwise

T: pick a pair of blocks to swap

CKPT Region

App2: CKPT enabled backtracking

12

Proceed with VPR’s random/simulated-annealing based algorithm

Key benefitsautomate support for backtracking

backup actionsabortcommit

cover arbitrarily complex algorithmcleaner code, simplify programming

programmer focus on algorithm

Page 13: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Evaluation

13

Page 14: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Platform and BenchmarksEvaluation Platform

Core i7 920, 12GB DDR3, 200GB SATADebian6-i386, gcc/g+-4.4.5LLVM-2.9

BenchmarksBugBench: 1.2.0

5 programs with buffer-overflow bugs3 CKPT regions per program: Small . Medium . Large

VPR: 5.0.2FPGA CAD tool, 1 CKPT region

CKPT ComparisonlibCKPT: U. TennesseeICCSTM: Intel ICC based STM

14

Page 15: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

15

Compare with Coarse-gain Scheme: libCKPT

HUGE gain over coarse-grain libCKPT

Page 16: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

16

Compare with Fine-gain Scheme: ICCSTM

better than best-known fine-grain ICCSTM

Page 17: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

17

0

20

40

60

80

100

120% of Buffer Size Reduction

INLINE

+RE1

%

%

%

%

%

RE1 Optimization: buffer size reduction

RE1 is the single most-effective optimization

Page 18: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

18

0

1

2

3

4

5

6

7

8

9% of Buffer Size Reduction +RE2

+RE3

+Hoist

+Aggr

+NRESE

+HeapOpti

+ArrayOpti

%

%

%

%

%

%

%

%%

Post RE1 Optimization: buffer size reduction

Other optimizations also contribute

Page 19: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

ConclusionCKPT Optimization Framework

compiler-drivenautomaticsoftware-onlycompiler analysis and optimizations100-1000X less overhead: over coarse-grain scheme4-50X improvement: over fine-grain scheme

CKPT-supported Appsdebugger: execution rewind in time

up to: 98% of CKPT buffer size reductionup to: 95% of backup call reduction

VPR: automatic software backtrackingonly 15% CKPT overhead

19

Page 20: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

20

Questions and Answers

?

Page 21: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Algorithm: Redundancy Elimination 1

1. Build dominating relationship (DOM) among backup calls

2. Identify leading backup call

3. Promote suitable leading backup call

4. Remove non-leading backup call(s)

21

Page 22: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Algorithm: NRESE

Backup address is NOT aliased to anything

points-to set is empty

AND

On any path from begin of CKPT to the respective write, there is no use of the backup address

the value can be independently re-generated without the need of it self

22

Page 23: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

1D array vs. Hash Tables Buffer Schemes

23

Page 24: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

24

10X

100X

1KX

10KX

100KX

Compare with Coarse-gain Scheme: libCKPT

HUGE gain over coarse-grain libCKPT

Page 25: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

Annotatedsource

Enable Checkpointing

Optimize Checkpointing

Source code

C/C++ LLVM IR

BackendProcess

Compiler Checkpointing (CKPT) Framework

x86

x64

…Power

C/C++

2. Pre Optimize

3. Redundancy Eliminations

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. Heap Optimize

8. Array Optimize

9. Post Optimize

5. Aggregation

25

Page 26: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

CKPT Enabled Debugging

Key benefitsexecution rewindingarbitrarily large regionunlimited # of retriesno restart

26

Page 27: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

27

Compare with Fine-gain Scheme: ICCSTM

better than best-known fine-grain solution

Page 28: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

start_ckpt();… backup(&a, sizeof(a)); a = …; … backup(&a, sizeof(a)); a = …; … if (C){ backup(&a, sizeof(a)); a = …; … } …

…stop_ckpt(c);

Redundancy Elimination Optimization 1Algorithm

establish dominating relationship

among backup calls

promote leading backup call

eliminate all non-leading backup call(s)

28

D

RE1: keep only dominating backup call

Page 29: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

29

initial guess

obtain a new result (manual CKPT)

check result

commit and continue

good

abort and try next

bad

CKPT Support for Automatic Backtracking (VPR)

CKPT automates the process, regardless of backtracking complexity

Page 30: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

30

Page 31: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

31

Key benefitsautomate support for backtracking

backup actionsabortcommit

cover arbitrarily complex algorithmcleaner code, simplify programming

programmer focus on algorithm

Page 32: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

3232

App2: CKPT enabled backtracking

Evaluate (manual CKPT)

Initial Guess

badReset Data

goodCommit Data

Finish

stop condition reached

Key benefitsautomate support for backtracking

backup actionsabortcommit

cover arbitrarily complex algorithmcleaner code, simplify programming

programmer focus on algorithm

Page 33: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

33

Key benefitsautomate CKPT process

backup actionsabortcommit

cover arbitrarily complex algorithmsimplify programming

programmer focus on algorithm

Page 34: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

2. Pre Optimize

3. Redundancy Eliminations

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. Heap Optimize

8. Array Optimize

9. Post Optimize

5. Aggregation

34

Page 35: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

How Can A Compiler Help Checkpointing?

Enable CKPTcompiler transformations

Optimize CKPTdo standard optimizations apply?support CKPT-specific optimizations?

CKPT Usesdebuggingbacktracking

35

Page 36: 1 Improving Productivity With Fine-grain Compiler-based Checkpointing Chuck (Chengyan) Zhao Prof. Greg Steffan Prof. Cristiana Amza Allan Kielstra* Dept.

36

0

20

40

60

80

100

120% of Buffer Size Reduction INLINE

+RE1

+RE2

+RE3

+Hoist

+Aggr

+NRESE

+HeapOpti

+ArrayOpti

Optimization: buffer size reduction

up to 98% of CKPT buffer size reduction

%

%

%

%

%


Recommended