+ All Categories
Home > Documents > Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

Date post: 30-Jan-2016
Category:
Upload: lester
View: 34 times
Download: 0 times
Share this document with a friend
Description:
Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin Ohio State University ESEC/FSE 07. Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay. Outline. Motivation Challenges for checkpointing/replaying Java software Summary of our approach Contributions - PowerPoint PPT Presentation
30
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin Ohio State University ESEC/FSE 07
Transcript
Page 1: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin

Ohio State University

ESEC/FSE 07

Page 2: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

22 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Outline

Motivation- Challenges for checkpointing/replaying Java

software- Summary of our approach

Contributions- Static analyses- Multiple execution regions- Experimental evaluation

Conclusions

Page 3: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

33 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Motivation Checkpointing/replaying has been used for a

variety of purposes at system level- Originally designed to support fault tolerance- Debugging of OS and of parallel and distributed

software

Checkpointing can benefit a number of software engineering tasks- Reduce the cost of manual debugging and testing- Support for automated techniques for debugging

and testing: e.g., dynamic slicing and delta-debugging

- Inspired by both system-level checkpointing [Pan-PDD88, Dunlap-OSDI02, King-USENIX05] and “saving-and-restoring” software engineering techniques [Saff-ASE05, Orso-WODA05, Orso-WODA06, Elbaum-FSE06]

Page 4: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

44 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Challenges Ease of use and deployment

- Application-level checkpointing: no JVM/runtime support, just code analysis and instrumentation

- Challenge: no direct access to the call stack; no control over thread scheduling or external resources (files, etc.)

Reduce the size of the recorded state- Dumping the entire heap may be prohibitively

expensive, especially for large programs- Challenge: static analyses to prune redundant state

Static and dynamic overhead- Static analysis cost is amortized over multiple runs- Approach is intended for long-running applications

Page 5: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

55 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Summary of Our Approach Tool input: program + checkpoint definition Performs static analyses and code instrumentation Tool output: two program versions First, an augmented checkpointing version is

executed once to record (parts of) the run-time program states - At the checkpoint: heap objects, static fields, locals- At certain points along the call chain leading to the

checkpoint Next, a pruned replaying version is executed multiple

times- Restore variables saved at the checkpoint- Restore variables saved at points along the call chain

How do we resume execution from the checkpoint?- Step 1: control flow quickly reaches the checkpoint- Step 2: recover state at checkpoint- Step 3: incrementally recover state after call sites along the

call chain leading to the checkpoint

Page 6: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

66 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Definitions Crosscut call chain (CC-chain)

- A programmer-specified call chain that leads to the method that contains the checkpoint

- E.g. main(44) -> run(28)

Decision points - A call site on the CC-chain (e.g. m.run) – due to

polymorphism- A predicate on which a decision point or the

checkpoint is control-dependent

At a decision point, the checkpointing version records the control-flow outcome

The replaying version uses this info to force the control flow to reach the checkpoint

Page 7: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

77 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Replaying, Step 1: Recover the Call Stack

Predicate decision point: recover boolean value

Call site decision point o.m(a1…, an)- Recover the run-time type of the receiver object;

instantiated during replaying using sun.misc.Unsafe

Page 8: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

88 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Checkpointing Versionvoid run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); boolean b = Options.v().whole_jimple(); => save(b); if (b){// DP getPack("cg").apply(); // --- checkpoint --- => save(…); getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); … } ...}

static void main(String[] args) { Main m = new Main(); boolean b = args.length !=0; => save(b); if (b) // DP => save(type_of(m)); m.run(args); // DP}

Page 9: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

99 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Replaying Versionvoid run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); boolean b = Options.v().whole_jimple(); => read(b); if (b){// DP getPack("cg").apply(); // --- checkpoint --- =>read(…); getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); … }

static void main(String[] args) { Main m = new Main(); boolean b = args.length !=0; => read(b); if (b) // DP => read(type_of(m)); => unsafe.allocate(m); => args = null; m.run(args); // DP}

Page 10: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1010 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Step 2: Recover at the Checkpoint Our static analysis selects locals for

recording(for checkpointing)/recovering(for replaying) when- They are written before the checkpoint- They are read after the checkpoint

Record primitive-typed values or entire object graphs on the heap (all reachable objects)

Static fields are selected based on the same idea

void run(String[] args) { processCmdLine(args); loadNecessaryClasses(); Set wp_packs = getWpacks(); Set body_packs = getBpacks(); if (Options.v().whole_jimple()) { getPack("cg").apply(); // --- checkpoint --- getPack("wjtp").apply(); getPack("wjop").apply(); getPack("wjap").apply(); } retrieveAllBodies(); for (Iterator i = body_packs.iterator(); i.hasNext();) { … }… }

body_packs

Page 11: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1111 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Selection of Static Fields A whole program Mod/Use analysis

- A static field is “written” if its value is changed, or any heap object reachable from it is mutated

- A static field is “read” if its value is directly read

Analysis algorithm- Context-sensitive and flow-insensitive; uses the

points-to solution and the call graph from Spark [Lhotak CC-03]

- Bottom-up traversal of the SCC-DAG of the call graph

- For each method m, a set Cm is maintained to contain all objects from which a mutated object can be reached

- Propagate backwards the objects in Cm that escape a callee method to its callers

- Select a static field fld if PointsToSet(fld) ∩ Cm ≠ ∅

Page 12: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1212 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Step 3: Recover after the Checkpoint Replaying only at decision points and the

checkpoint is not enough to guarantee correct execution after the checkpoint

Additionally record/recover local variables that will be read after each call site in CC-chain

void main(){

Set hs = new HashSet();

B b = new B(hs);

//-- reco/rest //(type_of(b))

b.m();

//-- extra reco/rest (hs)

if(hs == b.s){ … }

}

class B{

Set s;

void m(){

B r0 = this;

r0.s = new HashSet();

//-- checkpoint

//-- reco/rest (r0)

r0.s.add(“”);

}

}

hs uninitialize

d

Page 13: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1313 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Additional Issues A checkpoint can have multiple run-time

instances If a method in CC-chain has callers that are

not in the chain, it has to be replicated Currently do not support multi-threaded

programs Our technique does not guarantee the

correctness of the execution, when the post-checkpoint part of the program- Depends on external resources, such as files,

databases- Depends on unique-per-execution values, such as

clock- Is modified with new cross-checkpoint

dependencies Multiple execution regions

- Designated by a starting point and an ending point- Specified by two CC-chains

Page 14: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1414 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Study 1: Static Analysis

5 3 jb-6.1

8 3 jlex-1.2.6

5 2 db

4 2 jtar-1.21

8 2 jflex

9 4 violet

8 3 jess

11 4 sablecc

9 4 javacup

35 10 soot-2.2.3

10 3 raytrace

14 3 socksecho

11 3 socksproxy

6 1 compress

20 3 muffine

#IP #R Program

Page 15: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1515 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Static Analysis: Locals Reduction

0

200

400

600

800

1000

1200

1400

1600

1800

compr

ess

sock

spro

xy

sock

sech

o

rayt

race

soot

-2.2

.3

muffi

ne

sable

cc je

ss

viole

t

java

cup

jtar

-1.21 db

jflex

jb-6.

1

jlex-

1.2.

6

Total Locals Selected Locals

Page 16: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1616 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Static Analysis: Static Fields Reduction

0

500

1000

1500

2000

2500

3000

3500 Total SF Selected SF

Page 17: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1717 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Static Analysis: Removed/Inserted Statements

0

20

40

60

80

100

120

compr

ess

sock

spro

xy

sock

sech

o

raytra

ce

soot

-2.2.3

muffi

ne

sablec

c je

ss

violet

java

cup

jtar

-1.2

1 db

jflex

jb-6

.1

j lex-1.2.

6

Stmts Left after Pruning(%) Stmts Inserted(%)

Page 18: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1818 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Static Analysis Cost Phase 1: Soot infrastructure cost

- Between 1.64ms and 30.6ms per thousand Jimple statements

- On average, 11.1ms/1000 statements

Phase 2: Our analysis cost- Between 1.67ms and 26.6ms per thousand Jimple

statements- On average, 9.4ms/1000 statements

This should be amortized across multiple runs of the replaying version

Page 19: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

1919 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Study 2: Run-Time Performance (compress) Original program: compressing and

decompressing 5 big tar files several times Evaluated for five checkpoint definitions

- One checkpoint, close to the beginning of the program

- Two regions of compression and decompression- A region containing the process of compression- A region containing the process of decompression- One checkpoint, close to the end of the program

Page 20: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2020 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

compress Performance Normalized

running times

Normalized size of captured program state

0

20

40

60

80

100

120

140

1 2 3 4 5

checkpointing version replaying version

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5

Size of Heap Size of Captured Program State

Page 21: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2121 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Study 2: Run-Time Performance (soot) Input: soot-2.2.3 itself containing 2227333

methods Phases

- Enabling cg.spark, wjtp, wjop.ji, wjap.uft, jtp, jop.cp

Evaluated for six checkpoint definitions- Before whole-program packs- After cg- After wjtp- After wjop- After wjap- After body packs

Page 22: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2222 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

soot Performance Normalized

running times

Normalized captured program state

0

20

40

60

80

100

120

1 2 3 4 5 6

Checkpointing version Replaying version

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6

Size of Heap Size of Captured Program State

Page 23: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2323 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Study 2: Run-Time Performance (jflex-1.4.1)

Input: a .flex grammar file corresponding to a DFA containing 21769 states

Evaluated for four checkpoint definitions- After NFA is generated- After DFA is generated to DFA- After minimization - After emission

Page 24: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2424 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

jflex Performance Normalized

running time

Normalized size of capture state

0

50

100

150

1 2 3 4

Replaying version Checkpointing version

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4

Size of Heap Size of Captured Program State

Page 25: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2525 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Summary of Evaluation Static analysis successfully reduces the size of

program state recorded and recovered It is more meaningful to checkpoint/replay

long-running programs Checkpoints are better taken after a phase of

long time computation with (relatively) small output state- √ compress: small program state, short running

time- √ soot: large program state, but very long computation time- X jflex: large program state, short running time

Page 26: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2626 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Conclusions A static-analysis-based

checkpointing/replaying technique An implementation and an evaluation that

shows our technique can be an interesting candidate for testing, debugging, and dynamic slicing of long-running programs

Future work- Language-level checkpointing/replaying multi-

threaded programs- More precise static analyses could be employed to

reduce the size of program state to be captured- The run-time support for object reading and writing

could be improved

Page 27: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2727 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Questions?

Page 28: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2828 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

compress

Original running time: 4.05s

4.14 (0.38%)4.19 (0.74%)0.17% 471by31 1

3.19 (11.8%)5.22 (10.4%)28.8% 89.7M545 2

0.008%

26.7%

28.9%

%Heap

4.17 (8.1%)

4.70 (12.3%)

5.38 (9.0%)

Timec(s) (%wio)

47 (34.0%) 296by31 5

1.39 (24.7%) 89M578 4

2.17 (12.8%) 89.7by

22 3

Timer(s) (%rio) Space #Objects

Run

Page 29: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

2929 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

soot

72.8%

63.5%

79.0%

73.2%

73.2%

36.3%

%Heap

4661.6 (8.0%)

4972.8 (8.0%)

4770.1 (8.0%)

4688.4 (6.9%)

4712.2 (7.2%)

4695.3 (0.4%)

Timec(s) (%wio)

533.1 (97.8%) 806.5M

77767256

5

511.5 (95.2%) 806.4M

77739391

4

4387.3 (8.7%) 745M65648481

3

4410.5 (9.1%) 745M65648481

2

4643.5 (0.5%) 36.2M461058 1

411.5 (96.5%) 795.3M

75668735

6

Timer(s) (%rio) Space #Objects

Run

Original running time: 4665.7s

Page 30: Efficient Checkpointing of Java Software using Context-Sensitive Capture and Replay

3030 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

jflex

0.0003%

68.1%

68.1%

86.1%

%Heap

56.2 (0.14%)

63.9 (12.1%)

65.2 (12.3%)

64.9 (8.0%)

Timec(s) (%wio)

0.063 (50.8%) 2K21 4

55.4 (26.0%) 385.1M

6695172 3

55.6 (26.1%) 385.1M

6695173 2

68.8 (18.3%) 259.8M

6606489 1

Timer(s) (%rio) Space #Objects

Run

Original running time: 52.6s


Recommended