+ All Categories
Home > Documents > Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating...

Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating...

Date post: 18-Jan-2018
Category:
Upload: philip-hancock
View: 223 times
Download: 0 times
Share this document with a friend
Description:
Introduction Parallel program: set of co-operating processes Co-operation using –shared variables –message passing Developing parallel programs is considered difficult: –normal errors as in sequential programs –synchronisation errors (deadlock, races) –performance errors  We need good development tools
37
Execution Replay and Debugging
Transcript
Page 1: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Execution Replay and Debugging

Page 2: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Contents

Page 3: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Introduction• Parallel program: set of co-operating processes• Co-operation using

– shared variables– message passing

• Developing parallel programs is considered difficult:– normal errors as in sequential programs– synchronisation errors (deadlock, races)– performance errors

We need good development tools

Page 4: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Debugging of parallel programs• Most used technique: cyclic debugging• Requires repeatable equivalent executions• Is a problem for parallel programs: lots of

non-determinism present• Solution: execution replay mechanism:

– record phase: trace information about the non-deterministic choices

– replay phase: force an equivalent re-execution using the trace allowing the use of intrusive debugging techniques

Page 5: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Non-determinism• Classes:

– external vs. internal non-determinism– desired vs. undesired non-determinism

• Important: the amount of non-determinism depends on the abstraction level. E.g. a semaphore P()-operation can be fully deterministic while consisting of e number of non-deterministic spinlocking operations.

Page 6: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Causes of Non-determinism– In sequential programs:

• program code (self modifying code?)• program input (disk, keyboard, network, ...)• certain system calls (gettimeofday())• interrupts, signals, ...

– In parallel programs:• accesses to shared variables: race conditions

(synchronisation races and data races)– In distributed programs:

• promiscuous receive operations• test operations for non-blocking messages operations

Page 7: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Main Issues in Execution Replay• recorded execution = original execution:

– trace as little as possible in order to limit the overhead

• in time • in space

• replayed execution = recorded execution:– faithful re-execution: trace enough

Page 8: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Execution Replay Methods• Two types: content- vs. ordering-based

– content-based: force each process to read the same value or to receive the same message as during the original execution

– ordering-based: force each process to access the variables or to receive the message in the same logical order as during the original execution

Page 9: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Logical Clocks for Ordering-based Methods

• A clock C() attaches a timestamps C(x) to an event x

• Used for tracing the logical order of events• Clock condition:

• Clocks are strongly consistent if

• New timestamp is the increment of the maximum of the old timestamps of the process and the object

)()( bCaCba

)()( bCaCba

Page 10: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Scalar Clocks• Aka Lamport Clocks• Simple and fast update algorithm:

• Scales very well with the number of processes

• Provides only limited information:

1,max'' oSCpSCoSCpSC

baabbSCaSC

abbababSCaSCbababSCaSC

//////

or or or

or

Page 11: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Vector Clocks• A vector clock for a program using N

processes consist of N scalar values

• Such a clock is strongly consistent: by comparing vector timestamps one can deduce concurrency information:

0,...,0,1,0,..,0,sup'' oVCpVCoVCpVC

abbVCaVC

babVCaVCbabVCaVC

//

Page 12: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

An Example Program• A parallel program with two threads,

communicating using shared variables: A, B MA and MB. Local variables are x and y.

• M is used as a mutex using an atomic swap operation provided by the CPU:

valuememlocmemlocreturn

valuememlocswap][

][),(

Page 13: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

An Example Program (II)• Lock operation on a mutex M is implemented

(in a library):

• Unlock operation on a mutex M is implemented as:

• All variables are initially 0

);1)1,(( Mswapwhile

;0M

Page 14: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

An Example Program (III)• The example program:

Thread 1:L(MA);A=8;U(MA);L(MB);B=7;U(MB);

Thread 2:B=6;L(MB);x=B;U(MB);L(MA);y=A;U(MA);

Page 15: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

A Possible Execution: Low Level View

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

Page 16: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

A Possible Execution: High Level View

A=8L(MA)

U(MA)

L(MB)

B=7

U(MB)

x=BL(MB)

U(MB)

L(MA)

y=AU(MA)

B=6

tim e

Page 17: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Recap• A content-based replay method: the value

read by each load operation is stored• Trace generation of 1MB/s was measured on

a VAX 11/780• Undoable method: time needed to record the

large amount of trace information modifies the initial execution

• One advantage: possible to replay a subset of the processes in isolation.

Page 18: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Recap: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

0

0111

70

80

Page 19: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Instant Replay• First ordering-based replay method• Developed for CREW-algorithms• Each shared object receives a version

number that is updated or logged at each CREW-operation:– read: the version number is logged– write:

• the version number is incremented• the number of preceding read operations is logged

Page 20: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Instant Replay: Example

A=8Lw(MA)

Uw(MA)

Lw(MB)

B=7

Uw(MB)

x=BLr(MB)

Ur(MB)

Lr(MA)

y=AUr(MA)

B=6

version: 1log 0 reads

version: 1log 0 reads

log version 1

log version 1

PROBLEM

Page 21: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer• Widely cited method• Attaches a vector clock to each process. The

clocks attach a timestamp to each memory operations.

• Uses vector clocks to detect concurrent (racing) memory operations

• Automatically traces transitive reduction of the dependencies

Page 22: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer: Basic Idea

B=6

Is this order guaranteed?

swap(MB,1) 0

B=7

B=6

Page 23: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer: Transitive Reduction

B=7

MB=0

x=Bswap(MB,1) 0

Page 24: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

Page 25: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

(1,0)(2,0)

(4,0)

(5,1)

(6,4)

(3,0)(0,1)

(4,3)(4,4)(6,5)(6,6)(6,7)(6,8)(6,9)

(6,10)

(4,2)

Page 26: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

(1,0)(2,0)

(4,0)

(5,1)

(6,4)

(3,0)(0,1)

(4,3)(4,4)(6,5)(6,6)(6,7)(6,8)(6,9)

(6,10)

(4,2)

Page 27: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Netzer: Problems• Size of vector clock grows with the number of

processes– the method doesn’t scale well– programs that create thread dynamically?

• A vector timestamp has to be attached to all shared memory locations: huge space overhead.

• The method basically detects all data and synchronisation races and replays them.

Page 28: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT• Attaches a Lamport clock to each process.

The clocks attach a timestamp to each memory operations.

• Does not detect racing operation, but merely re-executes them in the same order.

• Also automatically traces transitive reduction of the dependencies

Page 29: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

12

4

5

8

31

679

1011121314

5

Page 30: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

12

4

5

8

31

679

1011121314

5

(5,8) (1,5),(7,9)Traced:

Page 31: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT: Example

A=8swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

12

4

5

8

31

679

1011121314

5

Page 32: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT: Example

A=8

swap(MA,1) 0

MA=0

swap(MB,1) 0

B=7

MB=0

x=Bswap(MB,1) 0

MB=0

swap(MA,1) 0

y=AMA=0

B=6

swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1

1

2

4

5

8

31

679

1011121314

5

Page 33: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT using three phases• Problem: high overhead due to the tracing of

all memory operations• Solution: only record/replay the

synchronisation operations (subset of all race conditions)

• Problem: no correct replay possible if the execution contains a data race

• Solution: add a third phase for detecting the data races

Page 34: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT using three phases• Phase 1: record the order of the

synchronisation races• Phase 2: replay the synchronisation races

while using intrusive data race detection techniques

• Phase 3: replay the synchronisation races and use cyclic debugging techniques to find the `normal’ errors

Page 35: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT: Example

A=8L(MA)

U(MA)

L(MB)

B=7

U(MB)

x=BL(MB)

U(MB)

L(MA)

y=AMA=0

B=6

1

3

4

2

5

67

8 - (0,5)Traced:

Page 36: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

ROLT• ROLT replays synchronisation races end

detects data races.• The method scales well and has a small

space and time overhead.• Produces small trace files.• A total order is imposed artificial

dependencies.

Page 37: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.

Conclusions


Recommended