+ All Categories
Home > Documents > Aug 19, 2006

Aug 19, 2006

Date post: 31-Dec-2015
Category:
Upload: brett-burns
View: 25 times
Download: 0 times
Share this document with a friend
Description:
Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai. Aug 19, 2006. Motivation. Distributed errors are hard to find with traditional debugging tools Centralized snapshot algorithms Expensive Geared towards detecting one error at a time - PowerPoint PPT Presentation
19
Distributed Watchpoints: Debugging Very Large Ensembles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai Aug 19, 2006
Transcript

Distributed Watchpoints: Debugging Very Large Ensembles of

Robots

De Rosa, Goldstein, Lee, Campbell, Pillai

Aug 19, 2006

8/19/2006 Distributed Watchpoints2

Motivation

• Distributed errors are hard to find with traditional debugging tools

• Centralized snapshot algorithms– Expensive– Geared towards detecting one error at a time

• Special-purpose debugging code is difficult to write, may itself contain errors

8/19/2006 Distributed Watchpoints3

Expressing and Detecting Distributed Conditions

“How can we represent, detect, and trigger on distributed conditions in very large multi-robot systems?”

• Generic detection framework, well suited to debugging

• Detect conditions that are not observable via the local state of one robot

• Support algorithm-level debugging (not code/HW debugging)

• Trigger arbitrary actions when condition is met

• Asynchronous, bandwidth/CPU-limited systems

8/19/2006 Distributed Watchpoints4

Distributed/Parallel Debugging:State of the Art

Modes:

• Parallel: powerful nodes, regular (static) topology, shared memory

• Distributed: weak, mobile nodes

Tools:

• GDB

• printf()

• Race detectors

• Declarative network systems with debugging support (ala P2)

8/19/2006 Distributed Watchpoints5

Example Errors: Leader Election

Scenario: One Leader Per Two-Hop Radius

8/19/2006 Distributed Watchpoints6

Example Errors: Token Passing

Scenario: If a node has the token, exactly one of it’s neighbors must have had it last timestep

8/19/2006 Distributed Watchpoints7

Example Errors: Gradient Field

Scenario: Gradient Values Must Be Smooth

8/19/2006 Distributed Watchpoints8

Expressing Distributed Error Conditions

Requirements:

• Ability to specify shape of trigger groups

• Temporal operators

• Simple syntax (reduce programmer effort/learning curve)

A Solution:

• Inspired by Linear Temporal Logic (LTL)– A simple extension to first-order logic– Proven technique for single-robot debugging [Lamine01]

• Assumption: Trigger groups must be connected– For practical/efficiency reasons

8/19/2006 Distributed Watchpoints9

Watchpoint Primitives

• Modules (implicitly quantified over all connected sub-ensembles)

• Topological restrictions (pairwise neighbor relations)

• Boolean connectives

• State variable comparisons (distributed)

• Temporal operators

nodes(a,b,c); n(b,c) & (a.var > b.var) & (c.prev.var != 2)

8/19/2006 Distributed Watchpoints10

Distributed Errors: Example Watchpoints

nodes(a,b,c);n(a.b) & n(b,c) & (a.isLeader == 1) & (c.isLeader == 1)

nodes(a,b,c);n(a,b) & n(a,c) & (a.token == 1) & (b.prev.token == 1) & (c.prev.token == 1)

nodes(a,b);(a.state - b.state > 1)

8/19/2006 Distributed Watchpoints11

Watchpoint Execution

nodes(a,b,c)…

21 43 65 87

10

912

11

14

13

16

15

18

17

20

19

22

21

24

23

26

25

28

27

30

29

32

31

1

2

3

1 2

1 9

.

.

.

.

1 9 2

1 910 √

8/19/2006 Distributed Watchpoints12

Performance: Watchpoint Size

• 1000 modules, running for 100 timesteps

• Simulator overhead excluded

• Application: data aggregation with landmark routing

• Watchpoint: are the first and last robots in the watchpoint in the same state?

Watchpoint Size vs. Simulation Time

0

100

200

300

400

500

600

700

800

900

none 2 3 4

Size (slots)

Time (s)

8/19/2006 Distributed Watchpoints13

Performance: Number of Matchers

• This particular watchpoint never terminates early

• Number of matchers increases exponentially

• Time per matcher remains within factor of 2

• Details of the watchpoint expression more important than size

Watchpoint Size vs. Number of Matchers

0

2000000

4000000

6000000

8000000

10000000

12000000

14000000

16000000

none 2 3 4

Size (slots)

Matchers

8/19/2006 Distributed Watchpoints14

Performance: Periodically Running Watchpoints

Watchpoint Activity % vs. Time

0

10

20

30

40

50

60

70

100% 50% 33% 25% 20% never

Activity (%)

Time (ms)

8/19/2006 Distributed Watchpoints15

Future Work

• Distributed implementation

• More optimization

• User validation

• Additional predicates

8/19/2006 Distributed Watchpoints16

Conclusions

• Simple, yet highly descriptive syntax

• Able to detect errors missed by more conventional techniques

• Low simulation overhead

Thank You

8/19/2006 Distributed Watchpoints18

Backup Slides

8/19/2006 Distributed Watchpoints19

Optimizations

• Temporal span

• Early termination

• Neighbor culling

• (one slide per)


Recommended