Avrora Scalable Sensor Simulation with Precise Timing Ben L. Titzer UCLA CENS Seminar, February 18,...

Post on 24-Dec-2015

217 views 3 download

Tags:

transcript

AvroraScalable Sensor Simulation with

Precise Timing

Ben L. Titzer

UCLACENS Seminar, February 18, 2005

IPSN 2005

http://compilers.cs.ucla.edu/avrora

2

Background - WSNs

Wireless Sensor Networks– Microcontroller and battery powered – Wireless communication– Event-driven programming model– Programmed with TinyOS and nesC

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Mica2 Dot - based on Atmel AVR microcontroller

http://compilers.cs.ucla.edu/avrora

3

Background - Microcontrollers

Microcontrollers are small– 128KB code, 4KB RAM, 4KB EEPROM– Processor, memory, IO on a single chip– 4 - 16mhz clockspeed– Interrupt-driven programming model– No operating system

http://compilers.cs.ucla.edu/avrora

4

Motivation

Developing sensor software is hard– Constrained resources, bare hardware– Narrow interface for debugging– Delicately timed driver code– Distributed communication

Precise measurements are difficult Current tools do a poor job

– TOSSIM, AtEmu

http://compilers.cs.ucla.edu/avrora

5

The Question

Can we achieve simulation of entire sensor networks?

http://compilers.cs.ucla.edu/avrora

6

The Question

Can we achieve simulation of entire sensor networks?

--[1] And make it precise?

http://compilers.cs.ucla.edu/avrora

7

The Question

Can we achieve simulation of entire sensor networks?

--[1] And make it precise? --[2] And make it flexible?

http://compilers.cs.ucla.edu/avrora

8

The Question

Can we achieve simulation of entire sensor networks?

--[1] And make it precise? --[2] And make it flexible? --[3] And make it fast?

http://compilers.cs.ucla.edu/avrora

9

The Question

Can we achieve simulation of entire sensor networks?

--[1] And make it precise? --[2] And make it flexible? --[3] And make it fast? --[4] And make it scalable?

http://compilers.cs.ucla.edu/avrora

10

The Goals of Avrora

Build a simulator for sensor networks– Cycle accurate– Energy accurate– Simulates sensor devices– Scales to large sensor networks

Allow detailed profiling and instrumentation

http://compilers.cs.ucla.edu/avrora

11

[1] Precision

Can we make it precise?– Instruction-level simulation– Cycle accurate– Accurate device models– Accurate radio / interference model– Well-known

http://compilers.cs.ucla.edu/avrora

12

[2] Flexibility

Can we make the simulator flexible?– Well-designed software architecture– Clear interfaces– Implemented in Java, object-oriented– Instrumentation infrastructure

• “Nonintrusive Precision Instrumentation of Microcontroller Software” submitted to LCTES 2005

http://compilers.cs.ucla.edu/avrora

13

Avrora Software Architecture

Microcontroller

Simulator

Interpreter

IOReg interface

SPI Timer Ports On-chip devices

Event queue interface

Radio LEDs

Pin interface

Off-chip devices

Platform

On-chip devices are controlled by the program through IOReg objects

Off-chip devices are controlled through individual pins or through UART and SPI interfaces

Time-triggered behavior is accomplished by inserting events into the event queue

SPI interface

http://compilers.cs.ucla.edu/avrora

14

[3] Speed - Event Queue

How can we achieve speed while retaining cycle accuracy?– Naïve implementation scales poorly– Event interface simplifies devices– Better performance– Key to achieving parallelism for sensor

network simulations

http://compilers.cs.ucla.edu/avrora

15

Event Queue Illustration

Timer0Event ProfilingEvent UARTEvent

Simulator

Interpreter

DeltaQueue

4 24 7 12 83

Interpreter tracks cycles consumed by each instruction Decrement head of queue and fire event(s) when necessary Retains cycle accuracy Allows for sleep optimization

http://compilers.cs.ucla.edu/avrora

16

Single-node PerformancePerformance Comparison: AVR Simulators

0

5

10

15

20

25

30

35

Livermore Loop 1 Livermore Loop 2 Livermore Loop 5 Bubblesort

benchmark

Mhz

Avrora (-server)AvroraATEMUsimulavr

http://compilers.cs.ucla.edu/avrora

17

[4] Scalability

Sensor networks have many nodes (10’s-1000’s)

Software controlled radios Micro-second level interactions High-fidelity simulation needed for

precise measurements

http://compilers.cs.ucla.edu/avrora

18

The AtEmu Approach

Introduce global clock Step all nodes one clock cycle at a time Compute radio waveform (bit level) Problems:

– Slow– Scales poorly - O(n^2) interactions– No parallelism

http://compilers.cs.ucla.edu/avrora

19

Observations

Communication has latency Nodes only influence each other

through communications Other than that, nodes run in parallel Hmm….

http://compilers.cs.ucla.edu/avrora

20

Parallel Simulation

Allow all nodes to run in parallel– One thread per node– Extends single-node simulation to network– Better overall simulation performance

New Problem:– Synchronization necessary to preserve

timing and order of communications– Efficient solutions?

http://compilers.cs.ucla.edu/avrora

21

Send-Receive Problem

Nodes send bytes to each other– No node should be allowed to run too far

ahead of other nodes that might try to send a byte to it

Node A

Node B

T=k T=k+L

Receive A1

T=0

Send A1

Node B should never be more than L cycles ahead of A

http://compilers.cs.ucla.edu/avrora

22

Sampling Problem

Nodes can sample current radio traffic– Sample cannot be computed until all

possible transmitters have passed the time when sampling was begun

Node A

Node B

T=k T=k+S

RSSI

T=0

Send A1

Node B cannot complete sample until node A passes time k

http://compilers.cs.ucla.edu/avrora

23

Reality

RSSI sampled infrequently Nodes both send and receive Latency L to send a byte on mica2

– 7372800hz / 2400bps = 3072 cycles Sampling time S to estimate RSSI

– 13 ADC cycles * 64 = 832 cycles

http://compilers.cs.ucla.edu/avrora

24

Two Approaches

Synchronization Intervals – Threads can’t run too far ahead– Period has to be smaller than L– Utilize event queue of each simulator

Wait for Neighbors– Each thread waits for neighbors when necessary

(sample or receive)– Requires fast global data structure

Avrora uses both

http://compilers.cs.ucla.edu/avrora

25

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

A

B

CE

DNetwork

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

26

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

A

B

CE

DNetwork

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

27

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

A

B

CE

DNetwork

Send C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

28

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

A

B

CE

DNetwork

Send C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

29

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

A

B

CE

DNetwork

Send C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

30

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

A

B

CE

DNetwork

Send C1 Send C2

C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

31

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

C1

A

B

CE

DNetwork

Send C1 Send C2

C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

32

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

C1

RSSI

A

B

CE

DNetwork

Send C1 Send C2

RSSI

C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

33

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

C1

RSSI

A

B

CE

DNetwork

Send C1 Send C2

RSSI

C1

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

34

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

Send E1

C1

RSSI

A

B

CE

DNetwork

Send C1 Send C2

RSSI

C2+E1

C1 C2

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

35

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

Send E1

C1

RSSI

A

B

CE

DNetwork

Send C1 Send C2

RSSI

C2+E1

C1 C2

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

36

Synchronization Illustration

Node A

Node B

Node C

Node D

Node E

T=0 T=1L T=2L T=3L

RSSI

Send E1

C1

RSSI

A

B

CE

DNetwork

Send C1 Send C2

RSSI

C2+E1

C1 C2

Delivery point

Synchronization point

Starting point

http://compilers.cs.ucla.edu/avrora

37

Results - ScalabilityNetwork Scalabilty Comparison: Sensor Network Simulators

0.1

1

10

100

1000

10000

1 2 4 8 16 32 64 128 256 512 1024

Number of nodes

Seconds

ATEMU

Avrora

TOSSIM

http://compilers.cs.ucla.edu/avrora

38

Results - ParallelismMultiprocessor Scalability

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8

Number of processors

Normalized Execution Time

1 node

4 nodes

32 nodes

http://compilers.cs.ucla.edu/avrora

39

Measurements

Accurate timing useful for– AEON: power and lifetime estimation– MAC layer tuning– Debugging driver code– Latency estimation for in-network

processing– Real-time monitoring

http://compilers.cs.ucla.edu/avrora

40

Channel UtilizationChannel Utilization

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1 2 4 8 16 32 64 128 256 512

Number of Senders

Bytes per second

Attempts

Delivered

Corrupted

Channel Limit

http://compilers.cs.ucla.edu/avrora

41

Partial Preamble Loss

Real radios take time to lock on– First few bits of transmission lost– Subsequent bytes misaligned– MAC software layer must compensate

Latency L between transmission and first reception larger– Admits more concurrency in simulation

http://compilers.cs.ucla.edu/avrora

42

R: A2+A3

Adaptive Synchronization

Assume first k [kl, kh] bits lost of first bytes transmitted

Latency for first byte is then:

Lf = L + kl * cyclesbit

Node A

Node B

T=k T=k+L

R: A2+A3

T=0

S: A1 S: A2 S: A3 S: A4

T=k+2L T=k+3L

kl

http://compilers.cs.ucla.edu/avrora

43

Back to the Question

Can we achieve simulation of entire sensor networks?

--[1] And make it precise? yes --[2] And make it flexible? yes --[3] And make it fast? yes --[4] And make it scalable? yes

http://compilers.cs.ucla.edu/avrora

44

Future Work

Performance Improvements– Sleeping nodes, M:N thread model– Single-node improvements

Port to other mote platforms Co-simulation with real network Implement partial preamble loss

– Measure properties of k

http://compilers.cs.ucla.edu/avrora

45

Acknowledgements

NSF: money Jens Palsberg: patience Daniel Lee: device implementations Simon Han: testing, timing validation Olaf Lansiedel: AEON energy model CENS: access to a stupidly big Sun V880

machine Sun: for donating said machine