Post on 24-Dec-2015
transcript
AvroraScalable Sensor Simulation with
Precise Timing
Ben L. Titzer
UCLACENS Seminar, February 18, 2005
IPSN 2005
http://compilers.cs.ucla.edu/avrora
2
Background - WSNs
Wireless Sensor Networks– Microcontroller and battery powered – Wireless communication– Event-driven programming model– Programmed with TinyOS and nesC
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Mica2 Dot - based on Atmel AVR microcontroller
http://compilers.cs.ucla.edu/avrora
3
Background - Microcontrollers
Microcontrollers are small– 128KB code, 4KB RAM, 4KB EEPROM– Processor, memory, IO on a single chip– 4 - 16mhz clockspeed– Interrupt-driven programming model– No operating system
http://compilers.cs.ucla.edu/avrora
4
Motivation
Developing sensor software is hard– Constrained resources, bare hardware– Narrow interface for debugging– Delicately timed driver code– Distributed communication
Precise measurements are difficult Current tools do a poor job
– TOSSIM, AtEmu
http://compilers.cs.ucla.edu/avrora
5
The Question
Can we achieve simulation of entire sensor networks?
http://compilers.cs.ucla.edu/avrora
6
The Question
Can we achieve simulation of entire sensor networks?
--[1] And make it precise?
http://compilers.cs.ucla.edu/avrora
7
The Question
Can we achieve simulation of entire sensor networks?
--[1] And make it precise? --[2] And make it flexible?
http://compilers.cs.ucla.edu/avrora
8
The Question
Can we achieve simulation of entire sensor networks?
--[1] And make it precise? --[2] And make it flexible? --[3] And make it fast?
http://compilers.cs.ucla.edu/avrora
9
The Question
Can we achieve simulation of entire sensor networks?
--[1] And make it precise? --[2] And make it flexible? --[3] And make it fast? --[4] And make it scalable?
http://compilers.cs.ucla.edu/avrora
10
The Goals of Avrora
Build a simulator for sensor networks– Cycle accurate– Energy accurate– Simulates sensor devices– Scales to large sensor networks
Allow detailed profiling and instrumentation
http://compilers.cs.ucla.edu/avrora
11
[1] Precision
Can we make it precise?– Instruction-level simulation– Cycle accurate– Accurate device models– Accurate radio / interference model– Well-known
http://compilers.cs.ucla.edu/avrora
12
[2] Flexibility
Can we make the simulator flexible?– Well-designed software architecture– Clear interfaces– Implemented in Java, object-oriented– Instrumentation infrastructure
• “Nonintrusive Precision Instrumentation of Microcontroller Software” submitted to LCTES 2005
http://compilers.cs.ucla.edu/avrora
13
Avrora Software Architecture
Microcontroller
Simulator
Interpreter
IOReg interface
SPI Timer Ports On-chip devices
Event queue interface
Radio LEDs
Pin interface
Off-chip devices
Platform
On-chip devices are controlled by the program through IOReg objects
Off-chip devices are controlled through individual pins or through UART and SPI interfaces
Time-triggered behavior is accomplished by inserting events into the event queue
SPI interface
http://compilers.cs.ucla.edu/avrora
14
[3] Speed - Event Queue
How can we achieve speed while retaining cycle accuracy?– Naïve implementation scales poorly– Event interface simplifies devices– Better performance– Key to achieving parallelism for sensor
network simulations
http://compilers.cs.ucla.edu/avrora
15
Event Queue Illustration
Timer0Event ProfilingEvent UARTEvent
Simulator
Interpreter
DeltaQueue
4 24 7 12 83
Interpreter tracks cycles consumed by each instruction Decrement head of queue and fire event(s) when necessary Retains cycle accuracy Allows for sleep optimization
http://compilers.cs.ucla.edu/avrora
16
Single-node PerformancePerformance Comparison: AVR Simulators
0
5
10
15
20
25
30
35
Livermore Loop 1 Livermore Loop 2 Livermore Loop 5 Bubblesort
benchmark
Mhz
Avrora (-server)AvroraATEMUsimulavr
http://compilers.cs.ucla.edu/avrora
17
[4] Scalability
Sensor networks have many nodes (10’s-1000’s)
Software controlled radios Micro-second level interactions High-fidelity simulation needed for
precise measurements
http://compilers.cs.ucla.edu/avrora
18
The AtEmu Approach
Introduce global clock Step all nodes one clock cycle at a time Compute radio waveform (bit level) Problems:
– Slow– Scales poorly - O(n^2) interactions– No parallelism
http://compilers.cs.ucla.edu/avrora
19
Observations
Communication has latency Nodes only influence each other
through communications Other than that, nodes run in parallel Hmm….
http://compilers.cs.ucla.edu/avrora
20
Parallel Simulation
Allow all nodes to run in parallel– One thread per node– Extends single-node simulation to network– Better overall simulation performance
New Problem:– Synchronization necessary to preserve
timing and order of communications– Efficient solutions?
http://compilers.cs.ucla.edu/avrora
21
Send-Receive Problem
Nodes send bytes to each other– No node should be allowed to run too far
ahead of other nodes that might try to send a byte to it
Node A
Node B
T=k T=k+L
Receive A1
T=0
Send A1
Node B should never be more than L cycles ahead of A
http://compilers.cs.ucla.edu/avrora
22
Sampling Problem
Nodes can sample current radio traffic– Sample cannot be computed until all
possible transmitters have passed the time when sampling was begun
Node A
Node B
T=k T=k+S
RSSI
T=0
Send A1
Node B cannot complete sample until node A passes time k
http://compilers.cs.ucla.edu/avrora
23
Reality
RSSI sampled infrequently Nodes both send and receive Latency L to send a byte on mica2
– 7372800hz / 2400bps = 3072 cycles Sampling time S to estimate RSSI
– 13 ADC cycles * 64 = 832 cycles
http://compilers.cs.ucla.edu/avrora
24
Two Approaches
Synchronization Intervals – Threads can’t run too far ahead– Period has to be smaller than L– Utilize event queue of each simulator
Wait for Neighbors– Each thread waits for neighbors when necessary
(sample or receive)– Requires fast global data structure
Avrora uses both
http://compilers.cs.ucla.edu/avrora
25
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
A
B
CE
DNetwork
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
26
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
A
B
CE
DNetwork
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
27
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
A
B
CE
DNetwork
Send C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
28
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
A
B
CE
DNetwork
Send C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
29
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
A
B
CE
DNetwork
Send C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
30
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
A
B
CE
DNetwork
Send C1 Send C2
C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
31
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
C1
A
B
CE
DNetwork
Send C1 Send C2
C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
32
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
C1
RSSI
A
B
CE
DNetwork
Send C1 Send C2
RSSI
C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
33
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
C1
RSSI
A
B
CE
DNetwork
Send C1 Send C2
RSSI
C1
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
34
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
Send E1
C1
RSSI
A
B
CE
DNetwork
Send C1 Send C2
RSSI
C2+E1
C1 C2
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
35
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
Send E1
C1
RSSI
A
B
CE
DNetwork
Send C1 Send C2
RSSI
C2+E1
C1 C2
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
36
Synchronization Illustration
Node A
Node B
Node C
Node D
Node E
T=0 T=1L T=2L T=3L
RSSI
Send E1
C1
RSSI
A
B
CE
DNetwork
Send C1 Send C2
RSSI
C2+E1
C1 C2
Delivery point
Synchronization point
Starting point
http://compilers.cs.ucla.edu/avrora
37
Results - ScalabilityNetwork Scalabilty Comparison: Sensor Network Simulators
0.1
1
10
100
1000
10000
1 2 4 8 16 32 64 128 256 512 1024
Number of nodes
Seconds
ATEMU
Avrora
TOSSIM
http://compilers.cs.ucla.edu/avrora
38
Results - ParallelismMultiprocessor Scalability
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3 4 5 6 7 8
Number of processors
Normalized Execution Time
1 node
4 nodes
32 nodes
http://compilers.cs.ucla.edu/avrora
39
Measurements
Accurate timing useful for– AEON: power and lifetime estimation– MAC layer tuning– Debugging driver code– Latency estimation for in-network
processing– Real-time monitoring
http://compilers.cs.ucla.edu/avrora
40
Channel UtilizationChannel Utilization
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1 2 4 8 16 32 64 128 256 512
Number of Senders
Bytes per second
Attempts
Delivered
Corrupted
Channel Limit
http://compilers.cs.ucla.edu/avrora
41
Partial Preamble Loss
Real radios take time to lock on– First few bits of transmission lost– Subsequent bytes misaligned– MAC software layer must compensate
Latency L between transmission and first reception larger– Admits more concurrency in simulation
http://compilers.cs.ucla.edu/avrora
42
R: A2+A3
Adaptive Synchronization
Assume first k [kl, kh] bits lost of first bytes transmitted
Latency for first byte is then:
Lf = L + kl * cyclesbit
Node A
Node B
T=k T=k+L
R: A2+A3
T=0
S: A1 S: A2 S: A3 S: A4
T=k+2L T=k+3L
kl
http://compilers.cs.ucla.edu/avrora
43
Back to the Question
Can we achieve simulation of entire sensor networks?
--[1] And make it precise? yes --[2] And make it flexible? yes --[3] And make it fast? yes --[4] And make it scalable? yes
http://compilers.cs.ucla.edu/avrora
44
Future Work
Performance Improvements– Sleeping nodes, M:N thread model– Single-node improvements
Port to other mote platforms Co-simulation with real network Implement partial preamble loss
– Measure properties of k
http://compilers.cs.ucla.edu/avrora
45
Acknowledgements
NSF: money Jens Palsberg: patience Daniel Lee: device implementations Simon Han: testing, timing validation Olaf Lansiedel: AEON energy model CENS: access to a stupidly big Sun V880
machine Sun: for donating said machine