Fabián E. Bustamante, Winter 2007
t-kernel – Reliable OS support for WSN
L. Gu and J. Stankovic, appearing in 4th Proc. of Sensys, Oct. 2006. Best paper award.
Presenter: F. E. Bustamante
EECS 443 Advanced Operating SystemsNorthwestern University
2
Motivation
Wireless sensor networks (WSNs) with– Resource constrained embedded microcontrollers– Complex application requirements
OS support is very limited; applications (developers) could benefits from– OS protection– Virtual memory– Preemptive scheduling
But microcontrollers do not have hardware support for thisHow can we efficiently provide such support w/o hardware help?
EECS 443 Advanced Operating SystemsNorthwestern University
3
Context – Resource constrained devices
Very-low-power configurations for– Energy efficiency– Small form factor– Low cost
Minimum assumptions made here (REM)– Reprogrammable– External nonvolatile storage– A certain amount of RAM available (4KB)
EECS 443 Advanced Operating SystemsNorthwestern University
4
Context – Complex apps requirements
Three real-world scenarios– VM - VigilNet – large-scale surveillance network
• 30 middleware services & 40K SLC
• Using overlay in absence of VM is not really an answer– Application specific, inefficient, labor intensive, error-prone
– Scheduling - Acoustic sampling app• Timing of microphone sampling inaccurate due to FIFO task
scheduling and long-running computation task
• Avoiding tasks and use only interrupts doesn’t solve it – now the clock handler’s switch statement is your problem!
– OS Control - Extreme scaling• To ensure the OS gets the CPU back
• Grenade timer or periodic reboot – Coarse control granularity– Applications must adapt to this– Long time w/o OS control
EECS 443 Advanced Operating SystemsNorthwestern University
5
Approach - Naturalization
ApplicationNaturalized
programOS
(t-kernel)
Load-time code modificationNaturalized program becomes a cooperative program supporting OS protection, VM & preemptive sched.Naturalizer– In: binary instructions; Out: naturalized instructions (natin)– Page by page
Paging– Storage management
Dispatcher– Controls execution
EECS 443 Advanced Operating SystemsNorthwestern University
6
Naturalization and control
Modify all branching instructions– Save registers, save destination and go to homeGate
(welcomeHome)– welcomeHome – retrieve destination, seeks for a natin page
(or create one) and transfer control to it– Transferring control flow to entry point – go to natin page and
go through cascading branch chain– Bridging for optimization – basically avoid going to kernel
when it’s safe to do sonatin pageApplication code
ALU instructions
branch
ALU instructions
branch
ALU instructions
ALU instructions
ALU instructions
ALU instructions
K-A transition
A-K transition
A-K transition
A-K transitionIgnored instructions
Entry point
Entry point
Entry point
K-A transition
K-A transition
Page description Miss
EECS 443 Advanced Operating SystemsNorthwestern University
7
natin pageApplication code
ALU instructions
branch
ALU instructions
branch
ALU instructions
ALU instructions
ALU instructions
ALU instructions
K-A transition
A-K transition
A-K transition
A-K transitionIgnored instructions
Entry point
Entry point
Entry point
K-A transition
K-A transition
Page description Miss
Minimum: 0 Bytes in RAM
To enhance speed, index cache– Hash has 16 possible
results
Hash Index VPC Offset
6 bits
Tag
8 bits3 bits
Hash
VPC
VPC mapping
EECS 443 Advanced Operating SystemsNorthwestern University
8
Three-level look up for a VPC
Each VPM is hashed to a number of natin pages; need to check all entry points to decide
1. VPC look-aside buffer (fast)
2. Two-associative VPC table
3. Brute-force search on the natin pages (slow)
Hit --execute
VPC
VPC Look-asidebuffer
2-associative VPC table
Kernel space
Natin space
Interrupt handlers
128K physical programmemory
Miss --search the
hashedarea
Hit --execute
Miss -- look at 2-associative VPC
table
Miss -- translate
Hashed area
EECS 443 Advanced Operating SystemsNorthwestern University
9
Three memory areas
Physical address sensitive memory (PASM)– Virtual/physical addresses are the same– The fastest access
Stack memory– Virtual/physical addresses directly mapped– Fast access with boundary checks
Heap memory– May involve a transition to kernel– The slowest, sometimes involves swapping
Heap memoryStack memoryPASM
0xFFFF0x10000x1000x0
Example configuration
EECS 443 Advanced Operating SystemsNorthwestern University
10
Swapping area organization
Challenge– After 10k writes, a flash page cannot longer be used– If swap-outs evenly distributed to all pages, maximum lifetime
Super page: associative cache, fast swapsOverflow partition extends longevity – use it when a super page is gone (O(N) seek time, however)System parameter – Af – associativity of super page (the larger, the faster and short lifetime)
Pageslot
Pageslot
Pageslot
Pageslot
Pageslot
Pageslot
Super page Super page Overflow areaPageslot
Swappage
Swappage
Maximum writenumber not
reached
Maximum writenumber to the slotsin the super pagereached
32 Bytes in RAM, 266 days (1,000 swaps/hour), 20% fast swaps
EECS 443 Advanced Operating SystemsNorthwestern University
11
Implementation
Hardware paramenters
Data RAMExternal flashProgram mem
4KB512KB128KB
OS Parameters Virtual mem.Data frameLook-aside buffer2-associative VPCSystem stackI/O Buffer
64KB64 frames64 entries256 entries1KB516 bytes
Implementation details
Code size (source)Code (binary)
10 KLSC29KB
MICA2
EECS 443 Advanced Operating SystemsNorthwestern University
12
Overhead of naturalization
Kernel transaction time: ~20 cycles (2.6microsec on MICA2 @ ~8MHz)
Kernel transaction– Saves/restore registers– Checks the stack pointers– Increments system counters– May need to
• Look for destination address
• Trigger naturalization of a new page
• Re-link naturalized page
Relative execution time with iter
EECS 443 Advanced Operating SystemsNorthwestern University
13
Overhead from the app’s perspective
PeriodicTask– Wake-up/poll-sensors/communicate – Varying the amount of computation in each task– Keep in mind the CPU idle ration of TinyOS apps
EECS 443 Advanced Operating SystemsNorthwestern University
14
Overhead of naturalized VM
Slowest stack access: 16 cycles
Heap access w/o swapping: 15 cycles
Heap access w/ swapping: 149,815 cycles
Swap out time: 20.3ms (near hardware’s limit)
Balance between swapping speed and longevityAn example: Associativity = 2
266 days, 20% fast swaps (assuming 1000swaps/hr)
Number of pages faults and swap-outs for slidingwin
EECS 443 Advanced Operating SystemsNorthwestern University
15
Comparison to VM approach
Comparing with Maté, a VM for TinyOS– A stack based virtual architecture – Comparison with an insertion-sorting program– Initial cost of t-kernel comes from naturalization
• After 100 grows slowly since naturalization has a one-time overhead
– In contrast, bytecode translation has to be done every time• And sophisticated optimizations for VMs cannot save you here
Of course, you could build Maté/TinyOS on top of t-kernel
EECS 443 Advanced Operating SystemsNorthwestern University
16
Conclusions & Future Work
Optimizing compilers
Real-time specification in t-kernel
Characterizing WSN applications
WSN benchmarks
What if power where not an issue?
Prof. Alan Epstein - Using computer-chip fabrication techniques to make a gas-turbine engine that fits in the palm of his hand (MIT).