Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | arnold-gregory |
View: | 213 times |
Download: | 1 times |
Sentomist: Unveiling transient WSN bugs via symptom mining 1
Sentomist: Unveiling Transient Sensor Network Bugs via Symptom Mining
Yangfan Zhou, Xinyu Chen, Michael R. LyuDept. of Computer Science & EngineeringThe Chinese University of Hong Kong
Jiangchuan LiuSchool of Computing ScienceSimon Fraser University
The 30th International Conference on Distributed Computing Systems
Sentomist: Unveiling transient WSN bugs via symptom mining 2
Wireless sensor networks
• Wireless Sensor Networks (WSNs)– For environmental data collection and monitoring– Networked wireless sensor nodes
• Sensor nodes– Sensor + Processor + Wireless– Simple hardware, e.g., XBOX MicaZ
• RAM: 8K• Processor: Atmel ATMega 128 series (16MHz)
– Cannot run large software
Sentomist: Unveiling transient WSN bugs via symptom mining 3
Wireless sensor networks
• Most WSN applications look short and simple– With TinyOS, customizing less than 100 lines of
codes is enough for a sensor node to sense and forward data
• But, WSN deployments notoriously keep encountering various system failures – Caused by software bugs– A major barrier to their extensive applications
• Potential WSN users rank reliability as the top 1 concern towards adopting WSNs
Sentomist: Unveiling transient WSN bugs via symptom mining 4
Objective of this research
• Fighting against WSN bugs are of critical concern towards maturing their applications
Sentomist: Unveiling transient WSN bugs via symptom mining 5
Contents
Preliminary: nature of WSN programs
A motivating example
Sentomist: our bug symptom mining approach
Conclusions
Case studies
Sentomist: Unveiling transient WSN bugs via symptom mining 6
Concurrency model of WSN applications
• Special concurrency model– Event-driven– Multi-tasking
• Not extensively tested before deployment
Why simple codes are still so buggy?
Sentomist: Unveiling transient WSN bugs via symptom mining 7
Concurrency model of WSN applications
• Event-driven programming model• Purpose: energy saving
– Enter a power-conserving sleep mode when there is no event
– Wake up upon event arrivals
• Events: interrupts– Packet arrival– Timer timeout
• Event procedure– Specific application logic for handling an event
Sentomist: Unveiling transient WSN bugs via symptom mining 8
Concurrency model of WSN applications
• Split an event procedure into two phases– Interrupt handler: triggered immediately– Tasks: put in a queue executed in a FIFO manner (deferred
function calls)
• Purpose– Multi-tasking: avoid resource monopolization
• Rules– Interrupt handler: triggered only by its corresponding
hardware interrupt– Interrupt handlers and tasks: all run to completion unless
preempted by other interrupt handlers– Tasks are posted by interrupt handlers or other tasks and
executed in a FIFO manner
Sentomist: Unveiling transient WSN bugs via symptom mining 9
Concurrency model of WSN applications
Event ProcedureAn event procedure instance starts at the entry of its corresponding interrupt handler.
It ends when its last task has been executed if the interrupt handler posts tasks, otherwise ends when the interrupt handler exits.
Sentomist: Unveiling transient WSN bugs via symptom mining 10
Concurrency model of WSN applications
• Concurrency model of WSN applications– Random starting and random interleaving of
event procedures– A new and complicated concurrency model
• Resulting in transient bugs – Caused by occasionally interleaving event procedure
s bearing implicit dependency– Hard to be triggered by simple testing scenarios– Hard to identify their symptoms in a long term syste
m running process– Possible to cause fatal results
Sentomist: Unveiling transient WSN bugs via symptom mining 11
Contents
Preliminary: nature of WSN programs
A motivating example
Sentomist: our bug symptom mining approach
Conclusions
Case studies
Sentomist: Unveiling transient WSN bugs via symptom mining 12
A motivating example: data pollution
10
24
22
26
A buggy function in an ADC event procedure, where packet->data will be polluted if the function is called again before the task prepareAndSendPacket runs.
Sentomist: Unveiling transient WSN bugs via symptom mining 13
Motivating example
• Data-race bug– packet->data is not protected
• Caused by the interleaving of event procedures, – triggered when a new data item arrives before the task prepareAndSendPacket runs
• Hard to be triggered– Need a variety of random interleaving scenarios to hit the condition
• Even if it is triggered, the symptom of the bug is not obvious• No way to figure out the data pollution automatically
– However, whether an application runs correctly is critical for current approaches of testing or troubleshooting
Sentomist: Unveiling transient WSN bugs via symptom mining 14
Motivating example
• A new approach for fighting against transient bugs is critical for WSN applications
• Observation – most execution patterns are
• ADC interrupt, interrupt exit• ADC interrupt, posting a task, interrupt exit, running the task
– When the bug is triggered, the pattern is something like• ADC interrupt, posting a task, interrupt exit, ADC interrupt,
interrupt exit, running the task
• Outlier!!!
Sentomist: Unveiling transient WSN bugs via symptom mining 15
Notion of Sentomist
• An important notion – Using the transient nature of such bugs– Although tremendous testing scenarios are needed
to trigger a bug, the application, however, behaves normally in most testing scenarios
– Hence, we can summarize the normal behaviors, since they are dominant features
– Outlier behaviors indicates transient bug symptoms
Sentomist: Unveiling transient WSN bugs via symptom mining 16
Contents
Preliminary: nature of WSN programs
A motivating example
Sentomist: our bug symptom mining approach
Conclusions
Case studies
Sentomist: Unveiling transient WSN bugs via symptom mining 17
Sentomist: Sensor Application Anatomist
Generating tremendous testing scenarios and run the tests
Generating tremendous testing scenarios and run the tests
Anatomizing program runtime into event procedure instances
Anatomizing program runtime into event procedure instances
Quantifying instances of event procedure with instruction counter
Quantifying instances of event procedure with instruction counter
Mining bug symptom with an outlier detection approach
Mining bug symptom with an outlier detection approach
4
3
1
2
Sentomist: Unveiling transient WSN bugs via symptom mining 18
Sentomist design
• Three critical issues in implementing Sentomist– How to decompose the program runtime into a set of time intervals
• The program behaviors of the majority of the intervals can exhibit certain statistical similarity (normal system behaviors)
• A natural granularity: the runtime of an event procedure instance• But, how?
– How to select a set of good attributes to feature each interval• Distinguishing normal system behaviors from outliers
– We need a generic outlier detection algorithm to find the intervals containing bug symptoms
Sentomist: Unveiling transient WSN bugs via symptom mining 19
Task 1: Identify event procedure in TinyOS
• Rules– Interrupt handler: triggered only by its corresponding hardware
interrupt– Interrupt handlers and tasks: all run to completion unless
preempted by other interrupt handlers– Tasks are posted by interrupt handlers or other tasks and executed
in an FIFO manner
• Track task functions and interrupt handlers– Post a task (put it to the task queue)– Run a task (get it from the task queue)– Interrupt entry and exit
• Analyze such sequence can tell us when each event procedure starts and ends
Sentomist: Unveiling transient WSN bugs via symptom mining 20
Sentomist design
• Three critical issues in implementing Sentomist– How to decompose the program runtime into a set of time intervals
• The program behaviors of the majority of the intervals can exhibit certain statistical similarity (normal system behaviors)
• A natural granularity: the runtime of an event procedure instance• But, how?
– How to select a set of good attributes to feature each interval• Distinguishing normal system behaviors from outliers
– We need a generic outlier detection algorithm to find the intervals containing bug symptoms
Sentomist: Unveiling transient WSN bugs via symptom mining 21
Featuring event procedure
An instruction counter of an event procedure consists of N elements, where N is the total # of instructions of the program’s corresponding machine codes.
The ith element denotes the execution number of the ith instruction during the runtime of the event procedure.
An instruction counter of an event procedure consists of N elements, where N is the total # of instructions of the program’s corresponding machine codes.
The ith element denotes the execution number of the ith instruction during the runtime of the event procedure.
Instruction counter
It can well feature system the behaviors during an event procedure
Sentomist: Unveiling transient WSN bugs via symptom mining 22
Sentomist design
• Three critical issues in implementing Sentomist– How to decompose the program runtime into a set of time intervals
• The program behaviors of the majority of the intervals can exhibit certain statistical similarity (normal system behaviors)
• A natural granularity: the runtime of an event procedure instance• But, how?
– How to select a set of good attributes to feature each interval• Distinguishing normal system behaviors from outliers
– We need a generic outlier detection algorithm to find the intervals containing bug symptoms
Sentomist: Unveiling transient WSN bugs via symptom mining 23
Outlier detection approach
• One-class support vector machine (SVM)– Assume all data belong to one class, the normal class and the origin p
oint belongs to another class, the outlier class – Model the majority characteristics of a set of unlabelled samples– Most input samples belong to the normal class
• If a sample is on the normal side, the closer it is to the boundary, the more suspicious it is as an outlier
• Otherwise, the farther it is away from the boundary, the more certain it is as an outlier.
• Rank the samples to see how certain a sample contains bug symptoms– The rank can direct the order of human inspections
to check whether a bug manifests
Sentomist: Unveiling transient WSN bugs via symptom mining 24
Contents
Preliminary: nature of WSN programs
A motivating example
Sentomist: our bug symptom mining approach
Conclusions
Case studies
Sentomist: Unveiling transient WSN bugs via symptom mining 25
Emulation Environment
• AVRORA– A state-of-the-art emulator for real WSN applications– Running a binary WSN application in the instruction code level– Providing a cycle-accurate emulation of the sensor node hardware functionalities and their
interactions• Exactly meets our requirements: we aim at the transient bugs caused by interleaving executions of
event procedures, where timing accuracy is of a critical concern• Why emulation
– To extensively explore the program execution space for triggering transient bugs– Real deployment is not cost-effective
Sentomist: Unveiling transient WSN bugs via symptom mining 26
Case study I: data pollution
• The aforementioned motivating example– Each sensor node requests its sensor reading periodically with a timer– After collecting three sensor readings, post a task to send the three readings in a data packet
• Testing scenario– Data sampling periods are 20ms, 40ms, 60ms, 80ms, and 100ms
• Collect 1099 instances of ADC event procedure
• Sentomist outputs
Data protectionData protectionSolution
Data racingData racingCause
Sentomist: Unveiling transient WSN bugs via symptom mining 27
Case study I: data pollution
• Even for this simple application, the program trace of each testing run is very long – e.g., when D = 20ms, the size of the function-level log
can reach tens of megabytes– If without Sentomist: It is labor-intensive to manually
inspect whether the WSN application runs correctly in each testing run
Sentomist: Unveiling transient WSN bugs via symptom mining 28
Case study II: Packet loss
• Testing program: a multi-hop packet forwarding protocol based on BlinkToRadio distributed with TinyOS
• Testing scenarios– Three motes are located in a straight line– Node 2 sends messages to node 0 via node 1– The packet forwarding mechanism at node 1
• Obtain the packet content.• Forward the packet to node 0 immediately
– Randomize the packet sending ratio of node 2 to inject a random sequence of packet arrival events for node 1
0 1 2
Sentomist: Unveiling transient WSN bugs via symptom mining 29
Case study II: Packet loss
• Monitor each instance of event procedure when node 1 forwards messages
• Collect 195 instances• Sentomist output
Sentomist: Unveiling transient WSN bugs via symptom mining 30
Case study II: Packet loss
1: event message_t * Receive.receive(message_t * msg, 2: void * payload, uint8_t len) 3: { 4: ... 5: if(TOS_NODE_ID == SINK) // TOS_NODE_ID is the node id. 6: { 7: ... // Copy the packet to serial send buffer. 8: // Post a task for sending packet to serial port. 9: post serialSendTask();10: }11: else12: {13: // Forward message to next hop.14: call AMSend.send(nextHopId, msg, msglen);15: }16: return msg;17: }
Queue up a received packet and send them when the busy flag is clearedQueue up a received packet and send them when the busy flag is cleared
Solution
Improper design: attempting to send a packet immediately when receiving itImproper design: attempting to send a packet immediately when receiving it
Cause
Sentomist: Unveiling transient WSN bugs via symptom mining 31
Conclusions• Transient bugs in WSN applications caused by random interleaving of event procedures are very difficult to identify
– Long term execution is needed for triggering the bugs– Identifying the bug symptom in long term system execution data is labor-intensive
• We utilize the transient nature of such bugs– Most event procedure instances behaves similarly– Outliers are indicators of bug symptoms
• We design Sentomist– Anatomize the long term system runtime data into a set of event procedure instances– Detect abnormal event procedure instances with plug-in outlier detection algorithm
• The effectiveness of Sentomist is demonstrated via representative case studies
Sentomist: Unveiling transient WSN bugs via symptom mining 32
Q&A
• Sentomist is a GUI-based open-source tool• Download at
– http://www.hkcloud.net/Sentomist– Including all case studies
Thank you!