Post on 16-Jan-2016
transcript
Department of Computer ScienceUniversity of Massachusetts, Amherst
TSAR*: A Two Tier Sensor Storage Architecture Using Interval Skip Graphs
Peter Desnoyers, Deepak Ganesan, and Prashant ShenoyUniversity of Massachusetts, Amherst
(*Tiered Storage ARchitecture)
UNIVERSITY OF MASSACHUSETTS, AMHERST
Why do we need archival storage?
Applications need historical sensor information. Why? Trigger events:
• Traffic monitoring - crash• Surveillance - break-in• Environmental monitoring - natural disaster
lead to requests for past information.
This requires archival storage.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Limited by lack of sufficient, energy-efficient storage and of communication and computation resources on current sensor platforms.
Optimized for continuous queries. High energy cost if used for archival - data must be transmitted to central data store.
Existing storage and indexing approaches
◊Streaming query systems TinyDB (Madden 2005), etc. Data storage and indexing is performed outside of network.
◊In-network storage and indexing DCS, GHT (Ratnasamy 2002) Dimensions (Ganesan 2003) Directed Diffusion (Intangonwiwat 2000)
UNIVERSITY OF MASSACHUSETTS, AMHERST
Technology Trends
RadioJ/byte
Flash J/byte
Max Flash size
Mica2 30 4.5 0.5MB
MicaZ 3.4 4.5 0.5MB
Telos 3.4 1 1MB
UMassNAND
0.01 >1GB
1000x
100xNew flash technologies enable large storage systems on small energy-constrained sensors.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Hierarchical Storage and Indexing
Hierarchical deployments are being used to provide scaling:
• James Reserve (CENS)
Higher powered micro-servers are deployed alongside resource constrained sensor nodes.
Key challenge:• Exploit proxy resources to
perform intelligent search across data on resource-
constrained nodes.
Sensors
Proxies
Application
UNIVERSITY OF MASSACHUSETTS, AMHERST
Key Ideas in TSAR
◊ Exploit storage trends for archival. Use cheap, low-power, high capacity flash memory in
preference to communication.
◊ Index at proxies and store at sensors. Exploit proxy resources to conserve sensor resources
and improve system performance.
◊ Extract key searchable attributes. Distill sensor data into concise attributes such as ranges
of time or value that may be used for location and retrieval but require less energy to transmit.
UNIVERSITY OF MASSACHUSETTS, AMHERST
TSAR Architecture
1. Interval Skip Graph-based index between proxies.
• Exploit proxy resources to locate data stored on sensors in response to queries.
2. Summarization process
• Extracts identifying information: e.g. time period during which events were detected, range of event values, etc.
3. Local sensor data archive
• Stores detailed sensor information: e.g. images, events. Sensor node archive
UNIVERSITY OF MASSACHUSETTS, AMHERST
TSAR Architecture
1. Interval Skip Graph-based index between proxies.
• Exploit proxy resources to locate data stored on sensors in response to queries.
3. Local sensor data archive
• Stores detailed sensor information, e.g. images, events.
2. Summarization process
• Extracts identifying information: e.g. time period during which events were detected, range of event values, etc.
Summarization function
UNIVERSITY OF MASSACHUSETTS, AMHERST
TSAR Architecture
2. Summarization process
• Extracts identifying information: e.g. time period during which events were detected, range of event values, etc.
3. Local sensor data archive
• Stores detailed sensor information, e.g. images, events.
Distributed index1. Interval Skip Graph-based
index between proxies
• Exploit proxy resources to locate data stored on sensors in response to queries.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Camera Sensing
storage
Cyclops camera
summarize
image
Sensor archives information and transmits summary to proxy.
Sensor node<id>
Summary
handle
Birds(t1,t2)=1
<id>
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Indexing
Index Network of proxies
Summary and location information are stored and indexed at proxy.
proxy
Birds(t1,t2)=1
<id> Birds t1,t2 1 <id>
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Querying and Retrieval
Birds in interval (t1,t2)?
proxy
Cyclops camera
summarize
Cyclops camera
summarize
Query is sent to any proxy.
Birds t1,t2 1 <id>
UNIVERSITY OF MASSACHUSETTS, AMHERST
Example - Querying and Retrieval
Birds in interval (t1,t2)?
proxy
Cyclops camera
summarize
Cyclops camera
summarize
Index is used to locate sensors holding matching records.
Birds t1,t2 1 <id>
<id>
UNIVERSITY OF MASSACHUSETTS, AMHERST
Record is retrieved from storage and returned to application.
Example - Querying and Retrieval
proxy
Cyclops camera
summarize
Cyclops camera
Birds t1,t2 1 <id>
<id>
UNIVERSITY OF MASSACHUSETTS, AMHERST
Outline of Talk
◊ Introduction and Motivation◊ Architecture◊ Example◊ Design
Skip Graph Interval Search Interval and Sparse Interval Skip Graph
◊ Experimental Results◊ Related Work◊ Conclusion and Future Directions
UNIVERSITY OF MASSACHUSETTS, AMHERST
The index should:
• support range queries over time or value,
• be fully distributed among proxies, and
• Support interval keys indicating a range in time or value.
Goals of Index Structure
insert(| |)
Distributed index
(| |)?
UNIVERSITY OF MASSACHUSETTS, AMHERST
What is a Skip Graph?
2 3 5 6 9 12 18 19
Single key and associated pointers
Distributed extension of Skip Lists (Pugh ‘90):
Probabilistically balanced - no global rebalancing needed.
Ordered by key - provides efficient range queries.
Fully distributed - data is indexed in place.
(Aspnes & Shah, 2003, Harvey et al. 2003)
Log(N) search and insert
No single root - load balancing, robustness
Properties:
UNIVERSITY OF MASSACHUSETTS, AMHERST
Interval search
Given intervals [low,high] and query X:1 - order by low2 - find first interval with high <= X3 - search until low > X
0 1 2 3 4 5 6 7 8 9 10
0 3
5 8
6 10
8 9
42
2 3
1 5
Query: x=4
UNIVERSITY OF MASSACHUSETTS, AMHERST
Interval search
Given intervals [low,high] and query X:1 - order by low2 - find first interval with high <= X3 - search until low > X
0 1 2 3 4 5 6 7 8 9 10
0 3
5 8
6 10
8 9
42
2 3
1 5
Query: x=4
UNIVERSITY OF MASSACHUSETTS, AMHERST
Interval search
Given intervals [low,high] and query X:1 - order by low2 - find first interval with high <= X3 - search until low > X
0 1 2 3 4 5 6 7 8 9 10
0 3
5 8
6 10
8 9
42
2 3
1 5
Query: x=4
UNIVERSITY OF MASSACHUSETTS, AMHERST
Simple Interval Skip Graph
0-3 0-1 1-5 2-4 5-8 6-10 8-9 9-12
Derived from Interval Tree,
Cormen et al. 1990
3 3 5 5 8 10 10 12
Method:
Index two increasing values: low, maxSearch on either as needed.
Interval keys: YESlogN search: YESlogN update: NO - (worst case O(N))
UNIVERSITY OF MASSACHUSETTS, AMHERST
Sparse Interval Skip Graph
Goal: efficient update of max(high) values in Interval Skip Graph.
Approach: take advantage of ratio of proxies (M) to data items (N)
Solution: eliminate redundant links and corresponding updates.
Before: complete search tree rooted at each data item. After: retain M trees, one rooted at each proxy, keeping robustness and load balancing properties.
M proxies
N data items
- - - - - - - - - -
Worst-case complexity:
Search: O(logM)Update: O(M)
UNIVERSITY OF MASSACHUSETTS, AMHERST
Adaptive Summarization
updates
queries
How accurately should the summary information represent the original data?
Detailed summaries =more summaries,
precise index
Precise index =fewer wasted queries
UNIVERSITY OF MASSACHUSETTS, AMHERST
Adaptive Summarization
updates
queries
How accurately should the summary information represent the original data?
Approximate summaries =fewer summaries,imprecise index
imprecise index =more wasted queries? ?
UNIVERSITY OF MASSACHUSETTS, AMHERST
= summarization (summaries / data) r = EWMA( wasted queries / data )
Target range: r0
Decrease if: r > r0Increase if: r < r0
Adaptive Summarization
updates
queries
Goal: balance update and query cost.
Approach: adaptation.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Prototype and Experiments
◊ Software: Em* (proxy),TinyOS (sensor)
◊ Hardware: StargateMica2 mote
◊ Network: 802.11 ad-hoc,multihop BMAC 11%
◊ Data:James Reserve [CENS] dataset30s temperature readings34 days
For physical experiments, data stream was stored on sensor node and replayed.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Index performance
Queries
Sensor data
How does the index performance scale with the number of proxies and size of dataset?
Tested in: Em* emulation
Tasks: insert, query
Variables: number of proxies (1-48)size of dataset
Metric: proxy-to-proxy messages
Interval skip graph index
UNIVERSITY OF MASSACHUSETTS, AMHERST
Index results
Sparse skip graph provides >2x decrease in message traffic for small numbers of proxies.
Sparse skip graph shows virtually flat message cost for larger index sizes.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Tested on: 4 Stargate proxies12 Mica2 sensors in tree configuration
Task: query
Variables: size of dataset
Metric: query latency (ms)
Query performance
data
queriesWhat is the query performance on real hardware and real data?
4-proxynetwork
3-level multi-hop
sensor field
UNIVERSITY OF MASSACHUSETTS, AMHERST
Validates the approach of using proxy resources to minimize the number of expensive sensor operations.
Query results
Sensor link latency dominates
Proxy linkdelay is negligible
The sensor communication consists only of a query and a response - the minimal communication needed to retrieve the data.
UNIVERSITY OF MASSACHUSETTS, AMHERST
Summary algorithm adapts to data and query dynamics.
Tested in: Em*, EMTOSSIMemulation
Task: data and queries
Variables: query/data ratio
Metric: summarization factor
Query/data = 0.2
Query/data=0.03
Query/data = 0.1
Adaptive SummarizationVaried query
rate
Summary rate adapts
How well does the adaptation mechanism track changes in conditions?
1/
UNIVERSITY OF MASSACHUSETTS, AMHERST
Related Work
◊ In-network Storage: DCS (Ratnasamy 2002) Dimensions (Ganesan 2003) …
◊ In-network Indexing: GHT (Ratnasamy 2002) DIFS (Greenstein 2003) DIM (Li 2003) …
◊ Hierarchical Sensor Systems: Tenet (CENS, USC)
◊ Sensor Flash File Systems: ELF (Dai 2004) Matchbox (Hill et al. 2000)
UNIVERSITY OF MASSACHUSETTS, AMHERST
Conclusions and Future Work
◊ Proposed novel Interval Skip Graph-based index structure and adaptive summarization mechanism for multi-tier sensor archival storage.
◊ Implemented these ideas in the TSAR system.◊ Demonstrated index scalability, query performance,
and adaptation of summarization factor, both in emulation and running on real hardware.
Future Work◊ Investigate other index structures.◊ Alternate interval- and non-interval-based summary
mechanisms.
For more information: http://presto.cs.umass.edu