+ All Categories
Home > Documents > Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

Date post: 04-Jan-2016
Category:
Upload: gervais-osborne
View: 215 times
Download: 0 times
Share this document with a friend
22
Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006
Transcript
Page 1: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

Data Centric Storage: GHT

Brad KarpUCL Computer Science

CS 4C38 / Z2517th January, 2006

Page 2: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

2

User remote; connected via base stationHow do users pose queries?

– by event name (e.g., “Zebras?”)Query(“Zebra”) {(“Zebra”, i, [u, v]); (“Zebra”, j, [x, y])}

– Geographic Hash Table (GHT)• In-network storage of data• Data placement, query routing built on geographic routing

One View of Sensor Networks: Querying Zebra Sightings

(x, y)

(u, v)

user

ji

Page 3: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

3

Problem:Data Dissemination in Sensornets

• Sensors numerous and widely dispersed

• Sensed data must reach remote user• Data dissemination problem:

– How best can we supply measured data to users?

• Design drivers for system:• Energy scarce• Wireless media prone to contention

Page 4: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

4

Context:Directed Diffusion [Estrin et al.,

2000]“Zebra?”

(“Zebra”, i,

[u,v]) (u, v)

i

(“Zebra”, j,

[x,y])

j(x, y)

• Data-centric routing: flood queries (interests) by name

• Return any responses along reverse paths

Page 5: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

5

Assumptions, Metrics, Terminology

• Large-scale networks with known geographic boundaries

• Users on WAN, a few APs with WAN uplinks• Nodes know own geographic locations;

often needed to annotate sensed data• Energy metrics

– Total usage: total number packet txs– Hotspot usage: max. number txs by one node

• Event: discrete, named object recognized by sensor (e.g., “Zebra”)

• Query: request from user for data under same naming scheme

Page 6: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

6

Outline

• Motivation and Context• Canonical Data Dissemination

Approaches• Geographic Hash Table (GHT) Service• Evaluation in Simulation• Summary

Page 7: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

7

Canonical Approach: Local Storage

For n nodes, Q event names queried for, and Dq

events detected with those names, cost (in pkts):– Total: – Hotspot: (at access point)

“Zebra?”

(“Zebra”, i,

[u,v]) (u, v)

i

(“Zebra”, j,

[x,y])

j(x, y)

Page 8: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

8

Canonical Approach: External Storage

(“Zebra”, i,

[u,v]) (u, v)

i

(“Zebra”, j,

[x,y])

j(x, y)

(“Cat”, k, [s,t])

(s, t)

For n nodes, Dt total events detected, cost (in pkts):

– Total: – Hotspot: (at access point)

Page 9: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

9

Canonical Approach: Data-Centric Storage (DCS)

For n nodes, Q names queried, Dq of those events detected, cost (in pkts):

Total (full enumeration):Total (summarization):Hotspot (full enumeration): (at access point)Hotspot (summarization): (at access point)

user“Zebra?”

ji (x, y)

(u, v)

Page 10: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

10

Cost Comparison ofCanonical Approaches

• Local storage incurs greatest total message count as n grows

• External storage always sends fewer total messages than DCS

• When many more event types detected than queried for, DCS incurs least hotspot message count

• DCS permits summarization of events (return multiple events in one packet)

Page 11: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

11

Outline

• Motivation and Context• Canonical Data Dissemination

Approaches• Geographic Hash Table (GHT) Service• Evaluation in Simulation• Summary

Page 12: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

12

Geographic Hash Table: A Sketch

• Two operations:– Put(k, v) stores event v under key k– Get(k) retrieves event associated with key k

• Hash key k into geo coordinates; store and retrieve events for that key at that location– Spreads key space storage load evenly across network!

user

H(“Zebra”) = (a, b)

(a, b)

“Zebra?”

ji (x, y)

(u, v)

Page 13: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

13

Design Criteria forScalable, Robust DCS

• Storage system must offer persistence despite node and link failures– If node holding k changes, queries and data

must make consistent choice of new node

• Storage shouldn’t concentrate at any one node

• Storage capacity should increase with node count

• As ever, avoid traffic concentration, minimize message count

Page 14: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

14

GHT: Home Nodes and Perimeters

• Likely no node exactly at H(k); hash function ignorant of topology

• Home node: closest node to point output by H(k)• Home perimeter: perimeter enclosing point

output by H(k)

Page 15: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

15

Consistency:Perimeter Refresh Protocol (PRP)

• (k,v) pairs replicated at all nodes on home perimeter

• Non-home nodes on home perimeter: replica nodes• Home node sends refresh packets every Th seconds,

containing all (k,v), to H(k)• Receiver of refresh who is closer to H(k) than

originator consumes it, initiates its own• Replica node becomes home node if its own refresh

returns• Upon forwarding a refresh, node resets takeover

timer for Tt seconds; upon expiration, node generates a refresh for k

• Death timer: all nodes expire (k,v) pairs they cache after Td seconds; reset every time refresh for k received.

Page 16: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

16

Outline

• Motivation and Context• Canonical Data Dissemination

Approaches• Geographic Hash Table (GHT) Service• Evaluation in Simulation• Summary

Page 17: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

17

Simulation ParametersRadio Type 802.11 MAC and PHY, 40

m range

Node Density 1 node / 256 m2

Mobility Rate 0.0, 0.1, 1.0 m/s

Number of Nodes 50, 100, 150, 200

Query Generation Rate 2 qps

Query Start Time 42 s

Refresh Interval 10 s

Event Types 20

Simulation Time 300 s

Events Detected 10 / type

Up/Down Duty Cycle [0, 120] s up;[0, 60] s down

Page 18: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

18

Query Success Rate w/Node Failures (100 Nodes)

Page 19: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

19

Storage per Node w/Node Failures (100 Nodes)

Page 20: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

20

Further Scaling and Robustness Results

• Mean and maximum storage load per node decrease as node population increases

• Query success rate above 96% for mobility rates of 0.1 m/s and 1 m/s

• Query success rate degrades gracefully as alternation between up/down states accelerates

• Validation of relative message costs of three canonical approaches in simulations of up to 100,000 nodes

Page 21: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

21

Follow-On Work in DCS

• Mapping geographic boundaries of a network; support hashing to inside a network with changing boundaries

• DCS without geographic routing: GEM [NeSo03]

• Range queries for GHT using K-D trees: DIM [LiGo03]

• Assigning coordinates for geographic routing using only topological knowledge (not, e.g., GPS) [RaRa03]

• Dealing with non-uniform node distributions; multiple hash functions [GaEs03]

Page 22: Data Centric Storage: GHT Brad Karp UCL Computer Science CS 4C38 / Z25 17 th January, 2006.

22

DCS: Summary• Three canonical approaches will be useful in data

dissemination for sensor networks: local storage, external storage, and data-centric storage

• Summarization is a key advantage of the DCS approach in reducing hotspot usage and total usage; home node is a useful aggregation point

• Sensor applications with many nodes, many event types, not all queried are those where DCS offers most attractive performance vs. other canonical approaches

• GHT spreads storage load evenly on sensor networks• GHT offers robust persistence under node failures

and mobility, because it binds data to fixed locations, rather than to “volatile” nodes


Recommended