+ All Categories
Home > Documents > Medians and Beyond: New Aggregation Techniques for Sensor Networks

Medians and Beyond: New Aggregation Techniques for Sensor Networks

Date post: 31-Dec-2015
Category:
Upload: aidan-neal
View: 33 times
Download: 3 times
Share this document with a friend
Description:
Medians and Beyond: New Aggregation Techniques for Sensor Networks. CS851 Seminar Presentation. Outline. Motivations, State of Art, Contributions The Q-Digest Scheme Queries on Q-Digest Experimental Evaluation Conclusions Be prepared! I have questions for you!. Motivations. - PowerPoint PPT Presentation
25
1 By: Gang Zhou Computer Science Department University of Virginia Medians and Beyond: New Medians and Beyond: New Aggregation Techniques Aggregation Techniques for Sensor Networks for Sensor Networks CS851 Seminar Presentation
Transcript

11

By: Gang ZhouComputer Science Department

University of Virginia

Medians and Beyond: New Medians and Beyond: New Aggregation Techniques for Aggregation Techniques for

Sensor NetworksSensor Networks

CS851 Seminar Presentation

22

CS851 2005Gang ZhouOutline Outline

Motivations, State of Art, ContributionsMotivations, State of Art, Contributions The Q-Digest SchemeThe Q-Digest Scheme Queries on Q-DigestQueries on Q-Digest Experimental EvaluationExperimental Evaluation Conclusions Conclusions

Be prepared! I have questions for Be prepared! I have questions for you!you!

33

CS851 2005Gang ZhouMotivationsMotivations

Trade Computation for CommunicationTrade Computation for Communication Transmitting one bit over radio is at least three Transmitting one bit over radio is at least three

orders of magnitude more expensive in terms of orders of magnitude more expensive in terms of energy consumption than executing a single energy consumption than executing a single instructioninstruction

Support Aggregation QueriesSupport Aggregation Queries Need aggregated answer, not a single raw reading Need aggregated answer, not a single raw reading Quantile query Quantile query

NNthth valuevalue Reverse quantile query Reverse quantile query

Value Value Nth Nth Consensus queryConsensus query

Most frequent?Most frequent? HistogramHistogram

44

CS851 2005Gang ZhouState of ArtState of Art TinyDB project in Berkeley & Cougar project in TinyDB project in Berkeley & Cougar project in

Cornell Cornell Pros:Pros:

Energy efficient in-network data aggregationEnergy efficient in-network data aggregation Work very well in singleton sensor valuesWork very well in singleton sensor values

MIN, MAX, AVERAGE, SUM, COUNTMIN, MAX, AVERAGE, SUM, COUNT Cons:Cons:

Do not deal with complex aggregate measuresDo not deal with complex aggregate measures Median, Quantile, Reverse Quantile, ConsensusMedian, Quantile, Reverse Quantile, Consensus

[Zhao et. al. 2003][Zhao et. al. 2003] Algorithms for constructing summaries like MAX, AVGAlgorithms for constructing summaries like MAX, AVG Focus more on network monitoring and maintenanceFocus more on network monitoring and maintenance

[Przydatek et. al. 2003][Przydatek et. al. 2003] Secure aggregationSecure aggregation

55

CS851 2005Gang ZhouContributionsContributions

Propose Q-Digest for Approximated AggregationPropose Q-Digest for Approximated Aggregation

Provide Strict Theoretical Guarantees on the Provide Strict Theoretical Guarantees on the Approximation Quality of the Queries in Terms of Approximation Quality of the Queries in Terms of the Message Sizethe Message Size

Evaluate the performance of Q-Digest in Evaluate the performance of Q-Digest in SimulationSimulation

66

CS851 2005Gang ZhouRoadmap Roadmap

Motivations, State of Art, ContributionsMotivations, State of Art, Contributions The Q-Digest SchemeThe Q-Digest Scheme Queries on Q-DigestQueries on Q-Digest Experimental EvaluationExperimental Evaluation Conclusions and DiscussionsConclusions and Discussions

77

CS851 2005Gang ZhouProperties of Q-DigestProperties of Q-Digest

Each node v in tree T is a bucket;Each node v in tree T is a bucket; Whose range Whose range [v.min, v.max][v.min, v.max]

defines the position and width defines the position and width of the bucket;of the bucket;

Has counter Has counter count(v)count(v);; Given the compression Given the compression

parameter K, a node v is in q-parameter K, a node v is in q-digest iff it satisfies:digest iff it satisfies: (1) If not a leaf, no high count;(1) If not a leaf, no high count; (2) If not the root, a node and (2) If not the root, a node and

its children should not have its children should not have low count; low count;

A q-digest is a set of buckets of A q-digest is a set of buckets of different sizes and their different sizes and their associated counts;associated counts;

88

CS851 2005Gang ZhouBuilding a Q-DigestBuilding a Q-Digest

Going bottom up to check whether any node violates digest property (2)Going bottom up to check whether any node violates digest property (2) If yes, delete itself and its sibling, and merge to its parent;If yes, delete itself and its sibling, and merge to its parent;

Key feature of q-digest: Key feature of q-digest: Detailed information concerning data Detailed information concerning data values which occur values which occur frequentlyfrequently are are preservedpreserved in the digest, while in the digest, while less frequentlyless frequently occurring values are lumped into larger buckets occurring values are lumped into larger buckets resulting in information resulting in information lossloss..

99

CS851 2005Gang ZhouMerging Q-DigestMerging Q-Digest

Parent node merge Q1(n1,K) and Q2(n2,K) from Parent node merge Q1(n1,K) and Q2(n2,K) from childrenchildren

How about merging Q1(n1,k1) and How about merging Q1(n1,k1) and Q2(n2,K2)?Q2(n2,K2)?

Each node has different communication Each node has different communication abilityability Each node has different power levelPowerful node can have bigger K while less powerful node can have smaller K value. Can we still get the same accuracy? Is that feasible?

1010

CS851 2005Gang ZhouSpace Complexity and Error Bound Space Complexity and Error Bound (1/4)(1/4)

What dos it mean 3K?

3K bites?

The root node does not satisfy property

(2).??

3K means 3K <nodeID(v), count(v)>

pairs

1111

CS851 2005Gang ZhouSpace Complexity and Error Bound Space Complexity and Error Bound (2/4)(2/4)

What about the leaf node, which does

not satisfy property (1)?

It doesn’t matter, because a leaf node is not the ancestor

of any node.

1212

CS851 2005Gang ZhouSpace Complexity and Error Bound Space Complexity and Error Bound (3/4)(3/4)

1313

CS851 2005Gang ZhouSpace Complexity and Error Bound Space Complexity and Error Bound (4/4)(4/4)

1414

CS851 2005Gang ZhouRepresentation of a Q-DigestRepresentation of a Q-Digest

Now to transmit the q-digest we send a set of tuple of Now to transmit the q-digest we send a set of tuple of the following form <nideID(v), count(v)> which the following form <nideID(v), count(v)> which requires a total of bits for each tuple. requires a total of bits for each tuple. )log)2(log( n

1515

CS851 2005Gang ZhouRoadmap Roadmap

Motivations, State of Art, ContributionsMotivations, State of Art, Contributions The Q-Digest SchemeThe Q-Digest Scheme Queries on Q-DigestQueries on Q-Digest Experimental EvaluationExperimental Evaluation Conclusions and DiscussionsConclusions and Discussions

1616

CS851 2005Gang ZhouQuantile Query(1/3)Quantile Query(1/3)

Quantile query:Quantile query: Given a fraction 0<q<1, find the value whose rank Given a fraction 0<q<1, find the value whose rank

in sorted sequence of the n values is qn.in sorted sequence of the n values is qn.

Answer the query:Answer the query: Sort nodes in q-digest in increasing v.max; breaking Sort nodes in q-digest in increasing v.max; breaking

ties by putting smaller ranges first;ties by putting smaller ranges first; Scan the sorted list and add the counts of nodes;Scan the sorted list and add the counts of nodes; For some node v, the sum becomes more than qn, For some node v, the sum becomes more than qn,

and the v.max is reported as the estimate of the and the v.max is reported as the estimate of the quantile;quantile;

1717

CS851 2005Gang ZhouQuantile Query(2/3)Quantile Query(2/3)

The confidence factorThe confidence factor Why need this?Why need this?

is the worst case error estimation, which only is the worst case error estimation, which only occurs for a very pathological input caseoccurs for a very pathological input case

What is it?What is it? Confidence factor is defined as: Confidence factor is defined as:

(maximum weight of any path from root to leaf in Q)/n(maximum weight of any path from root to leaf in Q)/n

m

log3

1818

CS851 2005Gang ZhouConfidence Factor ExampleConfidence Factor Example

N=15, k=5, =8 N=15, k=5, =8

1 1 5 7 3 3 3 31 1 5 7 3 3 3 3

(maximum weight of any path from root to leaf in Q)/n (maximum weight of any path from root to leaf in Q)/n = 7/15= 7/15

<< = 3 * log8 / 3K = 3*3/3*5 = 9/15= 3 * log8 / 3K = 3*3/3*5 = 9/15m

log3

1919

CS851 2005Gang ZhouRoadmap Roadmap

Motivations, State of Art, ContributionsMotivations, State of Art, Contributions The Q-Digest SchemeThe Q-Digest Scheme Queries on Q-DigestQueries on Q-Digest Experimental EvaluationExperimental Evaluation Conclusions and DiscussionsConclusions and Discussions

2020

CS851 2005Gang ZhouPerformance EvaluationPerformance Evaluation

SettingsSettings Routing treeRouting tree

Breadth first search treeBreadth first search tree Sensor fieldSensor field

1000 x 1000 area with 1000 sensor nodes1000 x 1000 area with 1000 sensor nodes 2000 x 2000 area with 4000 sensor nodes2000 x 2000 area with 4000 sensor nodes

Sensor valueSensor value RandomRandom Correlated : Correlated :

United States Geological SurveyUnited States Geological Survey Compare with List scheme:Compare with List scheme:

List: Report all (value, count)List: Report all (value, count)back to base station; no back to base station; no in-network aggregation;in-network aggregation;

2121

CS851 2005Gang ZhouError and Message SizeError and Message Size

160 bytes message size can get 5% error

400 bytes message size can get 2% error

2222

CS851 2005Gang ZhouTotal Data TransmissionTotal Data Transmission

Q-digest transmit less data than list

Random input needs more transmission than correlated data

2323

CS851 2005Gang ZhouResidual PowerResidual Power

For every byte transmitted, one unit of 40000 unit of power is depleted.(How about reception?)

In List, 0.02% nodes have residual power fraction less than ½. (???)

2424

CS851 2005Gang ZhouConclusionsConclusions

Propose Q-Digest for Approximated AggregationPropose Q-Digest for Approximated Aggregation

Provide Strict Theoretical Guarantees on the Provide Strict Theoretical Guarantees on the Approximation Quality of the Queries in Terms of Approximation Quality of the Queries in Terms of the Message Sizethe Message Size

Evaluate the performance of Q-Digest in Evaluate the performance of Q-Digest in SimulationSimulation

2525

CS851 2005Gang Zhou

Thank you!Thank you!


Recommended