+ All Categories
Home > Documents > Adaptive Cleaning for RFID Data Streams Shawn Jeffery Minos Garofalakis Michael Franklin UC Berkeley...

Adaptive Cleaning for RFID Data Streams Shawn Jeffery Minos Garofalakis Michael Franklin UC Berkeley...

Date post: 19-Dec-2015
Category:
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
32
Adaptive Cleaning Adaptive Cleaning for RFID Data for RFID Data Streams Streams Shawn Jeffery Minos Garofalakis Michael Franklin UC Berkeley Intel Research Berkeley UC Berkeley Presented by: Hamid Haidarian Shahri
Transcript

Adaptive Cleaning for Adaptive Cleaning for RFID Data StreamsRFID Data Streams

Shawn Jeffery Minos Garofalakis Michael Franklin UC Berkeley Intel Research Berkeley UC Berkeley

Presented by: Hamid Haidarian Shahri

Where Are We? Look at the Where Are We? Look at the Signs!Signs!

Looking at Signs – Before Looking at Signs – Before Jumping InJumping In

• S. Chaudhuri, U. Dayal, "An Overview of Data Warehousing and OLAP Technology," SIGMOD Record, 1997. 800+ citations

• DW and information integration• “Data cleaning” term publicized

Identified its importance in integration

• Extensive research followed

VLDB 2001VLDB 2001

• Session R12: DATA QUALITY & CLEANING

• Declarative data cleaning: language, model, and algorithms Helena Galhardas (INRIA Rocquencourt), Daniela Florescu (Propel), Dennis Shasha (NYU), Eric Simon, and Cristian-Augustin Saita (INRIA Rocquencourt)

• Potter's wheel: an interactive data cleaning system Vijayshankar Raman and Joseph M. Hellerstein (University of California at Berkeley)

• Update propagation strategies for improving the quality of data on the Web Alexandros Labrinidis and Nick Roussopoulos (University of Maryland)

Data Cleaning Previous Work - Data Cleaning Previous Work - 20062006

• Hamid Haidarian Shahri, S.H. Shahri, “Eliminating Duplicates in Information Integration: An Adaptive, Extensible Framework," IEEE Intelligent Systems, Vol. 21, No. 5, 2006.

Putting Things into Putting Things into ContextContext

• Data cleaning required after integration No unified standard across sources NOW: sensor/hardware errors

inevitable; research opportunity

• Data modeling (Amol Deshpande) An important use case is cleaning

VLDB 2006 – Three weeks VLDB 2006 – Three weeks agoago

• Research Session 5: Sensor Data (dedicated to cleaning!)

• Title: Adaptive Cleaning for RFID Data Streams Authors: Shawn R. Jeffery, Minos Garofalakis, Michael J.

Franklin

• Title: A Deferred Cleansing Method for RFID Data Analytics Authors: Jun Rao, Sangeeta Doraiswamy, Hetal Thakkar,

Latha S. Colby

• Title: Online Outlier Detection in Sensor Data Using Non-Parametric Models Authors: Sharmila Subramaniam, Themis Palpana, Dimitris

Papadopoulos, Vana Kalogeraki, Dimitrios Gunopulos

RFID: Radio Frequency RFID: Radio Frequency IDentificationIDentification

RFID data is dirtyRFID data is dirtyShelf 0 Shelf 1

RFIDReaders

StaticTags

Mobile Tags

15ft

1.5ft

3ft9ft

3ft

3ft

3ft

A simple experiment:

•2 RFID-enabled shelves

•10 static tags

•5 mobile tags

RFID Data CleaningRFID Data Cleaning

Time

Raw readings

Smoothed output

• RFID data has many dropped readings• Typically, use a smoothing filter to

interpolateSELECT distinct tag_idFROM RFID_stream [RANGE ‘5 sec’]GROUP BY tag_id

SELECT distinct tag_idFROM RFID_stream [RANGE ‘5 sec’]GROUP BY tag_idBut, how to set the size

of the window?

But, how to set the size of the window?

Smoothing Filter

Window Size for RFID Window Size for RFID SmoothingSmoothing

Fido moving Fido resting

Small windowSmall windowRealityReality

Raw readingsRaw readings

Large windowLarge window

Need to balance completeness vs. capturing tag movement

Need to balance completeness vs. capturing tag movement

Truly Declarative Truly Declarative SmoothingSmoothing

• Problem: window size non-declarative Application wants a clean stream

of data Window size is how to get it

• Solution: adapt the window size in response to data

ItineraryItinerary

• Introduction: RFID data cleaning• A statistical sampling perspective• SMURF

Per-tag cleaning Multi-tag cleaning

• Ongoing work• Conclusions

A Statistical Sampling A Statistical Sampling PerspectivePerspective

• Key Insight: RFID data random sample of present tags

• Map RFID smoothing to a sampling experiment

RFID’s Gory DetailsRFID’s Gory Details

Epoch TagID ReadRate

0 1 .9

0 2 .6

0 3 .3

Tag 1

Tag 2

Tag 3

Tag 4

Antenna & readerTags

E1 E2 E3 E4 E5 E6 E7 E8 E9E0

Read Cycle (Epoch)

Read Cycle (Epoch)

(For Alien readers)

Tag List

RFID Smoothing to SamplingRFID Smoothing to Sampling

RFID Sampling

Read cycle (epoch) Sample trial

Reading Single sample

Smoothing window Repeated trials

Read rate Probability of inclusion (pi)

Now use sampling theory to drive adaptation!

SMURFSMURF

• Statistical Smoothing for Unreliable RFID Data

• Adapts window based on statistical properties

• Mechanisms for:• Per-tag and multi-tag cleaning

Multi-tagCleaning

SMURF

Per-tagCleaning

raw RFID streams

cleanedcount readings

cleanedper-tag readings

Application(s) Application(s)

Per-Tag Smoothing: Model and Per-Tag Smoothing: Model and BackgroundBackground

• Use a binomial sampling model

Time (epochs)

pi

1

0

Smoothing Window

wi Bernoulli trials

piavg

Si

(Read rate of tag i)

E1 E2 E3 E4 E5 E6 E7 E8 E9E0

Per-Tag Smoothing: Per-Tag Smoothing: CompletenessCompleteness

• If the tag is there, read it with high probability

Want a large window

pi

1

0

Reading with a low pi

Expand the window

Time (epochs)E1 E2 E3 E4 E5 E6 E7 E8 E9E0

Per-Tag Smoothing: Per-Tag Smoothing: CompletenessCompleteness

Expected epochs needed to read

With probability 1-

Desired window size for tag i

1

ln*1avgi

ip

w

Per-Tag Smoothing: Per-Tag Smoothing: TransitionsTransitions• Detect transitions as statistically

significant changes in the data

pi

1

0

Statistically significant difference Flag a transition and

shrink the window

The tag has likely left by this point

Time (epochs)E1 E2 E3 E4 E5 E6 E7 E8 E9E0

Per-Tag Smoothing: Per-Tag Smoothing: TransitionsTransitions

# expected readings

Is the difference “statistically significant”?

# observed readings

)1(**2|*||| avgi

avgii

avgiii ppwpwS

•Statistically significantStatistically significant

SMURF in ActionSMURF in Action

Fido moving Fido resting

SMURFSMURF

Experiments with real and simulated data show similar results

Multi-tag CleaningMulti-tag Cleaning

• Some applications only need aggregates E.g., count of items on each shelf Don’t need to track each tag!

• Use statistical mechanisms for both: Aggregate computation Window adaptation

Aggregate ComputationAggregate Computation

• –estimators (Horvitz-Thompson) • Count:

• P[tag i seen in a window of size w]:

Use small windows to capture movementUse the estimator to compensate for lost

readings

wSiwN

1

wavgii p )1(1

Window AdaptationWindow Adaptation

• Upper bound window similar to per-tag

• “Transition” based on variance within subwindows

1

ln*1avgp

w

Count

Nw

Nw’

Time (epochs)E1 E2 E3 E4 E5 E6 E7 E8 E9E0

'VarVar2ww NN

Multi-tag ScenarioMulti-tag Scenario

Ongoing Work: Spatial Ongoing Work: Spatial SmoothingSmoothing

• With multiple readers, more complicated

Reinforcement

A? B? A U B? A B?

Arbitration

A? C? All are addressed by statistical framework!

U

A

B

C

D

Two rooms, two readers per room

Beyond RFIDBeyond RFID

• -estimator for other aggregates Use SMURF for sensor networks

• Use SMURF in general streaming systems (e.g., TelegraphCQ)

Remove RANGE clause from CQL

Other sensor dataOther sensor data

Other streaming dataOther streaming data

Related WorkRelated Work

• Commercial RFID middleware Smoothing filters: need to set smoothing

window

• RFID-related work Rao et al., StreamClean: complementary Intel Seattle, HiFi, ESP: static window size

• BBQ, MauveDB Heavyweight, model-based SMURF is non-parametric, sampling-based

• Statistical filters (digital signal processing & DB) Non-linear digital filters inspired SMURF design

ConclusionsConclusions

• Current smoothing filters not adequate

• Not declarative!

• SMURF: Declarative smoothing filter

• Uses statistical sampling to adapt window size

Thanks!Thanks!

Questions?


Recommended