+ All Categories
Home > Documents > Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti [email protected]...

Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti [email protected]...

Date post: 04-Jan-2016
Category:
Upload: roy-clark
View: 217 times
Download: 2 times
Share this document with a friend
Popular Tags:
26
Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti [email protected] University of Cyprus Song Lin [email protected] University of California - Riverside Dimitrios Gunopulos [email protected] University of California - Riverside ICDE 2006 Song Lin University of California, Riverside http://www.cs.ucr.edu/~slin
Transcript
Page 1: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Distributed Spatio-TemporalSimilarity Search

Demetrios Zeinalipour-Yazti [email protected] University of Cyprus

Song Lin [email protected]

University of California - Riverside Dimitrios Gunopulos [email protected] University of California - Riverside

ICDE 2006Song Lin University of California, Riverside

http://www.cs.ucr.edu/~slin

Page 2: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Trajectories are everywhere

Song Lin University of California, Riverside

Page 3: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Trajectory Similarity Search• Habitat monitoring

– Animal migration patterns

• Sign language detection– Movement of fingers

• Store surveillance video – Customer movement patterns

• Camera sensor network– Each sensor can monitor

the movement of objects within a small area

Song Lin University of California, Riverside

Page 4: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Distributed Similarity Search• The setting

– Monitoring area G with m objects moving inside– G is segmented into n non-overlapping cells each

having a camera sensor– Each record of the trajectory is stored locally at the

closest sensor

• Problem

Given a query trajectory Q, retrieve the top K trajectories which are most similar to Q.

Song Lin University of California, Riverside

Page 5: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

An example• Distributed top-K problem

– The trajectories of objects are distributed at different cells– It is expensive to collect all the trajectories centrally.

Song Lin University of California, Riverside

C1 C2

C3 C4

A2

A1

a) Map View b) Cell View

G

A3

A4 A5

A6

C1 C2

C3 C4

C1 C2

C3 C4

A2

A1

Q

a) Map View b) Cell View

G

A3

A4 A5

A6

C1 C2

Q

C3 C4

Page 6: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Finding K most similar trajectories

• We have to define what is similar– We use well known similarity measures for trajectories

• Euclidean• Dynamic Time Wrapping (DTW)

Berndt D., Clifford J., “Using Dynamic Time Warping to Find Patterns in Time Series”, In KDD’94, Menlo Park, CA, pp. 229-248, 1994.

• Longest Common SubSequence (LCSS)Das G., Gunopulos D., Mannila H., “Finding Similar Time Series”, In PKDD’97, Trondheim, Norway, pp. 88-100, LNCS 1263, 1997.

• We have to find the most similar trajectories– We focus on LCSS, but the techniques work for DTW as

well.

Song Lin University of California, Riverside

Page 7: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Similarity Measures

Song Lin University of California, Riverside

Courtesy of Dr. Eamonn Keogh

Song Lin University of California, Riverside

Euclidean Matching

Dynamic Time Warping Matching

Longest Common SubSequence Matching

A)

B)

C)

Page 8: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Longest Common Sub_Sequence(LCSS)

1 n

Out-of-phase Match

LCSS Figure: courtesy of Dr. Eamonn Keogh

• Used in string matching problems

• Captures out-of-phase matches, Captures outliers (ignore matching with outliers)

Song Lin University of California, Riverside

Page 9: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Longest Common Sub_Sequence(LCSS)

• LCSS can be computed in O(δ(l1+l2) ) by dynamic programming algorithm.

• In general, it is expensive to compute this similarity exactly, so we can also compute the bounds of it.

Song Lin University of California, Riverside

1 2 1 2

0, if or is empty

1 ( ( ), ( )),

( , ) if - and

( ( ( ), ), ( , ( )),

otherwis

i i

A B

LCSS Tail A Tail B

LCSS A B a b i i

max LCSS Tail A B LCSS A Tail B

e

Page 10: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Centralized LCSS UpperBound

Song Lin University of California, Riverside

,[ ] max{ [ ] },

( ) , where [ ] min{ [ ] },

EnvHigh i Q j i jEnvLow MBE Q EnvHigh

EnvLow i Q j i j

,1, if A[i] within envelop

( ( ), )0, therwise

LCSS MBE Q A

, ,Theorem: ( , ) ( ( ), )LCSS Q A LCSS MBE Q A

Page 11: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Problem with distributed computation of LCSS

Song Lin University of California, Riverside

• In distributed setting, computing lCSS is difficult, because– Sequential matching problem– Matching may occur across cells

Cell 1 Cell 2 Cell 3 Cell 4

Page 12: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Our Solution

Song Lin University of California, Riverside

• We compute lower bound and upper bound of the LCSS similarity distributively.

• We develop new distributed top-K algorithms (UB-K, UBLB-K) that use these bounds to find the most similar trajectories.

Page 13: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Distributed LCSS UpperBound

• Each cell uses LCSSδ,ε(MBE(Q), Aij)

to calculate the similarity of each

local sub_trajectory Aij to MBE(Q)

• Upper bound DUB_LCSS(Q,Ai) is

computed by adding the n local

results

Theorem 1

Song Lin University of California, Riverside

, ,1

( ( ), ) ( , )n

ij ij

LCSS MBE Q A LCSS Q A

Page 14: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

DistributedLCSS LowerBound

• For each trajectory Ai, cell cj finds the time region Tij = {ts(p)|p in Aij} when Ai stays in cell cj. Filter Q into Q′ij such that Q′ij is in the same time intervals as Aij , Q′ij = {p|p in Q and ts(p) in Tij}.

• Each cell performs a local computation of LCSSδ,ε(Q’ij, Aij)

• The lower bound DLB_LCSS(Q,Ai) is computed by adding the n local results

Theorem 2

Song Lin University of California, Riverside

, ,1

( ' , ) ( , ) n

ij ij ij

LCSS Q A LCSS Q A

Page 15: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Distribute top K algorithms

• Threshold Algorithm (TA)Fagin R., Lotem A. and Naor M., “Optimal Aggregation Algorithms For Middleware”, In PODS’01, Santa Barbara, CA, pp. 102-113, 2001.

• Three-Phase Uniform Threshold (TPUT)P. Cao and Z. Wang. Efficient Top-K Query Calculation in Distributed Networks. In PODC, Newfoundland, Canada, 2004.

• Threshold Join Algorithm (TJA)D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava. The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks. In DMSN,Trondheim, Norway, 2005.

Song Lin University of California, Riverside

Page 16: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Problem with existing approaches

• Assume the exact partial scores are available

• The exact scores at each cell can not be computed efficiently (recall that the matching may occur at the crossing cells)

• We use upper (lower) bounds to perform distributed top-k computation (based on Theorem 1 and Theorem 2)

Song Lin University of California, Riverside

Page 17: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Distributed top-K computation with bounds

• Now we have the Lower and Upper Bounds rather than Exact scores.

• e.g. instead of sim(A0,Q)=20 it gives us [A0,15,25]

• We propose UB-K and UBLB-K algorithms to compute the top-K results.

A2,3,6A0,4,8A4,5,10A7,7,9A3,8,11A9,8,9

....

A4,10,18A2,13,19A0,15,25A3,20,27A9,22,26A7,30,35

....

m

A4,4,5A2,5,6A0,5,7A3,5,6A9,8,10A7,12,13

....

A4,1,3A0,6,10A2,5,7A9,6,7A3,7,10A7,11,13

....

id,lb,ubv3

id,lb,ubv2

id,lb,ubv1

id,lb,ubMETADATA

n

Song Lin University of California, Riverside

Page 18: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

UB-K Algorithm

A4,30A2,27A0,25A3,20A9,18A7,12....

id,lbid,ub

A4,23A2,22A0,16A3,18A9,15A7,10....

DATAMETADATA

LB EXACT

Query: Find the K=2 highest ranked answers

Why not stop at 25? Because we might have another object X [UB:24, Real:23]

λ+1

TJA

λ

2λ+1

TJA

Song Lin University of California, Riverside

≥?

Page 19: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

UBLB-K Algorithm

A4,22,30A2,21,27A0,15,25A3,13,20A9,14,18A7,10,12

....

id,lbid,lb,ub

METADATA

Exact Score

LB,UB

A4,23A2,22A0,16A3,18A9,15A7,10....

DATA

EXACTNote: Kth highest LB is: 21Therefore A3 (UB:20) and below are not necessary

λ+1

TJA

2λ+1

TJA

Song Lin University of California, Riverside

≥?

Page 20: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

UB-K vs. UBLB-K

• Both fetch METADATA objects incrementally (αλ+1).

• UB-K uses upper bounds, while UBLB-K uses both upper bounds and lower bounds

• UB-K always fetches αλ+1 (α: step increment) DATA objects, while UBLB-K may fetch less DATA objects.

• UB-K fetches DATA incrementally, while UBLB-K uses a final bulk DATA transfer.

Song Lin University of California, Riverside

Page 21: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Experimental Evaluation

• Comparison system– Centralized– UB-K– UBLB-K

• Dataset– 25,000 trajectories generated over the

Oldenburg street map, using the Network Based Generator of Moving Objects*.

Song Lin University of California, Riverside

* Brinkhoff T., “A Framework for Generating Network-Based Moving Objects”. In GeoInformatica,6(2), 2002.

Page 22: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Performance Evaluation

Song Lin University of California, Riverside

Page 23: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Scalability Evaluation

Song Lin University of California, Riverside

Page 24: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Varying K and λ

Song Lin University of California, Riverside

Page 25: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Summary

• We described and analyzed well known similarity measures for trajectories

• DUB_LCSS and DLB_LCSS for bounding similarity of two trajectories distributively

• UB-K and UBLB-K to find K most similar trajectories

• Easily extended for DTW and other similarity measures

Song Lin University of California, Riverside

Page 26: Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti dzeina@cs.ucy.ac.cy dzeina@cs.ucy.ac.cy University of Cyprus Song Lin slin@cs.ucr.edu.

Distributed Spatio-TemporalSimilarity Search

Demetrios Zeinalipour-Yazti [email protected] University of Cyprus

Song Lin [email protected]

University of California - Riverside Dimitrios Gunopulos [email protected] University of California - Riverside

ICDE 2006Song Lin University of California, Riverside

http://www.cs.ucr.edu/~slin


Recommended