+ All Categories
Home > Documents > FTW: Fast Similarity Search under the Time Warping Distance

FTW: Fast Similarity Search under the Time Warping Distance

Date post: 19-Dec-2021
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
47
FTW: Fast Similarity Search under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Masatoshi Yoshikawa (Nagoya Univ.) Christos Faloutsos (Carnegie Mellon Univ.)
Transcript
Page 1: FTW: Fast Similarity Search under the Time Warping Distance

FTW: Fast Similarity Search under the Time Warping Distance

Yasushi Sakurai (NTT Cyber Space Labs)Masatoshi Yoshikawa (Nagoya Univ.)Christos Faloutsos (Carnegie Mellon Univ.)

Page 2: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 2

Motivation

n Time-series dataq many applications

n computational biology, astrophysics, geology, meteorology, multimedia, economics

n Similarity searchq Euclidean distanceq DTW (Dynamic Time Warping)

n Useful for different sequence lengthsn Different sampling ratesn scaling along the time axis

Page 3: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 3

Mini-introduction to DTWn DTW allows sequences to be stretched along the

time axisq Minimize the distance of sequencesq Insert ‘stutters’ into a sequenceq THEN compute the (Euclidean) distance

‘stutters’:original

Page 4: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 4

Mini-introduction to DTWn DTW is computed by dynamic programming

q Warping path: set of grid cells in the time warping matrix

data sequence P of length N

query sequence Q of length M

pN

qM

pi

qjq1

p1

P

Q

p1 pi pNq1

qj

qM

p-stutters

q-stutters

Optimum warping path(the best alignment)

Page 5: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 5

Mini-introduction to DTW

ïî

ïí

ì

----

+-=

=

)1,1(),1()1,(

min),(

),(),(

jifjif

jifqpjif

MNfQPD

ji

dtw

q-stutterno stutter

p-stutter

n DTW is computed by dynamic programming

p1, p2, …, pi,; q1, q2, …, qj

Page 6: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 6

Mini-introduction to DTWn Global constraints limit the warping scope

q Warping scope: area that the warping path is allowed to visit

P

Q

p1 pi pN

q1

qj

qM

P

Q

p1 pi pN

q1

qj

qM

Itakura ParallelogramSakoe-Chiba Band

Page 7: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 7

Mini-introduction to DTWn Width of the warping scope W is user-defined

P

Q

p1 pi pN

q1

qj

qM

Sakoe-Chiba Band

W1

P

Q

p1 pi pN

q1

qj

qM

W2

Page 8: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 8

Motivation

n Similarity search for time-series dataq DTW (Dynamic Time Warping)

n scaling along the time axisBut…n High search cost O(NM)n prohibitive for long sequences

Page 9: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 9

Our Solution, FTW

n Requirements: 1. Fast2. No false dismissals3. No restriction on the sequence length

n It should handle data sequences of different lengths4. Support for any, as well as for no restriction on

“warping scope”

Page 10: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 10

Problem Definition

n Givenq S time-series data sequences of unequal lengths

{P1, P2, …, PS}, q a query sequence Q, q an integer k, q (optionally) a warping scope W,

n Find the k-nearest neighbors of Q from the data sequence set by using DTW with W

Page 11: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 11

Overview

n Introductionn Related workn Main ideasn Experimental resultsn Conclusions

Page 12: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 12

Related Work

n Sequence indexingq Agrawal et al. (FODO 1998)q Keogh et al. (SIGMOD 2001)q …

n Subsequence matchingq Faloutsos et al. (SIGMOD 1994)q Moon et al. (SIGMOD 2002)q …

Page 13: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 13

Related Work

n Fast sequence matching for DTWq Yi et al. (ICDE 1998)q Kim et al. (ICDE 2001)q Chu et al. (SDM 2002)q Keogh (VLDB 2002)q Zhu et al. (SIGMOD 2003)q …

n None of the existing methods for DTW fulfills all the requirements

Page 14: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 14

Overview

n Introductionn Related workn Main ideasn Experimental resultsn Conclusions

Page 15: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 15

Main Idea (1) - LBS

n LBS (Lower Bounding distance measure with Segmentation)

n PA : Approximate sequencesq : segment rangeq : upper valueq : lower value

q t: length of time intervals*

):( Ui

Li

Ri ppp =

Uip

Rip

Lip AP

Rp1Rp4

Rp3

t t t t

Rp2

Page 16: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 16

Main Idea (1) - LBS

Rjq

Rip

n Compute lower bounding distanceq Distance of the two ranges and :

distance of their two closest points

Rjq

Rip

Time

Value Lower bound

Time

Value Lower bound=0

Page 17: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 17

Main Idea (1) - LBS

n Compute lower bounding distanceq Distance of the two ranges and :

distance of their two closest points

Rjq

ïïî

ïïí

ì

>-

>-

=)(0)()(

),(otherwise

pqpqqpqp

qpD Ui

Lj

Ui

Lj

Uj

Li

Uj

Li

Rj

Riseg

Rip

details

Page 18: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 18

Main Idea (1) - LBS

P

Q

P

Q

n Exact DTW distance

Page 19: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 19

Main Idea (1) - LBS

n Compute lower bounding distance from PA and QA

n Use a dynamic programming approach

AP

AQ

AP

AQ

),(),( QPDQPD dtwAA

lbs £

Page 20: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 20

Main Idea (1) - LBS

n Compute lower bounding distance from PA and QA

n Use a dynamic programming approach

AP

AQ

),(),( QPDQPD dtwAA

lbs £

P

Q

Page 21: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 21

Main Idea (2) - EarlyStopping

n Exploit the fact that we have found k-near neighbors at distance dcbq dcb: k-nearest neighbor distance (the Current Best)

the exact distance of the best k candidates so far

Page 22: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 22

Main Idea (2) - EarlyStoppingn Exclude useless warping paths by using

q Omit g(1,3) ifq Omit g(4,1) if

AP

AQ

g(1,2)

g(3,1)

AP

AQ

cbdg >)2,1( cbd

cbdg >)1,3(

Page 23: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 23

Main Idea (3) - Refinement

n Q: How to choose t (length of time intervals)?

AP

AQ

g(1,2)

g(3,1)

AP

AQ

t

t

Page 24: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 24

Main Idea (3) - Refinement

n Q: How to choose t (length of intervals)?n A: Use multiple granularities, as follows:

AP

AQ

g(1,2)

g(3,1)

AP

AQ

t

t

Page 25: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 25

Main Idea (3) - Refinement

n Compute the lower bounding distance from the coarsest sequences as the first refinement step

n Ignore P if , otherwise:

AP

AQ

g(1,2)

g(3,1)

AP

AQ

cbAA

lbs dQPD >),(

Page 26: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 26

Main Idea (3) - Refinement

n … compute the distance from more accurate sequences as the second refinement step

n … repeat

AP

AQ

AQ

AP

Page 27: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 27

Main Idea (3) - Refinement

n … until the finest granularityn Update the list of k-nearest neighbors if

P

Q

P

Q

cbdtw dQPD £),(

Page 28: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 28

Overview

n Introductionn Related workn Main ideasn Experimental resultsn Conclusions

Page 29: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 29

Experimental results

n Setupq Intel Xeon 2.8GHz, 1GB memory, Linuxq Datasets:

Temperature, Fintime, RandomWalkq Four different time intervals (for n=2048)

t1=2, t2=8, t3=32, t4=128

n Evaluationq Compared FTW with LB_PAA (the best so far)q Mainly computation time

Page 30: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 30

Outline of experiments

n Speed vs db sizen Speed vs warping scope Wn Effect of filteringn Effect of varying-length data sequences

Page 31: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 31

Search Performance

n Itakura Parallelogram

P

Q

p1 pi pN

q1

qj

qM

Page 32: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 32

Search Performance

n Wall clock time as a function of data set sizen Temperature FTW is up

to 50 times faster!

Page 33: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 33

Search Performance

n Wall clock time as a function of data set sizen Fintime FTW is up

to 40 times faster!

Page 34: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 34

Search Performance

n Wall clock time as a function of data set sizen RandomWalk FTW is up

to 40 times faster!

More effective as the size

grows

Page 35: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 35

Outline of experiments

n Speed vs db sizen Speed vs warping scope Wn Effect of filteringn Effect of varying-length data sequences

Page 36: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 36

Search Performance

n Sakoe-Chiba Band

P

Q

p1 pi pN

q1

qj

qM

W1

P

Q

p1 pi pN

q1

qj

qM

W2

Page 37: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 37

Search Performance

n Wall clock time as a function of warping scopen Temperature FTW is up

to 220 times faster!

Page 38: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 38

Search Performance

n Wall clock time as a function of warping scopen Fintime FTW is up

to 70 times faster!

Page 39: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 39

Search Performance

n Wall clock time as a function of warping scopen RandomWalk FTW is up

to 100 times faster!

Page 40: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 40

Outline of experiments

n Speed vs db sizen Speed vs warping scope Wn Effect of filteringn Effect of varying-length data sequences

Page 41: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 41

Effect of filtering

n Most of data sequences are excluded by coarser approximations (t4=128 and t3=32)q Using multiple granularities has significant advantages

Frequency of approximation use

Page 42: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 42

Outline of experiments

n Speed vs db sizen Speed vs warping scope Wn Effect of filteringn Effect of varying-length sequences

Page 43: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 43

Difference in Sequence Lengthsn 5 sequence data sets

Random(2048,0): length 2048 +/- 0Random(2048,32): length 2048 +/- 16Random(2048,64), Random(2048,128), Random(2048,256)

Outperform by 2+ orders of magnitude

LB_PAA can not handle

Page 44: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 44

Overview

n Introductionn Related workn Main ideasn Experimental resultsn Conclusions

Page 45: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 45

Conclusions

n Design goals: 1. Fast2. No false dismissals3. No restriction on the sequence length4. Support for any, as well as for no

restriction on “warping scope”

Page 46: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 46

Conclusions

n Design goals: 1. Fast (up to 220 times faster)2. No false dismissals3. No restriction on the sequence length4. Support for any, as well as for no

restriction on “warping scope”

Page 47: FTW: Fast Similarity Search under the Time Warping Distance

PODS 2005 Y. Sakurai et al 47

Page Accessesn Sequential scan of feature data should boost

performance (speed-up factors SF=5, SF=10)PAds: page accesses for data sequences

PAfd: page accesses for feature datadsfd

SF PASFPA

PA +=

details


Recommended