Indexing of Time Series by Major Minima and Maxima

Post on 08-Jan-2016

29 views 1 download

description

Indexing of Time Series by Major Minima and Maxima. Eugene Fink Kevin B. Pratt Harith S. Gandhi. Example:. 0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1, 0. 4. 3. 2. 1. 0. Time series. A time series is a sequence of real values measured at equal intervals. Results. - PowerPoint PPT Presentation

transcript

Indexing of Time Seriesby Major Minima and Maxima

Eugene FinkKevin B. Pratt

Harith S. Gandhi

Time series

A time series is a sequence of real values measured at equal intervals.

Example:0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1, 0

01

32

4

Results

• Compression of a time series by extracting its major minima and maxima

• Indexing of compressed time series

• Retrieval of series similar to a given pattern

• Experiments with stock and weather series

Outline

• Compression

• Indexing

• Retrieval

• Experiments

CompressionWe select major minima and maxima, along with the start point and end point, and discard the other points.

We use a positive parameter R to control the compression rate.

Major minima

A point a[m] in a[1..n] is a major minimum if there are i and j, where i < m < j, such that:• a[m] is a minimum among a[i..j], and• a[i] – a[m] R and a[j] – a[m] R.

a[j]a[i]

a[m]

R R

Major maxima

A point a[m] in a[1..n] is a major maximum if there are i and j, where i < m < j, such that:• a[m] is a maximum among a[i..j], and• a[m] – a[i] R and a[m] – a[j] R.

a[j]a[i]

a[m]

R R

Compression procedureThe procedure performs onepass through a given series.

It can compress a live serieswithout storing it in memory.

It takes linear time and constant memory.

Outline

• Compression

• Indexing

• Retrieval

• Experiments

Indexing of series

We index series in a database by their major inclines, which are upward and downward segments of the series.

Major inclinesA segment a[1..j] is a major upward incline if • a[i] is a major minimum;• a[j] is a major maximum;• for every m [i..j], a[i] < a[m] < a[j].

a[i]

a[j]

The definition of a major downward inclineis symmetric.

Identification of inclines

The procedure performs two passes through a list of major minima and maxima.

Identification of inclines

The procedure performs two passes through a list of major minima and maxima.

Its time is linear in the number of inclines.

Indexing of inclinesWe index major inclines of series in a database by their lengths and heights.

We use a range tree, which supports indexing of points by two coordinates.

lengthheight

length

height

incline

Outline

• Compression

• Indexing

• Retrieval

• Experiments

RetrievalThe procedure inputs a pattern series andsearches for similar segments in a database.

Pattern

Example:

Database

1

32

RetrievalThe procedure inputs a pattern series andsearches for similar segments in a database.

Main steps:

• Find the pattern’s inclines with the greatest height

• Retrieve all segments that have similar inclines

• Compare each of these segments with the pattern

Highest inclinesFirst, the retrieval procedure identifies the important inclines in the pattern. , and selects the highest inclines.

length1

height

length2

1 2

Candidate segmentsSecond, the procedure retrieves segments with similar inclines from the database.

An incline is considered similar if• its height is between height / C and height · C;• its length is between length / D and length · D.

We use the range tree toretrieve similar inclines.

incline

length / C

length · C

height / C

height · C

Similarity testThird, the procedure compares the retrieved segments with the pattern. ,using a given similarity test.

Outline

• Compression

• Indexing

• Retrieval

• Experiments

Experiments

We have tested a Visual-Basic implemen-tation on a 2.4-GHz Pentium computer.

Data sets:

• Stock prices: 98 series, 60,000 points

• Air and sea temperatures: 136 series, 450,000 points

00

210

fast rankingC = D = 5

time: 0.05 sec

200

perf

ect r

anki

ngStock prices (60,000 points) Search for 100-point patternsThe x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search.

00

331

fast rankingC = D = 2

time: 0.02 sec

200

perf

ect r

anki

ng

00

400

fast rankingC = D = 1.5

time: 0.01 sec

151

perf

ect r

anki

ng

Stock prices (60,000 points) Search for 500-point patternsThe x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search.

00

202

fast rankingC = D = 5

time: 0.31 sec

200

perf

ect r

anki

ng

00

328

fast rankingC = D = 2

time: 0.12 sec

200

perf

ect r

anki

ng

00

400

fast rankingC = D = 1.5

time: 0.09 sec

167

perf

ect r

anki

ng

Temperatures (450,000 points) Search for 200-point patternsThe x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search.

00

202

fast rankingC = D = 5

time: 1.18 sec

200

perf

ect r

anki

ng

00

400

fast rankingC = D = 2

time: 0.27 sec

151

perf

ect r

anki

ng

00

400

fast rankingC = D = 1.5

time: 0.14 sec

82

perf

ect r

anki

ng

Conclusions

Main results: Compression and indexing of time series by major minima and maxima.

Current work: Hierarchical indexing by importance levels of minima and maxima.

4

3 3

3 3

1

1 1

11

1