+ All Categories
Home > Documents > COMP 5331 Project 26-11-2015 1 Roadmap I will give a brief introduction (e.g. notation) on time...

COMP 5331 Project 26-11-2015 1 Roadmap I will give a brief introduction (e.g. notation) on time...

Date post: 21-Jan-2016
Category:
Upload: gerard-nelson
View: 215 times
Download: 2 times
Share this document with a friend
34
COMP 5331 Project 26-11-2015 1
Transcript

PowerPoint Presentation

COMP 5331 Project26-11-20151

RoadmapI will give a brief introduction (e.g. notation) on time series. Giving a notion of what we are playing with.

I will talk about the similarity (i.e. distance measure) between two time series and what invariance are.

Then I will formulate a problem on music data and claim that the existing methods of distance measure are not good enough.

Finally, I will introduce a novel method of distance measure.

Hope I can make all of them in 15 mins!

2

What are Time Series? It is a collection of observations made sequentially in time.

It is an ordered list of real-valued numbers: T=t1,t2,,tm.measurementSample rate1 per day40 Hz (40 times in a minute)Length of T = |T| = m 3timeCan be anythingscalarvector

A subsequence Ti,k of time series T is a shorter time series of length k.

The y axis (i.e. time) is not important. The measurement is always taken in consistent sample rate. Only the order is important!

Length = kThere are m-k+1 subsequence with length = k in T

Only store the x-valueSliding window4

Some data obviously are time series.

Blood pressure of my grandma OVER timeSalary of my dad OVER timeMy GPA OVER time

5But time series is not necessary related to timeOthers are not. But we can convert them!

OutlookCut and StretchAn Algorithm to Convert Image to Time SeriesCompute the central pointArbitrary choose a starting point o in the contourCalculating the distance between the central point and its contour. Start from the o, go around and finish at o

6

05010015020025030035040045000.51Handwriting data7Outlook

gtttatgtagcttaccccctcaaagcaatacactgaaaatgtttcgacgggtttacatcaccccataaacaaacaggtttggtcctagcctttctattag...

An Algorithm to Convert DNA to Time Series

for some complex object such as protein, we can use enzyme to break it into its linear building blocks, peptide.8Mentioned by group 2

MFCCMel Frequency Cepstral Coefficient It is a speech analysis method based on human perception experiments.It concentrates on only certain frequency components.9MFCCaudioA vector with 13 numbers

I use the first coefficient to do the experiment

9What can we ask in time seriesClustering

Classification

Motif Discovery (repeated pattern)

10On sub-sequences in a long time series streamOn different individual time series

Rule Discovery

Query

200400600800100012000We need to define similarity between two time series!11

For classification algorithm in time series, simple nearest neighbor classification performs VERY well.The choice of distance measure is importantSimilarityDistance function, D(A,B)We want to have these propertyD(A,B) = D(B,A)Symmetry D(A,A) = 0ConstancyD(A,B) = 0 iff A= B PositivityD(A,B) D(A,C) + D(B,C)Triangular Inequality (not essential but better to have, easy to find the lower bound and do indexing)

12

Triangular Inequality helps us to compute the lower bound of some distance measure.

Lower bound Compute the actual measure is expensive while compute the lower bound is cheap.So, we always compute the cheap lower bound first. Having the lower bound, we can check whether the computation of the actual measure is necessary or not for the current task. By using this strategy, we can save a lot of works.

13

Example

Our task is find the nearest point to the query point, Q.Visit a. Calculate the actual measure. The result is 2. 2 is our best-so-far answer.Visit b. Calculate the actual measure. The result is 7.81. 2 remains our best-so-far answer.

Visit c. Only need to calculate the lower bound measure.D(Q,b) D(Q,c) + D(b,c)5.51 D(Q,c)Since the lower bound of D(Q,c) is worse than the best-so-far, we dont need to compute the actual D(Q,c).

We have computed the pairwise distance of all the data points (i.e. a, b, c)14

14Euclidean Distance EDGiven two time series of length n,Q=q1,q2,..,qi,,qnC=c1,c2,ci,cn

15QCD(Q,C)

We can do a little bit speed up by using

NormalizationED is sensitive to distortion.We need to do normalization for the time series.

Amplitude and Offset invariance, which can be solved by z-normalization1605010015020025030001002003004005006007008009001000The green time series has a greater amplitude than the blue one

16Linear trend invariance

Noise invariance17020406080100120140160180200-4-2024681012020406080100120140-4-202468

Find the Least Square line:L=At2+Bt+CCompute the discrete points on L:L=l1, l2,,lnThe new time series:T=t1-l1,t2-l2,,tn-ln

Create an windowThere are n points in the window.Compute the mean value of these n pointsCreate a new point with this mean value and remove these n pointsGo to another set of n pointsAttention !Different domains require different invariancesCardiology (heart) data requires invariance to the mean valueSome invariance should not be considered in specific domainAdding rotation invariance would make it impossible to distinguish the shapes `p and `d, In music data, I think the following invariances are importantLocal scaling invarianceOcclusion invarianceUniform scaling invariance

18

ThreeLocal Scaling InvarianceWe think these two time series are similarIt is because they have the same component of parts.Each of them first has a static part, a peak, a static part, a valley and finally a static part. But all have different length (i.e. no. of sample points)ED returns a poor result because the peak and the valley are in different positions.It can be solved by Dynamic Time Warping (DTW).19

DTWThe one to many property allows similar shapes to match even they are out of phase and have different length20EDOne to oneDTWOne to many

The one to many property of DTW makes it perform better than EDBut this property also makes it computational expensive.

21ii+2itimejsism1n1Time Series BTime Series Apkpsp1The red dots in the matrix is the path P=p1,p2,,ps,pkps=(is,js) which shows alignment of points between A and B

The path is found by using dynamic programming in order to minimize the total distance between them

Occlusion InvarianceSome part of one time series is missing.

It can be handled by variant of DTW, which have the extra ability to ignore sections that are difficult to match (possibility with some penalty)

For example, for the valley in the lower time series, DTW cannot find any corresponding part (nearby) in the upper time series.22

Uniform Scaling InvarianceWe can easily see that the right time series is just rescale of the left one (rescale rate =2X)If we uniform scale (US) the left one by a factor of 2. ED will give a good result. For simplicity, I call the operation US+ED simply as USHowever, we do not know the factor before head. We are forced to testing all possible factor. 23 DTW is not generalization of uniform scalingSometimes, US performs better than DTWFirst, we must understand what is rescale.For example, we rescale a time series T={0,3,2,1} from |T|=4 to 10 and form a new series called t={0, 1, 2, 3, 2.67, 2.3, 2, 1.6, 1.3, 1} where the red points are the original points while the blue points are the created intermediate points.

24

Suppose there are two time seriesA={0, 3, 2, 1}B={0, 1, 2, 3, 2.67, 2.3, 2, 1.6, 1.3, 1}

If we rescale A from length 4 to 10 and form A={0, 1, 2, 3, 2.67, 2.3, 2, 1.6, 1.3, 1} ED(A,B)=0DTW(A,B)=|0-0| + |1-0| + |2-3| + |3-3| + |2.67-3| + |2.3-2| + |2-2| + |1.6-2| + |1.3-1| + |1-1|=3.33Using US matches our intuition.

25AB

So there is a compounded method of first using US and then DTW. It is called SWM, which stands for Scaled and Warped Matching.

26By Uniform ScalingBy DTW

Why I think these three important?Imagine you are beginner in playing piano.You may not possibly play some notes if there are so many of them appear in short period of time.Occlusion InvarianceYou may have a little bit tempo (speed of music) difference in some bars. Local scaling invariance

There are several parts in a music piece. You may play the first part faster than that of the other and play the second part same as that of the other.Uniform scaling invariance

27

ExperimentData: 2 music pieces of the exactly same content but with different length.Content: A segment extracted from my favourite song, Red Shadow in the Candle Light (), performed by me.

Slower one: 12sFaster one: 9s

28

I use the MFCC to convert a music piece into a matrix with 13 rows.I only use the first row in the analysis.29

Slower one

Faster one

9131203I call this time series as candle_fastI call this time series as candle_slowED, DTW, US, SWMSince candlefast and candleslow has different lengthsI extract the first 913 points from candleslow to form a new time series called candleslow_pruned.I rescale the candle_fast to have the same length with candleslow and form a new time series called candlefast_lengthen

ED(candlefast, candleslow_pruned)=2531

DTW(candlefast, candleslow)=662US(candlefast, candleslow)= ED(candlefast_lengthen, candleslow)= 3965SWM(candlefast, candleslow)=1260

I original expect SWM should be the best. It may because the longer one cannot be obtained by uniform scaling from the short one.

30

However, even SWM is not good enoughThey have similar lengthFor us, they look similar with the aiding of colorThe left time series are formed by four non-overlap subsequences.The right time series are formed by the same set of subsequences, with the same ordering, but with different scaling factor.

ED performs worseUS perform worse. US can only take care one subsequence by choosing the suitable scaling factorDTW perform better

31

31If we know ahead that the two time series formed by the same set of subsequence but with different scaling factor.

We can separate the time series and do a uniform scaling on each part. Piecewise uniform scalingAfter the piecewise uniform scaling, do the Scaled and Warped Matching

Observe that:

There are four , this may implies that the two time series are formed by same set of 4 subsequence.

32DTW

For music data, the story is much simpler.

For different parts, there are usually a rest between them.

But using this information, we can separate the time series and do the piecewise uniform scaling on it.

33

Thank YouQuestion and CommentsAnd get the Coupon3434


Recommended