Mining Event Periodicity from Incomplete Observations

Post on 11-Jan-2016

38 views 0 download

description

Mining Event Periodicity from Incomplete Observations. Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei Han University of Illinois at Urbana-Champaign *Now at Penn State University. KDD 2012 Beijing, China. Prologue: Detect Periodicity in Movements [Li et al., KDD’10]. - PowerPoint PPT Presentation

transcript

Zhenhui Jessie Li 1

Mining Event Periodicity from Incomplete Observations

Zhenhui (Jessie) Li*, Jingjing Wang, Jiawei HanUniversity of Illinois at Urbana-Champaign

*Now at Penn State University

KDD 2012Beijing, China

Zhenhui Jessie Li 2

Prologue: Detect Periodicity in Movements [Li et al., KDD’10]

Problem: What is the periodicity of the

movement?

Bee example:8 hours in hive16 hours fly nearby

Zhenhui Jessie Li 3

Prologue: Detect Periodicity in Movements [Li et al., KDD’10]

Observe the in-and-out movements from the reference spot (i.e., hive).

in hive

outside hive

time

Two-Dimensional Movement One-Dimensional Binary Sequence

Easy to see the

periodicity.

Zhenhui Jessie Li 4

Challenge: Periodicity Detection for Incomplete Observations

• Two factors result in incomplete observations: inconsistent + low sampling rate

• Movement data collection in real scenarios:– Human movements data collected from cellphones: only report

locations when making calls– Animal movement data: 2~3 locations in 3~5 days

2009-05-02 01:03 in2009-05-03 11:30 out2009-05-05 03:12 in2009-05-09 12:03 in2009-05-10 11:14 out2009-05-11 02:15 in…

in hive

outside hive

Complete Observations Incomplete Observations

Zhenhui Jessie Li 5

A Challenging Case of Detecting Periodicity for Incomplete

Observations

2009-05-02 01:03 in2009-05-03 11:30 out2009-05-05 03:12 in2009-05-09 12:03 in2009-05-10 11:14 out2009-05-11 02:15 in…

Sparse Raw Data

in out in

Any periodicity in the above sequence?

Zhenhui Jessie Li 6

Mining Periodicity in Incomplete Data

• Event has a period of 20• Occurrences of the event happen between 20k+5 to 20k+10

Zhenhui Jessie Li 7

A Probabilistic Model for Periodic Event

Example:• Human daily periodicity visiting

office• Period as 24• Visiting office at 10-11am, 14-

16pm

Zhenhui Jessie Li 8

A Probabilistic Model for Periodic Event with Random Observation

generate

x(5)=1 x(62)=0

Zhenhui Jessie Li 9

Periodicity Detection by Overlaying Observations

Skewed distribution

Even distribution

True period Wrong period

Zhenhui Jessie Li 10

Relationship between Observation Ratio and Probabilistic Model

Pos/Neg Ratio Periodic Distribution Vector

Zhenhui Jessie Li 11

Discrepancy Score to Measure Periodicity

If T (=24) is the correct period, the discrepancy score should be large for certain set of timestamps

If T (=23) is the wrong period, the discrepancy scores are likely to be zero for any set of timestamps

Zhenhui Jessie Li 12

Periodicity Measure

Zhenhui Jessie Li 13

Performance Comparisons

Sampling rate(Ratio of observed points in the complete sequence)

Zhenhui Jessie Li 14

Experiment on Real Human Data

One person’s visits to a specific location

Sampling rate: 20min

Sampling rate: 1hour

Zhenhui Jessie Li 15

Problems with Using Fourier Transform to Detect Periodicity

T=4

T=16

Zhenhui Jessie Li 16

Summary: Mining Event Periodicity from Incomplete Observations

• Motivation– Challenge of the real data: incomplete

observations (inconsistent + low sampling rate)

• Method– Overlay the segments and measure the

“skewness” of the distribution– Theoretically prove the correctness of the method

• Application– Location prediction– 2nd place in Nokia Mobile Data Challenge 2012– Periodicity-based feature + SVM

Thanks! Questions?