Post on 24-Feb-2016
description
transcript
1
The Keogh Lab
Data Mining and Structure Retrieval
Presented byAbdullah Mueen
2
Overview of our work• Our Goal: Extract information from raw, noisy, massive,
unstructured data.• We develop algorithms for
– Classification– Clustering– Rule finding– Motif discovery– Discord discovery– Shapelet discovery– Linkage discovery
• We work closely with the domain experts. – For collecting new data.– To verify our results.
3
Case 1: Motif DiscoveryBeet Leafhopper (Circulifer tenellus)
plant membrane
Stylet
voltage source
input resistor
V
0 50 100 150 2000
10
20
to insectconductive glue
voltage reading
to soil near plant
Exact Discovery of Time Series Motifs.Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney
Cash, Brandon Westover. SDM 2009.
MK motif discovery
4
false nettles
stinging nettles
Case 2: Shapelet Discovery
false nettles
Shapelet
stinging nettles
Time Series Shapelets: A New Primitive for Data Mining.
Lexiang Ye and Eamonn Keogh. SIGKDD 2009
5
Case 3: Linkage Discovery
CK-1
0.6291
CK-1
0.9033
CK-1 Distance Measure
0.6
0.7
0.8
0.9
CK-1
Dist
ance Single Linkage Dendrogram
Print House 1 Print House 2
A Compression Based Distance Measure for Texture. Bilson Campana and Eamonn Keogh . SDM 2010
text
a hand-press bookcharacter matrix
textornaments text
Lab Members
Dr. Eamonn KeoghDr. Gustavo Batista
Abdullah MueenQiang Zhu
Bilson CampanaThanawin Art R.
Bing HuYuan Hao
Jesin Zakaria6
7
Motif in Online Data • Maintain motif in streaming data without
introducing latency.
8
Motion Motif• Find repeated motion in motion capture data
which is a 32 dimensional time series.