Post on 20-Dec-2015
transcript
Time Series Bitmap ExperimentsTime Series Bitmap Experiments
• This file contains full color, large scale versions of the experiments shown in the paper, and additional experiments which were omitted because of space constraints
• Note that in every case, all the data is freely available
Figure 1 Expanded
Four time series files represented as time series bitmaps.
While they are all example of EEGs, example_a.dat is from a normal trace, whereas the others contain examples of spike-wave discharges. The fact that there is some difference between one dataset and all the rest is immediately apparent from a casual inspection of the bitmap representation.
Figure 3 Expanded
The gene sequences of mitochondrial DNA of four animals, used to create their own file icons using a chaos game representation.
Note that Pan troglodytes is the familiar Chimpanzee, and Loxodonta africana and Elephas maximus are the African and Indian Elephants, respectively. The file icons show that humans and chimpanzees have similar genomes, as do the African and Indian elephants.
Figure 6 Expanded
A snapshot of a folder containing cardiograms when its files are arranged by “Cluster” option.
Five cardiograms have been grouped into two different clusters
based on their similarity.
Cluster 1 (eeg 1 ~ 3):
BIDMC Congestive Heart Failure Database (chfdb): record chf02
Start times at 0, 82, 150, respectively
Cluster 2 (eeg 6 ~ 7):
BIDMC Congestive Heart Failure Database (chfdb): record chf15
Start times at 0, 82 respectively
1
2
3
4
5
11
13
12
14
15
6
9
10
7
8
16
18
17
19
20
1
2
3
4
5
11
13
12
14
15
6
9
10
7
8
16
18
17
19
20
Figure 7 Expanded
Cluster 1 (datasets 1 ~ 5):
BIDMC Congestive Heart Failure Database (chfdb): record chf02
Start times at 0, 82, 150, 200, 250, respectively
Cluster 2 (datasets 6 ~ 10):
BIDMC Congestive Heart Failure Database (chfdb): record chf15
Start times at 0, 82, 150, 200, 250, respectively
Cluster 3 (datasets 11 ~ 15):
Long Term ST Database (ltstdb): record 20021
Start times at 0, 50, 100, 150, 200, respectively
Cluster 4 (datasets 16 ~ 20):
MIT-BIH Noise Stress Test Database (nstdb): record 118e6
Start times at 0, 50, 100, 150, 200, respectively
Data Key
Section 5.1 ExpandedIn Ge and Smyth 2000, this dataset was explored with segmental hidden Markov models. After they careful adjusted the parameters they reported 98% classification accuracy. Using time series bitmap with virtually any parameter settings, we get perfect classifications and clustering.
We can get perfect classifications using one nearest neighbor classification, or we can project the data into 2 dimensional space (see next slide) and get perfect accuracy using a simple linear classifier, a decision tree or SVD.
(Dataset donated by Padhraic Smyth and Seyoung Kim)
1259241428812151327232675191718222320610111621429553238504036445256303439413143333753354542514648494754
ParametersLevel 1N = 60n = 12
1228719153101225416920261427172458222936613211118233441303931373544534652485049563240424538435533544751
Segmental Markov model [1]
0.35
0.4
0.45
0.5
0.55
1
23
4
5
6
7
8 9
10
11
12
13 14
15
1617
18
19
20
21
2223
24
25
26
27 28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
4849
50
51
52
53
54
55
56
ParametersLevel 1N = 60n = 12
Figure 8 ExpandedThe MIT ECG Arrhythmia dataset projected into 2D space using only the information from a level 2-time series bitmap. The two classes are easily separated by a simple linear classifier (gray line).
Here the bitmaps are almost the same.
Here the bitmaps are very different. This is the most unusual section of the time series, and it coincidences with the PVC.
Here is a Premature Ventricular Contraction (PVC)
Figure 9 Expanded
Below are some more examples of anomaly detection in ECG with our bitmap approach.
These examples did not make it into the paper because of space limitations
Premature ventricular contraction Premature ventricular contractionSupraventricular escape beat
Annotations by
a cardiologist