Adaptive Learning and Mining for Data Streams andFrequent Patterns
Albert Bifet
Laboratory for Relational Algorithmics, Complexity and Learning LARCADepartament de Llenguatges i Sistemes Informàtics
Universitat Politècnica de Catalunya
Ph.D. dissertation, 24 April 2009Advisors: Ricard Gavaldà and José L. Balcázar
LARCA
Future Data Mining
Future Data MiningStructured dataFind Interesting PatternsPredictionsOn-line processing
2 / 59
Mining Evolving Massive Structured Data
The Disintegration of Persistenceof Memory 1952-54
Salvador Dalí
The basic problemFinding interesting structureon data
Mining massive dataMining time varying dataMining on real timeMining structured data
3 / 59
Data Streams
Data StreamsSequence is potentially infiniteHigh amount of data: sublinear spaceHigh speed of arrival: sublinear time per exampleOnce an element from a data stream has been processedit is discarded or archived
Approximation algorithmsSmall error rate with high probabilityAn algorithm (ε,δ )−approximates F if it outputs F̃ forwhich Pr[|F̃ −F |> εF ] < δ .
4 / 59
Tree Pattern Mining
Trees are sanctuaries.Whoever knows how
to listen to them,can learn the truth.
Herman Hesse
Given a dataset of trees, find thecomplete set of frequent subtrees
Frequent Tree Pattern (FT):
Include all the trees whosesupport is no less thanmin_sup
Closed Frequent Tree Pattern(CT):
Include no tree which has asuper-tree with the samesupport
CT ⊆ FT
5 / 59
Outline
Mining EvolvingData Streams
1 Framework2 ADWIN
3 Classifiers4 MOA5 ASHT
Tree Mining6 Closure Operator
on Trees7 Unlabeled Tree
Mining Methods8 Deterministic
Association Rules9 Implicit Rules
Mining Evolving TreeData Streams
10 IncrementalMethod
11 Sliding WindowMethod
12 Adaptive Method13 Logarithmic
Relaxed Support14 XML Classification
6 / 59
Outline
Mining EvolvingData Streams
1 Framework2 ADWIN
3 Classifiers4 MOA5 ASHT
Tree Mining6 Closure Operator
on Trees7 Unlabeled Tree
Mining Methods8 Deterministic
Association Rules9 Implicit Rules
Mining Evolving TreeData Streams
10 IncrementalMethod
11 Sliding WindowMethod
12 Adaptive Method13 Logarithmic
Relaxed Support14 XML Classification
6 / 59
Outline
Mining EvolvingData Streams
1 Framework2 ADWIN
3 Classifiers4 MOA5 ASHT
Tree Mining6 Closure Operator
on Trees7 Unlabeled Tree
Mining Methods8 Deterministic
Association Rules9 Implicit Rules
Mining Evolving TreeData Streams
10 IncrementalMethod
11 Sliding WindowMethod
12 Adaptive Method13 Logarithmic
Relaxed Support14 XML Classification
6 / 59
Outline
1 Introduction
2 Mining Evolving Data Streams
3 Tree Mining
4 Mining Evolving Tree Data Streams
5 Conclusions
7 / 59
Data Mining Algorithms with Concept Drift
No Concept Drift
-input output
DM Algorithm
-
Counter1
Counter2
Counter3
Counter4
Counter5
Concept Drift
-input output
DM Algorithm
Static Model
-
Change Detect.-
6
�
8 / 59
Data Mining Algorithms with Concept Drift
No Concept Drift
-input output
DM Algorithm
-
Counter1
Counter2
Counter3
Counter4
Counter5
Concept Drift
-input output
DM Algorithm
-
Estimator1
Estimator2
Estimator3
Estimator4
Estimator5
8 / 59
Time Change Detectors and Predictors(1) General Framework
ProblemGiven an input sequence x1,x2, . . . ,xt , . . . we want to output atinstant t
a prediction x̂t+1 minimizing prediction error:
|x̂t+1−xt+1|
an alert if change is detectedconsidering distribution changes overtime.
9 / 59
Time Change Detectors and Predictors(1) General Framework
-xt
Estimator
-Estimation
10 / 59
Time Change Detectors and Predictors(1) General Framework
-xt
Estimator
-Estimation
- -Alarm
Change Detect.
10 / 59
Time Change Detectors and Predictors(1) General Framework
-xt
Estimator
-Estimation
- -Alarm
Change Detect.
Memory-
6
6?
10 / 59
Optimal Change Detector and Predictor(1) General Framework
High accuracyFast detection of changeLow false positives and false negatives ratiosLow computational cost: minimum space and time neededTheoretical guaranteesNo parameters neededEstimator with Memory and Change Detector
11 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 1
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 1 W1 = 01010110111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 10 W1 = 1010110111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 101 W1 = 010110111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 1010 W1 = 10110111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 10101 W1 = 0110111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 101010 W1 = 110111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 1010101 W1 = 10111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111W0= 10101011 W1 = 0111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111 |µ̂W0− µ̂W1 | ≥ εc : CHANGE DET.!
W0= 101010110 W1 = 111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 101010110111111 Drop elements from the tail of WW0= 101010110 W1 = 111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
Example
W= 01010110111111 Drop elements from the tail of WW0= 101010110 W1 = 111111
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize Window W2 for each t > 03 do W ←W ∪{xt} (i.e., add xt to the head of W )4 repeat Drop elements from the tail of W5 until |µ̂W0− µ̂W1 |< εc holds6 for every split of W into W = W0 ·W17 Output µ̂W
12 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
TheoremAt every time step we have:
1 (False positive rate bound). If µt remains constant withinW, the probability that ADWIN shrinks the window at thisstep is at most δ .
2 (False negative rate bound). Suppose that for somepartition of W in two parts W0W1 (where W1 contains themost recent items) we have |µW0−µW1 |> 2εc . Then withprobability 1−δ ADWIN shrinks W to W1, or shorter.
ADWIN tunes itself to the data stream at hand, with no need forthe user to hardwire or precompute parameters.
13 / 59
Algorithm ADaptive Sliding WINdow(2) ADWIN
ADWIN using a Data Stream Sliding Window Model,can provide the exact counts of 1’s in O(1) time per point.tries O(logW ) cutpointsuses O(1
εlogW ) memory words
the processing time per example is O(logW ) (amortizedand worst-case).
Sliding Window Model
1010101 101 11 1 1Content: 4 2 2 1 1Capacity: 7 3 2 1 1
14 / 59
K-ADWIN = ADWIN + Kalman Filtering(2) ADWIN
-xt
Kalman- -
AlarmADWIN
-Estimation
ADWIN Memory-
6
6?
R = W 2/50 and Q = 200/W (theoretically justifiable), where Wis the length of the window maintained by ADWIN.
15 / 59
Classification(3) Mining Algorithms
DefinitionGiven nC different classes, a classifier algorithm builds a modelthat predicts for every unlabeled instance I the class C to whichit belongs with accuracy.
Classification Mining AlgorithmsNaïve BayesDecision TreesEnsemble Methods
16 / 59
Hoeffding Tree / CVFDT(3) Mining Algorithms
Hoeffding Tree : VFDT
Pedro Domingos and Geoff Hulten.Mining high-speed data streams. 2000
With high probability, constructs an identical model that atraditional (greedy) method would learnWith theoretical guarantees on the error rate
Time
Contains “Money”
YESYes
NONo
Day
YES
Night
17 / 59
VFDT / CVFDT(3) Mining Algorithms
Concept-adapting Very Fast Decision Trees: CVFDT
G. Hulten, L. Spencer, and P. Domingos.Mining time-changing data streams. 2001
It keeps its model consistent with a sliding window ofexamplesConstruct “alternative branches” as preparation forchangesIf the alternative branch becomes more accurate, switch oftree branches occurs
18 / 59
Decision Trees: CVFDT(3) Mining Algorithms
Time
Contains “Money”
YESYes
NONo
Day
YES
Night
No theoretical guarantees on the error rate of CVFDT
CVFDT parameters :1 W : is the example window size.2 T0: number of examples used to check at each node if the
splitting attribute is still the best.3 T1: number of examples used to build the alternate tree.4 T2: number of examples used to test the accuracy of the
alternate tree.19 / 59
Decision Trees: Hoeffding Adaptive Tree(3) Mining Algorithms
Hoeffding Adaptive Tree:replace frequency statistics counters by estimators
don’t need a window to store examples, due to the fact thatwe maintain the statistics data needed with estimators
change the way of checking the substitution of alternatesubtrees, using a change detector with theoreticalguarantees
Summary:1 Theoretical guarantees2 No Parameters
20 / 59
What is MOA?
{M}assive {O}nline {A}nalysis is a framework for online learningfrom data streams.
It is closely related to WEKAIt includes a collection of offline and online as well as toolsfor evaluation:
boosting and baggingHoeffding Trees
with and without Naïve Bayes classifiers at the leaves.
21 / 59
Ensemble Methods(4) MOA for Evolving Data Streams
http://www.cs.waikato.ac.nz/∼abifet/MOA/
New ensemble methods:ADWIN bagging: When a change is detected, the worstclassifier is removed and a new classifier is added.Adaptive-Size Hoeffding Tree bagging
22 / 59
Adaptive-Size Hoeffding Tree(5) ASHT
T1 T2 T3 T4
Ensemble of trees of different sizesmaller trees adapt more quickly to changes,larger trees do better during periods with little changediversity
23 / 59
Adaptive-Size Hoeffding Tree(5) ASHT
0,2
0,21
0,22
0,23
0,24
0,25
0,26
0,27
0,28
0,29
0,3
0 0,1 0,2 0,3 0,4 0,5 0,6
Kappa
Err
or
0,25
0,255
0,26
0,265
0,27
0,275
0,28
0,1 0,12 0,14 0,16 0,18 0,2 0,22 0,24 0,26 0,28 0,3
Kappa
Err
or
Figure: Kappa-Error diagrams for ASHT bagging (left) and bagging(right) on dataset RandomRBF with drift, plotting 90 pairs ofclassifiers.
24 / 59
Adaptive-Size Hoeffding Tree(5) ASHT
Figure: Accuracy and size on dataset LED with three concept drifts.
25 / 59
Main contributions (i)Mining Evolving Data Streams
1 General Framework for Time Change Detectors andPredictors
2 ADWIN
3 Mining methods: Naive Bayes, Decision Trees, EnsembleMethods
4 MOA for Evolving Data Streams5 Adaptive-Size Hoeffding Tree
26 / 59
Outline
1 Introduction
2 Mining Evolving Data Streams
3 Tree Mining
4 Mining Evolving Tree Data Streams
5 Conclusions
27 / 59
Mining Closed Frequent Trees
Our trees are:Labeled and UnlabeledOrdered and Unordered
Our subtrees are:InducedTop-down
Two different ordered treesbut the same unordered tree
28 / 59
A tale of two trees
Consider D = {A,B}, where
A:
B:
and let min_sup = 2.
Frequent subtreesBA
29 / 59
A tale of two trees
Consider D = {A,B}, where
A:
B:
and let min_sup = 2.
Closed subtreesBA
29 / 59
Mining Closed Unordered Subtrees(6) Unlabeled Closed Frequent Tree Method
CLOSED_SUBTREES(t ,D ,min_sup,T )
123 for every t ′ that can be extended from t in one step4 do if Support(t ′) ≥min_sup5 then T ← CLOSED_SUBTREES(t ′,D ,min_sup,T )6789
10 return T
30 / 59
Mining Closed Unordered Subtrees(6) Unlabeled Closed Frequent Tree Method
CLOSED_SUBTREES(t ,D ,min_sup,T )
1 if not CANONICAL_REPRESENTATIVE(t)2 then return T3 for every t ′ that can be extended from t in one step4 do if Support(t ′) ≥min_sup5 then T ← CLOSED_SUBTREES(t ′,D ,min_sup,T )6789
10 return T
30 / 59
Mining Closed Unordered Subtrees(6) Unlabeled Closed Frequent Tree Method
CLOSED_SUBTREES(t ,D ,min_sup,T )
1 if not CANONICAL_REPRESENTATIVE(t)2 then return T3 for every t ′ that can be extended from t in one step4 do if Support(t ′) ≥min_sup5 then T ← CLOSED_SUBTREES(t ′,D ,min_sup,T )6 do if Support(t ′) = Support(t)7 then t is not closed8 if t is closed9 then insert t into T
10 return T
30 / 59
ExampleD = {A,B}
min_sup = 2.
〈A〉= (0,1,2,3,2,1) 〈B〉= (0,1,2,3,1,2,2)
(0) (0,1)
(0,1,1)
(0,1,2)
(0,1,2,1)
(0,1,2,2)
(0,1,2,3)
(0,1,2,2,1)
(0,1,2,3,1)
31 / 59
ExampleD = {A,B}
min_sup = 2.
〈A〉= (0,1,2,3,2,1) 〈B〉= (0,1,2,3,1,2,2)
(0) (0,1)
(0,1,1)
(0,1,2)
(0,1,2,1)
(0,1,2,2)
(0,1,2,3)
(0,1,2,2,1)
(0,1,2,3,1)
31 / 59
Experimental results(6) Unlabeled Closed Frequent Tree Method
TreeNatUnlabeled TreesTop-Down SubtreesNo Occurrences
CMTreeMinerLabeled TreesInduced SubtreesOccurrences
32 / 59
Closure Operator on Trees(7) Closure Operator
D : the finite input dataset of treesT : the (infinite) set of all trees
DefinitionWe define the following the Galois connection pair:
For finite A⊆Dσ(A) is the set of subtrees of the A trees in T
σ(A) = {t ∈T∣∣ ∀ t ′ ∈ A(t � t ′)}
For finite B ⊂TτD (B) is the set of supertrees of the B trees in D
τD (B) = {t ′ ∈D∣∣ ∀ t ∈ B (t � t ′)}
Closure OperatorThe composition ΓD = σ ◦ τD is a closure operator.
33 / 59
Galois Lattice of closed set of trees(7) Closure Operator
1 2 3
12 13 23
12334 / 59
Galois Lattice of closed set of trees
D
B = { }
1 2 3
12 13 23
12335 / 59
Galois Lattice of closed set of trees
B = { }
τD(B) = { , }
1 2 3
12 13 23
12335 / 59
Galois Lattice of closed set of trees
B = { }
τD(B) = { , }
ΓD(B) = σ ◦τD(B) = { and its subtrees }
1 2 3
12 13 23
12335 / 59
Mining Implications from Lattices of Closed Trees(8) Association Rules
ProblemGiven a dataset D of rooted, unlabeled and unordered trees,find a “basis”: a set of rules that are sufficient to infer all therules that hold in the dataset D .
D
∧ →
∧ →
→
∧ →
36 / 59
Mining Implications from Lattices of Closed Trees
Set of Rules:
A→ ΓD(A).
antecedents areobtained through acomputation akinto a hypergraphtransversalconsequentsfollow from anapplication of theclosure operators
1 2 3
12 13 23
12337 / 59
Mining Implications from Lattices of Closed Trees
Set of Rules:
A→ ΓD(A).
∧ →
∧ →
→
∧ →
1 2 3
12 13 23
12337 / 59
Association Rule Computation Example(8) Association Rules
1 2 3
12 13 23
123
23
38 / 59
Association Rule Computation Example(8) Association Rules
1 2 3
12 13 23
123
23
38 / 59
Association Rule Computation Example(8) Association Rules
1 2 3
12 13 23
123
23
→
38 / 59
Association Rule Computation Example(8) Association Rules
1 2 3
12 13 23
123
23
→
38 / 59
Model transformation(8) Association Rules
IntuitionOne propositional variable vt is assigned to each possiblesubtree t .A set of trees A corresponds in a natural way to a modelmA.Let mA be a model: we impose on mA the constraints that ifmA(vt ) = 1 for a variable vt , then mA(vt ′) = 1 for all thosevariables vt ′ such that vt ′ represents a subtree of the treerepresented by vt .
R0 = {vt ′ → vt∣∣ t ′ � t , t ∈U , t ′ ∈U }
39 / 59
Implicit Rules Definition(9) Implicit Rules
D
Implicit Rule
∧ →
Given three trees t1, t2, t3, we say that t1∧ t2→ t3, is an implicitHorn rule (abbreviately, an implicit rule) if for every tree t it holds
t1 � t ∧ t2 � t ↔ t3 � t .
t1 and t2 have implicit rules if t1∧ t2→ t is an implicit rule forsome t .
40 / 59
Implicit Rules Definition(9) Implicit Rules
D
NOT Implicit Rule
∧ →
Given three trees t1, t2, t3, we say that t1∧ t2→ t3, is an implicitHorn rule (abbreviately, an implicit rule) if for every tree t it holds
t1 � t ∧ t2 � t ↔ t3 � t .
t1 and t2 have implicit rules if t1∧ t2→ t is an implicit rule forsome t .
40 / 59
Implicit Rules Definition(9) Implicit Rules
This supertree of theantecedents is NOT a
supertree of theconsequents.
NOT Implicit Rule
∧ →
40 / 59
Implicit Rules Characterization(9) Implicit Rules
TheoremAll trees a, b such that a� b have implicit rules.
TheoremSuppose that b has only one component. Then they haveimplicit rules if and only if a has a maximum component whichis a subtree of the component of b.
for all i < nai � an � b1
a1 · · · an−1 an b1 a1 · · · an−1 b1
∧ →
41 / 59
Main contributions (ii)Tree Mining
6 Closure Operator on Trees7 Unlabeled Closed Frequent Tree Mining8 A way of extracting high-confidence association rules from
datasets consisting of unlabeled treesantecedents are obtained through a computation akin to ahypergraph transversalconsequents follow from an application of the closureoperators
9 Detection of some cases of implicit rules: rules thatalways hold, independently of the dataset
42 / 59
Outline
1 Introduction
2 Mining Evolving Data Streams
3 Tree Mining
4 Mining Evolving Tree Data Streams
5 Conclusions
43 / 59
Mining Evolving Tree Data Streams(10,11,12) Incremental, Sliding Window, and Adaptive Tree Mining Methods
ProblemGiven a data stream D of rooted, unlabeled and unorderedtrees, find frequent closed trees.
D
We provide three algorithms,of increasing power
IncrementalSliding WindowAdaptive
44 / 59
Relaxed Support(13) Logarithmic Relaxed Support
Guojie Song, Dongqing Yang, Bin Cui, Baihua Zheng,Yunfeng Liu and Kunqing Xie.CLAIM: An Efficient Method for Relaxed Frequent ClosedItemsets Mining over Stream Data
Linear Relaxed Interval:The support space of allsubpatterns can be divided into n = d1/εre intervals, whereεr is a user-specified relaxed factor, and each interval canbe denoted by Ii = [li ,ui), where li = (n− i)∗ εr ≥ 0,ui = (n− i + 1)∗ εr ≤ 1 and i ≤ n.Linear Relaxed closed subpattern t : if and only if thereexists no proper superpattern t ′ of t such that their suportsbelong to the same interval Ii .
45 / 59
Relaxed Support(13) Logarithmic Relaxed Support
As the number of closed frequent patterns is not linear withrespect support, we introduce a new relaxed support:
Logarithmic Relaxed Interval:The support space of allsubpatterns can be divided into n = d1/εre intervals, whereεr is a user-specified relaxed factor, and each interval canbe denoted by Ii = [li ,ui), where li = dc ie, ui = dc i+1−1eand i ≤ n.Logarithmic Relaxed closed subpattern t : if and only ifthere exists no proper superpattern t ′ of t such that theirsuports belong to the same interval Ii .
45 / 59
Algorithms(10,11,12) Incremental, Sliding Window, and Adaptive Tree Mining Methods
AlgorithmsIncremental: INCTREENAT
Sliding Window: WINTREENAT
Adaptive: ADATREENAT Uses ADWIN to monitor change
ADWIN
An adaptive sliding window whose size is recomputed onlineaccording to the rate of change observed.
ADWIN has rigorous guarantees (theorems)On ratio of false positives and false negativesOn the relation of the size of the current window andchange rates
46 / 59
Experimental Validation: TN1(10,11,12) Incremental, Sliding Window, and Adaptive Tree Mining Methods
INCTREENAT
CMTreeMiner
Time(sec.)
Size (Milions)2 4 6 8
100
200
300
Figure: Experiments on ordered trees with TN1 dataset
47 / 59
Adaptive XML Tree Classification on evolving datastreams(14) XML Tree Classification
D
D
B
C
A
C
D
B
C
B
D
B
C C
B
D
B
C
A
B
CLASS1 CLASS2 CLASS1 CLASS2
Figure: A dataset example
48 / 59
Adaptive XML Tree Classification on evolving datastreams(14) XML Tree Classification
Tree Trans.Closed Freq. not Closed Trees 1 2 3 4
c1
D
B
C C
B
C C 1 0 1 0
c2
D
B
C
A
B
C
A
C
A
A
1 0 0 1
49 / 59
Adaptive XML Tree Classification on evolving datastreams(14) XML Tree Classification
Frequent Treesc1 c2 c3 c4
Id c1 f 11 c2 f 1
2 f 22 f 3
2 c3 f 13 c4 f 1
4 f 24 f 3
4 f 44 f 5
41 1 1 1 1 1 1 0 0 1 1 1 1 1 12 0 0 0 0 0 0 1 1 1 1 1 1 1 13 1 1 0 0 0 0 1 1 1 1 1 1 1 14 0 0 1 1 1 1 1 1 1 1 1 1 1 1
Closed MaximalTrees Trees
Id Tree c1 c2 c3 c4 c1 c2 c3 Class1 1 1 0 1 1 1 0 CLASS12 0 0 1 1 0 0 1 CLASS23 1 0 1 1 1 0 1 CLASS14 0 1 1 1 0 1 1 CLASS2
50 / 59
Adaptive XML Tree Framework on evolving datastreams(14) XML Tree Classification
XML Tree Classification Framework ComponentsAn XML closed frequent tree minerA Data stream classifier algorithm, which we will feed withtuples to be classified online.
51 / 59
Adaptive XML Tree Framework on evolving datastreams(14) XML Tree Classification
Maximal Closed
# Trees Att. Acc. Mem. Att. Acc. Mem.
CSLOG12 15483 84 79.64 1.2 228 78.12 2.54CSLOG23 15037 88 79.81 1.21 243 78.77 2.75CSLOG31 15702 86 79.94 1.25 243 77.60 2.73CSLOG123 23111 84 80.02 1.7 228 78.91 4.18
Table: BAGGING on unordered trees.
52 / 59
Main contributions (iii)Mining Evolving Tree Data Streams
10 Incremental Method11 Sliding Window Method12 Adaptive Method13 Logarithmic Relaxed Support14 XML Classification
53 / 59
Outline
1 Introduction
2 Mining Evolving Data Streams
3 Tree Mining
4 Mining Evolving Tree Data Streams
5 Conclusions
54 / 59
Main contributions
Mining EvolvingData Streams
1 Framework2 ADWIN
3 Classifiers4 MOA5 ASHT
Tree Mining6 Closure Operator
on Trees7 Unlabeled Tree
Mining Methods8 Deterministic
Association Rules9 Implicit Rules
Mining Evolving TreeData Streams
10 IncrementalMethod
11 Sliding WindowMethod
12 Adaptive Method13 Logarithmic
Relaxed Support14 XML Classification
55 / 59
Future Lines (i)
Adaptive Kalman Filter
Kalman filter adaptive computing Q and R without using thesize of the window of ADWIN.
Extend MOA frameworkSupport vector machinesClusteringItemset miningAssociation rules
56 / 59
Future Lines (ii)
Adaptive Deterministic Association RulesDeterministic Association Rules computed on evolving datastreams
General Implicit Rules CharacterizationFind a characterization of implicit rules with any number ofcomponents
Not Deterministic Association RulesFind basis of association rules for trees with confidencelower than 100%
57 / 59
Future Lines (iii)
Closed Frequent Graph MiningMining methods to obtain closed frequent graphs.
Not incrementalIncrementalSliding WindowAdaptive
Graph ClassificationClassifiers of graphs using maximal and closed frequentsubgraphs.
58 / 59
Relevant publicationsAlbert Bifet and Ricard Gavaldà.Kalman filters and adaptive windows for learning in data streams. DS’06
Albert Bifet and Ricard Gavaldà.Learning from time-changing data with adaptive windowing. SDM’07
Albert Bifet and Ricard Gavaldà.Adaptive parameter-free learning from evolving data streams. Tech-Rep R09-9
A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavaldà.New ensemble methods for evolving data streams. KDD’09
José L. Balcázar, Albert Bifet, and Antoni Lozano.Mining frequent closed unordered trees through natural representations. ICCS’07
José L. Balcázar, Albert Bifet, and Antoni Lozano.Subtree testing and closed tree mining through natural representations. DEXA’07
José L. Balcázar, Albert Bifet, and Antoni Lozano.Mining implications from lattices of closed trees. EGC’2008
José L. Balcázar, Albert Bifet, and Antoni Lozano.Mining Frequent Closed Rooted Trees. MLJ’09
Albert Bifet and Ricard Gavaldà.Mining adaptively frequent closed unlabeled rooted trees in data streams. KDD’08
Albert Bifet and Ricard Gavaldà.Adaptive XML Tree Classification on evolving data streams
59 / 59