+ All Categories
Home > Documents > Generating Non-Redundant Association Rules Mohammed J. Zaki.

Generating Non-Redundant Association Rules Mohammed J. Zaki.

Date post: 20-Dec-2015
Category:
View: 217 times
Download: 2 times
Share this document with a friend
Popular Tags:
39
Generating Generating Non-Redundant Non-Redundant Association Association Rules Rules Mohammed Mohammed J. Zaki J. Zaki
Transcript
Page 1: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Generating Generating Non-RedundantNon-Redundant

Association RulesAssociation Rules

MohammedMohammed J. Zaki J. Zaki

Page 2: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 22

OutlineOutline

IntroductionIntroduction Association Rules – reminderAssociation Rules – reminder Closed Frequent ItemsetsClosed Frequent Itemsets Generating RulesGenerating Rules Complexity AnalysisComplexity Analysis Experimental EvaluationExperimental Evaluation

Page 3: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 33

IntroductionIntroduction Association Rule Discovery – Association Rule Discovery – The set of The set of

association rules can grow to be unwieldy association rules can grow to be unwieldy especially as we lower the frequency especially as we lower the frequency requirement (support).requirement (support).

Many rules are redundant.Many rules are redundant. Number of redundant rules can be Number of redundant rules can be

exponential in the length of the longest exponential in the length of the longest frequent itemset.frequent itemset.

For dense datasets it is not feasible to For dense datasets it is not feasible to mine all frequent itemsets.mine all frequent itemsets.

Page 4: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 44

IntroductionIntroduction

Solution:Solution: Using Using Closed Frequent ItemsetsClosed Frequent Itemsets::

The set is smaller in orders of magnitude.The set is smaller in orders of magnitude.No loss of information.No loss of information.Creating a “Generating Set”.Creating a “Generating Set”.

Algorithm for mining closed itemsets:Algorithm for mining closed itemsets:

CHARMCHARM

Page 5: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 55

Association Rules

Page 6: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 66

Mining Association RulesMining Association Rules

Page 7: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 77

Mining Association RulesMining Association Rules

Find all frequent itemsets:Find all frequent itemsets: 22mm : NP-Complete. : NP-Complete.Assuming a bound on transaction lengthAssuming a bound on transaction length

O (r · n · O (r · n · 22LL) .) .

Generating confident rules:Generating confident rules:For each itemset of size k, 2For each itemset of size k, 2k k potential rules. potential rules.Complexity: O (f · Complexity: O (f · 22LL).).

Num of max

frequent

itemsets

Num of

transactions

Longest

frequent

itemset

Num of

frequent

itemsets

Longest

frequent

itemset

Page 8: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 88

Closed Frequent Itemsets – Closed Frequent Itemsets – Defining a Galois connectionDefining a Galois connection

The MappingsThe Mappings:: Let:Let:

Define a Define a Galois Connection Galois Connection between the between the partially ordered sets P(I) , P(T).partially ordered sets P(I) , P(T).

Galois connection:Galois connection:For all a in A and b in B:For all a in A and b in B:

F (a) ≤ b ↔ G (b) ≤ aF (a) ≤ b ↔ G (b) ≤ a

Page 9: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 99

Galois Connection ContGalois Connection Cont..

Properties:Properties:

1.1.

2.2.

3.3.

))(())(( 22112211 XXttXXttXXXX

))(())(( 22112211 YYiiYYiiYYYY

))))(((( ))))(((( YYiittYYandandXXttiiXX

Page 10: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1010

Galois ConnectionGalois Connection

Page 11: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1111

ExampleExample

t (ACW) = t (A) ∩ t (C) ∩ t (W) t (ACW) = t (A) ∩ t (C) ∩ t (W)

= 1345 ∩ 123456 ∩ 12345= 1345 ∩ 123456 ∩ 12345

= 1345= 1345

i (245) = CDWi (245) = CDW

ACW ACW ACDW ACDW t (ACW)t (ACW) = 1345 = 1345 135 = 135 = t (ACDW)t (ACDW)

Page 12: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1212

Closure OperatorClosure Operator

c: P(s) c: P(s) P(s) if satisfies the following: P(s) if satisfies the following:1.1.

2.2.

3.3.

Closure Composition:Closure Composition: ccit it (x) = i • t (x) = i(t(x))(x) = i • t (x) = i(t(x))

ccti ti (x)(x)

))((:: XXccXXExtensionExtension

))(())((:: YYccXXccYYXXtytyMonotoniciMonotonici

))(())))((((:: XXccXXccccyyIdempotencIdempotenc

Page 13: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1313

Closure Operator – Round TripClosure Operator – Round Trip

Page 14: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1414

Closed Itemset - DefinitionClosed Itemset - Definition

A A Closed Itemset Closed Itemset X is an Itemset that X is an Itemset that is same as its closureis same as its closure..

Example : Example :

ccit it (AC) = i(t(AC) = i(1345) = ACW(AC) = i(t(AC) = i(1345) = ACW

conclusion: AC is not closed.conclusion: AC is not closed.

ACW is closed.ACW is closed.

Page 15: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1515

Closed Vs Frequent itemsetsClosed Vs Frequent itemsets

Page 16: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1616

Concept - DefinitionConcept - Definition

For any Closed Itemset X, there exists a For any Closed Itemset X, there exists a Closed Tidset Y, with the property:Closed Tidset Y, with the property:Y = t(X).Y = t(X).

The Pair X × Y is called a The Pair X × Y is called a Concept.Concept.

Page 17: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1717

Galois LatticeGalois Lattice

A concept xA concept x1 1 × × yy1 1 iis a sub concept of s a sub concept of

xx2 2 × × yy2 2 , If , If xx1 1 xx2 2 (if (if yy2 2 yy11).).

Let B(Let B(δδ) be the set of all concepts.) be the set of all concepts. The ordered set (B(The ordered set (B(δδ),≤) is a complete ),≤) is a complete

lattice, called the Galois lattice.lattice, called the Galois lattice.

Page 18: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1818

Galois Lattice Of ConceptsGalois Lattice Of Concepts

Page 19: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 1919

Frequent Closed ItemSets Vs. Frequent Closed ItemSets Vs. Frequent ItemsetsFrequent Itemsets

Lattice operationsLattice operations Join:Join: Meet:Meet:

Frequent Concept:Frequent Concept:With support greater than minsup, With support greater than minsup,

We define the support is the cardinality of the We define the support is the cardinality of the closed tidset.closed tidset.

)()()()( 21212211 YYXXcYXYX it )()()()( 21212211 YYcXXYXYX ti

Page 20: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2020

Join Meet ExampleJoin Meet Example

Join:Join:

(ACDW × 45) (ACDW × 45) (CDT × 56) =(CDT × 56) =

ccitit))ACDW ACDW CDT) × (45 CDT) × (45 56) = 56) =

ACDTW × 5ACDTW × 5

Meet:Meet:

(ACDW × 45) (ACDW × 45) (CDT 56) =(CDT 56) =

(ACDW (ACDW CDT) × CDT) × cctiti(45(455566) =) =

CD × 2456CD × 2456

Page 21: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2121

Frequent ConceptsFrequent Concepts

Page 22: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2222

Frequent ConceptsFrequent Concepts

Lemma 1:Lemma 1:An itemset’s (X) support is equal to the support An itemset’s (X) support is equal to the support

of its closure, i.e. of its closure, i.e. σ(X) = σ(cσ(X) = σ(citit(X)).(X)).

Therefore all frequent itemsets are uniquely Therefore all frequent itemsets are uniquely determined by the Closed itemsets and can determined by the Closed itemsets and can be determined by the join operation on the be determined by the join operation on the frequent conceptsfrequent concepts..

Page 23: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2323

Redundant RulesRedundant Rules

Definition:Definition:A rule A rule RR1 1 ::

is more general than a rule is more general than a rule RR2 2 denoted denoted RR11 ‹ ‹ RR22 , ,

provided that provided that RR22 can be generated by adding can be generated by adding

additional items to the antecedent or consequent additional items to the antecedent or consequent of of RR1 1 ..

The The Non-RedundantNon-Redundant rules are those that are most rules are those that are most general (with equal confidence).general (with equal confidence).

iippii XXXX ii

2211

Page 24: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2424

Rule GenerationRule Generation

Lemma 2:Lemma 2:Transitivity: Let Transitivity: Let XX11, X, X22, X, X33 be frequent closed be frequent closed

itemsets, withitemsets, with

If , thenIf , then

Observation: it is sufficient to consider rules among Observation: it is sufficient to consider rules among

adjacent concepts.adjacent concepts.

321 XXX

3322 XXXXqq2211 XXXX

pp

3311 XXXXpqpq

Page 25: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2525

Rule Generation – 100% confRule Generation – 100% conf..

Lemma 3:Lemma 3:An association rule has confidence p = 1.0 An association rule has confidence p = 1.0

If and only if .If and only if .

100% confidence rules are those directed from 100% confidence rules are those directed from a super-concept to a sub-concept,a super-concept to a sub-concept,i.e. Down Arcs.i.e. Down Arcs.

2200..11

11 XXXX

))(())(( 2211 XXttXXtt

Page 26: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2626

Rule Generation – 100% confRule Generation – 100% conf..

Page 27: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2727

Rule Generation – 100% confRule Generation – 100% conf

Theorem 1.Theorem 1.Let R = {Let R = {RR1 1 ,…, ,…, RRnn} be a set of rules with 100%} be a set of rules with 100%

confidence (pconfidence (pii for all i), such that for all i), such that

for all rules Rfor all rules Rii..

Let RLet RII denote the 100% confidence rule denote the 100% confidence rule

Then all rules RThen all rules Rii ≠ R ≠ RI I are more specific than are more specific than

, and thus are redundant. , and thus are redundant.

))(( andand ))(( 2222221111ii

ititiiii

itit XXccIIXXXXccII

2200..11

11 IIII

Page 28: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2828

Rule Generation – 100% confRule Generation – 100% conf

Example:Example:TWTWAA , TW , TWAC , CTWAC , CTWA A

ccit it (TW (TW A) = A) = ccitit (ATW) = ACTW (ATW) = ACTW

ccitit (TW (TW AC) = ACTW AC) = ACTW

ccitit (CTW (CTW A) = ACTW A) = ACTW

The most general

Page 29: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 2929

Rule Generation – Rule Generation – Confidence <100%Confidence <100%

Rules from sub-concepts to super-Rules from sub-concepts to super-concepts i.e. correspond to up-arcs.concepts i.e. correspond to up-arcs.

Rules between non adjacent concepts can Rules between non adjacent concepts can be derived by transitivity.be derived by transitivity.For example:For example:

CCW (with p= 0.83) and WW (with p= 0.83) and WA (q=0.8)A (q=0.8)

C C A (pq = 0.67)A (pq = 0.67)

Page 30: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3030

Rule Generation – Rule Generation – Confidence <100%Confidence <100%

Page 31: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3131

Rule Generation – Rule Generation – Confidence <100%Confidence <100%

Theorem 2.Theorem 2.Let R = {Let R = {RR1 1 ,…, ,…, RRnn} be a set of rules with } be a set of rules with

confidence p< 1.0 (pconfidence p< 1.0 (pii for all i), such that for all i), such that

for all rules Rfor all rules Rii..

Let RLet RII denote the rule denote the rule

Then all rules RThen all rules Rii ≠ R ≠ RI I are more specific than are more specific than

RRII , and thus are redundant. , and thus are redundant.

))(( andand ))(( 2211222211iiii

ititii

itit XXXXccIIXXccII

2211 IIII pp

Page 32: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3232

Generating SetGenerating Set

Combining the two sets gives us a generating Combining the two sets gives us a generating set for rules with set for rules with

minconf = 50% and minsup = 80%minconf = 50% and minsup = 80%::

}}TW→A , A→W , W→C , T→C , D→C ,TW→A , A→W , W→C , T→C , D→C ,

W→A W→A (0.8) , (0.8) , CC→W →W (0.83)(0.83) } }

All association rules canAll association rules can

Be derived from this setBe derived from this set

Page 33: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3333

Complexity of Rule GenerationComplexity of Rule Generation

Traditional:Traditional:

New Framework:New Framework: Best case: one closed itemset , no rules.Best case: one closed itemset , no rules. Worst case:Worst case:

All frequent itemsets are closed.All frequent itemsets are closed. Number of rules:Number of rules: Reduction factor:Reduction factor:

))22((2222222222 22000000

llllllllii

llii

llllllii

llii

iillllii

llii OO

))llOOlliill llll

ii

ll

ii

ll

ii

ll

ii22 (( ))((

0000

))22(( llOOll

ll22

Page 34: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3434

Experimental EvaluationExperimental Evaluation

Page 35: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3535

Experimental EvaluationExperimental Evaluation

Page 36: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3636

Experimental EvaluationExperimental Evaluation

Page 37: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3737

Number of RulesNumber of RulesTraditional Vs Closed itemsetTraditional Vs Closed itemset

Page 38: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3838

Number of RulesNumber of RulesTraditional Vs Closed itemsetTraditional Vs Closed itemset

Page 39: Generating Non-Redundant Association Rules Mohammed J. Zaki.

Yaeer Master© Yaeer Master© 3939

ConclusionConclusion

The new framework based on closed The new framework based on closed itemsets can drastically reduce the rule itemsets can drastically reduce the rule set, and can be presented to the user in a set, and can be presented to the user in a succinct manner.succinct manner.

Future work:Future work: Interactive visualization and exploration of Interactive visualization and exploration of

mined associations, generating rules on mined associations, generating rules on demand based on user’s interest.demand based on user’s interest.

Finding a minimal generating set.Finding a minimal generating set.


Recommended