Post on 15-Oct-2015
description
transcript
Chapter 9Market Basket Analysis and
Association Rules
2Data Mining Techniques So Far
Chapter 5 Statistics
Chapter 6 Decision Trees
Chapter 7 Neural Networks
Chapter 8 Nearest Neighbor Approaches: Memory-
Based Reasoning and Collaborative Filtering
3Questions related to Market Basket
4What can be inferred?
I purchase diapers
I purchase a new car
I purchase OTC cough medicine
I purchase a prescription medication
I dont show up for class
5Market Basket Analysis
Retail each customer purchases different set of products, different quantities, different times
MBA uses this information to: Identify who customers are (not by name) Understand why they make certain purchases Gain insight about its merchandise (products):
Fast and slow movers Products which are purchased together Products which might benefit from promotion
Take action: Store layouts Which products to put on specials, promote, coupons
Combining all of this with a customer loyalty card it becomes even more valuable
6Association Rules
DM technique most closely allied with Market Basket Analysis
AR can be automatically generated
AR represent patterns in the data without a specified target variable
Good example of undirected data mining
Whether patterns make sense is up to humanoids (us!)
7Association Rules Apply Elsewhere
Items purchased on a credit card, such as rental cars and hotel rooms, provide insight into the next product that customers are likely to purchase.
Optional services purchased by telecommunicationscustomers (call waiting, call forwarding, DSL, speed call, and so on) help determine how to bundle these services together to maximize revenue.
Banking services used by retail customers (money market accounts, CDs, investment services, car loans, and so on) identify customers likely to want other services.
Unusual combinations of insurance claims can be a sign of fraud and can spark further investigation.
Medical patient histories can give indications of likely complications based on certain combinations of treatments.
8Market Basket Analysis Drill-Down
MBA is a set of techniques, Association Rules being most common, that focus on point-of-sale (p-o-s) transaction data
3 types of market basket data (p-o-s data)
Customers
Orders (basic purchase data or baskets or item sets)
Items (merchandise/services purchased)
9Typical Data Structure (Relational Database)
Lots of questions can be answered
Avg # of orders/customer
Avg # unique items/order
Avg # of items/order
For a product What % of customers have purchased
Avg # orders/customer include it
Avg quantity of it purchased/order
Visualizationis extremelyhelpfulnext slide
Transaction Data
10
Combining data
These measuresgive broad insightinto the business.
In some cases,there are few repeat customers, so the proportion of orders per customer is close to 1.
This suggests a business opportunity to increase the number of sales per customers.
Or, the number of products per order may be close to 1, suggesting an opportunity for cross-selling during the process of making an order.
It can be useful to compare these measures to each other.
11
Questions about ...
Sales Order Characteristics
Item Popularity
Tracking Marketing Interventions
Clustering Products by Usage
12
Sales Order Characteristics
Customer purchases have additional interesting characteristics.
For instance, the average order size varies by time and region
For Web purchases and mail-order transactions, additional information may also be gathered at the point of sale:
Did the order use gift wrap?
Is the order going to the same address as the billing address?
Did the purchaser acceptor decline a particularcross-sell offer?
13
Item popularity
What is the most common item found on a one-item order?
What is the most common item found on a multi-item order?
What is the most common item for repeat customer purchases?
How has ordering of an item changed over time?
How does the ordering of an item vary geographically?
14
Tracking Marketing Interventions Including marketing interventions along with the product sales over
time makes it possible to see the effect of the interventions.
Prior to the intervention, sales are hovering at 50 units / week.
After the intervention, they peak at 7-8 times that amount.
A challenge in answering this question is determining whether the additional sales are incremental or are made by customers who would purchase the product anyway at some later time. We can also look at the
number of basketscontaining the item.
If the number of customersis not increasing, there isevidence that existingcustomers are simplystocking up on the item ata lower cost.
15
Clustering Products by Usage
What groups of products often appear together? Such groups of products are very useful for making recommendations
to customerscustomers who have purchased some of the products may be interested in the rest of them
A lot of information available about products.
In addition to the product hierarchy, such information includes the color of clothes, whether food is low calorie, whether a poster includes a frame, and so on
Questions: Do diet products tend to sell together?
Are customers purchasing similar colors of clothing at the same time?
Do customers who purchase framed posters also buy other products?
16
Pivoting for Cluster Algorithms
17
Association Rules
Wal-Mart customers who purchase Barbie dolls have a 60% likelihood of also purchasing one of three types of candy bars
Customers who purchase maintenance agreements are very likely to purchase large appliances
When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners
So what
18
Famous Rules: Beer & Diapers
19
Famous Rules: Beer & Diapers
WHY?
Beer drinkers do not want to interrupt their enjoyment of televised sports, so they buy diapers to reduce trips to the bathroom. No, thats not it.
Families with young children are preparing for the weekend.
What can a retailer do with this information?
Put the beer and diapers close together, so when one is purchased, customers remember to buy the other one.
Put them as far apart as possible, so opportunity to buy yet more items.
Put higher-margin diapers a bit closer to the beer, although mixing baby products and alcohol would probably be unseemly.
20
Association Rules
If buy Diaper
Buy Beer
Then
If buy Beer, Diaper
Buy Cheese,Chocolate
Then
Shoppers who buy Diaper are very likely to buy Beer.
Shoppers who buy Beer and Diaper are likely to buy Cheese and Chocolate
Examples:
For a frequent itemset {Diaper, Beer}, is Diaper promoting the purchase of Beer, or Beer increasing the chance of Diaper purchase?
We need directions.
21
Association Rules
Rule format:
If {set of items} Then {set of items}
LHS implies RHS *
If {Diaper, Baby Food}
{Beer, Wine}
Then
LHS RHS
An association rule is valid if it satisfies some evaluation measures
* RHS = "Right Hand Side LHS = "Left Hand Side
22
Association Rules
Association rule types:
Actionable Rules contain high-quality, actionable
information
Trivial Rules information already well-known by
those familiar with the business
Results from market basket analysis may simply be measuring
the success of previous marketing campaigns
Inexplicable Rules no explanation and do not
suggest action
Trivial and Inexplicable Rules occur most often
23
Milk & Wine co-occur But
Only 2 out of 200K transactions contain these items
Rule Evaluation
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Wine
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
.
24
Support:
The frequency in which the items in LHS and RHS co-occur.
E.g., The support of the {Diaper} {Beer} rule is 3/5:
60% of the transactions contain both items.
No. of transactions containing items in LHS and RHS
Total No. of transactions in the datasetSupport =
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
Rule Evaluation Support
25
Rule Evaluation - ConfidenceIs Beer leading to Diaper purchase or Diaper leading to Beer purchase?
Among the transactions with Diaper, 100% have Beer. P(Beer|Diaper)=100%
Among the transactions with Beer, 75% have Diaper. P(Diaper|Beer)=75%
Confidence =
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
No. of transactions containing both LHS and RHS
No. of transactions containing LHS
confidence for {Diaper} {Beer} : 3/3
When Diaper is purchased, the likelihood of Beer purchase is 100%
confidence for {Beer} {Diaper} : 3/4
When Beer is purchased, the likelihood of Diaper purchase is 75%
So, {Diaper} {Beer} is a more important rule according to confidence.
26
Rule Evaluation - LiftTransaction No. Item 1 Item 2 Item 3 Item 4
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Milk Vodka Chocolate
103 Beer Milk Diaper Chocolate
104 Milk Diaper Beer
Whats the support and confidence for rule {Chocolate}{Milk}?
Support = 3/5 Confidence = 3/4
Very high support and confidence. Does Chocolate really lead to Milk purchase?
No! Because Milk occurs in 4 out of 5 transactions. Chocolate is even decreasing the chance of Milk purchase 3/4 < 4/5, i.e. P(Milk|Chocolate) 1 then the rule is better at predicting the result than guessing
When lift < 1, the rule is doing worse than informed guessing and using
the Negative Rule produces a better rule than guessing
27
Rule Evaluation Lift (cont.)
Measures how much more likely is the RHS given the LHS than merely the RHS
Lift = confidence of the rule / probability of the RHS
i.e. = P(RHS|LHS)/P(RHS)
Example: {Diaper} {Beer} Total number of customer in database: 1000
No. of customers buying Diaper: 200
No. of customers buying beer: 50
No. of customers buying Diaper & beer: 20
Probability of Beer = 50/1000 (5%)
Confidence = 20/200 (10%)
Lift = 10%/5% = 2
Lift higher than 1 implies people have higher change to buy Beer when they buy Diaper. Lift lower than 1 implies people have lower change to buy Milk when they buy Chocolate.
28
Rule Evaluation Practical Impact
Most methods for extracting association rules find too many trivial rules. Most are either obvious and uninteresting.
Example: If Maternity Ward then patient is a woman. Confidence 100%, support 100%
Need to screen for rules that are of particular interest and significance.
Actionable: Keep only rules that can be acted upon.
Interestingness: Various measures for how surprising or unexpected a rule is.
Example: A rule is interesting if it contradicts what is currently known (e.g., it contradicts a rule that was previously discovered).
29
Creating Association Rules
1. Choosing the right set of items
2. Generating rules by deciphering the counts in the co-occurrence matrix
3. Overcoming the practical limits imposed by thousands or tens of thousands of unique items
30
Creating Association Rules
31
Creating Association Rules
Choosing the right set of items Within a grocery store where there are tens of
thousands of products on the shelves, a frozen pizza might be considered an item for analysis purposes, regardless of its toppings (extra cheese, pepperoni, or mushrooms), its crust (extra thick, whole wheat, or white), or its size.
On the other hand, the manager of frozen foods or a chain of pizza restaurants may be very interested in the particular combinations of toppings that are ordered.
32
Creating Association Rules
Choosing the right set of items
What level of the product hierarchy is the right one to use?
Market basket analysis produces the best results when the items occur in roughly the same number of transactions in the data. This helps prevent rules from being dominated by the most common items. Product hierarchies can help here. Roll up rare items to higher levels in the hierarchy, so they become more frequent. More common items may not have to be rolled up at all.
33
Creating Association Rules
Generating rules by deciphering the counts in the co-occurrence matrix
if condition, then result.
if Barbie doll, then candy bar
= if a customer purchases a Barbie doll, then the customer is also expected to purchase a candy bar.
Saying that the rule if B and C then A has a confidence of 0.33 is equivalent to saying that when B and C appear in a transaction, there is a 33 percent chance that A also appears in it.
34
Creating Association Rules
Overcoming the practical limits imposed by thousands or tens of thousands of unique items
1. Generate co-occurrence matrix for single itemsif OJ then soda
2. Generate co-occurrence matrix for two itemsif OJ and Milk then soda
3. Generate co-occurrence matrix for three itemsif OJ and Milk and Window Cleaner then soda
4. And so on
35
Algorithm to Extract Association Rules
The standard algorithm: Apriori Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499
The Association Rules problem was defined as:
Generate all association rules that have
support greater than the user-specified minimum support
and confidence greater than the user-specified minimum confidence
the base algorithm uses support and confidence, but we can also use lift to rank the rules discovered by Apriori.
The algorithm performs an efficient search over the data to find all such rules.
36
Finding Association Rules from Data
Association rules discovery problem is decomposed into two sub-problems:
1. Find all sets of items (itemsets) whose support is above minimum support - called frequent itemsets or large itemsets
2. From each frequent itemset, generate rules whose confidence is above minimum confidence.
Given a large itemset Y, and X is a subset of YCalculate confidence of the rule X (Y - X) If its confidence is above the minimum confidence, then X (Y - X) is an association rule we are looking for.
37
Example
A data set with 5 transactions
Minimum support = 40%, Minimum confidence = 80%
Phase 1: Find all frequent itemsets
{Beer} (support=80%),
{Diaper} (60%),
{Chocolate} (40%)
{Beer, Diaper} (60%)
Transaction No. Item 1 Item 2 Item 3
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Wine Vodka
103 Beer Cheese Diaper
104 Ice Cream Diaper Beer
Beer Diaper (conf. 34= 75%)
Diaper Beer (conf. 33= 100%)
Phase 2:
38
A naive way is to calculate the support for every possible itemset. 2N
possible itemsets given N items impossible to do!
Need smart method: frequent itemsets of size n contain itemsets of size n-1 that also must be frequest
Example: if {diaper, beer} is frequent then {diaper} and {beer} are each frequent as well
This means that
If an itemset is not frequent (e.g., {wine}) then no itemset that includes wine can be frequent either, such as {wine, beer} .
We therefore first find all itemsets of size 1 that are frequent.
Then try to expand these by counting the frequency of all itemsets of size 2 that include frequent itemsets of size 1.
Example:
If {wine} is not frequent we need not try to find out whether {wine, beer} is frequent. But if both {wine} & {beer} were frequent then it is possible (though not guaranteed) that {wine, beer} is also frequent.
Then take only itemsets of size 2 that are frequent, and try to expand those, etc.
Phase 1: Finding all frequent itemsetsHow to perform an efficient search of all frequent itemsets?
39
Assume {Milk, Bread, Butter} is a frequent itemset.
Using items contained in the itemset, list all possible rules {Milk} {Bread, Butter} {Bread} {Milk, Butter} {Butter} {Milk, Bread} {Milk, Bread} {Butter} {Milk, Butter} {Bread} {Bread, Butter} {Milk}
Calculate the confidence of each rule Pick the rules with confidence above the minimum confidence
Support {Milk, Bread, Butter}Support {Milk}
No. of transaction that support {Milk, Bread, Butter}No. of transaction that support {Milk}
=
Phase 2: Generating Association Rules
Confidence of {Milk} {Bread, Butter}:
40
Agrawal (94)s Apriori Algorithm -An Example
Transactions
1st scan
C1 L1
L2
C2 C22nd scan
C3 L33rd scan
T-ID Items
10 A, C, D
20 B, C, E
30 A, B, C, E
40 B, E
Itemset sup
{A} 2
{B} 3
{C} 3
{D} 1
{E} 3
Itemset sup
{A} 2
{B} 3
{C} 3
{E} 3
Itemset
{A, B}
{A, C}
{A, E}
{B, C}
{B, E}
{C, E}
Itemset sup
{A, B} 1
{A, C} 2
{A, E} 1
{B, C} 2
{B, E} 3
{C, E} 2
Itemset sup
{A, C} 2
{B, C} 2
{B, E} 3
{C, E} 2
Itemset
{B, C, E} Itemset sup
{B, C, E} 2{A,B,C}, {A, C, E}?
41
The number of combinations with n items is proportional to the number of items raised to the nth power - a number that gets very large, very fast.
42
Final Thought on Association Rules:The Problem of Lots of Data
Fast Food Restaurantcould have 100 items on its menu How many combinations are there with 3 different menu
items? 161,700 ! Supermarket10,000 or more unique items
50 million 2-item combinations 100 billion 3-item combinations
How to reduce data: Use of product hierarchies (groupings) Prunning: reducing the number of items and combinations
of items being considered at each step Minimum support pruning requires that a rule hold on a minimum
number of transactions. If there are one million transactions and the minimum support is 1%,
then only rules supported by 10,000 transactions are of interest.
Finally, know that the number of transactions in a given time-period could also be huge (hence expensive to analyze)
43
Using Association Rules to Compare Stores
EX: compare sales at store openings versus existing stores:
1. Gather data for a specific period (such as 2 weeks) from store openings.Augment each of the transactions in this data with a virtual item saying that the transaction is from a store opening.
2. Gather about the same amount of data from existing stores.Here you might use a sample across all existing stores, or you might take all the data from stores in comparable locations.Augment the transactions in this data with a virtual item saying that the transaction is from an existing store.
3. Apply market basket analysis to find association rules in each set.
4. Pay particular attention to association rules containing the virtual items.
44
DissociationRules
if A and not B, then C
Dissociation rules can be generated by a simple adaptation of the basic market basket analysis algorithm.
Downsides to including new items:
doubling the number of items seriously degrades performance
the size of a typical transaction grows because it now includes inverted items
the frequency of the inverse items tends to be much larger than the frequency of the original items.
So, minimum support constraints tend to produce rules in which all items are inverted, such as if NOT A and NOT B then NOT C.
These rules are less likely to be actionable.
45
Sequential Analysis Using Association Rules
Association rules find things that happen at the same time -what items are purchased at a given time.
The next natural question concerns sequences of eventsand what they mean. Examples: New homeowners purchase shower curtains before purchasing
furniture.
Customers who purchase new lawnmowers are very likely to purchase a new garden hose in the following 6 weeks.
When a customer goes into a bank branch and asks for an account reconciliation, there is a good chance that he or she will close all his or her accounts.
In order to consider time-series analyses on your customers, there has to be some way of identifying customers. Without a way of tracking individual customers, there is no way to analyze their behavior over time.
46
Sequential Patterns
Instead of finding association between items in a single transactions, find association between items across related transactions over time.
Customer ID Transaction Data. Item 1 Item 2
AA 2/2/2001 Laptop Case
AA 1/13/2002 Wireless network card Router
BB 4/5/2002 laptop iPaq
BB 8/10/2002 Wireless network card Router
Sequence : {Laptop}, {Wireless Card, Router}
A sequence has to satisfy some predetermined minimum support
47
Exercise 1 by hand
Given the above list of transactions, do the following:
1) Find all the frequent itemsets (minimum support 40%)
2) Find all the association rules (minimum confidence 70%)
3) For the discovered association rules, calculate the lift
Transaction No.Item 1 Item 2 Item 3 Item 4
100 Beer Diaper Chocolate
101 Milk Chocolate Shampoo
102 Beer Soap Vodka
103 Beer Cheese Wine
104 Milk Diaper Beer Chocolate
48
RapidMiner Practice
To see:
RapidMiner Tutorial example 2 / 26
To practice:
Do the exercise presented in the tutorial using the file Iris.ioo.
49
Exercise 1 using RapidMiner
Take Beer.xls file and find the association rules
First process the data to the right format(Beer1.xls )
50
RapidMiner Practice
To see:
Training Videos\05 - Akhtar Fareed -RapidMinerTutorial\RapidMiner Tutorial (part 9_9) Association Rules
To practice:
Do the exercises presented in the movie using the file BalanceScale.xls.
51
Data Preprocessing
Bank.xls Bank.ioo
Save as .ioo format
Process design Take a look at the .ioo file and attributes / variables
Process the attributes using Select Attributes Rules can only handle categorical data types
Find association rules Use operators: FP-Growth then Create Association Rules
Association Rules
Read and interpret the results
RapidMiner Practice