Date post: | 09-May-2015 |
Category: |
Technology |
Upload: | 7segments |
View: | 170 times |
Download: | 0 times |
Automated Data-Mining
ALL SME
Anybody Success
www.7segments.com
About me
+ (421) 918 666 238
BANKS• BCR ERSTE BANK (RO)• TATRABANKA• PSS
TELCO & RETAIL• T-MOBILE• VODAFONE (CZ)• SOS ELECTRONIC• EXISPORT
UTILITIES & OTHER• SPP• PIXEL FEDERATION
ABOUT PREDICTIVE ANALYTICSINTRODUCTION
CRISP DM Methodology
Business Understanding
• Set business goal and ask question– We need to grow sales in SME segment
• What SME are interested in our offer?
• Transform in into task for data-mining– Divide SME portfolio into 2 group
• Will accept offer• Will not accept offer
Data-mining starts by asking the right question
Data understanding
ALL SME
ACCEPTED OFFER X
Goal: Target less customers and achieve the same results
All customers Success in last 30 days
Data understandingImagine a predictive model splitting customers into segments: green, orange,
red
ALL SME
Target just 20% of customers to get 80% of max profit.
10%
All customers Success in last 30 days
Best customers10 000 (20%)
Average customers20 000 (40%)
Worst customers20 000 (40%)
50 000 customers
5 000 accepted offer
4000 (80%)
750 (15%)250 (5%)
100%
20%
40%
40%
60%
Ru
le1
Ru
le2
Data preparation
• In order to build good model you need relevant data
Customer Age Location Purchases Target
1 10 East 120 Yes
2 5 East 42 No
3 24 West 23 Yes
4 2 West 50 Yes
5 1 West 19 No
More attributes available = better chance for good prediction
Modeling:Split buyers and non buyers
Logistic RegressionLinear Regression
Decision Tree Neural Networks
Model Evaluation
• Train model on historical data
• Test on unseen data – 50%:50%– different history
Model vs. bežný výber
40%
25%
14%
8%4% 3% 2% 2% 1% 1%
0%5%
10%15%20%25%30%35%40%45%
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
% celkového počtu klientov%
tar
get v
sku
pine
model
random
Model vs. random selection on history
% of total SME%
bu
yers
in
gro
up
• Good models moves buyers to the first groups
• Weak model is like random: buyers are everywhere
Deployment
• Target campaign to the best customers• Usual process:
– Import predictions to CRM– Select top customers– Evaluate campaign after a XY days
HOW PREDICTIONS WORKIN 7SEGMENTS
DEMO
Usually it looks like this :-)
IBM SPSS Modeler 16
Smart software can simplify process
Step of methodology Level of automation in our solution
Business understanding Requires user input
Data understanding Automated & supervised
Data preparation Automated & supervised
Modeling Automated & supervised
Evaluation Automated & supervised
Deployment Semi-Automated
Use wizard for task definitionor
Select a pre-defined task
Let user to check results and decide about actions
Who is eligible for prediction?
For all customer who bought some “Shoes” in last 3y
What would you like to predict?
Identify those likely to:make another purchase above 5€ in next 300 days
Data understanding
Check default response rate & predictors
Training decision tree
Model Evaluation and Profitability Analysis
Deployment
Use this rules in your next campaign:
Summary
• Non-coders can understand and manage
• Works well on any data
• Useful: instant deployment into campaigns delivers immediate value
UNDER COVERINSIGHTS
Under Cover: Data
• We use available customer data for prediction• We generate predictors from customer events
Customer Time Event Amount Product
1 Monday Purchase 5 Shirt
1 Tuesday Purchase 10 Shoes
1 Thursday Purchase 20 Ball
1 Saturday Purchase 5 Pen
1 Sunday Purchase 2 Pin
Count of (event.purchase) = 5Count of (event.purchase) in (last 3 days) = 2Sum (event.purchase.amount) = 43Sum (event.purchase.amount) in (last 3 days) = 7Last / First / Most frequent event.purchase.product = Pin/Shirt/-….
For every event and all its properties
Under Cover: Training & validation
• Automatic building of train, test and actual dataset– train is from different time period then test– actual are the most recent data
• Rules for attribute transformations– Discretization, coarse classing, remove noise, replace
missing value• Variation of CHAID decision tree
– It’s almost non-parametric– ChiSq test works fine on unbalanced data compared to GINI
in CARTs– Generate heuristics for stopping criteria
Under Cover: Deployment
• Real-time scoring– Calculate only final predictors what’s easy task for
single customer– Apply rules-set to get prediction– Keep scoring for post-evaluation
Under Cover: Near Future
• Apply various algorithms– For better visualization & insights
• Log. Regression produces visual scorecard with points
– To reduce risk of over fitting or low AUC
• Generate less and better predictors– [Average] [profit] in [category=value] in the [last/first]
N [periods]
• Identify right time to retrain models
(Almost) Full List of FeaturesCampaigns• Automated & Planned• Scenario Designer• Email Designer & Gateway• SMS Gateway• Survey Designer• Social Pusher (FB, Twitter, LinkedIn)
CRM• Customer List• Custom Attributes• Events• Data Management• Import Wizard• Aggregator and Derivator
Analytics• Trends• Funnels• Descriptive Statistics• Session Length• Custom Report Designer• Dashbdoard Designer
Optimization• Segmentations• Predictions• Product Recommendation Engine• Market Basket Analysis• Real-time Customer Scoring
API and SDK for JavaScript, PHP, Python, Flash, Unity (mobile), Linux shell, …
Thank you for attention!
Call Jozo Kovac (+421) 918 666 238 and schedule live demo!
www.7segments.com
+ (421) 918 666 238