+ All Categories
Home > Technology > Automated Data Mining for Everyone

Automated Data Mining for Everyone

Date post: 09-May-2015
Category:
Upload: 7segments
View: 170 times
Download: 0 times
Share this document with a friend
Description:
The biggest challenge in data-mining isn't just to build the best model. Real challenge is to solve more problems easier. Look at amazing data-mining solution we've created in 7SEGMENTS. Data-mining easy and powerful as never before.
28
Automated Data-Mining Anybody Success
Transcript
Page 1: Automated Data Mining for Everyone

Automated Data-Mining

ALL SME

Anybody Success

Page 2: Automated Data Mining for Everyone

www.7segments.com

About me

+ (421) 918 666 238

BANKS• BCR ERSTE BANK (RO)• TATRABANKA• PSS

TELCO & RETAIL• T-MOBILE• VODAFONE (CZ)• SOS ELECTRONIC• EXISPORT

UTILITIES & OTHER• SPP• PIXEL FEDERATION

Page 3: Automated Data Mining for Everyone

ABOUT PREDICTIVE ANALYTICSINTRODUCTION

Page 4: Automated Data Mining for Everyone

CRISP DM Methodology

Page 5: Automated Data Mining for Everyone

Business Understanding

• Set business goal and ask question– We need to grow sales in SME segment

• What SME are interested in our offer?

• Transform in into task for data-mining– Divide SME portfolio into 2 group

• Will accept offer• Will not accept offer

Data-mining starts by asking the right question

Page 6: Automated Data Mining for Everyone

Data understanding

ALL SME

ACCEPTED OFFER X

Goal: Target less customers and achieve the same results

All customers Success in last 30 days

Page 7: Automated Data Mining for Everyone

Data understandingImagine a predictive model splitting customers into segments: green, orange,

red

ALL SME

Target just 20% of customers to get 80% of max profit.

10%

All customers Success in last 30 days

Best customers10 000 (20%)

Average customers20 000 (40%)

Worst customers20 000 (40%)

50 000 customers

5 000 accepted offer

4000 (80%)

750 (15%)250 (5%)

100%

20%

40%

40%

60%

Ru

le1

Ru

le2

Page 8: Automated Data Mining for Everyone

Data preparation

• In order to build good model you need relevant data

Customer Age Location Purchases Target

1 10 East 120 Yes

2 5 East 42 No

3 24 West 23 Yes

4 2 West 50 Yes

5 1 West 19 No

More attributes available = better chance for good prediction

Page 9: Automated Data Mining for Everyone

Modeling:Split buyers and non buyers

Logistic RegressionLinear Regression

Decision Tree Neural Networks

Page 10: Automated Data Mining for Everyone

Model Evaluation

• Train model on historical data

• Test on unseen data – 50%:50%– different history

Model vs. bežný výber

40%

25%

14%

8%4% 3% 2% 2% 1% 1%

0%5%

10%15%20%25%30%35%40%45%

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

% celkového počtu klientov%

tar

get v

sku

pine

model

random

Model vs. random selection on history

% of total SME%

bu

yers

in

gro

up

• Good models moves buyers to the first groups

• Weak model is like random: buyers are everywhere

Page 11: Automated Data Mining for Everyone

Deployment

• Target campaign to the best customers• Usual process:

– Import predictions to CRM– Select top customers– Evaluate campaign after a XY days

Page 12: Automated Data Mining for Everyone

HOW PREDICTIONS WORKIN 7SEGMENTS

DEMO

Page 13: Automated Data Mining for Everyone

Usually it looks like this :-)

IBM SPSS Modeler 16

Page 14: Automated Data Mining for Everyone

Smart software can simplify process

Step of methodology Level of automation in our solution

Business understanding Requires user input

Data understanding Automated & supervised

Data preparation Automated & supervised

Modeling Automated & supervised

Evaluation Automated & supervised

Deployment Semi-Automated

Use wizard for task definitionor

Select a pre-defined task

Let user to check results and decide about actions

Page 15: Automated Data Mining for Everyone

Who is eligible for prediction?

For all customer who bought some “Shoes” in last 3y

Page 16: Automated Data Mining for Everyone

What would you like to predict?

Identify those likely to:make another purchase above 5€ in next 300 days

Page 17: Automated Data Mining for Everyone

Data understanding

Check default response rate & predictors

Page 18: Automated Data Mining for Everyone

Training decision tree

Page 19: Automated Data Mining for Everyone

Model Evaluation and Profitability Analysis

Page 20: Automated Data Mining for Everyone

Deployment

Use this rules in your next campaign:

Page 21: Automated Data Mining for Everyone

Summary

• Non-coders can understand and manage

• Works well on any data

• Useful: instant deployment into campaigns delivers immediate value

Page 22: Automated Data Mining for Everyone

UNDER COVERINSIGHTS

Page 23: Automated Data Mining for Everyone

Under Cover: Data

• We use available customer data for prediction• We generate predictors from customer events

Customer Time Event Amount Product

1 Monday Purchase 5 Shirt

1 Tuesday Purchase 10 Shoes

1 Thursday Purchase 20 Ball

1 Saturday Purchase 5 Pen

1 Sunday Purchase 2 Pin

Count of (event.purchase) = 5Count of (event.purchase) in (last 3 days) = 2Sum (event.purchase.amount) = 43Sum (event.purchase.amount) in (last 3 days) = 7Last / First / Most frequent event.purchase.product = Pin/Shirt/-….

For every event and all its properties

Page 24: Automated Data Mining for Everyone

Under Cover: Training & validation

• Automatic building of train, test and actual dataset– train is from different time period then test– actual are the most recent data

• Rules for attribute transformations– Discretization, coarse classing, remove noise, replace

missing value• Variation of CHAID decision tree

– It’s almost non-parametric– ChiSq test works fine on unbalanced data compared to GINI

in CARTs– Generate heuristics for stopping criteria

Page 25: Automated Data Mining for Everyone

Under Cover: Deployment

• Real-time scoring– Calculate only final predictors what’s easy task for

single customer– Apply rules-set to get prediction– Keep scoring for post-evaluation

Page 26: Automated Data Mining for Everyone

Under Cover: Near Future

• Apply various algorithms– For better visualization & insights

• Log. Regression produces visual scorecard with points

– To reduce risk of over fitting or low AUC

• Generate less and better predictors– [Average] [profit] in [category=value] in the [last/first]

N [periods]

• Identify right time to retrain models

Page 27: Automated Data Mining for Everyone

(Almost) Full List of FeaturesCampaigns• Automated & Planned• Scenario Designer• Email Designer & Gateway• SMS Gateway• Survey Designer• Social Pusher (FB, Twitter, LinkedIn)

CRM• Customer List• Custom Attributes• Events• Data Management• Import Wizard• Aggregator and Derivator

Analytics• Trends• Funnels• Descriptive Statistics• Session Length• Custom Report Designer• Dashbdoard Designer

Optimization• Segmentations• Predictions• Product Recommendation Engine• Market Basket Analysis• Real-time Customer Scoring

API and SDK for JavaScript, PHP, Python, Flash, Unity (mobile), Linux shell, …

Page 28: Automated Data Mining for Everyone

Thank you for attention!

Call Jozo Kovac (+421) 918 666 238 and schedule live demo!

www.7segments.com

+ (421) 918 666 238


Recommended