Panel: Driving Factors for Prospective Sectors
Panelists: Hristo Hadjitchonev, Angel Marchev Jr., Nikola Toshev, Emil Ivanov, Sergi Sergiev
Moderated by: Angel Mitev
ROI of Data Science Projects
▪ Improve Revenue
▪ Reduce Cost
▪ Optimize Performance
a4everyone.com
A4E Case Studies - Nedelya
Real-time forecasting of demand
Automation of Supply-Demand chain decision process and Production Facility and Distribution
Analytics for Location and Marketing performances
Waste minimized ~2% (7% industry average)
Solution delivered via A4E proprietary Analytics cloud based platform
Usage of big data – enriched weather data (historical and forecast)
MRR
Industry: Pastry and cakes
retail and production
Size: 37 + retail locations
Revenue: € 9M (2016)
Web: http://nedelya.com
a4everyone.com
A4E Case Studies – Coca Cola
Analysis of past marketing performances
Geo targeting of marketing Campaign (300,000 households)
New product marketing - 750ml glass pack
20% better performance compared to the previous campaigns
Big Data utilization – rent per sq.m., income per district, NIS public data (age, gender, households, etc. distributions)
Project delivered based on proprietary modeling algorithms
ARR
Industry: Beverages
Size: 10000 +
Web: http://www.coca-
cola.com
a4everyone.com
A4E Case Studies Sport Depot
Analysis of historical sales data
Model and forecast of next winter season demand
Supply order recommendations for 23 stores and distribution based on:
• Sport
• Gender
• Color
• Size
• Pricing category
Project delivery based on proprietary forecasting algorithms
Project based and ARR
Industry: Sporting goods
retail
Brands: 60+
Locations: 21
Revenue: €35M
Web: sportdepot.bg
a4everyone.com
A4E Case Studies – Credissimo
Credit Score as a Service
Automated Scorecard updated on biweekly bases
5 seconds response time per request
Integration of the business rules
80% automated decisions
Industry: Financial Services
Users: 2M +
Markets: Bulgaria,
Macedonia, Poland
Revenue: € 11M
Web: credissimo.bg
Project “Next best offer”, DSK Bank
(Data mining driven sales)
8
Essence:
Computing individual probabilities to buy a credit product, using statistical methods, to extract
maximum sales potential from the data of own customers.
Target group:
Retail banking
Goals:
1.Discover implicit models (outside of the normal business logic) and better understanding motives for
buying a credit product;
2.Generating leads with high propensity to buy on individual level;
3.Deeper understanding of the (groups of) factors which determine the propensity to buy;
4.Targeting bigger groups (“nuggets”) of customers with high propensity to buy.
Business understanding
Data understanding & preparation ModellingEvaluation &
implementation
Target variable:
Applied for <personal> loans
Predictor variables:
3965 features, describing products, demographics, and behaviors
3 216 831 customers
Main challenges:
Non-structured data which lead to the large amount of additional work to be structured and merged
Too sparse data (many missing values which lead to the dropping out cases and/or variables)
Data prep:
Reduction of the 3965 features to 413 significant, most important features and factors, increasing
propensity to buy
General data cleansing
Business understanding
Data understanding & preparation ModellingEvaluation &
implementation
Project “Next best offer”, DSK Bank
(Data mining driven sales)
Best Method:
CHAID – Chi-squared Automatic Interaction Detection
(Gordon Kass, 1980):
Numeric & categorial variables
Uses cases with missing values
ChiSq procedure which distinguishes more than two splits
The model with best relations among the nodes
Take away’s:
• Big data needs of strong computing power (hours of calculation)
• The model requires update after time
Business understanding
Data understanding & preparation ModellingEvaluation &
implementation
Project “Next best offer”, DSK Bank
(Data mining driven sales)
Business understanding
Data understanding & preparation ModellingEvaluation &
implementation
Evaluation:
Accuracy is high at 76,8% for actual/predicted
customers and 81.3% overall
Area under curve (AUC) = 0.865
Main outputs:
Individual Raw Propensity (RP) score – to be
used for identifying prospective (groups of)
customers
Ruleset for generation of leads towards groups of
customers and for further analyses.
Project “Next best offer”, DSK Bank
(Data mining driven sales)
Comparison by efficiency:Two campaigns with the same time lengths using leads from:- data mining results from the model- business rules regularly practiced at the bank
CampaignCredit
applications
Credits deals
signed
Contacted
leads
Applications to
contacts (%)
Deals to contacts
(%)
Leads by Data mining 288 148 9674 3.0 1.5
Leads by Business rules 201 166 13487 1.5 1.2
Increased efficiency: 99.8% 24.3%
Business understanding
Data understanding & preparation ModellingEvaluation &
implementation
Project “Next best offer”, DSK Bank
(Data mining driven sales)
Hotel Chain
…
…
…
Travel Agent
…
…
…
Look to book
KPIs
Goal KPI Target
Shield suppliers from high transaction volume Utilisation 70-80%
Look to Book 4,000
Protect booking opportunities by maintaining high cache accuracy Accuracy 95%
Booking Error <1%
Enable certain channel access to rich rate diversity Booking Growth 10%
Drive low average response times by answering from cache Response time <300ms
Concept
● Build a cache based on machine learning to improve performance (KPIs):
○ Recognize patterns in history:
■ Availability
■ Frequency of rate change
■ Revenue management hierarchy
○ Predict search requests
○ Infer expiration time
● Use historical transaction data
Solution
● No proactive search requests
● Estimate validity, not expiration time
● Travel-specific feature engineering to capture the right correlations
● New cache structure
Smart Cache
Feature Engineering
Lead time to
arrival day
Cache
entry age
Booking
spike in
area
Hotel
type
All key
request
fields
All cached
response
fields
Weather
prediction
... Is cache
entry
valid?
200h 12.5h 3σ Airport ... ... ... ... No
48h 1h -0.5σ Resort ... ... ... ... Yes
... ... 0.2σ City ... ... ... ... ...
Online data vs offline data?
Retail Statistics
91% of all purchases happen in the physical stores*
20%
71%
of purchases are tracked by loyalty cards
of customers’ data is not collected and utilized
Source: RetailNext *
SessionM **
>90% of shoppers use their mobile while shopping **
Technology
3G / 4G BluetoothWiFi
50-60 % Link OnlineUnique key
Patterns Behaviour Mobile App Cashier Free WiFi
Loyalty cardsBills CRM
Customer Behavior Analytics
Add-on for every brick and mortar
ShopUp
WiFiData
MobileApplications
DoorCounters
WeatherDatabases
Combine
SalesDatabases
Analyze
Traffic
Retention
Cross-Chopping
Customer Behaviour Customer Flow
Venue Analytics Heatmap
Predict
Profiling
Customer basket
Customer cross-products
Free loyalty card
Traffic
Cross-SellUp-Sell
Recommendation engine
Preferences
Event recommendation
Notifications
Outside
Products Promotions
Shifts
Maintenance
Dwell time
Cannibalization
Customer
understanding
(revenue)
Employees
feedback
(performance)
Merchandising
(performance)
Hot and cold zones,
early notifications and
alerts
Cross and us-sell
mechaniques
(revenue)
Holistic view
Optimized shifts, KPI
and realistic metrics