Three ways to Fail your Data LabImplementation
Dataiku DSS
Data Labs
10 M€ in 2014121 499 M€ in 2014 3 029 M€ in 2015
5 454 M€ in 2014816 M€ in 201410 M€ in 2008
Marketing / Webü Behavioral segmentation
ü Churn predictionü Sales forecast
ü Dynamic Pricing
Industrie & Infrastructureü Predictive maintenanceü Logistic Optimization
ü Smart Cities
Bank & Insuranceü Fraud detection
ü Risk anticipation ü Lifetime moment detection
Why a data Lab?
• 1 single Workflow : from a segmentated workflow to a transversal one• Several use cases: Ability to adress many different data centric topics within a
single unit• Multiple competences: Business focused approached mixing many different
competences• End to end projects : combining data from different sources to handle several
aspects on a single topic
Deployment of the predictions
Dataiku DSS for fraud prediction
Client service
Sensor data
Garage data
Administration
• 1 Project Owner (IT)• 1 Project Manager (Business)• 1 Data scientist in house• 3 data scientist sfrom 3 different firms• 3 consultants from 3 different firms• 1 architect (external)
Accepted file
INVESTIGATE !
The transactions are blockeddepending on their gap with the
business rules and behavioralpatterns
Welcome to Technoslavia !
6
Focus on the framework, not on the input
Data Acquisition &
Understanding
Data Preparation Model Creation
Evaluation Deployment
Scored dataset
Scored dataset
Iteration 1
Iteration 2
Iteration n
✓ Read and import raw data✓ Detect schemas and structure
✓ Analyze distributions✓ Assess quality: outliers,
missing values...
✓ Performance metrics✓ Robustness & generalization
(cross validation)✓ Insights (eg variable importance)
✓ Create derived and aggregated variables→ Analytical dataset
→ Report
✓ Feature selection✓ Compare algorithms
✓ Scoring engine✓ Publish predictions✓ Monitor performance
✓ API
Business Understanding
Adapted from the CRISP-DM methodology
Dataset 1
Dataset 2
Dataset n
People and Governance
?Polyglott VS dictator
Problems : • Collaboration between
technical and non technical profiles insidea single project• Nécessary
collaboration betweenbusiness and techteams to adresstransversal projectsaccurately
Focus :• Promote diversity• …within a workflow
centric environment
End to end, from prototyping into production Do it you way …
…and scale!
Data Lab OrganisationData Lab
Lab Environment
Multydisciplinary Team:
Direction / Project Management
Business Analysts
Data Miners / Data Scientists
Production Environment
Business needs
Internal Data sources
External datasources
Missions :
Priorisation of the business needs
Prototyping / Agile solution engineering
Support for Apps deployment
Business Applications
Marketing Campaign Automation
Reporting webanalytics
Data as A Service Platform
Conception of “DATA PRODUCTS”
Integration of Data Products
Optimisation Engine
Real Time Scoring
Data Flow
Insights & Services
Processing chain
API Deployment
Thank you !