AbstractWe are now in a "competing on analytics" (aka data-science) era. Unfortunately mostly it is understood as predictive modelling. We would like to show that data science is much more than this. We will present general architecture of any data science solution using selected case studies from our projects.
Copyright (c) WLOG Solutions 2
Motivating example
Customer
• I want to improve demandforecasting model(s).
Me
• Why do you need forecastsfor?
Customer
• To make optimal resourceallocation.
…• …
Copyright (c) WLOG Solutions 4
Data ForecastsResource
allocation
Stages of the whole decision process
Why is it important?
• Companies want to compete on analytics at any business
level which means
• predictive analytics is not enough,
• optimization is a must to generate optimal recommendations at
low business level,
• simulation is a must to understand influence of unpredictable
factors,
• analytical toolbox must be flexible to minimize time-to-market.
Copyright (c) WLOG Solutions 5
What one can gain?
• More precise recommendations leads to better decisions.
• Automated decision making process leads to controllable
costs and safer business.
• Less expert guessing leads to healthier business.
• General problem tackling leads to minimized time-to-
market.
Copyright (c) WLOG Solutions 6
Analytics without loss of generality
Data analysis
What are the facts?
How we can use them for our Organisation?
How the facts support/deny our expert knowledge?
Optimization
What are optimal course of actions?
How far are my decisions from being optimal?
Simulation
What are possible future scenarios?
How can I measure risks that our Organization is exposed to?
Copyright (c) WLOG Solutions 8
What are the main requirements for an analytical framework?
Flexible
• Can tackle anybusiness problem
Accessible
• Can do a prototype fast and cheap
Scalable
• Can scaleusing moremachines
Efficient
• Can get goodquality modelsfast
Copyright (c) WLOG Solutions 9
WLOG* Analytics architecture™ (1)
Copyright (c) WLOG Solutions 10
*WLOG = Without Loss of Generality
WLOG* Analytics architecture™ (2)
Flexible
•R
•Python
Accessible
•Open-sourcewhere possible
Scalable
•Spark
•Cloud
Efficient
•Selected and tested libraries
•Java if needed
Copyright (c) WLOG Solutions 11
Our toolbox (2)
ETL
•Spark
•R
•Python
Predictivemodelling
•R
•H2O
•MXNET
•XGBOOST
Optimization
•COIN-OR
•ECLiPSe
•Choco
•Gecode
•Java (!)
Simulation
•R
•MASON
•Spark
•Julia (testing)
Visualization
•Python
•Javascript
Copyright (c) WLOG Solutions 13
What did we get as a result?
• Flexible,
• productive,
• scalable,
• with great price to quality ratio
platform to tackle almost any business problem.
Copyright (c) WLOG Solutions 14
Case studies
• Cash optimization in Deutsche Bank
• Midterm Energy price simulation
Copyright (c) WLOG Solutions 16
Cash optimization in Deutsche Bank (1)
Copyright (c) WLOG Solutions 17
CENTRAL
BANK
GROUP OF COOPERATING BANKS
VAULT
VAULTBRANCH
ATM
CORPORATE
CUSTOMER
RETAIL
CUSTOMER
deposit or
withdraw
deposit or
withdraw
buy or sell
buy or sell
closed
payment
closed
payment
closed
payment
cash
transfer
cash
transfer
Cash optimization in Deutsche Bank (2)
ETL
•End-of-day process
•Current balances
Demandforecasting
•Around 800+ forecasts
•Done in R
Cash movementrecommendations
•Done in CBC
•Workflow in R
Copyright (c) WLOG Solutions 18
Example
Copyright (c) WLOG Solutions 19
Before optimization:
Time: 3 months
Deposit transports: 20
Withdrawal transports: 0
After optimization:
Time: 3 months
Deposit transports: 3
Withdrawal transports: 0
Midterm Energy price simulation (1)
Copyright (c) WLOG Solutions 20
Statistical model for SPOT prices
Marginal costs
fundamental
model
Balancing market
fundamental
model
Demand model Supply model
• Complex model consisting of 5
submodels.
• Simulation is done on model
parameters:
• Temperature
• Wind
• Regulation
• Fuel prices (e.g. coal)
Example
Copyright (c) WLOG Solutions 21
9 scenarios:
1. Temperature: low, medium, high
2. Wind: weak, medium, strong
Montecarlo simulations
1. Temperature & wind
2. Peak probabilities
Key points
• Prediction is just a step to make a decision (optimal).
• Data science is expected to support any business decision.
• One should have tools covering three aspects: predictive,
optimization and simulation models.
• Open-source gives us flexibility and a great price/quality ratio.
Copyright (c) WLOG Solutions 23
Wit Jakuczun, PhD
CEO
Email: [email protected]
Mobile: +48 601 820 620
Skype: jakuczun
WWW: http://www.wlogsolutions.com/enCopyright (c) WLOG Solutions 25
Thank you!