Date post: | 16-Apr-2017 |
Category: |
Technology |
Upload: | dataworks-summithadoop-summit |
View: | 483 times |
Download: | 2 times |
50 AVENUE DES CHAMPS-ÉLYSÉES 75008 PARIS > FRANCE > WWW.OCTO.COM
HADOOP SUMMIT 2016 - DUBLIN
PRACTICAL ADVICE TO BUILD A DATA DRIVEN COMPANY
Simon MABY@simonmaby
2OCTO TECHNOLOGY > THERE IS A BETTER WAY
Story : Data Driven E-Commerce
3
A continuous improvement of all business processes, through a smart use of the data, all the
time, everywhere and to all purposes
OCTO TECHNOLOGY > THERE IS A BETTER WAY
4
BEING DATA DRIVEN IS BEING LEAN
OCTO TECHNOLOGY > THERE IS A BETTER WAY
IDEA
CODEDATA
BUILD
MEASURE
LEARN
5
REQUIREMENTS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
IDEA
CODE
DATA Data must be easily accessible
Business must be aware of opportunities to use algorithms
Datascience projects should have the lowest time to market possible
6
DATA
7
DATAData must be easily accessible
OCTO TECHNOLOGY > THERE IS A BETTER WAY
8
Your Datalake is a service to your company. It should be managed like a startup
Your employees are you first clients. The more they use it, the more you are Data Driven
OCTO TECHNOLOGY > THERE IS A BETTER WAY
9
FOCUS ON USABILITY OVER ARCHITECTURE
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Services
Datalake
Datalake Team :OPS - DEVs - DESIGNERS
End Users and projects
Design services for usability and grant support
Gather requirements
and usage metrics
10
FOCUS ON USABILITY OVER ARCHITECTURE : EXAMPLES
How simple is it to share data to other projects?
How simple is it to suscribe to a data feed?
Is it possible to run a full search on available datasets?
Is it possible to ask other projects for details about their data through a social network?
Auto-completion over SQL request from other projects?
Bookmarking, sharing, upvoting datasets, tagging metadata…OCTO TECHNOLOGY > THERE IS A BETTER WAY
11
CODE
12
CODEDatascience projects should have the lowest time
to market possible
OCTO TECHNOLOGY > THERE IS A BETTER WAY
13
EXPLORATION VERSUS PREDICTION
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Explore as quickly as possible
Deliver frequently in production
14OCTO TECHNOLOGY > THERE IS A BETTER WAY
(Not so) Big Data Infrastructure(For exploration)
15
WHAT IF WE GIVE LESS DATA TO OUR ALGORITHMS?
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Cf. Zoltan Prekopcsak, Hadoop Summit EU. 2015
16
FEATURE TEAMS TO DELIVER CODE READY FOR PRODUCTION
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Business rep.
Developer
Data Sc.
17
MESSAGE BROKER TO REUSE DATA FLOWS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
App A App B
DWDB X
App A App B
DW DB X
Kafka
App C
? ? ?- Custom dev- Data formats?- SLA?- Scheduling?…
- Standard format- Prod Ready- Exploration and prod will share same formats
18
KAPPA ARCHITECTURE : EVERYTHING IS A STREAM
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Stream Data Stream Processing Serving DB
Topic Streaming app v1
Streaming app v2
Result data v1
Result data v2Kafka
Batch jobs are just historical data you send into a streaming app Application code is decoupled from technical requirements One shot exploration code respecting the stream abstraction can go in
production easily
19
IDEAS
20
IDEASBusiness must be aware of the opportunities to
use algorithms
OCTO TECHNOLOGY > THERE IS A BETTER WAY
21
MIX THESE PEOPLE
OCTO TECHNOLOGY > THERE IS A BETTER WAY
BusinessKnows what is
valuable
Data ScientistKnows what is
feasible
Culture &Collaboration
22
FEATURE TEAMS ONCE AGAIN
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Business rep.
Developer
Data Sc.
23
EXPLAIN THEM THAT MACHINE LEARNING IS EASY (IT’S METHODOLOGY)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
24
EXPLAIN THEM THAT MACHINE LEARNING IS EASY (IT’S MAGIC)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
25
SPEND TIME TOGETHER
Show them the data
Pair Programming
Swap roles for one day
OCTO TECHNOLOGY > THERE IS A BETTER WAY
26
SOFTWARE IS EATING THE WORLD : MAKE THEM CODE
27OCTO TECHNOLOGY > THERE IS A BETTER WAY
Story : Octo Datascience Competition Platform
HOW WIDELY DATADRIVEN IS YOUR COMPANY?
Everybody is willing to make value out of the available data
Data serves not only the core business but every single function
Data is used in day-to-day activity in real-time
OCTO TECHNOLOGY > THERE IS A BETTER WAY
HOW DEEPLY DATADRIVEN IS YOUR COMPANY?
OCTO TECHNOLOGY > THERE IS A BETTER WAY
You are using cutting edges algorithms to automate processes
You are used to A/B testing based on data every week
You cross multiple data sources to build insights and models