Date post: | 18-Jan-2017 |
Category: |
Software |
Upload: | patrick-di-loreto |
View: | 118 times |
Download: | 1 times |
Presented by Patrick Di LoretoHead of Engineering
Site: https://developer.williamhill.com/BLOG: http://patricknoir.blogspot.comTwitter: https://twitter.com/patricknoir
Modernizing with Microservices and Fast Data
Big Data in Numbers
By the end of 2016 there will be more than:25,000,000,000 devices connected in internet
On 2013 we produced more data in 2 days than the whole human history since the origin
What does it mean for us
- 160TB of Data are flowing through our system every day
- We push more than 5 millions price changes in real time
- On a busy day we have ½ million simultaneous customers on our platform
The Challenge
Build a data platform suitable for the development of modern applications
Requirements- Be able to process large amounts of data in a close real time fashion
- Respect non functional requirements such as:- FAULT TOLERANCE- HIGHLY AVAILABILITY- SCALABILITY
- Dealing with existing/legacy systems
- Scale team delivery capability through adoption of Microservices Architecture
• Microservices are not exclusively STATELESS applications!
False Myth: Microservices Architecture 1/2
Monolith A CBA A
MONOLITH
• Achieve great ISOLATION without using synchronous protocols
False Myth: Microservices Architecture 2/2
A B
D E
C
A C E
DB
Message Bus
Monolith
Respecting Reactive PrinciplesBased on a Lambda Architecture
• Chronos – Data Source• Fates – Batch Layer• NeoCortex – Speed Layer• Hermes – Serving Layer
Omnia – Distributed Data Management Platform
Omnia
Chronos
Fates
Hermes
NeoCortex
Omnia Chronos – Data Source
Omnia Chronos
Is in charge to collect/intercept the data from different sources and make them available as streams of observable events.
Observable [ ]•Social media•Facebook•Twitter
•Affiliates
•Page viewing•Articles read, following and followers, bets etc…
•Sports related•Tweets•News
•Gaming
•Web Analytics•Activities with in our applications
Internal Product Centric
ExternalCustomer Centric
{ “type” : “bet”, “version” : “1.0” “time” : “2015-06-03 08:00:31”, “acquisitionTime: “ . . .”, “source” : “WHBetSystem” “payload” : { … any valid json }}
Omnia Chronos
Adapter Converter PersistenceManager
In Chronos you define streams that collect data and convert/persist into a stream of Observable[Incident].
Chronos
Stream 3
Stream 2
Stream 1
Stream
Omnia Chronos - Clustering
Chronos 1 Chronos 2 Chronos 3
Distributed System Properties:1. Concurrency2. Distribution3. Mobility
Omnia Chronos
• Chronos is built on top of Akka to leverage: – Referential transparency (Mobility)– Error Kernel Patter (Fail fast and in isolation)– Concurrency and Distribution for Horizontal and Vertical Scalability
• We use Scala Rx API to promote non blocking API to achieve Vertical Scalability
• Data are persisted in Kafka for durability:– Fast Write Operation with Zero Copy and Filesystem Cache– Compaction and Compression to optimise messages consumption
Vertical Scalability vs Horizontal Scalability
Horizontal – Distribute the load across different machines (Akka Cluster)
Vertical – Maximise local resource utilisation (Non blocking IO + Non blocking API)
Timing for Machine operationsInstruction Time
Execute typical instruction 1/1,000,000,000 = 1 nanosec
Fetch from L1 cache memory 0.5 nanosec
Branch misprediction 5 nanosec
Fetch from L2 cache memory 7 nanosec
Mutex lock/unlock 25 nanosec
Fetch from main memory 100 nanosec
Send 2K bytes over 1Gbps network 20,000 nanosec (20µs)
Read 1MB sequentially from memory 250,000 nanosec (250µs)
Fetch from new disk location (seek) 8,000,000 (8ms)
Read 1MB sequentially from disk 20,000,000 nanosec (20ms)
Send packet US to Europe and back 150,000,000 nanosec (150ms)
Humanised TimeInstruction Time
Execute typical instruction 1 s
Fetch from L1 cache memory 0.5 s
Branch misprediction 5 s
Fetch from L2 cache memory 7 s
Mutex lock/unlock ½ s
Fetch from main memory 1½ min
Send 2K bytes over 1Gbps network 5½ hours
Read 1MB sequentially from memory 3 days
Fetch from new disk location (seek) 13 weeks
Read 1MB sequentially from disk 6½ months
Send packet US to Europe and back 5 years
Omnia Fates
Fates represents the long term memory of Omnia. Is in charge to organise all the incidents recorded by Chronos into timelines and create new information as views by using machine learning, logical reasoning and time series analysis.
• A timeline represents the history, the sequence of incidents performed by a specific entity over the time. Timelines are organised per categories. An example of timeline can be the customer timeline, which might contain all the bets placed, deposit and withdraw activities, tweets etc... performed by the specific customer. A timeline category is not limited just to customers, it can be anything, for example: Sport Event: football match, competition
• Views are the result of job task that elaborates data from:– Timelines– Other Views
Omnia Fates
Fates represents the long term memory of Omnia. It organizes the incidents that Chronos collected into timelines and also elaborates new information as views by using machine learning, logical reasoning and time series analysis.
Fates: Batch layer
19
Omnia: Distributed & Reactive platform for data management
Customer: 123
Login
Deposit
Bet placed
…
Logout
Event: 78
Started
Fault
Penalty
…
GoalTimelines & Views
Bets Deposits Events Session
FatesBatch Layer
Timelines are created from timeline streams, each timeline stream read data from a Chronos stream and fed the right timeline.
Omnia FatesCh
rono
s
Fate
s
• Fates persist timelines of incidents.
• Column Family Name: <TimelineCategory>_tl
• Key Definition: ( (entityId, date), timestamp )
• The partition key is a strong hash key : well balanced Cassandra Cluster• Composite key: incidents are ordered by timestamp under a specific entity within a day
(date = yyyy-MM-dd )
Omnia Fates - Cassandra
• Multi Data Center application for operation and analytics/reporting• On line analysis against ETL!
Omnia Fates – Separation of Concerns
Omnia Fates
• We build views with job able to do:
Jobs are performed on top of NeoCortex
Logical Reasoning• Deduction• Induction• Abduction
Time line analysis• Trends• Cycles• Seasonality
Other ML• Classification• Clustering• Predictions
Omnia Neo Cortex
Omnia Neo Cortex• NeoCortex is a runtime platform and a set of libraries to perform concurrent and
distributed computations in a highly resilient way.• Was initially desgined as a library on top of spark (streaming) but it evolved in a
platform for Reactive Microservice which allows to build application in:– SPARK STREAMING– AKKA STREAMS– WILLIAM HILL LAMBDAS
• Applications are deployed in Neocortex as docker isolated microservices and they can interact each other using chronos streams and with client applications through Hermes.
Omnia Neo Cortex – SPARK STREAMING
Omnia Neo Cortex - Parallelism
chronosstream
Driver
Executor 1
Executor 2
Executor 3
Executor 4
Executor 3
Executor 4
Hermes
(Serving Layer)
Stage 1(map)
Stage 2(reduceByKey)
Fatestimelinesviews
Neocortex - Hiding Complexity
Omnia Hermes
Omnia HermesIs the layer on which data get represented for consumption: B2B and B2C. At its foundation micro-services, notifications and data as API are key aspects of the design
Scalable and simple full duplex communication for the web
Express the correlation between the entities of the model
Inspired by Falcor (Netflix) and GraphQL (Facebook)
Omnia Hermes
Herm
esDi
strib
uted
Cac
he
Hermes Node
Loca
l Cac
he
Subs
crip
tion
Man
ager
Clie
nt M
anag
er
Auth
entic
ation
Han
dler
Dispatcher
HTTP
WS
TCP
Browser
Herm
es JS
WH
Apps
Chro
nos
Omnia Infrastructure – Mesos/Marathon/Docker
Omnia Infrastructure
Omnia
Docker
Marathon
Mesos
Node Node Node Node Node
Questions
34