Date post: | 29-Aug-2014 |
Category: |
Technology |
Upload: | mongodb |
View: | 487 times |
Download: | 0 times |
Replacing Traditional Technologies with MongoDB
A Single Platform for all Financial Market Data
June 2014
James Blackburn & Gary Collier
Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc (‘Man’). These opinions are subject to change without notice, and are for information purposes only and do not constitute an offer or invitation to make an investment in any financial instrument or in any product to which any member of Man’s group of companies provides investment advisory or any other services. Any forward-looking statements speak only as of the date on which they are made and are subject to risks and uncertainties that may cause actual results to differ materially from those contained in the statements. Unless stated otherwise this information is communicated by Man Investments Limited and AHL Partners LLP which are both authorised and regulated in the UK by the Financial Conduct Authority.
© Man 2014 2
Legal Stuff
© Man 2014 3
Introductions
Gary Collier James Blackburn
© Man 2014 4
Agenda
The Story of MongoDB at AHL1. What is a Systematic Fund Manager?2. Low Frequency Futures and FX Data3. Single Stock Equity Trading4. Building a Tick Store5. Now and the Future?
PrologueAHL – A Systematic Fund Manager
© Man 2014 5
© Man 2014 6
Systematic Fund Management
Removing the first impedance mismatch…
© Man 2014 7
Quants and Techies Speak the Same Language
© Man 2014 8
Disparate Data Sources
Dat
a A
PI
But…
© Man 2014 9
All Data is Behind an API
Performance User Experience
Cluster Compute
Onboarding New Data Impedance Mismatch
Mix of Technologies
Is there one Technology which could address?
Many Moving Parts
Reliability
© Man 2013 10
Chapter 1Starting Small: Low Frequency Data
© Man 2014 11
The Data
8000 rows x 200 markets100 MB
5000000 rows x 250 markets500 GB
Parallel Filesystem
© Man 2014 12
Previous Solution
HDF5
HDF5HDF5
HDF5 HDF5Prop
PropProp
Prop
Prop
RDBMS
RDBMS RDBMS
© Man 2014 13
The Challenge
Fast?
Reliable?
Versionable?
Easy to extend?
© Man 2014 14
MongoDB Solution
node 85 node 96node 86 …node 87
node 1 node 2 node 12
node 73 node 84node 74
…
…
.
...
.
.
node 3
node 75
.
.
SSDshard 1 shard 2 shard 3 shard 4
shard 1 shard 2 shard 3 shard 4shard 1 shard 2 shard 3 shard 4
MongoDB Cluster
Linux24 cores
96 GB RAM
BloombergAdapter
JPMAdapter
MarkitAdapter
GSAdapter
© Man 2014 15
Performance: 200 Future Markets
Previous Solution MongoDB
100x faster to retrieve data
Consistent retrieval times
© Man 2014 16
Performance: EURUSD 1-Minute Data
Previous Solution MongoDB
2-5x faster to retrieve data
Consistent retrieval times
© Man 2014 17
Low Frequency Data - Conclusions
MongoDB faster than previous RDBMS/File Solution at…• ALL data sizes and ALL client load levels• …consistently
Game changing new features:• No impedance mismatch: onboard new data in minutes• Version Store: can ask “What did the data look like?”
Cost Savings:• Proprietary parallel filesystem replaced by commodity
SSD’s
© Man 2013 18
Chapter 2Getting Bigger: Single Stock Equities
© Man 2014 19
Single Stock Data - Scale
Thousands of Stocks
Many years of Time-series Data
Tens of different Data Item for each Stock
Complex trading models with many Quants sharing the Data
TradingSignal
Derived Data Item
Derived Data Item
Derived Data Item
Derived Data Item
Derived Data Item
Raw Data ItemsRaw Data
ItemsRaw Data ItemsRaw Data
ItemsRaw Data Item
Multi-user, versioned, interactive graph-based computation
© Man 2014 20
Single Stock Data
Source Data(Managed RDBMS)
Raw Data ItemsRaw Data
ItemsRaw Data ItemsRaw Data
ItemsRaw Data Item
Derived Data Item
Derived Data Item
Derived Data Item
Derived Data Item
Derived Data Item
TradingSignal
shard 1 shard 2 shard 3 shard 4shard 1 shard 2 shard 3 shard 4shard 1 shard 2 shard 3 shard 4
MongoDB Cluster~1TB Data
~10,000 Stocks~20 Years
250 Data Items Each Item is 600 MB
Single model ~150GB
Many Quants and models
Hours Minutes
© Man 2014 21
Single Stock Trading - Conclusions
MongoDB faster than previous RDBMS/File Solution at…• Fast interactive research• Read/write a 600MB Data item in < 1 second• Rebuild complex model: hours minutes
© Man 2013 22
Chapter 3MongoDB as a Tick Store
Almost, but not quite
© Man 2014 23
Big Data?
30TB Historic Data
Ticks/1000 per second
Sparse Data
© Man 2014 24
Third-Party Tick Stores
Typically…• Expensive• Proprietary query languages• Database-centric architectures, so…• Not ideal for cluster compute• Unless you pay for lots of cores…• Expensive!
So…• A real $$$ saving opportunity!
© Man 2014 25
Architecture
ReutersR
MD
S M
essa
ge B
us
Bloomberg
Banks
Kafka Queue
Kafka Queue
Kafka Queue
16 shard clusterMaster + 1 replica
Linux12 cores
256 GB RAM96TB Disk
Infiniband network LZ4 compressed data
MongoDB Cluster
Parallel Access
© Man 2014 26
Tick Store Performance
Infinibandsaturated
25x greater tick throughput
With just 2 machines!
© Man 2014 27
Tick Store: System Load
OtherTick Mongo (x2)N Tasks = 32
© Man 2014 28
Tick Store - Conclusions
Happy Quants!• 25x improvement in tick throughput• So fit models 25x as fast
Happy Accountants!• >40x cost saving of MongoDB Support compared to
previous Tick Store licensing.
© Man 2014 29
EpilogueWhere are we now and where next?
PerformanceLow Frequency Data: 100x faster
Equities Models: Hours SecondsTick Data: 25x faster
© Man 2014 30
Key Facts
Cost SavingsParallel File System Commodity SSD’s
Proprietary Tick Store MongoDBOrders of magnitude $$$ savings…
Efficiencies4 storage technologies 1Fully utilise expensive HPC resources
Support load on team down > 50%
Game ChangersOnboard Data: Days MinutesData Versioning
The technology is no longer the bottleneck
“Peopleware”Attract and retain great Quants
Attract and retain great Techies
And attend a great conference
© Man 2014 31
Where Next?
1. Extend the data ecosystem further2. Broader application across the company as a whole3. Open Source?