Date post: | 24-Jun-2015 |
Category: |
Business |
Upload: | hadoop-summit |
View: | 719 times |
Download: | 0 times |
Deep Value
Hadoop Summit June 2013
Deep Value, Inc.
Outline of talk l Who are we
l What do we do
l What is HFT
l What is the structure of our technology effort
l How we use Hadoop
l Focus on what we've built at top level and lessons learned
l Next steps? Open source with founding team
Deep Value l Started in 2006 to provide high performance execution
algorithms on a “paid for performance” basis.
l Execution algorithms take large client orders and split into small pieces to execute through the day
l Routinely trade 0.5 – 1% of US stock market volumes. Highest date in 2012 was ~4% and ~3% this year
l Exchange sponsored execution algorithms to NYSE floor brokers.
l 45 people based in US and India
What do we do l Utilize sophisticated math and statistics to see patterns in
the data to come up with trading tactics
l Use simulation to understand if trading ideas in-fact work.
l Core business is providing tools (algos) to mutual funds and others to avoid being gamed by pure HFT-traders
l Ability to harness compute resources is a key determinant of success - Hadoop
l All compute resources are now cluster based and need a grid platform to utilize - Hadoop
What if HFT? l Look at every order in the market and make real-time
decisions on what to do next
l Looking to receive rebates by providing liquidity when sensible to do so
– Citibank was favourite for many years due to low price and thus large % spread
l Some amount of “sniffing out” of large orders
l Often a speed game – faster routers, shorter wires, FPGA
l We use smarts to try and not show our hand
Trading Systems l Order management systems (OMS) / Execution
Management Systems (EMS)
l Takes in market data representing every order placed in every market
l Sends out orders to market, manipulates those orders (replace/cancel) and receives fills
– Via name-value protocol call FIX
l Fills represent actual trades
l Logs what it is doing via structured logging
Cloe
Lessons from building grid l Cluster wide locks is the problem
– Focus on these in design
– Batch changes and get lock once
l Build for performance case, and have failure case be potentially slower / more complex
– Regular message processing doesn't get cluster locks
l Hybrid of message passing & centralized control
Questions to solve: Hadoop l What is the algorithm actually doing?
– Complexity e.g. feedback loops
– Testing against intentions
l Can we do better next time
– Back-testing
– Improved research process
l Log and historical market data management
DV Research Process l What to be able to look at “raw” market data to be able to
prove ideas
– Typically non-programmers with statistical background
– R-project including R-Hadoop
l Want to be able to make change to production code, and test if this works better via simulation
– Does it work better, how, when?
l Roll out code to production easily
Hadoop-ifying Cloe l Realized we could run Cloe under Hadoop
l Drive “orders” into Cloe via Hadoop
l Pass in market data quote files via HBase
l Store simulation results in Hadoop/HBase
l Market Simulation Framework outputs fills
l Cascading to allow complex analysis by senior coders
Lessons learned - Hadoop l EC2 costs can mount quickly
– Had hybrid plan (either own or EC2)
– Built our own 50 node cluster. See DV blog.
l Smaller files should be in Hbase not Hadoop has a NameNode limitation
– All file pointers in memory
l Different tasks with different resource requirements don't play nicely in single cluster
– YARN should solve this.
Lessons learned – Hadoop...
l Make developer machine setup turn-key
– We use extensive scripting to make getting dev environment running a one step process
– Dev environment was controlled to close to cluster environment
l Cascading is great for complex analysis
l Importance of configuration of cluster
– Memory, threads, cores for your jobs
Next steps l Considering open-sourcing via Apache license
l Bring some sanity to traditional execution technology space
l Looking for a founding team
l Please talk to me afterward if you're interested in investigating further
End