Date post: | 25-Jun-2015 |
Category: |
Technology |
Upload: | datadogslides |
View: | 601 times |
Download: | 1 times |
The Data MulletFrom All SQL to No SQL back to Some SQL
Alexis Lê-Quôc @alq
The Data MulletFrom All SQL to No SQL back to Some SQL
Alexis Lê-Quôc @alq
Alexis Lê-Quôc @alq
This Talk
• A (mostly) DIRTy Architecture for...
• A new application (datadoghq.com) on a limited budget
• Running on a public cloud
• Focussing on data stores.
Some context
Servers
Monitoring
IaaS, PaaS Usage AnalyticsPerf. Management
Apps
Hosting
CDNs Asset Management
SDLC
Ops team Dev team
Alexis Lê-Quôc @alq
Dev & Ops “collaborate”
Concretely, what does Datadog do?
Alexis Lê-Quôc @alq
etc.
Alexis Lê-Quôc @alq
Watching real time feeds
Looking for patterns
Constant telemetry
Real-tim
e
Bursty batches
Share
Alexis Lê-Quôc @alq
Data Taxonomy
MetricsUnique visitorsLoadTransaction duration...
EventsConversationsAlertsBuild & Deploys...
Alexis Lê-Quôc @alq
Unit of scale
• 1 source, typically a server
• 100 metrics
• Every 15 s
• 24,000 points per hour
• ~3 bytes per point
• 100 KB/hour, 850 MB/year
• Events
• whenever they occur
• Highest resolution: 1s
• Small payload + metadata
Alexis Lê-Quôc @alq
ACID, BASE & DIRT
• ACID
• http://en.wikipedia.org/wiki/ACID
• BASE
• http://en.wikipedia.org/wiki/Eventual_consistency
• DIRT (Bryan Cantrill at Surge 2010)
• http://dtrace.org/resources/bmc/DIRT.pdf
Let’s dig some DIRT
DI-RealTime
Alexis Lê-Quôc @alq
The Consequences of DIRT?Latency
• Data consumed by people (and machines)
• Low end-to-end latency (5-15s)
• Psycho-physiological Factor
• Same order of magnitude as email/SMS*
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.2465&rep=rep1&type=pdf*
Alexis Lê-Quôc @alq
The Consequences of DIRT?Concurrency
• Concurrent events & data points show up in sync
• Access Patterns?
• All recent data, e.g. last 24 hours
Alexis Lê-Quôc @alq
The Consequences of DIRT?Tolerance to noise
• Not a System of Record
• “Real-time” decisions
• Drop (some) individual data points rather be late
• Applies to metrics, not events
Alexis Lê-Quôc @alq
Cross here? Or here?
Noise but no Latency Latency but no Noise
DataIntensive-RT
Alexis Lê-Quôc @alq
The Consequences of DIRT?Storage
• Business Cycles
• Retention Policy > Business Cycle
• E.g. retail, education 12 months
• Elastic Storage
• !CAPEX
Alexis Lê-Quôc @alq
The Consequences of DIRT?Latency
• Datadog, a data exploration app for people
• Looking for patterns
• Ideal: 300 ms round-trip
• Access patterns for long-term data?
• Storage trade-off: precompute oft-used properties
• Run-time Trade-off: want longer timespan, get lower resolution
• != RRD
Alexis Lê-Quôc @alq
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data sets
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data sets
Watch & ShareReal-time updatesOn-the-fly data analysis
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data sets
Look for PatternsOn-demand visualizationBackground data analysis
Watch & ShareReal-time updatesOn-the-fly data analysis
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data setsBA
SE
DIR
T
Look for PatternsOn-demand visualizationBackground data analysis
Watch & ShareReal-time updatesOn-the-fly data analysis
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data setsBA
SE
DIR
T
Look for PatternsOn-demand visualizationBackground data analysis
Watch & ShareReal-time updatesOn-the-fly data analysisD
IRT
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data setsBA
SE
DIR
T
Look for PatternsOn-demand visualizationBackground data analysisBA
SE
Watch & ShareReal-time updatesOn-the-fly data analysisD
IRT
Alexis Lê-Quôc @alq
AggregateConstant data influxLarge data setsBA
SE
DIR
T
Look for PatternsOn-demand visualizationBackground data analysisBA
SE
Watch & ShareReal-time updatesOn-the-fly data analysisD
IRT
Datadog = DIRT + BASE + a tiny bit of ACID
Alexis Lê-Quôc @alq
How It All Fits Together
Alexis Lê-Quôc @alq
The MulletAll SQL in front, NoSQL party in the back
Alexis Lê-Quôc @alq
Actual Stack
Alexis Lê-Quôc @alq
Choices, choices
• 5 axes
• Volume of Data
• Latency
• Ops: wake-up-in-the-middle-of-the-night factor
• Dev: community & tools
• Cost as in “a function of X”
Choosing Elastic Storage
Alexis Lê-Quôc @alq
Durable, Large-Scale Storage
• Postgres
• Mongo
• Cassandra
• (Riak)
• SciDB
Alexis Lê-Quôc @alq
Durable, Large-Scale Storage
• Postgres
• Itemized data points in a time series are useless
• BLOB management not fun
• Mongo
• Cassandra
• (Riak)
• SciDB
Alexis Lê-Quôc @alq
Durable, Large-Scale Storage
• Postgres
• Mongo
• SciDB
• Cassandra
• (Riak)
Alexis Lê-Quôc @alq
Durable, Large-Scale Storage
• Postgres
• Mongo
• Durability in question in 2010
• SciDB
• Cassandra
• (Riak)
Alexis Lê-Quôc @alq
Durable, Large-Scale Storage
• Postgres
• Mongo
• SciDB
• Very very early
• Cassandra
• (Riak)
Alexis Lê-Quôc @alq
Durable, Large-Scale Storage
• Postgres
• Mongo
• SciDB
• Our pick: Cassandra
• (Riak)
Alexis Lê-Quôc @alq
Cassandra: Volume of Data
• 100s of hosts, 150TB at FB in 2010
• Easy to distribute data, durable quorum writes
Alexis Lê-Quôc @alq
Cassandra: Latency
• < 10ms on writes
• reads more variable (on EC2)*
* More on this in a bit
Alexis Lê-Quôc @alq
Cassandra: Ops
• Release Engineering too aggressive
• ~10 releases since 1/2011 on 0.7 branch
• Good resilience to node loss in the later 0.7 versions
• Annoying idiosyncrasies (cassandra.yaml, predictability of disk use)
Alexis Lê-Quôc @alq
Cassandra: Dev
• Bizarre nomenclature (rows, columns... families?)
• Cumbersome data access
• Limited Semantics when used to SQL
• Good libraries
Alexis Lê-Quôc @alq
Cassandra: Cost
• Ops time
• I/O limits raised by increasing number of nodes
• Thereby increasing costs,
Alexis Lê-Quôc @alq
Riak
• Prototyped out of spite for Cassandra 0.7[0123]
• We ♡ Erlang
• Great folks
• But Cassandra pain subsided, priorities shifted.
• git merge datadog/riak did not happen
Choosing In-Mem
Alexis Lê-Quôc @alq
In-memory DB
• We started with Redis
• Then we stopped looking :)
Alexis Lê-Quôc @alq
Redis• Volume of Data
• Limited by available RAM, easy partitioning in our case
• Latency
• << 5 ms, dominated by network
• Ops
• Low-maintenance, stable, predictable, replicated, boringly rock-solid
• Dev
• Brilliant, clear docs, simple protocol, oft-used native data structures
• Cost
• ~ cost of RAM on EC2
Choosing a SQL Data Store
Alexis Lê-Quôc @alq
General-purpose data store
• We ♡ SQL
• Oracle
• Postgres
Alexis Lê-Quôc @alq
Oracle in numbers
• base license 47.5
• clustered db 23
• replication 10
• partitioning 11.5
• analytics 23
• in-mem cache 23
• total: $138,000
Alexis Lê-Quôc @alq
Oracle in numbers
• base license 47.5
• clustered db 23
• replication 10
• partitioning 11.5
• analytics 23
• in-mem cache 23
• total: $138,000
• for 2 cores
• + 22% annual support
• Just in licenses...
Alexis Lê-Quôc @alq
Oracle in numbers
• base license 47.5
• clustered db 23
• replication 10
• partitioning 11.5
• analytics 23
• in-mem cache 23
• total: $138,000
• for 2 cores
• + 22% annual support
• Just in licenses...
Alexis Lê-Quôc @alq
General-purpose data store
• Oracle
• Postgres
Alexis Lê-Quôc @alq
Postgres• Volume of Data
• High GBs, Low TBs
• Latency
• 10-100 ms after EXPLAIN ANALYZE
• Ops
• Low-maintenance, stable, predictable, replicated, boringly rock-solid
• Dev
• Well understood by (a certain class of) engineers
• Cost, a function of storage latency
Alexis Lê-Quôc @alq
Not forgetting...
• VoltDB
• RAM-based, potentially a match for our DIRTy parts
• Stored procedures, an acquired taste
• Home-grow data stores (soon)
• Rainbird
• ...
Alexis Lê-Quôc @alq
The Data Mullet
• All open-source, good if you’re ready to dive in code
• $0 CAPEX
• All OPEX on EC2
Alexis Lê-Quôc @alq
The Data Mullet on EC2
Structural Weakness: I/O latency at moderate throughputs
Alexis Lê-Quôc @alq
One “bad” cassandra query
Alexis Lê-Quôc @alq
Clogging the I/O pipes on EC2
Maximum Average Wait: up to 670 msMaximum Service Time: up to 5 ms
While writing 100 MB/sand reading 30 MB/s
Alexis Lê-Quôc @alq
Another “Bad” Query
DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util03:35:02 PM dev8-80 380 24000 5.7 62 47 130 1.3 4703:35:02 PM dev8-96 370 24000 5.6 63 46 120 1.2 4503:35:02 PM dev8-112 380 24000 5.5 63 46 120 1.2 4603:35:02 PM dev8-128 380 24000 7.2 63 56 150 1.3 50
Average service time in ms
Average wait in ms
Read throughput in sector/sTotal: 46 MB/s
Transfer per secondsConsumer HD: ~75 tps
SSD: 1-30 Ktps
Mitigation of I/O issues?
Alexis Lê-Quôc @alq
Cassandra: I/O Mitigation
• More nodes, more RAM, more partitions, more parallelism
• $$$
Alexis Lê-Quôc @alq
Postgres: I/O Mitigation
• Scale up to a point
• Replicate
• Move to bare Metal => $$$
• A well-trodden but difficult path
Alexis Lê-Quôc @alq
Better yet...
• Less dependency on low-latency, durable storage
• Move more data to RAM (Redis)
• Archive immutable data
• S3/Cloudfront?
Alexis Lê-Quôc @alq
A digression:Your Very Own Chaos Monkey
• Instances go bye-bye
• https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/741224
• Instances go bye-bye, take 2 (high load)
• https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/708920
Alexis Lê-Quôc @alq
Takeaway
• By mixing and matching open-source SQL (PG) and NoSQL (Redis, Cassandra) Datadog has been able to quickly & simply get up-and-running with “$0” down payment on infrastructure.