Date post: | 06-May-2015 |
Category: |
Technology |
Upload: | tim-lossen |
View: | 6,901 times |
Download: | 0 times |
Tim Lossen
Key-Value-Stores:The Key to Scaling?
Who?• @tlossen
• backend developer
- Ruby, Rails, Sinatra ...
• passionate about technology
Problem
Challenge• backend for facebook game
Challenge• backend for facebook game
• expected load:
- 1 mio. daily active users
- 20 mio. total users
- 100 KB data per user
Challenge• expected peak traffic:
- 10.000 concurrent users
- 200.000 requests / minute
Challenge• expected peak traffic:
- 10.000 concurrent users
- 200.000 requests / minute
• write-heavy workload
Wanted• scalable database
• with high throughput
- especially for writes
Options• relational database
- with sharding
Options• relational database
- with sharding
• nosql database
- key-value-store
- document db
- graph db
Options• relational database
- with sharding
• nosql database
- key-value-store
- document db
- graph db
Options• relational database
- with sharding
• nosql database
- key-value-store
- document db
- graph db
Options• relational database
- with sharding
• nosql database
- key-value-store
- document db
- graph db
Shortlist• Cassandra
• Redis
• Membase
Cassandra
Facts• written in Java
- 55.000 lines of code
• Thrift API
- clients for Java, Ruby, Python ...
History• originally developed by Facebook
- in production for “Inbox Search”
• later open-sourced
- top-level Apache project
Features• high availability
- no single point of failure
• incremental scalability
• eventual consistency
Architecture• Dynamo-like hash ring
- partitioning + replication
- all nodes are equal
Hash Ring
Architecture• Dynamo-like hash ring
- partitioning + replication
- all nodes are equal
• Bigtable data model
- column families
- supercolumns
“Cassandra aims to run on an in"astructure of hundreds of nodes.”
Redis
Facts• written in C
- 13.000 lines of code
• socket API
- redis-cli
- client libs for all major languages
Features• high read & write throughput
- 50.000 to 100.000 ops / second
Features• high read & write throughput
- 50.000 to 100.000 ops / second
• interesting data structures
- lists, hashes, (sorted) sets
- atomic operations
Features• high read & write throughput
- 50.000 to 100.000 ops / second
• interesting data structures
- lists, hashes, (sorted) sets
- atomic operations
• strong consistency
Architecture• in-memory database
- append-only log on disk
- virtual memory
Architecture• in-memory database
- append-only log on disk
- virtual memory
• single instance
- master-slave replication
- clustering is on roadmap
“Memory is the new disk, disk is the new tape.”
— Jim Gray
Membase
Facts• written in C and Erlang
• API-compatible to Memcached
- same protocol
• client libs for all major languages
History• developed by NorthScale & Zynga
- used in production (Farmville)
• released in June 2010
- Apache 2.0 License
Features• “Memcached with persistence”
- extremely fast
- throughput scales linearly
Features• “Memcached with persistence”
- extremely fast
- throughput scales linearly
• automatic data placement
- memory, ssd, disk
Features• “Memcached with persistence”
- extremely fast
- throughput scales linearly
• automatic data placement
- memory, ssd, disk
• configurable replica count
Architecture• cluster
- all nodes are alike
- one elected as “coordinator”
Architecture• cluster
- all nodes are alike
- one elected as “coordinator”
• each node is master for part of key space
- handles all reads & writes
Mapping Scheme
“simple, fast, elastic”
Solution
Which one would you pick?
Decision• Cassandra ?
Decision• Cassandra ?
- too big, too complicated
Decision• Cassandra ?
- too big, too complicated
• Membase ?
Decision• Cassandra ?
- too big, too complicated
• Membase ?
- not yet available (then)
Decision• Cassandra ?
- too big, too complicated
• Membase ?
- not yet available (then)
• Redis !
Motivation• keep operations simple
• use as few machines as possible
- ideally, only one
Design• two machines (+ load balancer)
- Redis master handles all reads / writes
- Redis slave as hot standby
Design• two machines (+ load balancer)
- Redis master handles all reads / writes
- Redis slave as hot standby
- both machines used as app servers
Design• two machines (+ load balancer)
- Redis master handles all reads / writes
- Redis slave as hot standby
- both machines used as app servers
• dedicated hardware
Data model• one Redis hash per user
- key: facebook id
• store data as serialized JSON
- booleans, strings, numbers, timestamps ...
Advantages• turns Redis into “document db”
- efficient to swap user data in / out
- atomic ops on parts
• easy to dump / restore user data
Capacity• 4 GB memory for 20 mio. integer keys
- keys always stay in memory!
Capacity• 4 GB memory for 20 mio. integer keys
- keys always stay in memory!
• 2 GB memory for 10.000 user hashes
- others can be swapped out
Capacity• 4 GB memory for 20 mio. integer keys
- keys always stay in memory!
• 2 GB memory for 10.000 user hashes
- others can be swapped out
• 3.6 mio. ops / minute
- sufficient for 200.000 requests
Status• game was launched in august
- currently still in beta
Status• game was launched in august
- currently still in beta
• expect to reach 1 mio. daily active users in Q1/2011
Status• game was launched in august
- currently still in beta
• expect to reach 1 mio. daily active users in Q1/2011
• will try to stick to 2 or 3 machines
- possibly bigger / faster ones
Conclusions• use the right tool for the job
Conclusions• use the right tool for the job
• keep it simple
- avoid sharding, if possible
Conclusions• use the right tool for the job
• keep it simple
- avoid sharding, if possible
• don’t scale out too early
- but have a viable “plan b”
Conclusions• use the right tool for the job
• keep it simple
- avoid sharding, if possible
• don’t scale out too early
- but have a viable “plan b”
• use dedicated hardware
Q & A
Links• cassandra.apache.org
• redis.io
• membase.org
• tim.lossen.de