Post on 05-Dec-2014
description
transcript
CMB – A Message Bus for the Cloud
CMB – A Message Bus for the Cloud
CQS – Queuing Service CNS – Topic based Pub Sub Service
Why did we build our own? • General purpose message bus to replace project driven one-off
solutions • Smooth data center failover, maybe even “active-active” queues • Must scale to millions of queues and 1000s of messages/sec (for
example 1 queue per STB) • Tight latency requirements (“10ms response time 95th pct”) • Evaluated other options to arrive at AWS SQS/SNS
AWS SQS Primer “Simple Queuing Service” • Focus on guaranteed delivery • Best effort on orderly delivery, duplicates • Few simple core APIs:
SendMessage() ReceiveMessage() DeleteMessage() • Do not trust message recipients
Advantages of adopting an API If you do it on your own: • API design typically biased towards first use case • Almost guaranteed: You won’t get it right the first time (iterations) • Difficult for new users to adopt: Documentation, tools, community,…
Why did we build our own? AWS SQS
Guaranteed Delivery + Simple, Robust API + Scalability + Ac;ve-‐Ac;ve ? DC Failover ? Latency & Throughput ? Limita;ons (Msg Size, # Ar;facts, …) ?
“Build a horizontally scalable queuing service on top of Cassandra (and Redis) which is API compatible with AWS SQS / SNS API”
CQS over Cassandra and Redis Cassandra • Cross-DC persistence and replication • Proven horizontal scalability Redis • Meet latency requirements • Help with best effort ordering • Handle Visibility Timeout (VTO)
Cassandra Data Modeling How to represent queued messages in Cassandra? • Single Column Queue • Single Row Queue • Multi-Row Queue
Single Column Queue
Single Row Queue
Multi-Row Queue
CQS Data Flow Example 1. SendMessage(MSG1) 2. SendMessage(MSG2) 3. SendMessage(MSG3) 4. MSG1 = ReceiveMessage() 5. DeleteMessage(MSG1)
CQS Architecture Recap Cassandra Persistence Layer • Messages sharded across 100 rows per queue • Avoid wide rows (> 500K) • Minimize churn (Tombstones) • Distribute queue among Cassandra nodes Redis Caching Layer • To meet latency requirements • Payload cache (kicks in after first miss, pre-load next 10k) • Improve FIFOness by storing Msg IDs in Redis List • Handle message visibility entirely in Redis (Hashtable)
CQS Key Cassandra Features Persistence and failover • Cross-DC replication in combination with Local Quorum Reads/Writes
(tunable consistency) Millions of queues, spiky traffic patterns • Massive horizontal scalability Message order (FIFOness) / future dated messages • Wide rows, composite column keys / TimeUUID and column sort order Message retention period (expiration) • TTL Fast lookup of static metadata (Queues, Users etc.) • Row Cache, Secondary Indexes
CQS Scalability • Send(), Receive() and Delete() scale with Cassandra
Ring, API Servers (stateless) and Redis Shards • Are constant time operations • Queues not sharded across Redis servers!
CQS Availability • Depends on availability of Cassandra • Service functions without Redis!
CQS DC Failover
AWS SNS API “Simple Notification Service”
• Topic based Publish/Subscribe Service • Supported protocols: HTTP/CQS/SQS • Few simple core APIs
CreateTopic() / DeleteTopic() Subscribe() / Unsubscribe() ConfirmSubscription() Publish()
• Do not trust message recipients (redelivery policy)
CNS Data Flow Example • Single operation: Publish message MSG1 to a topic
T with four Subscribers S1, S2, S5, S6. • S1, S2 are HTTP endpoints • S5, S6 are CQS queues
CNS Architecture Recap • CQS Queue preserves messages when Publish
Workers are down or overloaded • CQS Visibility Timeout takes care of guaranteed
delivery • Retry policy improves guaranteed delivery for
temporarily unavailable endpoints (http) • Publish Workers hardened for rogue endpoints (failing
endpoints, slow endpoints, …)
Differences SQS/SNS and CQS/CNS Goal: Full API compatibility Current state: • All APIs implemented, most parameters supported • Can use AWS Java SDK and others Limitations: • AWS4 signatures not supported (V1 and V2 ok) • SMS endpoints not supported, limited email support Enhancements: • Additional APIs for monitoring and management • Unlimited number of queues, topics and subscriptions • Adjustable message size and other parameters (MSG <= 64KB, LP <= 20 sec, DS <= 900 sec, RP, …)
CMB Ready for Production Use? • Code of CMB Core is stable • Extensive testing done (including throughput
scalability testing) • In use at Comcast (Sports, DVR, …)
Testing Goals • Functional testing (unit tests, good code coverage) • Stress testing (simulate Redis outage, data center failover) • Endurance testing • Load testing: Verify linear horizontal scalability (CQS / CNS
throughput scalability)
CQS Throughput Scalability • Throughput as a function of Cassandra Ring size • Increase load until throughput (msg/sec) reaches a maximum • Increase ring size and re-test • Ensure sufficient API and Redis capacity to support largest ring • Deployment: 10 API Servers, 5 Redis Shards, 4-16 Node Cassandra
Ring
CQS Throughput Scalability # Load Gen # API Servers # Redis Shards Ring Size API / Sec P99
5 10 5 4 2832 <= 100 ms
5 10 5 8 6072 <= 100 ms
5 10 5 12 9472 <= 100 ms
6 12 6 16 11667 <= 100 ms
8 15 7 20 13514 <= 100 ms
8 15 7 24 15365 <= 100 ms
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
4 8 12 16 20 24
API/sec
Ring Size
CNS Throughput Scalability • Most important metric: End-to-end latency • Fixed number of subscribers, gradually increase #msg/sec published
until system is “overwhelmed” • Increase number of Publish Workers and re-test • Deployment: 8 node Cassandra Ring, 2 API Servers, 2 Redis Shards,
3-6 Publish Workers • Test setup: Single topic with 100 HTTP subscribers, 10 min test
duration
CNS Throughput Scalability 3 Publish Workers
#PUB/SEC
#MSG/SEC
AVG(LAT)
API AVG(RT)
API P95(RT)
API AVG(CQ
S)
API P95(CQ
S)
PROD AVG(RT
)
PROD P95(RT)
PROD AVG(CQ
S)
PROD P95(CQS)
CONS AVG(RT)
CONS P95(RT)
CONS AVG(CQS)
CONS P95(CQS
)
HTTP AVG(RT)
HTTP P95(CQS
)
5 500 198 21 44 10 26 77 150 77 155 47 96 38 85 12 18
10 1000 177 15 30 8 16 68 119 68 118 39 70 31 59 12 17
20 2000 160 16 29 9 17 69 120 69 120 40 69 31 59 9 16
40 4000 209 18 37 10 22 78 138 78 139 41 74 32 62 11 18
80 8000 75656 28 61 14 27 237 1020 237 1020 143 790 131 770 14 21
CNS Throughput Scalability 6 Publish Workers
#PUB/SEC #MSG/SEC AVG(LAT)
API AVG(RT)
API P95(RT)
API AVG(CQ
S)
API P95(CQ
S)
PROD AVG(RT
)
PROD P95(RT)
PROD AVG(CQ
S)
PROD P95(CQS)
CONS AVG(RT)
CONS P95(RT)
CONS AVG(CQS)
CONS P95(CQS
)
HTTP AVG(RT)
HTTP P95(CQS
)
5 500 247 37 117 19 66 141 290 139 290 77 200 57 160 18 34
10 1000 226 41 130 21 79 163 380 162 370 79 200 58 160 17 39
20 2000 199 37 118 20 74 133 280 133 280 68 180 50 150 11 21
40 4000 225 45 140 25 110 148 320 148 320 76 210 53 170 18 25
80 8000 267 48 126 25 80 149 300 149 300 77 180 58 150 22 38
160 16000 145135 76 180 41 120 228 460 228 460 115 280 97 250 28 70
0
200
400
600
800
1000
1200
500 1000 2000 4000 8000
Lat
ency
(ms)
Throughput (Msgs/Sec)
6 workers
3 workers
CNS Throughput Scalability
Use Case: X1 Sports App
Use Case: X1 Sports App
Use Case: X1 Sports App
Moving Forward • Follow SNS / SQS APIs • More load and stress testing • Ease of deployment and scale up • More in-house production deployments (currently isolated
by application) • CQS as a Service
Thank You!
http://github.com/Comcast/cmb
http://groups.google.com/forum/#!forum/cmb-user-forum
bwolf@sv.comcast.com
BACKUP
CNS Endurance Test
Single topic with 5 HTTP subscribers, 65 msg/sec published 14 mio messages published over 12hrs
Use Case: EAS