+ All Categories
Home > Technology > Scaling the Britain's Got Talent Buzzer

Scaling the Britain's Got Talent Buzzer

Date post: 28-Nov-2014
Category:
Upload: malcolm-box
View: 2,619 times
Download: 1 times
Share this document with a friend
Description:
How Live Talkback scaled the Britain's Got Talent buzzer to support 50,000 requests/second
17
Powering the Britain’s Got Talent buzzer* *And Big Data Big Data Meetup, London 25/5/2011 1 1 Thursday, 26 May 2011
Transcript
Page 1: Scaling the Britain's Got Talent Buzzer

Powering the Britain’s Got Talent buzzer*

*And Big Data

Big Data Meetup, London 25/5/2011

1

1Thursday, 26 May 2011

Page 2: Scaling the Britain's Got Talent Buzzer

2

What we do

2Thursday, 26 May 2011

Page 3: Scaling the Britain's Got Talent Buzzer

Me

Malcolm Box, Co-founder & CTO

[email protected]

@malcolmbox

3

3Thursday, 26 May 2011

Page 4: Scaling the Britain's Got Talent Buzzer

The Buzzer

4

BIG DATA

4Thursday, 26 May 2011

Page 5: Scaling the Britain's Got Talent Buzzer

The challenge

10 Million+ viewers

Design goal of 50,000 requests/s, 10,000 buzzes/second

Equivalent to 130 Billion requests/month

But just on Saturday night

And four weeks to build

5

5Thursday, 26 May 2011

Page 6: Scaling the Britain's Got Talent Buzzer

The challenge

6

Source: http://www.google.com/adplanner/static/top1000/#

Where does 130 Billion requests fit?

6Thursday, 26 May 2011

Page 7: Scaling the Britain's Got Talent Buzzer

Where we started....

7

ELB

WebserverDjangoUbuntu

WebserverDjangoUbuntu

MySQL

app.livetalkback.com

Zabbix

Control plane

S3

CloudFront

cdn.livetalkback.com

7Thursday, 26 May 2011

Page 8: Scaling the Britain's Got Talent Buzzer

Step 1: Testing

Started with a platform with a previous peak of 100 requests/s

No idea where it would break

Tsung! http://tsung.erlang-projects.org/

8

8Thursday, 26 May 2011

Page 9: Scaling the Britain's Got Talent Buzzer

Step 2: ELB

Amazon Elastic Load Balancer

“Infinite capacity”

BUT very long impulse response and NO controls :(

HAProxy to the rescue

5K requests/s per node

9

9Thursday, 26 May 2011

Page 10: Scaling the Britain's Got Talent Buzzer

Step 3: Avoid the DB

MySQL was never going to be able to handle 10,000 writes/s, nor 50,000 reads

“Hey, Django does memcached. Problem solved”

Help, our memcached server I/O is maxed out :(

Two-layer cache: https://gist.github.com/953524

Write-behind data

10

10Thursday, 26 May 2011

Page 11: Scaling the Britain's Got Talent Buzzer

But we want analytics!

Now 10K things to write to disk every second

Logging? Database?

This is starting to look like BIG DATA

11

11Thursday, 26 May 2011

Page 12: Scaling the Britain's Got Talent Buzzer

Step 4: Baby

12

12Thursday, 26 May 2011

Page 13: Scaling the Britain's Got Talent Buzzer

Step 5: Cassandra

Deployed Cassandra cluster on EC2 to handle buzz records

Tested to > 10K writes/s

All good!

“So how many users did we have last night?”

13

13Thursday, 26 May 2011

Page 14: Scaling the Britain's Got Talent Buzzer

Where we ended...

14

HAProxy HAProxy

WebserverDjangoUbuntu

WebserverDjangoUbuntu

Memcached CassandraRDS Master

app.livetalkback.com

Chef

Zabbix

Control plane

CassandraMemcached S3

CloudFront

cdn.livetalkback.com10

nodes

100+ nodes

14Thursday, 26 May 2011

Page 15: Scaling the Britain's Got Talent Buzzer

Scaling up - and down

Configuring 100+ servers by hand each week would have been a pain

Used to Chef to automate

Also builds the test swarm

http://wiki.opscode.com/display/chef/Home

15

15Thursday, 26 May 2011

Page 16: Scaling the Britain's Got Talent Buzzer

Now what?

Still challenges with analytics & ad-hoc queries

Looking at Brisk and Hadoop

We’re sucking the Twitter firehose for Tellybug

MySQL is coping so far, but only just

16

16Thursday, 26 May 2011

Page 17: Scaling the Britain's Got Talent Buzzer

Questions?

[email protected]

@malcolmbox

17

17Thursday, 26 May 2011


Recommended