Rails on scale

Scaling rails with performance in mind

from the beginning

Tom [email protected]

Performance - what itactually is

well, code which does what it'ssupposed to, and doesn't do it asslow as rails 3.0's boot time.

in every part of a project's lifecycle, the way we treat performance is very different.

When you're young

When you're young and naive

when you start with a project, and it's still small on traffic, write naive code!

Do TDD!

To avoid this :)

Write short and concise code

Don't bother with premature optimization

(when you prematurely optimize, this happens)

READ!

prepare for growth, because you're optimistic and all that. make sure you'll know what to do when shit gets real.

be naive but not TOO naive, though

there are some things which just scream - don't do this! it's gonna suck, BAD!the n+1 query issue is a good example of too naive code.

The controller

The view

The view

The problemwe have an array of users, and when we iterate over that array we reach for profile_image and for posts, which triggers two queries to the DB for each user. ending up with 2n+1 queries, n being number of users

ActiveRecord's includes prefetches the extra queries, so they turn into two queries, instead of 2n queries

The solution

The new controller

now there are only 3 queries, instead for 2n+1 (n being the amount of users)note that this might not be the right thing to do in larger scale projects. you might want to cache the profile image in redis, for instance, and completely avoid bringing in the profile_image object from the database.

The importance of TDD

One of the roles I took upon arriving to FTBPro is kickstarting and leading the move to TDD, we also wrote a bunch of specs for our legacy code. Difference was incredible.

Daily deploys

(instead of weekly deploys)

New code's clean and awesome

More focus on features

because code's fairly covered, there's less issues that come up in production (less being relative, yeah?)

Upgrading made easy

we moved from rails 3.0 to 3.2 within two weeks. mostly because a vast majority of the issues were discovered in tests.

But this talk is about performance!

When writing TDD, your code will be faster.● TDD forces you to write short and atomic

methods● we try to make these methods fast because

we hate slow specs :)● code doesn't fail on production, because if it

fails, we know about it before deployment.● no long-running methods, because they're

short and concise

More performance specific TDD

using rspec you can test the time a method takes to run, set a threshold above which the spec fails!when using the bullet gem, you can set a limit on number of queries you allow a controller to run.Do benchmarks and performance tests

original code - written without tests

Rewrite - the specs

the actual code does exactly the same thing, but it's much shorter, and much more readable, because it's TDD, every method does only one thing, and is tested well.

Conclusion - do TDD!

● code is shorter● easier to maintain● it's tested so when it breaks we know it

before it's on production● when we need to refactor or change it, we

can be fairly certain it will still work as intended because of the tests.

When you're growing

Now, you start growing, and there are growing pains

● because you've written TDD, when you optimize, you're not going to break anything (or are, but will see it when tests run)

● your code is short and concise, so optimizing it will be easy

● because you didn't optimize anything, you'll feel what needs to be optimized first (using newrelic and the such)

● again, don't optimize what's easy to optimize, optimize the parts which start causing pain.

How to get the feelin`

Newrelic

shows you what's hurting the most

And gives you a breakdown of that

Google Analytics

Browse your site (that crazy!!)

Listen to users

they may come and complain, and may just go away. use google analytics to look for pages with unusually high bounce rate.

Custom tools

statsd and graphite can be quite handy

Real life example

in FTBPro, we have a score table for each league, it gets daily(ish) updated from an external source.We noticed in Newrelic that the league page took a long time to load. A short investigation pointed to the table, which led to a tiny change in the code.

Before

After

What? wait! it looks the same!

well, almost. there are two changes - one is a tiny change in variable names to make code more readable.the second change is we used a caching mechanism to bring in the team (called Subject in our code) without making any queries.

the difference was HUGE. time to build the table when cache was cold went down from 7 seconds to 0.5 seconds.

So - what have we done exactly?

● we removed an n+1 query not by including stuff, but by avoiding the query altogether

● we used a caching mechanism for teams, which takes the team's nick (Barcelona can be referred to as barca, or F.C. Barcelona) and returns the cached team.

● used that cache to speed up a very painful part of the site by a lot.

● and yes, of course the view is cached so the rebuild of the table only happens once a day.

When you need to refactor, or rewrite.

refactoring is taking code and changing it, while rewriting is starting from scratch.different reasons for refactoring or rewriting● code is causing performance issues● code is too clumsy, and makes debugging

very hard and costly● code just looks horrid● Tom said so.But when do we rewrite and when is it enough just to refactor?

When to refactor

● code is generally ok, maintainable and worth keeping

● small changes would get the desired result easily

● code is well covered with specs● we're too damn lazy to rewrite it all (yes, it's

a valid reason, lazy programmers create short code)

When should we just throw it away and rewrite.

● if the maintaining the code costs more than rewriting it, rewrite, and do it well!

● if the code does not have any test coverage and is untestable.

● when code looks like the Flying Spaghetti Monster

● when it was written by Avi Tzurel :)

make sure that new code is good, if you rewrite shit code to new shit code, you've done nothing!

A little bit about queues

DelayedJob, Resque, Sidekiq, they all got strange names with typos in them. They all save us from hell.

Move long running stuff to the background!

Let's talk about user registration - a user comes to the site, signs in with facebook, we get his image, his facebook friends, etc. It takes a while, even a long while.

Put it aside!

Calculating all that stuff is long.This doesn't have to be that way. We really only need to save the user's name, facebook details, and that's it. We'll do the rest in the background, using one of the queueing mechanisms Ruby has to offer us. This will allow us to give the user a better, faster experience.

Starting to get seriously huge

(ok, maybe this isn't a good image)

Hitting large scale

Q - when do you know you've hit large scale?A - when your servers crash daily.

now, when you've reached that, you know you need to do some really drastic stuff to adjust to your new position.

A quick detour to the land of DevOps

● handling large scale requires a lot of resources, and managing these resources effectively.

● cloud services such as Amazon AWS give companies some simple tools to handle scale very well.

● but if you don't know what you're doing, call for help :)

FTBpro's setup on AWS

Mysql with RDS

RDS is Amazon's mysql. It's optimized and easy to set up. saves us a lot of time on system administration.

Memcached with elasticache

Elasticache is the Amazon memcached service. same as RDS, saves us time bother of messing with memcached servers.

Custom redis server

thinking about moving to cloud services to save us the trouble.

Web servers with nginx+unicorn

nginx+unicorn are like milk and cookies. With the right setup we also get zero-downtime deploys, which are awesome.

Resque servers

they're also built for automatic scaling. just because we're awesome!

CDN cache with cotendo (akamai)

logged out users don't even touch the web servers - their content is served by the CDN.

Build it for quick and automatic scale

● self-deploying servers - when you start the server from its image, it will deploy to itself and start serving traffic / run resque workers

● adding servers is automatic - when there's high traffic, start them up, then kill them when traffic's low

● this allows to pay the minimum for hosting, while keeping scalability

careful with these self-deploying robots! make sure they know the robot rules...

The rules:1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.2. A robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law.3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

ok, back to Ruby (kind of)

When reaching massive scale, we'd start looking for custom solutions - relational dbs would stay forever, but some things should be moved to other customized solutions.● consider using mongo for document-like

data● consider using neo4j or other graph dbs for

representing graph data (sorry Avi, mongo ain't no graph DB!)

And don't forget to stay naive!

being large scale, but still fun and lean, can be hard, but pulling it off is worth it!

Thanks for not falling asleep!

Questions?

Tom [email protected]

Date post:	25-Jun-2015
Category:	Documents
Upload:	yifat-kanfi
View:	139 times
Download:	2 times

Rails on scale

Documents