HOW TO SCALE YOUR APP AND WIN THE CLOUD CHALLENGEQUENTIN ADAM
@WAXZCE2013
Quentin ADAM
Clever Cloud CEO
@waxzce on twitter
http://www.waxzce.org
WHO I AM ?
Java, scala, python, nodejs, php… apps scaling automatically in the cloud.
We cover your ass, you can focus on your own stuff
http://www.clever-cloud.com
PAAS PROVIDER
WHEN YOU NEED TO SCALE
THERE ARE 2 WAYS
GROWING AND GROWING UNTIL YOU EXPLODE OR BECOME WEIRD
OR SPLIT THE WORK AND MAKE YOUR SOFTWARE WORK AS A TEAM
Build an army of fat app
YOU CAN DO BOTH
SO WE NEED TO BE ABLE TO DISPATCH THE WORK
SCALE OUT
• Many workers doing the same thing
• No SPOF
• Growing is more easy
• Introduce best practice
SCALE UP
• 1 Fat instance
• 1 Fat application
• SPOF (single point of failure)
• Hard to maintain
• Always has a limit
• Short term meaning
BEST LONG
TERM
SOLUTION
SO, HOW TO SCALE OUT ?JUST SOME FACTS
SPLIT PROCESS AND STORAGE
Storage• Databases• Files• Sessions• Events• …
Code• Can be replicated• Stateless• Process
Picking one instance or another doesn’t matter
STATELESSNESS IS THE KEY
CONSIDER MORE THINGS AS DATA• User account
• Users data
• Files
• Sessions
• Events
CHOOSE DATASTORE WISELYYOU CAN SHOULD USE MANY DATASTORES
DATASTORE CHOICES ARE DRIVEN BY USAGE
Make decisions based on
needs
Do I need atomicity of requests ?
Do I need concurrent access ?
Do I mostly read or write ?
Do I need relational ?
Do I need big storage capacity ?
Do I need high
availability ?
• Not a big volume
• DB have to manage data TTL
• Data model : K/V
• Multiple writes at the same time
• High availability
I need to store sessions
QUICK EXAMPLE
• Not a big volume It’s OK, PG can handle small quantity of data
• DB have to manage data TTL
No, I have to do it manually
• Data model : K/VNo, PG is relational (mainly)
• Multiple writes at the same time
No, PG is Atomic
• High availability
PG is awesome ;-) Use of PG bouncer or similar allow good clustering
I need to store sessions
QUICK EXAMPLE
• Not a big volume It’s OK, redis can handle small quantity of data
• DB have to manage data TTL
Yes Redis can do it
• Data model : K/V Yes
• Multiple writes at the same time
No, redis is pseudo Atomic (master/slave)
• High availabilityRedis is great, but cauterization is rude…
I need to store sessions
QUICK EXAMPLE
• Not a big volume It’s OK, CB can handle small quantity of data
• DB have to manage data TTL
Yes CB can do it
• Data model : K/V Yes
• Multiple writes at the same time
OK, this is possible with memcached protocol
• High availabilityClustering is built in, no downtime
I need to store sessions
QUICK EXAMPLE
USE ONLINE DATABASE / BE READY TO TEST IN JUST A FEW MINUTES NO NEED TO TRASH YOUR COMPUTER
DON’T BE THAT GUY
DO NOT USE A TECHNOLOGY BECAUSE YOU <3 IT OR BECAUSE IT’S HYPE : USE IT BECAUSE IT FITS YOUR NEEDS
BALANCE YOUR LEARNING CURVE WITH THE TIME SAVED
DO NOT CREATE MONSTERS
COMMON MISTAKES
DO NOT USE MEMORY AS DATABASELIKE : SHARED / GLOBAL VARIABLE, CACHE “IN THE CODE”, INTENSIVE SESSION USAGE…
DO NOT USE A VARIABLE FOR MORE THAN ONE REQUEST
2 + 2 = 4
FOR SAME INPUT, SAME OUTPUT
GET do not change data on server
BE HTTP CONSISTENT
And data will be lost
CODE WILL FAIL
DO NOT USE FILE SYSTEM AS DATASTORE
File system are POSIX compliant
• POSIX is ACID• POSIX is powerful but is bottleneck • File System is nightmare of ops • File System is create coupling (host provider/OS/language)• Free SPOF multi tenant File System is a unicorn
STORE IN DATABASE, OR DATASTORE LIKE S3 (AWS) DEDICATED TO FILE MANAGEMENT
CAREFUL USE OF DARK MAGIC
SPLIT THE CODE : MODULES
• Smallest code base
• Deploy as service for each other
• Focus on best technology for a problem
SCALE YOUR TEAMMODULARIZE YOUR TEAM
USE EVENT BROKER TO MODULARIZE YOUR APP• AMQP
• Celery
• 0MQ
• Redis
• JMS
• Some case : hadoop, akka…
• …
CRON is not an event queue
MAKE HARD COMPUTATION ASYNC
ALWAYS USE A REVERSE PROXY
Y U NOT USE ONE ?
DO NOT BUILD “THE SERVER” WITH NO DOC
USE PROCESS DEPLOYMENT
EASY MOVING OR INCIDENT MANAGEMENT
KEEP CALM UNDER FIRE
TRACK BUG & GET METRICS