Date post: | 01-Nov-2014 |
Category: |
Documents |
Upload: | marc-worrell |
View: | 356 times |
Download: | 0 times |
ZOTONICMake it fast(er)
Marc Worrell — WhatWebWhat / Maximonster
Cross Functional Amsterdam — 2013, Jan 15th.
Let’s make a website
(Just because I need one)
I have <? PHP ?>
It is on this machine.
Everyone uses it.
So it must be good.
Let’s use it... (and think later)
I use <? PHP ?>
Done!
I use <? PHP ?>
Hurray visitors!
I use <? PHP ?>
Oh no! visitors!
I used <? PHP ?>
•What happened?
• I got mentioned on popular blog
• Too many PHP+Apache processes
•Melt down
I can use <? PHP ?>
•Of course you can
• Use more hardware
• Use caching proxy
• Use xyz and a bit of abc
• Add complexity
• And keep it all running, all the time
I can use ... (fill in)
• Same for RoR, Django, ...
• Problem is not that you can’t scale
• The problem is that you need to scale immediately
What happened?
•Many people followed popular link
• A process per request
• Death by too many processes
• ... doing the same thing!
• Known as the “Slash Dot Effect”
Is this typical?
•Web sites are quite small (less than a million pages)
• Except for a couple of huge ones
• Not visited that much (less than 10 pages per second)
• Unless linked from popular place
• Relative small set of “hot data”
Then we just cache those pages!?
•Modern web, realtime
• Push, Web Sockets, Personalization
• That is open connections
•More xyz and abc needed
• And... cache invalidation
That is why we made Zotonic
With the help of Open Source, a great community, and ... you?
That is why we made Zotonic
Zotonic because ...
• A web server should easily handle the load of 99.9999% of all web sites.
• You shouldn’t be bothering xyz and abc to make your site real time
• The front end developer needs to be happy and in the driver’s seat
• Do more with less hardware (and watts)
Erlang because ...
•Many parallel connections
• Share everything (more later)
•Modern CPUs are multi core, your software should be using them
• Fault tolerance makes a developer’s life easier
• It is stable, very stable
Make it efficient
• Start with a monorail solution
• Fast enough for that 99.9999% of sites
• Rethink the way requests are handled
The stack
'DWDEDVH
+773�VHUYHU
&RQWUROOHUV
7HPSODWHV
6HUYLFHV���$
3,
:HEVRFNHWV
0RGHOV
0LGGOHZDUH�SURFHVVHV
�FDFKLQJ��VHVVLRQV��UHQGHUHU��LPDJH�UHVL]HU��PRGXOH�
PDQDJHU��WUDQVODWLRQ�WDEOHV������
85/�GLVSDWFKHU
Steps for a request
• Accept
• Parse request
• Dispatch (match host, controller)
• Render template (fetch data)
• Serve result
Where is time spent?
• Simple request: 6000/sec
• Lots of content: 10/sec (or less)
• Fetching data and rendering should be optimized
What takes time?
• Fetch data from database
• Simple query round trip is 1 to 10 msec
• Fetch data from a caching server
• Network round trip is 0.5 msec
• Don’t hit the network or the database
What saves time?
• Don’t repeat things that you can do once a long time ago
• Like HTML escaping and content filtering
• Zotonic stores sanitized content
What saves ...?
• Combine similar actions into one
• Especially when they are happening at the same time
• Think of requests, db results, calculations etc. etc.
• Share database connections, protect your database
Life of a request
• Client side
• Server side
• Templates
• In memory caching
Client side
• Let client (and proxies) cache css, javascript, images etc.
• Combine css and javascript requests:http://example.org/lib/bootstrap/css/bootstrap~bootstrap-responsive~bootstrap-base-site~/css/jquery.loadmask~z.growl~z.modal~site~63523081976.css
Server side
• File requests are easily cached
• Checks on timestamp, modification dates
• Cache both compressed and un-compressed version
• Still access control checks for content (images, pdfs etc.)
Templates
• Do all the model interactions
• Drive page rendering
• Compiled into Erlang byte code
•Whole and partial caching possible
• Cached results are still dynamic
In memory caching
• Two layers
•Memo cache in process dictionary of request handler
• Central shared cache for whole site: depcache
Memo cache
• In process heap of request handler
•Quick access to often used values
• Resources, ACL checks etc.
• Flushed on writes and when growing too big
• Always enabled when rendering templates
Depcache
• Central cache per site
• Key dependencies for consistency
• Garbage collector thread
•Memo functionality for sharing results between processes
Erlang VM considerations
• Cheap processes
• Expensive data copying on messages
• Binaries have their own heap
• String processing is expensive (as in any language)
Erlang VM and Zotonic
• Big data structure #context{}
• Do most work in a single process
• Prune when messaging
• Don’t pass #context{} when messaging (use interface functions)
•Messaging binaries is ok
Aside: webmachine
• HTTP protocol abstraction
• Great for writing controllers
• Needed some work for Zotonic:
• Dispatch list copying
• Memo of some lookups
• Optimizations (process dictionary removal, combine data structures)
Slam dunk protection
• Happens on startup, change of images, templates, cache flush etc.
• Let individual requests fail but system continue
• Methods:
• Single template compiler
• Single image resize process
• Memo cache (share computation)
Database
• Complaint: “It is SQL, it can’t scale”
• Answer: “Wrong question”
Why PostgreSQL?
• Great stability
• Scales good with multi core
•Mature driver
• Full text indexing
• SQL gives good query support
• Known problems and performance
Why PostgreSQL?
(And we serialize most data into a blob anyway)
NoSQL?
• Tooling (backup, command line)
• Indexing has unproven performance
•Moving target
• Vendor lock in - too big differences
• Unproven disk stores
(No)SQL
• Stop worrying
•Optimize the layers above your data store
• Think how you query your data
• Think how you protect your data
Future
• Use more binaries (exchange MochiWeb?)
• Faster template compilation and startup
• Pluggable databases
• Better resource pooling
• Circuit breakers for stability
• Add options to use stale data
Future: zynamo
• Distributed version of Zotonic
• For better availability
• ... and scale to more data
• Each node is created equal
• Still SQL stores
•Ongoing work
It works
Come and join us!
• See us at http://zotonic.com/ (we have a movie)
• Follow us at @zotonic
• Join the community, we have exciting stuff to do
¿Questions?