Why I still develop synchronous
web in the asyncIO era
April 7th, 2017
Giovanni Barillari - pycon otto - Firenze, Italy
Who am I?
I’m Gio!pronounced as Joe
trust me, I’m a physicist :)
code principally in Python, Ruby, Java, Javascript
work with web applications since 2012
CTO @ Sellf since 2016
I love Open Source
contributor of web2py framework since 2012
maintainer of pydal library since 2015
author of weppy framework since 2014
Disclaimer
This is quite a subjective talk
I consider the need of a database in web development
asynchronous IO
approach used to achieve concurrency by allowingprocessing to continue while responses from IO operations
are still being waited upon
the event loop
asyncio programming is heavily centered on the notion ofan event loop, which in its most classic form uses callbackfunctions that receive a call once their corresponding IO
request has data available
the asyncio era
started with javascript
Javascript was designed to be a client side scriptinglanguage for browsers.
Browsers, like any other GUI app, are essentially eventmachines. All they do is respond to user-initiated events.
then it comes node
and Javascript became a server side language.
Reason for its success: it fully embraces the event-drivenprogramming paradigm that client-side programmers are
already well-versed in and comfortable with.
the non-blocking IO approach appropriate for the classiccase of lots of usually asleep or arbitrarily slow connections
became the de facto style in which all web-orientedsoftware should be written.
we all hate threads
asynchronous programming being used to criticisemultithreaded programming since:
threads are expensive to create and maintain in anapplicationthreaded programming is difficult and non-deterministic
the python world
continued confusion over what the GIL does and does notdo provided a fertile land for the async model to take root
strongly
Since Python 3.3 we moved from the implicit async IOparadigm offered by eventlet and gevent to the futures
and coroutines concepts of the asyncio module.
are you sure?
the throughput mith
since you avoid the wait for I/O context switches,
asynchronous programming styles are innately
superior for concurrent performance in nearly all
cases.
Are you sure your application is I/O bound?
I/O Bound refers to a condition in which the time it
takes to complete a computation is determined
principally by the period spent waiting for
input/output operations to be completed.
Are you sure context-switching is the bottleneck in yourreal world application?
Python is slow
I mean REALLY slow
Web applications deal with databases
communication with the database takes up a
majority of the time spent in a database-centric
application
This is a common wisdom in compiled languages, butPython is very slow, compared to such systems.
image made by zzzeek
asyncio is slow
Insert into postgres a few million rows:
Python 2.7.8 threads (22k r/sec, 22k r/sec) Python 3.4.1 threads (10k r/sec, 21k r/sec) Python 2.7.8 gevent (18k r/sec, 19k r/sec) Python 3.4.1 asyncio (8k r/sec, 10k r/sec)
benchmarks made by zzzeek
uvloop
benchmarks made by magicstack
but these are just TCP connections
dude wait, the magicstack guys also published asyncpg!
where the core parsing is written in Cython
and sadly is not DBAPI v2 compatible (PEP249)
web development is something more than network
what are we benchmarking?
the benchmarks’ fairy dust
benchmarks from Falcon website
benchmarks made by TechEmpower
we should stop looking just at numbers
let’s make a pointless benchmark
i5-6600k - OSX 10.11.6 - docker 17.03.0 - python 3.6.0 - wrk -d 15 -c [8-128] -t 4
serialize {“message”: “Hello, World!”} in json
let’s make a realistic benchmark
i5-6600k - OSX 10.11.6 - docker 17.03.0 - python 3.6.0 - wrk -d 15 -c [8-128] -t 4
load 20 records from postgres and serialize in json
when you do benchmarks, be sure on what you’re actuallybenchmarking
the json serialization benchmark on sanic equalsbenchmarking:
MagicStack’s httptools library vs gunicorn/uwsgi HTTPparsingujson vs standard json library
which are faster independently of asyncio
the code simplicity
threads are bad. asyncio code is more explicit and
you’ll have fewer bugs in your program.
The principle is basically:
I want context switches syntactically explicit in my
code. If they aren’t, reasoning about it is
exponentially harder.
In practice, you’ll end up with so many yield from orawait lines in your code that you end up with
well, I guess I could context switch just about
anywhere
Which is the problem you were trying to avoid in the firstplace.
def get_account_data(): user = database.get_user() preferences = database.get_user_preferences(user) post_count = database.get_post_count(user) return locals()
async def get_account_data(): user = await database.get_user() preferences = await database.get_user_preferences(user) post_count = await database.get_post_count(user) return locals()
forget about thread locals
from framework import request, response, session
def message(value): return {'message': value, 'page': request.page}
@app.route('/foo') def foo(): return message('foo')
@app.route('/bar') def bar(): return message('bar')
def message(value, request): return {'message': value, 'page': request.page}
@app.route('/foo') async def foo(request, response, session): return message('foo', request)
@app.route('/bar') async def bar(request, response, session): return message('bar', request)
your code is just less DRY
are we re-inventing the wheel?
Remember tornado?
aiohttp, muffin, Kyoukai, sanic..
can you use any of these in production to write a real app?
are we just moving the dust?
Before asyncio these were your 2 best friends
nginx
uwsgi
why?
nginx use an event loop to process requests
is pure C
with asyncio you would use gunicorn or other wsgi/asgiservers
and I still ask myself
is better to put it behind nginx or not?
does it mean we’re moving the HTTP stack to python code?
you saying you don’t use asyncio?
Of course I use it.
when I do HTTP requests
@app.register('/oauth/{provider}') def oauth(request): provider = request.match_info.get('provider') client, _ = yield from app.ps.oauth.login(provider, request) user, data = yield from client.user_info() url = app.cfg['MOON_HOST'] + '/v1/oauth/' + provider resp = yield from aiohttp.request( 'POST', url, data=json.dumps({'user': user, 'data': data})) redir_url = app.cfg['APP_HOST'] + '/?' rv = yield from resp.json() redir_url += urlencode(rv) raise aiohttp.web.HTTPFound(redir_url)
summing up
asyncio is awesome compared to the nodejs world
uvloop is just amazing
with python 3.6 asyncio seems pretty stable
pypy started support of async code in the latest rc
concurrency doesn’t mean things go faster
there’s no need to asyncify everything
avoid Hipe Driven Development
async or not, the performance of your application highlydepends on your application code
The future is bright, but we’re not there yet
I still see more cons rather than pros into turning webdevelopment async
Thank you.
Let’s discuss.