+ All Categories
Home > Documents > PyCon2013

PyCon2013

Date post: 08-Nov-2014
Category:
Upload: bharath-kumar
View: 30 times
Download: 0 times
Share this document with a friend
Description:
pycvcooooomnnnnnvbuiffsobgewfgbounf
Popular Tags:
52
Async I/O for Python 3 (PyCon 2013 keynote) Guido van Rossum [email protected]
Transcript
Page 1: PyCon2013

Async I/O for Python 3(PyCon 2013 keynote)

Guido van [email protected]

Page 2: PyCon2013

This all started on python-ideas...

• When someone proposed to fix asyncore.py• http://mail.python.org/pipermail/python-

ideas/2012-September/016185.html– Subject: asyncore: included batteries don't fit– Date: September 22, 2012– By October 6 it was a centithread– On October 12 I started several new threads– On December 12 I first posted PEP 3156

Page 3: PyCon2013

Take a deep breath

Page 4: PyCon2013

What is async I/O?

• Do something else while waiting for I/O• It's an old idea (as old as computers)• With lots of approaches– threads, callbacks, events...

• I'll come back to this later

Page 5: PyCon2013

Why async I/O?

• I/O is slow compared to other work– the CPU is not needed to do I/O

• Keep a UI responsive– avoid beach ball while loading a url

• Want to do several/many I/O things at once– some complex client apps– typical server apps

Page 6: PyCon2013

Why not use threads?

• (Actually you may if they work for you!)• OS threads are relatively expensive• Max # open sockets >> max # threads• Preemptive scheduling causes races– "solved" with locks

Page 7: PyCon2013

Async I/O without threads

• select(), poll(), etc.• asyncore :-(• write your own• frameworks, e.g. Twisted, Tornado, zeroMQ• Wrap C libraries, e.g. libevent, libev, libuv• Stackless, gevent, eventlet• (Some overlap)

Page 8: PyCon2013

Downsides

• Too many choices• Nobody likes callbacks• APIs not always easy• Standard library doesn't cooperate

Page 9: PyCon2013

So, about gevent...

• Scary implementation details– x86 CPython specific stack-copying code

• Monkey-patching– "patch-and-pray"

• Don't know when it task switches– could be not enough– could be unexpected

Page 10: PyCon2013

So what to do?

Page 11: PyCon2013

No, really!

Page 12: PyCon2013

Let's standardize the event loop

• At the bottom of all of these is an event loop– (that is, all except OS threads)

• Event loop multiplexes I/O• Various other features also common

Page 13: PyCon2013

Why is the event loop special?

• Serializes event handling– handle only one event at a time

• There should be only one– otherwise it's not serializing events

• Each framework has its own event loop API– even though the functionality has much overlap

Page 14: PyCon2013

What functionality is needed?

• start, stop running the loop– variant: always running

• schedule callback DT in the future (may be 0)– also: repeated timer callback

• set callback for file descriptor when ready– variant: call when I/O done

Page 15: PyCon2013

Interop

• Most frameworks don't interoperate• There's a small cottage industry adapting the

event loop from framework X to be usable with framework Y– Tornado now maintains a Twisted adapter– There's also a zeroMQ adapter for Tornado– I hear there's a gevent fork of Tornado– etc.

Page 16: PyCon2013

Enter PEP 3156 and Tulip

Page 17: PyCon2013

I know this is madness

• Why can't we all just use Tornado?• Let's just import Twisted into the stdlib• Standardizing gevent solves all its problems– no more monkey-patching– greenlets in the language

• Or maybe use Stackless Python?• Why reinvent the wheel?– libevent/ev/uv is the industry standard

Page 18: PyCon2013

Again: PEP 3156 and Tulip

• I like to write clean code from scratch• I also like to learn from others• I really like clean interfaces• PEP 3156 and Tulip satisfy all my cravings

Page 19: PyCon2013

What is PEP 3156? What is Tulip?

• PEP 3156:– standard event loop interface– slated for Python 3.4

• Tulip:– experimental prototype (currently)– reference implementation (eventually)– additional functionality (maybe)– works with Python 3.3 (always)

Page 20: PyCon2013

PEP 3156 is not just an event loop

• It's also an interface to change the event loop implementation (to another conforming one)– this is the path to framework interop– (even gevent!)

• It also proposes a new way of writing callbacks– (that doesn't actually use callbacks)

Page 21: PyCon2013

But first, the event loop

• Influenced by both Twisted and Tornado• Reviewed by (some) other stakeholders• The PEP is not in ideal state yet• I am going to sprint Mon-Tue on PEP and Tulip

Page 22: PyCon2013

Event loop method groups

• starting/stopping the loop• basic callbacks• I/O callbacks• thread interactions• socket I/O operations• higher-level network operations

Page 23: PyCon2013

Starting/stopping the event loop

• run() # runs until nothing more to do• run_forever()• run_once([timeout])• run_until_complete(future, [timeout])• stop()

• May change these around a bit

Page 24: PyCon2013

Basic callbacks

• call_soon(callback, *args)• call_later(delay, callback, *args)• call_repeatedly(interval, callback, *args)• call_soon_threadsafe(callback, *args)

• All return a Handler instance which can be used to cancel the callback

Page 25: PyCon2013

I/O callbacks

• add_reader(fd, callback, *args) -> Handler• remove_reader(fd)• add_writer(fd, callback, *args) -> Handler• remove_writer(fd)

• Not all fd types are always acceptable• fd may be an object with a fileno() method

Page 26: PyCon2013

UNIX signals

• add_signal_handler(sig, callback, *args) -> Handler

• remove_signal_handler(sig)

• Raise RuntimeError if signals are unsupported

Page 27: PyCon2013

Thread interactions

• wrap_future(future) -> Future• run_in_executor(executor, callback, *args)

-> Future

• Used to run code in another thread– sometimes there is no alternative– e.g. getaddrinfo(), database connections

• Threads may use call_soon_threadsafe()

Page 28: PyCon2013

Socket I/O operations

• sock_recv(sock, nbytes) -> Future• sock_sendall(sock, data) -> Future• sock_accept(sock) -> Future• sock_connect(sock, address) -> Future

• Only transports should use these

Page 29: PyCon2013

High-level network operations

• getaddrinfo(host, port, ...) -> Future• getnameinfo(address, [flags]) -> Future• create_connection(factory, host, port, ...)

-> Future• start_serving(factory, host, port, ...) -> Future

• Use these in your high-level code

Page 30: PyCon2013

Um, Futures?

• Like PEP 3148 Futures (new in Python 3.2):– from concurrent.futures import Future– f.set_result(x), f.set_exception(e)– f.result(), f.exception()– f.add_done_callback(func)– wait(fs, [timeout, [flags]]) -> (done, not_done)– as_completed(fs, [timeout]) -> <iterator>

• However, adapted for use with coroutines

Page 31: PyCon2013

Um, coroutines?

• Whoops, let me get back to that later

Page 32: PyCon2013

What's a Future?

• Abstraction for a value to be produced later– Also known as Promises (check wikipedia)– Per wikipedia, these are explicit futures

• API:– result() blocks until result is ready– an exception is a "result" too: will be raised!– exception() blocks ands checks for exceptions– done callbacks called when result/exc is ready

Page 33: PyCon2013

Futures and coroutines

• Not the concurrent.futures.Future class!• Nor exactly the same API• Where PEP 3148 "blocks", we must use...

Page 34: PyCon2013

Drum roll, please

Page 35: PyCon2013

PEP 380: yield-from

• @coroutinedef getresp(): s = socket() yield from loop.sock_connect(s, host, port) yield from loop.sock_sendall(s, b'xyzzy') data = yield from loop.sock_recv(s, 100)

• Yes, you can now return from a generator!• Please, do not write real code like this! :-)

Page 36: PyCon2013

I cannot possibly do this justice

• The best way to think about this is that yield-from is magic that "blocks" your current task but does not block your application

• It's almost best to pretend it isn't there when you squint (but things don't work without it)

• PS. @coroutine / yield-from are very close to async / await in C#

Page 37: PyCon2013

How to think about Futures

• Most of the time you can forget they are there• Just pretend that:

data = yield from <function_returning_future>is equivalent to: data = <equivalent_blocking_function>...and keep calm and carry on

• Also forget about result(), exception(), and done-callbacks

Page 38: PyCon2013

Error handling

• Futures can raise exceptions too• Just put a try/except around the yield-from:• try:

data = yield from loop.sock_connect(s, h, p)except OSError: <error handling code>

Page 39: PyCon2013

Coroutines

• Yield-from must be used inside a generator• Use @coroutine decorator to indicate that

you're using yield-from to pass Futures• Coroutines are driven by the yield-from• Without yield-from a coroutine doesn't run

• What if you want an autonomous task?

Page 40: PyCon2013

Tasks

• Tasks run as long as the event loop runs• A Task is a coroutine wrapped in a Future• Two ways to create Tasks:– @task decorator (instead of @coroutine)– f = Task(some_coroutine())

• The Task makes sure the coroutine runs• Task is a subclass of Future

Page 41: PyCon2013

Back to higher-level network ops

• Consider: loop.create_connection(factory, host, port)

• This will block and create a TCP connection• It returns a Future when ready• The factory is a protocol class– or a factory function returning a protocol instance

• Future's result is a (transport, protocol) tuple

Page 42: PyCon2013

Wait; transports and protocols?!

• PEP 3153 (async I/O) explains why transport and protocol is the right abstraction– transport: provides two byte streams• e.g. TCP or SSL or pipes

– protocol: implements application logic• e.g. SMTP or FTP or IRC

• Only this abstraction level supports both ready- (select) and done-callbacks (IOCP)

Page 43: PyCon2013

Below the event loop

• Lowest level factored out– selector classes: uniform API to select, poll, etc.– will be stdlib classes in their own right– also an IOCP "proactor" (not the same API)

• Not part of the PEP (uncontroversial)

Page 44: PyCon2013

There's a lot more...

Page 45: PyCon2013

But I'm out of time :-(

• StreamReader class: like a file whose methods return Futures (e.g. readline())

• Datagram protocol (under development)• Various types of locks (experimental)• Exemplary HTTP client and server protocols– (may base client on Requests, HTTP for humans)

• Subprocess support (mostly TBD)

Page 46: PyCon2013

More about interop...

• Write code against standard event loop API• May use yield-from, don't have to• Will interop with other code written like that• Will also work with adapted event loop– e.g. Twisted reactor– code using legacy event loop API will also work– Ideally most of Twisted will work with any

standard event loop

Page 47: PyCon2013

Using Futures w/o yield-from

• You can use Futures without yield-from!• Just use add_done_callback() and set_result()• This is how Twisted can adapt the event loop

Page 48: PyCon2013

When can I have it?

• Tulip works but is in flux and undocumented• PEP 3156 still to be reviewed thoroughly• Push to be ready for Python 3.4 (Feb 2014)– 3.4.0 beta 1 cutoff date Nov 23, 2013

• Tulip (3rd party) will work with vanilla 3.3• Will keep Tulip around for a few releases• PS. stdlib version won't be named "tulip"

Page 49: PyCon2013

And the rest of the stdlib?

• We'll start thinking about that in earnest once 3.4 is out of the door

• We may eventually have to deprecate urllib, socketserver etc.

• Or emulate them on top of PEP 3156• But that will take years

Page 50: PyCon2013

What about older Python versions?

• Sorry, you're out of luck :-(• yield-from only available in 3.3• Much of Tulip depends on yield-from– even the parts that just use Futures

• Consider this a carrot for porting to 3.3 :-)• However, someone could implement a PEP-

conforming event loop in Python 2.7– just use yield instead of yield-from

Page 51: PyCon2013

Acknowledgments

• Greg Ewing for PEP 380 (yield-from)• Glyph and SF Twisted folks for meetings• Richard Oudkerk for the IOCP proactor work• Nikolay Kim for much of the code and tests• Charles-François Natali for the Selectors• Eli Benderski, Geert Jansen, Saúl Ibarra

Corretgé, Steve Dower, Dino Viehland, Ben Darnell, Laurens van Houtven, Giampaolo Rodolà, and everyone on python-ideas...

Page 52: PyCon2013

Oh yeah, I'm sprinting

• Will be here Monday - Tuesday


Recommended