+ All Categories
Transcript
Page 1: The Vanishing Pattern: from iterators to generators in Python

The Vanishing Patternfrom iterators to generators in Python Luciano Ramalho

[email protected]@ramalhoorg

Page 2: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Demo: laziness in the Django Shell

2

Page 3: The Vanishing Pattern: from iterators to generators in Python

>>> from django.db import connection>>> q = connection.queries>>> q[]>>> from municipios.models import *>>> res = Municipio.objects.all()[:5]>>> q[]>>> for m in res: print m.uf, m.nome... GO Abadia de GoiásMG Abadia dos DouradosGO AbadiâniaMG AbaetéPA Abaetetuba>>> q[{'time': '0.000', 'sql': u'SELECT "municipios_municipio"."id", "municipios_municipio"."uf", "municipios_municipio"."nome", "municipios_municipio"."nome_ascii", "municipios_municipio"."meso_regiao_id", "municipios_municipio"."capital", "municipios_municipio"."latitude", "municipios_municipio"."longitude", "municipios_municipio"."geohash" FROM "municipios_municipio" ORDER BY "municipios_municipio"."nome_ascii" ASC LIMIT 5'}]

Page 4: The Vanishing Pattern: from iterators to generators in Python

>>> from django.db import connection>>> q = connection.queries>>> q[]>>> from municipios.models import *>>> res = Municipio.objects.all()[:5]>>> q[]>>> for m in res: print m.uf, m.nome... GO Abadia de GoiásMG Abadia dos DouradosGO AbadiâniaMG AbaetéPA Abaetetuba>>> q[{'time': '0.000', 'sql': u'SELECT "municipios_municipio"."id", "municipios_municipio"."uf", "municipios_municipio"."nome", "municipios_municipio"."nome_ascii", "municipios_municipio"."meso_regiao_id", "municipios_municipio"."capital", "municipios_municipio"."latitude", "municipios_municipio"."longitude", "municipios_municipio"."geohash" FROM "municipios_municipio" ORDER BY "municipios_municipio"."nome_ascii" ASC LIMIT 5'}]

this expression makes a Django QuerySet

Page 5: The Vanishing Pattern: from iterators to generators in Python

>>> from django.db import connection>>> q = connection.queries>>> q[]>>> from municipios.models import *>>> res = Municipio.objects.all()[:5]>>> q[]>>> for m in res: print m.uf, m.nome... GO Abadia de GoiásMG Abadia dos DouradosGO AbadiâniaMG AbaetéPA Abaetetuba>>> q[{'time': '0.000', 'sql': u'SELECT "municipios_municipio"."id", "municipios_municipio"."uf", "municipios_municipio"."nome", "municipios_municipio"."nome_ascii", "municipios_municipio"."meso_regiao_id", "municipios_municipio"."capital", "municipios_municipio"."latitude", "municipios_municipio"."longitude", "municipios_municipio"."geohash" FROM "municipios_municipio" ORDER BY "municipios_municipio"."nome_ascii" ASC LIMIT 5'}]

this expression makes a Django QuerySet

QuerySets are “lazy”: no database access so far

Page 6: The Vanishing Pattern: from iterators to generators in Python

>>> from django.db import connection>>> q = connection.queries>>> q[]>>> from municipios.models import *>>> res = Municipio.objects.all()[:5]>>> q[]>>> for m in res: print m.uf, m.nome... GO Abadia de GoiásMG Abadia dos DouradosGO AbadiâniaMG AbaetéPA Abaetetuba>>> q[{'time': '0.000', 'sql': u'SELECT "municipios_municipio"."id", "municipios_municipio"."uf", "municipios_municipio"."nome", "municipios_municipio"."nome_ascii", "municipios_municipio"."meso_regiao_id", "municipios_municipio"."capital", "municipios_municipio"."latitude", "municipios_municipio"."longitude", "municipios_municipio"."geohash" FROM "municipios_municipio" ORDER BY "municipios_municipio"."nome_ascii" ASC LIMIT 5'}]

this expression makes a Django QuerySet

QuerySets are “lazy”: no database access so far

the query is made only when we iterate over the results

Page 7: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

QuerySet is a lazy iterable

7

Page 8: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

QuerySet is a lazy iterable

technical term

8

Page 9: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Lazy

• Avoids unnecessary work, by postponing it as long as possible

• The opposite of eager

9

In Computer Science, being “lazy” is often a good thing!

Page 10: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Now, back to basics...

10

Page 11: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration: C and Python#include <stdio.h>

int main(int argc, char *argv[]) { int i; for(i = 0; i < argc; i++) printf("%s\n", argv[i]); return 0;}

import sys

for arg in sys.argv: print arg

Page 12: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration: Java (classic)

class Arguments { public static void main(String[] args) { for (int i=0; i < args.length; i++) System.out.println(args[i]); }}

$ java Arguments alfa bravo charliealfabravocharlie

Page 13: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration: Java ≥1.5

$ java Arguments2 alfa bravo charliealfabravocharlie

• Enhanced for (a.k.a. foreach)

since2004

class Arguments2 { public static void main(String[] args) { for (String arg : args) System.out.println(arg); }}

Page 14: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration: Java ≥1.5• Enhanced for (a.k.a. foreach)

class Arguments2 { public static void main(String[] args) { for (String arg : args) System.out.println(arg); }}

since2004

import sys

for arg in sys.argv: print arg

since1991

Page 15: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

You can iterate over manyPython objects

• strings

• files

• XML: ElementTree nodes

• not limited to built-in types:

• Django QuerySet

• etc.

15

Page 16: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

So, what is an iterable?

• Informal, recursive definition:

• iterable: fit to be iterated

• just as: edible: fit to be eaten

16

Page 17: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

The for loop statement is not the only construct that

handles iterables...

17

Page 18: The Vanishing Pattern: from iterators to generators in Python

List comprehension

● Compreensão de lista ou abrangência de lista

● Exemplo: usar todos os elementos:

– L2 = [n*10 for n in L]

List comprehension• An expression that builds a list from any iterable

>>> s = 'abracadabra'>>> l = [ord(c) for c in s]>>> l[97, 98, 114, 97, 99, 97, 100, 97, 98, 114, 97]

input: any iterable object

output: a list (always)

Page 19: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Set comprehension

• An expression that builds a set from any iterable

>>> s = 'abracadabra'>>> set(s){'b', 'r', 'a', 'd', 'c'}>>> {ord(c) for c in s}{97, 98, 99, 100, 114}

19

Page 20: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Dict comprehensions

• An expression that builds a dict from any iterable

>>> s = 'abracadabra'>>> {c:ord(c) for c in s}{'a': 97, 'r': 114, 'b': 98, 'c': 99, 'd': 100}

20

Page 21: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Syntactic support for iterables

• Tuple unpacking, parallel assignment

>>> a, b, c = 'XYZ'>>> a'X'>>> b'Y'>>> c'Z'

21

>>> l = [(c, ord(c)) for c in 'XYZ']>>> l[('X', 88), ('Y', 89), ('Z', 90)]>>> for char, code in l:... print char, '->', code...X -> 88Y -> 89Z -> 90

Page 22: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Syntactic support for iterables (2)

• Function calls: exploding arguments with *

>>> import math>>> def hypotenuse(a, b):... return math.sqrt(a*a + b*b)...>>> hypotenuse(3, 4)5.0>>> sides = (3, 4)>>> hypotenuse(sides)Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: hypotenuse() takes exactly 2 arguments (1 given)>>> hypotenuse(*sides)5.0

22

Page 23: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Built-in iterable types

• basestring

• str

• unicode

• dict

• file

• frozenset

• list

• set

• tuple

• xrange

23

Page 24: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Built-in functions that take iterable arguments

• all

• any

• filter

• iter

• len

• map

• max

• min

• reduce

• sorted

• sum

• zip

unrelated to compression

Page 25: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Classic iterables in Python

25

Page 26: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iterator is...

• a classic design pattern

Design PatternsGamma, Helm, Johnson & VlissidesAddison-Wesley, ISBN 0-201-63361-2

26

Page 27: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Head First Design Patterns PosterO'Reilly, ISBN 0-596-10214-3

27

Page 28: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Head First Design Patterns PosterO'Reilly, ISBN 0-596-10214-3

28

“The Iterator Pattern provides a way to access the elements of an aggregate object sequentially without exposing the underlying representation.”

Page 29: The Vanishing Pattern: from iterators to generators in Python

An iterable Train class>>> train = Train(4)>>> for car in train:... print(car)car #1car #2car #3car #4>>>

Page 30: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

class Train(object):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __iter__(self): return TrainIterator(self)

class TrainIterator(object):

def __init__(self, train): self.train = train self.current = 0

def __next__(self): # Python 3 if self.current < len(self.train): self.current += 1 return 'car #%s' % (self.current) else: raise StopIteration()

An iterable Train with iterator

iterable

iterator

Page 31: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iterable ABC

• collections.Iterable abstract base class

• A concrete subclass of Iterable must implement .__iter__

• .__iter__ returns an Iterator

• You don’t usually call .__iter__ directly

• when needed, call iter(x)

31

Page 32: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iterator ABC

• Iterator provides.nextor.__next__

• .__next__ returns the next item

• You don’t usually call .__next__ directly

• when needed, call next(x)

Python 3

Python 2

Python ≥ 2.6

32

Page 33: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

for car in train:

• calls iter(train) to obtain a TrainIterator

• makes repeated calls to next(aTrainIterator) until it raises StopIteration

class Train(object):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __iter__(self): return TrainIterator(self)

class TrainIterator(object):

def __init__(self, train): self.train = train self.current = 0

def __next__(self): # Python 3 if self.current < len(self.train): self.current += 1 return 'car #%s' % (self.current) else: raise StopIteration()

Train withiterator

1

1

2

>>> train = Train(3)>>> for car in train:... print(car)car #1car #2car #3

2

Page 34: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg34 Richard Bartz/Wikipedia

Page 35: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iterable duck-like creatures

35

Page 36: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Design patterns in dynamic languages

• Dynamic languages: Lisp, Smalltalk, Python, Ruby, PHP, JavaScript...

• Many features not found in C++, where most of the original 23 Design Patterns were identified

• Java is more dynamic than C++, but much more static than Lisp, Python etc.

36

Gamma, Helm, Johnson, Vlissides a.k .a. the Gang of Four (GoF)

Page 37: The Vanishing Pattern: from iterators to generators in Python

Peter Norvig:“Design Patterns in Dynamic Languages”

Page 38: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Dynamic types

• No need to declare types or interfaces

• It does not matter what an object claims do be, only what it is capable of doing

38

Page 39: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Duck typing

39

“In other words, don't check whether it is-a duck: check whether it quacks-like-a duck, walks-like-a duck, etc, etc, depending on exactly what subset of duck-like behaviour you need to play your language-games with.”

Alex Martellicomp.lang.python (2000)

Page 40: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

A Python iterable is...

• An object from which the iter function can produce an iterator

• The iter(x) call:

• invokes x.__iter__() to obtain an iterator

• but, if x has no __iter__:

• iter makes an iterator which tries to fetch items from x by doing x[0], x[1], x[2]...

sequence protocol

Iterable interface

40

Page 41: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Train: a sequence of carstrain = Train(4)

41

train[0] train[1] train[2] train[3]

Page 42: The Vanishing Pattern: from iterators to generators in Python

Train: a sequence of cars>>> train = Train(4)>>> len(train)4>>> train[0]'car #1'>>> train[3]'car #4'>>> train[-1]'car #4'>>> train[4]Traceback (most recent call last): ...IndexError: no car at 4

>>> for car in train:... print(car)car #1car #2car #3car #4

Page 43: The Vanishing Pattern: from iterators to generators in Python

Train: a sequence of carsclass Train(object):

def __init__(self, cars): self.cars = cars

def __getitem__(self, key): index = key if key >= 0 else self.cars + key if 0 <= index < len(self): # index 2 -> car #3 return 'car #%s' % (index + 1) else: raise IndexError('no car at %s' % key)

if __getitem__ exists, iteration “just works”

Page 44: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

The sequence protocol at work>>> t = Train(4)>>> len(t)4>>> t[0]'car #1'>>> t[3]'car #4'>>> t[-1]'car #4'>>> for car in t:... print(car)car #1car #2car #3car #4

__len__

__getitem__

__getitem__

Page 45: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Protocol

• protocol: a synonym for interface used in dynamic languages like Smalltalk, Python, Ruby, Lisp...

• not declared, and not enforced by static checks

45

Page 46: The Vanishing Pattern: from iterators to generators in Python

class Train(object):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __getitem__(self, key): index = key if key >= 0 else self.cars + key if 0 <= index < len(self): # index 2 -> car #3 return 'car #%s' % (index + 1) else: raise IndexError('no car at %s' % key)

Sequence protocol

__len__ and __getitem__ implement the immutable sequence protocol

Page 47: The Vanishing Pattern: from iterators to generators in Python

import collections

class Train(collections.Sequence):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __getitem__(self, key): index = key if key >= 0 else self.cars + key if 0 <= index < len(self): # index 2 -> car #3 return 'car #%s' % (index + 1) else: raise IndexError('no car at %s' % key)

Sequence ABC• collections.Sequence abstract base class

abstract methods

Python ≥ 2.6

Page 48: The Vanishing Pattern: from iterators to generators in Python

import collections

class Train(collections.Sequence):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __getitem__(self, key): index = key if key >= 0 else self.cars + key if 0 <= index < len(self): # index 2 -> car #3 return 'car #%s' % (index + 1) else: raise IndexError('no car at %s' % key)

Sequence ABC• collections.Sequence abstract base class

implement these 2

Page 49: The Vanishing Pattern: from iterators to generators in Python

import collections

class Train(collections.Sequence):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __getitem__(self, key): index = key if key >= 0 else self.cars + key if 0 <= index < len(self): # index 2 -> car #3 return 'car #%s' % (index + 1) else: raise IndexError('no car at %s' % key)

Sequence ABC• collections.Sequence abstract base class

inherit these 5

Page 50: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Sequence ABC• collections.Sequence abstract base class

>>> train = Train(4)>>> 'car #2' in trainTrue>>> 'car #7' in trainFalse>>> for car in reversed(train):... print(car)car #4car #3car #2car #1>>> train.index('car #3')2

50

Page 51: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg51 U.S. NRC/Wikipedia

Page 52: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generators

52

Page 53: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration in C (example 2)

#include <stdio.h>

int main(int argc, char *argv[]) { int i; for(i = 0; i < argc; i++) printf("%d : %s\n", i, argv[i]); return 0;}

$ ./args2 alfa bravo charlie0 : ./args21 : alfa2 : bravo3 : charlie

Page 54: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration in Python (ex. 2)

import sys

for i in range(len(sys.argv)): print i, ':', sys.argv[i]

$ python args2.py alfa bravo charlie0 : args2.py1 : alfa2 : bravo3 : charlie 54

not Pythonic

Page 55: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iteration in Python (ex. 2)

import sys

for i, arg in enumerate(sys.argv): print i, ':', arg

$ python args2.py alfa bravo charlie0 : args2.py1 : alfa2 : bravo3 : charlie 55

Pythonic!

Page 56: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

import sys

for i, arg in enumerate(sys.argv): print i, ':', arg

Iteration in Python (ex. 2)

$ python args2.py alfa bravo charlie0 : args2.py1 : alfa2 : bravo3 : charlie

this returns a lazy iterable object

that object yields tuples (index, item)

on demand, at each iteration

56

Page 57: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

What enumerate does

>>> e = enumerate('Turing')>>> e<enumerate object at 0x...>>>>

enumerate builds an enumerate object

57

Page 58: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

What enumerate does

isso constroium gerador

and that is iterable

>>> e = enumerate('Turing')>>> e<enumerate object at 0x...>>>> for item in e:... print item...(0, 'T')(1, 'u')(2, 'r')(3, 'i')(4, 'n')(5, 'g')>>>

58

enumerate builds an enumerate object

Page 59: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

What enumerate does

isso constroium gerador

the enumerate object produces an

(index, item) tuplefor each next(e) call

>>> e = enumerate('Turing')>>> e<enumerate object at 0x...>>>> next(e)(0, 'T')>>> next(e)(1, 'u')>>> next(e)(2, 'r')>>> next(e)(3, 'i')>>> next(e)(4, 'n')>>> next(e)(5, 'g')>>> next(e)Traceback (most recent...): ...StopIteration

• The enumerator object is an example of a generator

Page 60: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Iterator x generator• By definition (in GoF) an iterator retrieves successive items

from an existing collection

• A generator implements the iterator interface (next) but produces items not necessarily in a collection

• a generator may iterate over a collection, but return the items decorated in some way, skip some items...

• it may also produce items independently of any existing data source (eg. Fibonacci sequence generator)

60

Page 61: The Vanishing Pattern: from iterators to generators in Python

Faraday disc(Wikipedia)

Page 62: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Very simplegenerators

62

Page 63: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generatorfunction

• Any function that has the yield keyword in its body is a generator function

63

>>> def gen_123():... yield 1... yield 2... yield 3...>>> for i in gen_123(): print(i)123>>>

the keyword gen was considered for defining generator functions,

but def prevailed

Page 64: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

• When invoked, a generator function returns a generator object

Generatorfunction

64

>>> def gen_123():... yield 1... yield 2... yield 3...>>> for i in gen_123(): print(i)123>>> g = gen_123()>>> g <generator object gen_123 at ...>

Page 65: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generatorfunction

>>> def gen_123():... yield 1... yield 2... yield 3...>>> g = gen_123()>>> g <generator object gen_123 at ...>>>> next(g)1>>> next(g)2>>> next(g)3>>> next(g)Traceback (most recent call last):...StopIteration

• Generator objects implement the Iterator interface

65

Page 66: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generatorbehavior

• Note how the output of the generator function is interleaved with the output of the calling code

66

>>> def gen_AB():... print('START')... yield 'A'... print('CONTINUE')... yield 'B'... print('END.')...>>> for c in gen_AB():... print('--->', c)...START---> ACONTINUE---> BEND.>>>

Page 67: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generatorbehavior

• The body is executed only when next is called, and it runs only up to the following yield

>>> def gen_AB():... print('START')... yield 'A'... print('CONTINUE')... yield 'B'... print('END.')...>>> g = gen_AB()>>> next(g)START'A'>>>

Page 68: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generatorbehavior

• When the body of the function returns, the generator object throws StopIteration

• The for statement catches that for you

68

>>> def gen_AB():... print('START')... yield 'A'... print('CONTINUE')... yield 'B'... print('END.')...>>> g = gen_AB()>>> next(g)START'A'>>> next(g)CONTINUE'B'>>> next(g)END.Traceback (most recent call last): File "<stdin>", line 1, in <module>StopIteration

Page 69: The Vanishing Pattern: from iterators to generators in Python

for car in train:

• calls iter(train) to obtain a generator

• makes repeated calls to next(generator) until the function returns, which raises StopIteration

class Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): for i in range(self.cars): # index 2 is car #3 yield 'car #%s' % (i+1)

Train with generator function

1

1

2

>>> train = Train(3)>>> for car in train:... print(car)car #1car #2car #3

2

Page 70: The Vanishing Pattern: from iterators to generators in Python

Classic iterator x generator

class Train(object):

def __init__(self, cars): self.cars = cars

def __len__(self): return self.cars

def __iter__(self): return TrainIterator(self)

class TrainIterator(object):

def __init__(self, train): self.train = train self.current = 0

def __next__(self): # Python 3 if self.current < len(self.train): self.current += 1 return 'car #%s' % (self.current) else: raise StopIteration()

class Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): for i in range(self.cars): yield 'car #%s' % (i+1)

2 classes, 12 lines of code

1 class,3 lines of code

Page 71: The Vanishing Pattern: from iterators to generators in Python

class Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): for i in range(self.cars): yield 'car #%s' % (i+1)

The pattern just vanished

Page 72: The Vanishing Pattern: from iterators to generators in Python

class Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): for i in range(self.cars): yield 'car #%s' % (i+1)

“When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I'm using abstractions that aren't powerful enough -- often that I'm generating by hand the expansions of some macro that I need to write.”

Paul GrahamRevenge of the nerds (2002)

Page 73: The Vanishing Pattern: from iterators to generators in Python

Generator expression (genexp)

>>> g = (c for c in 'ABC')>>> g<generator object <genexpr> at 0x10045a410> >>> for l in g:... print(l)... ABC>>>

Page 74: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

• When evaluated, returns a generator object

>>> g = (n for n in [1, 2, 3])>>> g<generator object <genexpr> at 0x...>>>> next(g)1>>> next(g)2>>> next(g)3>>> next(g)Traceback (most recent call last): File "<stdin>", line 1, in <module>StopIteration

Generator expression (genexp)

Page 75: The Vanishing Pattern: from iterators to generators in Python

for car in train:

• calls iter(train) to obtain a generator

• makes repeated calls to next(generator) until the function returns, which raises StopIteration

class Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): for i in range(self.cars): # index 2 is car #3 yield 'car #%s' % (i+1)

Train with generator function

1

1

2

>>> train = Train(3)>>> for car in train:... print(car)car #1car #2car #3

2

Page 76: The Vanishing Pattern: from iterators to generators in Python

for car in train:

• calls iter(train) to obtain a generator

• makes repeated calls to next(generator) until the function returns, which raises StopIteration

1

2

class Train(object): def __init__(self, cars): self.cars = cars def __iter__(self): return ('car #%s' % (i+1) for i in range(self.cars))

Train with generator expression

>>> train = Train(3)>>> for car in train:... print(car)car #1car #2car #3

Page 77: The Vanishing Pattern: from iterators to generators in Python

class Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): return ('car #%s' % (i+1) for i in range(self.cars))

Generator functionx genexpclass Train(object):

def __init__(self, cars): self.cars = cars

def __iter__(self): for i in range(self.cars): yield 'car #%s' % (i+1)

Page 78: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Built-in functions that return iterables, iterators or generators

• dict

• enumerate

• frozenset

• list

• reversed

• set

• tuple

78

Page 79: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

• boundless generators

• count(), cycle(), repeat()

• generators which combine several iterables:

• chain(), tee(), izip(), imap(), product(), compress()...

• generators which select or group items:

• compress(), dropwhile(), groupby(), ifilter(), islice()...

• generators producing combinations of items:

• product(), permutations(), combinations()...

The itertools module Don’t reinvent the wheel, use itertools!

this was not reinvented: ported from Haskell

great for MapReduce

Page 80: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Generators in Python 3

• Several functions and methods of the standard library that used to return lists, now return generators and other lazy iterables in Python 3

• dict.keys(), dict.items(), dict.values()...

• range(...)

• like xrange in Python 2.x (more than a generator)

• If you really need a list, just pass the generator to the list constructor. Eg.: list(range(10))

80

Page 81: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

A practical example using generator functions

• Generator functions to decouple reading and writing logic in a database conversion tool designed to handle large datasets

https://github.com/ramalho/isis2json

81

Page 82: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Main loop writes JSON file

Page 83: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Another loop readsthe input records

Page 84: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

One implementation:same loop reads/writes

Page 85: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

But what if we need to read another format?

Page 86: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

Functions in the script

• iterMstRecords*

• iterIsoRecords*

• writeJsonArray

• main

* generator functions

Page 87: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

main:read commandline arguments

Page 88: The Vanishing Pattern: from iterators to generators in Python

main: determineinput format

selected generator function is passed as an argument

input generator function is selected based on the input file extension

Page 89: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

writeJsonArray:write JSON records

89

Page 90: The Vanishing Pattern: from iterators to generators in Python

writeJsonArray:iterates over one of the input generator functions

selected generator function received as an argument...

and called to produce input generator

Page 91: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

iterIsoRecords:read recordsfrom ISO-2709format file

generator function!

91

Page 92: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

iterIsoRecords

yields one record, structured as a dict

creates a new dict in each iteration

92

Page 93: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

iterMstRecords:read recordsfrom ISIS.MST file

generator function!

Page 94: The Vanishing Pattern: from iterators to generators in Python

iterIsoRecordsiterMstRecords

yields one record, structured as a dict

creates a new dict in each iteration

Page 95: The Vanishing Pattern: from iterators to generators in Python

Generators at work

Page 96: The Vanishing Pattern: from iterators to generators in Python

Generators at work

Page 97: The Vanishing Pattern: from iterators to generators in Python

Generators at work

Page 98: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

We did not cover

• other generator methods:

• gen.close(): causes a GeneratorExit exception to be raised within the generator body, at the point where it is paused

• gen.throw(e): causes any exception e to be raised within the generator body, at the point it where is paused

Mostly useful for long-running processes.Often not needed in batch processing scripts.

98

Page 99: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

We did not cover

• generator delegation with yield from

• sending data into a generator function with the gen.send(x) method (instead of next(gen)), and using yield as an expression to get thedata sent

• using generator functions as coroutines

not useful in the context of iteration

Python ≥ 3.3

“Coroutines are not related to iteration”

David Beazley

99

Page 100: The Vanishing Pattern: from iterators to generators in Python

@ramalhoorg

How to learn generators

• Forget about .send() and coroutines: that is a completely different subject. Look into that only after mastering and becoming really confortable using generators for iteration.

• Study and use the itertools module

• Don’t worry about .close() and .throw() initially. You can be productive with generators without using these methods.

• yield from is only available in Python 3.3, and only relevant if you need to use .close() and .throw()

100


Top Related