How Pony ORM translates Python generators to SQL queries

Post on 10-Sep-2014

737 views 1 download

Tags:

description

Pony ORM is an Object-Relational Mapper implemented in Python. It uses an unusual approach for writing database queries using Python generators. Pony analyzes the abstract syntax tree of a generator and translates it to its SQL equivalent. The translation process consists of several non-trivial stages. This talk was given at EuroPython 2014 and reveals the internal details of the translation process.

transcript

Object-Relational MapperAlexey Malashkevich @ponyorm

What makes Pony ORM different?

Fast object-relational mapper which uses Python generators for writing database queries

A Python generator

(p.name for p in product_list if p.price > 100)

A Python generator vs a SQL query

SELECT p.nameFROM Products pWHERE p.price > 100

(p.name for p in product_list if p.price > 100)

(p.name for p in product_list if p.price > 100)

SELECT p.nameFROM Products pWHERE p.price > 100

A Python generator vs a SQL query

(p.name for p in product_list if p.price > 100)

SELECT p.nameFROM Products pWHERE p.price > 100

A Python generator vs a SQL query

(p.name for p in product_list if p.price > 100)

SELECT p.nameFROM Products pWHERE p.price > 100

A Python generator vs a SQL query

The same query in Pony

SELECT p.nameFROM Products pWHERE p.price > 100

select(p.name for p in Product if p.price > 100)

• Pony ORM

• Django

• SQL Alchemy

Query syntax comparison

Pony ORM:

select(p for p in Product if p.name.startswith('A') and p.image is None or p.added.year < 2014)

Query syntax comparison

Django:

Product.objects.filter( Q(name__startswith='A', image__isnull=True) | Q(added__year__lt=2014))

Query syntax comparison

SQLAlchemy:

session.query(Product).filter( (Product.name.startswith('A') & (Product.image == None)) | (extract('year', Product.added) < 2014))

Query syntax comparison

session.query(Product).filter( (Product.name.startswith('A') & (Product.image == None)) | (extract('year', Product.added) < 2014))

Query syntax comparison

Product.objects.filter( Q(name__startswith='A', image__isnull=True) | Q(added__year__lt=2014))

select(p for p in Product if p.name.startswith('A') and p.image is None or p.added.year < 2014)

Pony

Django

SQLAlchemy

Query translation

select(p for p in Product if p.name.startswith('A') and p.image is None or p.added.year < 2014)

• Translation from the bytecode is fast• The bytecode translation result is cached• The Python generator object is used as a

cache key

Python generator object

Building a query step by stepq = select(o for o in Order if o.customer.id == some_id)q = q.filter(lambda o: o.state != 'DELIVERED')q = q.filter(lambda o: len(o.items) > 2)q = q.order_by(Order.date_created)q = q[10:20]

SELECT "o"."id"FROM "Order" "o" LEFT JOIN "OrderItem" "orderitem-1" ON "o"."id" = "orderitem-1"."order"WHERE "o"."customer" = ? AND "o"."state" <> 'DELIVERED'GROUP BY "o"."id"HAVING COUNT("orderitem-1"."ROWID") > 2ORDER BY "o"."date_created"LIMIT 10 OFFSET 10

How Pony translates generator expressions to SQL?

Python generator to SQL translation

1. Decompile bytecode and restore AST

2. Translate AST to ‘abstract SQL’

3. Translate ‘abstract SQL’ to a specific SQL dialect

Python generator to SQL translation

1. Decompile bytecode and restore AST

2. Translate AST to ‘abstract SQL’

3. Translate ‘abstract SQL’ to a concrete SQL dialect

Bytecode decompilation

• Using the Visitor pattern• Methods of the Visitor object correspond

the byte code commands• Pony keeps fragments of AST at the stack• Each method either adds a new part of AST

or combines existing parts

(a + b.c) in x.y

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in

Bytecode decompilation

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in

Stack

Bytecode decompilation

(a + b.c) in x.y

> LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in

StackName('a')

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL a> LOAD_FAST b

LOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in

StackName('b')

Name('a')

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST b

> LOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in

StackGetattr(Name('b'), 'c')

Name('a')

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR c

> BINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in

StackAdd(Name('a'),

Getattr(Name('b'), 'c'))

Bytecode decompilation

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADD

> LOAD_FAST xLOAD_ATTR yCOMPARE_OP in

StackName('x')

Add(Name('a'), Getattr(Name('b'), 'c'))

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST x

> LOAD_ATTR yCOMPARE_OP in

StackGetattr(Name('x'), 'y')

Add(Name('a'), Getattr(Name('b'), 'c'))

Bytecode decompilation

(a + b.c) in x.y

LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR y

> COMPARE_OP in

StackCompare('in',

Add(…), Getattr(…))

Abstract Syntax Tree (AST)

a

in

+

.c

b

.y

x

(a + b.c) in x.y

Python generator to SQL translation

1. Decompile bytecode and restore AST

2. Translate AST to ‘abstract SQL’

3. Translate ‘abstract SQL’ to a concrete SQL dialect

What SQL it should be translated to?

a

in

+

.c

b

.y

x

(a + b.c) in x.y

It depends on variables types!

What SQL it should be translated to?

(a + b.c) in x.y

• If a and c are numbers, y is a collection(? + "b"."c") IN (SELECT …)

• If a and c are strings, y is a collectionCONCAT(?, "b"."c") IN (SELECT …)

• If a, c and y are strings“x"."y" LIKE CONCAT('%', ?, "b"."c", '%')

What SQL it should be translated to?

(a + b.c) in x.y

• The result of translation depends on types• If the translator analyzes node types by

itself, the logic becomes too complex• Pony uses Monads to keep it simple

(a + b.c) in x.y

AST to SQL Translation

• Encapsulates the node translation logic • Generates the result of translation - ‘the

abstract SQL’• Can combine itself with other monads

The translator delegates the logic of translation to monads

A Monad

• StringAttrMonad• StringParamMonad• StringExprMonad• StringConstMonad• DatetimeAttrMonad• DatetimeParamMonad• ObjectAttrMonad• CmpMonad• etc…

Each monad defines a set of allowed operations and can translate itself into a part of resulting SQL query

Monad types

AST Translation

• Using the Visitor pattern• Walk the tree in depth-first order• Create monads when leaving each node

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

ObjectIterMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

ObjectIterMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

ObjectIterMonad

StringAttrMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

ObjectIterMonad

StringAttrMonad

(a + b.c) in x.y

AST to SQL Translation

in

.y

x

+

a .c

b

StringParamMonad

ObjectIterMonad

StringAttrMonad

StringExprMonad

ObjectIterMonad

StringAttrMonad

CmpMonad

Abstract SQL

(a + b.c) in x.y['LIKE', ['COLUMN', 't1', 'y'], ['CONCAT', ['VALUE', '%'], ['PARAM', 'p1'], ['COLUMN', 't2', 'c'], ['VALUE', '%'] ]]Allows to put aside the SQL dialect differences

Python generator to SQL translation

1. Decompile bytecode and restore AST

2. Translate AST to ‘abstract SQL’

3. Translate ‘abstract SQL’ to a specific SQL dialect

Specific SQL dialects

['LIKE', ['COLUMN', 't1', 'y'], ['CONCAT', ['VALUE', '%'], ['PARAM', 'p1'], ['COLUMN', 't2', 'c'], ['VALUE', '%'] ]]

MySQL:

`t1`.`y` LIKE CONCAT('%', ?, `t2`.`c`, '%')

SQLite:

"t1"."y" LIKE '%' || ? || "t2"."c" || '%'

Other Pony ORM features

• Identity Map• Automatic query optimization• N+1 Query Problem solution• Optimistic transactions• Online ER Diagram Editor

Django ORM

s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id

• How many SQL queries will be executed?• How many objects will be created?

Django ORM

s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id

Student 123

Django ORM

s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id

Student 123 Group 1

Django ORM

s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id

Student 123

Student 456

Group 1

Django ORM

s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id

Student 123

Student 456

Group 1

Group 1

Pony ORM

s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id

Pony ORM – seeds, IdentityMap

s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id

Student 123

Group 1

Pony ORM – seeds, IdentityMap

s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id

Student 123

Group 1

seed

Pony ORM – seeds, IdentityMap

s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id

Student 123

Group 1

seed

Pony ORM – seeds, IdentityMap

s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id

Student 123

Student 456

Group 1

seed

Pony ORM – seeds, IdentityMap

s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id

Student 123

Student 456

Group 1

seed

Solution for the N+1 Query Problem

orders = select(o for o in Order if o.total_price > 1000) \ .order_by(desc(Order.id)).page(1, pagesize=5)

for o in orders: print o.total_price, o.customer.name

1

SELECT o.id, o.total_price, o.customer_id,...FROM "Order" oWHERE o.total_price > 1000ORDER BY o.id DESCLIMIT 5

Order 1

Order 3

Order 4

Order 7

Order 9

Customer 1

Customer 4

Customer 7

Solution for the N+1 Query Problem

Order 1

Order 3

Order 4

Order 7

Order 9

Customer 1

Customer 4

Customer 7

Solution for the N+1 Query ProblemOne SQL query

Solution for the N+1 Query Problem

1

1

SELECT c.id, c.name, …FROM “Customer” cWHERE c.id IN (?, ?, ?)

orders = select(o for o in Order if o.total_price > 1000) \ .order_by(desc(Order.id)).page(1, pagesize=5)

for o in orders: print o.total_price, o.customer.name

SELECT o.id, o.total_price, o.customer_id,...FROM "Order" oWHERE o.total_price > 1000ORDER BY o.id DESCLIMIT 5

Automatic query optimizationselect(c for c in Customer if sum(c.orders.total_price) > 1000)

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"FROM "Customer" "c"WHERE ( SELECT coalesce(SUM("order-1"."total_price"), 0) FROM "Order" "order-1" WHERE "c"."id" = "order-1"."customer") > 1000

SELECT "c"."id"FROM "Customer" "c" LEFT JOIN "Order" "order-1" ON "c"."id" = "order-1"."customer"GROUP BY "c"."id"HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000

Transactions

def transfer_money(id1, id2, amount): account1 = Account.objects.get(pk=id1) if account1.amount < amount: raise ValueError('Not enough funds!') account2 = Account.object.get(pk=id2) account1.amount -= amount account1.save() account2.amount += amount account2.save()

Django ORM

@transaction.atomicdef transfer_money(id1, id2, amount): account1 = Account.objects.get(pk=id1) if account1.amount < amount: raise ValueError('Not enough funds!') account2 = Account.object.get(pk=id2) account1.amount -= amount account1.save() account2.amount += amount account2.save()

TransactionsDjango ORM

@transaction.atomicdef transfer_money(id1, id2, amount): account1 = Account.objects \

.select_for_update.get(pk=id1) if account1.amount < amount: raise ValueError('Not enough funds!') account2 = Account.objects \

.select_for_update.get(pk=id2) account1.amount -= amount account1.save() account2.amount += amount account2.save()

TransactionsDjango ORM

@db_sessiondef transfer_money(id1, id2, amount): account1 = Account[id1] if account1.amount < amount: raise ValueError('Not enough funds!') account1.amount -= amount Account[id2].amount += amount

TransactionsPony ORM

db_session

• Pony tracks which objects where changed• No need to call save()• Pony saves all updated objects in a single

transaction automatically on leaving the db_session scope

Transactions

UPDATE Account

SET amount = :new_value

WHERE id = :id

AND amount = :old_value

Optimistic Locking

Optimistic Locking

• Pony tracks attributes which were read and updated

• If object wasn’t locked using the for_update method, Pony uses the optimistic locking automatically

Entity-Relationship Diagram Editorhttps://editor.ponyorm.com

Entity-Relationship Diagram Editorhttps://editor.ponyorm.com

Entity-Relationship Diagram Editorhttps://editor.ponyorm.com

Main Pony ORM features:

• Using generators for database queries• Identity Map• Solution for N+1 Query Problem• Automatic query optimization• Optimistic transactions• Online ER-diagram editor

Wrapping up

• Python 3• Microsoft SQL Server support• Improved documentation• Migrations• Ansync queries

Pony roadmap

• Site ponyorm.com• Twitter @ponyorm• Github github.com/ponyorm/pony• ER-Diagram editor editor.ponyorm.com• Installation: pip install pony

Thank you!

Pony ORM