how pony orm translates python generators to sql queries
DESCRIPTION
Pony ORM is an Object-Relational Mapper implemented in Python. It uses an unusual approach for writing database queries using Python generators. Pony analyzes the abstract syntax tree of a generator and translates it to its SQL equivalent. The translation process consists of several non-trivial stages. This talk was given at EuroPython 2014 and reveals the internal details of the translation process.TRANSCRIPT
Object-Relational MapperAlexey Malashkevich @ponyorm
What makes Pony ORM different?
Fast object-relational mapper which uses Python generators for writing database queries
A Python generator
(p.name for p in product_list if p.price > 100)
A Python generator vs a SQL query
SELECT p.nameFROM Products pWHERE p.price > 100
(p.name for p in product_list if p.price > 100)
(p.name for p in product_list if p.price > 100)
SELECT p.nameFROM Products pWHERE p.price > 100
A Python generator vs a SQL query
(p.name for p in product_list if p.price > 100)
SELECT p.nameFROM Products pWHERE p.price > 100
A Python generator vs a SQL query
(p.name for p in product_list if p.price > 100)
SELECT p.nameFROM Products pWHERE p.price > 100
A Python generator vs a SQL query
The same query in Pony
SELECT p.nameFROM Products pWHERE p.price > 100
select(p.name for p in Product if p.price > 100)
• Pony ORM
• Django
• SQL Alchemy
Query syntax comparison
Pony ORM:
select(p for p in Product if p.name.startswith('A') and p.image is None or p.added.year < 2014)
Query syntax comparison
Django:
Product.objects.filter( Q(name__startswith='A', image__isnull=True) | Q(added__year__lt=2014))
Query syntax comparison
SQLAlchemy:
session.query(Product).filter( (Product.name.startswith('A') & (Product.image == None)) | (extract('year', Product.added) < 2014))
Query syntax comparison
session.query(Product).filter( (Product.name.startswith('A') & (Product.image == None)) | (extract('year', Product.added) < 2014))
Query syntax comparison
Product.objects.filter( Q(name__startswith='A', image__isnull=True) | Q(added__year__lt=2014))
select(p for p in Product if p.name.startswith('A') and p.image is None or p.added.year < 2014)
Pony
Django
SQLAlchemy
Query translation
select(p for p in Product if p.name.startswith('A') and p.image is None or p.added.year < 2014)
• Translation from the bytecode is fast• The bytecode translation result is cached• The Python generator object is used as a
cache key
Python generator object
Building a query step by stepq = select(o for o in Order if o.customer.id == some_id)q = q.filter(lambda o: o.state != 'DELIVERED')q = q.filter(lambda o: len(o.items) > 2)q = q.order_by(Order.date_created)q = q[10:20]
SELECT "o"."id"FROM "Order" "o" LEFT JOIN "OrderItem" "orderitem-1" ON "o"."id" = "orderitem-1"."order"WHERE "o"."customer" = ? AND "o"."state" <> 'DELIVERED'GROUP BY "o"."id"HAVING COUNT("orderitem-1"."ROWID") > 2ORDER BY "o"."date_created"LIMIT 10 OFFSET 10
How Pony translates generator expressions to SQL?
Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a specific SQL dialect
Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a concrete SQL dialect
Bytecode decompilation
• Using the Visitor pattern• Methods of the Visitor object correspond
the byte code commands• Pony keeps fragments of AST at the stack• Each method either adds a new part of AST
or combines existing parts
(a + b.c) in x.y
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in
Bytecode decompilation
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in
Stack
Bytecode decompilation
(a + b.c) in x.y
> LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in
StackName('a')
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL a> LOAD_FAST b
LOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in
StackName('b')
Name('a')
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST b
> LOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in
StackGetattr(Name('b'), 'c')
Name('a')
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR c
> BINARY_ADDLOAD_FAST xLOAD_ATTR yCOMPARE_OP in
StackAdd(Name('a'),
Getattr(Name('b'), 'c'))
Bytecode decompilation
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADD
> LOAD_FAST xLOAD_ATTR yCOMPARE_OP in
StackName('x')
Add(Name('a'), Getattr(Name('b'), 'c'))
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST x
> LOAD_ATTR yCOMPARE_OP in
StackGetattr(Name('x'), 'y')
Add(Name('a'), Getattr(Name('b'), 'c'))
Bytecode decompilation
(a + b.c) in x.y
LOAD_GLOBAL aLOAD_FAST bLOAD_ATTR cBINARY_ADDLOAD_FAST xLOAD_ATTR y
> COMPARE_OP in
StackCompare('in',
Add(…), Getattr(…))
Abstract Syntax Tree (AST)
a
in
+
.c
b
.y
x
(a + b.c) in x.y
Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a concrete SQL dialect
What SQL it should be translated to?
a
in
+
.c
b
.y
x
(a + b.c) in x.y
It depends on variables types!
What SQL it should be translated to?
(a + b.c) in x.y
• If a and c are numbers, y is a collection(? + "b"."c") IN (SELECT …)
• If a and c are strings, y is a collectionCONCAT(?, "b"."c") IN (SELECT …)
• If a, c and y are strings“x"."y" LIKE CONCAT('%', ?, "b"."c", '%')
What SQL it should be translated to?
(a + b.c) in x.y
• The result of translation depends on types• If the translator analyzes node types by
itself, the logic becomes too complex• Pony uses Monads to keep it simple
(a + b.c) in x.y
AST to SQL Translation
• Encapsulates the node translation logic • Generates the result of translation - ‘the
abstract SQL’• Can combine itself with other monads
The translator delegates the logic of translation to monads
A Monad
• StringAttrMonad• StringParamMonad• StringExprMonad• StringConstMonad• DatetimeAttrMonad• DatetimeParamMonad• ObjectAttrMonad• CmpMonad• etc…
Each monad defines a set of allowed operations and can translate itself into a part of resulting SQL query
Monad types
AST Translation
• Using the Visitor pattern• Walk the tree in depth-first order• Create monads when leaving each node
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
ObjectIterMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
ObjectIterMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
ObjectIterMonad
StringAttrMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
ObjectIterMonad
StringAttrMonad
(a + b.c) in x.y
AST to SQL Translation
in
.y
x
+
a .c
b
StringParamMonad
ObjectIterMonad
StringAttrMonad
StringExprMonad
ObjectIterMonad
StringAttrMonad
CmpMonad
Abstract SQL
(a + b.c) in x.y['LIKE', ['COLUMN', 't1', 'y'], ['CONCAT', ['VALUE', '%'], ['PARAM', 'p1'], ['COLUMN', 't2', 'c'], ['VALUE', '%'] ]]Allows to put aside the SQL dialect differences
Python generator to SQL translation
1. Decompile bytecode and restore AST
2. Translate AST to ‘abstract SQL’
3. Translate ‘abstract SQL’ to a specific SQL dialect
Specific SQL dialects
['LIKE', ['COLUMN', 't1', 'y'], ['CONCAT', ['VALUE', '%'], ['PARAM', 'p1'], ['COLUMN', 't2', 'c'], ['VALUE', '%'] ]]
MySQL:
`t1`.`y` LIKE CONCAT('%', ?, `t2`.`c`, '%')
SQLite:
"t1"."y" LIKE '%' || ? || "t2"."c" || '%'
Other Pony ORM features
• Identity Map• Automatic query optimization• N+1 Query Problem solution• Optimistic transactions• Online ER Diagram Editor
Django ORM
s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id
• How many SQL queries will be executed?• How many objects will be created?
Django ORM
s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id
Student 123
Django ORM
s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id
Student 123 Group 1
Django ORM
s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id
Student 123
Student 456
Group 1
Django ORM
s1 = Student.objects.get(pk=123)print s1.name, s1.group.ids2 = Student.objects.get(pk=456)print s2.name, s2.group.id
Student 123
Student 456
Group 1
Group 1
Pony ORM
s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id
Pony ORM – seeds, IdentityMap
s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id
Student 123
Group 1
Pony ORM – seeds, IdentityMap
s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id
Student 123
Group 1
seed
Pony ORM – seeds, IdentityMap
s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id
Student 123
Group 1
seed
Pony ORM – seeds, IdentityMap
s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id
Student 123
Student 456
Group 1
seed
Pony ORM – seeds, IdentityMap
s1 = Student[123]print s1.name, s1.group.ids2 = Student[456]print s2.name, s2.group.id
Student 123
Student 456
Group 1
seed
Solution for the N+1 Query Problem
orders = select(o for o in Order if o.total_price > 1000) \ .order_by(desc(Order.id)).page(1, pagesize=5)
for o in orders: print o.total_price, o.customer.name
1
SELECT o.id, o.total_price, o.customer_id,...FROM "Order" oWHERE o.total_price > 1000ORDER BY o.id DESCLIMIT 5
Order 1
Order 3
Order 4
Order 7
Order 9
Customer 1
Customer 4
Customer 7
Solution for the N+1 Query Problem
Order 1
Order 3
Order 4
Order 7
Order 9
Customer 1
Customer 4
Customer 7
Solution for the N+1 Query ProblemOne SQL query
Solution for the N+1 Query Problem
1
1
SELECT c.id, c.name, …FROM “Customer” cWHERE c.id IN (?, ?, ?)
orders = select(o for o in Order if o.total_price > 1000) \ .order_by(desc(Order.id)).page(1, pagesize=5)
for o in orders: print o.total_price, o.customer.name
SELECT o.id, o.total_price, o.customer_id,...FROM "Order" oWHERE o.total_price > 1000ORDER BY o.id DESCLIMIT 5
Automatic query optimizationselect(c for c in Customer if sum(c.orders.total_price) > 1000)
SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"FROM "Customer" "c"WHERE ( SELECT coalesce(SUM("order-1"."total_price"), 0) FROM "Order" "order-1" WHERE "c"."id" = "order-1"."customer") > 1000
SELECT "c"."id"FROM "Customer" "c" LEFT JOIN "Order" "order-1" ON "c"."id" = "order-1"."customer"GROUP BY "c"."id"HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000
Transactions
def transfer_money(id1, id2, amount): account1 = Account.objects.get(pk=id1) if account1.amount < amount: raise ValueError('Not enough funds!') account2 = Account.object.get(pk=id2) account1.amount -= amount account1.save() account2.amount += amount account2.save()
Django ORM
@transaction.atomicdef transfer_money(id1, id2, amount): account1 = Account.objects.get(pk=id1) if account1.amount < amount: raise ValueError('Not enough funds!') account2 = Account.object.get(pk=id2) account1.amount -= amount account1.save() account2.amount += amount account2.save()
TransactionsDjango ORM
@transaction.atomicdef transfer_money(id1, id2, amount): account1 = Account.objects \
.select_for_update.get(pk=id1) if account1.amount < amount: raise ValueError('Not enough funds!') account2 = Account.objects \
.select_for_update.get(pk=id2) account1.amount -= amount account1.save() account2.amount += amount account2.save()
TransactionsDjango ORM
@db_sessiondef transfer_money(id1, id2, amount): account1 = Account[id1] if account1.amount < amount: raise ValueError('Not enough funds!') account1.amount -= amount Account[id2].amount += amount
TransactionsPony ORM
db_session
• Pony tracks which objects where changed• No need to call save()• Pony saves all updated objects in a single
transaction automatically on leaving the db_session scope
Transactions
UPDATE Account
SET amount = :new_value
WHERE id = :id
AND amount = :old_value
Optimistic Locking
Optimistic Locking
• Pony tracks attributes which were read and updated
• If object wasn’t locked using the for_update method, Pony uses the optimistic locking automatically
Entity-Relationship Diagram Editorhttps://editor.ponyorm.com
Entity-Relationship Diagram Editorhttps://editor.ponyorm.com
Entity-Relationship Diagram Editorhttps://editor.ponyorm.com
Main Pony ORM features:
• Using generators for database queries• Identity Map• Solution for N+1 Query Problem• Automatic query optimization• Optimistic transactions• Online ER-diagram editor
Wrapping up
• Python 3• Microsoft SQL Server support• Improved documentation• Migrations• Ansync queries
Pony roadmap
• Site ponyorm.com• Twitter @ponyorm• Github github.com/ponyorm/pony• ER-Diagram editor editor.ponyorm.com• Installation: pip install pony
Thank you!
Pony ORM