10. the pypy translation tool chain toon verwaest thanks to carl friedrich bolz for his kind...

44
10. The PyPy translation tool chain Toon Verwaest Thanks to Carl Friedrich Bolz for his kind permission to reuse and adapt his notes.

Upload: preston-owens

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

10. The PyPy translation tool chain

Toon Verwaest

Thanks to Carl Friedrich Bolz for his kind permission to reuse and adapt his notes.

© Toon Verwaest

The PyPy tool chain

22

Roadmap

> What is PyPy?> The PyPy interpreter> The PyPy translation tool chain

© Toon Verwaest

The PyPy tool chain

33

Roadmap

> What is PyPy?> The PyPy Interpreter> The PyPy translation tool chain

© Toon Verwaest

The PyPy tool chain

44

What is PyPy?

> Reimplementation of Python in Python

> Framework for building interpreters and VMs

> L * O * P configurations— L dynamic languages— O optimizations— P platforms

© Toon Verwaest

The PyPy tool chain

5

PyPy

© Toon Verwaest

The PyPy tool chain

6

Roadmap

> What is PyPy?> The PyPy interpreter> The PyPy translation tool chain

© Toon Verwaest

The PyPy tool chain

7

The PyPy Interpreter

> Python: imperative, object-oriented dynamic language

> Stack-based bytecode interpreter (like JVM, Smalltalk)

def f(x):return x + 1

>>> dis.dis(f)

2 0 LOAD_FAST 0 (x)

3 LOAD_CONST 1 (1)

6 BINARY_ADD

7 RETURN_VALUE

© Toon Verwaest

The PyPy tool chain

8

The PyPy Bytecode Compiler

> Written in Python

> .py to .pyc

> Standard, flexible compiler— Lexer— Parser— AST builder— Bytecode generator

> You only have to build this once

© Toon Verwaest

The PyPy tool chain

9

Bytecode interpreter

> Focuses on language semantics. No low-level details!

> Written in RPython— This makes it very slow! About 2000x slower than CPython

> PyPy's Python bytecode compiler and interpreter are not the hot topic of the PyPy project!

© Toon Verwaest

The PyPy tool chain

10

Roadmap

> What is PyPy?> The PyPy interpreter> The PyPy translation tool chain

© Toon Verwaest

The PyPy tool chain

11

The PyPy Translation Tool Chain

> Model-driven interpreter (VM) development— Focus on language model rather than implementation details— Executable models (meta-circular Python)

> Translate models to low-level (LL) back-ends— Considerably lower than Python— Weave in implementation details (GC, JIT)— Allow compilation to different back-ends (OO, procedural)

© Toon Verwaest

The PyPy tool chain

12

The PyPy Translation Tool Chain

© Toon Verwaest

The PyPy tool chain

13

Inside the Translation Tool Chain

© Toon Verwaest

The PyPy tool chain

14

PyPy “Parser”

> Tool chain starts from loaded Python bytecode> Translator shares Python environment with the target> Relies on Python's reflective capabilities> Allows meta-programming (runtime initialization)

def a_decorator(an_f):def g(b):

an_f(b+10)return g

@a_decoratordef f(a):

print a

f(4) -> 14

© Toon Verwaest

The PyPy tool chain

15

PyPy Control-Flow Graph

© Toon Verwaest

The PyPy tool chain

16

PyPy Control-Flow Graph

> Consists of Blocks and Links

> Starting from entry_point

> “Single Static Information” form

def f(n):

return 3*n+2

Block(v1): # input argument

v2 = mul(Constant(3), v1)

v3 = add(v2, Constant(2))

© Toon Verwaest

The PyPy tool chain

17

PyPy CFG: “Static Single Information”

> Remember SSA: PHIs at dominance frontiers

© Toon Verwaest

The PyPy tool chain

18

PyPy CFG: “Static Single Information”

def test(a):if a > 0:

if a > 5:return 10

return 4if a < - 10:

return 3return 10

> SSI: “PHIs” for all used variables– Blocks as “functions without branches”

© Toon Verwaest

The PyPy tool chain

19

Type Inference

© Toon Verwaest

The PyPy tool chain

20

Why type inference?

> Python is dynamically typed

> We want to translate to statically typed code— For efficiency reasons

What do we need to infer?

> Type for every variable

> Messages sent to an object must be defined in the compile-time type or a supertype

© Toon Verwaest

The PyPy tool chain

© Toon Verwaest

The PyPy tool chain

22

How to infer types?

> Starting from entry_point— Can reach the whole program— We know type of arguments and

return-value

> Forward propagation— Iteratively, until all links in

the CFG have been followed at least once

— Results in a large dictionary mapping variables to types

© Toon Verwaest

The PyPy tool chain

23

Implications of applying type inference

Applying type inference restrictstype of input programs

© Toon Verwaest

The PyPy tool chain

24

RPython: Demo

def plus(a, b):return a + b

def entry_point(arv=None):print plus(20, 22)print plus(“4”, “2”)

© Toon Verwaest

The PyPy tool chain

25

RPython: Demo

@objectmodel.specialize.argtype(0)def plus(a, b):

return a + b

def entry_point(arv=None):print plus(20, 22)print plus(“4”, “2”)

RPython is Zen

> Subset of Python

> Informally: The subset of Python which is type inferable

> Actually: type inferable stabilized bytecode— Allows load-time meta-programming (see parser)— Messages sent to an object must be defined in the compile-time

type or supertype

© Toon Verwaest

The PyPy tool chain

26

© Toon Verwaest

The PyPy tool chain

27

RTyper

© Toon Verwaest

The PyPy tool chain

28

RTyper

> Bridge between annotator and low-level code generators

> Different low-level models for different target groups— LLTypeSystem C-style (structures, pointers and arrays)

— OOTypeSystem JVM, CLI, Squeak (trace-off: single inheritance, )

> Does not need to iterate until a fixpoint is reached

> Replaces all operations by low-level ones

© Toon Verwaest

The PyPy tool chain

29

Back-end Optimizations

© Toon Verwaest

The PyPy tool chain

30

Back-end Optimizations

> Some general optimizations — Inlining— Constant folding— Escape analysis (allocating objects on the stack)

> Partly assume code generation for optimizing back-end

© Toon Verwaest

The PyPy tool chain

31

Back-end Optimizations: “Object Explosion”

> OO: lots of helper objects

> Allocating objects is expensive

> Replace unneeded objects with direct calls

© Toon Verwaest

The PyPy tool chain

32

Preparation for Source Generation

© Toon Verwaest

The PyPy tool chain

33

Exception Handling and Memory Management

> C has no support for:— automatic memory management— exception handling

> Translate explicit exception handling to flags and if/else

> Memory management in PyPy spirit:— not language specific— weave garbage collector in during translation

© Toon Verwaest

The PyPy tool chain

34

JIT Compiler

> Makes VMs fast— Dynamic information is key

> Is an implementation detail

> Still under development

> “As you surely know, the key idea of PyPy is that we are too lazy to write a JIT of our own: so, instead of passing nights writing a JIT, we pass years coding a JIT generator that writes the JIT for us :-)”

Weave in while translating to low-level!

© Toon Verwaest

The PyPy tool chain

35

Code Generation

© Toon Verwaest

The PyPy tool chain

36

Code Generation

> One C-function per Control-Flow Graph

> All low-level statements can be translated directly

> Gets compiled to binary format with C compiler

© Toon Verwaest

The PyPy tool chain

37

Translation Demo

© Toon Verwaest

The PyPy tool chain

38

PyPy Performance

> Translator— Slow— Uses quite some memory— Produces lots of source code (200 kloc for 5 kloc source)

— But: our models are executable (2000x slower than CPython)

> Resulting Interpreter— Currently: two times slower to two times faster than CPython— First experiments with JIT: up to 500x faster for special cases— But most importantly: very adaptable!

© Toon Verwaest

The PyPy tool chain

39

More PyPy & Getting Involved

> http://codespeak.net/pypy> http://morepypy.blogspot.com> irc://irc.freenode.org/pypy> PyPy sprints

© Toon Verwaest

The PyPy tool chain

40

Summary

> PyPy project has two main parts— Language interpreter models— PyPy translation tool chain

> PyPy translation tool chain— Has no typical parser— Uses SSI— Applies type inference

– Limits input from Python to RPython— Compiles to low-level and object-oriented back-ends— Weaves in implementation details

© Toon Verwaest

The PyPy tool chain

41

Summary

© Toon Verwaest

The PyPy tool chain

4242

What you should know!

What is the goal of the PyPy project?What are the main steps of the PyPy toolchain?

When is a program RPython?

© Toon Verwaest

The PyPy tool chain

4343

Can you answer these questions?

> Why do we want to keep the language model separated from implementation details?

> Why wouldn't we want to keep those details separated?> Why is it not really a problem that the tool chain can only

compile RPython code?

© Toon Verwaest

The PyPy tool chain

44

xxx

44

License

> http://creativecommons.org/licenses/by-sa/2.5/

Attribution-ShareAlike 2.5You are free:• to copy, distribute, display, and perform the work• to make derivative works• to make commercial use of the work

Under the following conditions:

Attribution. You must attribute the work in the manner specified by the author or licensor.

Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.

• For any reuse or distribution, you must make clear to others the license terms of this work.• Any of these conditions can be waived if you get permission from the copyright holder.

Your fair use and other rights are in no way affected by the above.