an insight into cpython compiler design

Post on 06-May-2015

2.080 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Describes the construction of the Python compiler by taking apart the Unladen Swallow codebase

TRANSCRIPT

An Insight into CPython Compiler Design

Ramkumar Ramachandra

FOSS.IN/2009

03 December 2009

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 1 / 12

Short Discussion of the CPython Compiler

1 Short Discussion of the CPython Compiler

2 A Gentle Introduction to LLVM

3 Enter: Unladen Swallow

4 Conclusion

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 2 / 12

Short Discussion of the CPython Compiler

How Python is compiled

Do the boring grammar parsing

Compile the parse tree to bytecode

Apply optimizations

Interpret the bytecode

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 2 / 12

Short Discussion of the CPython Compiler

The various stages of compilation

PyAST FromNode() in Python/ast.c | Parse tree → AST

PyAST Compile() in compile.c | AST → CFG → Bytecode

PyAST Compile() calls PySymtable Build() and compiler mod() | AST → CFG

assemble() | Post-order DFS | CFG → Bytecode

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 3 / 12

Short Discussion of the CPython Compiler

What the final bytecode looks like

a, b = 1, 0if a or b:

print "Hello", a

1 0 LOAD_CONST 4 ((1, 0))3 UNPACK_SEQUENCE 26 STORE_NAME 0 (a)9 STORE_NAME 1 (b)

2 12 LOAD_NAME 0 (a)15 JUMP_IF_TRUE 7 (to 25)18 POP_TOP19 LOAD_NAME 1 (b)22 JUMP_IF_FALSE 13 (to 38)

>> 25 POP_TOP

3 26 LOAD_CONST 2 ('Hello ')29 PRINT_ITEM30 LOAD_NAME 0 (a)33 PRINT_ITEM34 PRINT_NEWLINE35 JUMP_FORWARD 1 (to 39)

>> 38 POP_TOP>> 39 LOAD_CONST 3 (None)

42 RETURN_VALUE

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 4 / 12

Short Discussion of the CPython Compiler

Execute the bytecode

1 PyObject *PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) {2 PyObject *result;3 result = PyEval_EvalFrame(f);4 return result;5 }

1 PyObject *PyEval_EvalFrame(PyFrameObject *f)2 {3 register PyObject ** stack_pointer; /* Next free slot */4 register unsigned char *next_instr;5 register int opcode; /* Current opcode */6 register int oparg; /* Current opcode argument , if any */7 PyObject *retval = NULL; /* Return value */8 PyCodeObject *co; /* Code object */9 }

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 5 / 12

A Gentle Introduction to LLVM

What is LLVM and why is it relevant?

Compiler infrastructure

Invents a new IR

Replaces lower levels of GCC

Provides static GCC-like compilation and JIT

Python frontend possible

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 6 / 12

Enter: Unladen Swallow

How Unladen Swallow started

Objective: To speed up CPython

Experiment with Psyco

Temporarily use VMgen for eval loop

Remove rarely used opcodes

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 7 / 12

Enter: Unladen Swallow

Compile Python bytecode to LLVM IR

1 extern "C" _LlvmFunction *2 _PyCode_ToLlvmIr(PyCodeObject *code)3 {4 _LlvmFunction *wrapper = new _LlvmFunction ();5 /* fbuilder functions in llvm_fbuilder.cc */6 wrapper ->lf_function = fbuilder.function ();7 return wrapper;8 }

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 8 / 12

Enter: Unladen Swallow

Changes to the eval loop

1 static int2 mark_called_and_maybe_compile(PyCodeObject *co , PyFrameObject *f)3 {4 co->co_hotness += 10;5 if (co->co_hotness > PY_HOTNESS_THRESHOLD) {6 if (co->co_llvm_function == NULL) {7 int target_optimization =8 std::max(Py_DEFAULT_JIT_OPT_LEVEL ,9 Py_OptimizeFlag);

10 if (co ->co_optimization < target_optimization) {11 // If the LLVM version of the function wasn't12 // created yet , setting the optimization level13 // will create it.14 r = _PyCode_ToOptimizedLlvmIr(co , target_optimization);15 }16 }17 if (co->co_native_function == NULL) {18 // Now try to JIT the IR function to machine code.19 co->co_native_function =20 _LlvmFunction_Jit(co->co_llvm_function);21 }22 }23 return 0;24 }

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 9 / 12

Enter: Unladen Swallow

Implement feedback-directed optimization

Optimize native code, not bytecode

Speed up builtin lookups/ inline simple builtins

Don’t compile cold branches

Inline simple operators using type feedback

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 10 / 12

Conclusion

References

[1] Abelson, H., Sussman, G. J., and Sussman, J. Structure and Interpretation of ComputerPrograms (SICP). The MIT Press, 1984.

[2] Aho, A. V., Sethi, R., and Ullman, J. D. Compilers: Principles, Techniques, and Tools.Pearson Education, Inc, 2006.

[3] Aycock, J. Compiling Little Languages in Python.

[4] Cannon, B. Design of the CPython Compiler, 2005.

[5] Ertl, M. A., Gregg, D., Krall, A., and Paysan, B. vmgen — a generator of efficientvirtual machine interpreters. Software—Practice and Experience 32, 3 (2002), 265–294.

[6] Montanaro, S. In A Peephole Optimizer for Python (1998).

[7] Wang, D. C., Appel, A. W., Korn, J. L., and Serra, C. S. In The Zephyr AbstractSyntax Description Language (1997).

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 11 / 12

Conclusion

Contact information

Ramkumar Ramachandraartagnon@gmail.comhttp://artagnon.com

Indian Institute of Technology, KharagpurPresentation source available on http://github.com/artagnon/foss.in

Ramkumar Ramachandra (FOSS.IN/2009) An Insight into CPython Compiler Design 03 December 2009 12 / 12

top related