numpy-aware dynamic python compiler - meetupfiles.meetup.com/263790/numba.pdfnumba --- a deeper look...

Post on 17-Oct-2020

13 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

NumbaNumPy-aware dynamic Python compiler

NYC Python MeetupOctober 24, 2012

Travis E. Oliphant

Friday, October 26, 12

Motivation

•Python is great for rapid development and high-level thinking-in-code

•It is slow for interior loops because lack of type information leads to a lot of indirection and “extra” code.

Friday, October 26, 12

Motivation

•NumPy users have a lot of type information --- but only currently have one-size fits all pre-compiled, vectorized loops.

•Many new features envisioned will need the ability for high-level expressions to be compiled to machine code.

Friday, October 26, 12

Goals

• Most developers should not have to write anything but Python -- maybe other even higher-level Domain Specific Language (DSL).

• Create faster code using array-expressions from NumPy users -- Fortran is the initial target

• Take advantage of multi-core and GPUs for a subset of Python.

Friday, October 26, 12

Why Not PyPy?

• PyPy does not work with CPython• PyPy is a (meta) “tracing” JIT. Machine code is

generated on the fly so there is no “build step” -- but we want to support a “build step” when justified

• PyPy tries to speed up everything -- we want to optimize more specifically on numeric codes (including complex numbers)

More to the story...

Friday, October 26, 12

Why not Cython?

• Cython is great for what it does, but...• Cython creates extension modules which cannot be

“unloaded” dynamically• Cython requires a full C-compiler• Cython doesn’t do type inference -- you have to

declare types on everything• Cython is another syntax to learn

Friday, October 26, 12

More Ranting

• The world needs more array-oriented compilers --- Python has needed one for a decade at least (Numeric provided typed multi-dimensional arrays in 1995)

• Array-oriented computing needs more light in CS curricula (secrets of APL and N’IAL)

• Most domain experts can write what they want at a high-level. Commonly this is then “translated” to a lower-level and then the compiler gets a hold of it. This is sub-optimal.

Friday, October 26, 12

More Ranting• Today’s vector machines (and vector co-processors,

or GPUS) were made for array-oriented computing. • The software stack has just not caught up ---

unfortunate because APL came out in 1963. • There is a reason Fortran remains popular.

Friday, October 26, 12

Array-Oriented Computing• Loosely defined as “Organize data-together” and

operate on it together (or in cache-size chunks) with array-level operations (e.g. NumPy)

Object

Attr1

Attr2

Attr3

Object

Attr1

Attr2

Attr3

Object

Attr1

Attr2

Attr3

Object

Attr1

Attr2

Attr3

Object

Attr1

Attr2

Attr3

Object

Attr1

Attr2

Attr3

Attr1 Attr2 Attr3

Object1

Object2

Object3

Object4

Object5

Object6

Friday, October 26, 12

Goal:

Numba should be the world’s best array-oriented compiler.

Friday, October 26, 12

NumPy + Mamba = Numba

LLVM Library

Intel Nvidia AppleAMD

OpenCLISPC CUDA CLANGOpenMP

LLVM-PY

Python Function Machine Code

ARM

Friday, October 26, 12

Uses of NumbaPython

Function

NumPy Runtime

Ufu

ncs

Gen

eral

ized

U

Func

s

Func

tion-

base

d In

dexi

ng

Mem

ory

Filte

rs

Win

dow

K

erne

l Fu

ncs

I/O F

ilter

s

Red

uctio

n Fi

lters

Com

pute

d C

olum

ns

Numba

function pointer

Friday, October 26, 12

Uses of Numba in SciPy

integrateoptimize

odespecial

writing more of SciPy at high-level

Friday, October 26, 12

Numba --- a deeper look

Numba is a Python to LLVM translator. It translates Python to LLVM IR (the LLVM

machinery is then used to create machine code from there). Numba is NumPy aware

--- it understands NumPy’s type system, methods, C-API, and data-structures

Friday, October 26, 12

Numba -- written in Python

• Numba itself is pure Python -- it uses (an updated) LLVM-py to interact with the LLVM C++ library to build a representation of the code in LLVM assembler.

• LLVM then creates machine code (or a “bitcode” module which can be persisted or sent to another machine)

• Machine-code is equivalent to a C-level function-pointer (e.g. a ctypes function)

Friday, October 26, 12

Example

Friday, October 26, 12

Examples

Friday, October 26, 12

Friday, October 26, 12

Friday, October 26, 12

Demo

Friday, October 26, 12

Status and Future

• My early bytecode branch further developed by Jon Riehl (Resilient Science) sponsored by Continuum Analytics, Inc. --- interprets bytecode directly

• Current trunk works with AST directly and making rapid progress- Mark Florrison (minivect)- Siu Kwan Lam (pymothoa)

Friday, October 26, 12

RoadMap

• Numba 0.2 available now• Github trunk has many changes -- 0.3 will support

- structures- Python code- objects with inheritance

Check out NumbaPro for cutting-edge features

Friday, October 26, 12

Software Stack Future?

LLVM

Python

COBJCFORTRAN

R

C++

Plateaus of Code re-use + DSLs

MatlabSQL

TDPL

Friday, October 26, 12

Join Us!

http://numba.pydata.org

Friday, October 26, 12

top related