strata beijing 2017: jumpy, a python interface for nd4j

16

Upload: adam-gibson

Post on 23-Jan-2018

1.820 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Strata Beijing 2017: Jumpy, a python interface for nd4j
Page 2: Strata Beijing 2017: Jumpy, a python interface for nd4j

Who are we?

This slide shows that GPUs should complement the big data stack on the Hadoop ecosystem, rather than trying to replace Hadoop etc. outright. Wholesale replacement of the big data stack will be cost-prohibitive to many clients. We believe the right approach is to sell GPUs for accelerated computation and a few other use cases. That’s our beach head. (Obviously, the widening functionality of the Volta will change the GPU ecosystem.)

Founded 2014

Distributed worldwide

Lots of activity in China

Page 3: Strata Beijing 2017: Jumpy, a python interface for nd4j

Skymind in China

Page 4: Strata Beijing 2017: Jumpy, a python interface for nd4j

Most JVM python interfaces

● Network based. Requires gateway and py4j● Tons of overhead. Often a bottleneck with real Spark

jobs● Places a focus on “pushing logic down to scala”● Doesn’t interop well with existing python ecosystem● Often api compatibility issues● “Good enough” for basic use cases despite overhead

Page 5: Strata Beijing 2017: Jumpy, a python interface for nd4j

Basic facts about overhead

● In depth paper: https://arxiv.org/pdf/1612.01437.pdf● Python vs scala: 15x slower● Much of this is due to network traffic● Serialization is another big problem● Imagine saving objects every time you run compute.

Page 6: Strata Beijing 2017: Jumpy, a python interface for nd4j

Distributed Deep Learning bottlenecks

● Network overhead from param servers● Data movement between cpu and gpu● Buffer allocation for compute● Data Loading and input creation (creating tensors

from data)

Page 7: Strata Beijing 2017: Jumpy, a python interface for nd4j

Linear Algebra in python

● C based internally● Python is just an interface● Tend to interop with numpy pointers directly● Supports cpu and gpu● For DL often varied engines (MPI,GRPC,..)● Often extended in C

Page 8: Strata Beijing 2017: Jumpy, a python interface for nd4j

Linear Algebra in spark

● Based on breeze and net lib java (not maintained anymore, limited to cpu)

● Most routines are Scala based● On heap memory (bad for latency)● Cuda support is sparse at best● Doesn’t conform with industry standards (python)● Not meant for heavy compute (hardware accel)● Relies on spark for most ops (you can’t do this with

deep learning)

Page 9: Strata Beijing 2017: Jumpy, a python interface for nd4j

Minor conclusions

● 1 of these is not like the other ● Hard to interop with python ecosystem● Spark tries to be something it’s not re: linear algebra● Spark should do data loading. Not linear algebrabetter handled by c++ (simd,gpus,..)● Alternatives are needed (more specialization) (a focus

on c++ with pythonic conventions)

Page 10: Strata Beijing 2017: Jumpy, a python interface for nd4j

Nd4j

● Java based api, c++ core● Own off heap memory management (even for gpu)● Soon: Autodiff and graph execution (graph of

operations) and sparse● Similar architecture to numpy (easy interop)

(http://nd4j.org/userguide)● Works with blas/lapack ● Generally faster than numpy even from python (as

we’ll see soon)● It’s not python though!

Page 11: Strata Beijing 2017: Jumpy, a python interface for nd4j

Nd4j Parameter Server Aeron: More stable latency than GRPC and way faster (25x!) than TF

Page 12: Strata Beijing 2017: Jumpy, a python interface for nd4j

Jumpy: A better python interface

● Low latency using c internally● Interface with nd4j <-> numpy via direct pointers● Syntax sugar similar to numpy● Uses jnius underneath(https://github.com/kivy/pyjnius)● JNIUS starts and manages a JVM for you. Interops

via JNI and Cython● Easy to extend

Page 13: Strata Beijing 2017: Jumpy, a python interface for nd4j

Jumpy examples

Page 14: Strata Beijing 2017: Jumpy, a python interface for nd4j

Thanks! Join our QQ group:

Page 15: Strata Beijing 2017: Jumpy, a python interface for nd4j

Conclusions and future work

● No networks! An actual path to improvement● Reflection can be a bottleneck● Like most useful things in python, most of it is c!● Plans to optimize pyjnius itself ● Can enable us to interop with other parts of python

Page 16: Strata Beijing 2017: Jumpy, a python interface for nd4j