apache toree: a jupyter kernel for spark by marius van niekerk

15
Apache Toree: A Jupyter Kernel for Spark Marius van Niekerk Apache Toree Contributor

Upload: spark-summit

Post on 21-Feb-2017

90 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Apache Toree:A Jupyter Kernel for Spark

Marius van NiekerkApache Toree Contributor

Page 2: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Apache Toree

A Jupyter kernel to connect to Spark and create interactive applications

Page 3: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Ecosystem

Page 4: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Jupyter• Open source, interactive data science

and scientific computing across over 40 programming languages.

• Toree is an implementation of the Jupyter Kernel Protocol

Page 5: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Apache Toree History• Started as ibm/spark-kernel in 2013• Started Apache incubation in late 2015• First Apache Release is coming soon.

Page 6: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Notebooks Dashboards NodeJS Application

Jupyter

Toree Toree Toree

KernelGateway

EclairJS

Toree

InteractiveApplication

Page 7: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Compatibility• Toree 0.1.x supports Spark 1.6.x• Toree 0.2.x supports Spark 2.x

Page 8: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Features• Kernel languages: Scala, Python, R• Magics• Tab completion• Plugin system

Page 9: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Try it out• Welcome to Spark with Scala:

– https://tmpnb.org• Docker

– docker run -it --rm -p 8888:8888 \jupyter/all-spark-notebook

– This is a very large container ~4gb

Page 10: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk
Page 11: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Examples• github.com/Lull3rSkat3r/apache-toree-demos• github.com/apache/incubator-

toree/tree/master/etc/examples/notebooks

Page 12: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Extending on top of Toree• github.com/Brunel-Visualization/Brunel• github.com/jupyter/declarativewidgets

Page 13: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Other Alternatives• sparkmagic (livy)• jupyter-scala• Apache Zeppelin• spylon-kernel• databricks

Page 14: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Help outLooking for contributors

l Web - toree.apache.orgl Mailing - [email protected] Chat - gitter.im/apache/toree

Page 15: APACHE TOREE: A JUPYTER KERNEL FOR SPARK by Marius van Niekerk

Thank You.Github: @mariusvniekerkTwitter: @__mvn__