distributed deep learning with keras and … modules •datavec •arbiter •nn •datasets •rl4j...
TRANSCRIPT
![Page 1: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/1.jpg)
DISTRIBUTED DEEP LEARNING WITH KERAS AND TENSORFLOW ON APACHE
SPARKGUGLIELMO IOZZIA
MSD
MOSCOW, OCTOBER 10TH 2019
#guglielmoiozzia
![Page 2: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/2.jpg)
ABOUT ME
Currently at
Previously at
Author
I got some awards lately I love cooking
Champion #guglielmoiozzia
![Page 3: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/3.jpg)
MSD IRELAND
+ 50 years
Approx. 2,000 employees
$2.5 billion investment to date
Approx 50% MSD’s top 20 products manufactured here
Export to + 60 countries
€6.1 billion turnover in 2017
2017 + 300 jobs & €280m investment
MSD Biotech, Dublin, coming in 2021
![Page 4: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/4.jpg)
CORE TOPICS
• What is it?Deep Learning
• 2 of the most popular frameworks for DLKeras and Tensorflow
• Why is it so difficult?Why Distributed Deep Learning on Spark?
• Why and How?DL in Python on the
JVM
![Page 5: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/5.jpg)
DEEP LEARNING
It is a subset of Machine
Learning which is based on
Multilayer Neural Networks
![Page 6: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/6.jpg)
DEEP LEARNING
http://www.asimovinstitute.org/wp-content/uploads/2019/04/NeuralNetworkZoo20042019.png
![Page 7: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/7.jpg)
DL FRAMEWORKS POPULARITY
![Page 8: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/8.jpg)
TENSORFLOW
It is an end-to-end open source platform for ML. It has a comprehensive, flexible ecosystem of tools, libraries and community resources for researchers and developers.
https://www.tensorflow.org/
![Page 9: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/9.jpg)
KERAS
Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.
It allows for easy prototyping and runs seamlessly on CPUs and GPUs.
https://keras.io/
![Page 10: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/10.jpg)
KERAS & TENSORFLOW
Starting from TensorFlow r1.14
![Page 11: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/11.jpg)
Speed
It achieves high performance for
both batch and streaming data,
using a state-of-the-art DAG
scheduler, a query optimizer,
and a physical execution engine.
Ease of Use
It offers over 80 high-level
operators that make it easy to
build parallel apps. And you can
use it interactively from the
Scala, Python, R, and SQL shells.
Generality
Combine SQL, streaming, and
complex analytics.
Runs Everywhere
It runs on Hadoop, Apache
Mesos, Kubernetes,
standalone, or in the cloud.
It can access diverse data
sources.
![Page 12: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/12.jpg)
CHALLENGES OF TRAINING MNNS IN SPARK
• Different execution models between Spark and the DL frameworks
• GPU configuration and management
• Performance
• Accuracy
![Page 13: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/13.jpg)
WHY DISTRIBUTED DL ON THE JVM?
![Page 14: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/14.jpg)
DEEPLEARNING4J
It is an Open Source,
distributed, Deep Learning
framework written for JVM
languages.
It is integrated with
Hadoop and Apache
Spark.
It can be used on
distributed GPUs and
CPUs.
![Page 15: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/15.jpg)
WHY DISTRIBUTED DL ON THE JVM?
TensorFlow
![Page 16: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/16.jpg)
DL4J MODULES
• DataVec
• Arbiter
• NN
• Datasets
• RL4J
• DL4J-Spark
• Model Import
• ND4J
It is an Open Source linear algebra
and matrix manipulation library which
supports n-dimensional arrays and it
is integrated with Apache Hadoop
and Spark.
![Page 17: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/17.jpg)
DL4J + APACHE SPARK
• DL4J provides high level API to design, configure train and evaluate
MNNs.
• Spark performances are excellent in particular for ETL/streaming, but
in terms of computation, in a MNN training context, some data
transformation/aggregation needs to be done using a low-level
language.
• DL4J uses ND4J, which is a C++ library that provides high level Scala
API to developers.
![Page 18: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/18.jpg)
DATA PARALLELISM AND MODEL PARALLELSM
![Page 19: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/19.jpg)
HOW TRAINING HAPPENS IN SPARK WITH DL4J
Parameter Averaging(DL4J 1.0.0-alpha)
Asynchronous SDG(DL4J 1.0.0-beta+)
![Page 20: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/20.jpg)
HOW TRAINING HAPPENS IN SPARK WITH DL4J
The key classes users should be familiar with to get started with distributed training in DL4J are:
• TrainingMaster: It specifies how distributed training will be conducted in practice. Implementations include Gradient Sharing or Parameter Averaging .
• SparkDl4jMultiLayer and SparkComputationGraph: They are wrappers around the MultiLayerNetwork and ComputationGraph classes in DL4J that enable the functionality related to distributed training.
• RDD<DataSet> and RDD<MultiDataSet>: Spark RDDs with DL4J’s DataSet or MultiDataSet classes that define the source of the training or evaluation data.
![Page 21: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/21.jpg)
MODEL IMPORT IN DL4J
Keras TensorFlow
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
Train the Model
Save it as .pb
Load Model and
Weights
Load New Data
Predict
KerasModelImport
TFGraphMapper
![Page 22: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/22.jpg)
MODEL IMPORT IN DL4J
Keras TensorFlow
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
Train the Model
Save it as .pb
Load Model and
Weights
Load New Data
Predict
![Page 23: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/23.jpg)
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
Import the VGG16
Model.
Test it.
![Page 24: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/24.jpg)
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
![Page 25: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/25.jpg)
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
![Page 26: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/26.jpg)
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
![Page 27: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/27.jpg)
DL4J MODEL IMPORT IN ACTION
![Page 28: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/28.jpg)
KERAS MODEL IMPORT: SUPPORTED FEATURES
• Layers
• Losses
• Activations
• Initializers
• Regularizers
• Constraints
• Metrics
• Optimizers
![Page 29: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/29.jpg)
DL4J VISUAL FACILITIES
![Page 30: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/30.jpg)
MEMORY UTILIZATION: SOMETHING TO TAKE CARE OF
![Page 31: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/31.jpg)
More on DL with DL4J on Spark in my book
http://tinyurl.3c1om/y9jkvtuy
![Page 32: DISTRIBUTED DEEP LEARNING WITH KERAS AND … MODULES •DataVec •Arbiter •NN •Datasets •RL4J •DL4J-Spark •Model Import •ND4J It is an Open Source linear algebra and matrix](https://reader033.vdocument.in/reader033/viewer/2022042307/5ed37162b458607d8231cc0c/html5/thumbnails/32.jpg)
Thanks!
Any questions?
You can find me at
@GuglielmoIozzia
https://ie.linkedin.com/in/giozzia
googlielmo.blogspot.com