chug dl presentation

Introducing a new exascale era for innovation

DEEP LEARNING ON GPUS

by Alex Volkov

[email protected]

2

AGENDA OUTLINE• Introduction: Brief discussion of what is motivating popularity of GPUs.

• Why GPUs? Moore’s law and breakdown of Dennard Scaling is resulting in emergence of domain specific architectures (DSAs).

• GPUs match application to the processor architecture (graphics, virtual reality, neural networks and Deep Learning, massive matrix computations)

• What is deep learning and how it differs from traditional data analytics?• Traditionally experts manually extract relevant features. These features drive some process or model for

information discovery or decision making. Deep learning models are driven by raw data directly. No manual feature extraction.

• Example: Image processing. In the past would use image processing such as edge detection, optical flow, etc. With deep learning feedthe image directly to the network and use CNN layers (convolutional neural networks), RNN/LSTM(recurrent NN and long-short-term-memory networks). These layers more or less achieve automaticallywhat used to be done manually, but better and automated.

• Discussion on different types of Deep Learning networks. Attached image reference.• Discussion how Analytics/Big Data can be enhanced by DL and technology examples.

• Yahoo projects: Caffe on Spark, Tensorflow on Spark.• DL4J on Spark

• Tentatively do a quick demo or overview of a demo done ahead of time with Tensorflow on Spark accelerated with GPUs.

• Conclusion.

3

DEEP LEARNING ON GPUS

Abstract:

Modern day computational challenges are going beyond capabilities of traditionalmultiprocessors. Graphical Processing Units (GPUs) are filling the performance gapby taking advantage of its massively parallel architecture. GPUs enable practicalapplications of Artificial Intelligence and Deep Learning (DL), Machine Learning, andstate of the art analytics methodologies. The presentation will give a generaloverview of DL and the areas where GPUs can help accelerate the analyticsworkflows using DL. DL applications will illustrate the challenges that data scientistsare faced with and how DL is meeting these challenges. A GPU enabled Sparkecosystem using Tensorflow DL framework will be presented to demonstrateadvantages that GPUs bring to the datacenters and data scientists.

Data Analytics and Deep Learning on GPUs

4

INNOVATION: A HISTORICAL VIEW

h

5

WHY GPUSGPUs are cost effective computing engines for demands of Exascale Era

End of Dennard Scaling places a

cap on single threaded

performance

Increasing application

performance will require fine grain

parallel code with significant

computational intensity

AI and Data Science emerging as

important new components of

scientific discovery

Dramatic improvements in

accuracy, completeness and

response time yield increased

insight from huge volumes of data

Cloud based usage models, in-

situ execution and visualization

emerging as new workflows

critical to the science process

and productivity

Tight coupling of interactive

simulation, visualization, data

analysis/AI

6

RISE OF GPU COMPUTING ARCHITECTUREMoore’s Law is NOT Dead as transistor count keeps growing

1980 1990 2000 2010 2020

GPU-Computing perf

1.5X per year

1000X

by

2025

Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte,

O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected

for 2010-2015 by K. Rupp

102

103

104

105

106

107

Single-threaded perf

1.5X per year

1.1X per year

APPLICATIONS

SYSTEMS

ALGORITHMS

CUDA

ARCHITECTURE

7

SO WHAT IS DEEP LEARNING (DL)?

And what is a Neural Network? Fundamentally it is just a geometric transformation.

“A geometric transformation is a function whose domain and range are sets of points.”

DL is a sophisticated Neural Network

8

MANY TYPES OFNEURAL NETWORKSDL does not require manual feature extraction

Automation is the name of the game

https://www.asimovinstitute.org/author/fjodorvanveen/

• Popular Neural Nets in DL

• CNN – convolution neural nets

• RNN – recurrent neural nets

• LSTM – long-short term memory

• GAN – Generative Adversarial Network

• GRU – Gated Recurrent Unit

“Success of deep learning so far has been the ability to map space Xto space Y using a continuous geometric transform, given largeamounts of human-annotated data. Doing this well is a game-changer for essentially every industry.”

https://blog.keras.io/the-limitations-of-deep-learning.html

https://www.asimovinstitute.org/author/fjodorvanveen/

https://blog.keras.io/the-limitations-of-deep-learning.html

9

CONVERGENCE OF DATA ANALYTICS AND DLGPU acceleration at the heart of the envisioned converged analytics (Lambda) architecture

Data

Sources

Ingest

Storage

Stream

Processing

Batch

Processing

Serving

Layer

Notebook

Visualization

Syslog

Netflow

Graph

Visualization

Inte

racti

vit

y

Query

Speed

cuSTINGER

10

GOAI: GPU OPEN ANALYTICS INITIATIVE

GDF (GPU Data Frame) Data Remains Resident on GPU for efficient to avoid io bottleneck

https://github.com/gpuopenanalytics/pygdf

In the past GPU Data had to be copied unnecessarily between host and device memory resulting in io bottlenecks

https://github.com/gpuopenanalytics/pygdf

11

GPU ANALYTICS SOFTWARE STACKAchieve unprecedented speedup in your day to day workflows

GBM training benchmark comparing a dual-Xeon CPU-only system to a system with multiple Tesla P100 GPUs.

https://devblogs.nvidia.com/parallelforall/goai-open-gpu-accelerated-data-analytics/

https://devblogs.nvidia.com/parallelforall/goai-open-gpu-accelerated-data-analytics/

12

BATCH PROCESSING WITH SPARK AND DL

Spark is not efficient as a computation layer for DL calculations, but can be used for fast ETL. Typically orchestrate jobs to GPUs. Popular frameworks for Spark and DL:

DL4J - https://deeplearning4j.org/spark, https://github.com/deeplearning4j/scalnetJava and Scala based

Yahoo:https://github.com/yahoo/CaffeOnSpark,

https://mapr.com/blog/distributed-deep-learning-caffe-using-mapr-cluster/https://github.com/yahoo/TensorFlowOnSpark

Databricks: https://github.com/databricks/tensorframes

CERNDB: https://github.com/cerndb/dist-keras

https://github.com/adatao/tensorspark

List of frameworks to run DL on Spark clusters

https://deeplearning4j.org/spark

https://github.com/deeplearning4j/scalnet

https://github.com/yahoo/CaffeOnSpark

https://mapr.com/blog/distributed-deep-learning-caffe-using-mapr-cluster/

https://github.com/yahoo/TensorFlowOnSpark

https://github.com/databricks/tensorframes

https://github.com/cerndb/dist-keras

https://github.com/adatao/tensorspark

13

TENSORFLOW ON SPARK

Sparks-tensorflow-connector - library for loading and storing TensorFlow records with Apache Spark.https://github.com/tensorflow/ecosystem/tree/master/spark/spark-tensorflow-connector

Demo Yahoo’s TensorflowOnSpark implementation.https://github.com/yahoo/TensorFlowOnSpark/tree/master/examples

https://github.com/yahoo/TensorFlowOnSpark/wikihttps://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARNhttps://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_Standalone

https://github.com/yahoo/TensorFlowOnSpark/wiki/Conversion-Guide

DEMO TF on SPARK

https://github.com/tensorflow/ecosystem/tree/master/spark/spark-tensorflow-connector

https://github.com/yahoo/TensorFlowOnSpark/tree/master/examples

https://github.com/yahoo/TensorFlowOnSpark/wiki

https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARN

https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_Standalone

https://github.com/yahoo/TensorFlowOnSpark/wiki/Conversion-Guide

chug dl presentation

Technology