using containers for gpu...

20
Jonathan Calmels, Felix Abecassis, 05/08/2017 USING CONTAINERS FOR GPU APPLICATIONS

Upload: phamnhu

Post on 09-Sep-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Jonathan Calmels, Felix Abecassis, 05/08/2017

USING CONTAINERS FOR GPU APPLICATIONS

2

OUR TEAM

Enable GPUs in the container ecosystem:

• Monitoring

• Orchestration

• Images

• Runtime

• OS

Core Container Technologies

3

CHALLENGESA Typical Cluster

Ubuntu 14.04Drivers 3674x Maxwell

CentOS 7Drivers 3614x Kepler

Ubuntu 16.04Drivers 3758x Pascal

CUDA 7.5

CUDA 7.0cuDNN 4

CUDA 7.5cuDNN 6

CUDA 8.0

Patches

4

CONTAINERS

Portable and reproducible builds

Ease of deployment

Isolation of resources

Run across heterogeneous CUDA toolkit environments (sharing the host driver)

Bare Metal Performance

Facilitate collaboration

To the rescue

5

VIRTUAL MACHINES VS CONTAINERSNot so similar

6

7

8

NVIDIA-DOCKERgithub.com/NVIDIA/nvidia-docker

9

DOCKERHUB IMAGES

CUDA 8.0 runtime

CUDA 8.0 runtime

cuDNN v5 runtime

CUDA 8.0 devel

cuDNN v5 runtime

CUDA 8.0 devel

cuDNN v5 devel

NVIDIA/Caffe0.15.13

TensorFlowDIGITS

5.0

Ubuntu14.04

Ubuntu16.04

CNTK

cuDNN v5devel

cuDNN v6devel

PyTorch

Multiple flavors

10

CONTAINERS ON NVIDIA DGX-1NVIDIA’s Deep Learning System

11

NVIDIA-DOCKER 1.0Internals

nvidia-docker

dockerdockerd

nvidia-docker-plugin

http+unix

http+unix

http

GPU information

cuda+nvmlnvidia driver

container process

$ NV_GPU=0 nvidia-docker run -ti nvidia/cuda

12

LIMITS OF NVIDIA-DOCKER 1.0

Only for Docker CLI

Docker plugins are difficult to manage

Not extensible (OpenGL, Vulkan, InfiniBand, KVM, etc.)

Challenging to support new architectures (Power, ARM)

Difficult to integrate into the container ecosystem

Limited scope

13

DOCKER COMPONENTS

14

WHAT’S A CONTAINER

Init system

Namespaces

Cgroups

BPF

LSM

Netfilter

Netlink

Capabilities

Seccomp

UnionFS

KVM

...

Kernel building blocks

15

LIBNVIDIA-CONTAINERgithub.com/NVIDIA/libnvidia-container

Integrates with the container internals

Agnostic of the container runtime

Drop-in GPU support for runtime developers

Better stability, follows driver releases

Brings features seamlessly (Graphics, Display, Exclusive mode, VM, etc.)

16

NVIDIA-DOCKER 2.0 Internals

dockerdlibnvidia-container

http(s)(+unix)

cuda+nvml

nvidia driverdocker-containerd

+ shim

nvidia-runc nvidia-oci-runtime

grpc+unix

container process

docker

$ docker run -ti -e NVIDIA_VISIBLE_DEVICES=0 --runtime=nvidia nvidia/cuda

17

DEMO!

18

CONTAINER FUTURE

nvidia-docker 2.0 release

Multi-arch support (Power, ARM)

Support other container runtimes (LXC/LXD, Rkt)

Additional Docker images

Additional features (OpenGL, Vulkan, InfiniBand, KVM, etc.)

Support for GPU monitoring (cAdvisor)

Enable GPUs everywhere

19

QUESTIONS?