qnibterminal plus infiniband - containerized mpi workloads

27
QNIBTerminal plus InfiniBand Containerized MPI Workloads 2014-11-05 Christian Kniep insideHPC Edition Slides slightly modified in comparison to the HPC Advisory Council

Upload: insidehpc

Post on 02-Jul-2015

244 views

Category:

Technology


0 download

DESCRIPTION

In this deck, Christian Kniep presents: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads. Watch the video presentation: http://wp.me/p3RLHQ-dvM

TRANSCRIPT

Page 1: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

QNIBTerminal plus InfiniBandContainerized MPI Workloads

2014-11-05Christian Kniep

insideHPC EditionSlides slightly modified in comparison

to the HPC Advisory Council

Page 2: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

Agenda• Docker in a Nutshell • QNIBTerminal

• Testbed • MPI Benchmark • HPCG-Results

• Future Work • Conclusion

2

Page 3: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

Docker in a Nutshell

3

• (chroot on steroids)2

Page 4: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• Builds on-top LinuX Containers (LXC)

• Kernel namespaces (isolation)

• cgroups (resource mgmt)

Docker in a Nutshell

4

• (chroot on steroids)2

Page 5: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• intuitive build system

Docker in a Nutshell

5

• (chroot on steroids)2

• Builds on-top LinuX Containers (LXC)

• Kernel namespaces (isolation)

• cgroups (resource mgmt)

Page 6: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• RedHat backing

• public repositories

• intuitive build system

Docker in a Nutshell

6

• (chroot on steroids)2

• Builds on-top LinuX Containers (LXC)

• Kernel namespaces (isolation)

• cgroups (resource mgmt)

Page 7: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

Traditional vs. Lightweight Layers

7

SERVER

HOST KERNEL

HYPERVISOR

KERNEL

SERVICE

Userland (OS)

KERNEL KERNEL

Userland (OS)Userland (OS) Userland (OS)

SERVICE SERVICE

SERVER

HOST KERNEL

SERVICE

Userland (OS)

Userland (OS)Userland (OS) Userland (OS)

SERVICE SERVICE

Traditional Virtualisation Containerisation

IB

IB

Page 8: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

QNIBTerminalMotivation

8

Plain Metrics

Page 9: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

QNIBTerminalMotivation

9

Plain Log Events

Page 10: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

QNIBTerminalMotivation

10

Overlap Metrics/Log Events

Page 11: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

QNIBTerminal Overview

11

haproxy haproxy

dnshelixdns

elk

kibana

logstash

etcd

carboncarbon

graphite-webgraphite-web

graphite-apigraphite-api

grafanagrafana

slurmctldslurmctld

compute0slurmd

compute<N>slurmd

Log/Events

Services Performance

Compute

elasticsearch

One Node Setup• All network traffic over bridge• Crippled MPI workload

Page 12: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• Multiple Open MPI version installed

• gcc versions

• 3 containers on top (CentOS 6, CentOS 7, Ubuntu 12)

• SLURM Resource Scheduler

• 1 native partition

• 3 containers partitions

Testbed

12

• 8 nodes (CentOS 7, 2x 4core XEON, 32GB, Mellanox ConnectX-2)

Page 13: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• osu-micro-benchmarks-4.4.1

• osu_alltoall with two tasks on two hosts

13MPI benchmark was not in original HPC Advisory Council Presentation

MPI Benchmark

$ mpirun -np 2 -H venus001,venus002 $(pwd)/osu_alltoall# OSU MPI All-to-All Personalized Exchange Latency Test v4.4.1# Size Avg Latency(us)1 1.832 1.824 1.748 1.6316 1.6232 1.6864 1.80128 2.77256 3.11512 3.51

Page 14: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

MPI Benchmarkdistribution’s results [2 task @2nodes]

14

late

ncy

[us]

0

1

2

3

4

5

Message Size (KB)

4 8 16 32 64 128 256 512 1024

native cos7 cos6 u12

MPI benchmark was not in original HPC Advisory Council Presentation

Page 15: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

15

late

ncy

[us]

0

0,7

1,4

2,1

2,8

distribution 1.5.4 1.6.4 1.8.3

nativecos7cos6u12

oMPI 1.6.4

oMPI 1.6.4

oMPI 1.5.4

oMPI 1.5.4

MPI BenchmarkOpen MPI comparison [2 task @2nodes, avg(1B->64B)]

Page 16: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• mimics thermodynamic application workload

• Linpack corrective / successor in the long-term?

16

HPCG Benchmark

Page 17: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

17

GFL

OP/

s

3

3,75

4,5

5,25

6

native cos7 cos6 u12

CentOS 7.0 oMPI 1.6.4 gcc 4.8.2

HPCG Benchmarkdistribution’s results

Page 18: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

18

GFL

OP/

s

3

3,75

4,5

5,25

6

native cos7 cos6 u12

CentOS 7.0 oMPI 1.6.4 gcc 4.8.2

CentOS 6.5 oMPI 1.5.4 gcc 4.4.7

Ubuntu12.04 oMPI 1.5.4 gcc 4.6.3

HPCG Benchmarkdistribution’s results

Page 19: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

19

GFL

OP/

s

3

3,75

4,5

5,25

6

distribution

nativecos7cos6u12

oMPI 1.6.4

oMPI 1.6.4

oMPI 1.5.4

oMPI 1.5.4

HPCG BenchmarkOpen MPI comparison

Page 20: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

20

GFL

OP/

s

3

3,75

4,5

5,25

6

distribution 1.6.4 1.8.4

nativecos7cos6u12

oMPI 1.6.4

oMPI 1.6.4

oMPI 1.5.4

oMPI 1.5.4

gcc 4.8.2gcc 4.8.2gcc 4.4.7gcc 4.6.3

HPCG BenchmarkOpen MPI comparison

Page 21: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

21

GFL

OP/

s

3

3,75

4,5

5,25

6

distribution 1.5.4 1.6.4 1.8.4

nativecos7cos6u12

oMPI 1.6.4

oMPI 1.6.4

oMPI 1.5.4

oMPI 1.5.4

gcc 4.8.2gcc 4.8.2gcc 4.4.7gcc 4.6.3

HPCG BenchmarkOpen MPI comparison

Page 22: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• Security evaluations

• Compare different frameworks to orchestrate

• Use of SV-IOR (Keynote earlier today)

• Compare with tuned bare-metal

• Tune docker installation

Future Work

22

• Benchmark real-world applications

Page 23: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• Out-of-the-box: container beats bare-metal

• Continuous testing/deployment of containerized workloads

• Bare-metal kernel provides access to IB

• Container in charge from MPI upwards

Conclusion

23

• Bunch of tooling within docker ecosystem

• Abstraction bare-metal / application works fine

• Low performance overhead

Page 24: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• Contact • @CQnib / @qnibinc • [email protected] • http://qnib.org

La Fin

24

https://www.flickr.com/photos/dharmabum1964/3108162671

Page 25: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

• Paper: http://doc.qnib.org/

• Contact • @CQnib / @qnibinc • [email protected] • http://qnib.org

La Fin

25

https://www.flickr.com/photos/dharmabum1964/3108162671

Page 26: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

La Fin

26

https://www.flickr.com/photos/dharmabum1964/3108162671

• Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks

• Paper: http://doc.qnib.org/

• Contact • @CQnib / @qnibinc • [email protected] • http://qnib.org

Page 27: QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

La Fin

27

https://www.flickr.com/photos/dharmabum1964/3108162671

• Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks

• Questions?

• Paper: http://doc.qnib.org/

• Contact • @CQnib / @_qnib • [email protected] • http://qnib.org