shifter and singularity on blue waters...22 performance of mpi applications in shifter doi:...

31
Shifter and Singularity on Blue Waters Maxim Belkin June 7, 2018

Upload: others

Post on 28-Feb-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Shifter and Singularity on Blue Waters

Maxim BelkinJune 7, 2018

Page 2: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

A simplistic view of a scientific application

My Application

DATA RESULTS

Page 3: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Your Computer

My Application

DATA RESULTS i

Blue Waters

My Application

DATA RESULTS

Received an allocation on Blue Waters!

Page 4: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

A more realistic view of a scientific application

My Application

LIBRARY X

DATA RESULTS

EXECUTABLE Y HARDWARE Z

Your Computer

Page 5: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

My Application

DATA

LIBRARY A EXECUTABLE B HARDWARE C

RESULTS

Blue Waters

A more realistic view of a scientific application

Page 6: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

My Application

LIBRARY X

DATA RESULTS

EXECUTABLE Y HARDWARE C

SoftwareContainers

Page 7: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

https://github.com/NERSC/shifter https://github.com/singularityware/singularityhttps://github.com/docker

DockerShifter Singularity

Not suitablefor HPC

Designedfor HPC

Can be used inHPC environment

Page 8: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Docker

FROM centos:7RUN yum update && ... wget …COPY file.ext inside/container/RUN ./configure && make && …WORKDIR /home/user

1) Build: docker build -t repo/name:tag .2) Upload: docker push repo/name:tag3) Download: docker pull repo/name:tag4) Run: docker run repo/name:tag4a) Execute: docker exec <container> command

Dockerfile

Page 9: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

https://github.com/NERSC/shifter https://github.com/singularityware/singularity

Page 10: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Dockerssh qsub aprun

Login nodes

User

MOM nodes

Compute nodes

-l gres=shifter16

Page 11: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Dockerqsub

Login nodesMOM nodes

-l gres=shifter16

-v UDI=repo/name:tag

-l gres=shifter16-v UDI=repo/name:tagexport CRAY_ROOTFS=SHIFTERaprun -b ... --

app-in-container --options

Compute nodes

application

applicationapplication

application

application

applicationapplicationapplication

UDI: User-Defined Image

Page 12: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Dockerqsub

Login nodesMOM nodes

-l gres=shifter16

-l gres=shifter16 module load shifteraprun -b ... --shifter --image=repo/name:tag --app-in-container --options

Compute nodes

application

applicationapplication

application

application

applicationapplicationapplication

Page 13: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Shifter moduleshiftershifterimg

images

lookup

pull

login

returns info about available UDIs

returns an ID of an image (if available)

launches an app from UDI on compute nodes

downloads Docker image and converts it to UDI

authenticates to DockerHub (for private images)

Page 14: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Examplesshifterimg imagesshifterimg lookup ubuntu:zesty

shifterimg pull centos:latest

shifterimg login

shifterimg pull --user bw_user prvimg1:latest

shifterimg pull --group bw_grp prvimg2:latest

image on DockerHub

image on Blue Waters

Page 15: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

My Application

LIBRARY X

DATA RESULTS

EXECUTABLE Y HARDWARE Z

Page 16: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

MPI in Shifter: requirements

1. glibc 2.17+ (CentOS 7, Ubuntu 14)2. Use one of the following MPIs (see MPICH-ABI compatibility)

3. Do not use package manager to install MPI*

In a Container:

• MPICH v3.1+• Intel® MPI Library v5.0+• Cray MPT v7.0.0+

• MVAPICH2 2.0+• Parastation MPI 5.1.7-1+• IBM MPI v2.1+

Page 17: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

MPI in Shifter: requirementsOn Blue Waters:

1. Use Intel or GNU Programming Environment2. Load MPICH ABI-compatible MPI (on BW: cray-mpich-abi)3. Set LD_LIBRARY_PATH to the location of MPICH-ABI libraries

Page 18: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

MPI in Shifter: in DockerfileFROM centos:7RUN yum -y install file gcc make gcc-gfortran gcc-c++ wget curlRUN cd /usr/local/src/ && \

wget http://www.mpich.org/static/downloads/3.2/mpich-3.2.tar.gz && \tar xf mpich-3.2.tar.gz && \rm mpich-3.2.tar.gz && \cd mpich-3.2 && \./configure && \make && make install && \cd /usr/local/src && \rm -rf mpich-3.2*

Page 19: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

module unload PrgEnv-craymodule unload ccemodule load PrgEnv-gnumodule unload cray-mpichmodule load cray-mpich-abi

LIBS=$CRAY_LD_LIBRARY_PATH:/opt/cray/wlm_detect/default/lib64CACHE=$PWD/cache.$PBS_JOBIDmkdir -p $CACHEfor dir in $( echo $LIBS | tr ":" " " ); do cp -L -r $dir $CACHE; done

export LD_LIBRARY_PATH=$CACHE/lib:$CACHE/lib64

19

123456789

101112

MPI in Shifter: job batch script snippet

Page 20: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

qsub -l gres=shifter16 -l nodes=2:ppn=32:xemodule load shifteraprun -b -n 64 -N 32 -- shifter --image=repo/image:tag -- appaprun -b -n 64 -N 32 -- shifter --image=repo/image2:tag -- app2

20

MPI in Shifter

qsub -l gres=shifter16 -v UDI=repo/image:tag -l nodes=2:ppn=32:xeexport CRAY_ROOTFS=SHIFTERaprun -b -n 64 -N 32 -- app

1

2

Page 21: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

21

MPI in Shifter

2 4 8 16 32 64 128 256 512 1024 2048 4096

Nodes

2

4

8

16

32

64

128

256

Job

star

t-up

time,

s 1.7 GBCLI

36 MB

prologue

ppn = 1

1 2 4 8 16

processes per node (ppn)

4

8

16

32

64

Job

star

t-up

time,

s 1.7 GBCLI

36 MB

prologue

80 nodes

a

b

T(N, ppn=1)

T(N=80, ppn)1

2

DOI: 10.1145/3219104.3219145

Hon Wai Leong

Page 22: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

22

Performance of MPI applications in Shifter

DOI: 10.1145/3219104.3219145

Tim

e, µs

Message size, bytes

107

106

105

104

103

102 10 106 105 104 103 102 1

a CLEMPI_AlltoallMPI_Alltoallv

Shifter

0

1

2

3

4

0 200 400 600 800 1000 1200

b

Tim

e, m

s

Message size, kilobytes

CLEMPI_BcastMPI_Reduce

Shifter

7.5

8.0

8.5

IOR

Rea

d, G

B/s

Benchmark number0 10 20 30

2.5

3.0

3.5

IOR

Writ

e, G

B/s

Cray Linux EnvironmentShifter

Cray Linux EnvironmentShifter

Galen Arnold

Page 23: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

23

Using GPUs in Shifter

1. Use Shifter CLI:shifter --image ...

2. Set CUDA_VISIBLE_DEVICES:export CUDA_VISIBLE_DEVICES=0

aprun -b ... -- shifter --image=centos:latest -- nvidia-smi

Page 24: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

24

Using MPI + GPUs in Shifter

Caveat: shifter command removes LD_LIBRARY_PATH

Solution: use a script to pass LD_LIBRARY_PATH

aprun -b -n ... -N 1 -e VAR="$LD_LIBRARY_PATH” ... \shifter --image=repo/name:tag ... -- script.sh

script.sh:export LD_LIBRARY_PATH=“$VAR”application

Page 25: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Directory mapping between BW and Shifter UDI

shifter ... --volume=/path/on/BW:/path/in/UDI

Important: /path/in/UDI must exist in UDI

qsub -v UDI="centos:latest -v /u/sciteam/$USER:/home"

/projects, /scratch, and “Home” folders are mapped between Blue Waters and UDI !

Page 26: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

https://github.com/NERSC/shifter https://github.com/singularityware/singularity

Page 27: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Dockerssh qsub aprun

Login nodes

User

MOM nodes

Compute nodes

-l gres=singularity

Page 28: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Docker

qsub

Login nodes

MOM nodes

-l gres=singularity

-l gres=singularity

module load singularity

singularity build IMAGE.simg docker://centos:latest

singularity exec -H /u:/home IMAGE.simg -- app

Compute nodes

application

application

application

application

application

application

application

application

Page 29: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

Directory mapping between BW and Singularity

singularity exec -B /path/on/BW:/path/in/singularity

And... /path/in/singularity does not have to exist.

/projects on BW is /mnt/abc/projects in SingularityNote: /projects on BW is /mnt/abc/projects in Singularity/scratch on BW is /mnt/abc/scratch in Singularity

Page 30: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

GPUs in Singularity

singularity exec --nv ...

If you’re using BW directories, make sure they are mapped into Singularity imageExample:aprun -n 16 -N 8 -b -- singularity exec -B

/dsl/opt/cray/nvidia/default/bin:/home/staff/mbelkin/home/staff/mbelkin/nvidia-smi

Page 31: Shifter and Singularity on Blue Waters...22 Performance of MPI applications in Shifter DOI: 10.1145/3219104.3219145 Time, µ s Message size, bytes 107 106 105 104 103 102 1 10 102

38

Thank you!