![Page 1: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/1.jpg)
Using Docker Containers for Scientific Environments — On-Premises and in the Cloud
Sergey Yakubov, Martin Gasthuber, Birgit LewendelKEK, Tsukuba, 18.10.2017
![Page 2: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/2.jpg)
Page 2
Contents
Introduction
Scientific environments on-premises• IT-Managed containers
• Custom user containers
Scientific environments in hybrid clouds• HNSciCloud project• Using cloud to extend local resources
Conclusions and outlook
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
![Page 3: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/3.jpg)
Page 3
Introduction
• Batch farm (HTCondor) – see talk by T. Finnern• HPC cluster Maxwell (SLURM)
• Large storage, fast network and CPUs• 12,000 cores, Infiniband, 76 TB memory, 3.3 PB storage
• Used mostly for offline data analyses/numerical simulations• But also for online analyses (more in the future)
• Docker containers
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
Compute resources at DESY
![Page 4: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/4.jpg)
Page 4
Introduction
• Using Docker container technology we can create environments that allow to:• separate IT and user requirements/dependencies
• separate responsibilities - IT focus on scaling and container template construction, physicist on application development
• provide compute resources dynamically and quickly, whether on top of existing local resources or in the cloud
• control provisioned resources - storage, CPUs, memory, networks, …
• Can we do this with OpenStack & Co? Probably yes, but …
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
Containerized scientific environments
![Page 5: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/5.jpg)
Page 5
Scientific environments on-premises
• A Dockerfile is created by IT/ group admins (e.g. Debian image with software for a specific experiment) and stored as Puppet resource
• Puppet automatically creates an image on Dockerfile changes and pushes it to DESY’s Docker registry
• Compute resources are reserved via SLURM
• At a specified time SLURM job starts Docker containers on each of the allocated compute nodes with sshd daemon.
• Users with corresponding rights can login and do their work.
IT-Managed Containers
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
admin
admin userssh
![Page 6: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/6.jpg)
Page 6
Scientific environments on-premises
• User submits a SLURM job script with Docker commands• Compute resources are allocated via SLURM
• SLURM execute specified Docker containers on each of the allocated compute nodes • Any Docker images can be used
• Docker authorization plugin takes care about security.
Custom user containers
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
user
![Page 7: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/7.jpg)
Page 7
Scientific environments on-premisesExample - SIMEX
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
SimEx - photon science simulation platform
https://github.com/eucall-software/simex_platform
![Page 8: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/8.jpg)
Page 8
Scientific environments on-premisesExample - SIMEX
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
X-ray wavefront propagation calculator• Propagation of light through optical elements
• Utilizes SRW (Synchrotron Radiation Workshop) library• C++ core + python wrappers
• Hybrid OpenMP/MPI parallelization
02468
101214
0 10 20 30 40
Spee
d-up
N cores
Threads x MPIprocesses
Numberof nodes
Total time Time/file
1x1 1 11h 1031 s40x1 1 65 min 98 s4x10 4 7.5 min 45 s8x5 8 4.2 min 51 s
Single source file 40 source files
![Page 9: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/9.jpg)
Page 9
Scientific environments on-premisesExample - SIMEX
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
X-ray wavefront propagation calculator• Propagation of light through optical elements
• Utilizes SRW (Synchrotron Radiation Workshop) library• C++ core + python wrappers
• Hybrid OpenMP/MPI parallelization
02468
101214
0 10 20 30 40
Spee
d-up
N cores
Threads x MPIprocesses
Numberof nodes
Total time Time/file
1x1 1 11h 1031 s40x1 1 65 min 98 s4x10 4 7.5 min 45 s8x5 8 4.2 min 51 s
Single source file 40 source files
160x speed-up
![Page 10: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/10.jpg)
Page 10
Helix Nebula Science CloudJoint Pre-Commercial Procurement
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
Procurers: CERN, CNRS, DESY, EMBL-EBI, ESRF, IFAE, INFN, KIT, STFC, SURFSaraExperts: Trust-IT & EGI.eu
The group of procurers have committed• Procurement funds• Manpower for testing/evaluation• Use-cases with applications & data• In-house IT resources
Resulting services will be made available to end-users from many research communities
Co-funded via H2020 Grant Agreement 687614
Total procurement budget >5M€* Thanks to the CERN IT Group for the provided HNSciCloud slides
*
![Page 11: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/11.jpg)
Page 11
Helix Nebula Science Cloud
• Compute and storage• support a range of virtual machine and container configurations including HPC working
with datasets in the petabyte range• Transparent Data Access
• provide transparent for user on-premise’s data access from the cloud
• Network connectivity• provide high-end network capacity via GEANT for the whole platform
• Federated Identity Management• provide common identity and access management
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
*
Technical challenges
![Page 12: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/12.jpg)
Page 12
Helix Nebula Science Cloud
Preparation
• Analysis of requirements, current market offers and relevant standards
• Build stakeholder group• Develop tender material
Implementation & Sharing
Jan’16 Dec’18
Eachstepiscompetitive - onlycontractorsthatsuccessfullycompletethepreviousstepcanbidinthenext
4Designs 3Prototypes 2Pilots
Call-offFeb’17
Call-offDec’17
TenderJul’16
We are here
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
*
Project phases
![Page 13: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/13.jpg)
Page 13
Scientific environments in hybrid clouds
Resources, Fast network, Transparent Data Access from HNSciCloud and SLURM Elastic Computing
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
control node
compute nodes
Using cloud to extend local resources
![Page 14: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/14.jpg)
Page 14
Scientific environments in hybrid clouds
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
cloudcompute nodes
Using cloud to extend local resources
control node
compute nodes
![Page 15: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/15.jpg)
Page 15
Scientific environments in hybrid clouds
test.sh
Example
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
#!/bin/sh
#SBATCH --partition=cloudXXX#SBATCH --workdir=/test_id#SBATCH --nodes=1
id –u > cloud_id.txtdockerrun centos:7 id –u > cloud_docker_id.txt
local-node$ sbatch test.sh
local-node$ id –u12345local-node$ cat cloud_id.txt12345local-node$ cat cloud_docker_id.txt12345
![Page 16: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/16.jpg)
Page 16
Conclusions and outlookContainerized scientific environment• Implemented via Docker
• Isolates work of different users/groups• Same performance as on underlying infrastructure
• Portable
• More user experience to be gained
Hybrid clouds• Dynamical cloud resource allocation/deallocation
• Transparent to the user
• user submits job to local scheduler• transparent data access from the cloud
• thanks to Docker no need to install user software on the cloud VM• Performance to be tested
| Sergey Yakubov | 18.10.2017 | Hepix Fall 2017 | KEK, Tsukuba
![Page 17: Using Docker Containers for Scientific Environments —On- … · 2018. 11. 21. · Using Docker Containers for Scientific Environments —On-Premises and in the Cloud Sergey Yakubov,](https://reader034.vdocument.in/reader034/viewer/2022051902/5ff24d07f8601b09973156f8/html5/thumbnails/17.jpg)
Thank you for you attention!