community forum data-intensive computing · community forum data-intensive computing (dic) ... e.g....

41
8. Mai 2019 Community Forum Data-Intensive Computing Support Teams for bwForCluster MLS&WISO, SDS@hd, and bwVisu URZ 1/41

Upload: others

Post on 20-May-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

8. Mai 2019

Community Forum Data-Intensive ComputingSupport Teams for bwForCluster MLS&WISO, SDS@hd, and bwVisu

URZ 1/41

Page 2: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Common User Meeting

bwForTreffI for users of bwHPC systemsI for students and scientists

interested in HPC

SDS@hd user meeting

I for users of Scientific DataStorage SDS@hd

I for students and scientistsinterested in data storage

Community Forum Data-Intensive Computing (DIC)

I Get status and news of HPC and storage systemsI Learn how to use both systems for different workflowsI Networking with support team and other users

URZ 2/41

Page 3: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Community Forum DIC

AgendaI Scientific Data Storage SDS@hdI bwForCluster MLS&WISO

I NewsI Job EfficiencyI Container Environment

I AnnouncementsI Archive Service SDA@hd

I DemosI Singularity ContainersI Remote Visualization Service bwVisu

I Discussion

URZ 3/41

Page 4: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwHPC and bwHPC-S5

URZ 4/41

Page 5: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Baden-Wurttemberg FrameworkProgram 2018-2024

I Concept of the Universities of Baden-Wurttemberg for High Performance Computing (HPC),Data Intensive Computing (DIC), and Large Scale Scientific Data Management (LS2DM)⇒ provides Compute Resources

I bwHPC-S5: Scientific Simulation and Storage Support Services⇒ provides User Support

Information on Websitehttp://www.bwhpc.deBest Practices Guidehttp://www.bwhpc.de/wiki

URZ 5/41

Page 6: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwHPC-S5 Project

URZ 6/41

Page 7: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwHPC-S5 Support

I Information seminars, hands-on training, HPC and storage workshopshttp://www.bwhpc.de/courses_a_tutorials.html

I Documentation & best practices repository http://www.bwhpc.de/wikiI Providing/maintaining/developing:

I simplified access to all bwHPC resourcesI cluster independent & unified user environmentI software portfolioI tools for data transfer and managementI interface to archive and repositories services

I Migration support:I code adaptation, e.g. MPI or OpenMP parallelizationI code porting (from desktop or old HPC clusters)I migration to higher HPC levels

URZ 7/41

Page 8: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

SDS@hd – Scientific Data Storage

https://sds-hd.urz.uni-heidelberg.de

URZ 8/41

Page 9: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdService Features and Storage Hardware

I High performance data access for large scientific “hot data”

I HW parts: LSDF2 & LSDF2bI High availability, no single point of failureI Spatially divided hardware in different fire areas

I IBM Spectrum Scale Advanced (GPFS)I Current usable capacity: 5,8 PB nettoI Data security, encryption

I Protocols: SMB, NFSv4, SSHFS, SFTPI Flexible group and access management

I Collaborations among universities are possible

I ’Landesdienst’ for scientists of BW universitiesI in HD: valid Uni-ID necessary (if needed: ’unentgeltliche Tatigkeit’)

I Connection to other services: bwHPC, bwVisu, heiCLOUD

URZ 9/41

Page 10: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdNews

Extension of the LSDF 2 hardware:I DFG proposal for LSDF 2 Phase 2 (End 2017) was granted!

I Extension of storage capacity to 20 PB!

I The extension of the LSDF2 storage systems with the granted funding iscurrently prepared and will lead to a public procurement process in 2019

I Future additional proposal, if more storage capacity is needed⇒ Reporting

URZ 10/41

Page 11: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdReporting

The use of SDS@hd must be acknowledged in publications!

I Highly important for DFG reviewing and future fundings of LSDF2I Attest of scientific necessity and usefulness with publications

List of scientific publications:

I Send info about new publications to [email protected]

I Submit list at application for renewal of Speichervorhaben (SV)

Text for acknowledgement in publications:The authors gratefully acknowledge the data storage service SDS@hd supported by the Ministry ofScience, Research and the Arts Baden-Wurttemberg (MWK) and the German Research Foundation(DFG) through grant INST 35/1314-1 FUGG.

See: https://sds-hd.urz.uni-heidelberg.de/reporting

URZ 11/41

Page 12: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdSDS@hd Registration

EntitlementI Authorization to use the serviceI Delivered by home organization

SDS@hd-Management

I Register Speichervorhaben (SV) or participate on existing SVI Signatures needed on contractsI https://sds-hd.urz.uni-heidelberg.de/management

Registration Server

I Register and manage service account, e.g. set personal service passwordI https://bwservices.uni-heidelberg.de

URZ 12/41

Page 13: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdCosts

I Investment was funded by MWK (50%) and DFG (50%) (Art. 91b GG)I only operation costs have to be paid

I Price:I 0,25 ct/GB/month (0,0025 EUR/GB/month, = 30 EUR/TB/year)I Heidelberg User:

I 0,1 ct/GB/month (12 EUR/TB/year, subsidized by central funds)

I Measurement: average of used quotaI Backup is already included

URZ 13/41

Page 14: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdBilling

I Costs have to be chargedI Measurement: average of used quota per monthI Beginning in 2019I Some ”Kostenstellen” on existing contracts are incorrect!

Will be contacted for correction in the next weeks.

I Medical Faculties:I Service registration only for scientists of universitiesI Not for usage as member of university hospitalI Announcement as scientific member of the medical facultyI If you have a Uni-ID, everything is OK!

URZ 14/41

Page 15: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Storage SDS@hdData Access Protocols

NFSv4 with Kerberos SMB 2.x, 3.x SFTP

operating system Linux Windows (since Vista)OS X Mavericks (10.9)

all

installation effort one-time installation effort(Kerberos, ID mapping)

minimal effort: Mappingof a network drive

installation of SFTP client,e.g. sshfs, WinSCP,FileZilla

possible bandwidth up to 40Gbit/sec up to 40Gbit/sec up to 40Gbit/sec

fault tolerance highly available highly available –

encryption (transfer) possible possible possible

firewall ports 2049 (nfs) 139 (netbios), 135 (rpc)445 (smb)

22 (ssh)

recommendation For workstations,servers, microscopes ...

For workstations,servers, microscopes, ...

For Linux PCs, machinesin restrictive or externalnetworks

URZ 15/41

Page 16: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwForCluster MLS&WISO

URZ 16/41

Page 17: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwForCluster MLS&WISOSystem Layout

bwForCluster MLS&WISODevelopment (IWR) Produc on (URZ/RUM)

@Heidelberg @Heidelberg @Mannheim

standard

IB-QDR

admin

admin

login

login

Long distance (28 km)IB connect (160 Gbit/s)

IB-FDR IB-QDR

best cop fat standard

login

login

login

login

admin

admin

WORKHOME

4xFDR 40 GEto LSDF2

URZ 17/41

Page 18: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwForCluster MLS&WISOProduction

Access Prerequisites

I bwForCluster Entitlement for LocalUserID from home universityI Member of RV for bwForCluster MLS&WISO (Production) at ZAS

Registration

I Register at https://bwservices.uni-heidelberg.deI Choose bwForCluster MLS&WISO - ProductionI Set service password

Login

I Login server:bwfor.cluster.uni-mannheim.debwforcluster.bwservices.uni-heidelberg.de

I User ID: SitePrefix LocalUserID (= hd UniID for Heidelberg users)I Password: Service password

URZ 18/41

Page 19: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwForCluster MLS&WISOProduction

File Systems

$TMPDIR $HOME Workspaces SDS@hd

Visibility node local global global global1

Lifetime batch job walltime permanent workspace lifetime permanent

Disk Space 128 GB 36 TB 384 TB 5.7 PB

Quotas no 100 GB no yes2

Backup no no no yes3

File System xfs BeeGFS BeeGFS GPFS

1on all compute nodes except standard nodes2disk quota set by SDS@hd for SV3done by SDS@hd

URZ 19/41

Page 20: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

SDS@hd on bwHPC clusters

Cluster Location Connection Mode Status

bwUniCluster Karlsruhe via data mover rdata in production

bwForCluster Heidelberg/Mannheim

via data mover &direct access on computenodes

in production

bwForCluster Tubingen via data mover in production

bwForCluster Ulm not needed in evaluation

bwForCluster Freiburg not needed in evaluation

ForHLR Karlsruhe via data mover rdata in production

HazelHen Stuttgart in evaluation in preparation

URZ 20/41

Page 21: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Workspaces

What are workspaces?Workspaces are allocated folders with a lifetime.

Workspace tools

ws allocate foo 10 Allocate a workspace named foo for 10 days.

ws list -a List all your workspaces.

ws find foo Get absolute path of workspace foo.

ws extend foo 5 Extend lifetime of workspace foo by 5 days from now.

ws release foo Manually erase your workspace foo.

Limits for bwForCluster MLS&WISO - ProductionI Max. time for one allocation: 90 daysI Number of extensions: 10 timesI Max. lifetime of a workspace: 990 days

URZ 21/41

Page 22: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwHPC Training

HPC CoursesI Duration: 1/2 day up to 1 weekI Courses from all bwHPC partners in Baden-Wurttemberg

http://www.bwhpc.de/courses_a_tutorials.html

I Planned courses in 2019:I Hands-On Workshop bwHPC (german) 15. May 2019 in MannheimI OpenMP and/or MPI Workshop in 2019I Nvidia OpenACC Workshop in 2019

URZ 22/41

Page 23: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

News

URZ 23/41

Page 24: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Job EfficiencyJob Feedback Script

I generated after Job completionI appended to job outputfileI includes basic job statisticsI additional Hints and advisesI available in 11.2018

URZ 24/41

Page 25: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Container EnvironmentSingularity on bwForCluster MLS&WISO

I reproducible runtime environment onHPC System

I comparable to Docker, Docker Imagescan be re-used

I Modules, Workspaces & SDS@hdavailable

I available via Module System in 12.2018

URZ 25/41

Page 26: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Announcements

URZ 26/41

Page 27: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Scientific Data Archive SDA@hd

I Heidelberg Service for Archiving on Tape (in developement)I all type of research dataI typically: 10 years retention obligation of DFGI base set of metadataI SDS@hd connection, including migration paths

I Costs:I 0,05 ct/GB/month (= 6 EUR/TB/year)

I Contact: [email protected]

URZ 27/41

Page 28: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Singularity

URZ 28/41

Page 29: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Singularity

I Open source container runtime focused on HPCI Focus on mobility of applications – encapsulate applications with all

dependenciesI Easy to useI Containers are started (and behave) like regular applicationsI Integrated support for MPI and GPUs

URZ 29/41

Page 30: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Singularity & Docker

I Easy conversion Docker → SingularityI singularity pull tensorflow.sif

docker://tensorflow/tensorflow:latestI Docker layers are squashed

I User rightsI Docker daemon needs to run as root → who

is allowed to start containers?I In Singularity containers are started with

user rights

URZ 30/41

Page 31: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Definition File

I Similar to DockerfilesI Similar to DockerfilesI No multiple “RUN”-instructions

URZ 31/41

Page 32: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Usefull Commands

I singularity build paraview-v5.6.0.sif paraview.defI Uses definiton files to build an immutable image

I singularity run paraview-v5.6.0.sifI Runs through the entrypoints and commands specified in the definition file

I singularity shell -B /mnt/... paraview-v5.6.0.sifI Open a shell inside the container

URZ 32/41

Page 33: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwVisu

URZ 33/41

Page 34: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Funded by

I State project for remote visualization of large scientific dataI Main Focus: user-friendliness

Our Project Partners

URZ 34/41

Page 35: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Powerful hardware

I Can serve up to 288 concurrent users with GPUsI Both NVIDIA and AMD GPUs available – users can choose optimal hardware

for their use caseI Powerful CPUsI High-bandwidth connection to SDS@hd and bwForCluster

URZ 35/41

Page 36: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Application Images

I Containerized AppsI Docker and SingularityI Reproducibility of scientific worksI High control of software environmentI Simple deplayment

URZ 36/41

Page 37: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Easy to use software interface

I Webfronted for managing jobsI Directly connect to visualization jobs with browserI Image registry allows users to easily develop and deploy new applications to

bwVisuI Containerized application images of most common visualization appl readily

available (e.g. Paraview, VMD, Vislt, ...)

URZ 37/41

Page 38: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Other key features

I Mount file storage services like SDS@hdI Use your custom visualization applications on bwVisuI Restrict visibility and access of your applicationsI Share visualization sessions with other users simultaneously

URZ 38/41

Page 39: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

bwVisu Web

I Intuitive GUII Uses remote desktop tool Xpra

I Only requirement: HTML5 enabled browserI Platform independentI Local client for low latency

URZ 39/41

Page 40: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Discussion

I Questions?I Your use cases?I Change requests?I Feature requests?

URZ 40/41

Page 41: Community Forum Data-Intensive Computing · Community Forum Data-Intensive Computing (DIC) ... e.g. MPI or OpenMP parallelization I code porting (from desktop or old HPC clusters)

Thank you!

Support Units

I HPC:[email protected]

I SDS@hd:[email protected]

URZ 41/41