interactive supercomputing at the ... - gtc on demand · gtc europe 2018 2 cscs—swiss national...

27
Interactive supercomputing at the crossroads of HPC and cloud technologies GTC Europe Sadaf Alam, CSCS October 11, 2018

Upload: others

Post on 31-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Interactive supercomputing at the crossroads of HPC and cloud technologiesGTC EuropeSadaf Alam, CSCSOctober 11, 2018

  • GTC Europe 2018 2

    CSCS—Swiss National Supercomputing Centre

    CSCS develops and operates cutting-edge high-performance computing systems as an essential service facility for Swiss researchers. These computing systems are used by scientists for a diverse range of purposes – from high-resolution simulations to the analysis of complex data.

    10s of Peta (1015) Floating-point operations/second)

    100s of Peta (1015) Bytes storage

    https://www.cscs.ch/

    https://www.cscs.ch/

  • GTC Europe 2018 3

    Science Highlights

    Synthesis of atomically precise graphene nanoribbons with "zigzag" edges. Precursor monomers

    deposited on gold self-assemble and react producing an atomically precise nanomaterial. Ribbons with

    "zigzag" edges are predicted to host spin-polarized electronic edge states that make them interesting

    for spintronic applications. (Image: EMPA / Carlo Pignedoli)

    Supercomputer used to design nanoelectronics of the futureFor materials researchers, computer simulations stand alongside experiment and

    theory as an everyday tool in their work. Simulations play a key role in

    understanding the complex issues inherent in nanomaterials research.

  • GTC Europe 2018 4

    Science HighlightsComputer simulations have helped a team of researchers led by ETH professor Viola Vogel to develop a peptide that is able to detect the tensional state of tissue fibers. This paves the way for completely novel research approaches in medicine and pharmacology.

    Cells are surrounded by extracellular matrix fibres, which they stretch and thereby change their functionality. The cell nuclei (blue) are shown together with fibronectin fibres (green), whereby the relaxed fibres are stained with a bacterial peptide (red). (Image: Viola Vogel group, ETH Zürich)

    The bacterial peptide (blue) attaches to a fibronectin fibre (white) over several binding sites. (Graphics: Samuel Hertig)

  • GTC Europe 2018 5

    MeteoSwiss Operational Weather Forecasting System

  • Emergence of Interactive Usage of Resources

  • GTC Europe 2018 7

    Access to Resources

    #!/bin/bash -l #SBATCH --job-name=job_name#SBATCH --time=01:00:00 #SBATCH --nodes=2 #SBATCH --ntasks-per-core=2 #SBATCH --ntasks-per-node=12 #SBATCH --cpus-per-task=2 #SBATCH --partition=normal #SBATCH --constraint=gpu

    export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK export CRAY_CUDA_MPS=1 module load daint-gpu srun./executable.x

    Job à queue à runs according to policy and availability of resources

  • GTC Europe 2018 8

    Access to Resources

  • GTC Europe 2018 9

    Access to Resources

  • GTC Europe 2018 10

    Access to Resources

  • GTC Europe 2018 12

    VS

  • § HPC features§ Parallel computing§ Parallel file system technologies

    (POSIX based)§ Bulk processing, scale out with

    fast, integrated ecosystem§ High bandwidth networking

    subsystems§ Internal and external connectivity

    for high throughput data transfers

    § …

    § Cloud feature§ IaaS (PSI system engineers own

    infrastructure and services)§ On-demand § High availability through service

    migration§ Roles based access control§ Storage models for role based

    access controls§ Isolation, security and QoS§ …

    GTC Europe 2018 13

    Technological Options – Cloud & HPC Convergence

  • Use Cases for Interactive Access and X-as-a-Service

  • Materials Cloud

    GTC Europe 2018 15

    § NCCR MARVEL Mission§ Accelerate the design and discovery of novel

    materials

    § Use quantum-mechanical simulations

    § Materials Cloud§ A platform for open science with educational,

    research, and archiving tools, simulation software

    and services, curated and raw data

    § Web portal: uses Jupyter to manage workflows by using “App”

    § Integration with CSCS§ Web portal running on OpenStack at CSCS§ Capable of executing jobs on Piz Daint by using

    SSH from the web portal

    https://www.materialscloud.org/

    https://www.materialscloud.org/

  • Paul Scherrer Institute

    GTC Europe 2018 16

    § PSI Mission§ Study the internal structure of a wide range of different

    materials§ Research facilities: the Swiss Light Source (SLS), the free-

    electron X-ray laser SwissFEL, the SINQ neutron source and the SμS muon source

    § PSI facility users reserve a scientific device for a period of time

    § Compute power should also be available§ Storage and archive availability during the experiment§ Data retrievable after experiment by the users of PSI

    facilities (not PSI)

    § Proposal to interface Piz Daint with their workflow§ Use an API to access compute and data services (job

    scheduler, data mover)§ Create a reservation service to reserve computation nodes§ Provide a portal running on OpenStack to let PSI users

    access archived data at CSCS

    https://www.psi.ch/media/overview-swissfel

    https://www.psi.ch/media/overview-swissfel

  • GTC Europe 2018 17

    Human Brain Project

    § HBP Mission

    § To build a research infrastructure to help advance neuroscience, medicine and computing

    § To develop six ICT research Platforms form the heart of the HBP infrastructure

    § To undertake targeted research and theoretical studies, and explores brain structure and function in humans, rodents and other species

    § ICT Research Platforms§ Neuroinformatics (access to shared brain data)§ Brain Simulation (replication of brain architecture and activity on

    computers)

    § High Performance Analytics and Computing (providing the required computing and analytics capabilities)

    § Medical Informatics (access to patient data, identification of disease signatures)

    § Neuromorphic Computing (development of brain-inspired computing) § Neurorobotics (use of robots to test brain simulations).

    § Federated e-infrastructure project to support Diverse data sources and computing models

    https://www.humanbrainproject.eu/

    https://www.humanbrainproject.eu/

  • Development of HPC+X-as-a-Service Architecture

  • GTC Europe 2018 19

    Managing New Access Controls

    Users accessing resources(traditional)

    Services accessing resources(new)

    Experimental sitesprivileged access(new++)

  • GTC Europe 2018 20

    Architecture for Multiple Access Controls

    CN

    Gate to CSCS (ela)

    Login nodeUser

    HPC parallel FS

    SSH

    INTERNET

    Public IPs

    Local storageand in memory

    Gateway

    Local storage

    Software defined infrastructure

    Data science workflowsInteractive compute

    Containers

    Archives

    API

    API

  • GTC Europe 2018 21

    Existing Implementation

    Future Implementation

  • GTC Europe 2018 22

    Interactive Computing E-Infrastructure

    Federated infrastructure for data, compute and additional services by five leading European HPC data centres (BSC in Spain, CEA in France, CINECA in Italy, CSCS in Switzerland and Juelich in Germany) (https://fenix-ri.eu/)

    https://fenix-ri.eu/

  • GTC Europe 2018 23

    Federated IaaS and Sites’ Autonomy

  • Fenix Services Implemented by ICEI

    § Consumable and accountable services§ Interactive Computing Services§ Scalable Computing Services§ Virtual Machine Services§ Active Data Repositories§ Archival Data Repositories

    § Underlying and building block services§ Internal interconnect§ External interconnect§ Authentication/Authorization Services§ Data Mover Services§ Data Transfer Services

    § User and customer support services§ Fenix User and Resource Management Service (FURMS)§ Monitoring Services§ User Support Services

    GTC Europe 2018 24

  • Addressing Key Challenges

    § Authentication and Authorization infrastructure§ Enable multiple identity providers (not only users known by the HPC

    centre)§ Identify “who” (workflow, scientific device, web portal,…) is authorized to

    use HPC services

    § Data management§ Complex data ownership with multiple identity providers§ Security: running VM with root access and using parallel HPC filesystem§ Automated staging in/out and transformation of data: Data Broker

    § Workflow systems§ Which workflow engines or standards to support?§ Enable access to HPC services via a REST API (compute, data,

    reservation)§ Interactive service and batch scheduling (job waiting in queue)

    GTC Europe 2018 25

  • Abstract

    High “performance” computing, networking and storage technologies have been among the driving forces behind numerous scientific discoveries and breakthroughs for decades. Recently, the X-as-a-service model offered by several cloud technologies has enabled researchers, particularly in the fields of data science, to access resources and services in an on-demand and elastic manner. Complex workflows in different domains, such as the European Human Brain Project (HBP), however require a converged, consolidated and flexible set of infrastructure services to support their performance and accessibility requirements. This talk would cover the background and distinguishing features of the European Interactive Computing E-Infrastructure (ICEI) project, which will offer a set of federated services to realize the Fenix infrastructure (https://fenix-ri.eu).

    GTC Europe 2018 26

    https://fenix-ri.eu)/

  • Thank you for your attention.