e-infrastructures & elixir cz · e-infrastructures and elixir: motivation •elixir is a...

27
e-Infrastructures & ELIXIR CZ Luděk Matyska CERIT-SC (ICS MU) & CESNET

Upload: others

Post on 20-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

e-Infrastructures & ELIXIR CZ

Luděk MatyskaCERIT-SC (ICS MU) & CESNET

Page 2: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

e-Infrastructures and ELIXIR: Motivation

• ELIXIR is a research infrastructure dealing with data and tools to process/analyze them• A very specific case of an e-infrastructure

• Interaction with “classical” e-infrastructures critical• User-provider model not satisfactory

• Synergistic development and evolution needed on both sides

• Collaborative model with higher potential

• Solution for ELIXIR CZ:

Make the e-infrastructures prime partners

Page 3: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

National e-infrastructure landscape

• CESNET• Legal body• All Czech public universities plus Czech Academy of Science are owner• Primary national network infrastructure• Also coordinating national grid, cloud and storage activities• Plus cyberinfrastructure

• CERIT-SC• At Masaryk University• Flexible clouds, development, innovations

• IT4Innovations• At Technical University Ostrava• National Supercomputing Centre

Page 4: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

National e-infrastructure landscape

• Till 2019 all three e-infrastructures independent• All on the national roadmap

• Now a single national e-infrastructure e-INFRA CZ• A consortium of all three coordinated by CESNET

• Not a new legal body, but formal consortium agreement in place

• CESNET & CERIT-SC/MU have a long collaboration history• Formally till 1998

• MetaCentrum – distributed grid, cloud

• CESNET & CERIT-SC are founding members of ELIXIR CZ

Page 5: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

e-Infrastructures and ELIXIR CZ

• CESNET & CERIT-SC founding partners of ELIXIR CZ

• Primary goal: to take responsibility of the IT resources and capacity building• The IT capacity is fully purchased and operated be e-infrastructures

• However, this does not mean a single locality of the resources

• Building a distributed e-infrastructure for ELIXIR CZ• Fully integrated into the national e-infrastructure landscape

• Building on top of their expertise and expanding it towards Life Science community needs

• The control over this capacity is in hands of ELIXIR CZ, not the e-infrastructures

Page 6: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

IT capacity building

• Separate funding stream• European development funds for capital investment

• Specific for ELIXIR CZ, together with RI finding cover the total cost (including operational costs like personnel and electricity)

• The current project primary around e-infrastructure partners within the ELIXIR CZ• CERIT-SC/MU coordinating

• CERIT-SC and CESNET delivering

• Not only hardware, but also the commercial software covered

Page 7: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

IT capacity building

• Primary computing capacity, storage a some software

• Computing capacity• Clusters with either thin or thick (more CPU, more memory) nodes

• GPU acceleration

• Extensively coordinated with the capital investment of e-infrastructures

• Synergies recognized and fully utilized

• Storage capacity• Large one (archival) provided by CESNET in its own capacity

• ELIXIR CZ contribution towards working storage space

Page 8: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

IT capacity in more detail

• Compute cluster elmo

• 42 servers, 1500 CPUs

• Location: ICS MU and IOCB Prague

• Part dedicated for service/tools development

• Part dedicated to specific ELIXIR services

• Repeat Explorer, (FireProt)

• Part dedicated for Galaxy or Chipster initiated workflows

• Specific MetaCentrum queue dedicated to ELIXIR CZ

• Part of Metacentrum capacity in Brno

• 45 servers, ELIXIR CZ priority access

• International activities

Page 9: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

IT capacity in more detail

• Storage capacity• Originally 400 TB in Brno and 100 TB in Prague

• Currently extended to 2 PB

• Object storage for cloud use

• Usage• Next to clusters for data processing

• Data of ELIXIR CZ services, copies of used databases

• Readily available copies of data in home directories

Page 10: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

Software

• Run on centrally provided resources

• Virtualization used to extend the availability

• E.g. PEAKS software

• Windows based

• Run in a virtual appliance and thus shared by more groups

• Uses the capacity of the national high speed network

• On going discussions about the feasibility of the commercial

software

• Black box, rather expensive, not prepared for sharing (setup, licensing

model, …)

Page 11: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

Current plans

• Cloud (compute and storage) for processing of highly sensitive data

• Support for healthcare oriented research

• First version at CERIT-SC/MU, technology before end of 2019

• However, the technology is the easiest part of this activity

• Extension of compute/processing capacity

• Including GPU acceleration (also for AI)

• Thick nodes with 2-3 TB RAM

• Capacity at IMG

Page 12: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

Pilot at IMG

• Coordination of building IT capacities across several research infrastructures

• ELIXIR CZ, CZBI, OPENSCREEN, CCP and e-INFRA CZ

• With contribution from the Institute

• Coordinated by ELIXIR CZ

• Support for “data lake” model

• Combining data from several sources/infrastructures for easier processing and cross correlations

• A model for similar setups at other localities/institutions

Page 13: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already
Page 14: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

General services

• Operation of stable services in secured environment• Including virtualization services and support for containerization

• Cppredictor, ccmi, Chipster, web ELIXIR CZ

• Resources for services (including workflow engines)• Chipster, Galaxy, RepeatExplorer, FireProt

• Complex environment for service/tools development• Including GUI for bioinformaticians (Chipster and Galaxy)

• Application software• Commercial like CLC Workbench, PEAKS Studio, or Mascot

• Open source systems available through the whole MetaCentrum

Page 15: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

General services

• Storage for working copies or archives• For archives CESNET data services are used

• Support for training and education• UCHT: Genomics: analysis and algorithms

• Platform to run training

• But also learning how to adapt e-infrastructure to specific Life Science needs:• Discussion on SLA, responsibility of individual partners

• Monitoring, acceptable use policies and rules

• https://wiki.metacentrum.cz/wiki/Elixir

Page 16: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

International scope

• ELIXIR CZ is deeply integrated into the European-wide ELIXIR activities

• e-Infrastructure partners play very active role• Co-leadership of the ELIXIR Compute Platform

• Task co-leadership in AAI and Clouds

• Leadership of the EOSC-Life project workpackage on Access control and management

• And active extensive involvement in the cloud workpackage

• Leadership of the CINECA project workpackage (AAI)

• ELIXIR Authentication and Authorization Infrastructure (ELIXIR AAI) major internationally (globally) visible contribution

Page 17: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI (authentication and authorisation infrastructure)

• ELIXIR Compute platform’s service portfolio for authenticating researchers and helping other services to decide their permissions

• Being implemented as an ELIXIR infrastructure service from 2019

• Will underpin Life Science AAI with other RIs

• Uses standards as much as possible• Involved in GA4GH DURI

Statistics 09/2019

• 3000 users• ~4500 logins/month• 67 relying services• https://login.elixir-czech.org/statistics

• https://login.elixir-czech.org/services

Page 18: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI

• ELIXIR Compute platform’s service portfolio for authenticating

researchers and helping other services to decide their

permissions

• Being implemented as an ELIXIR infrastructure service from

2019

• The first accepted ELIXIR infrastructure service

• Will underpin Life Science AAI with other Ris

• Activity within EOSC-Life under our leadership

• Based on standards developed by e-infrastructures

• Actively participated in these activitis

• Involved in GA4GH DURI

• (Global alliance for human genomes, global impact)

Page 19: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI Statistics (09/2019)

• 3000 users• ~4500 logins/month• 67 relying services• 67 services in pilot

More info on• https://login.elixir-czech.org/statistics

• https://login.elixir-czech.org/services

Page 20: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI history – where we are now

• Use case gathering -- Autumn 2014• https://docs.google.com/document/d/12fBLl8WenlxABQDzYEAJ6NtOhIXnZBPiGQLuhFUCyqo/edit

• Requirements and design – Spring 2015• https://docs.google.com/document/d/1CMY1np3GyvPD8LcKvXljXcRO04V2zu3n_Jcg19jgNOw/edit

• Deployment starts – Autumn 2015 – EXCELERATE WP4.3.1• Part of ELIXIR Compute platform, operated by CZ and FI

• First release -- Autumn 2016• Until that ELIXIR AAI in pilot status

• Key components up and running already

Page 21: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI design

ELIXIR AAI

External authentication(e-infrastructures)

Relying services

eduGAIN IdPs Common IdPs

ELIXIR Proxy IdP ELIXIR Directory Bona fide management

Dataset authorisation management (REMS)

Group/role mgmt (PERUN)

Credential translation

EGA eLearning

Cloud Beacon

wiki

Data archive

… …

Attribute self-management

Step-up AuthN

Page 22: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI design

ELIXIR AAI

External authentication(e-infrastructures)

Relying services

eduGAIN IdPs Common IdPs

ELIXIR Proxy IdP ELIXIR Directory Bona fide management

Dataset authorisation management

Group/role management

Credential translation

EGA eLearning

Cloud Beacon

wiki

Data archive

… …

Attribute self-management

Step-up AuthN

ELIXIR Proxy IdP- User has one ELIXIR identity- User can authenticate using (several)

external identities- Proxy IdP consolidates the IDs- Acts as SAML2 or OpenID Connect IdP for

Relying services

Page 23: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI design

ELIXIR AAI

External authentication(e-infrastructures)

Relying services

eduGAIN IdPs Common IdPs

ELIXIR Proxy IdP ELIXIR Directory Bona fide management

Dataset authorisation management (REMS)

Group/role management (Perun)

Credential translation

EGA eLearning

Cloud Beacon

wiki

Data archive

… …

Attribute self-management

Step-up AuthN

Multi-factor authentication - when requested by the relying party- with a smartphone app- based on TOTP standard

Page 24: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

ELIXIR AAI design

ELIXIR AAI

External authentication(e-infrastructures)

Relying services

eduGAIN IdPs Common IdPs

ELIXIR Proxy IdP ELIXIR Directory Bona fide management

Dataset authorisation management (REMS)

Group/role management (Perun)

Credential translation

EGA eLearning

Cloud Beacon

wiki

Data archive

… …

Attribute self-management

Step-up AuthN

Bona Fide researchers- Anyone can have ELIXIR ID - Bona Fide researcher: a member of

bioinformatics community in good standing- For instance: access to a registered access

beacon- GA4GH

Page 25: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

Operations• Failover configuration

• 3x machines in Czech Republic and 1x machine in Finland• Disaster recovery procedures defined for critical

components• The whole infrastructure is monitored

• High-level view via https://login.elixir-czech.org/monitor• Access statistics

• https://login.elixir-czech.org/statistics• List of connected services

• https://login.elixir-czech.org/services• Privacy Policy

• https://perun.elixir-czech.cz/docs/ELIXIRAAIPrivacyPolicy-v1.pdf

Page 26: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

e-Infrastructures and ELIXIR CZ

• Providing and operating the IT resources

• https://wiki.metacentrum.cz/wiki/Elixir

[email protected]

• Developing specific IT solution• From configuration of resources to software development

• Always fully integrated with national and international “pure” e-infrastructure activities (e.g. now EOSC)

• Helping collaboration with other research infrastructures• IT part shared

• Internationally involved and visible

• Very active component of the ELIXIR CZ

Page 27: e-Infrastructures & ELIXIR CZ · e-Infrastructures and ELIXIR: Motivation •ELIXIR is a research infrastructure dealing with data and tools ... •Key components up and running already

Thank you