e-infrastructures & elixir cz · e-infrastructures and elixir: motivation •elixir is a...
TRANSCRIPT
e-Infrastructures & ELIXIR CZ
Luděk MatyskaCERIT-SC (ICS MU) & CESNET
e-Infrastructures and ELIXIR: Motivation
• ELIXIR is a research infrastructure dealing with data and tools to process/analyze them• A very specific case of an e-infrastructure
• Interaction with “classical” e-infrastructures critical• User-provider model not satisfactory
• Synergistic development and evolution needed on both sides
• Collaborative model with higher potential
• Solution for ELIXIR CZ:
Make the e-infrastructures prime partners
National e-infrastructure landscape
• CESNET• Legal body• All Czech public universities plus Czech Academy of Science are owner• Primary national network infrastructure• Also coordinating national grid, cloud and storage activities• Plus cyberinfrastructure
• CERIT-SC• At Masaryk University• Flexible clouds, development, innovations
• IT4Innovations• At Technical University Ostrava• National Supercomputing Centre
National e-infrastructure landscape
• Till 2019 all three e-infrastructures independent• All on the national roadmap
• Now a single national e-infrastructure e-INFRA CZ• A consortium of all three coordinated by CESNET
• Not a new legal body, but formal consortium agreement in place
• CESNET & CERIT-SC/MU have a long collaboration history• Formally till 1998
• MetaCentrum – distributed grid, cloud
• CESNET & CERIT-SC are founding members of ELIXIR CZ
e-Infrastructures and ELIXIR CZ
• CESNET & CERIT-SC founding partners of ELIXIR CZ
• Primary goal: to take responsibility of the IT resources and capacity building• The IT capacity is fully purchased and operated be e-infrastructures
• However, this does not mean a single locality of the resources
• Building a distributed e-infrastructure for ELIXIR CZ• Fully integrated into the national e-infrastructure landscape
• Building on top of their expertise and expanding it towards Life Science community needs
• The control over this capacity is in hands of ELIXIR CZ, not the e-infrastructures
IT capacity building
• Separate funding stream• European development funds for capital investment
• Specific for ELIXIR CZ, together with RI finding cover the total cost (including operational costs like personnel and electricity)
• The current project primary around e-infrastructure partners within the ELIXIR CZ• CERIT-SC/MU coordinating
• CERIT-SC and CESNET delivering
• Not only hardware, but also the commercial software covered
IT capacity building
• Primary computing capacity, storage a some software
• Computing capacity• Clusters with either thin or thick (more CPU, more memory) nodes
• GPU acceleration
• Extensively coordinated with the capital investment of e-infrastructures
• Synergies recognized and fully utilized
• Storage capacity• Large one (archival) provided by CESNET in its own capacity
• ELIXIR CZ contribution towards working storage space
IT capacity in more detail
• Compute cluster elmo
• 42 servers, 1500 CPUs
• Location: ICS MU and IOCB Prague
• Part dedicated for service/tools development
• Part dedicated to specific ELIXIR services
• Repeat Explorer, (FireProt)
• Part dedicated for Galaxy or Chipster initiated workflows
• Specific MetaCentrum queue dedicated to ELIXIR CZ
• Part of Metacentrum capacity in Brno
• 45 servers, ELIXIR CZ priority access
• International activities
IT capacity in more detail
• Storage capacity• Originally 400 TB in Brno and 100 TB in Prague
• Currently extended to 2 PB
• Object storage for cloud use
• Usage• Next to clusters for data processing
• Data of ELIXIR CZ services, copies of used databases
• Readily available copies of data in home directories
Software
• Run on centrally provided resources
• Virtualization used to extend the availability
• E.g. PEAKS software
• Windows based
• Run in a virtual appliance and thus shared by more groups
• Uses the capacity of the national high speed network
• On going discussions about the feasibility of the commercial
software
• Black box, rather expensive, not prepared for sharing (setup, licensing
model, …)
Current plans
• Cloud (compute and storage) for processing of highly sensitive data
• Support for healthcare oriented research
• First version at CERIT-SC/MU, technology before end of 2019
• However, the technology is the easiest part of this activity
• Extension of compute/processing capacity
• Including GPU acceleration (also for AI)
• Thick nodes with 2-3 TB RAM
• Capacity at IMG
Pilot at IMG
• Coordination of building IT capacities across several research infrastructures
• ELIXIR CZ, CZBI, OPENSCREEN, CCP and e-INFRA CZ
• With contribution from the Institute
• Coordinated by ELIXIR CZ
• Support for “data lake” model
• Combining data from several sources/infrastructures for easier processing and cross correlations
• A model for similar setups at other localities/institutions
General services
• Operation of stable services in secured environment• Including virtualization services and support for containerization
• Cppredictor, ccmi, Chipster, web ELIXIR CZ
• Resources for services (including workflow engines)• Chipster, Galaxy, RepeatExplorer, FireProt
• Complex environment for service/tools development• Including GUI for bioinformaticians (Chipster and Galaxy)
• Application software• Commercial like CLC Workbench, PEAKS Studio, or Mascot
• Open source systems available through the whole MetaCentrum
General services
• Storage for working copies or archives• For archives CESNET data services are used
• Support for training and education• UCHT: Genomics: analysis and algorithms
• Platform to run training
• But also learning how to adapt e-infrastructure to specific Life Science needs:• Discussion on SLA, responsibility of individual partners
• Monitoring, acceptable use policies and rules
• https://wiki.metacentrum.cz/wiki/Elixir
International scope
• ELIXIR CZ is deeply integrated into the European-wide ELIXIR activities
• e-Infrastructure partners play very active role• Co-leadership of the ELIXIR Compute Platform
• Task co-leadership in AAI and Clouds
• Leadership of the EOSC-Life project workpackage on Access control and management
• And active extensive involvement in the cloud workpackage
• Leadership of the CINECA project workpackage (AAI)
• ELIXIR Authentication and Authorization Infrastructure (ELIXIR AAI) major internationally (globally) visible contribution
ELIXIR AAI (authentication and authorisation infrastructure)
• ELIXIR Compute platform’s service portfolio for authenticating researchers and helping other services to decide their permissions
• Being implemented as an ELIXIR infrastructure service from 2019
• Will underpin Life Science AAI with other RIs
• Uses standards as much as possible• Involved in GA4GH DURI
Statistics 09/2019
• 3000 users• ~4500 logins/month• 67 relying services• https://login.elixir-czech.org/statistics
• https://login.elixir-czech.org/services
ELIXIR AAI
• ELIXIR Compute platform’s service portfolio for authenticating
researchers and helping other services to decide their
permissions
• Being implemented as an ELIXIR infrastructure service from
2019
• The first accepted ELIXIR infrastructure service
• Will underpin Life Science AAI with other Ris
• Activity within EOSC-Life under our leadership
• Based on standards developed by e-infrastructures
• Actively participated in these activitis
• Involved in GA4GH DURI
• (Global alliance for human genomes, global impact)
ELIXIR AAI Statistics (09/2019)
• 3000 users• ~4500 logins/month• 67 relying services• 67 services in pilot
More info on• https://login.elixir-czech.org/statistics
• https://login.elixir-czech.org/services
ELIXIR AAI history – where we are now
• Use case gathering -- Autumn 2014• https://docs.google.com/document/d/12fBLl8WenlxABQDzYEAJ6NtOhIXnZBPiGQLuhFUCyqo/edit
• Requirements and design – Spring 2015• https://docs.google.com/document/d/1CMY1np3GyvPD8LcKvXljXcRO04V2zu3n_Jcg19jgNOw/edit
• Deployment starts – Autumn 2015 – EXCELERATE WP4.3.1• Part of ELIXIR Compute platform, operated by CZ and FI
• First release -- Autumn 2016• Until that ELIXIR AAI in pilot status
• Key components up and running already
ELIXIR AAI design
ELIXIR AAI
External authentication(e-infrastructures)
Relying services
eduGAIN IdPs Common IdPs
ELIXIR Proxy IdP ELIXIR Directory Bona fide management
Dataset authorisation management (REMS)
Group/role mgmt (PERUN)
Credential translation
EGA eLearning
Cloud Beacon
wiki
Data archive
… …
Attribute self-management
Step-up AuthN
ELIXIR AAI design
ELIXIR AAI
External authentication(e-infrastructures)
Relying services
eduGAIN IdPs Common IdPs
ELIXIR Proxy IdP ELIXIR Directory Bona fide management
Dataset authorisation management
Group/role management
Credential translation
EGA eLearning
Cloud Beacon
wiki
Data archive
… …
Attribute self-management
Step-up AuthN
ELIXIR Proxy IdP- User has one ELIXIR identity- User can authenticate using (several)
external identities- Proxy IdP consolidates the IDs- Acts as SAML2 or OpenID Connect IdP for
Relying services
ELIXIR AAI design
ELIXIR AAI
External authentication(e-infrastructures)
Relying services
eduGAIN IdPs Common IdPs
ELIXIR Proxy IdP ELIXIR Directory Bona fide management
Dataset authorisation management (REMS)
Group/role management (Perun)
Credential translation
EGA eLearning
Cloud Beacon
wiki
Data archive
… …
Attribute self-management
Step-up AuthN
Multi-factor authentication - when requested by the relying party- with a smartphone app- based on TOTP standard
ELIXIR AAI design
ELIXIR AAI
External authentication(e-infrastructures)
Relying services
eduGAIN IdPs Common IdPs
ELIXIR Proxy IdP ELIXIR Directory Bona fide management
Dataset authorisation management (REMS)
Group/role management (Perun)
Credential translation
EGA eLearning
Cloud Beacon
wiki
Data archive
… …
Attribute self-management
Step-up AuthN
Bona Fide researchers- Anyone can have ELIXIR ID - Bona Fide researcher: a member of
bioinformatics community in good standing- For instance: access to a registered access
beacon- GA4GH
Operations• Failover configuration
• 3x machines in Czech Republic and 1x machine in Finland• Disaster recovery procedures defined for critical
components• The whole infrastructure is monitored
• High-level view via https://login.elixir-czech.org/monitor• Access statistics
• https://login.elixir-czech.org/statistics• List of connected services
• https://login.elixir-czech.org/services• Privacy Policy
• https://perun.elixir-czech.cz/docs/ELIXIRAAIPrivacyPolicy-v1.pdf
e-Infrastructures and ELIXIR CZ
• Providing and operating the IT resources
• https://wiki.metacentrum.cz/wiki/Elixir
• Developing specific IT solution• From configuration of resources to software development
• Always fully integrated with national and international “pure” e-infrastructure activities (e.g. now EOSC)
• Helping collaboration with other research infrastructures• IT part shared
• Internationally involved and visible
• Very active component of the ELIXIR CZ
Thank you