elixir’s human data communities imi fairplus project · text mining, structured metadata,...
TRANSCRIPT
www.elixir-europe.org
@ELIXIREurope
www.elixir-europe.org
ELIXIR-EXCELERATE is funded by the European Commission within the
Research Infrastructures programme of Horizon 2020, grant agreement number
676559.
ELIXIR’s Human Data Communities IMI FAIRplus project
Jen Harrow, ELIXIR Tools Platform Coordinator
agriculture
medicine
bioindustries
environment
ELIXIR’s mission is to operate a sustainable European infrastructure for biological information, supporting life-science research and its translation to society, the bio-industries, environment and medicine
ELIXIR’s strategy is to connect national bioinformatics centres and EMBL-EBI into a distributed infrastructure built from coordinated national and
international data resources, tools and services
Development of ELIXIR
EOSC-Life, EJP-RD ,FAIRPlus
ELIXIR in numbers
• 22 Members and 1 Observers
• ~ 180 institutes involved
• 700+ staff
• 18 Core Data Resources
• 21 Implementation Studies ongoing or soon to start
• 27 papers in ELIXIR F1000 channel
• 300 live events in TeSS
• 400 companies attended Innovation and SME programme
Overview of ELIXIR Platforms:
Data Platform: Supporting the whole ELIXIR ecosystem of data resources
ELIXIR Core Data Resources
“of fundamental importance to all
research in the life sciences”
ELIXIR Data Resources
Prioritised by national
programmes; reviewed by ELIXIR SAB
Huge importance to specific research
communities
Working towards the launch of Global BioData Coalition
• Global BioData Coalition is now being formed• Coalition of funders - worldwide - to coordinate
and sustain global biodata landscape• HIRO support, WT, NIH and AMED (Japan)
leading
• 2 year project to develop coalition • Initial funding from WT, NIH - approaching
additional funders globally• Project plan agreed, start Q4 2018
• ELIXIR: Lead consultation on indicators and procedure w global stakeholders• Based on ELIXIR Methodology
Interoperability Platform
FindableAccessibleInteroperableReusable
Tech
Services
Standards
Adoption
Interoperability services and practices to support FAIR data and interoperability activities
Use case focusedData providersData integrators
With international initiatives, from community grassroots to government programmes.
Capacity
building
The 2018 Recommendations of RIRsResource Description
FAIRsharing (UK) Registry of curated metadata of DBs, Policies, Standards
g:Profiler (EE) Gene-centric data integrator - Web UI, and API services
Identifiers.org (EBI) Identification and resolution system for life science, provider of compact identifiers and URIs
Intermine (UK) Framework to integrate life sciences data based on an extensible data model, providing web interface and
RESTful web services.
ISA Framework (UK) ISA (Investigation > Study > Assay) helps researchers to provide rich description of experimental metadata
so that the resulting data and discoveries are reproducible and reusable.
Ontology Lookup Service
(EBI)
Repository for biomedical ontologies that aims to provide a single point of access to the latest ontology
versions through web UI or RESTFUL API
3DBIONOTES API* (ES) A reusable platform-independent API call component for protein metadata alignment, annotation, and
integration across major protein data resources.
BridgeDb (NL) A combination of a software framework and and API for mapping identifiers for related objects in life
sciences
DisGeNET API* (ES) API SPARQL Endpoint for genetic variant (human disease data)
MOLGENIS (NL) A software package to help researchers set up an online database application that supports data queries
and allows data sharing.
The Tools Platform
Raise software
quality and
sustainability,
by producing
and promoting
software best
practices and
developing
training
activities
Bio.tools, a
discovery portal
for
bioinformatics
software
information,
providing
curated
description of
tools and data
services
OpenEBench,
an infrastructure
providing
services for
hosting
scientific
benchmark
activities and
technical
monitoring of
bioinformatics
tools and
service
To support
efforts around
software
packaging &
containers, e.g.
BioConda/
BioContainer
and support
sustainable
integration into
bio.tools and
OpenEBench
T
To drive the
development of
execution
platforms (eg
Galaxy) and
ensure
integration with
bio.tools,
OpenEBench
and workflows
using CWL
Tools Interoperability, guidelines and resources for guaranteeing platforms integration at the ELIXIR Tools
platform ecosystem, with other platforms at ELIXIR and beyond.
Compute Platform: Access, exchange and storage
• Mission: Develop distributed solutions for cloud, compute, storage services including user authentication and access control
• Coordination of dependencies with e-infrastructures (esp. GÉANT, EGI, EUDAT) in collaboration with biological and medical research infrastructures (CORBEL)
Tommi Nyrönen, FI , Luděk Matyska, CZ , Steven Newhouse, EBI
ELIXIR Authorisation and Authentication Infrastructure
Enables life science researchers to use their institutional IDs to access services and data:• Reduced bureaucracy and costs
• Improved vetting: federated identities provide greater confidence to the service and data providers
• Regular updates: as researchers join leave institutions, their affiliation information is maintained regularly
• Improved access to usage metrics: consistent use of accounts allows service providers to better analyse the use of their services
• Applicable to other research infrastructures (CORBEL)
Training Platform:TeSS Portal (Training eSupport System)
• Platform to disseminate, discover & package training resources, training materials and events – led by ELIXIR UK
• Aggregating information from ELIXIR nodes and various 3rd-party content providers
http://tess.elixir-uk.org
ELIXIR Communities connect infrastructure with life-science research experts across Europe
• ELIXIR Communities are formed around domain experts in our Nodes
• Include non-ELIXIR partners
• ELIXIR Communities provide a mechanism for long-term collaborations with other ESFRI and large-scale initiatives
• ELIXIR Communities will drive the service developments in the ELIXIR Platforms and provide framework to develop and maintain community standards
Partnerships and community formation
ELIXIR Human Data Communities• Federated Human Data• Rare Diseases• human Copy Number Variation
se
ELIXIR Human Genomics & Translational Data –Tools developed within ELIXIR
Data DiscoverabilityFederating lightweight discoverability of data, anddatasets across ELIXIR
Data ArchivalUtilising the ELIXIR Deposition Databases toensure secure, long-term, efficient archival of data
Federated Data AccessCoordinating a collection of interoperable EGA-like resources to ensure secure management ofsensitive data across the ELIXIR Nodes
Data AnalysisBringing ‘analysis to data’ via common workflow languages, workflows, containers, and tools
ELIXIR Beacon - GA4GH
Driver Project
ELIXIR Federated Human Data
Community - htsget/htsref
bio.tools
Serena Scollen and Gary Saunders
Developing standards and tools:ELIXIR and GA4GH to develop strategic partnership
Simplify the way people search for and request access to potentially identifiable data in international and national
genomic data resources
Working towards GA4GH standards, APIs and toolkits to be used throughout
8 GA4GH Workstreams8 new driver project announced in Feb
15/22 ELIXIR Nodes involved
Public data discovery web-service: Beacon Driver Project
Yes / No(+optional metadata
about the allele)
Do you have information about the allele “C at
position 32,936,732 on chromosome 13?”
Beacon X: YesBeacon Y: NoBeacon Z: No…
Do you have information about the allele “C at
position 32,936,732 on chromosome 13?”
https://beacon-network.org www.elixir-europe.org/beacons
9 Nodes have lit Beacons
Federation of human genome data
• Many national datasets from human research participants needs to be stored locally
• ELIXIR developing a federation with shared metadata (FAIR) and local data store (secure)
• Linking local EGA to
• national clouds
• international access (ELIXIR-AAI - Authentication and Authorisation Infrastructure)
Sharing genomic data across borders
EU declaration - 2018
Currently signed by:
Austria, Bulgaria, Croatia, Cyprus, Czech Republic, Estonia, Finland, Greece, Italy, Latvia, Lithuania, Luxembourg, Malta, Portugal, Slovenia, Spain, Sweden, Netherlands and the UK
ELIXIR members
“Leveraging European infrastructures to access one million human genomes by 2022”
• Coordinated, secure, federated environment will enable population scale genomic, phenotypic, and biomolecular data to be accessible across international borders
• Lessons learned & solutions developed should be taken from existing infrastructures, and ongoing data sharing efforts in cancer, population genetics & rare disease areas
• Need to empower data scientists with knowledge and tools
Saunders G et al., pre-submission acceptance to Nature Genetics Reviews
New ELIXIR initiative :FAIRplus-to develop tools and guidelines for making life science FAIR
ELIXIR - Project CoordinatorJanssen - Project Leader
22 participants 12 academic, 7 EFPIA, 3 SME
€8.23M budget €4M H2020 EC funding + €4.23M EFPIA in-kind
42 months (Jan 2019-June2022)
22
FAIRplus: Aims
• Establish a value-based process for prioritisation and selection of Innovative Medicine
Initiative (IMI) project databases
• Develop FAIRification toolkit e.g. develop guidelines, tools and metrics - FAIR Cookbook
• Apply this toolkit to FAIRify datasets from selected IMI projects (>20 selected using a value
based selection process) and EFPIA companies
• Deliver training for data handlers (academia, SMEs and pharmaceuticals) to change and
sustain the data management culture e.g. Fellowship scheme
• Foster and innovation ecosystem on FAIR open data to power future reuse, knowledge
generation and societal benefit e.g. FAIR innovation and SME events
23
Our consortia
24
Concept
25
Tools available to FAIRplus
OmicsDIOmicsDI
Identifiers.org
Bioschemas
ELIXIR-LU Data
catalogue
Containerisation
tools
F A I R cross capability
CMMI (Capability Maturity Model Integration) for processes and datasets
271.
Initial2.
Repeatable3.
Defined4.
Managed5.
Optimizing
Using a “design,
use and refine”
cycle we will
iterate through
the processes
and products
Use Cases:Strong links to past Innovative Medicine Initiative (IMI) projects
STAGE 1 CONSORTIUM PARTICIPANTS:
ADAPT-SMART:
LYG
ADVANCE:
Synapse
AETIONOMY: *
UL
AMYPAD:
Synapse
APPROACH:
LYG, ITTM
BEAT-DKD:
SIB
BioVacSafe:
CDISC
DDMoRE:
EMBL-EBI, LYG
DRIVE AB:
Synapse
EBiSC:
EMBL-EBI, Fraunhofer
EPAD:
Synapse
Ebola+:
HYVE
EHR4CR:
EMBL-EBI, CDISC
ELF:
LYG
EMIF:
EMBL-EBI, IMIM,
HYVE, ITTM, Synapse
EMTRAIN:
EMBL-EBI
e-Tox: *
EMBL-EBI, BSC,
IMIM, Synapse
eTRANSAFE:
ELIXIR Hub, EMBL-
EBI, BSC, IMIM
eTRIKS: *
UOXF, UL, ICL, HYVE,
OntoForce, CDISC,
ITTM
EU-AIMs:
EMBL-EBI
HARMONY: Synapse
IMPRiND:
UOXF
IMIDIA:
SIB
iPiE:
IMIM,
Synapse
K4DD:
Fraunhofer
ND4BB*TRANSLOCATION:
HYVE, Fraunhofer
Open PHACTS:*
EMBL-EBI, UNIMAN,
BSC,
IMIM, HWU,
UM, PHACTS,
OntoForce
RADAR-CNS:
LYG, HYVE
RESCEU:
Synapse
RHAPSODY:
SIB
ROADMAP: Synapse SAFE-T:
ITTM
TransQST:
EMBL-EBI, IMIM,
UM, Synapse
BigData@Heart:
HYVE
EFPIA PARTICIPANTS (selected examples with highest relevance to this topic):
JANSSEN AZ LILLY GSK NOVARTIS BAYER BI
OncoTrack
OpenPHACTS
eTRIKS
DO->IT
HARMONY
eTOX
eTRANSAFE
ELF
K4DD
OncoTrack
OpenPHACTS
eTRIKS
eTOX
eTRANSAFE
ELF
K4DD
OncoTrack
OpenPHACTS
eTRIKS
DO->IT
OpenPHACTS
eTRIKS
DO->IT
eTOX
K4DD
OpenPHACTS
DO->IT
HARMONY
eTOX
eTRANSAFE
OncoTrack
eTRIKS
DO->IT
HARMONY
eTOX
eTRANSAFE
ELF
K4DD
OncoTrack
DO->IT
eTOX
eTRANSAFE
Plus ReSOLUTE
Communications
29
FAIRplus Website: www.fairplus-project.eu
FAIRplus Twitter: https://twitter.com/FAIRplus_eu● Tweet using #FAIRplus
IMI forum LinkedIn: https://www.linkedin.com/company/innovative-medicine-initiative-joint-undertaking/
Contact us: [email protected]
CONFIDENTIALITY: Please respect that this project is EC funded. If you would like to disseminate or use any of the information from the work plan or outcomes, please contact the Coordinators [email protected]
18th-22nd Nov 2019 Paris
Themes Can include :Text mining, Structured metadata, Identifiers, Data distribution, Data integration, Data Validation, Tools, Containers, Tools discovery, and Training materials
Opportunities for companies to submit hacking topics
Contact: [email protected]
www.elixir-europe.org
@ELIXIREurope
www.elixir-europe.org
ELIXIR-EXCELERATE is funded by the European Commission within the
Research Infrastructures programme of Horizon 2020, grant agreement number
676559.
Thank you!