high throughput sequencing, bioinformatics and ... · pharmaceutical products and veterinary...
TRANSCRIPT
HIGH THROUGHPUT SEQUENCING, BIOINFORMATICS
AND COMPUTATIONAL GENOMICS
AND INTERNATIONAL ANIMAL HEALTH INFORMATION
THE OIE STRATEGY
Vincenzo Caporale Incheon, 15 October, 2014
ACKNOWLEDGEMENT
OIE-AD HOC GROUP ON HIGH THROUGHPUT SEQUENCING,
BIOINFORMATICS AND COMPUTATIONAL GENOMICS (HTS-BCG)
Dr Sándor Belak
Dr Carlos Barroso
Dr Peter Daniels
Prof Massimo Palmarini
OIE - Collaborating Center for Viral Genomics and Bioinformatics
OIE – Collaborating Center for Development and Production of Vaccines,
Pharmaceutical Products and Veterinary Diagnostic Systems using Biotechnology
OIE – Collaborating Center for Biotechnology-based Diagnosis of Infectious
Diseases in Veterinary Medicine
OIE – Reference Laboratory for Bluetongue
PILOT PROJECT & MOCK UP
Prof Massimo Palmarini
Dr Robert Gifford
Dr Antonino Caminiti
Dr Emiliana Brocchi
Dr Marco Caporale
Dr Silvano Galasso
University of Glasgow Centre for Virus Research - Glasgow
Istituto Zooprofilattico della Lombardia e Emilia Romagna - Brescia
The role of genetic material profiling and complete genome sequencing in microbial infections is increasing a. pathogenesis and immunity b. diagnosis c. management d. characterisation of infectious agent e. likely distribution of their spread in time and space
In the current and future scientific environment a microorganism is not identified satisfactorily unless essential features of its genome are described
GENETIC PROFILING AND ANIMAL HEALTH
The increasing reliance on generating and using sequence information, in particular the trend toward whole genome sequencing and metagenomics
The inevitable introduction of devices and systems for the generation of sequence data away from the laboratory and closer to the point of sampling on the farm or at other points along the “value chain”
a. THE ROLE, DESIGN AND MANAGEMENT OF VETERINARY LABORATORIES
The ever increasing trend toward global open information systems
b. THE NOTIFICATION AND MANAGEMENT OF INFECTIOUS DISEASES AND FOOD-BORNE INFECTIONS
HIGH THROUGHPUT SEQUENCING, BIOINFORMATICS AND COMPUTATIONAL GENOMICS
HAVE CRUCIAL AND FAR REACHING IMPLICATIONS FOR
OIE RESPONSIBILITY AND INSTITUTIONAL ROLE IN
THE GLOBAL MANGEMENT OF ANIMAL
HEALTH AND FOOD SAFETY
THE OIE HAS THE RESPONSIBILITY OF A LEADING AND CENTRAL INSTITUTIONAL ROLE
b. the development and adoption of international standards for animal health and welfare including diagnostic tests and vaccines
a. the animal health information collection and distribution worldwide
OIE RESPONSIBILITY IN THE GLOBAL MANGEMENT OF ANIMAL
HEALTH AND FOOD SAFETY
THE OIE STRATEGY
THE OIE STRATEGY
1. ADOPTION OF STANDARDS FOR SEQUENCE PRODUCTION, ASSEMBLY AND USE IN THE LABORATORY, ON THE FARM AND AT ANY POINT ALONG THE “VALUE CHAIN” LINKING ANIMALS TO CONSUMERS
2. CREATION OF AN OIE PLATFORM FOR THE COLLECTION AND MANAGEMENT OF PATHOGEN GENOMIC SEQUENCES AND RELATED METADATA INTEGRATED WITH EXISTING WORLD ANIMAL HEALTH INFORMATION SYSTEM (WAHIS)
DEVELOPING THE OIE
STANDARDS
1. Sequence and sequence analysis data should be considered an integral and necessary part of the reporting of cases and outbreaks to the OIE
THE OIE STRATEGY DEVELOPING THE OIE STANDARDS
The use of new technological tools, such as HTS-BCG, in the management of diseases of animals
2. Standards for sequence production and assembly should be generated
a. adopted within the context of tried and accepted processes for the management of animal health and food safety, including the laboratory quality assurance system
b. appropriate to the purpose of the investigation
c. sampling and the interpretation of the results have to be in context with the pathogenesis and epidemiology of the infection in the animal species under study carried out by appropriately qualified veterinary investigators within the appropriate regulatory framework
Methods to obtain pathogen genomic sequences - including HTS - Sequencing quality (quality of reads has to be appropriate to the purpose) Procedures for sequence assembly Reference genomes
DEVELOPING AN INFORMATION
PLATFORM
THE OIE STRATEGY
1. To create the OIE platform for the collection and management of pathogen genomic sequences and related metadata
Policy of data governance Property rights Accessibility Proper use
Data standards to describe sequences production and assembly
File formats for storage
Policy to support the use of open source software and licensing
2. To implement a process to reach consensus to define
The OIE Reference Centres and other national sources should provide, the expertise for:
developing policies, practices and standards to generate, manage and use sequence information in animal diseases and related food safety infections, diagnosis and management
the scientific understanding of the use of sequence information of microorganisms, in relation to animal health and food safety
THE OIE STRATEGY THE ROLE OF THE GLOBAL NETWORK OF OIE REFERENCE CENTRES
THE PILOT PROJECT
OBJECTIVES OF THE PILOT PROJECT
4. Facilitate the participation of non-OIE Reference Laboratories in the platform and uploading of their own sequences
3. The network of OIE Collaborating Centres will be the main players in creating the platform and in defining the standards
1. Provide the blueprint for the creation of an open access platform for the collection and management of pathogen genomic sequences and related metadata managed and coordinated by the OIE and fully integrated into the World Animal Health Information System (WAHIS), defining among other property rights, accessibility and proper use of data
2. To update the OIE Manuals
defining reference genomes
defining standards for production, assembly and storage of genomic sequences to be integrated into the Manuals
A Pilot projects has initiated
THE PLATFORM
THE OIE PLATFORM THE PLATFORM
The open access platform for the collection and management of pathogen genomic sequences and related metadata will
Provide assistance and expertise for sequence assembly and data analyses within and outside the OIE Reference Laboratory network
Gather genomic sequences from the OIE Reference Laboratories, national laboratories, scientific institutions laboratories , etc.
Provide services for data storage
Provide sequencing information within and outside the OIE Reference Laboratory network
THE OIE PLATFORM OVERVIEW
EPIDEMIOLOGICAL DATABASE
SEQUENCE DATABASE
OIE-REFERENCE LABORATORIES WITH LOCAL DATABASE
EPIDEMIOLOGICAL DATABASE
SEQUENCE DATABASE
SERVICE CENTER OIE-HEADQUARTERS
OIE-REFERENCE LABORATORIES WITHOUT
LOCAL DATABASE
WEB
SERVICES
EPIDEMIOLOGICAL DATABASE
SEQUENCE DATABASE
OTHER LABORATORIES WITH LOCAL DATABASE
WEB
SER
VIC
ES
OTHER LABORATORIES WITHOUT LOCAL
DATABASE SEQUENCE
EPIDEMIOLOGICAL DATA
EPIDEMIOLOGICAL DATA
SEQUENCE
REPORTS
USERS
OIE-WAHIS
OTHER GENOME & EPIDEMIOLOGICAL DATA
DATABASES (i.e. GMI, FAO, etc)
THE SERVICE CENTER &
THE PERIPHERAL NODES
THE OIE PLATFORM THE SERVICE CENTER AND THE PERIPHERAL NODES
Data exchange and services (such as those for sequence assembly, sequence comparison and data analysis, etc.) will be assured by the Service center to all laboratories within the system
Access and property rights will be ensured to the laboratory that has provided them, regardless of where data are stored physically
THE SERVICE CENTER IS LOCATED AT THE HEADQUARTER
to assure any other information exchange within and outside the platform
to manage the distributed database of the genomic sequences
to link the database and WAHIS
to provide access to data
THE KEY PERIPHERAL NODES ARE THE OIE –REFERENCE LABORATORIES
to supply the genomic sequences and the epidemiological data to the Service center
to provide assistance and expertise for sequence assembly and data analyses within and outside the OIE Reference Laboratory network
THE ARCHITECTURE
THE OIE PLATFORM THE ARCHITECTURE
OIE PAHOGEN GENOMICS PLATFORM
DATABASE STORE EPIDEMIOLOGICAL & SEQUENCE DATA
INDEX SEARCH RETRIEVE
CONNECTION MODULE LINK TO WAHIS
INTEROPERABILITY & SERVICES
AUTHENTICATION & AUTHORIZATION MANAGEMENT
AMINISTRATION MODULE VISIBILITY AND RIGHTS
MANAGEMENT WORKFLOW
MANAGEMENT
ANALYTICAL MODULE
SEQUENCE ASSEMBLY
OTHER TOOLS FOR DATA ANALYSIS
SEQUENCE COMPARISON HEADQUARTERS
OIE-REFERENCE LABORATORIES
OTHER LABORATORIES
OTHER USERS
THE BASIC COMPONENTS
The genomic sequence database will allow the storing, indexing and searching of genomic sequences
THE OIE PLATFORM THE BASIC COMPONENTS
The database will be characterized by a high degree of flexibility
to manage to manage files that differ in structure (type and organisation of the information in the file), format (plain text, compressed formats, etc.)
size (from a few megabytes to many gigabytes)
to allow pipelines to be run with variable programming language data input scope (sequence assembly, quality evaluation, phylogeny, etc.)
For each genomic sequence, the database will collect and store data on the sequencer equipment used the pipelines chosen the epidemiological findings
1. GENOMIC SEQUENCE DATABASE
THE OIE PLATFORM THE BASIC COMPONENTS
The database will be distributed over a network of interconnected computers located in the laboratory network
In all cases responsibility for data quality and data property will belong to the laboratory inputting them.
Laboratories that do not possess their own database will have the possibility of participating in the worldwide OIE genomic database that will provide them with the necessary infrastructure for raw data management
c. Each laboratory system must guarantee the same standardised services and functions to assure the platform interoperability
a. Each laboratory represents an independent node with its own system
b. Each laboratory maintains control over the data produced locally and it accesses the entire database – as if it was centralised – to carry out analyses and reporting
1. GENOMIC SEQUENCE DATABASE
THE OIE PLATFORM THE BASIC COMPONENTS
Assures the connectivity of the various components within the platform and other relevant information systems
It has two main functions accessible by the operator at the laboratory level
1. sequences are linked to the related immediate notifications, follow-ups and other types of official reports stored in WAHIS
2. the enrichment of genomic sequences with epidemiological data
2. THE CONNECTION MODULE
The epidemiological data enriching the sequence data will be
1. those present in WAHIS, in cases where the sequences are related to an immediate notification
2. a minimal set of epidemiological data to be defined in all other cases
Epidemiological data not related to official reporting will be subject to verification by the OIE before being publicly available to assure compliance to the existing international veterinary standards
The OIE platform for pathogen genomics will collect and store sensitive information with related security challenges
3. THE ADMINISTRATION MODULE
THE OIE PLATFORM THE BASIC COMPONENTS
An Administration module and appropriate standard must be implemented
To manage visibility and access of data
To guarantee compliance with quality requirements during the uploading of the sequences
To ensure the respect of property rights
To trace information on the use of data
The administration module will integrate specific mechanisms to expand the platform with new services and functions by adding or update modules.
MODULES FOR SEQUENCE ASSEMBLY
AND COMPARISON &
DATA ANALYSIS
THE OIE PLATFORM MODULES FOR SEQUENCE ASSEMBLY & COMPARISON AND DATA ANALYSIS
The development and sharing of common tools would promote not only knowledge exchange but also the participatory development of standards
The OIE platform for genomics will offer a set of modules accessible to laboratories participating in the platform for sequence assembly sequence comparison data analysis
The modules could be developed by a single Laboratory or a group of Centres within the platform, and then offered to the other members of the platform
DATA FLOW
THE OIE PLATFORM DATA FLOW
The OIE requires that Member Countries
Official communications between Member Countries and the OIE are subject to strict procedures, and the publication of data follows a series of VERIFICATION AND VALIDATION steps before information is made public
a. notify any event of epidemiological significance (immediate notifications and follow-ups)
b. transmit periodic reports on the presence or absence of OIE listed diseases (6-monthly
VETERINARY SERVICE
LABORATORY DELEGATE** OIE-REFERENCE
LABORATORY OIE WAHIS/WAHID/
INTERFACE
-
DATA PUBLICATION
DATA VERIFICATION & VALIDATION
SAMPLE ANALYSIS
CONFIRMATION
CONFIRMATION
CONFIRMATION
SAMPLE
CONFIRMATION
IMMEDIATE NOTIFICATION
DATA VERIFICATION & VALIDATION
SAMPLE ANALYSIS
SAMPLE*
EPIDEMIOLOGICAL DATA
* Confirmation from an OIE Reference Laboratory is required in case of certain diseases (e.g. FMD) **Chief Veterinary Officer
THE OIE PLATFORM OIE IMMEDIATE NOTIFICATION DATA FLOW
VETERINARY SERVICE
LABORATORY CVO** OIE-REFERENCE
LABORATORY OIE WAHIS/WAID
/ INTERFACE
-
DATA PUBLICATION
DATA VERIFICATION & VALIDATION
SAMPLE ANALYSIS
CONFIRMATION
CONFIRMATION
CONFIRMATION
SAMPLE
CONFIRMATION & SEQUENCE
IMMEDIATE NOTIFICATION
DATA VERIFICATION & VALIDATION
SAMPLE ANALYSIS
SAMPLE*
EPIDEMIOLOGICAL DATA
* Confirmation from an OIE Reference Laboratory is required in case of certain diseases (e.g. FMD) **The relationship between laboratory and OIE-Reference Laboratory regarding sequences transmission needs to be defined
EPIDEMIOLOGICAL DATA
SEQUENCE & EPIDEMIOLOGICAL
DATA**
SEQUENCE GENERATION
THE OIE PLATFORM SUGGESTION FOR IMMEDIATE NOTIFICATION AND SEQUENCE
TRANSMISSION DATA FLOW
LABORATORY/ OIE-REFERENCE LABORATORY
OIE WAHIS/WAHID/INTEFACE
DATA PUBLICATION
DATA VERIFICATION & VALIDATION
CVO
- DATA VERIFICATION
& VALIDATION
SAMPLE ANALYSIS
SEQUENCE & EPIDEMIOLOGICAL DATA SEQUENCE
GENERATION
THE OIE PLATFORM GENERIC TRANSMISSION OF SEQUENCES AND
EPIDEMIOLOGICAL DATA
VERIFICATION IF REQUIRED