context is everything: integrating genomics ...genepio.org/wp-content/uploads/2016/10/gen...context...
TRANSCRIPT
![Page 1: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/1.jpg)
Context is Everything: Integrating Genomics,
Epidemiological and Clinical Data Using GenEpiO
On behalf of the GenEpiO Development TeamWill Hsiao & Damion Dooley (BC Public Health Lab),
Emma Griffiths & Fiona Brinkman (SFU)
1
![Page 2: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/2.jpg)
Sequencing & Bioinformatics
• Sequencing, Assembly Pipeline Parameters
• QA/QC Metrics• Tree Construction Details
Isolate Source
• Food, Clinical, Environment• Food category, Body Product• Dates, Location
Clinical and Epi Details
• Demographics• Host disease, Symptoms • Lab Test Results• Exposures
2
Genomic sequences don’t mean much without contextual information.
![Page 3: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/3.jpg)
When requisitioning metadata, you need to anticipate the needs of downstream users.
Descriptive – Organized - Standardized
FAIR Principles of Data Management:
F – FindableA – AccessibleI – InteroperableR – Reusable
Published in Nature, March 2016 (BioScience experts, ELIXIR)
3
![Page 4: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/4.jpg)
Collection of isolates, metadata, Result reporting
Lab Information Management
Systems
Sequence submission, Distribution of
sequences
Public Repositories
Downstream Users –Public Health agencies, Surveillance Programs, Researchers
Minimal Metadata Requirements in
Public Repositories
Downstream Food Safety and Public Health Activities
Requiring Integrated Contextual
Information:• Outbreak
Investigation• Source Attribution• Risk Assessment• More…
SampleSource
Metadata Silos Barriers to
Data Sharing
Minimal Metadata
RequirementsBarriers to
Data Integration
Variable Metadata Collection At
Source
Disconnect between Upstream Sequence
Providers and Needs of Downstream Users
• Information siloed• Data loss• Lack of Standardization
The metadata collected at the source affects information propagation and downstream use.
4
![Page 5: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/5.jpg)
The
of Contextual Information
Isn’tSTANDARDIZED
•Free text, short hand, granularity, misspelling, paper format 5
![Page 6: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/6.jpg)
When Words Can Mean Different Things.
Lack of standardization results in semantic ambiguity.
6
![Page 7: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/7.jpg)
• Ontologies help resolve issues of taxonomy, granularity and specificity
• Reduces time consuming manual processing and mining
Leafy Greens “has_disposition” Transmission Vehicle (E.coli)
Spinach Lettuce
Endive “is_a” Lettuce TypeIcebergSpinaciaoleracea
Amaranthushybridus
S. Africa
7
Ontology, A Way of Structuring Information
• Standardized, well-defined hierarchy of terms
• Interconnected with logical relationships e.g. Endive “is_a” Lettuce type
Integrates Epi Exposure data
![Page 8: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/8.jpg)
Ontology Acts Like A Mapping Tool.
• Humans AND computers can read it
• Mapping allows interoperability AND customization
• Facilitates data sharing, reproducibility
8http://eil.stanford.edu/supply_chain/case/mapping_example.JPG
![Page 9: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/9.jpg)
Ontologies are different from program interfaces.
• How information is structured and linked • How information is presented to users
Ontologies: Interfaces:
9
![Page 10: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/10.jpg)
10
Data standards and ontologies help sort data at the source.
Sorting at the source… benefits from clarity
MetadataChallenges:• Collection• Organization• Storage/Archiving
![Page 11: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/11.jpg)
Prospective Metadata Collection is Easier and More Efficient than
Retrospective Retrieval.
11
![Page 12: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/12.jpg)
12
Standardization of Contextual Information
Facilitate Reporting and Quality Control
• Reproducibility• Reproducibility• Reproducibility• Reproducibility
![Page 13: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/13.jpg)
To develop a useful ontology, engaging the end users is your TOP priority.
Medical & Environmental Microbiologists
Bioinformaticians
Surveillance Analysts & Lab Personnel
EpidemiologistsSoftware and Work Flows
Investigation ToolsInstrumentation
+ =
Interview users Examine resources
GenEpiO(Genomic
Epidemiology Application Ontology)
13
Existing Standards MIxSSNOMED-CTLOINC
![Page 14: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/14.jpg)
Public Health Surveillance
Case Cluster Analysis
Result Reporting
Infectious Disease Epidemiology (from case to Intervention)Lab Surveillance (from sample to strain typing results)
Evidence Collection& Outbreak
Investigation
Sample Collection& Processing
Sequence Data Generation &
Processing
Bioinformatics Analysis
Result Reporting
Whole Genome Sequencing (SO, ERO, OBI etc)
Quality Control (OBI, ERO)
LegendPrimary Focus
GenEpiO (OBO)
Other
Anatomy (FMA)
Environment (Envo)
Food (FoodOn)
Clinical Sampling (OBI)
Custom LIMS
Quality Control (OBI, ERO)
AMR (ARO)
Virulence (PATO)
Phylogenetic Clustering (EDAM)
Mobile Elements (MobiO)
Quality Control (OBI, ERO)
Nomenclature & Taxonomy (NCBItaxon)
AMR (ARO) LOINC
Surveillance (SurvO)
Demographics (SIO)
Patient History (SIO)
Symptoms (SYMP)
Exposures (ExO)
Source Attribution (IDO)
Travel (IDO)
Transmission (TRANS)
Food (FoodOn)
Geography (OMRSE)
Outbreak Protocols
Surveillance (SurvO)
Food (FoodOn)
Surveillance (SurvO)
Mobile Elements (MobiO)
Infectious Disease (IDO)
Typing (TypON)
14
![Page 15: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/15.jpg)
GenEpiO: Combining Different Epi, Lab, Genomics and Clinical Data Fields.
Lab AnalyticsGenomics, PFGE
Serotyping, Phage typingMLST, AMR
Sample MetadataIsolation Source (Food, Host
Body Product, Environmental), BioSample
Epidemiology InvestigationExposures
Clinical DataPatient demographics, Medical
History, Comorbidities, Symptoms, Health Status
ReportingCase/Investigation Status
GenEpiO(Genomic Epidemiology Application Ontology)
15
See draft version at https://github.com/GenEpiO/genepio/wiki
![Page 16: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/16.jpg)
Current Ontology Development Focuses On 3 Key Areas
Food Antimicrobial Resistance
Disease Surveillance
16
(FoodON)(SurvO) (ARO)
![Page 17: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/17.jpg)
Resources:• LanguaL (USDA)• FoodEx2 (EFSA)• Codex Alimentarius (WHO)• USDA Nutrient Database• Painter Classification (CDC)• FoodO/FooDB• AGROVOC• Food Safety Information Network• Compendium of Analytical Methods (HC)• More…
FoodON: A Farm-to-Fork Food Ontology
Aim:To provide food descriptors for food items, ingredients, production environments to facilitate outbreak investigations, risk assessments, source attribution etc
See draft version at https://github.com/FoodOntology/foodon
17
![Page 18: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/18.jpg)
Ontologies are commonly encoded using OWL (Web Ontology Language).
• Markup language for sharing ontologies on the web • Machine and human-readable• OWL statements written in RDF (XML syntax)
• Protégé (editor)• Ontology lookup services:
18
Explore GenEpiO with Proofsheethttp://tinyurl.com/uiproofsheet
![Page 19: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/19.jpg)
GenEpiO and FoodON are now part of the OBO Foundry library of ontologies.
• Prescribes best practices for ontology development
• Committed to common use, interoperability, collaborative development
• Common relations and syntax• Accessible definitions, good
documentation
Open Biomedical Ontologies - http://www.obofoundry.org/
144 ontologies accepted or under development Describing genes and phylogenies to diseases and anatomy
Ontology for the description of life-science and clinical investigations
19
![Page 20: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/20.jpg)
GenEpiO will be implemented in different interfaces.
• Creates BioSample-Compliant Genome Submission Forms. 20
Metadata Manager: Data entry portal
• Metadata Requisition• Implements GenEpiO terms• Facilitates descriptive metadata• Sorts at point-of-entry
![Page 21: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/21.jpg)
Use computers to identify common exposures, symptoms etc among genomics clusters
Example: Automating Case Definition generationCorrelate Genomics Salmonella Cluster A cases between 01 Mar 2015- 15 Mar 2015 with High-Risk Food Types Spinach Leafy Greens and Geographical Location of Vancouver
XXXXXXXXXXXXXXGenEpiO Will Help Integrate Genomics and
Epidemiological Data.
21
![Page 22: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/22.jpg)
Line List Visualizations of Selectable Data Based on GenEpiO Fields.
1. Line List View
2. Timeline View
Hideable cases
Selectable fields
Travel
Symptoms and Onset
Exposure Types
Hospitalization
22
![Page 23: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/23.jpg)
Summary of the Advantages Genomic Epidemiology Ontology offers Public Health.
Improved Public Health
Investigation power!
1. Eliminates semantic ambiguity
2. Term-mapping allows customization
3. Faster data integration
4. Standardized quality control and result reporting trigger actionable events in same way
5. Reproducibility (accreditation, validation)
23
![Page 24: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/24.jpg)
Genomic Epidemiology Ontology is like instrumentation for your contextual information…
it needs maintenance and continual improvement
To achieve consensus and uptake International GenEpiO and FoodON Consortia
Join us!E-mail: [email protected]
24
(>60 members from 15 countries)
GenEpiO facilitates data sharing!
![Page 25: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/25.jpg)
Metadata management is crucial, but tricky.
GenEpiO helps sort it out. 25
![Page 26: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/26.jpg)
Context is everything in foodborne outbreak investigations.
GenEpiO helps fit the data together. 26
![Page 28: Context is Everything: Integrating Genomics ...genepio.org/wp-content/uploads/2016/10/Gen...Context is Everything: Integrating Genomics, Epidemiological and Clinical Data Using GenEpiO](https://reader034.vdocument.in/reader034/viewer/2022050206/5f59c8dcc607392dc36dcff9/html5/thumbnails/28.jpg)
28
Project LeadersFiona Brinkman – SFUWill Hsiao – PHMRLGary Van Domselaar – NML
Simon Fraser University (SFU)Emma GriffithsGeoff WinsorJulie ShayMatthew LairdBhav Dhillon
McMaster UniversityAndrew McArthurDaim Sardar
European Food Safety AgencyLeibana Criado ErnestoVernazza FrancescoRizzi Valentina
National Microbiology Laboratory (NML)Franklin BristowAaron PetkauThomas MatthewsJosh AdamAdam OlsenTara LynchShaun TylerPhilip MabonPhilip AuCeline NadonMatthew Stuart-EdwardsMorag GrahamChrystal BerryLorelee TschetterEduardo ToboadaPeter KruczkiewiczChad LaingVic GannonMatthew WhitesideRoss DuncanSteven Mutschall
University of LisbonJoᾶo Carriҫo
European Bioinformatics InstituteMelanie CourtotHelen Parkinson
BC Public Laboratory and BC Centre for Disease Control (BCCDC)Damion DooleyJudy Isaac-RentonPatrick TangNatalie PrystajeckyJennifer GardyLinda HoangKim MacDonaldYin ChangEleni GalanisMarsha TaylorJennifer Law
University of MarylandLynn Schriml
Canadian Food Inspection Agency (CFIA)Adam KoziolBurton BlaisCatherine Carrillo
Dalhousie UniversityRob BeikoAlex Keddy
www.irida.ca