enabling faster analysis of vaccine adverse event reports with ontology support

33
ENABLING FASTER ANALYSIS OF VACCINE ADVERSE EVENT REPORTS WITH ONTOLOGY SUPPORT Mélanie Courtot, Ph.D. candidate, Brinkman lab Knowledge Translation Seminar, March 21 st 2013

Upload: melanie-courtot

Post on 18-Nov-2014

588 views

Category:

Health & Medicine


1 download

DESCRIPTION

A description of my PhD project, aiming at using ontologies to support automated analysis of adverse events reports in the domain of immunization

TRANSCRIPT

Page 1: Enabling faster analysis of vaccine adverse event reports with ontology support

ENABLING FASTER ANALYSIS OF VACCINE ADVERSE EVENT REPORTS WITH ONTOLOGY SUPPORT

Mélanie Courtot, Ph.D. candidate, Brinkman lab

Knowledge Translation Seminar, March 21st 2013

Page 2: Enabling faster analysis of vaccine adverse event reports with ontology support

Outline• Problem statement and significance• The Adverse Event Reporting Ontology (AERO) for

adverse event reports analysis• Clinical standard• Logical encoding• Classified dataset• Classification using MedDRA annotations and text mining

• AERO for data integration• The Semantic Web• VAERS as linked data

Page 3: Enabling faster analysis of vaccine adverse event reports with ontology support

Problem statement

• Importance of monitoring adverse event• Long term effects, various demographics• Detection of abnormal events in population leads to withdrawals

etc

• Current adverse events following immunization (AEFIs) reporting systems use different standards (if any) to encode reports

• The resultant lack of consistency limits the ability to query and assess potential safety issues• Reports are manually assessed: time and money consuming• Inability to assess all reports carefully

Page 4: Enabling faster analysis of vaccine adverse event reports with ontology support

Goal and significance of my work• Goal: Improve safety signal detection in vaccine

AEFIs reports• Step 1: Augment existing standards with logically formalized

elements• Step 2: Perform automatic case classification • Step 3: Test classification utility to detect safety signals

• Significance: Increase the timeliness and cost effectiveness of reliable adverse event signal detection

Page 5: Enabling faster analysis of vaccine adverse event reports with ontology support

4 steps to automated classification

1. We agree on a standard to describe adverse events

2. We encode that standard in a computer amenable format

3. We map the clinical standard to current adverse events annotations

4. We classify reports of adverse events according to established guidelines

Page 6: Enabling faster analysis of vaccine adverse event reports with ontology support

Existing standard: the Brighton collaboration

• https://brightoncollaboration.org• Provides case definitions and guidelines to standardize

reporting• Well established network (adopted as standard in Canada

2009)• Benefits of working with Brighton:

• Existing software tool• Extensive network of collaborators, shared vision

Page 7: Enabling faster analysis of vaccine adverse event reports with ontology support

What is missing?

Page 8: Enabling faster analysis of vaccine adverse event reports with ontology support

Strategy for encoding adverse event reports

• Model the domain using an ontology • Ontologies typically have two distinct components:

• Names for important concepts in the domain• Prokaryotic cells• Eukaryotic cells

• Background knowledge/constraints on the domain• Nothing can be a prokaryotic and an eukaryotic

cell

Page 9: Enabling faster analysis of vaccine adverse event reports with ontology support

Strategy for encoding adverse event reports

• Ontology encoded using the Web Ontology Language (OWL 2)

• Open Biological and Biomedical Ontology Foundry helps with quality, interoperability and avoiding redundant work• More than >100 biomedical ontologies in the suite, e.g., Gene Ontology (GO)

• Reuse of resources (ontologies and tools)

Page 10: Enabling faster analysis of vaccine adverse event reports with ontology support

Reasoning is critical

• Prokaryotic and Eukaryotic cell are declared disjoints

• Fungal cell is a Eukaryotic cell

• Spore is a Fungal cell and a Prokaryotic cell

=> inconsistency

doi:10.1371/journal.pone.0022006.g003

Page 11: Enabling faster analysis of vaccine adverse event reports with ontology support

Clinical guideline in AERO• Goal: provide a pattern to encode adverse event following

immunization guidelines• This pattern should be applicable to any type of clinical

guideline• Enable the reports to be annotated with diagnosis

according to a specific guideline (and keep track of what it is)

• We want to:• Encode the guideline in OWL• Be able to infer correct classification (i.e., perform accurate

diagnosis)

Page 12: Enabling faster analysis of vaccine adverse event reports with ontology support
Page 13: Enabling faster analysis of vaccine adverse event reports with ontology support

Current status

• Pattern implemented in the OWL file for anaphylaxis

• Has been successfully used to

model the WHO malaria clinical guidelines• Paper submitted (yay )• Need to add other guidelines

Jie ZhengUpenn

Page 14: Enabling faster analysis of vaccine adverse event reports with ontology support

VAERS dataset• VAERS = Vaccine Adverse Event Reporting System• Depends on the Centers for Disease Control and

Prevention (CDC) and the Food and Drug Administration (FDA) in the United States

• Spontaneous reporting system• Issues with underreporting, quality of reporting

• Uses MedDRA annotations (Medical Dictionary of regulatory Activities)

Page 15: Enabling faster analysis of vaccine adverse event reports with ontology support

Example VAERS report

Page 16: Enabling faster analysis of vaccine adverse event reports with ontology support

Classified VAERS data• Unclassified files available publicly• Classified dataset available upon request (in this case

H1N1 dataset)• Cleanup

• No default NULL value: “none”, “null”, “”…• Multiple languages: encoding issue with Spanish• 5 MedDRA terms per report, or duplicates

• Pre-processing required• Load into database• Match to public records

Page 17: Enabling faster analysis of vaccine adverse event reports with ontology support

Classification using MedDRA annotations

• Goal is to map the current Brighton terms in AERO to their MedDRA counterpart

• Then try and classify the MedDRA-annotated reports using the Brighton criteria

• Compare that with classification done by medical experts

Page 18: Enabling faster analysis of vaccine adverse event reports with ontology support

Mapping to MedDRA

• Translate, as best possible, MedDRA annotations to Brighton symptoms• Import selected MedDRA terms in to OWL, following

general strategy of Minimal Information to Reference an External Ontology Terms (Courtot, et al. 2011)

• Standardized MedDRA Queries provide useful documentation on how to interpret MedDRA

• OWL used to define Brighton symptoms in terms of MedDRA terms (this will be only approximate)

Page 19: Enabling faster analysis of vaccine adverse event reports with ontology support

Classification using text

• In collaboration with Seeker Solutions, a Victoria based company

• Goal is to use text part of the reports to classify them• Process:

• Training data: a set of reports that have been manually classified• Machine learning algorithm learns pattern leading to correct

classification• The model is applied to new testing data

• 2 types of classification tested:• Likelihood• Topic modeling

Page 20: Enabling faster analysis of vaccine adverse event reports with ontology support

Likelihood ordering

Page 21: Enabling faster analysis of vaccine adverse event reports with ontology support

Topic modeling

Page 22: Enabling faster analysis of vaccine adverse event reports with ontology support

Current status

• Testing classification with the MedDRA terms

• Need to work on the MedDRA mapping

• Test classification with AERO (and compare with the one with MedDRA)

• Refine text classification• Using the ontology to guide

clustering• Using Canadian dataset

Page 23: Enabling faster analysis of vaccine adverse event reports with ontology support

AERO for data integration

Page 24: Enabling faster analysis of vaccine adverse event reports with ontology support

The semantic web• From a web of documents to a web of data• HTML pages can’t be understood by machines; humans

have to manually follow hyperlinks• Semantic web uses standard for data representation,

querying, vocabularies to link data behind the scenes• Use of Uniform Resources Identifiers (URIs) and

Resource Description Framework (RDF)

Page 25: Enabling faster analysis of vaccine adverse event reports with ontology support

RDF and URIs• RDF: a language used to represent information about

resources on the web• RDF statement: subject, predicate, object

• URI: unique identifiers for things• http://purl.obolibrary.org/obo/AERO_0000244: major

dermatological criterion for anaphylaxis according to Brighton

Page 26: Enabling faster analysis of vaccine adverse event reports with ontology support

Linked Open Data cloud

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Page 27: Enabling faster analysis of vaccine adverse event reports with ontology support
Page 28: Enabling faster analysis of vaccine adverse event reports with ontology support

VAERS as linked data• Transform the VAERS dataset in RDF to enable better

integration with existing resources• No need to worry about resources’ structure (CSV,

databases, XML)• Each report is an instance of a VAERS report• System will also provide technical infrastructure to test

classification• RDF automatically generated from the database

containing VAERS data

Page 29: Enabling faster analysis of vaccine adverse event reports with ontology support

VAERS as linked dataReport 117893

Page 30: Enabling faster analysis of vaccine adverse event reports with ontology support

VAERS as linked data

Page 31: Enabling faster analysis of vaccine adverse event reports with ontology support

Querying across linked data• URIs(or mappings between URIs) to link different

resources• Querying on the VAERS dataset

• E.g., are there difference in the type of adverse events between a live attenuated flu vaccine and a trivalent inactivated one?

• Querying across multiple datasets• Identify drugs in text (e.g. Benadryl) and infer they are anti-allergic

agents via DrugBank

Page 32: Enabling faster analysis of vaccine adverse event reports with ontology support

Example: link state code in VAERS to state info in DBPedia, pass result to Google visualization API

Page 33: Enabling faster analysis of vaccine adverse event reports with ontology support

Acknowledgements

• Alan Ruttenberg, Ryan Brinkman• Oliver He, Yu Lin, Lindsay Cowell, Barry Smith, Ryan

Brinkman, Peter d’Eustachio, Albert Goldfain• Julie Lafleche, Lauren McDonald, Robert Pless,

Barbara Law, Jan Bonhoeffer, Jean-Paul Collet• Brinkman lab