biocuration 2014 - effective automated classification of adverse events using ontology-based...

16
EFFECTIVE AUTOMATED CLASSIFICATION USING ONTOLOGY-BASED ANNOTATION : EXPERIENCE WITH ANALYSIS OF ADVERSE EVENT REPORTS Mélanie Courtot, [email protected] Current: PhD Candidate, Terry Fox Laboratory, BC Cancer Agency Starting April 14 th 2014: PDF, MBB Dept., Simon Fraser University (and affiliation with BC Public Health Microbiology and Research Laboratory).

Upload: melanie-courtot

Post on 21-Nov-2014

495 views

Category:

Health & Medicine


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

EFFECTIVE AUTOMATED CLASSIFICATION USING ONTOLOGY-BASED ANNOTATION : EXPERIENCE WITH ANALYSIS OF ADVERSE EVENT REPORTS

Mélanie Courtot, [email protected] Current: PhD Candidate, Terry Fox Laboratory, BC Cancer Agency Starting April 14th 2014: PDF, MBB Dept., Simon Fraser University (and affiliation with BC Public Health Microbiology and Research Laboratory).

Page 2: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Background and problem statement • Surveillance of Adverse Events Following Immunization is important • Detection of issues with vaccine •  Importance of vaccine-risk communication

• Analysis of AE reports is a subjective, time- and money costly process • Manual review of the textual reports

Page 3: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Hypothesis

Health Agencies

Data repositories

Other guideline(s)

Brighton guideline

AUTOMATIC CASE CLASSIFICATION

BRIGHTON ANNOTATIONS

ADVERSE EVENT REPORTING ONTOLOGY

(AERO)

Clinician

2INFORMATION

RECALLSOPs

GENERAL POPULATION

GUIDELINE REPRESENTATION

1

DATA INTEGRATION&

ANSWERING QUERIES

3

Encoding Brighton guidelines in OWL allows automated classification of adverse events at similar accuracy

Page 4: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Test case

• VAERS dataset • Vaccine Adverse Event Reporting System • 6032 reports: ~5800 negative, ~230 positive • Post H1N1 immunization 2009/2010 • Manually classified for anaphylaxis

• MedDRA (Medical Dictionary of Regulatory Activities) is used to represent clinical findings

Page 5: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Free text partof the report

MedDRA encodedstructured data

Example VAERS report

Page 6: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Automated Diagnosis workflow

ADVERSE EVENT REPORTING ONTOLOGY

(AERO)

OWL/RDFEXPORT

VAERS DATASET

MySQL

BRIGHTON ANNOTATIONS

ASCII files MySQL

~800 MedDRA terms mapped to 32 Brighton terms

REASONER

?

MANUALLY CURATEDDATASET

A

B

C

D

Page 7: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Results

ADVERSE EVENT REPORTING ONTOLOGY

(AERO)

OWL/RDFEXPORT

VAERS DATASET

MySQL

BRIGHTON ANNOTATIONS

ASCII files MySQL

~800 MedDRA terms mapped to 32 Brighton terms

REASONER

?

MANUALLY CURATEDDATASET

A

B

C

D

At best cut-off point: Sensitivity 57% Specificity 97%

Page 8: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Standardized MedDRA Queries • SMQs are an existing MedDRA-based screening method

• Retrieval of documents based on Anaphylaxis SMQ alone only fair: 54% sensitivity, 97% specificity

•  Idea: •  Identify MedDRA terms that are significantly associated

with the diagnosis outcome using contingency tables • Augment the existing MedDRA SMQ with those terms

Page 9: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Cosine similarity method • Represent documents (query and report) as vectors of

terms • Compare the cosine measure of the angle they form

Cosine ~ 1 Query ~ Report

Cosine ~ 0 Query != Report

Page 10: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Example • Vector MEDDRA SMQ: ’Choking', 'Cough’, ’Oedema’, 'Rash’

• Vector REPORT#72: ’Oedema’, 'Rash’, ‘Vomiting’ • Vector REPORT#104: ‘Palpitations’, ‘Fatigue’, Neuropathy’

Page 11: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Results - Expanded MedDRA SMQ

At best cut-off point: Sensitivity 92%, Specificity 87%

Page 12: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Discussion • Using the ontology, the sensitivity is too low for efficient

screening • Brighton guidelines are not meant for screening, but for

diagnosis confirmation • We improved on the screening result and reached 92%

sensitivity, 87% specificity. • Using both approaches concurrently yields best screening

results

Page 13: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Key outcomes • Current encoding standards don’t allow for complete

representation of events •  e.g., missing temporality descriptors (sudden onset, rapid

progression) •  Critical for diagnosis confirmation and causality assessment

•  Information lacking in reports form surveillance systems •  Not assessed? Not recorded? Negative?

•  Logical translation of guidelines allows for better detection of inconsistencies and errors •  We are working with the Brighton Collaboration towards adding a

logical formalization to the existing case definitions

Page 14: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Use of the ontology for reporting •  In current systems:

•  Fast screening -> fast detection of potentially positive reports

• Reporter can be sent a more detailed report, e.g. “Brighton-based anaphylaxis report form”

•  In future systems: •  Implementation of the ontology-based system at the

time of data entry • Provides labels and textual definitions for each term • Enable consistency checking

Page 15: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Next steps: IRIDA project •  Integrated Rapid Infectious Disease Analysis •  http://www.irida.ca •  IRIDA is a bioinformatics platform for genomic

epidemiology analysis to improve outbreak surveillance and detection

• Collaboration between academia and public health • Ontologies will be developed to annotate clinical, lab and

epidemiology data, and integrate for further analysis

Page 16: Biocuration 2014 - Effective automated classification of adverse events using ontology-based annotations

Acknowledgements •  Ryan Brinkman, BC Cancer Agency, Vancouver, Canada •  Alan Ruttenberg, University at Buffalo, New York, USA •  Julie Lafleche, Robert Pless, Barbara Law, Public

Health Agency of Canada, Ottawa, Ontario •  Jan Bonhoeffer, Brighton Collaboration, Basel,

Switzerland •  IRIDA project: Fiona Brinkman, William Hsiao