© io informatics, inc. 2015 c hallenges, r isks and o pportunities f or s emantic i nteroperability...

24
© IO Informatics, Inc. 2015 CHALLENGES, RISKS AND OPPORTUNITIES FOR SEMANTIC INTEROPERABILITY WITH TRANSMART GIVE MEANING TO YOUR DATA Authors: Robert Stanley, CEO (Presenter) Dr. Jason Eshleman, Director of Informatics

Upload: melissa-clarke

Post on 17-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015

CHALLENGES, RISKS AND OPPORTUNITIES FOR

SEMANTIC INTEROPERABILITY WITH TRANSMART

GIVE MEANING TO YOUR DATA

Authors: Robert Stanley, CEO (Presenter)Dr. Jason Eshleman, Director of Informatics

Page 2: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

CHALLENGES, RISKS, OPPORTUNITIES WITH TRANSMART

Page 3: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

• Time, effort required to manually curate and load data

• Understanding, complying with, extending tranSMART schema

• Avoiding lost security, provenance, context, curation decisions

• Need for automation, alerting, scale for enterprise applications

CHALLENGES FOR ETL AND INTEROPERABILITY:

“GETTING DATA IN”

PD Study

Alz StudytMLoader

Page 4: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

• “Getting the data in” is not enough

• Lexical matching is not enough

• Need to navigate across datasets to identify / harmonize to common identifiers, terms, relationships

• Barriers to machine-supported use of standards and ontologies for data harmonization

RISKS FOR ETL AND INTEROPERABILITY:

MAKING DATA USEFULLY SEARCHABLE

Study ID Gender Treatment

PD Study1a Female Azilect

Study ID Sex Treatment

GSE26927 1 rasagiline

PD Study Alz Study

Study ID Sex Treatment

PD Study1a Female Azilect

Page 5: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

• Evolving, diverse science will ~always~ require diverse ontologies / vocabularies / ways of describing and organizing data

• Traditional integration models (tM*mappings.txt) and applications are not designed for efficient alignment across new standards and datasets.

RISKS FOR ETL AND INTEROPERABILITY:

CONNECTING TO OTHER / NEW DATA

(new data)

Drugbank

Page 6: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

SOLUTIONS

Page 7: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

• Provide automation with provenance, dynamic rules, alerting

• Provide algorithmic / inference support for ETL yet make curation decisions transparent and easily reported

• Automate alignment with tranSMART schema

SOLUTIONS FOR ETL AND INTEROPERABILITY:

“GETTING DATA IN”

PD Study

Alz Study

Page 8: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

• “Data in” is not enough – semantic alignment is required

• Supported discovery and visualization across datasets to harmonize common identifiers, terms, relationships

• Automate alignment with pre-existing standards and ontologies, with preferred labels and synonyms

SOLUTIONS FOR ETL AND INTEROPERABILITY:

MAKING DATA USEFULLY SEARCHABLE

Page 9: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

• Apply resources ontologies/vocabularies, math, inference => rules that transparently extend to new sources

SOLUTIONS FOR ETL AND INTEROPERABILITY:

CONNECTING TO OTHER / NEW DATA

(new data)

DrugBank

* E.g., Re-align my data with… AERO, BAO, ChEBI, ChEMBL, DisGeNET, DrugBank, ICDn, NCIT, UMLS, VOID, (…)

• Build on a data model and environment designed for semantic alignment with new data or collaborators*

Page 10: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

TOOLS AND METHODS

Page 11: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

LEVERAGING SEMANTIC INNOVATION

FOR TRANSMARTImport data into

Modeler (KE)

Agile semantic (w3c) data modelling

Machine-aided identification of

semantic inconsistencies

Extend with additional resources

Align with desired ontologies and nomenclature

Re-align with collaborators’, new sources

Import into tranSMART using ETL

pipeline

Store and execute integration rules and entailments

Extend and apply tranSMART’s loading ETL

Page 12: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

SENTIENT PLATFORM KNOWLEDGE EXPLORER, WEB

QUERY VISUALIZATION, QUERY AND

SEMANTIC BENEFITS FOR INTEGRATION MAPPING

IMPORT, EDIT AND APPLY ONTOLOGIES AND CONTROLLED VOCABULARIES

UNCOVER HIDDEN RELATIONSHIPS AND APPLY INFERENCE TO IMPROVE INTEGRATION EFFICIENCY

SEARCH AND / OR DELIVER INTEGRATED DATA TO TRANSMART AND OTHER APPLICATIONS

Resource Mapping

Linked Open Data

Ontologies SPARQL

RDFEnterprise platform built on open source using Angular, SPARK, Knime, NoSQL*, …

Page 13: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

Page 14: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

STARTING WITH (TRANSLATIONAL) DIVERSITY

Patient Name

Cond Trtmnt

[Patient x] Alz Azilect

Pt ID Disease Diag.

Rx

[Pt ID xx4x]

Parkinsons Rasagaline

Study 2 Data Set

Study 1 Data Set

The data is in separate applications, using different standards and databases.

We want to be able to ask questions that include all of our clinical data and molecular assessments, and our partner’s data, but currently can’t do this. What if we put the data into tranSMART!?!

Copyright © 2015 IO Informatics Inc.

Page 15: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

treatment

AlzheimersDisease

diagnosis

Semantic Lab Studies Network

Rasagiline

First, “dump” the data into the system and “shake it”. Creation and application of staging RDF reduces manual review requirements by over 95%.*

Patient Name

Cond Trtmnt

[Patient x] Alz Azilect

Lab Studies Data Set

Patient [Preferred ID #]

(is transformed by IOI tranSMART staging ontology)

(*align with useful terminology and relationships)

(for initial discovery, harmonization)

STAGING RDF FOR CURATION DISCOVERYVIA MATH, DIRTY QUERIES AND INFERENCE

*Automap to tranSMART ontologystandard, apply math, concatenate IDs, visualize and iterate with subject experts…

Copyright - IO Informatics © 2015

Page 16: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

Patient [Preferred ID #]

Semantic Medical Records Network

Bringing the next dataset or standard into the system makes cross-source lexical matching, ontological/nomenclature (relationships/labels) matching and inference available for curation and harmonization, with context and provenance.

Pt ID Disease Diag. Rx

[Pt ID xx4x]

Parkinsons Rasagaline

Medical Records Data Set(is enhanced into)

Rasagiline

treatment

Azilect

brand name

diagnosis

Parkinsons Disease

ALIGN AND RE-ALIGN WITH STANDARDSA SEMANTIC MODELING ENVIRONMENT

(*harmonize to useful “upper” synonyms and relationships)

Page 17: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

linked

by common terms

USEFUL OUTCOME…

Find patients diagnosed with both Parkinsons and Alzheimers disease who were treated with Azilect.

All data is harmonized and deeply searchable

Query directs content to desired application / schema(e.g., “put it in tranSMART!”)

diagnosis

ParkinsonsDisease

diagnosis

Patient [Preferred ID #]

AlzheimersDisease

Azilect

brand name

treatment

Rasagaline

Page 18: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

BENEFITS

Page 19: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

Simplify initial analysis and navigation across prospective datasets to reduce manual review burden for curation by over 95%

Maintain provenance on data sources, curation decisions

Automate ETL on clean data to reduce loading time and effort (*alerts for unexpected events, pop-up decision-maker)

BENEFITS OF A SEMANTIC LAYER:GET USEFUL DATA IN MORE

QUICKLY

Page 20: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

Visual, algorithmic, thesauri, ontological and inferential support for data harmonization

Create and reuse rules, inferences and classifications for semantic interoperability

BENEFITS OF A SEMANTIC LAYER:DATA IN TRANSMART IS

HARMONIZED

Page 21: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

Provide a data model and platform designed for rapid extension and interoperability with new data sources

Use and re-use private and public resources - algorithms, vocabularies, ontologies/taxonomies - for aligning and re-aligning data

BENEFITS OF A SEMANTIC LAYER:DATA IS FOR LONG-TERM

ALIGNMENT WITH NEW DATA AND STANDARDS

Page 22: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

2007

GROWING ADOPTION AND RESOURCES 2008200920112014

Copyright - IO Informatics © 2015

Page 23: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

SIDE NOTE - DOCKER CONTAINER FOR EASY INSTALLATION OF TRANSMART IS NOW AVAILABLE

https://registry.hub.docker.com/u/ioinformatics/transmart/

Page 24: © IO Informatics, Inc. 2015 C HALLENGES, R ISKS AND O PPORTUNITIES F OR S EMANTIC I NTEROPERABILITY WITH TRAN SMART G IVE MEANING TO YOUR DATA Authors:

© IO Informatics, Inc. 2015© IO Informatics, Inc. 2015

Discussion

For additional information contact IO Informatics:

Robert Stanley, CEOBo Purtic, Ph.D, Director SalesBill Hayden, Director Business Development

Phone: (510) 705-8470Email: [email protected]