a few loose ends

30
Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam

Upload: keaton

Post on 27-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Information Management for the Life Sciences M. Scott Marshall Marco Roos Adaptive Information Disclosure University of Amsterdam. A few loose ends. Practice: OWL modeling of a statement COI demo: Bridging CDISC and HL7 with query federation Terminology and SKOS Demonstration - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A few loose ends

Information Management for the Life Sciences

M. Scott MarshallMarco Roos

Adaptive Information DisclosureUniversity of Amsterdam

Page 2: A few loose ends

• Practice: OWL modeling of a statement• COI demo: Bridging CDISC and HL7 with query

federation• Terminology and SKOS• Demonstration• Toward Query Federation – putting it all together

A few loose ends

Page 3: A few loose ends

Towards RDF/OWL(1)

ALL instances of PeptideHormone are an instance of Peptide that has_role SOME instance of HormoneActivity

Source: Alan Ruttenberg

Page 4: A few loose ends

Towards RDF/OWL(3)

ALL instances of PeptideHormone are an instance of Peptide that has_role SOME instance of HormoneActivity

Source: Alan Ruttenberg

Page 5: A few loose ends

Towards RDF/OWL(3) - Instances

Source: Alan Ruttenberg

Page 6: A few loose ends

Towards RDF/OWL(4) URIs

chebi:25905 = <http://purl.org/obo/owl/CHEBI#CHEBI_25905>

Source: Alan Ruttenberg

Page 7: A few loose ends

Towards OWL(5) : triples

chebi:25905 rdfs:subClassOf chebi:16670.

chebi:25905 rdfs:subClassOf _:1.

:_1 owl:onProperty ro:hasRole.

:_1 owl:someValuesFrom go:GO_00179.…

Source: Alan Ruttenberg

Page 8: A few loose ends

SPARQLing: Put ?variables where you are looking for matches

chebi:25905 rdfs:subClassOf chebi:16670.

chebi:25905 rdfs:subClassOf _:1.

:_1 owl:onProperty ro:hasRole.

:_1 owl:someValuesFrom go:GO_00179.

select ?moleculeClasswhere {

?moleculeClass rdfs:subClassOf chebi:16670.

?moleculeClass rdfs:subClassOf ?res.

?res owl:onProperty ro:hasRole.

?res owl:someValuesFrom go:GO_00179.}

?moleculeClass = chebi:25905 Source: Alan Ruttenberg

Page 9: A few loose ends

Current Task Forces

• BioRDF – integrated neuroscience knowledge base– Kei Cheung (Yale University)

• Clinical Observations Interoperability – patient recruitment in trials– Vipul Kashyap (Cigna Healthcare)

• Linking Open Drug Data – aggregation of Web-based drug data – Chris Bizer (Free University Berlin)

• Pharma Ontology – high level patient-centric ontology– Christi Denney (Eli Lilly)

• Scientific Discourse – building communities through networking– Tim Clark (Harvard University)

• Terminology – Semantic Web representation of existing resources– John Madden (Duke University)

Page 10: A few loose ends

Background of the HCLS IG

• Originally chartered in 2005– Chairs: Eric Neumann and Tonya Hongsermeier

• Re-chartered in 2008– Chairs: Scott Marshall and Susie Stephens– Team contact: Eric Prud’hommeaux

• Broad industry participation– Over 100 members – Mailing list of over 600

• Background Information– http://www.w3.org/2001/sw/hcls/– http://esw.w3.org/topic/HCLSIG

Page 11: A few loose ends

COI Task Force

•Task Lead: Vipul Kashap•Participants: Eric Prud’hommeaux, Helen Chen, Jyotishman Pathak, Rachel Richesson, Holger Stenzhorn

Page 12: A few loose ends

COI: Bridging Bench to Bedside

• How can existing Electronic Health Records (EHR) formats be reused for patient recruitment?

• Quasi standard formats for clinical data:– HL7/RIM/DCM – healthcare delivery systems – CDISC/SDTM – clinical trial systems

• How can we map across these formats?– Can we ask questions in one format when the

data is represented in another format?

Source: Holger Stenzhorn

Page 13: A few loose ends

COI: Use Case

Pharmaceutical companies pay a lot to test drugs

Pharmaceutical companies express protocol in CDISC

-- precipitous gap –Hospitals exchange information in HL7/RIMHospitals have relational databases

Source: Eric Prud’hommeaux

Page 14: A few loose ends

• Type 2 diabetes on diet and exercise therapy or• monotherapy with metformin, insulin• secretagogue, or alpha-glucosidase inhibitors, or• a low-dose combination of these at 50%• maximal dose. Dosing is stable for 8 weeks prior• to randomization. • …

• ?patient takes meformin .

Inclusion Criteria

Source: Holger Stenzhorn

Page 15: A few loose ends

Exclusion Criteria

Use of warfarin (Coumadin), clopidogrel(Plavix) or other anticoagulants.…

?patient doesNotTake anticoagulant .

Source: Holger Stenzhorn

Page 16: A few loose ends

?medication1 sdtm:subject ?patient ;spl:activeIngredient ?ingredient1 .

?ingredient1 spl:classCode 6809 . #metformin

OPTIONAL {

?medication2 sdtm:subject ?patient ; spl:activeIngredient ?ingredient2 .?ingredient2 spl:classCode 11289 . #anticoagulant

} FILTER (!BOUND(?medication2))

Criteria in SPARQL

Source: Holger Stenzhorn

Page 17: A few loose ends

Terminology Task Force

•Task Lead: John Madden•Participants: Chimezie Ogbuji, M. Scott Marshall, Helen Chen, Holger Stenzhorn, Mary Kennedy, Xiashu Wang, Rob Frost, Jonathan Borden, Guoqian Jiang

Page 18: A few loose ends

Features: the “bridge” to meaning

Concepts Features Data

Ontology Keyword Vectors Literature

Ontology Image Features Image(s)

Ontology Gene Expression Profile

Microarray

Ontology Detected Features

Sensor Array

Page 19: A few loose ends

Terminology: Overview

• Goal is to identify use cases and methods for extracting Semantic Web representations from existing, standard medical record terminologies, e.g. UMLS • Methods should be reproducible and, to the extent possible, not lossy• Identify and document issues along the way related to identification schemes, expressiveness of the relevant languages• Initial effort will start with SNOMED-CT and UMLS Semantic Networks and focus on a particular sub-domain (e.g. pharmacological classification)

Source: John Madden

Page 20: A few loose ends

Medical terminologies: today

Moderate number of large, evolved terminologies

Adapted for specific business-process contexts

Each separately, centrally curated

Typically hierarchical, various expressivities

Uncommon to mix vocabularies

Outpatient billing - CPT

Inpatient billing - CD

Laboratory results - LOINC

Clinical findings - SNOMED

Journal indexing - MEDLARS

Pharmacy - MEDRA

Process - HL7

Clinical trials - CDISC

Others...

Source: John Madden

Page 21: A few loose ends

SKOS & the 80/20 principle: map “down”

• Minimal assumptions about expressiveness of source terminology• No assumed formal semantics (no model theory)• Treat it as a knowledge “map”• Extract 80% of the utility without risk of falsifying intent

21

Source: John MaddenSource: John Madden

Page 22: A few loose ends

The AIDA toolbox for knowledge extraction and knowledge management

in a Virtual Laboratory for e-Science

Page 23: A few loose ends

23

SNOMED CT/SKOS under AIDA: retrieve

Page 24: A few loose ends

Putting it all together

• Choosing valid terms for use in the SPARQL query by browsing/searching the knowledge base.

• Create single SPARQL endpoint for a federation of knowledge bases (SWObjects)

• Apply bridging technique to bridge MeSH terms and terms in HCLS Knowledge Base.

• Use terms from Terminology Server in Scientific Discourse

Page 25: A few loose ends
Page 26: A few loose ends
Page 27: A few loose ends

Task Force Resources to federate

• BioRDF – knowledge base, aTags (stored in KB)• Clinical Observations Interoperability – drug ontology• Linking Open Drug Data – LOD data• Pharma Ontology – ontology• Scientific Discourse – SWAN ontology, SWAN SKOS, myexperiment ontology • Terminology – SNOMED-CT, MeSH, UMLS

Page 28: A few loose ends

Someday, we should be able to find this as evidence for a fact in a Knowledge Base

Page 29: A few loose ends

Getting Involved

• Benefits to getting involved include:– Early access to use cases and best practice– Influence standard recommendations– Cost effective exploration of new technology through

collaboration– Network with others working on the Semantic Web

• Get involvedEmail chairs and team contact

[email protected]

– Participate in the next F2F (last one was here):• http://esw.w3.org/topic/HCLSIG/Meetings/2009-04-30_F2F

Page 30: A few loose ends

A Few Announcements

• Still unofficial but almost set: Semantic Web Applications and Tools for the Life Sciences Workshop (SWAT4LS) in Amsterdam 2009 (tentative date: Nov 20)

• Possibly W3C Semantic Web Health Care and Life Sciences Interest Group (HCLSIG) F2F in Fall in Amsterdam

• Shared Names http://sharednames.org workshop likely in the Fall, location unknown

• Protégé Conference in Amsterdam June 23 - 26