semantic web for 360-degree health: state-of-the-art & vision for better interoperability

Post on 07-May-2015

2.479 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Ora Lassila and Amit Sheth, "Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Interoperability", Invited Talk at ONC-HHS Invitational Workshop on Next Generation Interoperability for Health, Washington DC, January 19-20, 2011.

TRANSCRIPT

Ora Lassila •  Principal Architect (Nokia

Mobile Solutions); also an advisor to Nokia’s top mgmt

•  Elected member of W3C’s Advisory Board since 1998

•  Earlier: Research Fellow (Nokia Research), W3C Fellow (MIT), Project Manager (CMU), entrepreneur, etc.

•  Ph.D from Helsinki University of Technology (CS)

•  http://www.lassila.org/

Amit Sheth •  LexisNexis Ohio Eminent

Scholar, Director, Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University

•  Educator, researcher, entrepreneur – 2 companies, products, deployed apps, W3C and biomedical community standards

•  Earlier: UGA, Telcordia, Unisys, Honeywell

•  http://knoesis.org/amit

• Semantic Web • some background

Ora

• Semantic Web in use • examples of applications in traditional clinical care to translational medicine

Amit

• Challenges (and promise) • what makes this difficult • why do we want to pursue it anyway

Ora (technical) Amit (health)

• Often characterized as the “next generation of the World Wide Web” • Web content amenable to automation •  (current content intended for humans…)

• Often characterized as the “next generation of the World Wide Web” • Web content amenable to automation •  (current content intended for humans…)

• In reality, the Semantic Web is a vision of the future of (personal) computing • machines working on behalf of their human users • more autonomy, handling of unanticipated situations

• Heavy reliance of knowledge representation & reasoning • also multi-agent systems, other AI-based technologies

• At the core, the Semantic Web is about • describing things (objects, concepts, services, …) • querying the descriptions •  reasoning about the descriptions

• As such, it is knowledge representation •  for the Web •  (or KR using standardized Web technologies)

• (in comparison, the “old Web” was really about documents and finding them…)

• Motivated by the need for automation • automation requires interoperability (via standards) • heavy process, high up-front investment •  (alternative: hand-crafted but “brittle” programs…)

• Interoperability achieved by exposing meaning • accessible semantics • note: interoperability of any two systems can be

achieved via engineering, but this does not scale

• Automation → autonomy • prevailing paradigm: agent-based systems •  implies reasoning, planning, interoperable

representations of knowledge

• Contrary to “Web 2.0”, Semantic Web aims at achieving many things “ad hoc” • e.g., ad hoc mash-ups by non-computer savvy people

• Shared (and accessible) semantics is the key to interoperability • Semantic Web introduces a fundamentally different approach to standardization • standardize how to say things and not what to say • ontological techniques allow “delayed semantic

commitment”

• Semantic Web is built in a layered manner • Not everybody needs all the layers

Encoding characters : Unicode

Encoding structure: XML

Uniform metamodel: RDF + URI

Simple data models & taxonomies: RDF Schema

Rich ontologies: OWL

Queries: SPARQL, Rules: RIF

Semantic Web

• Achieve for data what Web did to documents • Relationship with the original Semantic Web vision: no AI, no agents, no autonomy • Interoperability is still very important •  interoperability of formats •  interoperability of semantics

• Enables interchange of large data sets •  (thus very useful in, say, collaborative research)

• Semantic Web vision is largely predicated on the availability of data • Linked Data is a movement that gets us there

Web of pages - text, manually created links - extensive navigation

2007

1997 Web of databases - dynamically generated pages - web query interfaces

Web of resources - data, service, data, mashups

Web of people - social networks, user-created casual content

Keywords

Patterns

Objects

Situations, Events

Tech assimilated in life

Web 1.0

Web 2.0

Web 3.0

Web of Sensors, Devices/IoT - 40 billion sensors, 5 billion mobile connections

Medical Informatics Bioinformatics

Etiology Pathogenesis Clinical findings Diagnosis Prognosis Treatment

Genome Transcriptome

Proteome Metabolome

Physiome ...ome

Genbank

Uniprot

Pubmed

Clinical Trials.gov

...needs a connection

Biomedical Informatics

Hypothesis Validation Experiment design Predictions Personalized medicine

More advanced capabilities for search, integration, analysis, linking to new insights and discoveries!

text

Health Information Services

Elsevier iConsult

Scientific Literature

PubMed 300 Documents Published Online each day

User-contributed Content (Informal) Experts: GeneRifs WikiGene

Consumer: Blogs Social Networks

NCBI Public Datasets

Genome, Protein DBs new sequences daily

Laboratory Data

Lab tests, RTPCR, Mass spec

Clinical Data

Personal health history

Search, browsing, complex query, integration, workflow, analysis, hypothesis validation, decision support.

• W3C Semantic Web Health Care & Life Sciences Interest Group: http://www.w3.org/2001/sw/hcls/ • Clinical Observations Interoperability: EMR + Clinical Trials: http://esw.w3.org/HCLS/ClinicalObservationsInteroperability • National Center for Biomedical Ontologies: http://bioportal.bioontology.org/

• Status: In use continuously since 01/2006 • Where: Athens Heart Center & its partners and labs • What: Use of semantic Web technologies for clinical decision support

Examples demonstrating use of Semantic Web for Health Care and Life Sciences research projects and operational clinical or research applications

Details: http://knoesis.org/library/resource.php?id=00004

Annotate ICD9s Annotate Doctors

Lexical Annotation

Level 3 Drug Interaction

Insurance Formulary

Drug Allergy Demo at: http://knoesis.org/library/demos/

owl:thing

prescription_drug_ brand_name

brandname_undeclared

brandname_composite

prescription_drug

monograph_ix_class

cpnum_ group

prescription_drug_ property

indication_ property

formulary_ property

non_drug_ reactant

interaction_property

property

formulary

brandname_individual

interaction_with_prescription_drug

interaction

indication

generic_ individual

prescription_drug_ generic

generic_ composite

interaction_ with_non_ drug_reactant

interaction_with_monograph_ix_class

• Status: Completed research • Where: NIH • What: queries across integrated data sources • Enriching data with ontologies for integration, querying,

and automation • Ontologies beyond vocabularies: the power of

relationships

gene

GO

PubMed

Gene name

OMIM

Sequence

Interactions Glycosyltransferase

Congenital muscular dystrophy Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07 http://knoesis.org/library/resource.php?id=00014

Congenital muscular dystrophy, type 1D

(GeneID: 9215)

has_associated_disease

has_molecular_function

Acetylglucosaminyl-transferase activity

Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

MIM:608840 Muscular dystrophy, congenital, type 1D

GO:0008375

has_associated_phenotype

has_molecular_function

EG:9215 LARGE

acetylglucosaminyl- transferase

GO:0016757 glycosyltransferase

GO:0008194 isa

GO:0008375 acetylglucosaminyl- transferase

GO:0016758

From medinfo paper. Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07

SELECT DISTINCT ?t ?g ?d { ?t is_a GO:0016757 . ?g has molecular function ?t . ?g has_associated_phenotype ?b2 . ?b2 has_textual_description ?d . FILTER (?d, “muscular distrophy”, “i”) . FILTER (?d, “congenital”, “i”) }

• Status: Completed research • Where: NIH • What: Understanding the genetic basis of nicotine dependence. Integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. • How: Semantic Web technologies (especially RDF, OWL, and SPARQL) support information integration and make it easy to create semantic mashups (semantically integrated resources).

• NIDA study on nicotine dependency • List of candidate genes in humans • Analysis objectives include:

o Find interactions between genes o Identification of active genes – maximum number of

pathways o Identification of genes based on anatomical locations

• Requires integration of genome and biological pathway information

Entrez Gene

Reactome KEGG

HumanCyc

GeneOntology HomoloGene

Genome and pathway information integration

• pathway

• protein

• pmid

• pathway

• protein

• pmid • pathway

• protein

• pmid

• GO ID

• HomoloGene ID

http://knoesis.org/library/resource.php?id=00221

BioPAX ontology

Entrez Knowledge Model (EKoM)

• Status: Research prototype – in regular lab use • Where: Center for Tropical and Emerging Global Diseases (CTEGD), UGA • What: Semantics and Services Enabled Problem Solving Environment for Trypanosoma cruzi • Who: Kno.e.sis, UGA, NCBO

Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University

Tarleton Research Group, Center for Tropical and Emerging Global Diseases(CTEGD), University of Georgia

Large Scale Distributed Information Systems (LSDIS). University of Georgia

National Center for Biological Ontologies (NCBO), Stanford University

The Wellcome Trust Sanger Institute, Cambridge, UK

The Oswaldo Cruz Institute (Fiocruz), Brazil

• T. cruzi is a protozoan parasite that causes Chagas Disease or American trypanosomiasis • Chagas disease is the leading cause of death in Latin America where around 18 million people are infected with this parasite • Related parasites include, Trypanosoma brucei and Leishmania major that causes African trypanosomiasis and leishmaniasis, respectively.

T. Brucei surrounded by red blood cells in a smear of infected blood. (Copyright: Jürgen Berger and Dr. Peter Overath, Max Planck Institute for Developmental Biology, Tübengen)

Trykipedia - a Wiki-based platform for collaboration of Parasite Research Community

• Data Resources  Internal lab data (from Tarleton Research Group)

 Gene Knockout, Strain Creation, Microarray, and Proteome  External databases (TriTrypDB, ProtozoaDB, Drug Bank, etc. )

• Ontologies  Parasite Lifecycle Ontology (PLO)  Parasite Experiment Ontology (PEO)

• PKR supports complex biological queries related to T.cruzi drugs, vaccination, or gene knockout targets; for example,  Find all genes with proteomic expression in mammalian lifecycle stage with GPI anchor

or signal peptide predictions.

 Find genes annotated as potential vaccine candidates.

 Find all genes with proteomic expression evidence in the mammalian host lifecycle

stages for T. cruzi

*T.cruzi Semantic Problem Solving Environment Project, Courtesy of D.B. Weatherly and Flora Logan, Tarleton Lab, University of Georgia

Sequence Extraction

Plasmid Construction

Transfection

Drug Selection

Cell Cloning

Gene Name

3‘ & 5’ Region

Knockout Construct Plasmid

Drug Resistant Plasmid

Transfected Sample

Selected Sample

Cloned Sample

T.Cruzi sample

Cloned Sample

Gene Name

?

Gene Knockout and Strain Creation*

Related Queries from Biologists • List all groups in the lab that used

a Target Region Plasmid? • Which researcher created a new

strain of the parasite (with ID = 66)? • An experiment was not successful

– has this experiment been conducted earlier? What were the results?

Complex queries can also include: - on-the-fly Web services execution to retrieve additional data -  inference rules to make implicit knowledge explicit

1.  Describe drug user’s knowledge, attitudes, and behaviors related to illicit use of OxyContin®

2.  Describe temporal patterns of non-medical use of OxyContin® tablets as discussed on Web-based forums

3.  Collaboration between Kno.e.sis and CITAR (Center for Interventions, Treatment and Addictions Research) at Wright State Univ.

• Volatile nature of execution environments • May have an impact on multiple activities/ tasks in the

workflow • HF Pathway • New information about diseases, drugs becomes

available • Affects treatment plans, drug-drug interactions

• Need to incorporate the new knowledge into execution • capture the constraints and relationships between

different tasks activities

New knowledge about treatment found during the execution of the pathway

New knowledge about drugs, drug drug interactions

Diabetes mellitus adversely affects the outcomes in patients with myocardial infarction (MI), due in part to the exacerbation of left ventricular (LV) remodeling. Although angiotensin II type 1 receptor blocker (ARB) has been demonstrated to be effective in the treatment of heart failure, information about the potential benefits of ARB on advanced LV failure associated with diabetes is lacking. To induce diabetes, male mice were injected intraperitoneally with streptozotocin (200 mg/kg). At 2 weeks, anterior MI was created by ligating the left coronary artery. These animals received treatment with olmesartan (0.1 mg/kg/day; n = 50) or vehicle (n = 51) for 4 weeks. Diabetes worsened the survival and exaggerated echocardiographic LV dilatation and dysfunction in MI. Treatment of diabetic MI mice with olmesartan significantly improved the survival rate (42% versus 27%, P < 0.05) without affecting blood glucose, arterial blood pressure, or infarct size. It also attenuated LV dysfunction in diabetic MI. Likewise, olmesartan attenuated myocyte hypertrophy, interstitial fibrosis, and the number of apoptotic cells in the noninfarcted LV from diabetic MI. Post-MI LV remodeling and failure in diabetes were ameliorated by ARB, providing further evidence that angiotensin II plays a pivotal role in the exacerbated heart failure after diabetic MI.

ARB possibly plays role in heart failure

Angiotensin II type 1 receptor blocker attenuates exacerbated left ventricular remodeling and failure in diabetes-associated myocardial infarction., Matsusaka H, et. al.

possibly plays role in

Disease

Angiotension Receptor Blocker (ARB)

Ontology: A Framework for Schema-Driven Relationship Discovery from Unstructured Text, Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp. 583-596

• Matching medical requirements with availability of medical resources (Mumbai, India) •  Project HERO Helpline for Emergency Response Operations

•  For patients seeking for immediate medical help

• Medical awareness in rural India •  mMitra, info. service during pregnancy and childhood

emergency

Information bridge

Medical Emergency

Medical Resourc

es

• Any specific problem (typically) has a specific solution that does not require Semantic Web technologies • Q: Why then is the Semantic Web attractive? A: For future-proofing

Semantic Web can be a solution to those problems and situations that

we are yet to define

• Cultural resistance (“this smacks of AI…”) • Unfamiliar technology (e.g., reasoning) • Often implies complex representational models • procedural programs vs. declarative data

• Unclear business models • Also, actual technical challenges • scalability of query processing • complexity (and thus scalability) of reasoning • scalability of access control • …

• (merely an observation of what you may encounter…)

• What makes Semantic Web attractive and worth pursuing is…

Sou

rce:

Min

dlab

, U o

f Mar

ylan

d

• Serendipity in interoperability • can we interoperate with systems, devices and/or

services we knew nothing about at design time?

• Serendipity in information reuse • with accessible semantics, this becomes easier…

• Serendipity in information integration • can information from independent sources be combined? • even simple forms of reasoning can help

(Source: Oxford American Dictionary)

• Semantic Web was designed to • accommodate different points of view • be flexible about what it can express (not preferential

towards any particular domain or application)

• Combining information in new ways • we cannot anticipate all the possible ways in which

information is used, combined ⇒  there is value to merely making information (data)

available • using Semantic Web technologies lowers the threshold

for “serendipitous reuse”

Clinical Care Insurance, Financial Aspects

Genetic Tests… Profiles

Follow up, Lifestyle

Clinical Trials Social Media

Patients, Public

Hospitals Doctors

Payors

CDC

CROs

Pharmaceutical Companies

FDA NIH (Research)

Universities, AMCs

From FDA, CDC

Translation 1: Genomic Research and Clinical Practice Translation 2: Clinical Research and Clinical Practice

Slide by: Vipul Kashyap

• For each component in 360-degree health care, we have data, processes, knowledge and experience. Interoperability solutions need to encompass all these! • Possibly largest growth in data will be in sensors (eg

Body Area Networks, Biosensors) and social content. Extensive use of mobile phones.

Credit: ece.virginia.edu

• Semantic Web is an “interoperability technology” • Linked Data is a step in the right direction • Many examples of viable usage of Semantic Web technologies • Words of warning about deployment • For health, Semantic Web provides the needed interoperability, and can accommodate all necessary “points of view” • Significant research challenges remain as Health presents the most complex domain

• Researchers: Satya Sahoo, Dr. Priti Parikh, Pablo Mendes, Cartic Ramakrishnan, and Kno.e.sis team • Collaborators: Athens Heart Center (Dr. Agrawal), NLM (Olivier Bodenreider), CCRC-UGA (Will York), UGA (Tarleton), Bioinformatics-WSU (Raymer) • Funding: NIH/NCRR, NIH/NLBHI (R01), NSF

http://knoesis.org

1.  A. Sheth, S. Agrawal, J. Lathem, N. Oldham, H. Wingate, P. Yadav, and K. Gallagher, Active Semantic Electronic Medical Record, Intl Semantic Web Conference, 2006.

2.  Satya Sahoo, Olivier Bodenreider, Kelly Zeng, and Amit Sheth, An Experiment in Integrating Large Biomedical Knowledge Resources with RDF: Application to Associating Genotype and Phenotype Information WWW2007 HCLS Workshop, May 2007.

3.  Satya S. Sahoo, Kelly Zeng, Olivier Bodenreider, and Amit Sheth, From "Glycosyltransferase to Congenital Muscular Dystrophy: Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology, Amsterdam: IOS, August 2007, PMID: 17911917, pp. 1260-4

4.  Satya S. Sahoo, Olivier Bodenreider, Joni L. Rutter, Karen J. Skinner , Amit P. Sheth, An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence, Journal of Biomedical Informatics, 2008.

5.  Cartic Ramakrishnan, Krzysztof J. Kochut, and Amit Sheth, "A Framework for Schema-Driven Relationship Discovery from Unstructured Text", Intl Semantic Web Conference, 2006, pp. 583-596

6.  Satya S. Sahoo, Christopher Thomas, Amit Sheth, William S. York, and Samir Tartir, "Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies", 15th International World Wide Web Conference (WWW2006), Edinburgh, Scotland, May 23-26, 2006.

7.  Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, Amit Sheth and Krishnaprasad Thirunarayan, 'Provenance Context Entity (PaCE): Scalable provenance tracking for scientific RDF data.’ SSDBM, Heidelberg, Germany 2010.

•  Papers: http://knoesis.org/library •  Demos at: http://knoesis.wright.edu/library/demos/

top related