tim clark harvard medical school & massachusetts general hospital rpi tetherless world...

Post on 19-Dec-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Dynamic Semantic Metadata on Web Resources in Biomedicine

Tim ClarkHarvard Medical School &

Massachusetts General Hospital

RPI Tetherless World ConstellationMay 3, 2011

Copyright 2010 Massachusetts General Hospital. All rights reserved.

Outline of talk

Biomedical web data integration challenges Requirements to cure complex disorders Catch-22 for semantic data in medicine

Web 3.0 and semantic metadata Injecting semantics into the existing

ecosystem Integrating ontologies, documents &

data Annotation Ontology & Annotation

Framework Hypothesis management (vs. KM)

Some complex neurological disorders

Alzheimer Disease

Huntington’s Disease

Nicotine Addiction

SchizophreniaBipolar

DisorderAlcohol

addiction

AutismParkinson’s

DiseaseALSNeuropathic

PainMajor

Depressive Disorder

Alzheimer’s + Addiction + Depression

Yearly mortality (U.S.) = 642,00 people

Yearly costs (U.S.) = $676 B / 4.7% GDP

Prevalence = 5.3 M + 76 M + 14.4 M

= 95.7 M people

create hypothesis

design experiment

run experiment collect data

interpret data

share interpretations

synthesize knowledge

“hippocampal atrophy and Aβ load predicted shorter time-to-progression”

MCI progressors non progressors

PET imaging of PIB (radiolabelled compound binds amyloid beta A4 protein)

MRI imaging of brain structure showing loss of hippocampal volume

Brain. 2010 Nov;133(Pt 11):3336-3348.

= 218 subjects +

Alzheimer Disease

Parkinson’s Disease

Schizophrenia

Autism

Bipolar DisorderDrug Addiction

Huntington’s Disease

ALS

Depression

dopaminergic pathway

α-synuclein, β-amlyoid

α-synuclein, Tau

chr 16p11.2 CNV

chr 16p11.2 CNV

CRF, glutaminergic system, dopamine, amygdala …

Alzheimer Disease

Parkinson’s Disease

Schizophrenia

Autism

Bipolar DisorderDrug Addiction

Huntington’s Disease

ALS

Depression

SIRT2

1. We want to organize all the known facts in neurobiology so we can mash them up.

2. There are no “facts” in neurobiology, except uninteresting ones.

3. All we have, are assertions supported by evidence, of varying quality.

Science is part of an information ecosystem

1667 2010

Printing Press Web

We scientists do not attend professional meetings to present our findings ex cathedra, but in order to argue.

John Polanyi, FRS, Nobel Laureate

University of Manchester

What is Web 3.0?

Social Web (Web 2.0, read/write)

Shared annotation with controlled terminology systems (Sem Web)

+

How does Web 3.0 help?

Information sharing within communities or tasks via Social Web (Web 2.0), wikis and forums

Information “permeability” across pharma R&D projects / domains / pipeline stages via shared metadata (semantic annotation)

Web 3.0 improves cross-domain Signal to Noise, institutional memory & data “findability”

Supervised automatic annotation

Genes

Proteins

Biological Processes

Chemical Compounds

Antibodies

Cells

Brain anatomy

Annotation Ontology & SWAN Annotation Framework

Annotation Ontology (AO) is a domain-independent Web ontology. Links document fragments to ontology

terms. Metadata separate from annotated

documents. SWAN AF manages document

annotation. Interfaces to textmining svcs & supports

curation.Collaborating with

NCBO, UCSD, Elsevier, USC, Manchester, EMBL, Colorado, EBI, etc…

Annotation Ontology (AO)

Text

Shared metadata

Supervised textmining

2) Automatic annotation

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 20

10

Term localization and curationD

r. P

aolo

Cic

care

se –

Oct

8,

2010

Localization in text

Localization on image

Complements and can support

Semantics on documents (SESL) Vocabulary standards & terminology

development Document & data managementCollaboratories & web communitiesHypothesis management (SWAN)Nanopublications (OpenPHACTS)

What is Hypothesis Management?

Model the thinking behind your research Database it, web-ify it, RDF-ize it, share it Link the Models / Hypotheses to

Claims / Interpretations Evidence (publications, experiments, data) Supporting and contradictory claims from others Evidence for these other claims

Web 3.0: share, compare and discuss Manage knowledge while creating it

Can be public, private, or semi-private

An Alzheimer hypothesis in SWAND

r. P

aolo

Cic

care

se –

Oct

8,

2010

Exploring a claimD

r. P

aolo

Cic

care

se –

Oct

8,

2010

The SWAN ontology

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 20

10

Cognitive

Deficits(S)

BACE1(O)

Relate to(p)

provenance

context

With thanks to Barend Mons and Paul Groth…

Mons / Groth model of a nanopublication

Claim in a graph with authorship

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:authoredBy

Vincent Marchesi

foaf:name

foaf:Person

rdf:type

pav: http://purl.org/pav/provenance/2.0/ foaf: http://xmlns.com/foaf/0.1/

G2

Adding provenance

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:authoredBy

G2

<http://example.info/person/0>

pav:curatedBy

G4

Gwen Wong

foaf:name

foaf:Person

rdf:type

Author’s evidence

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:contributedBy

<http://example.info/citation/1>

swanrel:referencesAsSupportiveEvidence

G5

G6

Translating into HyQue-triples

G8

<http://example.info/alzswan:statement_f3556dcfc331d9b9af9d5c0cfc570ba6_event_1>

<http://bio2rdf.org/go:0051087>

rdf:type

Event of type GO "chaperone binding"

rdfs:label

<prefix:actor_1>

<prefix:target_1>

<prefix:location_1>

<http://bio2rdf.org/chebi:53002>

<http://bio2rdf.org/mesh:D008565>

<http://bio2rdf.org/go:0005886>

rdf:type

rdf:type

rdf:type

rdfs:label “Beta amyloid”

rdfs:label “Membrane protein”

rdfs:label “Plasma membrane”

With many thanks to Nigam Shah, Stanford University

Adding the provenance

Hyque triples

G8

<http://example.info/person/2>

pav:contributedBy

Nigam Shah

foaf:name

foaf:Person

rdf:typeG9

Connecting SWAN Claim and Hyque triples

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

Hyque triples

G8

swanrel:derivedFrom

Drug Discovery Hypotheses

Target / pathway hypotheses will be linked to: Pathway & target relation to disease, Target selection criteria, Validation assays and criteria, Experiment (assay) provenance, Experimental data and computations, Scientist remarks, findings and

discussion. Start as a relatively simple model

and extend

Drug Development Hypotheses

Hypotheses of therapeutic action for compounds and scaffolds will be linked to Hypothesis / results for individual

assays, Experiment (assay) provenance, Experimental data, Group annotation, Internal databases etc.

Start as a relatively simple model and extend

Open Enterprise Semantic Model

Web 3.0 + T-R + PharmaInformation ecosystem

Some other applications

Research reproducibility Linking data to documents at time of

publications Citation of reagents, instruments, code,

protocolsBibliographies and citation networks

Bibliographic records and citations are metadata

Personal annotations Selective sharing and virtual

communitiesDatabase annotation

Biomedical ontology database curation projects

NASA Astrophysics Data System

What is NASA ADS? Web database comprising over 8 million

astronomy and physics papers Full-text for over 880K articles, including all

major astronomy journals NASA ADS semantic annotation

requirements Astronomical objects by catalog ID Specific telescope, type of telescope,

wavelength Investigators Grant funding sources

SummaryCuring complex medical disorders goes

hand in hand with next-gen biomedical communications

Web 3.0 provides the technology framework

Semantic annotation, hypothesis management, nanopubs: tools for next-gen biomed comms .

Requires / enables international collaborations of biomedical researchers and informaticians.

Open enterprise model with semantic metadata.

AcknowledgementsPeople

Paolo Ciccarese (Harvard) Maryann Martone (UCSD) Anita DeWaard & Tony Scerri (Elsevier) Karen Verspoor & Larry Hunter (Colorado) Adam West & Ernst Dow (Eli Lilly) Carole Goble (Manchester) Nigam Shah (Stanford / NCBO) Paul Groth (VU Amsterdam)

Funding: Elsevier, NIH, Eli Lilly, & EMD Serono

Whereas King Ptolemy, living forever, the Manifest God whose excellence is fine, son of King Ptolemy and Queen Arsinoe, the Father-

loving Gods, is wont to do many favours for the temples of Egypt and for all those who are subject to his kingship, he being a god…

English translation by R.S. Simpson

top related