tim clark harvard medical school & massachusetts general hospital rpi tetherless world...

46

Post on 19-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General
Page 2: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Dynamic Semantic Metadata on Web Resources in Biomedicine

Tim ClarkHarvard Medical School &

Massachusetts General Hospital

RPI Tetherless World ConstellationMay 3, 2011

Copyright 2010 Massachusetts General Hospital. All rights reserved.

Page 3: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Outline of talk

Biomedical web data integration challenges Requirements to cure complex disorders Catch-22 for semantic data in medicine

Web 3.0 and semantic metadata Injecting semantics into the existing

ecosystem Integrating ontologies, documents &

data Annotation Ontology & Annotation

Framework Hypothesis management (vs. KM)

Page 4: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Some complex neurological disorders

Alzheimer Disease

Huntington’s Disease

Nicotine Addiction

SchizophreniaBipolar

DisorderAlcohol

addiction

AutismParkinson’s

DiseaseALSNeuropathic

PainMajor

Depressive Disorder

Page 5: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Alzheimer’s + Addiction + Depression

Yearly mortality (U.S.) = 642,00 people

Yearly costs (U.S.) = $676 B / 4.7% GDP

Prevalence = 5.3 M + 76 M + 14.4 M

= 95.7 M people

Page 6: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

create hypothesis

design experiment

run experiment collect data

interpret data

share interpretations

synthesize knowledge

Page 7: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

“hippocampal atrophy and Aβ load predicted shorter time-to-progression”

MCI progressors non progressors

PET imaging of PIB (radiolabelled compound binds amyloid beta A4 protein)

MRI imaging of brain structure showing loss of hippocampal volume

Brain. 2010 Nov;133(Pt 11):3336-3348.

= 218 subjects +

Page 8: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Alzheimer Disease

Parkinson’s Disease

Schizophrenia

Autism

Bipolar DisorderDrug Addiction

Huntington’s Disease

ALS

Depression

Page 9: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

dopaminergic pathway

α-synuclein, β-amlyoid

α-synuclein, Tau

chr 16p11.2 CNV

chr 16p11.2 CNV

CRF, glutaminergic system, dopamine, amygdala …

Alzheimer Disease

Parkinson’s Disease

Schizophrenia

Autism

Bipolar DisorderDrug Addiction

Huntington’s Disease

ALS

Depression

SIRT2

Page 10: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

1. We want to organize all the known facts in neurobiology so we can mash them up.

2. There are no “facts” in neurobiology, except uninteresting ones.

3. All we have, are assertions supported by evidence, of varying quality.

Page 11: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Science is part of an information ecosystem

1667 2010

Printing Press Web

Page 12: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

We scientists do not attend professional meetings to present our findings ex cathedra, but in order to argue.

John Polanyi, FRS, Nobel Laureate

University of Manchester

Page 13: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

What is Web 3.0?

Social Web (Web 2.0, read/write)

Shared annotation with controlled terminology systems (Sem Web)

+

Page 14: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

How does Web 3.0 help?

Information sharing within communities or tasks via Social Web (Web 2.0), wikis and forums

Information “permeability” across pharma R&D projects / domains / pipeline stages via shared metadata (semantic annotation)

Web 3.0 improves cross-domain Signal to Noise, institutional memory & data “findability”

Page 15: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General
Page 16: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Supervised automatic annotation

Genes

Proteins

Biological Processes

Chemical Compounds

Antibodies

Cells

Brain anatomy

Page 17: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Annotation Ontology & SWAN Annotation Framework

Annotation Ontology (AO) is a domain-independent Web ontology. Links document fragments to ontology

terms. Metadata separate from annotated

documents. SWAN AF manages document

annotation. Interfaces to textmining svcs & supports

curation.Collaborating with

NCBO, UCSD, Elsevier, USC, Manchester, EMBL, Colorado, EBI, etc…

Page 18: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Annotation Ontology (AO)

Text

Shared metadata

Page 19: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Supervised textmining

2) Automatic annotation

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 20

10

Page 20: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Term localization and curationD

r. P

aolo

Cic

care

se –

Oct

8,

2010

Page 21: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Localization in text

Page 22: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Localization on image

Page 23: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Complements and can support

Semantics on documents (SESL) Vocabulary standards & terminology

development Document & data managementCollaboratories & web communitiesHypothesis management (SWAN)Nanopublications (OpenPHACTS)

Page 24: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General
Page 25: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

What is Hypothesis Management?

Model the thinking behind your research Database it, web-ify it, RDF-ize it, share it Link the Models / Hypotheses to

Claims / Interpretations Evidence (publications, experiments, data) Supporting and contradictory claims from others Evidence for these other claims

Web 3.0: share, compare and discuss Manage knowledge while creating it

Can be public, private, or semi-private

Page 26: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General
Page 27: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General
Page 28: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

An Alzheimer hypothesis in SWAND

r. P

aolo

Cic

care

se –

Oct

8,

2010

Page 29: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Exploring a claimD

r. P

aolo

Cic

care

se –

Oct

8,

2010

Page 30: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

The SWAN ontology

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 20

10

Page 31: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Cognitive

Deficits(S)

BACE1(O)

Relate to(p)

provenance

context

With thanks to Barend Mons and Paul Groth…

Mons / Groth model of a nanopublication

Page 32: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Claim in a graph with authorship

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:authoredBy

Vincent Marchesi

foaf:name

foaf:Person

rdf:type

pav: http://purl.org/pav/provenance/2.0/ foaf: http://xmlns.com/foaf/0.1/

G2

Page 33: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Adding provenance

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:authoredBy

G2

<http://example.info/person/0>

pav:curatedBy

G4

Gwen Wong

foaf:name

foaf:Person

rdf:type

Page 34: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Author’s evidence

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:contributedBy

<http://example.info/citation/1>

swanrel:referencesAsSupportiveEvidence

G5

G6

Page 35: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Translating into HyQue-triples

G8

<http://example.info/alzswan:statement_f3556dcfc331d9b9af9d5c0cfc570ba6_event_1>

<http://bio2rdf.org/go:0051087>

rdf:type

Event of type GO "chaperone binding"

rdfs:label

<prefix:actor_1>

<prefix:target_1>

<prefix:location_1>

<http://bio2rdf.org/chebi:53002>

<http://bio2rdf.org/mesh:D008565>

<http://bio2rdf.org/go:0005886>

rdf:type

rdf:type

rdf:type

rdfs:label “Beta amyloid”

rdfs:label “Membrane protein”

rdfs:label “Plasma membrane”

With many thanks to Nigam Shah, Stanford University

Page 36: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Adding the provenance

Hyque triples

G8

<http://example.info/person/2>

pav:contributedBy

Nigam Shah

foaf:name

foaf:Person

rdf:typeG9

Page 37: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Connecting SWAN Claim and Hyque triples

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

Hyque triples

G8

swanrel:derivedFrom

Page 38: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Drug Discovery Hypotheses

Target / pathway hypotheses will be linked to: Pathway & target relation to disease, Target selection criteria, Validation assays and criteria, Experiment (assay) provenance, Experimental data and computations, Scientist remarks, findings and

discussion. Start as a relatively simple model

and extend

Page 39: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Drug Development Hypotheses

Hypotheses of therapeutic action for compounds and scaffolds will be linked to Hypothesis / results for individual

assays, Experiment (assay) provenance, Experimental data, Group annotation, Internal databases etc.

Start as a relatively simple model and extend

Page 40: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Open Enterprise Semantic Model

Page 41: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Web 3.0 + T-R + PharmaInformation ecosystem

Page 42: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Some other applications

Research reproducibility Linking data to documents at time of

publications Citation of reagents, instruments, code,

protocolsBibliographies and citation networks

Bibliographic records and citations are metadata

Personal annotations Selective sharing and virtual

communitiesDatabase annotation

Biomedical ontology database curation projects

Page 43: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

NASA Astrophysics Data System

What is NASA ADS? Web database comprising over 8 million

astronomy and physics papers Full-text for over 880K articles, including all

major astronomy journals NASA ADS semantic annotation

requirements Astronomical objects by catalog ID Specific telescope, type of telescope,

wavelength Investigators Grant funding sources

Page 44: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

SummaryCuring complex medical disorders goes

hand in hand with next-gen biomedical communications

Web 3.0 provides the technology framework

Semantic annotation, hypothesis management, nanopubs: tools for next-gen biomed comms .

Requires / enables international collaborations of biomedical researchers and informaticians.

Open enterprise model with semantic metadata.

Page 45: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

AcknowledgementsPeople

Paolo Ciccarese (Harvard) Maryann Martone (UCSD) Anita DeWaard & Tony Scerri (Elsevier) Karen Verspoor & Larry Hunter (Colorado) Adam West & Ernst Dow (Eli Lilly) Carole Goble (Manchester) Nigam Shah (Stanford / NCBO) Paul Groth (VU Amsterdam)

Funding: Elsevier, NIH, Eli Lilly, & EMD Serono

Page 46: Tim Clark Harvard Medical School & Massachusetts General Hospital RPI Tetherless World Constellation May 3, 2011 Copyright 2010 Massachusetts General

Whereas King Ptolemy, living forever, the Manifest God whose excellence is fine, son of King Ptolemy and Queen Arsinoe, the Father-

loving Gods, is wont to do many favours for the temples of Egypt and for all those who are subject to his kingship, he being a god…

English translation by R.S. Simpson