what is an ontology and why should you care? barry smith ontology.buffalo/smith

29
What is an ontology and Why should you care? Barry Smith http:// ontology.buffalo.edu/smith with thanks to Jane Lomax, Gene Ontology 1

Upload: quin-elliott

Post on 30-Dec-2015

19 views

Category:

Documents


0 download

DESCRIPTION

What is an ontology and Why should you care? Barry Smith http://ontology.buffalo.edu/smith with thanks to Jane Lomax, Gene Ontology Consortium. You’re interested in which genes control heart muscle development 17,536 results. time. Defense response Immune response Response to stimulus - PowerPoint PPT Presentation

TRANSCRIPT

What is an ontology and Why should you care?

Barry Smithhttp://ontology.buffalo.edu/smith

with thanks to Jane Lomax, Gene Ontology Consortium

1

You’re interested in which genes control heart muscle development

17,536 results

2

Selected Gene Tree: pearson lw n3d ...Branch color classification:Set_LW_n3d_5p_...

Colored by: Copy of Copy of C5_RMA (Defa...Gene List: all genes (14010)

attacked

time

control

Puparial adhesionMolting cyclehemocyanin

Defense responseImmune responseResponse to stimulusToll regulated genesJAK-STAT regulated genes

Immune responseToll regulated genes

Amino acid catabolismLipid metobolism

Peptidase activityProtein catabloismImmune response

Selected Gene Tree: pearson lw n3d ...Branch color classification:Set_LW_n3d_5p_...

Colored by: Copy of Copy of C5_RMA (Defa...Gene List: all genes (14010)

Microarray datashows changed expression ofthousands of genes.

How will you spot the patterns?

3

Ontologies provide a way to capture and represent all this knowledge in a computable form

4

Uses of ‘ontology’ in PubMed abstracts

5

6

By far the most successful: The Gene Ontology

7

Definitions

8

Gene products involved in cardiac muscle development in humans9

Term Search Results

10

Hierarchical view representing relations between represented types

11

How GO can be used to help analyse microarray data

• Treat samples• Collect mRNA• Label• Hybridize• Scan• Normalize• Select differentially regulated genes • Understand the biological phenomena involved

12

Traditional analysis operates via literature search for each successive

geneGene 1ApoptosisCell-cell signalingProtein phosphorylationMitosis…

Gene 2Growth controlMitosisOncogenesisProtein phosphorylation…

Gene 3Growth controlMitosisOncogenesisProtein phosphorylation…

Gene 4Nervous systemPregnancyOncogenesisMitosis…

Gene 100Positive control. of cell proliferationMitosisOncogenesisGlucose transport…

13

But by using GO annotations, this work has already been

done

GO:0006915 : apoptosis

14

GO allows grouping by process

ApoptosisGene 1Gene 53

MitosisGene 2Gene 5Gene45Gene 7Gene 35…

Positive control. of cell proliferationGene 7Gene 3Gene 12…

GrowthGene 5Gene 2Gene 6…

Glucose transportGene 7Gene 3Gene 6…

Allows us to ask meaningful questions of microarray data e.g. which genes are involved in the same process, with same/different expression patterns? 15

How does theGene Ontology work?

16

1. It provides a controlled vocabulary

contributing to the cumulativity of scientific results achieved by distinct research communities

(if we all use kilograms, meters, seconds … , our results are callibrated)

17

18

2. It provides a tool for algorithmic reasoning

Hierarchical view representing relations between represented types

19

The massive quantities of annotations to gene products in terms of the GO allows a new kind of research

20

Uses of GO in studies of• pathways associated with heart failure development

correlated with cardiac remodeling (PMID 18780759)• sex-specific pathways in early cardiac response to pressure

overload in mice (PMID 18665344)• molecular signature of cardiomyocyte clusters derived from

human embryonic stem cells (PMID 18436862)• contrast between cardiac left ventricle and diaphragm muscle

in expression of genes involved in carbohydrate and lipid metabolism. (PMID 18207466 )

• immune system involvement in abdominal aortic aneurisms in humans (PMID 17634102)

• …

21

But GO covers only three sorts of biological entities

–cellular components–molecular functions–biological processes

and does not provide representations of disease-related phenomena

22

23

How extend the GO to

help integrate complex representations of reality

help human beings find things in complex representations of reality

help computers reason with complex representations of reality

in other areas of biomedicine?

24

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

The Open Biomedical Ontologies (OBO) Foundry

25

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity

(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

Cellular Process

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

initial OBO Foundry coverage

GRANULARITY

RELATION TO TIME

26

CRITERIA

opennness

common formal language.

collaborative development

evidence-based maintenance

identifiers

versioning

textual and formal definitions

CRITERIA

COMMON ARCHITECTURE: The ontology uses common formal relations

ORTHOGONALITY: One ontology for each domain

27

CRITERIA

Michael Ashburner, Suzanna Lewis, Chris Mungall (GO Consortium)

Alan Ruttenberg (Science Commons, OWL Working Group, HCLS/Semantic Web)

Richard Scheuermann (ImmPort, CTSA)

Barry Smith

28

LEADERSHIP

OBO Foundry provides

• tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort

• an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology

• automatic web-based linkage between medical terminologies and biological knowledge resources

29