What is an ontology and Why should you care?
Barry Smithhttp://ontology.buffalo.edu/smith
with thanks to Jane Lomax, Gene Ontology Consortium
1
Selected Gene Tree: pearson lw n3d ...Branch color classification:Set_LW_n3d_5p_...
Colored by: Copy of Copy of C5_RMA (Defa...Gene List: all genes (14010)
attacked
time
control
Puparial adhesionMolting cyclehemocyanin
Defense responseImmune responseResponse to stimulusToll regulated genesJAK-STAT regulated genes
Immune responseToll regulated genes
Amino acid catabolismLipid metobolism
Peptidase activityProtein catabloismImmune response
Selected Gene Tree: pearson lw n3d ...Branch color classification:Set_LW_n3d_5p_...
Colored by: Copy of Copy of C5_RMA (Defa...Gene List: all genes (14010)
Microarray datashows changed expression ofthousands of genes.
How will you spot the patterns?
3
How GO can be used to help analyse microarray data
• Treat samples• Collect mRNA• Label• Hybridize• Scan• Normalize• Select differentially regulated genes • Understand the biological phenomena involved
12
Traditional analysis operates via literature search for each successive
geneGene 1ApoptosisCell-cell signalingProtein phosphorylationMitosis…
Gene 2Growth controlMitosisOncogenesisProtein phosphorylation…
Gene 3Growth controlMitosisOncogenesisProtein phosphorylation…
Gene 4Nervous systemPregnancyOncogenesisMitosis…
Gene 100Positive control. of cell proliferationMitosisOncogenesisGlucose transport…
13
GO allows grouping by process
ApoptosisGene 1Gene 53
MitosisGene 2Gene 5Gene45Gene 7Gene 35…
Positive control. of cell proliferationGene 7Gene 3Gene 12…
GrowthGene 5Gene 2Gene 6…
Glucose transportGene 7Gene 3Gene 6…
Allows us to ask meaningful questions of microarray data e.g. which genes are involved in the same process, with same/different expression patterns? 15
1. It provides a controlled vocabulary
contributing to the cumulativity of scientific results achieved by distinct research communities
(if we all use kilograms, meters, seconds … , our results are callibrated)
17
The massive quantities of annotations to gene products in terms of the GO allows a new kind of research
20
Uses of GO in studies of• pathways associated with heart failure development
correlated with cardiac remodeling (PMID 18780759)• sex-specific pathways in early cardiac response to pressure
overload in mice (PMID 18665344)• molecular signature of cardiomyocyte clusters derived from
human embryonic stem cells (PMID 18436862)• contrast between cardiac left ventricle and diaphragm muscle
in expression of genes involved in carbohydrate and lipid metabolism. (PMID 18207466 )
• immune system involvement in abdominal aortic aneurisms in humans (PMID 17634102)
• …
21
But GO covers only three sorts of biological entities
–cellular components–molecular functions–biological processes
and does not provide representations of disease-related phenomena
22
23
How extend the GO to
help integrate complex representations of reality
help human beings find things in complex representations of reality
help computers reason with complex representations of reality
in other areas of biomedicine?
24
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
The Open Biomedical Ontologies (OBO) Foundry
25
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Organism-Level Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
Cellular Process
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
initial OBO Foundry coverage
GRANULARITY
RELATION TO TIME
26
CRITERIA
opennness
common formal language.
collaborative development
evidence-based maintenance
identifiers
versioning
textual and formal definitions
CRITERIA
COMMON ARCHITECTURE: The ontology uses common formal relations
ORTHOGONALITY: One ontology for each domain
27
CRITERIA
Michael Ashburner, Suzanna Lewis, Chris Mungall (GO Consortium)
Alan Ruttenberg (Science Commons, OWL Working Group, HCLS/Semantic Web)
Richard Scheuermann (ImmPort, CTSA)
Barry Smith
28
LEADERSHIP
OBO Foundry provides
• tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort
• an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology
• automatic web-based linkage between medical terminologies and biological knowledge resources
29