introduction to ontologies for environmental biology
DESCRIPTION
Oxford, August 2007TRANSCRIPT
Introduction to Ontologies for Environmental Biology
Barry Smith
http://ontology.buffalo.edu/smith
2
Finnegans Webconcepttypeclassinstancemodelrepresentationdataprocessproperty
Disciplines here involved
GIS
Ecology
Environmental biology
Various -omics disciplines
Bioinformatics
Medical Informatics
Database science
Semantic webists
...
4
Part 1: What is an Ontology?
5
what cellular component?
what molecular function?
what biological process?
6
natural language labels designed for use in annotations
to make the data cognitively accessible to human beings
and algorithmically tractable to computers
7
compare: legends for mapscompare: legends for maps
8
compare: legends for mapscommon legends allow (cross-border) integration
9
ontologies are legends for data
10
compare: legends for diagrams
Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological OntologySyst. Biol. 56(2):283–294, 2007
12
computationally tractable legends
help integrate complex representations of reality
help human beings find things in complex representations of reality
help computers reason with complex representations of reality
ontologies are used to annotate data
but there are two kinds of annotations
16
names of types
17
names of instances
18
A basic distinction
type vs. instance
science text vs. diary
human being vs. Michael Ashburner
19
A 515287 DC3300 Dust Collector Fan
B 521683 Gilmer Belt
C 521682 Motor Drive Belt
Catalog vs. inventory
20
Ontology types Instances
21
An ontology is a collection of standardized names for types
We learn about types in reality from looking at the results of scientific experiments captured in the form of scientific theories
Ontologies provide the terminological scaffolding of scientific theories
experiments relate to what is particular science describes what is general
siamese
mammal
cat
organism
thingtypes
animal
instances
frog
22
23
types vs. their extensions
type
{a,b,c,...} class of instances = a collections
of particulars
24
Extension =def
The extension of a type A is the class of instances of A
(the class of all entities to which the term ‘A’ applies)
25
types vs. classes
types
{c,d,e,...} classes
26
types vs. classes
types
extensions ~ defined classes
27
Defined class =def
member of Abba aged > 50 years
pizza with > 4 different toppings
red wine to serve with fish
28
Part 2: The OBO Foundry
29
what cellular component?
what molecular function?
what biological process?
The Gene Ontology
The Gene Ontology
32
Five bangs for your GO buck
1. based in biological science
2. cross-species data comparability (human, mouse, yeast, fly ...)
3. cross-granularity data integration (molecule, cell, organ, organism)
4. cumulation of scientific knowledge in algorithmically tractable form
5. links people to software
6. part of Open Biomedical Ontologies (OBO)
The Gene Ontology
33
Entry point for creation of web-accessible biomedical data
GO initially low-tech to encourage users
Simple (web-service-based) tools created to support the work of biologists in creating annotations (data entry)
OBO OWL DL converters now making OBO Foundry annotated data immediately accessible to Semantic Web data integration projects
The OBO Foundry
A suite of high quality interoperable reference ontologies to serve the annotation of biomedical data
providing guidelines for those who need to create new ontology resources
http://obofoundry.org
35
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic Quality(PaTO)
Biological Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process(GO)
The OBO Foundry building out from the original GO
Simple guidelines
• use singular nouns
• distinguish continuants from occurrents
• distinguish things from their qualities
• distinguish types from their instances
• do not use the weasel word ‘concept’
37
OPENNESS: The ontology is open and available to be used by all.
FORMAL LANGUAGE: The ontology is in, or can be instantiated in, a common formal language.
ORTHOGONALITY: The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap.
CONVERGENCE: The developers agree to work torwards a single ontology for each domain.
http://obofoundry.org/http://obofoundry.org/
CRITERIA
38
UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.
IDENTIFIERS: The ontology possesses a unique identifier space within OBO.
VERSIONING: The ontology provider has procedures for identifying distinct successive versions.
DEFINITIONS: The ontology includes textual definitions for all terms.
CRITERIA
http://obofoundry.org/http://obofoundry.org/
39
CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.
DOCUMENTATION: The ontology is well-documented.
USERS: The ontology has a plurality of independent users.
COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.
CRITERIA
http://obofoundry.org/http://obofoundry.org/
40
Foundry ontologies all work in the same way
all are built to represent the types existing in a pre-existing domain and the relations between these types in a way which can support reasoning
– we have data– we need to make this data available for semantic
search and algorithmic processing– we create a consensus-based ontology for annotating
the data– and ensure that it can interoperate with Foundry
ontologies for neighboring domains
41
Formal-Ontological Relations
is_a
part_of
located_at
depends_on
is_boundary_of
adjacent_to
42
To support integration of ontologies
relational expressions such as
is_a
part_of
...
should be used in the same way in all ontologies involved
43
to define these relations properly
we need to take account of both types and instances in reality
44
Kinds of relations
<instance, type>: Toronto instance_of city
<instance, instance>: Toronto part_of Ontario
<type, type>: waterfall part_of river
45
is_a
human is_a mammal
all instances of the type human are as a matter of necessity instances of the type mammal
46Karen Eilbecksong.sf.netproperties and features of
nucleic sequencesSequence Ontology
(SO)
RNA Ontology Consortium(under development)three-dimensional RNA
structuresRNA Ontology
(RnaO)
Barry Smith, Chris Mungallobo.sf.net/relationshiprelationsRelation Ontology (RO)
Protein Ontology Consortium(under development)protein types and
modificationsProtein Ontology
(PrO)
Michael Ashburner, Suzanna Lewis, Georgios Gkoutos
obo.sourceforge.net/cgi-bin/ detail.cgi?
attribute_and_valuequalities of biomedical entities
Phenotypic Quality Ontology
(PaTO)
Gene Ontology Consortiumwww.geneontology.orgcellular components, molecular functions, biological processes
Gene Ontology (GO)
FuGO Working Groupfugo.sf.netdesign, protocol, data
instrumentation, and analysis
Functional Genomics Investigation Ontology
(FuGO)
JLV Mejino Jr.,Cornelius Rosse
fma.biostr.washington.edu
structure of the human bodyFoundational Model of
Anatomy (FMA)
Melissa Haendel, Terry Hayamizu, Cornelius Rosse,
David Sutherland, (under development)
anatomical structures in human and model organisms
Common Anatomy Refer-
ence Ontology (CARO)
Paula Dematos,Rafael Alcantara
ebi.ac.uk/chebimolecular entitiesChemical Entities of Bio-logical Interest (ChEBI)
Jonathan Bard, Michael Ashburner, Oliver Hofman
obo.sourceforge.net/cgi-bin/detail.cgi?cell
cell types from prokaryotes to mammals
Cell Ontology (CL)
CustodiansURLScopeOntology
47Karen Eilbecksong.sf.netproperties and features of
nucleic sequencesSequence Ontology
(SO)
RNA Ontology Consortium(under development)three-dimensional RNA
structuresRNA Ontology
(RnaO)
Barry Smith, Chris Mungallobo.sf.net/relationshiprelationsRelation Ontology (RO)
Protein Ontology Consortium(under development)protein types and
modificationsProtein Ontology
(PrO)
Michael Ashburner, Suzanna Lewis, Georgios Gkoutos
obo.sourceforge.net/cgi-bin/ detail.cgi?
attribute_and_valuequalities of biomedical entities
Phenotypic Quality Ontology
(PaTO)
Gene Ontology Consortiumwww.geneontology.orgcellular components, molecular functions, biological processes
Gene Ontology (GO)
FuGO Working Groupfugo.sf.netdesign, protocol, data
instrumentation, and analysis
Functional Genomics Investigation Ontology
(FuGO)
JLV Mejino Jr.,Cornelius Rosse
fma.biostr.washington.edu
structure of the human bodyFoundational Model of Anatomy (FMA)
Melissa Haendel, Terry Hayamizu, Cornelius Rosse,
David Sutherland, (under development)
anatomical structures in human and model organisms
Common Anatomy Refer-
ence Ontology (CARO)
Paula Dematos,Rafael Alcantara
ebi.ac.uk/chebimolecular entitiesChemical Entities of Bio-logical Interest (ChEBI)
Jonathan Bard, Michael Ashburner, Oliver Hofman
obo.sourceforge.net/cgi-bin/detail.cgi?cell
cell types from prokaryotes to mammals
Cell Ontology (CL)
CustodiansURLScopeOntology
Pleural Cavity
Pleural Cavity
Interlobar recess
Interlobar recess
Mesothelium of Pleura
Mesothelium of Pleura
Pleura(Wall of Sac)
Pleura(Wall of Sac)
VisceralPleura
VisceralPleura
Pleural SacPleural Sac
Parietal Pleura
Parietal Pleura
Anatomical SpaceAnatomical Space
OrganCavityOrganCavity
Serous SacCavity
Serous SacCavity
AnatomicalStructure
AnatomicalStructure
OrganOrgan
Serous SacSerous Sac
MediastinalPleura
MediastinalPleura
TissueTissue
Organ PartOrgan Part
Organ Subdivision
Organ Subdivision
Organ Component
Organ Component
Organ CavitySubdivision
Organ CavitySubdivision
Serous SacCavity
Subdivision
Serous SacCavity
Subdivision
Foundational Model of Anatomy
Pleural Cavity
Pleural Cavity
Interlobar recess
Interlobar recess
Mesothelium of Pleura
Mesothelium of Pleura
Pleura(Wall of Sac)
Pleura(Wall of Sac)
VisceralPleura
VisceralPleura
Pleural SacPleural Sac
Parietal Pleura
Parietal Pleura
Anatomical SpaceAnatomical Space
OrganCavityOrganCavity
Serous SacCavity
Serous SacCavity
AnatomicalStructure
AnatomicalStructure
OrganOrgan
Serous SacSerous Sac
MediastinalPleura
MediastinalPleura
TissueTissue
Organ PartOrgan Part
Organ Subdivision
Organ Subdivision
Organ Component
Organ Component
Organ CavitySubdivision
Organ CavitySubdivision
Serous SacCavity
Subdivision
Serous SacCavity
Subdivision
part
_of
is_a
50
Mature OBO Foundry ontologies now undergoing reform
Cell Ontology (CL)Chemical Entities of Biological Interest (ChEBI)Foundational Model of Anatomy (FMA)Gene Ontology (GO)Phenotypic Quality Ontology (PaTO)Relation Ontology (RO)Sequence Ontology (SO)
51
Ontologies being built to satisfy Foundry principles ab initio
Ontology for Clinical Investigations (OCI)Common Anatomy Reference Ontology (CARO)Ontology for Biomedical Investigations (OBI)Protein Ontology (PRO)RNA Ontology (RnaO)Subcellular Anatomy Ontology (SAO)
52
Ontologies in planning phaseBiobank/Biorepository Ontology (BrO, part of OBI)Environment Ontology (EnvO) Immunology Ontology (ImmunO)Infectious Disease Ontology (IDO)Mouse Adult Neurogenesis Ontology (MANGO)
OBO Foundry Success Story
Model organism research seeks results valuable for the understanding of human disease.
This requires the ability to make reliable cross-species comparisons, and for this anatomy is crucial.
But different MOD communities have developed their anatomy ontologies in uncoordinated fashion.
53
Ontologies facilitate grouping of annotations
brain 20 hindbrain 15 rhombomere 10
Query brain without ontology 20Query brain with ontology 45
54
CARO – Common Anatomy Reference Ontology
for the first time provides guidelines for model organism researchers who wish to achieve comparability of annotations
for the first time provides guidelines for those new to ontology work
See Haendel et al., “CARO: The Common Anatomy Reference Ontology”, in: Burger (ed.), Anatomy Ontologies for Bioinformatics: Springer, in press.
55
56
CARO-conformant ontologies already in development:
Fish Multi-Species Anatomy Ontology (NSF funding received)Ixodidae and Argasidae (Tick) Anatomy Ontology Mosquito Anatomy Ontology (MAO) Spider Anatomy OntologyXenopus Anatomy Ontology (XAO)
undergoing reform: Drosophila and Zebrafish Anatomy Ontologies
Part 3 The Hole Story
The Ontology of Environments
Initial hypothesis:Environments are holes
environmentplacesite
nichehabitatsettinghole
spatial regioninteriorlocation
Places are holes
66
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic Quality(PaTO)
Biological Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process(GO)
No place for environments
A Neglected Major Category in Ontologies thus far
Things (e.g. organisms)
Qualities / Features
Functions
Processes
Environments = that into which organisms (etc.) fit
68
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
Environments are holes in which organisms, cells, molecules ... can live
envi
ron
men
ts
are
her
e
Environments are holes
Double Hole Structure of the Occupied Niche
Medium (filling the environing hole)
Tenant (occupying the central hole)
Retainer (a boundary of some surrounding structure)
Tenant, medium and retainer
the medium of the bear’s niche is a
circumscribed body of air
medium might be body of water, cytosol, nasal mucosa, epithelium, endocardium,
synovial tissue ...
The Empty Niche
Fiat boundary Physical boundary
Two Types of Boundary
Fiat boundary Physical boundary
Positive and negative parts
positivepart
negativepartor hole
(made of matter)
(not made of matter)
Four Basic Niche Types(Niche as generalized hole)
1 2 3 4
1: a womb; an egg; a house (better: the interior thereof)2: a snail’s shell; 3: the niche of a pasturing cow; 4: the niche around a circling buzzard (fiat boundary)
Types of relations for EnvO
in
on (surface of)
surrounds
lives_in
attaches to
realizes
occupies (spatial region)
...
Lexical Semantics
the fruit is in the bowlthe bird is in the nestthe lion is in the cagethe pencil is in the cupthe fish is in the riverthe river is in the valleythe water is in the lakethe car is in the garagethe fetus is in the cavity in the uterine liningthe colony of whooping crane is in its breeding grounds
Double Hole Structure
Medium (filling the environing hole)
Tenant (occupying the central hole)
Retainer (a boundary of some surrounding structure)
when a tenant leaves its niche the gap left by the tenant is filled immediately by the surrounding medium
A hole in the ground
Solid physical boundaries at the floor and walls
but with a fiat lid:
hole
Part 4: Not every hole is an environment
An environment is a special kind of (generalized) hole
but what kind?
Elton – niche as role
the ‘niche’ of an animal means its place in the biotic environment, its relations to food and enemies. [...] When an ecologist says ‘there goes a badger’ he should include in his thoughts some definite idea of the animal’s place in the community to which it belongs, just as if he had said ‘there goes the vicar’ (Elton 1927, pp. 63f.)
G.E. Hutchinson: niche as volume in a functionally defined space
the niche = an n-dimensional hyper-volume whose dimensions correspond to resource gradients over which species are distributed
G.E. Hutchinson (1957, 1965)
Hypervolume niche = a location in an attribute space
defined by a specific constellation of environmental variables such as degree of slope, exposure to sunlight, soil fertility, foliage density, salinity...
Niche Construction
Lewontin: niches normally arise in symbiosis with the activities of organisms or groups of organisms (“ecosystem engineering”);
they are not already there, like vacant rooms in a gigantic evolutionary hotel, awaiting organisms who would evolve into them. (The Triple Helix, Gene Organism, Environment)
Part Last: Bringing Together the Spatial and Functional Approaches to Environment Ontology
The environment is not a location in an attribute space, but it must have features have such location
Every environment must have some spatial location
The functional niche presupposes the spatial-structural niche
Ontology of environment + ontology of associated environmental features
J. J. Gibson’s Ecological Psychology
The terrestrial environment is [best] described in terms of a medium, substances, and the surfaces that separate them. (Gibson 1979, p. 16)
Gibson’s theory of surface layout
‘a sort of applied geometry that is appropriate for the study of perception and behavior’ (1979, p. 33)
ground, open environment, enclosure, detached object, attached object, hollow object, place, sheet, fissure, stick, fiber, dihedral, etc.
Gibson’s theory of surface layout as an anatomy of environments
• systems of barriers, doors, pathways to which the behavior of organisms is specifically attuned,
• temperature gradients, patterns of movement of air or water molecules
• water holes, food sources (features)
• apertures (mouths, sphincters ...)
Two sets of issues
Environments, as spatial structures, and their parts
Environmental attributes (qualities, functions), determining multidimensional loci à la Hutchinson
Aim
To define structural properties such as: open, closed, connected, compact, spatial coincidence, integrity, aggregate, boundary
RCC (Region Connection Calculus) plus extensions
Ecological Niche Concepts
niche as particular place or subdivision of an environment that an organism or population occupies
vs.
niche as function of an organism or population within an ecological community
Next steps
Our data needs are to link niche features with geo-locations
Scale: From geographic to microbiological
From locations of organisms/samples, sources of museum artifacts ...
to organism interactions, e.g. on bacterial infection – how the interior of one organism or organism part serves as environment for another organism
Hosts for bacterial infection(interior of) lung blood (bacteremia)erythrocyte - plasmodium inhabits red blood cells hepatocyte – plasmodium infects liver cells macrophagegut and oral mucosa, nasal mucosa, vaginal mucosa kidney bladder portion of epithelial tissue
C: bacteria (arrows) adhering to and penetrating the epithelial cells (×3,000)
D: abscess (Ab) formation in subepithelial region with a colony of bacteria (arrows) and a red blood cell (RBC) in it (×2,000)
106
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic Quality(PaTO)
Biological Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Componen
t(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process(GO)
Environments, environment parts (features), environment qualities
Ontologies neededEnvironment -- Taxonomy
place, habitat, city, farm, building (interior), oral cavity, uterine cavity, gut ...
Environment part – Anatomy of environments (Surface, conduit, entry ...)city wall, uterine wall, water source, ...
Environment functionprotection, supply of food,...
Environment quality – (Phenotypes) ambient temperature, salinity, ...