what is an ontology? a representation of knowledge in a domain in theory thomas gruber (1993) “an...
TRANSCRIPT
What is an Ontology?A representation of knowledge in a domain
In theoryThomas Gruber (1993)“An ontology is a formal, explicit specification of a shared conceptualization.”
In EnglishAn ontology makes explicit…What things are we talking about?How are those things related to each other?
Why?Defining a common vocabulary in which knowledge can be formally represented can support sharing and reuse among software systems but perhaps more importantly an ontology can support sharing and reuse of application domain knowledge.
How?Applied to datasets
1. Conceptualization refers to an abstract model of some phenomenon in the world that you want to represent in an ontology.
2. Specification is a description of what you want an ontology to do. 3. Explicit means that the type of concepts used, and the constraints
on their use are explicitly defined. 4. Formal refers to the fact that the ontology should be machine-
readable. 5. Shared reflects the notion that an ontology captures consensual
knowledge, that is, it is not private of some individual, but accepted by a group.
R. Studer, V. R. Benjamins, and D. Fensel: Knowledge engineering: Principles and methods. Data and Knowledge Engineering (DKE), 25(1-2):161-197, 1998.
“An ontology is a formal, explicit specification of a shared conceptualization.”
Debate between ‘purists’ (philosophers/AI), who want an ontology to be absolutely correct, and ‘practitioners’ (e.g. biologists), who want to get things done, has become -- over the last few years -- a fruitful collaboration.
Ontologies are often represented as hierarchies (graphs, trees)Nodes are the entities and edges are the relationshipsFor example, the Ice_Cream_Ontology (ICO) Entities are various ice cream treats, relationship “is_a” subtype
Ontologies can be more complicated that this taxonomy tree.For example, the GO is represented as a DAG, directed acyclic graph, since a term (entity) can have more than one parent
Different sorts of ontologies
Top level/Upper – captures fundamental aspects of reality regardless of the domain
Continuant vs. occurrent; (cat, cat chasing a mouse)Type vs. instance (cat, “Fluffy”)
DomainReference/Core/Foundational – (complete) representation of a domainCanonical – defining ‘normal’Application ontology –
Integration of different types of entities for a specific purpose
For example, Biomedical domainOBO (Open Biomedical Ontologies) – a suite of orthogonal interoperable reference ontologies
Mappings between ontologies: e.g. lung (anatomy), lung development (biological process), abnormal lung development (phenotype)
Using OBO-Relations – also an ontology
Cell cycleEssential highly conserved process that is important in understanding cancer.
The Cell Cycle Ontology (CCO) is an application ontology that integrates knowledge about the cell cycle from a diverse set of already existing resources.http://www.cellcycleontology.org/
organism
protein-protein interactions
biological process
protein
gene
Contains two types of biological entities
Continuant – ObjectOccurrent – Process
But…How to represent ‘cell cycle’ in DAG (acyclic) format
OBO Temporal relations
Cell cycle ontology currently exists forHomo sapiens (human)Arabidopsis thaliana (thale cress)Saccharomyces cerevisiae (brewer’s yeast)Schizosaccharomyces pombe (fission yeast)
None for mouse… yet!