the birnlex: principles and practices of community ontology development
DESCRIPTION
The BIRNLex: Principles and practices of community ontology development. Maryann Martone. The Ontology Task Force: Cross Test Beds. Carol Bean (co-chair), NIH-NCRR Maryann Martone (co-chair), BIRN CC Amarnath Gupta, BIRN CC Bill Bug, Mouse BIRN - PowerPoint PPT PresentationTRANSCRIPT
The BIRNLex: Principles and practices of community ontology development
Maryann Martone
The Ontology Task Force: Cross Test Beds
Carol Bean (co-chair), NIH-NCRRMaryann Martone (co-chair), BIRN CCAmarnath Gupta, BIRN CCBill Bug, Mouse BIRNChristine Fennema-Notestine, Morph BIRNJessica Turner, FBIRN•Jeff Grethe, BIRN CC•Daniel Rubin, NCBO•David Kennedy, Morph BIRN
•Provide a dynamic knowledge infrastructure to support integration and analysis of BIRN federated data sets, one which is conducive to accepting novel data from researchers to include in this analysis•Identify and assess existing ontologies and terminologies for summarizing, comparing, merging, and mining datasets. Relevant subject domains include clinical assessments, assays, demographics, cognitive task descriptions, neuroanatomy, imaging parameters/data provenance in general, and derived (fMRI) data•Identify the resources needed to achieve the ontological objectives of individual test-beds and of the BIRN overall. May include finding other funding sources, making connections with industry and other consortia facing similar issues, and planning a strategy to acquire the necessary resources
Concept Based User Interface
• Has been developed based on feedback from community at Ontology boot-camp and test bed AHMs
• Provides access to BIRN ontological sources
• Allows for the construction of queries based on familiar concepts - architecture handles the generation of integrated views
•Currently, over 2000 tables registered from BIRN databases, internal and external knowledge sources
BONFIRE: BIRN Knowledge Sources
Bonfire Ontology Browser and Extension Tool
BIRNLex
• Grew out of BIRN Ontology Workshops• UMLS difficult to work with
• Duplicate terms• No definitions• Inconsistent and sometimes incomprehensible relationships
– Meant to cover all domains of interest to BIRN: imaging, neuroanatomy, experimental techniques, behavior
– Presented at this year’s SFN meeting; version 1.0 to be released very soon
– Draft version posted on the web (see OTF Wiki)– Current domain areas: neuroanatomy, behavioral paradigms,
mouse strain nomenclature, experimental procedures
– Developed in Protégé using OWL
BIRNLex - General Principles
OTF has adopted and refined best practices for ontology development being promoted by NCBO/OBO Foundry
• re-use existing community ontologies covering BIRN require domains - e.g. OBI, CARO, BFO, GO Cellular Component, NCBI taxonomy• novel domains - behavioral paradigms, imaging protocols, etc. - submit to OBO Foundry or contribute to relevant community effort (e.g., imaging experiments and processing going into OBI) for all• BIRNLex entities - must have Aristotelian definitions (genera & differentia)• OTF and other BIRN members are holding regular curation sessions • heavy use of curatorial metadata to support automated evaluation/analysis/maintenance of ontology • Use OWL and other supporting technologies enabling us to leverage variety of mature and emerging tools to support ontology curation, ontology-centric annotation, and ontology-driven semantic querying
Core Ontologies
• Imported into Protégé– BFO: Basic Foundational Ontology– skos (simple knowledge organization system)
• Preferred labels• Alternative labels
– OBI: Ontology of Biomedical Investigation• Manually imported: NeuroNames brain anatomy,
paradigm classes from Peter Fox– Each term is identified by its source and its source unique
identifier• Included cross reference to UMLS identifiers
– Utilize synonyms– Maps to other efforts using UMLS
• End user doesn’t have to worry about these categories
•Facilitates alignment with other ontologies across scales and modalities
•Adopted framework proposed by Barry Smith and colleagues for biological ontologies (Rosse et al., 2005, AMIA proceedings)•Based anatomical work on the FMA•Don’t want to concern ourselves about the upper level ontologies; want to focus on our domain•Using as a rough guide for now while these ontologies are being built
Use of Foundational OntologiesUBO - Upper Bio OntologyBFO - Basic Formal Ontology
BIRNLex is a Lexicon, not a terminology
• A is a B which has C– Defines class structure– Defines properties
• Electron microscope is a type of microscope which uses electrons to form an image– Microscope
• Electron microscope– Has property
» Image formation
BIRNLex Curation
• Meet on a semi-regular basis (many interruptions)• Identify domains and strategies
– Not mixing structure and function big help in moving forward• We slip up quite a bit• Revise, revise, revise• Tools for biologists are inadequate; better if you’re a computer
scientist (I handed off BIRNLex, reluctantly, several months ago)
• Divide up the work• Assign curation status
– We don’t argue too long– Curated, graph position temporary, uncurated, raw import from
source
Strict rules for developing taxonomies
• Behavioral Paradigm– Oddball paradigm
• Auditory oddball paradigm• Visual oddball paradigm
• Forebrain:– Has part: Amygdala
• Working memory paradigm– Serial item
recognition task– Radial maze
• Limbic system– Has part: Amygdala
The state of Neuroanatomy in BIRN
•Assessed the usage of anatomical terms in each atlas used by BIRN
•Inconsistency in application of terms•Resolution of technique was not considered
•Create standard “atomic” definitions for core brain parts•Create a volumetric hierarchy
•Provides a basis for accounting for resolution
•Goal: which structures give rise to signals measured by a technique
•Structure not function
•no arguments about whether the amygdala exists functionally
•No arguments about whether the fornix is functionally part of the hypothalamus
•Imported Neuronames hierarchy for volummetric relations among brain parts
•e.g., hippocampal formation has part•Mostly gray matter = dentate gyrus, hippocampus•Mostly white matter = alveus
•Develop consistent application rules:•“My hippocampus” = dentate gyrus + hippocampus”•Need descriptors for topological relationships and spatial overlap
PutamenGlobus PallidusCaudate NucleusThalamusVentral Diencephalon
HippocampusCerebral CortexCerebral White Matter
Dendrite Axon
Neuron
Neuroepithelialcell
Glia
Cell body Spine
Dendritic Spine
Component
Post synaptic
Component
PSD
SER
Actin Filament
Ribosome
Orientation
Distribution
Properties
Morphometrics
Shape
Compartment
Compartment
Shaft
Component
Actin Filament
SER
RER
Ribosome
Lysosome
Ribosome
Microtubule
Component
Orientation
Distribution
Properties
Morphometrics
Shape
Microglia Macroglia
Compartment
Macromolecule
macromolecule
macromolecule
Gene
Ontology
Cell type Ontology“has regional part”
“has constitutional part”
Subcellular Anatomy Ontology: Extending anatomy to subcellular dimension; based on FMA
Next Steps
• Community extension and curation– Import into Bonfire– BIRNLex “Wikipedia”
• Integration with BIRN imaging, workflow and analysis tools
• Work to evaluate and extend PATO for imaging data– Spatio-temporal relationships
• Better web interface• Begin transition into fully structured ontology:
– MIND Ontology: Multiscale Investigation of Neurological disease
Relationships in complex scenes
Vlad Mitsner, Masako Terada, Stephen Larson
•Incorporation of ontologies into segmentation tools for electron tomography
•Describe each “scene” as an instance of the ontology
•Capture not only entities but relationships among entities
•Electron microscopic data are sparse
•Discover “rules” for subcellular anatomy
Data Technique
Analysis Annotator
Biological Entity
FUGO OBI
PATO
Images as Instances
PATO: Phenotype and Trait Ontology
Genotype Entity Attribute Valuenpo gut structure dysplastic
gut relative size smallr210 retina pattern irregular
brain structure fusedtm84 d/v pattern
formationqualitative abnormal
blood islands relative number number increasedBsb[2] elongation of
arista literalprocess arrested
C-alpha[1D] adult behaviour
behavioral activity uncoordinated
2003 trial data: FB & ZFIN
•Way of expressing complex phenotypes in way that is more scientifically “sound”•BIRN provides valuable test cases for PATO•BIRN data immediately becomes interoperable with Zebrafish and fly communities
Suzanna Lewis, Chris Mungall et al.
•BIRN has made a good faith effort to evaluate and employ existing ontologies; we are patient but we’ve got work to do
•Ontology building is not for people with thin skins–We are not attempting to build formal ontologies for everything–Provide a formal and consistent structure for describing data
•A man who consults one ontologist knows what to do; a man who consults two ontologists is never sure
•Don’t want to be victims in the ontology wars•NCBO/MGI have been very helpful•The principles suggested to us so far have been useful; they make the process easier, not harder•Reference ontologies are useful, because they take care of the categories, e.g., dependent enduring entity, that tend to drive domain scientists a little nuts
–Challenge to develop tools on top of shifting infrastructure–expect that we’ll have to redo annotation periodically
Lessons learned