developing an owl-dl ontology for research and care of intracranial aneurysms – challenges and...

2
Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz, Susanne Hanser Institute for Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany ABCC5 gene encodes the protein MRP5 Naïve Approach : ABCC5 encodes.MRP5 Critique : there may be gene (sensu nucleotide chain) instances that happen to never encode any Gene product Y instance Solution 1 (preferred) : Use value restrictions ABCC5 encodes.GeneProductY Solution 2 : Consider ABCC5 and MRP5 not as classes but as information entity instances: ABCC5 Information Entity MRP5 Information Entity <ABCC5, MRP5> encodes Headache is associated with intracranial aneurysm Naïve Approach : Headache associated_with. Intracranial Aneurysm Intracranial Aneurysm associated_with. Headache Critique : Not every headache is a symptom of intracranial aneurysm and not every intracranial aneurysm produces headache. Solution : Introduce (anonymous) dispositions Intracranial Aneurysm has-disposition.( has-realization Headache) BBS2 is a protein with unknown function Naïve Approach : BBS2 Protein With Unknown Function Protein Critique : Whether a function is known or unknown is ontologically irrelevant Objection : Such classes are important for ontology navigation and housekeeping Solution 1 : Eliminate irrelevant class BBS2 Protein Solution 2 : Mark irrelevant class as housekeeping or navigational class BBS2 NAVProtein With Unknown Function Protein Smoking is a risk factor for aneurysm rupture Naïve Approach : Tobacco Smoking Aneurysm Rupture Risk Factor Tobacco Smoking Process Critique : What is represented is a context- dependent statement on and not a generic property of Smoking. Such classes are important for ontology navigation and housekeeping Solution : Clearly separate: Ontology proper (with the top node Factor Particular) Tobacco Smoking Process Endurant Particular from context ontology (with the top node Factor Particular in context) Tobacco Smoking Aneurysm Rupture Risk Factor Risk Factor Particular in Context Is a disease an event or state ? Naïve Approach : Brain Neoplasm Event or Brain Neoplasm Stative Critique : Ontologically, there are two entities: the disease and the course of the disease. In the context of @neurist, there is no need for this distinction. Solution : Introduce the disjunction class State or Process State or Process Event Stative Brain Neoplasm State or Process Pregnancy is a suspected risk factor for aneurysm rupture Naïve Approach : Pregnancy Suspected Aneurysm Rupture Risk Factor Critique : The context-dependent statement on pregnancy as a risk factor is modified by a modal expression. This is not an issue of ontology Objection : The distinction between suspected and proven risks is crucial for the use of the ontology (retrieval) Solution : Maintain naïve approach, but encode it within context ontology

Upload: osborn-cobb

Post on 27-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,

Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and LimitationsHolger Stenzhorn, Martin Boeker, Stefan Schulz, Susanne Hanser Institute for Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany

ABCC5 gene encodes the protein MRP5

Naïve Approach: ABCC5⊑ encodes.MRP5

Critique: there may be gene (sensu nucleotide chain) instances that happen to never encode any Gene product Y instance

Solution 1 (preferred): Use value restrictions ABCC5 ⊑ encodes.GeneProductY

Solution 2: Consider ABCC5 and MRP5 not as classes but as information entity instances: ABCC5 Information Entity MRP5 Information Entity <ABCC5, MRP5> encodes

Headache is associated with intracranial aneurysm

Naïve Approach: Headache ⊑ associated_with. Intracranial Aneurysm Intracranial Aneurysm ⊑ associated_with. Headache

Critique: Not every headache is a symptom of intracranial aneurysm and not every intracranial aneurysm produces headache.

Solution: Introduce (anonymous) dispositionsIntracranial Aneurysm ⊑ has-disposition.( has-realization Headache)

BBS2 is a protein with unknown function

Naïve Approach: BBS2 ⊑ Protein With Unknown Function ⊑ Protein

Critique: Whether a function is known or unknown is ontologically irrelevant

Objection: Such classes are important for ontology navigation and housekeeping

Solution 1: Eliminate irrelevant class BBS2 ⊑ Protein

Solution 2: Mark irrelevant class as housekeeping or navigational class BBS2 ⊑ NAVProtein With Unknown Function ⊑ Protein

Smoking is a risk factor for aneurysm rupture

Naïve Approach: Tobacco Smoking ⊑ Aneurysm Rupture Risk Factor Tobacco Smoking ⊑ Process

Critique: What is represented is a context-dependent statement on and not a generic property of Smoking.Such classes are important for ontology navigation and housekeepingSolution: Clearly separate:Ontology proper (with the top node Factor Particular) Tobacco Smoking ⊑ Process ⊑ Endurant ⊑ Particular from context ontology (with the top node Factor Particular in context) Tobacco Smoking ⊑ Aneurysm Rupture Risk Factor ⊑ Risk Factor ⊑ Particular in Context

Is a disease an event or state ?

Naïve Approach: Brain Neoplasm ⊑ Event or Brain Neoplasm ⊑ Stative

Critique: Ontologically, there are two entities: the disease and the course of the disease. In the context of @neurist, there is no need for this distinction.

Solution: Introduce the disjunction class State or Process State or Process Event ⊔ Stative Brain Neoplasm ⊑ State or Process

Pregnancy is a suspected risk factor for aneurysm rupture

Naïve Approach: Pregnancy ⊑ Suspected Aneurysm Rupture Risk Factor Critique: The context-dependent statement on pregnancy as a risk factor is modified by a modal expression. This is not an issue of ontology

Objection: The distinction between suspected and proven risks is crucial for the use of the ontology (retrieval)

Solution: Maintain naïve approach, but encode it within context ontology

Page 2: Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz,

Developing an Ontology for Research and Care of Intracranial Aneurysms – Challenges and LimitationsHolger Stenzhorn, Martin Boeker, Stefan Schulz, Susanne Hanser Institute for Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany

BioTop - Characteristics

• 257 classes• 42 relation types• 193 logical restriction on classes. • 57 sufficient criteria for full class definitions• OWL-DL as representation language • BFO as upper ontology• OBO RO relations

Example of a formal class definition (biotop:Tissue):

ChemTop

According to the modularization principles the more detailed biochemistry descriptions were separated into an optional add-on called ChemTop.

Integration of ChemTop and ChEBI planned.

Modularization

BioTop supports the following modularization principles:

• as subdomain boundaries are fuzzy a limited degree of overlap must be accounted for

• modules should be dimensioned in a way that they classify rapidly• modules follow the same principles of upper-level arrangement• bridging files link modules between themselves and with upper-level

ontologies

Upper Level Ontology

Domain Top Level Ontology

Domain Ontology

DOLCE

BioTop ChemTop

BFO-RO

GO CL ChEBI

Alignment and mapping

Completed:UMLS Semantic Network Mapping GENIA Ontology mappingGO: mapping of the three GO ontologiesCO: mapping of the cell ontologyTaxdemo: mapping of sample biological taxonomy

Ongoing:BioTop – DOLCE mapping (requires redesign of the BioTop Quality branch)

Background

The need for semantic standardization in the Life Sciences has been addressed by a dynamic evolution of domain ontologies (e.g. OBO, SNOMED CT), which tend to adhere to foundational principles of ontology design.Current biomedical ontologies are characterized by • large fragmentation and overlap• missing cross-ontology links • lack of clear and unambiguous formal definitions of basic terms• purpose-specific architecture and design decisions

Availabilityhttp://purl.org/biotop:

BioTop - Rationale

• To consolidate and integrate domain ontologies bridging the gap to a common upper level ontology

• To enforce unambiguous logic-based descriptions of basic entities of biology and medicine using a standardized description language

• To maintain neutrality with regard to granularity and observer-biased views

Bridgingmechanism

ChemTop BioTopBridge

BioTop GO Bridge

ChemTop ChEBI Bridge

BioTop CLBridge

BioTop BFO-RO Bridge

ChemTop BFO-RO Bridge

Onto1 Onto2 Bridge

Onto1

Onto2

≡ ⊑

• Sources• Publications• Discussion List