reasoning requirements for bioscience

34
Applied Semantic Web Timely. Practical. Reliable. http://applied-semantic-web.org Emanuele Della Valle [email protected] http://emanueledellavalle.org Reasoning requirements for Bioscience

Upload: emanuele-della-valle

Post on 20-Aug-2015

2.653 views

Category:

Technology


1 download

TRANSCRIPT

Applied Semantic WebTimely. Practical. Reliable.http://applied-semantic-web.org

Emanuele Della [email protected]://emanueledellavalle.org

Reasoning requirementsfor Bioscience

Emanuele Della Valle - http://applied-semantic-web.org

Share, Remix, Reuse — Legally

This work is licensed under the Creative Commons Attribution 3.0 Unported License.

Your are free:

• to Share — to copy, distribute and transmit the work

• to Remix — to adapt the work

Under the following conditions

• Attribution — You must attribute the work by inserting– “© applied-semantic-web.org” at the end of each reused slide– a credits slide stating “These slides are partially based on

“Querying the Semantic Web with SPARQL” by Emanuele Della Valle http://applied-semantic-web.org/2010/03/05_SPARQL.ppt

To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/

2

Emanuele Della Valle - http://applied-semantic-web.org

Reasoning Requirements (for Bioscience) [KeetRM07]

Supporting the ontology development process• Finding source of errors• Classification• Supporting changes

Model checking (violation)

Finding gaps in an ontology

Discovering new relations and concepts

Comparison of two ontologies

Mereologic reasoning

Finding inconsistencies in a hierarchy of relations

Reasoning across linked ontologies

Complex queries

3

Emanuele Della Valle - http://applied-semantic-web.org

Supporting the ontology development process

Finding source of errors

IDEA: Help in finding the source of errors • Instead of all the logical consequences of a modeling error

Example [ParsiaSK05]: Swoop UI guides the user in locating and understanding bugs in the ontology by narrowing them down to their exact source.• gene-part makes is

inconsistent because the class dna-part andthe class expression part-of.gene are inconsistent

• Users can follow thesedependencies and reachthe root cause of the inconsistency, e.g., the class which isindependently inconsistent in its definition

4

Emanuele Della Valle - http://applied-semantic-web.org

Supporting the ontology development process

Classification 1/3

IDEA: People do the basic modeling, machine complete the work

Example [Rector06]: • Using a classifier to make life easier

5

Emanuele Della Valle - http://applied-semantic-web.org 6

Supporting the ontology development process

Classification 2/3

Substance Protein

• Insulin• ATPase

Steroid• Cortisol

PhsioloicRole HormoneRole CatalystRole

Hormone ≡ Substance playsRole.HormoneRole

ProteinHormone ≡ Protein playsRole.HormoneRole

SteroidHormone ≡ Steroid playsRole.HormoneRole

Catalyst ≡ Substance playsRole.CatalystRole

Enzyme ≡ Protein playsRole.CatalystRole

Insulin playsRole.HormoneRole

Cortisol playsRole.HormoneRole

ATPase playsRole.CatalystRole

Emanuele Della Valle - http://applied-semantic-web.org

Supporting the ontology development process

Classification 3/3

Substance• Protein

– ProteinHormone- Insulin+

– Enzyme- ATPase+

• Steroid– SteroidHormone+

- Cortisol+• Hormone

– ProteinHormone+- Insulin+

– SteroidHormone+- Cortisol+

• Catalyst– Enzyme+

- ATPase+

7

Emanuele Della Valle - http://applied-semantic-web.org

Supporting the ontology development process

Supporting changes

IDEA: asking the modeler to do minimal changes andlet the reasoner do the hard job

Example [Rector06] : • What do we have to do to organise hormones as metabolic

hormones and sex hormones and stress hormones and add in testosterone?

• The hard way– Know exactly where to introduce the new concepts in the

polyherarchy• The easy way

– Declare the semantics of metabolic hormones, sex hormones, stress hormones and testosterone

– Let the reasoner classify – Check for other required changes

8

Emanuele Della Valle - http://applied-semantic-web.org 9

Supporting the ontology development process

Supporting changes - Declare the semantics

Substance• Protein

– Insulin– ATPase

Steroid• Cortisol• Testosterone

PhsioloicRole• HormoneRole

– SexHR– MetabHR– StressHR

• CatalystRole

SexHrmn ≡ Substance playsRole.SexHR

MetabolicHrmn ≡ Substance playsRole.MetabHR

StressHrmn ≡ Substance playsRole.StressHR

Testosterone playsRole. SexHR

Cortisol playsRole.StressHR

Emanuele Della Valle - http://applied-semantic-web.org

Supporting the ontology development process

Supporting changes – classify and check Substance

• Protein– ProteinHormone

- Insulin+• Steroid

– SteroidHormone+- Cortisol+

• Hormone– ProteinHormone+

- Insulin+– SteroidHormone+

- Cortisol+– SexHrmn

- Testosterone– StressHrmn

- Cortisol+– MetabHrmn

- Insulin+• Catalyst

– Enzyme+- ATPase+

Declare

Insulin playsRole.MetabRole

and classy once more

Emanuele Della Valle - http://applied-semantic-web.org

Model checking

IDEA: test the ontology (at the type-level) against instance data that ought to conform to the logical theory

Example [ParsiaSK05]: • Cow are herbivores,

– herbivore eats.(plant (part_of.plant)• Mad cows are cows that eats brains and other part of sheeps

– mad-cow cow, eats.(brain(part_of.sheep)•

11

Emanuele Della Valle - http://applied-semantic-web.org

Finding gaps in an ontology 1/2

IDEA: find out relations and concepts that are known, but that have not yet been added to the ontology

Example [KeetRM07]:• Problem

– Foundational Model of Anatomy (FMA) has about 72000 concepts and 1.9 million relations among them and is known to be incomplete in particular at the cellular and sub-cellular levels of granularity

– How can we know what is missing?• Possible solutions

– Manual browsing: unrealistic – Targeted queries by domain experts: can help – Automated Reasoning: nice to have

12

Emanuele Della Valle - http://applied-semantic-web.org

Finding gaps in an ontology 2/2

Targeted queries by domain experts

Context• there are 17 types of Macrophage (types of cells of the

immune system) in the FMA, which must be part of or contained in something

Query (a recoursive one) [Keet06]• Find organ with macrophages

– all x of class Macrophage (and of its sub-class)– all y of class Organ (and of its sub-class)– where x is a part of y or

x is contained in y

Each one of the 17 types of macrophages is expected to be related to at least one organ, but in FMA only Hepatic macrophage are known to be part of the Liver. The relations of the other 16 are missing.

13

Emanuele Della Valle - http://applied-semantic-web.org

Discovering new relations and concepts 1/4

IDEA: Finding new types of relations and DL-concepts by using instances

Example [KeetRM07]:1. correlation between two roles

2. examining if the data supports somesubconcept X’ or a quaternary role

3. Path query that may be of arbitrary length, with any roles and conceptsthat relate to X

14

Emanuele Della Valle - http://applied-semantic-web.org

Discovering new relations and concepts 2/4

Correlation between two roles

Formally• for each x:X, y:Y, r:R, XRY,

does there exist a z:Z, s:S, such that there exist 1 x and xsz?

Example• querying if more than 50% of the patients that suffer from

lactose intolerance also have the symptom of being nauseous– X = Patient, – R = has disease, – Y = Lactose intolerance,– S = has symptom, and – Z = Nausea

Challenging on large set of instances, but feasible

15

Emanuele Della Valle - http://applied-semantic-web.org

Discovering new relations and concepts 3/4

examining if the data supports some subconcept X’ or a quaternary role

Formally• for each x:X, y:Y, r:R, XRY, • does there exist an xsz and an xta • where z:Z, s:S, a:A, t:T hold?

brute force leads to a combinatorial explosion

16

Emanuele Della Valle - http://applied-semantic-web.org

Discovering new relations and concepts 4/4

Path query that may be of arbitrary length, with any roles and conceptsthat relate to X

Formally

for each x:X, return any r1, ...rn, their type of role and the concepts Y1, ...Yn they are related to

Example

establish if a protein is in some way related to anything else

Exploring the search space of sequences of conjunctive queries of not-predefined arbitrary length may not be realistic

Feasible example

discover the relationships betweenHistone code, DNA sequence, and Gene expression regulation

17

Emanuele Della Valle - http://applied-semantic-web.org

Comparison of two ontologies

IDEA: similar to the problem of ontology matching but finding differences matters more than finding similarities

Example [KeetRM07]:• Treating ontologies as formal rendering of a scientific theory

– any discrepancies between two competing theories can provide an impetus for experimentation to resolve the issues.

• biological pathway comparison– Across species– Canonical vs. changed by toxins or genetic defects

Not feasible in the general case

18

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic Reasoning

IDEA: Reasoning with mereological parthood and other (part-whole) relations

Example [Rector06]:• Complex and largely investigated field [Winston87]

OWL has not native primitives, but can be use to model the classical part-whole relations

19

Forearm

ArmPart ofArm

Hand

Arm OR part part of armhas_locussome

Injury to Arm(or part of arm)

Injury to Hand has_locussome

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic Reasoning- Five families of relations

Partonomic• Parts and wholes

– The lid is part of the box• Constitution

– The box is made of cardboard• Membership

– The box is part of the shipment

Nonpartonomic• Containment

– The gift is contained in the box• Connection/branching/Adjacency

– The box is connected to the container by a strap

20

[Rector06]

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic Reasoning – Transitivity 1/2

True kinds of part-of are transitive and a fault to the part is a fault in the whole• The finger nail is part of the finger is part of the hand is part

of the upper extremity is part of the body– Injury to the fingernail is injury to the body

• The tail-light is part of the electrical system is part of the car– A fault in the tail light is a fault in the car

Membership is not transitive• The foot of the goose is part of the goose but not part of the

flock of geese– Damage to the foot of the goose is not damage ot the flock of

geese

21

[Rector06]

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic Reasoning – Transitivity 2/2

Containment is transitive but things contained are not necessarily parts• A fault (e.g. souring) to the milk contained in the bottle is not

damage to the bottle

Some kinds of part-whole relation are questionably transitive• Is the cell that is part of the finger a part of the body?

– Is damage to the cell that is part of the finger damage to the body?- Not necessarily, since the cells in my body die and regrow constantly

22

[Rector06]

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic reasoning – 1st Implementation Patterns

IDEA: Transitive properties with non-transitive “direct” subproperties

Transitive properties should have non-transitive children• isPartOf : transitive

– isPartOfDirectly : non-transitive

Split which is used in partial descriptions and complete definitions• Necessary conditions use non-transitive version• Definitions use transitive version

Benefits• Allows more restrictions in domain/range constraints and

cardinality– Allows the hierarchy along that axis to be traced one step at a time– Allow a good approximation of pure trees

- Make the nontransitive subproperty functional- Transitive properties can (almost) never be functional

(by definition, a transitive property has more than one value in any non-trivial system)

– Constraints on transitive properties easily lead to unsatisfiability

23

[Rector06]

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic reasoning – 1st Implementation Patterns

Example:• Finger isPartOfDirectly Hand• Hand isPartOfDirectly Arm• Arm isPartOfDirectly Body

24

body arm hand fingerisPartOfDirectly isPartOfDirectly isPartOfDirectly

isPartOf

isPartOf

isPartOf isPartOf isPartOf

isPartOfisPartOf

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic reasoning – 2nd Implementation Patterns

IDEA: “Adapted SEP Triples” [SeidenbergR06]

Example [Rector06]:• Body (‘as a whole’)

– Body• The Body’s parts

– isPartOf.Body• The Body and it’s parts

– Body isPartOf.Body• Note: the “and” in natural

language is capture by“or” (unionOf)

• Repeat for all parts

25

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic reasoning – Using part-of 1/2

Example [Rector06] • Injury_to_Hand ≡ Injury

has_locus.(Hand isPartOf.Hand)• Injury_to_Arm = Injury

has_locus.(Arm isPartOf.Arm)• Injury_to_Body = Injury

has_locus.(Body isPartOf.Body)

• After classification

we get the expected hierarchy from point of view of anatomy

26

Emanuele Della Valle - http://applied-semantic-web.org

Mereologic reasoning – Using part-of 2/2

Example [SeidenbergR06] • Burn is found to be a subclass of LegInjury

27

Emanuele Della Valle - http://applied-semantic-web.org

Example [Rector06] • The Brain is contained in the Cavity defined by the Cranium

which is a structural part of the skull.

Both views:The Brain is located in the skull but not part of the skull

Mereologic reasoning – Multiple “views”

28

Clinician’s view:Pericardium is part of heart & Pericardiitis isa kind of HeartDisease

Anatomist’s view:Pericardium is a distinct organ thatdevelops separatelyfrom Heart

Emanuele Della Valle - http://applied-semantic-web.org

Finding inconsistencies in a hierarchy of relations

IDEA: checking if hierarchy of relations are correctly modeled as normally done for concepts

EXAMPLE [KeetRM07]: • asymmetry implies irreflexivity, • therefore any subrole of an asymmetric role should not be

irreflexive

At present, automated reasoners assumes relations to have been modeled correctly • never return an inconsistency on a property• inconsistent or erroneous re-classified concepts are the

results of incorrect relation modeling

29

Emanuele Della Valle - http://applied-semantic-web.org

Reasoning across linked ontologies

IDEA: reason across multiple interlinked ontology without creating a huge (and unmanageable) ontology by merging multiple ontologies, but beneficing from (expressive) reasoning occurring locally to each ontology

Example [KeetRM07]: • reasoning across

– MGED Ontology for microarray experiments,– BioPax for biological pathways,– Cell type, and– Mammalian Phenotype

Coordinated modularization and linking of ontologies on demand would be required

30

Emanuele Della Valle - http://applied-semantic-web.org

Complex queries

IDEA: queries that require use of part-whole relations or cycles among the concepts in the ontology

Examples [KeetRM07, Keet06]: • using of part-whole relations

– retrieve all biological pathways that contain a give protein e.g., Ubiquitin

– what are the cellular components of blood?• involving adjacent levels of granularity (e.g., Cell-level and

SubcellularOrganelle-level)– which cell type(s) do(es) not have a nucleus as part?

• using of cycles among the concepts – retrieve the place of a give animal, e.g., Hedgehog, in food webs

• requiring to exploit two different part of relations (i.e., a structural one for the kidney and a functional one for the hormones)– Which hormones are located in the kidney, and where in the kidney?

• Requiring to exploit one a part of and two sub class of relations– In which organs are macrophages located?

31

Emanuele Della Valle - http://applied-semantic-web.org

Reasoning Requirements (for Bioscience) [KeetRM07]

Supporting the ontology development process :-|• Finding source of errors :-|• Classification :-D• Supporting changes :-)

Model checking (violation) :-)

Finding gaps in an ontology :-(

Discovering new relations and concept :-/

Comparison of two ontologies :-((

Mereologic reasoning :-)

Finding inconsistencies in a hierarchy of relations :-(

Reasoning across linked ontologies :-((

Complex queries :-)

32

Emanuele Della Valle - http://applied-semantic-web.org

Credits and References [KeetRM07] C. Maria Keet, Marco Roos, M. Scott Marshall: A Survey of

Requirements for Automated Reasoning Services for Bio-Ontologies in OWL. OWLED 2007• http://www.webont.org/owled/2007/PapersPDF/submission_20.pdf

[Rector06] Alan Rector. GALEN Revisited. • http://www.cs.man.ac.uk/~rector/presentations/Reasoning-web-rector-GALEN-2006.ppt

[ParsiaSK05] Bijan Parsia, Evren Sirin, Aditya Kalyanpur: Debugging OWL ontologies. WWW 2005: 633-640• http://www2005.org/cdrom/docs/p633.pdf

[Keet06] Keet, C.M. Granular information retrieval from the Gene Ontology and from the Foundational Model of Anatomy with OQAFMA. KRDB Research Centre Technical Report KRDB06-1, Free University of Bozen-Bolzano, 6 April 2006. 19p. • http://www.inf.unibz.it/krdb/pub/TR/KRDB06-1.pdf

[Winston87] Winston, M., R. Chaffin, et al. (1987). "A taxonomy of part-whole relations." Cognitive Science 11: 417-444.• http://csjarchive.cogsci.rpi.edu/1987v11/i04/p0417p0444/MAIN.PDF

[SeidenbergR06] Julian Seidenberg, Alan L. Rector: Representing Transitive Propagation in OWL. ER 2006: 255-266• http://www.springerlink.com/content/02kk647893q43700/

33

Applied Semantic WebTimely. Practical. Reliable.http://applied-semantic-web.org

Emanuele Della [email protected]://emanueledellavalle.org

Reasoning requirementsfor Bioscience