has_part in go

18
has_part : a new twist • Original thinking – gene products would not propagate over has_part has_part used as a navigational aid or in probabilistic inference has_part could be omitted from main ontology files with no loss of information • In fact: There are situations where has_part can be used in annotation propagation

Upload: chris-mungall

Post on 11-May-2015

775 views

Category:

Documents


0 download

DESCRIPTION

preliminary description of requirements for has_part and inference rules

TRANSCRIPT

Page 1: has_part in GO

has_part : a new twist• Original thinking

– gene products would not propagate over has_part– has_part used as a navigational aid or in probabilistic

inference– has_part could be omitted from main ontology files

with no loss of information

• In fact:– There are situations where has_part can be used in

annotation propagation

Page 2: has_part in GO

Motivation for has_part: An example of an incorrect use of part_of

chromosomechromosome

nucleusnucleus mitochondrionmitochondrion

part_of part_of

all chromosome part_ofsome nucleus

all chromosome part_ofsome mitochondrion

Page 3: has_part in GO

Current GO: part-specific subtypes

chromosomechromosome

nucleusnucleus mitochondrionmitochondrion

part_of

nuclear chromosomenuclear chromosome

miotchondrial chromosomemiotchondrial chromosome

part_of

ABF1ABF1 MGM101MGM101

is_a is_a

Page 4: has_part in GO

propagation over part_of

chromosomechromosome

nucleusnucleus mitochondrionmitochondrion

part_of

nuclear chromosomenuclear chromosome

miotchondrial chromosomemiotchondrial chromosome

part_of

ABF1ABF1 MGM101MGM101

is_a is_a

ABF1ABF1 MGM101MGM101

Page 5: has_part in GO

part-specific subtype pattern• A common ‘design pattern’ in GO

– If p is located in w1 or w2 then create part-specific subtypes:• p-in-w1 is_a p and part_of w2

• p-in-w2 is_a p and part_of w2

• Cons:– ‘clutters up’ ontology

• but terms can be managed automatically using reasoner– Annotators may not see subtypes and annotate too generally

• Easy to fix with correct tooling?• Pros:

– greater discriminative power, more specific annotations– Logically coherent, easy to implement rules

– Then why not implement this universally?

Page 6: has_part in GO

Another example: erroneous use of part_of with complexes

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

part_of

TFB1TFB1

part_of

we wouldnot do this!!

Page 7: has_part in GO

core TFIIH complex (CURRENT GO)

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

TFB1TFB1

part_ofpart_of

core TFIIH portion of holo TFIIH complex

core TFIIH portion of holo TFIIH complex

core TFIIH portion of NEF3 complex

core TFIIH portion of NEF3 complex

is_a is_a

Page 8: has_part in GO

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

TFB1TFB1

part_ofpart_of

core TFIIH portion of holo TFIIH complex

core TFIIH portion of holo TFIIH complex

core TFIIH portion of NEF3 complex

core TFIIH portion of NEF3 complex

is_a is_a

Problem: annotations to more generic term

Page 9: has_part in GO

Problem: additional semi-redundant annotations required to capture necessary

gene products

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

TFB1TFB1

part_ofpart_of

core TFIIH portion of holo TFIIH complex

core TFIIH portion of holo TFIIH complex

core TFIIH portion of NEF3 complex

core TFIIH portion of NEF3 complex

is_a is_a

TFB1TFB1 TFB1TFB1

ABF1ABF1 MGM101MGM101

Page 10: has_part in GO

core TFIIH with has_part

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

has_parthas_part

TFB1TFB1

-Logically correct- Can we propagate gene products?

Page 11: has_part in GO

We would like to propagate gene products in this case – but can we

do this universally?

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

has_parthas_part

TFB1TFB1

TFB1TFB1

TFB1TFB1

Page 12: has_part in GO

thought experiment: use has_part for location-specific chromosomes

chromosomechromosome

nucleusnucleusmitochondrionmitochondrion

has_parthas_part

ABF1ABF1 MGM101MGM101

Page 13: has_part in GO

no reliable propagation of gene products over has_part

chromosomechromosome

nucleusnucleusmitochondrionmitochondrion

has_parthas_part

ABF1ABF1 MGM101MGM101

ABF1ABF1MGM101MGM101

MGM101MGM101ABF1ABF1

Page 14: has_part in GO

So what’s the difference in ontology structure?

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

has_parthas_part

TFB1TFB1

chromosomechromosome

nucleusnucleus mitochondrionmitochondrion

has_parthas_part

ABF1ABF1

MGM101MGM101

Page 15: has_part in GO

So what’s the difference in ontology structure?

core TFIIH complexcore TFIIH complex

holo TFIIH complexholo TFIIH complex

NEF3 complexNEF3 complex

has_parthas_part

TFB1TFB1

chromosomechromosome

nucleusnucleus mitochondrionmitochondrion

has_parthas_part

ABF1ABF1

MGM101MGM101

there is no structuraldifference

the computer needs some additionalknowledge that is currently IMPLICIT

Page 16: has_part in GO

What’s the difference in annotations?

core TFIIH complexcore TFIIH complex

TFB1TFB1

chromosomechromosome

ABF1ABF1

MGM101MGM101

SOME instances of MGM10 proteins are part of SOME chromosomes

SOME instances of TFB1proteins are part of SOME core TFIIH complexes

Page 17: has_part in GO

What’s the difference in annotations?

core TFIIH complexcore TFIIH complex

TFB1TFB1

chromosomechromosome

ABF1ABF1

MGM101MGM101

SOME instances of MGM10 proteins are part of SOME chromosomes

SOME instances of TFB1proteins are part of SOME core TFIIH complexes

ALL instances of core TFIIH complexesproteins has_part SOME TFB1 protein(in this species)

TFB1 is integral to core TFIIH complex

complex

This crucial piece of knowledge is IMPLICIT

NOT ALL instances of chromosomehas_part SOME MGM10

Page 18: has_part in GO

Solution• Add additional qualifier to GAF

– name TBD. integral_to?• Semantics:

– ALL instances of this complex in this species have this gene product as part

– This is stronger than an existing annotation:• some instance of this gene product in this species are found in this complex

• Also works for BP– ALL instances of this process in this species require this gene product

• Example: spermatogenesis, meiosis, MSH– standard annotation:

• some instance of this gene product actively participate in this process• Works using standard DL reasoning technology• Requires change in annotation practice