has_part in go
DESCRIPTION
preliminary description of requirements for has_part and inference rulesTRANSCRIPT
has_part : a new twist• Original thinking
– gene products would not propagate over has_part– has_part used as a navigational aid or in probabilistic
inference– has_part could be omitted from main ontology files
with no loss of information
• In fact:– There are situations where has_part can be used in
annotation propagation
Motivation for has_part: An example of an incorrect use of part_of
chromosomechromosome
nucleusnucleus mitochondrionmitochondrion
part_of part_of
all chromosome part_ofsome nucleus
all chromosome part_ofsome mitochondrion
Current GO: part-specific subtypes
chromosomechromosome
nucleusnucleus mitochondrionmitochondrion
part_of
nuclear chromosomenuclear chromosome
miotchondrial chromosomemiotchondrial chromosome
part_of
ABF1ABF1 MGM101MGM101
is_a is_a
propagation over part_of
chromosomechromosome
nucleusnucleus mitochondrionmitochondrion
part_of
nuclear chromosomenuclear chromosome
miotchondrial chromosomemiotchondrial chromosome
part_of
ABF1ABF1 MGM101MGM101
is_a is_a
ABF1ABF1 MGM101MGM101
part-specific subtype pattern• A common ‘design pattern’ in GO
– If p is located in w1 or w2 then create part-specific subtypes:• p-in-w1 is_a p and part_of w2
• p-in-w2 is_a p and part_of w2
• Cons:– ‘clutters up’ ontology
• but terms can be managed automatically using reasoner– Annotators may not see subtypes and annotate too generally
• Easy to fix with correct tooling?• Pros:
– greater discriminative power, more specific annotations– Logically coherent, easy to implement rules
– Then why not implement this universally?
Another example: erroneous use of part_of with complexes
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
part_of
TFB1TFB1
part_of
we wouldnot do this!!
core TFIIH complex (CURRENT GO)
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
TFB1TFB1
part_ofpart_of
core TFIIH portion of holo TFIIH complex
core TFIIH portion of holo TFIIH complex
core TFIIH portion of NEF3 complex
core TFIIH portion of NEF3 complex
is_a is_a
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
TFB1TFB1
part_ofpart_of
core TFIIH portion of holo TFIIH complex
core TFIIH portion of holo TFIIH complex
core TFIIH portion of NEF3 complex
core TFIIH portion of NEF3 complex
is_a is_a
Problem: annotations to more generic term
Problem: additional semi-redundant annotations required to capture necessary
gene products
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
TFB1TFB1
part_ofpart_of
core TFIIH portion of holo TFIIH complex
core TFIIH portion of holo TFIIH complex
core TFIIH portion of NEF3 complex
core TFIIH portion of NEF3 complex
is_a is_a
TFB1TFB1 TFB1TFB1
ABF1ABF1 MGM101MGM101
core TFIIH with has_part
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
has_parthas_part
TFB1TFB1
-Logically correct- Can we propagate gene products?
We would like to propagate gene products in this case – but can we
do this universally?
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
has_parthas_part
TFB1TFB1
TFB1TFB1
TFB1TFB1
thought experiment: use has_part for location-specific chromosomes
chromosomechromosome
nucleusnucleusmitochondrionmitochondrion
has_parthas_part
ABF1ABF1 MGM101MGM101
no reliable propagation of gene products over has_part
chromosomechromosome
nucleusnucleusmitochondrionmitochondrion
has_parthas_part
ABF1ABF1 MGM101MGM101
ABF1ABF1MGM101MGM101
MGM101MGM101ABF1ABF1
So what’s the difference in ontology structure?
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
has_parthas_part
TFB1TFB1
chromosomechromosome
nucleusnucleus mitochondrionmitochondrion
has_parthas_part
ABF1ABF1
MGM101MGM101
So what’s the difference in ontology structure?
core TFIIH complexcore TFIIH complex
holo TFIIH complexholo TFIIH complex
NEF3 complexNEF3 complex
has_parthas_part
TFB1TFB1
chromosomechromosome
nucleusnucleus mitochondrionmitochondrion
has_parthas_part
ABF1ABF1
MGM101MGM101
there is no structuraldifference
the computer needs some additionalknowledge that is currently IMPLICIT
What’s the difference in annotations?
core TFIIH complexcore TFIIH complex
TFB1TFB1
chromosomechromosome
ABF1ABF1
MGM101MGM101
SOME instances of MGM10 proteins are part of SOME chromosomes
SOME instances of TFB1proteins are part of SOME core TFIIH complexes
What’s the difference in annotations?
core TFIIH complexcore TFIIH complex
TFB1TFB1
chromosomechromosome
ABF1ABF1
MGM101MGM101
SOME instances of MGM10 proteins are part of SOME chromosomes
SOME instances of TFB1proteins are part of SOME core TFIIH complexes
ALL instances of core TFIIH complexesproteins has_part SOME TFB1 protein(in this species)
TFB1 is integral to core TFIIH complex
complex
This crucial piece of knowledge is IMPLICIT
NOT ALL instances of chromosomehas_part SOME MGM10
Solution• Add additional qualifier to GAF
– name TBD. integral_to?• Semantics:
– ALL instances of this complex in this species have this gene product as part
– This is stronger than an existing annotation:• some instance of this gene product in this species are found in this complex
• Also works for BP– ALL instances of this process in this species require this gene product
• Example: spermatogenesis, meiosis, MSH– standard annotation:
• some instance of this gene product actively participate in this process• Works using standard DL reasoning technology• Requires change in annotation practice