example annotation:

1
The evidence describes how the annotation was created, and provides a way of measuring its strength or reliability. GO has developed a set of standard evidence codes which form a loose hierarchy, with ‘inferred by electronic annotation’ (IEA) being the least reliable type of evidence, followed by ‘inferred by sequence similarity’ (ISS). Example annotation: DB D B_O bject_ID D B_O bject_ Sym bol [NOT ] go_i d D B:R eference (| D B:R eference) Evidence With A spect D B_O bject_N am e (| Nam e) D B_O bject_Synonym (| Synonym) D B_O bject_ Type Taxon (| taxon) D ate SG D S0000296 PH O 3 GO:0015888 SG D :8789 |PMID :267 6709 IMP P Y BR092C gene taxon:4932 20001122 SG D S0000296 PH O 3 GO:0003993 SG D :8789 |PMID :267 6709 IMP F Y BR092C gene taxon:4932 20001122 Fields highlighted in grey are mandatory The source of an annotation may be a literature reference, a database record or the type of computational anaylsis. Literature references are entered as an accession number, either from the database in question and/or from PubMed. Annotations based on computational analysis include a reference to the method of analysis. The annotation of gene products to GO terms is performed according to two main principles: the recording of the source of the annotation and the type of evidence on which the annotation was based. Collaborating databases Many important databases produce GO annotations and contribute to the development of the GO. These include: FlyBase (database for the fruitfly Drosophila melanogaster), Berkeley Drosophila Genome Project (Drosophila informatics; GO database & software), Saccharomyces Genome Database (SGD) (database for the budding yeast Saccharomyces cerevisiae), Mouse Genome Database (MGD) & Gene Expression Database (GXD) (databases for the mouse Mus musculus), The Arabidopsis Information Resource (TAIR) (database for the brassica family plant Arabidopsis thaliana), WormBase (database for the nematode Caenorhabditis elegans), PomBase (database for the fission yeast Schizosaccharomyces pombe), Rat Genome Database (RGD) (database for the rat Rattus norvegicus), DictyBase (informatics resource for the slime mold Dictyostelium discoideum), The Pathogen Sequencing Unit (The Wellcome Trust Sanger Institute), Genome Knowledge Base (GKB) (Cold Spring Harbor Laboratory), EBI : InterPro - SWISS-PROT - TrEMBL groups, The Institute for Genomic Research (TIGR), Gramene (A Comparative Mapping Resource for Monocots), Compugen (with its Internet Research Engine). Abbreviations used by GO are described here: http://www.geneontology.org/doc/GO.x rf_abbs What is a Gene Ontology (GO) annotation? Databases external to GO make cross-links between GO terms and objects in their databases (typically, gene products, or their surrogates, genes), and then provide tables of these links to GO. The GO itself contains no information about genes or gene products. The GO annotation (‘gene association’) files are all publicly available: http://www.geneontology.org/#annotations Gene products are annotated to the most specific GO term possible for the information available. A gene product is annotated to one or more terms in each of the three ontologies; biological process, cellular component and molecular function. Annotation of a gene product to one ontology is independent of its annotation to the other two ontologies. When there is no information regarding one or more aspects of a gene product, the gene product is annotated to the GO term ‘unknown’. Annotating with GO: an Annotating with GO: an overview overview A gene product is annotated with terms reflecting only its normal activities, locations and processes. Database name abbreviation Used when it is specified in the source that that a gene product is NOT associated with a particular gene product e.g. “we have found that protein Z is not involved in the X cascade”. Database Object identifier. A Database Object is usually a gene product, but can also be a gene or a transcript. Gene Ontology term identifier P = biological process, F = molecular function and C = cellular component. Taxonomic identifier for gene product Object type: gene, transcript or protein http://www.geneontology.org/ IDA inferred from direct assay IEP inferred from expression pattern IEA inferred from electronic annotation TAS traceable author statement NAS non-traceable author statement ND no biological data available Evidence codes IC inferred by curator IMP inferred from mutant phenotype IGI inferred from genetic interaction IPI inferred from physical interaction ISS inferred from sequence similarity

Upload: coty

Post on 07-Jan-2016

28 views

Category:

Documents


1 download

DESCRIPTION

Evidence codes ICinferred by curator IMPinferred from mutant phenotype IGIinferred from genetic interaction IPIinferred from physical interaction ISSinferred from sequence similarity. IDAinferred from direct assay IEPinferred from expression pattern - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Example annotation:

The evidence describes how the annotation was created, and provides a way of measuring its strength or reliability. GO has developed a set of standard evidence codes which form a loose hierarchy, with ‘inferred by electronic annotation’ (IEA) being the least reliable type of evidence, followed by ‘inferred by sequence similarity’ (ISS).

Example annotation:

DB DB_Object_IDDB_Object_

Symbol[NOT] go_id

DB:Reference(|DB:Reference)

Evidence With AspectDB_Object_Name

(|Name)DB_Object_Synonym

(|Synonym)DB_Object_

TypeTaxon(|taxon)

Date

SGD S0000296 PHO3 GO:0015888SGD:8789|PMID:2676709

IMP P YBR092C gene taxon:4932 20001122

SGD S0000296 PHO3      GO:0003993SGD:8789|PMID:2676709

IMP      F      YBR092C gene taxon:4932 20001122

Fields highlighted in grey are mandatory

The source of an annotation may be a literature reference, a database record or the type of computational anaylsis. Literature references are entered as an accession number, either from the database in question and/or from PubMed. Annotations based on computational analysis include a reference to the method of analysis.

Theannotation of

gene products to GO terms is performed according to

two main principles: the recording of the source of the annotation and the type

of evidence on whichthe annotation was

based.

Collaborating databasesMany important databases produce GO annotations and contribute to the development of the GO. These include:FlyBase (database for the fruitfly Drosophila melanogaster), Berkeley Drosophila Genome Project (Drosophila informatics; GO database & software), Saccharomyces Genome Database (SGD) (database for the budding yeast Saccharomyces cerevisiae), Mouse Genome Database (MGD) & Gene Expression Database (GXD) (databases for the mouse Mus musculus), The Arabidopsis Information Resource (TAIR) (database for the brassica family plant Arabidopsis thaliana), WormBase (database for the nematode Caenorhabditis elegans), PomBase (database for the fission yeast Schizosaccharomyces pombe), Rat Genome Database (RGD) (database for the rat Rattus norvegicus), DictyBase (informatics resource for the slime mold Dictyostelium discoideum), The Pathogen Sequencing Unit (The Wellcome Trust Sanger Institute), Genome Knowledge Base (GKB) (Cold Spring Harbor Laboratory), EBI : InterPro - SWISS-PROT - TrEMBL groups, The Institute for Genomic Research (TIGR), Gramene (A Comparative Mapping Resource for Monocots), Compugen (with its Internet Research Engine).

Abbreviations used by GO are described here:

http://www.geneontology.org/doc/GO.xrf_abbs

What is a Gene Ontology (GO) annotation?

Databases external to GO make cross-links between GO terms and objects in their databases (typically, gene products, or their surrogates, genes), and then provide tables of these links to GO. The GO itself contains no information about genes or gene products. The GO annotation (‘gene association’) files are all publicly available:

http://www.geneontology.org/#annotations Gene products are annotated to

the most specific GO term possible for the information

available.

A gene product is annotated to one or

more terms in each of the threeontologies; biological process,

cellularcomponent and molecular

function.

Annotation of a gene product to one ontology is independent of its

annotation to the other two ontologies.

When there is no information regarding one or more aspects of a gene product, the gene product

is annotated to the GO term ‘unknown’.

Annotating with GO: an Annotating with GO: an overviewoverview

A gene product is annotated

with terms reflecting only its normal

activities, locations and processes.

Database name abbreviation

Used when it is specified in the source that that a gene product is NOT associated with a particular gene product e.g. “we have found that protein Z is not involved in the X cascade”.

Database Object identifier. A Database

Object is usually a gene product, but can also be a gene or a transcript.

Gene Ontology term identifier

P = biological process, F =

molecular function and C = cellular

component.

Taxonomic identifier for gene

product

Object type: gene, transcript or

protein

http://www.geneontology.org/

IDA inferred from direct assay

IEP inferred from expression pattern

IEA inferred from electronic annotation

TAS traceable author statement

NAS non-traceable author statement

ND no biological data available

Evidence codesIC inferred by curator

IMP inferred from mutant phenotype

IGI inferred from genetic interaction

IPI inferred from physical interaction

ISS inferred from sequence similarity