1 gene ontology and functional annotation donghui li aspb plant biology, june 29, 2008, merida

Post on 27-Mar-2015

220 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Gene Ontology and Functional Annotation

Donghui Li

ASPB Plant Biology, June 29, 2008, Merida

2

TAIR literature statistics

May 2007 May 2008

Reference 31,058 34,179

Research articles 22,640 25,001

Full-text papers 15,572 16,638

Average new papers/month

204 216

Loci with valid references

9,289 10,847

3

Functional annotation

Controlled vocabularies: GO and PO

Functional annotation at TAIR

Community annotation

Outline

4

is defined as the process of collecting information about a gene’s biological identity:

• molecular function (protein kinase)• biological roles (protein phosphorylation)• subcellular localization (cytoplasm)

• aliases• mutant phenotype• expression domain

Functional annotation

5

An annotation is a statement that a gene product …

…has a particular molecular function

…is involved in a particular biological process

…is located within a certain cellular component

…as determined by a particular method

…as described in a particular reference

What is an annotation?

Adapted from Harold J Drabkin, The Jackson Laboratory

6Adapted from Harold J Drabkin, The Jackson Laboratory

Smith et al. (2006) determined by an enzyme assay that Abc2 has protein kinase activity, is involved in the process of protein phosphorylation, and is located in the cytoplasm.

Smith et al. (2006) determined by an enzyme assay that Abc2 has protein kinase activity, is involved in the process of protein phosphorylation, and is located in the cytoplasm.

ReferenceReference

Evidence code

Evidence code

Controlled vocabulariesControlled

vocabularies

Gene productGene

product

7

Non-controlled vocabulary• same name, different concept• different name, same concept

Controlled vocabulary (CV)

Controlled vocabulary• A standardized restricted set of defined terms

designed to reduce ambiguity in describing a concept

8

Same name, different concept

Cell

9

Same name, different concept

germination

seed germinationpollen germinationspore germination

10

glucose biosynthesisglucose synthesisglucose formationglucose anabolismgluconeogenesis

Different name, same concept

noncarbohydrate precursors(pyruvate, amino acids and glycerol)

glucose

(3Z)-phytochromobilin + oxidized ferredoxin = biliverdin IXa + reduced ferredoxin. (EC:1.3.7.4)phytochromobilin synthase activity =phytochromobilin:ferredoxin oxidoreductase activity

protein formationtranslation = protein biosynthesis

11

Cross-species cross-database comparison is problematic without CV

• translation• protein biosynthesis

• phytochromobilin synthase activity• phytochromobilin:ferredoxin oxidoreductase activity

12

Cross-species cross-database comparison is problematic without CV

pollen spore

germination

seed germinationpollen germinationspore germination

13

GO: The Gene Ontology, Gene Ontology Consortium

PO: The Plant Ontology, Plant Ontology Consortium

Controlled vocabularies used by TAIR

14

molecular function: catalytic / binding activitieskinase activity, DNA binding activitytranscriptional factor

biological process: biological goal or objectivesignal transductionmitosis, purine metabolism

cellular component: location or complexnucleus ribosome, proteasome

Gene Ontology

15

Term

16

Ontology structure: directed acyclic graph (DAG)

DAG: each child may have one or more parents

parent 1

child

parent 2

17

protein complex

organelle

mitochondrion

fatty acid beta-oxidation multienzyme complex

Ontology structure: directed acyclic graph (DAG)

18

is-a

protein complex

organelle

mitochondrion

fatty acid beta-oxidation multienzyme complex

part-of

is-a

Ontology structure: term-term relationships

19

Gene ontology browser: AmiGO

http://www.geneontology.org

http://amigo.geneontology.org

20

Plant structure

morphological and anatomical structures

stamen, petal, guard cell

Growth and developmental stages

whole plant growth stages and plant structure developmental stages

seedling growth, rosette growth, leaf development stages, embryo development stages

Plant Ontology

21

term

evidence

association

gene

How are annotations made?

The Plant Journal (2006) 47:701

AT5G27620

GO:0004672 protein kinase activity

kinase assay

22

Experimental evidence codesExperimental evidence codes

EXPEXP - Inferred from Experiment- Inferred from Experiment

IMPIMP -- IInferred from MMutant PPhenotype

IDAIDA -- IInferred from DDirect AAssay

IGIIGI - I- Inferred from GGenetic IInteraction

IPIIPI -- IInferred from PPhysical IInteraction

IEPIEP -- IInferred from EExpression PPattern

Computational analysis evidence codesComputational analysis evidence codes

ISSISS -- IInferred from SSequence or structural SSimilarity

Evidence codes

23

24

May 2008

KnownKnown, EXP Unannotated

Unknown

Functional annotation of Arabidopsis genome using GO

25

Search GO Annotations

26

27

28

29

30

31

Total With gene-related data

Indexed Curated

Papers in priority 1 journals

222 166 100% 144 (86%)

Papers in priority 2 journals

546 385 100% 207 (54%)

Papers in priority 3 journals

517 314 100% 31 (10%)

Papers in priority 4 journals

1291 461 100% 11 (2%)

Total 2576 1326 1326 393 (30%)

Papers entered into TAIR (May 07 to May 08)

32

TAIR - Plant Physiology collaboration

• Author submits annotation after the paper is accepted

• Web-based interface

• AGI locus identifier (At1g01040)

• Gene function annotation linked to loci with method

• Will expand to include other journals (Plant Cell ...)

33

34

Functional annotation submission form

curator@arabidopsis.org

35

Add your comment on TAIR

top related