vbi oomycetes2011 final
DESCRIPTION
JGI tutorialTRANSCRIPT
MycoCosm & JGI Oomycete Genomics Tools
Andrea AertsEukaryotic Genomics Program
US DOE Joint Genome Institute, Walnut Creek, CA
2
US DOE Joint Genome Institute
• Opened in 1999• 250+ people• Budget: ~$70M• Location: Walnut Creek, CA
www.jgi.doe.gov
bioenergy, carbon cycling & biogeochemistry
3
Are you in the right room?
genome.jgi-psf.org
IMG
MycoCosm
70+ annotated eukaryotic genomes
4
Outline
Eukaryotic Genome Annotation
MycoCosm
Manual Curation
5
Started with Human Genome Project
6
Automated Annotation Pipeline
Quality Control
Gene Prediction
Data Mapping
Repeat Masking
Annotation
Genome assembly and ESTs
Repeat Library
Training predictors
External gene sets
Optional Inputs Automated Pipeline Staged Portal Release
External data sets
Manual curation
7
Eukaryotic Gene Prediction
Ab initio gene prediction is based on knowledge of general features of gene structure trained on a set of known genes.FGENESH
GENEMARK
Homology methods build exons around regions of homology and ensure gene structure features.
Homolog
Predicted model
Known genes are mapped onto genomic sequence
mRNA
Known gene
ATG TGA
GT AG
exons introns5’UTR3’UTR
Promoter PolyA
Gene model
GENEWISEFGENESH+,
8
More Prediction Methods
Use ESTs/cDNAs to extend, correct or predict gene models
• ESTEXT
Predicted model
ESTs
Extended model
5’UTR 3’UTR
ATG TGA
ATG TGADetect orthologs with poor alignments and refine with synteny based methods • FGENESH2
Genome A
Genome B
FGENESH
Representative set
GENEWISE
EXTERNAL MODELS
Non-redundant gene set is built from “the best” models from each locus according to homology and ESTs, followed by manual curation
9
Outline
Eukaryotic Genome Annotation
MycoCosm
Manual Curation
10
Ly
Genome-Centric View
Comparative View
www.jgi.doe.gov/fungi
1111
Genome-Centric View
Focus: functional genomics, user data deposition and curation
12
New Comparative View
1313
Community Building Tools
•
Asilomar, Pacific Grove, CA (Mar 15-20): Mar 15: Dothideomycetes jamboree Mar 17: JGI Workshop (Fungi & CSP) Mar 18: Fungal Tools (MycoCosm)Walnut Creek, CA (Mar 21-24): Mar 21: Basidiomycetes jamboree Mar 22: MycoCosm tutorial
14
Oomycetes at JGI
• Phytophthora ramorum• Phytophthora sojae
• improved assembly • released “internally” in April 2011
• Phytophthora capsici• Hybrid Sanger/454 assembly • Public release November 2010
• Phytophthora cinnamomi• Currently in sequencing
15
Outline
Eukaryotic Genome Annotation
MycoCosm
Manual Curation
16
Manual Curation Tools
17
Manual Curation Workflow
1. Find a gene2. Validate gene structure3. Choose the best model4. Fix the structure or report the problems5. Annotate it!
18
1. Find a Gene
• Search using keywords• Browse KEGG/GO/KOG lists• Blat/Blast your sequence against genome,
transcripts, proteins• Search by ID (Simple search for model name,
protein id, gi number)
19
2. Validate Gene Structure
Using GenomeBrowser, check:ESTs; Genome conservation; Homology;Other evidence (eg, proteomics, tiling arrays)
20
2. Validate Gene Structure, cont
Using Protein Pages, check:Alignments with homologs; Domains
21
2. Validate Gene Structure, cont
Using Cluster Details pages, compare:
sizes and domain composition of orthologs
22
4. Fix Gene Structure
• If the best not good enough report that it requires editing (AnnotationPage), or fix it (TrackEditor) and then promote it (AnnotationPage)
• If there is no model, create it (TrackEditor) and then promote it (AnnotationPage)
23
5. Annotate Gene
Using AnnotationPage, assign:• Gene name/symbol (optional)• Defline (mandatory, will go to GenBank)• Description (optional)• Notes (‘No issues’ if the model is good)• Disposition (‘Catalog’ or ‘Delete’)