basics of genome annotation
DESCRIPTION
Basics of Genome Annotation. Daniel Standage Biology Department Indiana University. An-no-ta- tion \ˌa - nə -ˈ tā- shən \. A critical or explanatory note or body of notes added to a text The act of annotating. http:// dictionary.reference.com /browse/ annotation?s =t. Genome annotation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/1.jpg)
Basics of Genome AnnotationDaniel StandageBiology DepartmentIndiana University
![Page 2: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/2.jpg)
An-no-ta-tion \ˌa-nə-ˈtā-shən\
1. A critical or explanatory note or body of notes added to a text
2. The act of annotating
http://dictionary.reference.com/browse/annotation?s=t
![Page 3: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/3.jpg)
![Page 4: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/4.jpg)
![Page 5: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/5.jpg)
Genome annotation
![Page 6: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/6.jpg)
Genome annotation
![Page 7: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/7.jpg)
Genome annotation
Information itself (e.g., this gene encodes a cytochrome P450 protein, with exons at…)
Annotation process (operational definition)
Data management formatting storage distribution representation
![Page 8: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/8.jpg)
Methods for gene finding
Ab initio gene prediction
Gene prediction by spliced alignment
![Page 9: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/9.jpg)
Ab initio gene prediction
Ab initio: “from first principles”
Requires only a genomic sequence
Uses statistical model of genome composition to identify most probable location of start/stop codons, splice sites
Popular implementations Augustus GeneMark SNAP
![Page 10: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/10.jpg)
Ab initio gene prediction
![Page 11: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/11.jpg)
Prediction by spliced alignment
Utilizes experimental (transcript) and/or homology (reference proteins) data
Spliced alignment of sequences reveals gene structure matches = exons gaps = introns
Popular implementations GeneSeqer Exonerate GenomeThreader
![Page 12: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/12.jpg)
Comparison of prediction methods
Ab initio Spliced alignment
Do not require extrinsic evidence
Requires transcript and/or protein sequences
Does not benefit from additional transcript data
Accuracy improves with additional transcript data
More likely to recover complete gene structures
More likely to recover accurate internal exon/intron structure
![Page 13: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/13.jpg)
Issues with gene prediction
Accuracy (best methods achieve ≈80% at exon level)
Parameters matter (species-specific codon usage)
Comparison and assessment
![Page 14: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/14.jpg)
Recurring theme in genomics
Once I have a result, how to I assess its reliability?
How do I compare it to alternative results?
![Page 15: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/15.jpg)
Recurring theme in genomics
"Why, when you only had one result, did you think that was the correct one?"
![Page 16: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/16.jpg)
![Page 17: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/17.jpg)
Manual annotation
Visually inspect gene predictions, spliced alignments
Determine reliable consensus gene structure
Available software Apollo: http://apollo.berkeleybop.org yrGATE: http://goblinx.soic.indiana.edu/src/yrGATE
![Page 18: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/18.jpg)
![Page 19: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/19.jpg)
“Combiner” tools
Maker: http://www.yandell-lab.org/software/maker.html
EVidenceModeler: http://evidencemodeler.sourceforge.net
![Page 20: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/20.jpg)
Evaluating annotations
Comparison ParsEval1: http://standage.github.io/AEGeAn
Quality assessment Annotation Edit Distance2 (Maker) GAEVAL (PlantGDB)
1Standage and Brendel (2012) BMC Bioinformatics, 13:187.2Eilbeck et al (2009) BMC Bioinformatics, 10:67.
![Page 21: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/21.jpg)
Recommendations / Considerations
Automated annotation
Manual refinement
Assessment and filtering for particular analyses
Be very skeptical
Remember: no “one true” assembly / annotation
![Page 22: Basics of Genome Annotation](https://reader034.vdocument.in/reader034/viewer/2022051517/56815380550346895dc18422/html5/thumbnails/22.jpg)
xGDBvm
Pre-installed on iPlant cloud (free for academics!) Search for xGDBvm image
Includes an EVM pipeline for automated annotation
Includes yrGATE for manual annotation
Visualization, search, access control
More info: http://goblinx.soic.indiana.edu