Transcript

From Sequence to Knowledge

Assembly, Annotation, and Analysis

of Phage genomes from Genomic

and Metagenomic Data Sets

A helping hand through

The Annotation Bottleneck

Ramy K. Aziz

Workshop presenters

6 Aug 2017 Phage Genomics - Evergreen 2017

Alejandro Reyes

AR

Ramy Aziz

RAJason Gill

JG

PRELUDE

6 Aug 2017 Phage Genomics - Evergreen 2017

A bit of history…

• Since 2009, the Genomics Workshop has

become an essential part of the Evergreen

phage meeting

• The challenge always is: how to meet

needs/expectations that are so many and

so diverse, in ~4 hours

6 Aug 2017 Phage Genomics - Evergreen 2017

A bit of history…

• Since 2009, the Genomics Workshop has

become an essential part of the Evergreen

phage meeting

• The challenge always is: how to meet

needs/expectations that are so many and

so diverse, in ~4 hours

• The answer is:

…….

6 Aug 2017 Phage Genomics - Evergreen 2017

A bit of history…

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2011 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2013 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2013 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2013 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2015 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2015 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2015 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2015 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

The 2015 workshop

6 Aug 2017 Phage Genomics - Evergreen 2017

MOTIVATION

6 Aug 2017 Phage Genomics - Evergreen 2017

“The analysis bottleneck”

• Observation:

– We generate more data than we can analyze.

– We generate sequence data faster than

we can analyze them.

• Opinion:

– Not all bottlenecks are

created equal!

– It is important to define the question(s)

before working on the answer(s)!6 Aug 2017 Phage Genomics - Evergreen 2017

“The analysis bottleneck”

• The Lavigne paradox (2013)

6 Aug 2017 Phage Genomics - Evergreen 2017

“The analysis bottleneck”

• The Lavigne paradox (2013)

6 Aug 2017 Phage Genomics - Evergreen 2017

“The analysis bottleneck”

• The Lavigne paradox (2015)

6 Aug 2017 Phage Genomics - Evergreen 2017

AUDIENCE

6 Aug 2017 Phage Genomics - Evergreen 2017

Workshop audience

• Who (how many) among you have:

– annotated at least a phage genome?

– worked on a viral metagenome?

– used the command line (Unix, Linux, Mac

Terminal) for sequence analysis?

• We have actually ran an online survey,

and here is what we found …

6 Aug 2017 Phage Genomics - Evergreen 2017

Workshop audience

6 Aug 2017 Phage Genomics - Evergreen 2017

Workshop audience

6 Aug 2017 Phage Genomics - Evergreen 2017

Workshop audience

6 Aug 2017 Phage Genomics - Evergreen 2017

Workshop audience

6 Aug 2017 Phage Genomics - Evergreen 2017

Workshop audience

6 Aug 2017 Phage Genomics - Evergreen 2017

Quick group activity

Defining the question(s):

• Introduce yourself, your institution, and your

favorite phage

• Do you have a genome sequenced? Planning to?

– Why have you sequenced your phage genome?

– Why you want to sequence your phage genome?

• What is the single most pressing question you

want to have answered from genome analysis?

6 Aug 2017 Phage Genomics - Evergreen 2017

DEFINING THE QUESTION(S)

6 Aug 2017 Phage Genomics - Evergreen 2017

What you want …... isfrom genome from metagenome

6 Aug 2017 Phage Genomics - Evergreen 2017

Incomplete

frameshift

- complete

- accurate

Credit: Andrew Kropinski Credit: Bas Dutilh

faulty assembly

What you want …... isfrom genome from metagenome

6 Aug 2017

Incomplete faulty assembly

frameshift

- complete

- accurate

Phage Genomics - Evergreen 2017

Credit: Andrew Kropinski Credit: Bas Dutilh

A process of reconstruction

6 Aug 2017 Phage Genomics - Evergreen 2017

A process of reconstruction

• Experimentally

6 Aug 2017 Phage Genomics - Evergreen 2017

DNA

TGATTGTGTGTTTGCGCAATGCG

ATGTGTATATATAGTGAGCTTGCCC

GTCTCTCTNNNTCTCTTG

TGATTGGTCTNNNTCTCTTGCGCAATGCG

A process of reconstruction

• Experimentally

• Computationally

6 Aug 2017 Phage Genomics - Evergreen 2017

TGATTGTGTGTTTGCGCAATGCG

ATGTGTATATATAGTGAGCTTGCCC

GTCTCTCTNNNTCTCTTG

TGATTGGTCTNNNTCTCTTGCGCAATGCG

DNA

TGATTGTGTGTTTGCGCAATGCG

ATGTGTATATATAGTGAGCTTGCCC

GTCTCTCTNNNTCTCTTG

TGATTGGTCTNNNTCTCTTGCGCAATGCG

A process of reconstruction

• Experimentally

• Computationally

6 Aug 2017 Phage Genomics - Evergreen 2017

TGATTGTGTGTTTGCGCAATGCG

ATGTGTATATATAGTGAGCTTGCCC

GTCTCTCTNNNTCTCTTG

TGATTGGTCTNNNTCTCTTGCGCAATGCG

“Any phage

one can get!”

“eDNA”

TGATTGTGTGTTTGCGCAATGCG

ATGTGTATATATAGTGAGCTTGCCC

GTCTCTCTNNNTCTCTTG

TGATTGGTCTNNNTCTCTTGCGCAATGCG

Assembly

Gene finding/

ORF calling

tRNA calling

Annotation

(Assigning

functions)

orienting

Validation

Fixing frameshifts

Introns and Inteins Subsystem

assignment

Refinement/

Secondary

annotation

loop

Special purpose:

toxins, morons, integrases,

lifestyle prediction

Regulatory elements

(promoters, terminators)

Output: files and graphics

From Sequence to Knowledge

From raw sequence data to

genome submission/ publication

Classification

• The phage sequence space (Lima-Mendez et al.)

• The phage proteomic tree (Edwards & Rohwer)

• New: VIP tree http://www.genome.jp/viptree

6 Aug 2017 Phage Genomics - Evergreen 2017

Countless tools

6 Aug 2017 Phage Genomics - Evergreen 2017

This workshop: outline

1. Annotation overview

2. Automated tools for genome annotation:

– PhAnToMe/RAST related tools

– Galaxy/ Apollo

3. Tools for metagenome-based analyses

– Assembly

– Functional prediction via protein families

6 Aug 2017 Phage Genomics - Evergreen 2017

Where to go from here?

• Part I:

General introduction of genome annotation

• Part II:

Two levels

– Level 1: Novices and beginners:

Automated annotation tools

– Level 2: Intermediate to advanced users:

Command-line based tools

6 Aug 2017 Phage Genomics - Evergreen 2017

Online resources/ Slideshare• Data & links:

– http://egybio.net/tutorial

• Slides

– http://bit.ly/annotation2016

– http://bit.ly/phantome4

– Old tutorials (more detailed, but missing latest ):

• Evergreen 2011: http://slidesha.re/phantome1

• http://slidesha.re/phiRAST1 (by Karin Holmfeldt)

• Evergreen 2013: http://bit.ly/phantome2

• Evergreen 2015: http://bit.ly/phantome3

6 Aug 2017 Phage Genomics - Evergreen 2017


Top Related