next gen. sequencing sept. 24, 2008 1 massively parallel high throughput dna sequencing: automation...

35
1 xt Gen. Sequencing Sept. 24, 2008 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo Deciphering of New Genomes Bruce A. Roe, Ph.D., George Lynn Cross Research Professor of Chemistry and Biochemistry, Advanced Center for Genome Technology, Stephenson Researchand Technology Center, University of Oklahoma

Upload: hortense-singleton

Post on 15-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

1Next Gen. Sequencing Sept. 24, 2008

Massively Parallel High Throughput DNA Sequencing: Automation for

Microbial Community, Gene Expression and de novo Deciphering

of New Genomes

Massively Parallel High Throughput DNA Sequencing: Automation for

Microbial Community, Gene Expression and de novo Deciphering

of New Genomes

Bruce A. Roe, Ph.D.,

George Lynn Cross Research Professor of Chemistry and Biochemistry, Advanced Center for Genome Technology,

Stephenson Researchand Technology Center, University of Oklahoma

Page 2: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

2Next Gen. Sequencing Sept. 24, 2008

0.04

0.5

1.0

1.5

2.0

~~

10

20

30

40

1994 1996 1998 2000 2002 2004 2006

Date of Introduction

# Million Bases/Run

2007

ABI 370/377 40Kb/run

~~50

75

100

2008

A Brief History of Long Read Automated DNA Sequencing

Instruments: ABI and 454/Roche

A Brief History of Long Read Automated DNA Sequencing

Instruments: ABI and 454/Roche

30Mb/run454-GS20

ABI 3700 200Kb/run

ABI 3730 1 Mb/run

454/Roche GS-FLX-XLR

1Gb/run

454/Roche GS-FLX 100Mb/run

Page 3: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

3Next Gen. Sequencing Sept. 24, 2008

454 GS-FLX Sequencer454 GS-FLX Sequencer

• Pico-scale sequencing reactions

• 2 Core Techniques:– Emulsion PCR– Pyrosequencing

Page 4: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

4Next Gen. Sequencing Sept. 24, 2008

Emulsion PCREmulsion PCR• Micro-reactors

– Water-in-oil emulsion generates millions of micelles.

– Each micelle contains all reagents/templates for a PCR reaction.

– ~10 Million individual PCR reactions in a single tube.

Page 5: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

5Next Gen. Sequencing Sept. 24, 2008

Emulsion PCREmulsion PCR

Page 6: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

6Next Gen. Sequencing Sept. 24, 2008

Load Beads into 454 Picotiter PlateLoad Beads into 454 Picotiter Plate

Centrifugation

Load Enzyme Beads

44 μm

Load beads into PicoTiter Plate

Page 7: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

7Next Gen. Sequencing Sept. 24, 2008

PyrosequencingPyrosequencing

Light + oxy luciferin

luciferin (4)

• Luciferase hydrolyses ATP

to oxidize luciferin and

produce light

Sulfurylase

Luciferase

ATP

APS

(3)•Sulfurylase creates ATP

from PPi and APSEnzyme Bead

(5)CCD camera detects bursts of light

DNABead

A A T C G G C A T G C T A A A A G T C A

Annealed Primer

DNA PolymerasedTTP

(1) • DNA Polymerase adds

nucleotide (dNTP)

(2)

PPi

•Pyrophosphate

is released (PPi)

T

Page 8: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

8Next Gen. Sequencing Sept. 24, 2008

Pyrosequencing OutputPyrosequencing Output

Page 9: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

9Next Gen. Sequencing Sept. 24, 2008

Base Calling via FlowgramBase Calling via Flowgram

TTCTGCGAA

Page 10: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

10Next Gen. Sequencing Sept. 24, 2008

Types of LibrariesTypes of Libraries• 454/Roche

– Shotgun• Random 250+bp reads

– Paired-End• 25-250bp ends of a circularized DNA molecule

– Amplicon• PCR product for SNP discovery

• Roe Lab– Combined Paired-End and Shotgun

approach• Best of both worlds

Page 11: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

11Next Gen. Sequencing Sept. 24, 2008

Our Combined Paired End & Shotgun DNA Preparation Protocol Overview

Our Combined Paired End & Shotgun DNA Preparation Protocol Overview

Quantitate on Caliper AMS-90 or by RealTime PCR

Ligate to Circularized the DNA

Shear to ~500 bp fragments in the Nebulizer but eliminate the enrichment

step for fragments containing linker

DNA End Repair & Linker Ligation as in paired-end protocol

Cleave the Terminal Linkers with EcoR1

Shear to 2-4 Kbp fragments on the Hydroshear

Page 12: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

12Next Gen. Sequencing Sept. 24, 2008

Amplification (emPCR)

Pyrosequencing of the combined linker-containing (paired end) and shotgun

fragments on 454/Roche GS-FLX

Quantitate on Caliper AMS-90 or by RealTime PCR

DNA End Repair, Adaptor Ligation, Adapter End Repair

Our Combined Paired End/Shotgun DNA Preparation Protocol Overview (cont)

Our Combined Paired End/Shotgun DNA Preparation Protocol Overview (cont)

Page 13: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

13Next Gen. Sequencing Sept. 24, 2008

Assembly of Sequence Reads from Our Combined Paired-End/Shotgun ProtocolAssembly of Sequence Reads from Our Combined Paired-End/Shotgun Protocol

• Separate based on inclusion or exclusion of middle linker– Those sequences containing a middle linker are

further separated based on the length of the read to either end of the linker sequence

– ~15% of the total reads contain the middle linker sequence

• Assembly of the reads by Newbler• Convert paired ends for ordering and orienting

– *.454f and *.454r

Page 14: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

14Next Gen. Sequencing Sept. 24, 2008

Automation of the Shotgun Library Preparation StepsAutomation of the Shotgun Library Preparation Steps

• Why automate?– Time– Reproducibility

• What are the obstacles?– Reaction Cleanup

• Qiagen Minelute centrifuge columns are difficult to automate, so replace those steps with

• Agencourt SPRI magnetic beads and add a magnetic station to the Zymark SciClone bed

– Enzyme Stability and Storage• Build an enzyme cooling station on the Zymark

SciClone bed

Page 15: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

15Next Gen. Sequencing Sept. 24, 2008

SPRI Bead TechnologySPRI Bead Technology

• Solid Phase Reversible Immobilization• Carboxyl coated magnetic particles

suspended in a solution of 10% PEG and 1.25M NaCl

• Reversibly binds DNA – Hawkins, et al. (1994) DNA purification and isolation using a solid-

phase. Nucleic Acids Research, 22(21):4543-4544

http://www.agencourt.com/products/spri_reagents/ampure/

Page 16: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

16Next Gen. Sequencing Sept. 24, 2008

DNA Purification through the Qiagen Minelute Columns vs... Agencourt SPRI Magnetic BeadsDNA Purification through the Qiagen Minelute

Columns vs... Agencourt SPRI Magnetic Beads

Qiagen Minelute centrifuge column Agencourt SPRI magnetic beads

At least a 30% increase in yield with the SPRI beads and it is easier to automate when using the SPRI beads

Page 17: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

17Next Gen. Sequencing Sept. 24, 2008

Homemade 96 well Magnetic Plate for Purification of the SPRI Beads

Homemade 96 well Magnetic Plate for Purification of the SPRI Beads

Inverted 96 well DNA sequencing plate with cylindrical magnets

Page 18: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

18Next Gen. Sequencing Sept. 24, 2008

Enzyme Chilling StationEnzyme Chilling Station

Plastic rack fitted with Swagelock fittings and tubing for cooling.

Page 19: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

19Next Gen. Sequencing Sept. 24, 2008

Zymark SciClone Deck Arrangement

Shaker

EtOH

Enzyme Mixes

Shaker

Shaker

Magnet

SPRI Beads

Sample

Buffers

Waste

Page 20: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

20Next Gen. Sequencing Sept. 24, 2008

QuickTime™ and a decompressor

are needed to see this picture.

Automated Library Making on the Caliper-Zymark SciCloneAutomated Library Making on the Caliper-Zymark SciClone

To view this automation, get our quicktime movie 454ZymarkPrep.mov

Page 21: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

21Next Gen. Sequencing Sept. 24, 2008

We also have increased the average read lengths from 250 to > 315 bases by increasing the number of flows and amounts of reagents

We also have increased the average read lengths from 250 to > 315 bases by increasing the number of flows and amounts of reagents

• Slightly dilute the Substrate, Inhibitor and Apyrase by transferring 2.5mL from one of the Buffer CB bottles to each respective tube in the reagent tube-tray

• Add 174ul (as opposed to 164ul) from the tube of apyrase to the apyrase buffer tube in the reagent tube-tray.

• Transfer 150ml Buffer CB from bottle 3 (at the back of the cassette) to bottle 0 (at the front of the cassette).

• Modify the run script to allow for 130 flow cycles

Page 22: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

22Next Gen. Sequencing Sept. 24, 2008

Reuse the Pico Titer plate after cleaning by sonication

Reuse the Pico Titer plate after cleaning by sonication

Page 23: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

23Next Gen. Sequencing Sept. 24, 2008

Summary - Methods Summary - Methods • For library preparation, It is possible to:

– incorporate both shotgun and paired end reads in the same library– replace the Qiagen Minelute centrifuge columns with Agencourt

SPRI beads in the library preparation and build (or buy) an enzyme chilling station to facilitate automating the library making process

– eliminate the steps involved in single stranded DNA preparation steps

• It also is possible to:– break the emulsion after emPCR using centrifugation rather than

using a Swinlock filter containing a sieving fabric. – Increase the volumes of the FLX reagents and increase the

number of cycles results in a significantly increased read length.– reuse the PicoTiter plate after cleaning by sonication

• All our protocols are available on our lab protocol web site at url: http://www.genome.ou.edu/proto.html

Page 24: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

24Next Gen. Sequencing Sept. 24, 2008

ApplicationsApplications

• Whole Genome Sequencing

• Pooled samples– Plant viruses– Plant fungi – BAC-based genomic sequencing

• EST Libraries

• Bacterial Communities

Page 25: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

25Next Gen. Sequencing Sept. 24, 2008

Novel cDNA pooling strategyNovel cDNA pooling strategy

• Add tags to the PCR primer sequences to allow for deconvolution of viral sequences post sequencing

• cDNA samples are pooled in sets with 24 unique individual tags after a two step PCR

Page 26: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

26Next Gen. Sequencing Sept. 24, 2008

Strategy for preparing cDNA ready for 454 sequencing from dsRNA

Strategy for preparing cDNA ready for 454 sequencing from dsRNA

5’ 3’

3’ 5’

Anneal with Random Hexamer Primers followed by Reverse Transcriptase PCR Reaction

5’ 3’

5’3’NNNNNN

CCTTCGGATCCTCC

RNAse Treatment to Remove any Excess Random Hexamer Primers followed by a second Taq Polymerase PCR with one of the 24 four base Tagged Primers

3’ 5’

5’

GGAAGCCTAGGAGG

5’

5’

CCTCCTAGGCTTCCGAGA

+5’

3’ 5’CCTCCTAGGCTTCCNNNNNN

CCTCCTAGGCTTCC

NNNNNN

NNNNNNCCTTCGGATCCTCC5’ 3’

+

Additional Rounds of RT PCR with Random Hexamer Primers

NNNNNN

CCTTCGGATCCTCC

CCTCCTAGGCTTCC

NNNNNN

CCTCCTAGGCTTCCNNNNNN

NNNNNNCCTTCGGATCCTCC5’ 3’

AGAGCCTTCGGATCCTCC

GGAAGCCTAGGAGG

+ 5’ 3’

3’ 5’

AGAGCCTTCGGATCCTCC

CCTCCTAGGCTTCCGAGA

Amplified Product Ready for Ligating 454 A and B Primers

A B

Page 27: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

27Next Gen. Sequencing Sept. 24, 2008

Uniquely Tagged cDNA Sample on the 454

Uniquely Tagged cDNA Sample on the 454

454 tag (TCAG)

TGP Unique tag (GACA)

TGP common primer

(CCTTCGGATCCTCC)

RT-PCR Sequence

Page 28: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

28Next Gen. Sequencing Sept. 24, 2008

10 Day Contour Clamped Homogenous Electrophoretic 10 Day Contour Clamped Homogenous Electrophoretic Field (CHEF) Gels for Chromosome IsolationField (CHEF) Gels for Chromosome Isolation

3.5 Mb

4.6 Mb

5.7 Mb

12

3

567

4

S.pombe Po OkAlf-8 in all 4 lanes

Chr. #• Excise individual chromosomal

bands, freeze at -200C and then melt by heating to 65 0C.

• Mix 500 ul aliquots of TE saturated phenol and melted gel and re-freeze at -200C

• Centrifuge at 2500 RPM in a table top centrifuge at -200C

• Remove aqueous layer and extract any residual phenol twice with water-saturated ether

• Ppt with 2.5 vol of 95% ethanol/acetate, wash 70% ethanol and dry the DNA

• Dissolve the DNA in 10 ul of 10:0.1 TE

Page 29: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

29Next Gen. Sequencing Sept. 24, 2008

Eluted & amplified chromosomes on a 1% agarose gel

BAC Hind3 1 2 3 4 5 6 7 Hind3

Qiagen REPLI-g Mini kit was used to amplify the chromosomes

• 2.5 ul of the purified chromosomal DNA was mixed with 2.5 ul of Qiagen denaturation buffer for 3 minutes at 250C followed by mixing with 5 ul of Qiagen neutralization buffer.

• A master mix containing 10 ul nuclease-free water, 29 ul reaction buffer (containing dNTPs and exonuclease-resistant primers) and 1 ul of the Qiagen’s DNA polymerase was added to the treated chromosomal DNA and incubated at 300C overnight.

• The amplified chromosomal DNA product then was verified on a 1% agarose gel by electrophoresis and subjected to the mixed shotgun paired-end sequencing where over 90% of the sequences matched in our CRR

database

10 Day Contour Clamped Homogenous Electrophoretic 10 Day Contour Clamped Homogenous Electrophoretic Field (CHEF) Gels for Chromosome IsolationField (CHEF) Gels for Chromosome Isolation

Page 30: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

30Next Gen. Sequencing Sept. 24, 2008

Summary of our use of CHEF gels for Summary of our use of CHEF gels for chromosome isolation and subsequent chromosome isolation and subsequent

amplification for sequencingamplification for sequencing• Using our long established freeze/thaw

phenol extraction protocol, individual chromosomes can be purified from chromosome grade agarose CHEF gels and then

• Amplified using the Qiagen REPLI-g Mini kit • Sequence data can obtained after library

making, emPCR and massively parallel pyrosequencing on the 454/Roche GS-FLX with over 90% of the sequences matching our target genome/fungal database

Page 31: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

31Next Gen. Sequencing Sept. 24, 2008

• BAC growth in 96 deep well microtiter plates• Robotic BAC isolation via the cleared lysate protocol

using the Hydra robot.• Sheer each BAC individually and create the paired

end libraries on the Zymark SciClone robot.• Individually tagged A linkers are added with B linkers

prior to pooling 12 tagged libraries, followed by• emPCR, and half-plate sequencing of each pool.

Strategy of adding the 454/Roche MID-based Tags

prior to BAC Pooling

Strategy of adding the 454/Roche MID-based Tags

prior to BAC Pooling

Page 32: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

32Next Gen. Sequencing Sept. 24, 2008

Strategy of adding the 454/Roche MID-based Tags

prior to BAC Pooling

Strategy of adding the 454/Roche MID-based Tags

prior to BAC Pooling

• 12 uniquely tagged individual shotgun libraries would be pooled and sequenced on each half- 454/Roche GS-FLX picotiter plate, 24 tagged libraries/full plate

• 24 150 Kb BACs requires 3.6 Mb for 1 x sequence coverage• With >75 Mb of DNA sequence obtained per full plate, >20x coverage is

obtained for each of the 24 pooled BACs• 96 BACs would therefore require 4 full plate runs on the 454/Roche GS-

FLX and no ABI 3730 runs are needed to deconvolute the individual BACs as each BAC is individually tagged

• The BACs then are easily closed and finished using PCR-based methods.

Page 33: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

33Next Gen. Sequencing Sept. 24, 2008

Analysis of ordered and oriented combined shotgun/paired-end results

Analysis of ordered and oriented combined shotgun/paired-end results

Our present strategy is to use the combined shotgun/paired-end pyrosequencing approach on the 454/Roche GS-FLX followed by PCR-based closure methods.

vector

454/Roche GS-FLX only assembled sequences

repeat sequences missing in the 454 data but present in the 3730 and/or obtained by PCR-based closure

Phrap-assembled ABI-3730 and 454/Roche GS-FLX sequences

Un-joined 454 data often with no missing base but joined by 454 paired-ends and spanned by 3730 or PCR-based sequences

Page 34: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

34Next Gen. Sequencing Sept. 24, 2008

Acknowledgments• Collaborators

– Plant Virus studies• Oklahoma State University: Ulrich Melcher, Vijay Muthamukar• Noble Foundation: Marilyn Roossinck, Guoan Shen, Byoung Min, Rick Nelson,

Tracy Feldman– Phymatotrichopsis omnivora aka Cotton Root Rot Fungi

• Oklahoma State University: Steve Marek• Noble Foundation: Carolyn Young

– Medicago truncatula• University of Minnesota: Nevin Young, Roxanne Denny, Steven Cannon (now at

Iowa State), Arvind Bhari, Shelly Wang• The JCV Institute: Chris Town, Foo Cheung• The John Innes Institute, UK: Giles Oldroyd & Sanger Institute: Jane Rogers • Toulouse/INRA & Genoscope, France: Fredric Debelle, Francis Quetier• Munich Bioinformatics Center IMGAG Consortium: Claus Mayer

• Funding from the NSF Plant Genome, Microbial and EPSCoR Programs and the USDA

Page 35: Next Gen. Sequencing Sept. 24, 2008 1 Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo

35Next Gen. Sequencing Sept. 24, 2008

OU Genome Center Personnel

Nature gives up her secrets to the prepared mind, driving innovation

Automation

Graham

Fares

Doug

Simone

www.genome.ou.edu/proto/htmll