next generation dna sequencing platforms: evolving tools for cancer research next generation dna...

20
Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Norma Neff Bioengineering / Quake Lab Sequencing Core Director Stem Cell Institute SIM1 G1115 / G0821 [email protected]

Upload: terence-hudson

Post on 23-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Next Generation DNA Sequencing

Platforms:Evolving Tools for Cancer Research

Norma NeffBioengineering / Quake LabSequencing Core Director

Stem Cell InstituteSIM1 G1115 / [email protected]

Page 2: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Emulsion PCR-based Sequencing Technologies

Sequencing Technologies

Sequencing By Synthesis

Single Molecule Sequencing Technologies

Recommended Reviews:Michael Metzker (2010) Nature Reviews Genetics 11:31Quail et al (2012) BMC Genomics Jul 24;13:341.

Page 3: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Outline of Today’s Presentation:Sequencing by Synthesis

Next Gen Sequencing Sample or Library Preps

Review of Seq Technologies

Comparisons of Different Platforms

Summary and Final Thoughts

Page 4: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Design of Sequencing Samples or Libraries

Adapters are Ligated to Sample DNA to be sequenced = LibraryAdapters are short (30-50bp) double-stranded oligos

Sequences of the adapters are specific to each seq platform

A1 A2

Sites for PCR primers to bind to amplify the Library

A1 A2

A1 A2

Sites for seq primers to bind to seq the sample DNA

A1 A2BC1 BC2

Bar codes (6-12bp) for multiplexing libraries in a seq run

Page 5: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

3’ OH

Sequencing by Synthesis:Bases are added to DNA Molecules at the 3’ OH end of the Chain

Page 6: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Emulsion PCR – Library DNA is amplified in an Oil Droplet

•Beads are spun into wells on a plate•Flows one dNTP at a time•Detects PPi Release•By Coupled Luciferase Rxn• Light Intensity = Base addition

•Beads are spun into wells of chip•Flows one dNTP at a time•Detects H+ Release•pH change = Base addition

Page 7: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

GS JuniorRoche 454 GS FLX+ Titanium

Roche 454 Benchtop Sequencers – 400bp Readlengths / Reliable ChemistryRequires most time from Library to Machine Loading

First Technology to Incorporate Bar Coding of Libraries

Output = 1 Millions Reads; 400 -700MbRead Length = 400bases (700bases)Run Time = 8-23 hoursError Profile = Indels Homopolymers

Output = 70k-100k Reads; 30MbRead Length = 400 basesRun Time = 10 hoursError Profile = Indels Homopolymers

Page 8: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Ion Torrent = Desktop Sequencers for Low and High Sequence Output

Ion Proton I

PGMOutput 10-500M basesRead Length = 200 basesRun Time = 1-3 hoursError Profile = Indels Homopolymers

Output 10 G basesRead Length = 200 basesRun Time = 4 hoursError Profile = Indels Homopolymers

Coming soon:Proton II and III

300-400 base reads

Page 9: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

OH H

5’

3’2’

OO

O-

P

O

O-

P

O

O-

PO

O OHAdenosine

dATP vs ATP

H H

5’

3’2’

OO

O-

P

O

O-

P

O

O-

PO

O OHAdenosine

ddATP vs dATP

Irreversible TerminatorSanger Sequencing

H

5’

3’2’

OO

O-

P

O

O-

P

O

O-

PO

O OHAdenosine

Reversible Terminators & Cleavable Fluorescent Tags

ON3

X

X

ON3

Page 10: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Solid Phase Amplification – Library DNA binds to Oligos Immobilized on Glass Flowcell Surface

V3 HiSeq

•Clusters are Linearized•Seq primer annealed•All four dNTPs added at each cycle•Error Profile = substitutions•Each dNTP has a different

**Fluorescent Tag**•Intensity of different Tags = Base call

Page 11: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Evolution of Solexa / Illumina Sequencing Platform

GA II (2006) HiSeq 2000 (2010)

V3

Output 30 - 40 Gb / laneRead Length = 100 bases SR

Or 2x100 PRAccommodates Dual Bar codes

Run Time = 2-14 daysError Profile = substitutions

HISeq 2500 = 2x150 (2x250)600 million reads / 39 hours

Output 1 million 1x36bp reads / laneImproved chemistry to 10 million / lanePaired end reads to 2x150bp

Page 12: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

MiSeq – QC Libraries and 250bp Reads

V3 HiSeqMiSeq

V1 Runs 1x50bp + I or 2 bar codes (6 hrs)2x150bp + bar codes (28 hrs)

10M reads = 1G bases

V2 Runs – Use Top and Bottom of Lane2x250bp + bar codes (39 hrs)

15M reads = 7G bases

Accommodates Dual Bar Codes

•Uses single reagent cassette and buffer bottle•Same paired end libraries on all Illumina seqs•Has additional options for Base Space datastorage system and alignment software•Real time run monitoring and data sharing

Page 13: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Single Molecule Imaging: Heavy Metal Battle RoyaleShort Reads & High Output vs Long Reads & Low Output

Page 14: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Helicos Genetic Analysis System

14Company Confidential

SamplePreparation

dA Tailing TdT

HeliScope™Sample Loader

Oligo dT on Flowcell

HeliScope™Single Molecule

Sequencer

HeliScope™AnalysisEngine

>GATAGCTAGCTAGCTACACAGAGAT >GATAGACACACACACACACAGCGCA >GTACTACACACAGCGACACAGTCTA >GTCGAACACACATGAACACATGAGC >GTGTCACACACGACTACACATGCAT >TAGTGACACACGTAGACACGACAGT >TCTCGACACACTATCACACGACTCA>TGCACACACACTCGTACACGAGACG

Output

Does not useligation or PCRamplification

600 – 900Million AlignedBases per lane

X 50 lanes

20Tb

•33bp Avg Reads; 1-10 Gb; 8 day Run•Use Terminal transferse to add poly dA tail•Flows one nucleotide at a time – Error Profile = Indels•DNA quality not an important factor – ancient DNA•Can do Direct RNA Sequencing – 3’ ends•Custom Seq Capture Flowcells•Primarily a sequencing service company

Page 15: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Adapter

Mapped Read Length

Subreads

Pacific Biosciences RS: Real Time Movies of Nucleotide-binding by DNA Polymerase

Page 16: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

PacBio Technology Makes Base Calls on How Long the Base Stays in the Active Site

Output = 50k Reads; 100 Mb per SMRT Cell (16 max per run)Read Length = 2000 basesRun Time = 90min per SMRT CellError Profile = Indels

Page 17: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Year Mean Mapped Subread

Mean Mapped

Readlength

Mapped Reads

per Cell

Mean Mapped SubRead Accuracy

Mapped bases

per cell

Movies per

SMRT cell

Max Time per

Movie

Strobe Seq

2011 470bp 550bp 14k 85% 5Mb 1 30min yes

2012 675bp 1914bp 48k 89% 92Mb 2 90min no

Update of PacBio Progress 2011 – 2012

Page 18: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Cost of $equencingReagentsLibrary ConstructionQuality AssessmentAccessory Equipment and SuppliesLabor to get samples on the machineMachine maintenance / service contractsComputational RequirementsData Storage

Technology Run Type Est Cost / MB

Roche 454 Full Plate 400bases $30

Ion Torrent PGM 316 chip 200bases $3

Illumina HiSeq 2x100 bases $0.01

Illumina MiSeq 2x150 bases $2

PacBio 45,000 / 90min cell $0.20

Page 19: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Two Strategies for Sequencing: Depth of Coverage vs Speed

Depth of Coverage (20-100 million reads with good quality scores) = Discoveryvs

Speed (1-24 hours run time) = Validation and Diagnosis

Accuracy / Seq Error Profiles / Bioinformatic Tools

Page 20: Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research Next Generation DNA Sequencing Platforms: Evolving Tools for Cancer Research

Summary and Final Thoughts

Sequencing Technologies Keep Evolving

Plan your sequencing experiments based on the data set you needConsider size of data set, accuracy reads, cost and speed

Choose your platform appropriately

Work smarter – be imaginative and what seems impossible today can be the standard tomorrow