aaron liston, oregon state university botany 2012 intro to next generation sequencing...

6
Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing Workshop 1 Growth in Next-Gen Sequencing Capacity 0.0E+00 5.0E+10 1.0E+11 1.5E+11 2.0E+11 2.5E+11 3.0E+11 3.5E+11 Output (bp) ABI 3730xl 454 GS20 Solexa SOLiD Illumina GAII Illumina HiSeq Illumina GAIIx Adapted from Mardis, 2011, Nature 2002 2004 2006 2008 2010 Platforms Slide courtesy of Rich Cronn NGS Library Types Original separation of 2-5 kb Separation of 200-500 bp Fragment Library Paired-end Library Mate-paired Library http://www.appliedbiosystems.com 454 (2005) Template Type Sequencing Method Imaging Method Clonally amplified by emulsion PCR Sequencing by synthesis using single nucleotide addition Bioluminescence with charge coupled device (CCD) camera Ansorge, W. 2009. New Biotechnology 25:195-203. http://www.rps.psu.edu/indepth/graphics/sequencing_small.jpg

Upload: others

Post on 12-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing ...milkweedgenome.org/sites/default/files/workshopFiles/... · 2015-10-15 · Aaron Liston, Oregon

Aaron Liston, Oregon State University

Botany 2012 Intro to Next Generation Sequencing Workshop

1

Growth in Next-Gen Sequencing Capacity

0.0E+00

5.0E+10

1.0E+11

1.5E+11

2.0E+11

2.5E+11

3.0E+11

3.5E+11

Ou

tpu

t (

bp

)

ABI

3730xl

454

GS20

Solexa

1G

ABI

SOLiD

Illumina

GAII

Illumina

HiSeq

Illumina

GAIIx

Adapted from

Mardis, 2

011, Nature

2002 2004 2006 2008 2010

Platforms

Slide courtesy of Rich Cronn

NGS Library Types

Original separation of 2-5 kb

Separation of 200-500 bp

Fragment Library

Paired-end Library

Mate-paired Library

http://www.appliedbiosystems.com

454 (2005)

Template Type Sequencing Method Imaging Method

Clonally amplified by emulsion PCR

Sequencing by synthesis using single nucleotide

addition

Bioluminescence with charge coupled device

(CCD) camera

Ansorge, W. 2009. New Biotechnology 25:195-203. http://www.rps.psu.edu/indepth/graphics/sequencing_small.jpg

Page 2: Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing ...milkweedgenome.org/sites/default/files/workshopFiles/... · 2015-10-15 · Aaron Liston, Oregon

Aaron Liston, Oregon State University

Botany 2012 Intro to Next Generation Sequencing Workshop

2

454

Instrument Run time Millions of Reads/run

Bases / read Yield

MB/run

454 GS Jr. Titanium

10 hrs 0.1 400 50

Ion Torrent – 314 chip

2.5 hrs 0.25 200 50

454 FLX Titanium 10 hrs 1 400 400

454 FLX+ 20 hrs 1 650 650

2012 NGS Field Guide. www.molecularecologist.com

Illumina (Solexa) 2007

Template Type Sequencing Method Imaging Method

Clonally amplified by solid phase amplification

Sequencing by synthesis with cyclic reversible

termination

Four color imaging of single events using

fluorescence

http://www.illumina.com/systems/hiseq_2000.ilmn

Clonally Amplified Templates

Solid-phase Amplification

Metzker, M. 2010. Nature Reviews Genetics 11:31-46.

Cyclic reversible termination (CRT)

Metzker, M. 2010. Nature Reviews Genetics 11:31-46.

Illumina

Instrument Run time Millions of Reads/run

Bases / read Yield

MB/run

Illumina MiSeq 26 hrs 4 150+150 1200

Illumina GAIIx 14 days 300 150+150 96,000

Illumina HiSeq 1000

8.5 days ≤1500 100+100 ≤300,000

Illumina HiSeq 2000

11.5 days ≤3000 100+100 ≤600,000

2012 NGS Field Guide. www.molecularecologist.com

Page 3: Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing ...milkweedgenome.org/sites/default/files/workshopFiles/... · 2015-10-15 · Aaron Liston, Oregon

Aaron Liston, Oregon State University

Botany 2012 Intro to Next Generation Sequencing Workshop

3

SOLiD (2008)

Template Type Sequencing Method Imaging Method

Clonally amplified by emulsion PCR

Sequencing by ligation Four color imaging of single events by CCD

camera

SOLiD

Instrument Run time Millions of Reads/run

Bases / read Yield

MB/run

SOLiD – 5500xl 8 days >1,410 75+35 155,100

2012 NGS Field Guide. www.molecularecologist.com

Semiconductor Sequencing

http://www.iontorrent.com/the-simplest-sequencing-chemistry/

Ion Torrent (2010) Ion Torrent (2010)

Semiconductor Sequencing

http://www.iontorrent.com/the-simplest-sequencing-chemistry/

Ion Torrent (2010)

Semiconductor Sequencing

http://www.iontorrent.com/the-simplest-sequencing-chemistry/

Ion Torrent

Instrument Run time Millions of Reads/run

Bases / read Yield

MB/run

Ion Torrent – 314 chip

2.5 hrs 0.25 200 50

Ion Torrent – 316 chip

3 hrs 1.6 200 320

Ion Torrent – 318 chip

4.5 hrs 4 200 800

2012 NGS Field Guide. www.molecularecologist.com

Page 4: Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing ...milkweedgenome.org/sites/default/files/workshopFiles/... · 2015-10-15 · Aaron Liston, Oregon

Aaron Liston, Oregon State University

Botany 2012 Intro to Next Generation Sequencing Workshop

4

Ion Torrent Proton (2012)

Real Time Sequencing by Synthesis

Pacific Biosciences (2010)

circular consensus

Instrument Run time Millions of Reads/run

Bases / read Yield MB/run

3730xl (capillary) 2 hrs 0.000096 650 0.06

PacBio RS 2 hrs 0.01 860 – 1,500 5-10

454 GS Jr. Titanium 10 hrs 0.1 400 50

Ion Torrent – 314 chip 2.5 hrs 0.25 200 50

454 FLX Titanium 10 hrs 1 400 400

454 FLX+ 20 hrs 1 650 650

Ion Torrent – 316 chip 3 hrs 1.6 200 320

Illumina MiSeq 26 hrs 4 150+150 1200

Ion Torrent – 318 chip 4.5 hrs 4 200 800

Illumina GAIIx 14 days 300 150+150 96,000

SOLiD – 5500xl 8 days >1,410d 75+35 155,100

Illumina HiSeq 1000 8.5 days ≤1500 100+100 ≤300,000

Illumina HiSeq 2000 11.5 days ≤3000 100+100 ≤600,000

Run time, Reads and Yield for Current NGS Instruments

2012 NGS Field Guide. www.molecularecologist.com Oxford Nanopore (2012)

Strand Sequencing

64 triplet signals

Exonuclease Sequencing

Page 5: Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing ...milkweedgenome.org/sites/default/files/workshopFiles/... · 2015-10-15 · Aaron Liston, Oregon

Aaron Liston, Oregon State University

Botany 2012 Intro to Next Generation Sequencing Workshop

5

GridION

MinION

Oxford Nanopore (2012)

Platform Primary Errors

Single-pass Error Rate (%)

Final Error Rate (%)

3730xl (capillary) Substitution 0.1-1 0.1-1

454 Indel 1 1

Illumina Substitution ~0.1 (85% of

reads) ~0.1 (85% of

reads)

SOLiD A-T bias ~5 ≤0.1

Ion Torrent Indel ~1 ~1

PacBio RS CG deletions ~15 ≤15

Oxford Nanopore Deletions ≥4 4

NGS Error Rates

2012 NGS Field Guide. www.molecularecologist.com

Instrument Run time Millions of Reads/run Bases / read Yield MB/run ABI 3730xl (capillary) 2 hrs 0.000096 650 0.06 PacBio RS 2 hrs 0.01 860 – 1,500 5-10 454 GS Jr. Titanium 10 hrs 0.1 400 50

Oxford Nanopore MinION (2012) 6 hrs or less [0.1] [9,000] [1000]

Ion Torrent – 314 chip 2.5 hrs 0.25 200 50 454 FLX Titanium 10 hrs 1 400 400 454 FLX+ 20 hrs 1 650 650 Ion Torrent – 316 chip 3 hrs 1.6 200 320 Illumina MiSeq 26 hrs 4 150+150 1200 Ion Torrent – 318 chip 4.5 hrs 4 200 800

Oxford Nanopore GridION 2000 (2012) [6 hrs or less] [4] [10,000] [40,000]

Oxford Nanopore GridION 8000 (2013) [6 hrs or less] [10] [10,000] [100,000]

Illumina MiSeq upgrade (2012) [36 hrs] 15 250+250 7000

Ion Torrent – Proton I (2012) 4 hrs [50] [200] [40,000]

Ion Torrent – Proton II (2013) 4 hrs [250] [400] [100,000]

Illumina GAIIx 14 days 300 150+150 96,000

Illumina HiSeq 2500 mini-cell (2012) 42 hrs 600 150+150 180,000

SOLiD – 5500xl 8 days >1,410 75+35 155,100 Illumina HiSeq 1000 8.5 days ≤1500 100+100 ≤300,000 Illumina HiSeq 2000 11.5 days ≤3000 100+100 ≤600,000

Grey = based on company sources. Brackets = speculation.

Run time, Reads & Yield for Current and Announced NGS Instruments 2012 NGS Field Guide. www.molecularecologist.com

Instrument Reagent

Cost/runa Reagent Cost/MB

Minimum Unit Cost (% run)

ABI 3730xl (capillary) $144 $2308 $6 (1%) PacBio RS $300-1700c $7-38 $500 (100%) 454 GS Jr. Titanium $1100 $22 $1500 (100%) 454 FLX Titanium $6,200 $12 $2000 (12%) 454 FLX+d $6,200 $7 $2000 (12%) Ion Torrent – 314 chip $350 $7 ~$750 (100%) Ion Torrent – 316 chip $550 $2 ~$1000 (100%)

Oxford Nanopore minION (2012) ≤$900 $1 ~$1100 (10%)

Illumina MiSeq $1160 $1 ~$1400 (100%) Ion Torrent – 318 chip $750 $1 ~$1200 (100%) Illumina GAIIx $17,575 $0.19 $3000 (14%) Illumina iScanSQ $12,750 $0.09 $3000 (14%) Ion Torrent – Proton I (2012) $1000 $0.09 ? (100%) SOLiD – 5500xl $10,503 <$0.07 $2000 (12%) Illumina HiSeq 1000 $10,220 $0.04 $3000 (12%) Illumina HiSeq 2000 $23,470d ≥$0.04 $3000 (6%)

Illumina HiSeq 2500 or MiSeq upgrades (2012) ? ? ?

Oxford Nanopore GridION 2000 (2012) varies $0.03-0.04 ? (≤1%)

Oxford Nanopore GridION 8000 (2013) varies $0.01-0.02 ? (≤1%)

Ion Torrent – Proton II (2013) [$1000] [$0.01] ? (100%)

How much will it cost?

Includes all stages of sample prep. for a single sample (i.e., library prep through sequencing. capillary = sequencing only)

2012 NGS Field Guide. www.molecularecologist.com

Platform Year Sequencing

Method Amplification Detection Features

454 2005 Pyro-

sequencing Emulsion PCR Light First NGS

Illumina 2007 Synthesis Bridge PCR Light 90% of Market

SOLiD 2008 Ligation Emulsion PCR Light Lowest Error Rate

Ion Torrent 2010 Synthesis Emulsion PCR Hydrogen Ion Semiconductor

Chip

Pacific Biosciences

2010 Synthesis None = Single

Molecule Light

Anchored Polymerases

Oxford Nanopore

2012 Nanopore None = Single

Molecule Electrical

Conductivity “Run Until” Sequencing

NGS Technology Summary

Modified from Travis C. Glenn. 2011. Field guide to next-generation DNA sequencers. Molecular Ecology Resources 11: 759-769

Page 6: Aaron Liston, Oregon State University Botany 2012 Intro to Next Generation Sequencing ...milkweedgenome.org/sites/default/files/workshopFiles/... · 2015-10-15 · Aaron Liston, Oregon

Aaron Liston, Oregon State University

Botany 2012 Intro to Next Generation Sequencing Workshop

6

Instrument Purchase Cost Additional

Instruments Service Contract

ABI 3730xl (capillary) $376,000 - $19,800 454 GS Jr. Titanium $108,000 $16,000 $12,600 454 FLX to FLX+ upgrade $29,500 - - 454 FLX+ $450,000 $30,000 $50,000 PacBio RS $695,000 - $85,000

Ion Torrent – (314/316/318 chips) $49,000 $16,000* $9,900*

Ion Torrent – Proton (2012) $149,000 $16,000* $32,000*

SOLiD – 5500xl $251,000 $54,000 $44,400 Illumina MiSeq $125,000 - $12,500 Illumina MiSeq upgrade (2012) $0 - - Illumina HiScanSQ $405,000 $55,000 $41,500 Illumina GAIIx $250,000 $100,000 $44,500 Illumina HiSeq 1000 $560,000 $55,000 $62,000

Illumina HiSeq 1000 to 2000 upgrade $175,000 - -

Illumina HiSeq 2000 $690,000 $55,000 $75,900

Illumina HiSeq 2000 to 2500 upgrade (2012) $50,000 - -

Illumina HiSeq 2500 (2012) $690,000 $55,000 $75,900

Oxford Nanopore minION (2012) $0 $0 $0

Oxford Nanopore GridION 2000 (2012) [$30,000]?? ? ?

Oxford Nanopore GridION 8000 (2013) [$30,000]?? ? ?

Instrument purchase, additional instrument and service agreement costs.

2012 NGS Field Guide. www.molecularecologist.com

*Includes optional OneTouch template preparation instrument.

Instrument Computational Resources Data File Sizes (GB)

3730xl (capillary) $2,000 desktop 0.03 454 GS Jr. Titanium $5,000 desktop <3 images, <1 sff 454 FLX Titanium $5,000 desktop 20 images, 4 sff 454 FLX+ $5,000 desktop 40 images, 8 sff PacBio RS $65,000 cluster 20 pulsed, 2 Fastq

Ion Torrent – 314 chip $16,500 desktop server 0.1 Fastq

Ion Torrent – 316 chip $16,500 desktop server 0.6 Fastq

Ion Torrent – 318 chip $16,500 desktop server [small]

Ion Torrent – Proton I (2012) $75,000 cluster [big]

Ion Torrent – Proton II (2013) $75,000 cluster [big]

SOLiD – 5500xl $35,000 cluster 148 Illumina MiSeq $5,000 desktop or BaseSpace cloud 1 Illumina HiScanSQ $222,000 cluster (or DYI for less) 50 Illumina GAIIx $222,000 cluster (or DYI for less) 600 Illumina HiSeq 1000 $222,000 cluster (or DYI for less) 300

Illumina HiSeq 2000 $222,000 cluster (or DYI for less) 600

Illumina HiSeq 2500 (2012) $222,000 cluster (or DYI for less) [big]

Oxford Nanopore minION (2012) laptop [small]

Oxford Nanopore GridION 2000 (2012) ? [small to big] Oxford Nanopore GridION 8000 (2013) ? [small to big]

Desktops assume higher-end models with multiple processors, ≥8 GB RAM and ≥1 TB HD.

Required Computational Resources

2012 NGS Field Guide. www.molecularecologist.com

Instrument Primary Advantages Primary Disadvantages

3730xl (capillary) Low cost for very small studies Very high cost for large amounts of data.

454 GS Jr. Titanium

Long read length. Low capital cost. Low cost per experiment

High cost per Mb.

454 FLX+ Double the maximum read length of Titanium

High capital cost. High cost per Mb. Reagent issues. Upgrade issues.

PacBio Single molecule real-time sequencing. Longest available read length. Short instrument run time. Low cost per sample.

High error rates. Low total number of reads per run. High cost per Mb. High capital cost. Many methods still in development. Weak company performance.

Ion Torrent – 314/316/318

Low cost per sample for small studies. Fast runs. Semiconductor Chips. Instrument with few moving parts.

Higher error rate than Illumina. Higher cost per Mb. Long sample prep.

SOLiD – 5500xl

Each lane of Flow-Chip can be run independently. High accuracy. Ability to rescue failed sequencing cycles. 96 validated barcodes per lane. High throughput.

Not likely to be sold very long after the Ion Torrent Proton comes to market. Relatively short reads. more gaps in assemblies than Illumina data. less even data distribution than Illumina. High capital cost.

Illumina MiSeq

Low cost instrument and runs. Low cost/Mb for a small platform. Fastest Illumina run times and longest Illumina read lengths.

Relatively few reads and Higher cost/Mb .compared to other Illumina platforms.

Illumina HiSeq One or two independent flow cells. Most reads, Gb per day and Gb per run. Lowest cost per Mb.

High capital cost. High computation needs.

Primary Advantages and Disadvantages – Current Platforms

2012 NGS Field Guide. www.molecularecologist.com

Instrument Primary Advantages Primary Disadvantages

Ion Torrent – Proton Moderately low-cost instrument for high throughput applications. Cost / Mb approaching HiSeq.

Error-rate likely higher than Illumina. Higher cost/Mb than HiSeq.

Illumina MiSeq Upgrade

Same as MiSeq, but 3X more reads and 250X250 paired ends. Free upgrade.

Reagent costs not announced yet, but likely to be higher than current MiSeq.

Illumina HiSeq 2500 Same as HiSeq 2000, but can also run two 2 lane miniFlowCells to achieve much faster run times and longer read lengths.

Mini-FlowCell will have a higher cost per read than standard flow cell. Can’t run mini and standard flow cells together.

Oxford Nanopore minION No instrument. USB powered. No sample processing required. Could be used in the field.

No data publicly available. High cost per Mb relative to other Nanopore sequencers.

Oxford Nanopore GridION

Extremely long reads are feasible. Low-cost instrument (node). Error-rate doesn’t increase along the length of the read. Real time analysis allows “run until” sequencing.

No data publicly available. Announced 4% error-rates. Single use cartridges may require serial sequencing for efficiency.

2012 NGS Field Guide. www.molecularecologist.com

Primary Advantages and Disadvantages – New Platforms

Platform – Instrument Application: de novo assemblies

BACs, plastids, & microbial genomes

transcriptome Plant & animal genome

454 – GS Jr. B – good but expensive C – need multiple runs, expensive D – cost prohibitive

454 – FLX+ A – good, need to multiplex to be economical

A/B – good but expensive, not best for short RNAs

B/C – good as part of a mixed platform strategy, expensive to use alone

MiSeq B – good, assembly more challenging than 454

B/A – may need multiple runs, assembly more challenging than 454, longer reads may make it the best

C – expensive, use to validate libraries for HiSeq

HiSeq 2000

B/C – more data than needed unless highly indexed. assembly more challenging than 454

A/B – good, assembly more challenging than 454 but much more data available for analyses

A – primary data type in many current projects. requires mate-pair libraries

Ion Torrent – 314 C – reads are shorter than Illumina & as expensive as 454

C – reads are shorter than Illumina & as expensive as 454

D – cost prohibitive, reads shorter than alternatives

Ion Torrent – 318 B – good, data more challenging to assemble than 454 or Illumina

B/C – good, data more challenging to assemble than 454 or Illumina

C – high cost, data more challenging to assemble than 454 or Illumina

Utility (according to Travis Glenn Univ. Georgia) of currently available DNA sequencing platforms for de novo assembly

2012 NGS Field Guide. www.molecularecologist.com

Utility grades combine data characteristics (amount, quality, length), cost of data, and ease of assembling the data into the final desired product.

Platform – Instrument Application: Resequencing

Targeted loci Transcript counting Genome resequencing

454 – GS Jr. B – good but expensive, need to limit loci

D – cost prohibitive D – cost prohibitive for large genomes

454 – FLX+ B – good but expensive, should limit loci

D – cost prohibitive C/D – cost prohibitive for large genomes

MiSeq A/B – good, fewer and higher cost reads than HiSeq

B – more expensive than HiSeq or SOLiD

C – expensive for large genomes

HiSeq 2000 A – primary data type in many current projects. best for many loci

A – primary data type in many current projects

A – primary data type in many current projects

Ion Torrent – 314 C – OK but expensive, need to limit loci

D – cost prohibitive D – cost prohibitive

Ion Torrent – 318 B – good, slightly less data per run than MiSeq

B/C – more expensive than HiSeq or SOLiD. new informatics pipelines needed. new error profile

C – expensive for large genomes

Utility (according to Travis Glenn Univ. Georgia) of currently available DNA sequencing platforms for resequencing

2012 NGS Field Guide. www.molecularecologist.com

Utility grades combine data characteristics (amount, quality, length), cost of data, and ease of assembling the data into the final desired product.