1 next generation sequencing itai sharon november 11th, 2009 introduction to bioinformatics

26
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Post on 21-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

1

Next Generation Sequencing

Itai SharonNovember 11th, 2009Introduction to Bioinformatics

Page 2: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

2

2010: 5K$, a few days?

2009: Illumina, Helicos40-50K$

Sequencing the Human Genome

Year

Log

10(p

rice)

201020052000

10

8

6

4

22012: 100$, <24 hrs?

2008: ABI SOLiD60K$, 2 weeks

2007: 4541M$, 3 months

2001: Celera100M$, 3 years

2001: Human Genome Project2.7G$, 11 years

Page 3: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

3

In this Talk:

• Sequencing 1.0: Sanger• Assembly• Next generation sequencing (NGS)• NGS applications• Future directions

Page 4: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Genome Sequencing

• Goal figuring the order of nucleotides across a genome

• Problem Current DNA sequencing methods can handle only

short stretches of DNA at once (<1-2Kbp)

• Solution Sequence and then use computers to assemble the

small pieces

4

Page 5: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Genome Sequencing

55

ACGTGGTAA CGTATACAC TAGGCCATA GTAATGGCG CACCCTTAG TGGCGTATA CATA…

ACGTGGTAATGGCGTATACACCCTTAGGCCATA

Short fragments of DNA

AC..GCTT..TC

CG..CA

AC..GC

TG..GT TC..CC

GA..GCTG..AC

CT..TGGT..GC AC..GC AC..GC

AT..ATTT..CC

AA..GC

Short DNA sequences

ACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACCTCT...

Sequenced genome

Genome

Page 6: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Sanger Sequencing

6

Page 7: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Sanger Sequencing

7

Page 8: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Sanger Sequencing

• Advantages Long reads (~900bps) Suitable for small projects

• Disadvantages Low throughput Expensive

8

Page 9: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Assembly

9

9

Cut DNA to larger pieces (2Kbp, 15Kbp) and sequence both ends of each piece (Fleischmann et al., 1994)

contig 1 contig 215Kbp mates

2Kbp mates

~(length―1,000)

~500 bp ~500 bp

resolving repeats

Better assembly of contigs, gap lengths estimation

Page 10: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

many pieces to assemble

High coverage:

Assembly: How Much DNA?

10

Low coverage:

A few pieces to assemble

a few contigs, a few gaps

many contigs, many gaps

Input OutputLander and Waterman,

1988

Page 11: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

Sanger Sequencing

11

1980 1990 2000

1982: lambda virusDNA stretches up to 30-40Kbp (Sanger et al.)

1994: H. Influenzae1.8 Mbp (Fleischmann et al.)

2001: H. Sapiens, D. Melanogaster3 Gbp (Venter et al.)

2007: Global Ocean Sampling Expedition~3,000 organisms, 7Gbp (Venter et al.)

Page 12: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

12

Next Generation Sequencing: Why Now?

Page 13: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

13

High Parallelism is Achieved in Polony Sequencing

PolonySanger

Page 14: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

14

Generation of Polony array: DNA Beads (454, SOLiD)

DNA Beads are generated using Emulsion PCR

Page 15: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

15

Generation of Polony array: DNA Beads (454, SOLiD)

DNA Beads are placed in wells

Page 16: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

16

Generation of Polony array: Bridge-PCR (Solexa)

DNA fragments are attached to array and used as PCR templates

Page 17: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

17

Sequencing: Pyrosequencing (454)

Complementary strand elongation: DNA Polymerase

Page 18: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

18

Sequencing: Fluorescently labeled Nucleotides (Solexa)

Complementary strand elongation: DNA Polymerase

Page 19: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

19

Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)

Complementary strand elongation: DNA Ligase

Page 20: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

20

Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)

5 reading frames, each position is read twice

Page 21: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

21

Single Molecule Sequencing: HeliScope

Page 22: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

22

Technology Summary

Read length Sequencing Technology

Throughput (per run)

Cost (1mbp)*

Sanger ~800bp Sanger 400kbp 500$

454 ~400bp Polony 500Mbp 60$

Solexa 75bp Polony 20Gbp 2$

SOLiD 75bp Polony 60Gbp 2$

Helicos 30-35bp Single molecule

25Gbp 1$

*Source: Shendure & Ji, Nat Biotech, 2008

Page 23: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

23

What, When and Why

• Sanger:Small projects (less than 1Mbp)

• 454:De-novo sequencing, metagenomics

• Solexa, SOLiD, Heliscope:– Gene expression, protein-DNA interactions– Resequencing

Page 24: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

24

Applications

Page 25: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

25

Applications

Page 26: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics

26

Where Do We Go from Here?

• Higher throughput, longer reads (Pacific BioSciences)

• Computational bottleneck• Shift to sequencing-based technologies• Will it help to cure cancer?