next generation sequencing update

47
Next Generation Sequencing Update Karl V. Voelkerding, MD Professor of Pathology University of Utah Medical Director for Genomics and Bioinformatics ARUP Laboratories AACC-AMP 2012 Molecular Pathology Course [email protected]

Upload: others

Post on 24-Feb-2022

12 views

Category:

Documents


1 download

TRANSCRIPT

Next Generation Sequencing Update

Karl V. Voelkerding, MD Professor of Pathology

University of Utah Medical Director for Genomics and Bioinformatics

ARUP Laboratories

AACC-AMP 2012 Molecular Pathology Course

[email protected]

Disclosures

• Grant/Research Support: NIH

• Salary/Consultant Fees: None

• Committees: College of American Pathologists

• Stocks/Bonds: None

• Honorarium/Expenses: None

• Intellectual Property/Royalty Income: None

Learning Objectives

• Explain Principles of NGS

• Describe Current and Future NGS Platform Options

• Discuss Spectrum of NGS Clinical Applications

First Next Generation Sequencing Publication

454 Life Sciences

Nature 437 (7057) 376-380

2005

Sanger Sequencing

Electrophoretic Separation of Chain Termination Products

Next Generation Sequencing

Paradigm Shift

Sequence Clonally Amplified DNA Templates in a Flow Cell

Massively Parallel Configuration

Genomic DNA or Enriched Genes

Fragmentation

End Repair and Adapter Ligation

Fragment A Adapter Adapter

“Fragment Library”

Process

(150 – 500 bp)

Fragment B Adapter Adapter

Fragment C Adapter Adapter

Clonal Amplification of Each Fragment

Sequencing of Clonal Amplicons in a Flow Cell

“Fragment Library”

A

B

C

Emulsion Bead PCR Surface Clusters

C B A A

B

C

Process

Sequencing of Clonal Amplicons in a Flow Cell

Generation of Luminescent or Fluorescent Images

Conversion to Sequence

Pyrosequencing 454

Reversible Dye Terminators Illumina

Sequencing by Ligation SOLiD

Process

454/Roche

Pyrosequencing

Solexa/Illumina

Reversible dye terminators

200 – 400 base reads 36 – 75 base reads

Bead Emulsion PCR Surface Bridge PCR

Solexa/Illumina Sequencing

A T C G

Qualitative and Quantitative Information

Coverage

Ref Seq

G>A Illumina

Luminescence (Roche)

Fluorescence (Illumina,SOLiD)

pH Detection (Ion Torrent)

Signal to Noise Processing

Cyclic Base Calls C G A T G C - - -

Base Quality Scores C30 G28 A33 T30 G28 C30 - - -

Next Generation Sequencing • Sequence up to billions of fragments simultaneously

• Iterative/cyclic sequencing

Next Generation Sequencing Data

Primary Sequence Alignment

BWA

Refined Sequence Alignment GATK/Picard

Variant Calling SAMTools/GATK

Variant Annotation Annovar

@HW-ST573_75:1:1:1353:4122/11

CAATCGAATGGAATTATCGAATGCAATCGA

ATAGAATCATCGAATGGACTCGAATGGAAT

CATCGAA

+

ggfggggggggggggfgggggggfgegggg

fdfeefeggggggggegbgegegggdeYed

gggggeg

@HW-ST573_75:1:1:1347:4151/11

ATCTGTTCTTGTCTTTAACTCTCAAGGCAC

CACCTTCCATGGTCAATAATGAACAACGCC

AGCATGC

+

effffggggggggggggfgggggggggggg

gdggggfgggfgdggaffffgfggffgdgg

ggggdfg

@HW-ST573_75:1:1:1485:4153/11

GAGGAGAGATATTTTGACTTCCTCTCTTCA

TATTTGGATGCTTTTTACTTATCTCTCTTG

ACTAATT

+

dZdddbXc`_ccccbeeedbeaedeeeee^

aeeedcaZca_`^c[eeeeed]eeecd[dd

^eeba[d

FastQ File Format

Variant g.34142190T>C in TPM1

454/Roche 2004/5

Solexa/Illumina 2006/7

ABI/Life Tech 2007/8

GS FLX

Genome Analyzer

SOLiD

First Wave

GS Junior

SOLiD 5500 SOLiD 5500xl

Helicos Pacific

Biosciences

HeliScope

Second Wave - SMS

SMRT

GAIIx GAIIe

HiScanSQ HiSeq

MiSeq 2011

Next Generation Sequencers

Ion Torrent Life Technologies PGM

2011

Third Wave

Clinical Dissemination

Illumina HiSeq 2000

2 Independent Flow Cells

8 Lanes per Flow Cell

2 X 100 base pairs

540-600 Gb Output

8-11 Day Sequencing Run

Multiple Gene Panel Samples per Lane

2 Genomes per Flow Cell

2-3 Exome(s) per Lane

Reversible Dye Terminators

Illumina MiSeq

2 X 150 bp 2 X 250 bp

2.0 – 7.0 Gb Output

~27 Hrs Sequencing Run

Multi-Gene Panels Genetics Oncology

Microbiology

Viral and Bacterial Genomes

Transcriptomes

Illumina MiSeq

Transcriptome Sequencing

GAPDH Sequence Reads

Monitors H+ Release

Hydrogen Ion

Pyrophosphate

Ion Torrent

100 – 200 base pairs

10 Mb – 1.0 Gb Output

~2 Hrs Sequencing Run

Monitors H+ Release

Ion Torrent

Multi-Gene Panels Genetics Oncology

Microbiology

Viral and Bacterial Genomes

Transcriptomes

Ion Torrent

BRAF, c.1799T>A, p.V600E 26.5% mutant alleles

Technology Advances for 2012/13

Illumina HiSeq 2000

2 X 100 base pairs

540-600 Gb Output Single Genome in 27+ Hours

Multiple Exomes in 27+ Hours

Upgrade Module

120 Gb 27+ Hours

Late 2012

11 Day Sequencing Run

Late 2012 Ion Torrent - Proton

Exomes/Genome “Several Hours”

Oxford Nanopore Technologies

Processive Enzyme

Protein Nanopore in Polymer Membrane

Current Disruption Based Electronic Signal

MinION – Late 2012

The Meeting Place

Biotechnology Bioinformatics

Biomedical Question

Sequence Analysis Interpretation

Sequence Generation

What is the Genetic Landscape of a Tumor

What Pathogen is Responsible for an Outbreak

What Genetic Contributors Account for a Phenotype

Whole Exome

Whole Genome

Multi-Gene Diagnostics

Increasing Complexity

Clinical Applications

Multiple Genes

Multi-Gene Diagnostics

Clinical Phenotype

Locus Heterogeneity Allelic Heterogeneity

Mutational Spectrum

Multi-Gene Diagnostics

“New First Tier” Genetic Testing

Scaling Increases Interpretive Complexity

Can Yield Non-Definitive Results

Gateway to Exome/Genome

Multi-Gene Diagnostics

Genomic DNA

Enrichment

Target Genes

NGS Library Preparation

Next Generation Sequencing

Interpretation

Bioinformatics

PCR or LR-PCR RainDance ePCR

Fluidigm HaloGenomics

Solid Surface or

In Solution

Gene Enrichment Approaches

Amplification Based

Genomic DNA

Array Capture Based

Enriched Genes NGS

Advantage: Enrichment Specificity Advantage: Scalable to Exome

PCR or LR-PCR RainDance ePCR

Fluidigm HaloGenomics

Solid Surface or

In Solution

Gene Enrichment Approaches

Amplification Based

Genomic DNA

Array Capture Based

Drawbacks: Not as Scalable Instrument and Chip Costs

Drawbacks: Homologous Sequence Capture Manually Complex

Whole Exome

Whole Genome

Multi-Gene Diagnostics

Increasing Complexity

Clinical Applications

~ 30+ Megabases (~ 1.5% of the genome)

~ 180,000 exons (~ 20,500 genes)

Harbors “Majority” of Mendelian Mutations

“Journey to the Center of the Genome”

Human Exome

Exome Sequencing History

“Genetic Diagnosis by Whole Exome Capture and Massively Parallel DNA Sequencing”

Choi et al PNAS 2009 – Congenital Chloride Diarrhea

~45 Gene Discovery Publications May 2012

Recessive Dominant De Novo

Library Preparation

Next Generation Sequencing Library

Exome Enriched Library

Bioinformatics Analysis

Next Generation Sequencing

Genomic DNA

Hybridize to Exome Capture Probes

Comparison of Exome DNA Sequencing Technologies

Clark et al Nature Biotech Vol 29(10) Oct 2011

Comparison of Exome DNA Sequencing Technologies

Clark et al Nature Biotech Vol 29(10) Oct 2011

MAZ HLA-DOB Exon 1

Coverage

Aligned reads

Reference Capture probes

Exon 1

Nimblegen Exome Capture and Illumina HiSeq

Exome Sequencing - Coverage of Coding Regions is Variable

Capture Technology – Probe Design and Capture Efficiency

Define Proportion of Exome “Adequately Covered”

Dependent On

Define Proportion of Exome “Not Adequately Covered”

Conversely

Exome Sequencing – Performance Characteristics

Sequencing Depth

Co-Capture Component

Pseudogenes

Paralogs and Homologs

Exome Sequencing – Performance Characteristics

Define Proportion of Exome “Accurately Sequenced”

Repetitive Elements

Difficult to Sequence Regions

Mendelian Disorders – Working Hypothesis Seeking “Rare” Variants in a Single Gene(s)

Needle(s) in the

Haystack(s)

Annotated Variants

Prioritization by Heuristic Filtering Prioritization by Likelihood Prediction

Filter Out Common Variants

Pathogenicity Prediction Filtering

Variant Binning

Candidate Genes/Potential Causative Variants

Cross Reference Databases

Pedigree Information Linkage/SGS/IBD

dbSNP/1000 genomes Variant frequency

SIFT/PolyPhen GERP

Intersects

HGMD/OMIM/Locus Specific

VAAST Algorithm

Missense Nonsense/Frameshift/Splice Site/Indels

Bioinformatics

Library Preparation

Next Generation Sequencing Library

Exome Enriched Library

Bioinformatics Analysis

Next Generation Sequencing

Genomic DNA

Hybridize to Exome Capture Probes Genome

Sequencing

Library Preparation

Next Generation Sequencing Library

Bioinformatics Analysis

Next Generation Sequencing

Genomic DNA

vs

Cost – Coverage – Complexity

Exome Sequencing

Genome Sequencing

Whole Genome Sequencing Chr 10: g.43,615,633C>G in RET

Horizon

Continued Evolution of Sequencing and Bioinformatics

College of American Pathologists Checklist Requirements for Next Generation Sequencing

Professional Societies Guidelines for Clinical Next Generation Sequencing

Self Assessment Questions

• Describe Process Steps for NGS

• List NGS Platform Options and Capabilities

• Relate Spectrum of Clinical NGS Applications