everything you ever wanted to know about ngs but were afraid to ask
DESCRIPTION
This presentation provides an overview of Next Generation Sequencing, terms and definitions, and forensic applications. What is Next Generation Sequencing (NGS)? What can NGS do that we can't do now? How much information? Single Nucleotide Polymorphisms (SNP) Short Tandem Repeats (STR) Ion PGM™ System for HID - how does it work? FAQs Torrent Server and Torrent Suite™ Software Data - how does it compare to CE?TRANSCRIPT
1 The world leader in serving science
Everything You Ever Wanted to Know About Next Generation Sequencing, but were Afraid to Ask. (Well…Almost Everything)
2
What is Next Generation Sequencing (NGS)? • Catch-all term that describes several distinct modern
sequencing technologies • The actual chemistry varies, but sequence data is collected
as it is being generated • Generating DNA sequence cheaper and easier • Whole genome analysis for risk/disease detection made real
xkcd.com - THIS WORK IS LICENSED UNDER A CREATIVE COMMONS ATTRIBUTION-NON COMMERCIAL 2.5 LICENSE.
3
What can NGS do that we can’t do now?
• It’s all about capacity. • Human Genome Project =
1990-2003 and $3 billion • Brute force
• Now – A few days and $1000 per genome…really?
Image courtesy of genome.gov and the National Human Genome Research Institute
4
How much information?
If we define: • BP = A single
sequenced base pair • Run = a single chip,
flow cell, or 96-well plate
Platform Max BP per Run CE 96,000 Ion 314™ Chip 550,000 Ion 316™ Chip 3,000,000 Ion 318™ Chip 5,500,000 Proton™ I Chip 80,000,000 Biggest one on the market
3,000,000,000
“Massively Parallel” means lots of sequencing reads happening all at the same time
5
Sequences lots of short DNA pieces at the same time.
• Traditional Sequencing theoretical limit is 1000 BP
• Ion Platforms 200-400 BP • Varies for other platforms,
may be as short as 50 BP • Not a problem for HID
CE
NGS
6
Great! NGS does lots of sequencing. How will all this help us catch bad guys or identify missing persons?
• Obvious use:
• Mitochondrial sequencing • More information
• What else can it do?
File reproduced in accordance with creative commons public domain license: original work by jhc, who grants anyone the right to use this work for any purpose, without any conditions, unless such conditions are required by law, derivative work by Shanel is shown. http://commons.wikimedia.org/wiki/File:Mitochondrial_DNA_en.svg
7
Single Nucleotide Polymorphisms (SNP) What’s in a SNP? More information • Identity – SNP amplicons are small, good for degraded DNA • Ancestry – No match in the database? At least you can let
law enforcement know they are looking for a male of ________ descent
• Phenotype – with _______ colored eyes and ________ hair.
8
Short Tandem Repeats (STR)
• A 15 is not always just a 15… • vWA, for example • Individual A 15 repeats: TCTA [TCTG]4 [TCTA]10
• Individual B 15 repeats: TCTA [TCTG]3 [TCTA]11
• Mixture deconvolution
• More information!
STRs for Forensic or Paternity use only
9
Way More Information…
• How to deal with all this information? • Bioinformatics - The field of science concerning the application of
computer science and information technology to biology; using computers to handle biological information...1
• Don’t we do that already?
• Example: Trimming
• AKA getting rid of the unusable signal at the beginning and end • Ion 314™ Chip – 550,00 BP • Average read length 200 BP • 2,750 sequence reads = too many to hand trim!
1. bioinformatics. Dictionary.com. The Free On-line Dictionary of Computing. Denis Howe. http://dictionary.reference.com/browse/bioinformatics
10 The world leader in serving science
Ion PGM™ for HID – How does it work? Hint: It’s not magic.
11
PGM™ System for HID – How does it work?
It’s a Process… 1. Library Preparation 2. Quantify 3. Ion OneTouch™ 2 System* 4. Ion OneTouch™ ES System* 5. Sample on the Chip* 6. Chip on the Ion PGM™ System 7. Data! *The Ion Chef™ System can do Steps 3-5
12
Library Preparation
1. Amplify the regions of interest
2. Carve back the primers
3. Add barcodes and adapters
All this in a single tube!
4. Clean up the reaction
13
Library Prep FAQs • Ion AmpliSeq™ Panel PCR - How does it compare to what
we do now? • 100+ primer sets instead of 20+ • No fluorophores = very stable primers
• Why trim the primers? • Don’t want to waste sequencing real estate on primers • Need a blunt end for the next step (ligating the adapters)
• What do the adapters do? • Barcodes let you load multiple samples onto one chip • P1 and A/X adapters will be used as primer binding sites in the next step
(OneTouch 2 emulsion PCR) • Is the clean-up step necessary?
• Yes, you need to remove all the primer parts, dead enzymes, and salts before you move on to the next step.
• Ampure® XP Beads can be automated.
14
Quantify Critical Quality Control Step! • Ion Library Quantitation Kit
• Real-time QPCR • 7500 friendly
• Adding the right amount of Library to Ion Sphere™ Particles is critical
• Can use Qubit® or Bioanalyzer®, but real-time is the most accurate
• Ion Library Equalizer™ Kit not recommended at this time
15
What happens to the Library next?
3. OneTouch™ 2 4. OneTouch ™ ES 5. Sample on Chip
Clonal Amplification
Load Chip Incorporate Nucleotide
Detect and Call
6. Chip on the PGM
Ion Chef™
16
Ion OneTouch™ 2 Add Library and Ion Sphere™ Particles (ISP) PCR Part 2 – Emulsion PCR
17
Ideal Clonal Amplification using Emulsion PCR
Isolate templated ISPs
ISP
Primer dNTPs
Polymerase MgCl2
Final Templated ISPs ready for sequencing
Emulsion droplet
P1 Barcode + A = X
18
ISP
Non-Ideal Outcomes
Amplification Amplification
ISP
Polyclonal “mixed” reads duplicate reads
ISP
ISP
ISP
ISP ISP
no template no ISP
19
Non-Ideal Outcomes
Quantification of the library is important!
Too much library = lots of polyclonal “mixed” reads
Not enough library = lots of empty ISPs
It’s all about balance
20
OneTouch™ 2 FAQs
• Does the PCR happen to the whole sample at once? • Yes, the whole sample is injected into the amp plate and the thermal
cycling happens to all of it at the same time.
• If it’s just PCR, what takes so long? • The rate-limiting step is the collection of the ISPs from the emulsion. The
emulsion is dripped onto the bridge a little at a time, the centrifuge spins to collect the ISPs at the bottom of the collection tubes, the excess oil and buffer flow out of the top of the collection tubes, and the process is repeated until all the ISPs have been collected.
21
Ion OneTouch™ ES
• Liquid-handling robot • Enriches for ISPs with
template on them
8-well strip tube (flanked by magnets)
Pipet Tip
1) ISP sample 5) Wash Solution 2) MyOne™ Beads 6) Empty 3) Wash Solution 7) Melt-Off Solution 4) Wash Solution 8) Empty
22
Post Amplification
Add Magnetic Streptavidin Bead
Immobilize to Magnet and Wash
Denature ISP with NaOH
*
*This species can be minimized through proper dilution. Proper DNA to Ion Sphere™ Particle ratio is critical!
Ion Sphere™ Particle (ISP) Enrichment
Biotin
ISP
23
To the Chip!
Chips Chip Cross Section
One ISP per well…mostly
Pipet sample in through this port
Chip differences -look for the “football”
24
The Chip up Close
Rothberg J.M. et al Nature doi:10.1038/nature10242
Sensor Plate
Silicon Substrate Drain Source Bulk
dNTP
To column receiver
∆ pH
∆ Q
∆ V
Sensing Layer
H+
Millions of Tiny pH Meters
25
Chip FAQs
• How many samples can I load per chip with the HID panels? • Ion 314™ chip – up to 8 individuals • Ion 316™ chip – up to 38 individuals • Ion 318™ chip – up to 77 individuals • For the HID-Ion AmpliSeq™ Identity Panel
• Explain “the chip is the machine” • As Ion chip technology improves, you just buy new versions of the chip • You can scale your throughput level by buying different chips • Like getting a new CE array, laser, and CCD for every run
26
Chip FAQs • Why am I not supposed to wear gloves when handling the chip?
• The chip is a sensitive piece of electronics. You want to electrically “ground” your body to the instrument and then handle the chip to avoid a charge difference between the instrument and the chip.
• Gloves act as an insulator. They could preserve a charge difference between the instrument and the chip, which could be discharged as a static electricity when you place the chip. This could fry the chip.
27
The Ion PGM™ System • The Ion PGM™ System
• Flows a single type of nucleotide at a time over the chip
• Detects the voltage changes from the wells in the chip
• Washes the chip • Repeats a defined number of
times • Transfers data from each flow
to the Torrent Server
If the chip is the machine, then what is this thing for?
28
Simple, Natural Chemistry PGM™ observes the addition of each nucleotide in real time H+ is the “ion” in Ion Torrent™
29
Ion PGM™System FAQs • Is 18 MΩ water really that important?
• Yes, since the system relies on minute pH changes to detect nucleotide incorporation, the pH of the water and all solutions is critical.
• Why does the PGM™ System need Argon or Nitrogen gas? • These inert gasses exclude CO2 from the system • CO2 dissolved in water makes carbonic acid, so back to the whole pH
thing…
• What about homopolymers (lots of the same nucleotide in a row)? • Every template molecule on the ISP is wanting many of the same
nucleotide (dNTP) at once. There just aren’t enough to go around so some template molecules “get behind”.
• 8 Thymines does not equal 8x the signal of 1 Thymine. Ion has improved its data analysis software to help compensate for this.
30
Data - Run Report
31
Torrent Server and Torrent Suite™ Software Web-based data delivery with integrated alignment, variant calling, and data analysis plugins
32
AC C170
T CCAANT C CC180
NC T AT CA T T T190
T T
T
C
Data – How does it compare to CE?
• In CE, each peak is an aggregate of lots of molecules of the same length with the same fluorophore
• Peak heights provide some idea of the relative quantities of these molecules
173-length C 176-length T and C
33
Data – How does it compare to CE? • Ion Torrent™ technology counts all of the times that an
outcome was observed, so data is more like this:
Chrom Position* Genotype Coverage A Reads
C Reads
G Reads
T Reads Deletions Freq
chr1 173 CC 3485 3 3480 0 2 0 99.85 chr1 176 CT 4776 2 2280 1 2484 9 52.01
*This would be the actual mapped position on the chromosome, so it would typically be a much larger number. These have been changed to make comparison with the previous example easier.
34
Plugin Output
Easy to see heterozygous vs. homozygous
35
Data FAQs • What file type should I archive? • How many runs can I store?
36
Incorporation for 1 Flow (DAT)
Many flows (DAT) Raw signals per flow (WELLS)
0.1 1.2 0.3 2.1 0.1 0.2 2.1 3.1 0.0 0.2 2.1 3.1 0.0 0.1 1.2 0.3 2.1 0.0 0.0 0.0 3.2 1.4 0.1 1.3 1.0 0.2 0.1
Signal Processing
Base Calling
unmapped
BAM
FASTQ 4.4 GB
242 GB 12 GB
10 GB 2.5 GB
4.3 GB ~ KB
VCF Variant Caller
BAM
Alignment
SFF
Torrent Suite™ Software Data Analysis Flow
37
28 GB Raw Voltage Data DAT 129 GB 242 GB
1 GB Signal Processing WELLS 7-9 GB 12-15 GB
1 GB/ Base Calls - Flow SFF 4.5-6.0 GB/ 8-10 GB/
0.2 GB Base Calls - Base FASTQ 1-1.25 GB 1.8-2.25 GB
0.1 GB Base Calls - Aligned BAM 1.7-2.0 GB 2.7-3.0 GB
*v3.0 250 bp run (500 flows) Sept2012. The information presented here is an approximation. User experiences may vary.
Process Description File Types Ion 314™ chip Ion 316™ chip Ion 318™ chip
Torrent Pipeline – Approximate Data Sizes*
2 TB
80-100 GB
100-120GB
20-30 GB
40-60 GB
Ion PI™ chip
38
Process Equipment
File sizes for a typical run of 520 flows that enables 200 base pair read length. The Torrent Server has 12 TB of space with n=2 redundancy.
The information presented here is an approximation. User experiences may vary.
Data Acquisition 40 runs PGM™
Sequencer 8 runs
Raw Data 300 runs Torrent Server 65 runs
Archived Data 4100 runs Torrent Server 880 runs
Ion 314™ chip
Ion 316™ chip
Ion 318™ chip
5 runs
40 runs
530 runs
FAQ - How many runs can be stored?
39
How can I make this easier/faster/better?
• Automated Sample Preparation • Automated Template Preparation • Enzyme Upgrade
40
• For Research, Forensic or Paternity Use Only. When used for purposes other than Human Identification the instruments and modules of the cited software are for Research Use Only. Not for use in diagnostic procedures.
• © 2014 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. AMPure is a registered trademark of Beckman Coulter, Inc. Bioanalyzer is a registered trademark of Agilent Technologies, Inc.
41
Thank You!
Questions?
42 The world leader in serving science
Please note: HID-Ion AmpliSeq™ panels have not been tested on these platforms
Automated Sample Preparation
43
Ion AmpliSeq™ Automation on NGS Express: $ 63.5K
http://www.perkinelmer.com/pdfs/downloads/APP_Ion_AmpliSeq_Life_Technologies.pdf
44
Ion AmpliSeq™Automation on Freedom EVO® 150
• Throughput: 96 samples in 4.5hours (HSM panel) • Ion AmpliSeq™ Library Kit 2.0 384LV processes 288 samples • Demonstrated protocol available in Ion Community • $125k + $25K PCR thermocycler = ~$150K http://ioncommunity.lifetechnologies.com/docs/DOC-6536 (documentation) http://ioncommunity.lifetechnologies.com/docs/DOC-6652 (poster)
45
Ion AmpliSeq™ Automation on Hamilton Starlet System
• Low or high throughput • 96 well multiprobe
heads • Full walk away
solutions to free up time for important research
• Upgradable with lab focus changes or new sequencing technologies
http://www.hamiltonrobotics.com/fileadmin/user_upload/prodb/app_notes/Genomics/BR012_NGS_US.pdf
46 The world leader in serving science
Automated Template Preparation
47
Ion OneTouch™ 2 System for Template Prep
• Available now for both the Ion PGM™ and Ion Proton™ Systems
• Supports 200 bp and 400 bp on the Ion PGM™ System & Up to 200 bp on Ion Proton™ System
• Simple and Robust
48
Ion Chef™ System Bringing Workflow Simplicity to Ion Proton™ Sequencing
• Simple to use • Automated template prep AND chip
loading – library in, loaded chips out • Simple reagent and consumables
loading– minutes of hands-on time • Minimizes potential sources of user
error and sequencing variability
• High throughput • Processes 2 chips and multiple
barcoded samples within hours
• Flexible • Supports all Ion systems, chips, read
lengths, and “Classic” and Avalanche template prep chemistry
Insert Link to lifetech catalog page https://www.youtube.com/watch?v=qpsd93d2mcc
49
Ion Chef™ System Benchtop Sequencing Prep with Integrated Workflows
Tip Rack
Chip Loader
Reco
very
M
odul
es
Thermal- cycler
Reag
ent B
ay
Enric
her
Waste
Robo
tic A
rm
Ion Chef™ System Quotes Early Q2, Shipments Commence End Q2, Ship in Volume Q3
50
• For Research, Forensic or Paternity Use Only. When used for purposes other than Human Identification the instruments and modules of the cited software are for Research Use Only. Not for use in diagnostic procedures.
• © 2014 Thermo Fisher Scientific Inc. All rights reserved. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified. AMPure is a registered trademark of Beckman Coulter, Inc. Bioanalyzer is a registered trademark of Agilent Technologies, Inc.