automation challenges in next generation sequencing · 2016. 10. 15. · automation challenges in...

Post on 19-Aug-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Automation Challenges in Next

Generation Sequencing

Robin Coope

Group Leader, Instrumentation

BC Cancer Agency Genome Sciences Centre

Outline

• The BCGSC’s Scale of Operations

• Current Automation Projects and Lessons

• Challenges of Small Volume Pipetting

• Automating Fragment Analysis

Calculations

• Automated Gel-Based Size Selection

BCGSC Monthly Total Libraries

14 HiSeqs, 3 MiSeqs, 8 Robots from 3 manufacturers

~ 50 Staff in high throughput lab groups.

BCGSC Monthly Pooled Library Output

12,500 Libraries per year

Biospecimens

RNA Isolation cDNA

Synthesis Normalization &

Shearing

End Repair A Tailing,

Adapter Ligation

PCR (addition of indexes)

Size Selection

Normalization and pooling

Sequencing

QC QC

Shearing and Amplicon Construction.

QC QC

QC

QC

QC

Four Plate PCR-Free WGS

• A new protocol based on Illumina’s but with

more automation friendly labware

• Fully utilizes Biomek FX deck, grippers and

plate stackers to enable library construction

of four plates in six hours (w/o QC)

• One FX program with nine integrated

stages

HL60 – PCR Free

HL60 – Normal (PCR) Library

Construction

MultiMACS and Nimbus for mRNA

RNA isolation with MuliMACS

allows Poly-A mRNA capture

while retaining the flow

through which can be used to

isolate miRNA

We are automating its

operation with a Hamilton

Nimbus. It has 1-1000ul

pipetting capacity and can

work with the MultiMACS

geometry.

Renormalization and Pooling

• Implementing a PE

Janus for up-front

quantification,

normalization and re-

arrays

• A Span-8 has been

acquired from our

clinical partners for post

library construction

pooling and re-arrays

Hard Lessons

• High throughput production protocols need to be robust and easily maintainable.

• This means formal standards in program structure and liquid handling methods must apply.

• Pairing engineers with biologists to develop automation protocols with clear distinctions in mandate but great communication is most successful.

Where Liquid Handler Software

Should Be

Complexity of

User Interface

Most Commercial

Liquid Handlers

Power of Programming Environment

The Real Sweet Spot

Pipetting

or

Things we discovered with difficulty that

may have been obvious to others

QC Relies on Accurate Pipetting

• Liquid transfers below 2ul may depend on:

– Plate geometry

– Tip geometry

– Temperature

– Liquid properties

– Liquid surface angle

– Humidity

The Heartbreak of Backpressure

Aspirate 2ul

Consider the following situation of successive transfers

The Heartbreak of Backpressure Consider the following situation of successive transfers

Dispense 2ul but with an added

dispense motion to “blow out” the tips

The Heartbreak of Backpressure

In air, aspirate extra

displacement for the

“blow out” step

Consider the following situation of successive transfers

But any leftover droplets

may cause a vacuum to

form in the tips

The Heartbreak of Backpressure Consider the following situation of successive transfers

On next aspiration, the

negative pressure from the

liquid seal and pre-aspirate

in some tips will cause

over-aspiration of the

sample

The Heartbreak of Backpressure

2 ul pipetted with a 1 ul blowout

2 ul pipetted with a 2 ul blowout

The Heartbreak of Backpressure

2 ul pipetted with a 4 ul blowout

The Heartbreak of Backpressure

The Heartbreak of Backpressure

2 ul pipetted with a 10 ul blowout

The Heartbreak of Surface

Tension

• Using an Eosyn dye assay, we

observed up to 20% variability in 2ul,

transfers depending on the surface

angle of the liquid. Successive single

transfers were not consistent.

• Using the same protocol with the Artel

MVS QC system, this effect was not

observed.

• It was also not observed with DNA

transfers.

intelligent Fluorescent Units

Illumina Clusters

• Higher molarity (number of

fragments) = denser clusters

• Shorter fragments cluster

more efficiently

• The optimum density is

4.6million clusters/tile with

100 tiles per lane.

• Too high density prevents

base calling

• Runs cost ~$2500/lane

20 MICRONS

Molarity Estimation

• To get the molarity you have

to get the mass and average

size of DNA fragments.

• The mass is obtained by a

fluorescent assay or qPCR

• Average size is generally

determined by visual

estimation from an “Agilent”

Trace”

• But this plot doesn’t show

molarity, it shows mass.

intelligent Fluorescent Units

(iFU)

x

ssnMW

x

smssFU DNA

x

s-2 s-1 s s+1 s+2

l

sv

ds

dv

tds

dv

t

svsvx

11

ds

dv

s

svskFUsn

n(s) is the sample molarity

dsds

dv

s

svsFUkQuantMass

0

so

Consider a capillary sections and DNA of

a mass density of ρ(s). s is for size.

The DNA moves with velocity ν(s) so:

and

To use this you have to determine v(s) from DNA ladder.

An Empirical Method 1. Calculate the area under each ladder band peak

2. Divide each peak’s corresponding mass concentration

by its area and generate a correction factor curve

3. Divide the FU signal of the sample by the correction

curve to generate a corrected mass concentration curve

4. Convert to molarity by dividing by size

The Two Methods Agree

The two methods of calculation have a 94% correlation

Complementary Errors

1. The x-axis is linear in

time not size, so larger

sizes are more bunched

together.

2. Linearizing the X-axis

increases the weight of

larger sizes.

3. Converting to molarity

increases the weight of

smaller sizes.

4. These visual errors

cancel for well shaped

peaks.

An Extreme Example

Correlation to Aligned Sequence

Top: Actual library fragment sizes

found from paired end sequencing and

alignment of an amplicon pool.

Middle: A plot of the same pool

showing mass vs. linearized size.

Bottom: A plot of the same pool

showing molarity calculated by iFU

using the empirical method.

For n=8 amplicons, correlation was 60% for

Agilent and 70% for iFU. For n=51 WGS

samples, correlation was >90% for both

Weighting for Cluster Efficiency

0

0

dssn

dssOLCsn

c

y = 4.8843ln(x) - 15.717 R² = 0.9622

0

2

4

6

8

10

12

14

16

18

20

0 200 400 600 800 1000

OLC

(p

M)

Size [bp]

Optimal Loading Concentration (OLC) vs. Size

The relationship of mean size to clustering efficiency is shown in the

empirically determined curve above. This gives a function OLC(s) which is

then used to find the optimal loading concentration using the following

formula. n(s) is the molarity as a function of size:

Weighting for Cluster Efficiency

3.50E+06

4.00E+06

4.50E+06

5.00E+06

5.50E+06

6.00E+06

6.50E+06C

lust

ers

Pe

r Ti

le

Amplicon Samples

Cluster Density

Cluster Density

iFU Prediction Values

Target Cluster Density

This retrospective prediction assumes a linear relationship between loading

concentration and cluster density. The iFU with OLC(s) weighting reduces the

cluster density error by an average of 60% for these (n=28) amplicons.

qPCR vs. Intercalating Dye

R² = 0.6982

0

20

40

60

80

100

120

140

200 300 400 500 600 700 800

Clu

ster

ing

effi

cien

cy

(K/m

m2

per

pM

)

Average library size (bp)

MiSeq library clustering efficiency - using qPCR quant All types

CTL

WGS

PCR-free WGS

WGBS

RNA-Seq

Amplicon

Direct-amplicon

Linear (All types)

R² = 0.0744

0

20

40

60

80

100

120

140

200 300 400 500 600 700 800

Clu

ster

ing

effi

cien

cy

(K/m

m2

per

pM

)

Average library size (bp)

MiSeq library clustering efficiency - using Qubit quant All types

CTL

WGS

PCR-free WGS

WGBS

RNA-Seq

Amplicon

Direct-amplicon

Linear (All types)

Conclusions

• For “tight” libraries visual estimation of mean

insert size gives good results and cluster errors

are more likely to be from mass quantification

errors than insert size estimation errors.

• For wider library distributions like amplicons,

iFU improves the correlation to the aligned

sequence by ~13% over the mass plot.

• For cluster generation, our prediction method

shows a reduction in cluster density error by

60% in this set (n=28) of amplicons.

Gel Based Size Selection

Why Gel Based Size Selection

• To purify small or large fragments, like

miRNA or kilobase+ fragments

• To mitigate PCR induced biases in

sensitive libraries such as for mRNA

• To rescue libraries where shearing is off

target and beads may miss the “peak of

peaks”

Barracuda

The GSC has built four 96 channel

size selection robots and run over

10,000 samples since 2010.

In Q3-Q4 of 2013, Coastal

Genomics will release The

Ranger based on this technology.

Hole-to-Hole Size Selection

60mm

miRNA Size Selection

N.B. All of the miRNA sequences in The Cancer Genome Atlas Project

were size selected on Barracuda

Mid Length Range

Target: 275-325 bp Target: 175-225bp

Two size selections of NGS libraries prior to adapter ligation

Before size selection

After size selection

Amplicons

A 1kb target before and after Amplicon before and after

10kb Target from HL60

A 9-13kb Target Region

This is an

Agilent artifact!

10kb Target From Sheared

gDNA

18kb Target from Ladder

A: 1kb Extended ladder

B: 2.5kb Molecular Ruler

C: A+B

A2,B2 Target 18-50kb

A3,B3 Target 25-50kb

Barracuda Control Software

Barracuda uses a single pipettor so each sample is loaded and runs

independently.

Performance

• Recovery efficiency is 90% for 10kb

bands.

• Time: Barracuda, with a single channel

can take up to 7.5 hours for a full plate

• Coastal Genomics’ Ranger will have 12

channels so expect full plates in ~1hr

depending on protocol.

Acknowlegements

Instrumentation

Lok Tin Lam

Steven Pleasance

Duane Smailus

Philip Tsao

Quality Assurance

Miruna Bala

Susan Wagner Andrew Mungall

and the Library

Construction

Team

Cluster Density Estimation

Rod Docking

Lateef Yang

Richard Corbett

Aly Karsan

Coastal Genomics

Jared Slobodan

Matt Nesbitt

Tech D, past and present

Martin Hirst

Michelle Moksa

Yongjun Zhao

Pawan Pandoh

Richard Moore

Michael Mayo and

The Sequencing Team

Quality Control in NGS

• You want to have the right amount of DNA

going into the protocol and know it’s not

“degraded”

• You want to know how much and what

size DNA is coming out of the protocol, so

you can get the right cluster density.

• In theory this is simple but sample

diversity and length dependent protocol

steps (shearing, bead cleans) makes it

complex

Declaration

Robin is a co-inventor of the size selection

technology described here and a board

member of Coastal Genomics.

top related