sequencing library preparation - harvard university · sequencing library preparation ......

67
Sequencing Library Preparation Sarah Boswell http://scholar.harvard.edu/saboswell

Upload: trandat

Post on 12-Aug-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

Sequencing Library PreparationSarah Boswell

http://scholar.harvard.edu/saboswell

Genomics, Epigenomics & Transcriptomics

Hasin et al. Multi-omics approaches to disease, Genome Biology 18:83 (2017).

Genomics, Epigenomics & Transcriptomics

DNA DNA & Histone RNA

Modifications

C. Petit. Evodevomics, BioSciences Master Reviews (2013).

Basic Library Preparation

Small piece of DNA or cDNA

INSERT

INSERT + Sequencing Adapter

Library

Seq Methods

Sequencing Library Preparation

Starting Material: DNA & RNA

RNA-Seq library prep

Genomic library prep

Mate-pair sequencing (circularization)

Low input / single cell library prep

High Quality Starting Material

DNA Extraction

Treat with RNase

RNA Extraction

Treat with DNase

Practice your extraction before the real experiment

Genomic DNA Extraction

https://genomics.ed.ac.uk/resources/sample-requirements

Any of the available kits or protocols

Bacteria or yeast often need additional bead-beating step.

RNase treat all DNA samples.

Quantitation

Absorbance: Nano-drop (50-500 ng/ul)

Theoretically should can read to 3000 ng/ul. Empirically find it is

only accurate within range above.

Dye based

SYBR Green, Qubit, Quant-IT

Genomic DNA QC

https://genomics.ed.ac.uk/resources/sample-requirements http://www.agilent.com/cs/library/applications/5991-5258EN.pdf

Agarose Gel

TapeStation

Genomic DNA Assay

Chromatin Immunoprecipitation (ChIP)

https://www.thermofisher.com/us/en/home/life-science/antibodies/antibodies-learning-center/antibodies-resource-library/antibody-application-notes/step-by-step-guide-successful-chip-assays.html

Crosslinking

Cell Lysis

Chromatin Preparation / Sheering

Immunoprecipitation

Crosslinking Reversal & DNA cleanup

High Quality Starting Material

DNA Extraction

Treat with RNase

RNA Extraction

Treat with DNase

RNA: What is Transcriptomics?

Wikipedia

“The transcriptome is the set of all messenger RNA molecules in one

cell or a population of cells.”

National Cancer Institute Definition of Terms

“The study of all RNA molecules in a cell. Transcriptomics is used to

learn more about how genes are turned on in different types of cells

and how this may help cause certain diseases, such as cancer.”

RNA enrichment

PolyA tailed messenger RNA: mRNA-Seq

Total RNA (rRNA removed): “total” RNA-Seq

Front Genet. 2015 Jan 26;6:2

mRNA (polyA) Purification

mRNA enrichment

mRNA binds beads coated with oligo dT primer

Non-polyadenylated transcripts are washed away

TTTTTT AAAA

AAAA

TTTTTTAAAA

Transcripts Lost in polyA Purification

Ribosomal/Transfer RNA

Histone mRNA

Long-noncoding RNA

Nascent intron containing transcripts

Micro RNA

Degraded RNA

Many viral transcripts

Prokaryote/Bacterial transcripts

polyA is the degradation signal

rRNA Depletion

Illumina: TruSeq

Probes hybridize rRNA on magnetic

beads

RNA of interest remains in supernatant

KAPA: RiboErase

Probes hybridize rRNA in solution

Hybrids are digested with RNase H

Probes digested with DNAse I

Modified from: Scientific Reports 6, article 37876 (2016)

Bead Rnase H

rRNA

Purified RNA

mRNA/long noncoding RNA/nascent RNA

rRNA

RNA Extraction

Cultured cells – Easy!

Tissue samples

Use your favorite column kit – Qiagen, Invitrogen, Zymo

For high throughput suggest bead based – MagMax

96 well format column plates are also available

Dissect in cold room if possible

Use RNAlater solution to store tissue sample

Upon extraction proceed to homogenization in cold room

RNA Extraction: Column Purification

On-column DNase digestion

Dry to remove excess ethanol

Elute with warm water to

increase yieldsOn-Column

DNase

http://www.sabiosciences.com/pathwaymagazine/pcrhandbook/10.php

RNA Extraction: Column Purification

Standard columns/protocols have 100-200bp cut off.

Will result in loss of small RNA species

Specialized columns and kits are available

microRNA

FFPE

Blood

HACK - Cut off size can be adjusted by changing the

percent ethanol used for sample binding

RNA Extraction: Bead Purification

Magnetic based purification good for high-throughput

applications

Can use oligo dT beads or total RNA beads

DNase step:

DNase I digestion step is sometimes skipped (polyA libraries only)

Best practice is to keep this step.

Binding Wash DNase Re-bind Wash Elute

http://www.beckman.com/nucleic-acid-sample-prep/rna-isolation/isolation-from-tissue

RNA Extraction: Tissues / Trizol

Keep tissues as cold as possible

Work in cold room

After homogenization suggest column based cleanup

Particularly important if used Trizol for lysis

DNase 10ug then column cleanup

https://www.thermofisher.com/order/catalog/product/AM7020

RNA Quantitation & Quality

Quantitation

Absorbance: Nano-drop (50-500 ng/ul)

Theoretically should can read to 3000 ng/ul. Empirically find it

is only accurate within range above.

Dye based

RiboGreen

Qubit / Quant-IT

Quality

Visualize on gel

Agilent Bioanalyzer (RIN)

RNA quality

High quality RNA needed for mRNA libraries

Degraded samples should only be used to make

a “total” RNA-seq library – rRNA removal

FFPE & Archival Samples

mRNA Purification of Degraded Samples

Transcript 1 Transcript 2

PolyA tail no longer attached to transcript.

Results in differential loss of transcripts between samples.

TTTTTT AAAA

AAAA

TTTTTTAAAA

AAAA

Sequencing Library Preparation

Starting Material: DNA & RNA

RNA-Seq library prep

Genomic library prep

Mate-pair sequencing (circularization)

Low input / single cell library prep

RNASeq Stranded Library Prep

(dUTP method)

Index

or strand specific amplification

or mRNA purification

http://www.rna-seqblog.com/wp-content/uploads/2012/12/library-preparation.jpg

Library Strandedness

http://seqanswers.com/forums/showthread.php?t=44220

ACCATGAACCGTA

TGGTACTTGGCAT

ACCAUGAACCGUA

Read alignment depends on

direction of transcription

“sense” strand of transcript can

be on either the sense or

antisense strand of the DNA

Library Strandedness

https://galaxyproject.org/tutorials/rb_rnaseq/

Key steps in library preparation

Starting Material

Library amplification bias

Multiplexing

qPCR quantitation

Sequencing read order & terminology

Library Amplification Bias

Final step of library prep is

amplification

Introduces library bias

Some products preferentially

amplified

Fewer cycles = less bias

Modified from: Nature Methods 9, 72-74 (2012)

transcript count

2, 13, 4

transcript count

12, 20, 8

Limited Cycle Library Amplification

Number of cycles needed is proportional to

amount of input RNA.

Library prep kits will recommend a certain

number of cycles.

This is usually optimized for the lower input.

Test how many cycles will give you

enough product.

Fewer cycles = less bias

Perform micro qPCR reaction on small amount of pre-

amplification library (Kapa)

Amplify only the number cycles needed to get enough

product for sequencing (20ul of 4nM product)

Limited Cycle Library Amplification

https://www.kapabiosystems.com/document/kapa-library-quantification-illumina-tds/?dl=1

Limited Cycle Library Amplification

9 cycles | 10 cycles 11 cycles | 12 cycles

Amplify 9 cycles

PCR

bubble

Library QC

Quantitation

Dye based

SYBR Green

Qubit / Quant-IT

Size & Quality

Agilent Bioanalyzer

Size determination

Do not use for

quantitation

Peak around 150 = primer dimer

Size selection with SPRI beads

http://core-genomics.blogspot.com/2012/04/how-do-spri-beads-work.html

Solid Phase Reverse

Immobilization beads

Carboxyl groups on

surface bind DNA in the

presence of crowding

agents (PEG & NaCl)

Key steps in library preparation

Starting Material

Library amplification bias

Multiplexing

qPCR quantitation

Sequencing read order & terminology

RNASeq Stranded Library Prep

(dUTP method)

Index

or strand specific amplification

http://www.rna-seqblog.com/wp-content/uploads/2012/12/library-preparation.jpg

or mRNA purification

Multiplexing (barcodes and indices)

Multiplexing allows optimal use of reads you will get

Charges for sequencing are usually per lane of the flow cell

For RNA-Seq number of reads you need will depend on your

experiment

HiSeq generates ~150 million reads per lane

NextSeq generates ~ 450 million reads (one lane instrument)

10 million standard for transcriptome

20 million standard for total RNA (rRNA depleted)

Make sure multiplexing libraries of similar size

Consider Cluster Size in Multiplexing

Library Preparation

DNA

(0.1-5.0 μg)

1 2 3 7 8 94 5 6T G CT A C G A T …

C

C

CC

A

A

A

TT

GG

G

G

Sequencing

Single molecule array

Cluster Growth

Image Acquisition Base Calling

5’

5’3’

TG

TA

CG

AT

CA

CC

CG

AT

CG

AA

www.support.illumina.com

Multiplexing

Pool samples based on dye based quantitation

Submit pool to core facility for sequencing.

Make all sequencing libraries in one batch

qPCR quantitate

before sequencing

Key steps in library preparation

Starting Material

Library amplification bias

Multiplexing

qPCR quantitation

Sequencing read order & terminology

Sequencing Read Order

1. Read 1

2. Index Read 1 (i7)

3. Index Read 2 (i5)

4. Read 2

HiSeq/MiSeq (4 color)

• A&C read on one camera

• G&T read on other

NextSeq (2 color)

Barcode and/or UMI

INDEX

Rd2 Seq PrimerIndex 2

primer(A)

Index 2

primer(B)

Index 1 primerRd1 Seq Primer

Sequencing Library Preparation

Starting Material: DNA & RNA

RNA-Seq library prep

Genomic library prep

Mate-pair sequencing (circularization)

Low input / single cell library prep

Genomic Library Prep

Once you have sheared your DNA this is a quick process

Protocol same as for RNA-Seq once

you have sheared dsDNA

Acoustic shearing – Covaris

Sonication

Hydrodynamic shearing – nebulization

Shear Genomic DNA or begin with cDNA

End Repair (blunt ends)

Add 3’ A Tail

Ligate Adapters

Enrich/Linearize with PCR

Sequencing

http://tucf-genomics.tufts.edu/home/faq

Tagmentation(DNA fragmentation facilitated by transposon activity)

http://www.molecularecologist.com/2015/01/new-to-the-genome-sequencing-8-menu-nextera-library-preps/

Tagmentation Approach

Nextera from Illumina

Very fast and efficient for DNA library preps

Often used as last stage of low input RNASeq library protocols

Works with small amounts of DNA

Important to RNase treat your sample

Needs precise DNA quantitation (Qubit)

https://www.my46.org/intro/whole-genome-and-exome-sequencing

Whole Genome Sequencing (WGS)

vs

Whole Exome Sequencing (WES)

Genomic Assembly with Mate-Pairs

https://www.illumina.com/content/dam/illumina-marketing/images/technology/mate-pair-sequencing-figure.gif

Mate-Pair Sequencing

http://www.illumina.com/

Mate-Pair Sequencing

http://www.illumina.com/

Read2

Read1

http://www.illumina.com/

Read2

Read1

Read2

Read1

Go

ing

b

ac

kw

ard

s

Mate-Pair Sequencing

http://www.illumina.com/

Read2

Read1

Read2

Read1

Read1

Read2

Go

ing

b

ac

kw

ard

s

Mate-Pair Sequencing

Insert pieces of DNA can be long (2-5kb)

Allows for better de novo genome assembly

and finishing of genome assemblies.

Helps in determining presence and location of

genomic rearrangements and amplifications.

Mate-Pair Genomic Library

Sequencing Library Preparation

Starting Material: DNA & RNA

RNA-Seq library prep

Genomic library prep

Mate-pair sequencing (circularization)

Low input / single cell library prep

Low Input & Single Cell RNA-Seq

Lower input = less chance to see mRNA of interest

Need to consider sampling error

High technical variation

Single cell methods will only capture 10-40% of

expected mRNA

Will not reliably detect low-abundance transcripts

Differential expression observed is reliable for highly

expressed genes

Single Cell / Low Input Methods

InDrops

InDrops

Single Cell / Low Input Methods

InDrops

Single Cell / Low Input Methods

Smart-Seq / Drop-Seq / SCRB-Seq

S. Picelli et al., Full-length RNA-seq from single cells using Smart-seq2., Nat Protoc 9, 171–81 (2014).

In Drop-Seq / SCRB-Seq

libraries are enriched for 3’

UMI labeled ends

InDrops Single Cell Sequencing

Lysis and reverse transcription occurs in the beads

Samples are frozen after RT as RNA:DNA hybrid in gel.

A. M. Klein et al., Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells., Cell 161, 1187–201 (2015).

InDrops Library Prep / CEL-Seq2

ssSynthesis to make full

length dsDNA

IVT back to RNA via T7

promoter

Fragmentation RNA

Random hexamer RT

with adaptor

PCR to add index and

illumina adaptorsR. Zilionis et al., Single-cell barcoding and sequencing using droplet microfluidics, Nature Protocols 12, 44-73 (2017).

InDrops

Smart-Seq gives full transcript information &

detects most genes per cell.

Droplet based and Smart-Seq most accurate

differential expression

Drop-Seq 5-12% capture efficiency

InDrops/10x 50-80% capture efficiency

Single Cell / Low Input Methods

Low Input or Single Cell

Only use if needed for experiment

All commonly used methods all rely on PolyA tail

If you can get more starting material then you will

get better results.

Gold standard is TruSeq with >500ng input RNA

Plan a small scale starter experiment to see if

protocol will give useful results

http://rnaseq.uoregon.edu/

Transcript Enrichment:

Capture Sequencing

Capture targeted sequence using

biotinylated RNA bait

Sequencing library applied to

beads

Retain only library covering genes

of interest

Saves money on sequencing

IDT lockdown probes expensive

but good for small number genes

http://rnaseq.uoregon.edu/

Capture Probes

Tiling is the number of times a base is covered by a different

probe.

Difficult to design probes if looking at a single gene family or

pseudogenes.

Final Thoughts

Practice your library prep on a control sample.

Be sure you understand each step in library prep.

Talk to someone who has done the protocol before

starting.

qPCRPrecise quantitation is key to effective sequencing!

Useful Websites

support.illumina.com/

seqanswers.com/

core-genomics.blogspot.com/2012/04/how-do-spri-

beads-work.html

www.broadinstitute.org/files/shared/illuminavids/Sa

mplePrepSlides.pdf