methods, applications & analysis of scrna-seq• project success depends on recognition and good...

35
Methods, applications & analysis of scRNA-Seq: How an integrated understanding of every step makes for a better single cell experiment Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility 2018 ABRF Meeting – Satellite Workshop 4 Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq) Sunday, April 22

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Methods, applications & analysis of scRNA-Seq:How an integrated understanding of every step makes for a better single cell

experiment

Michael Kelly, PhDTeam Lead, NCI Single Cell Analysis Facility

2018 ABRF Meeting – Satellite Workshop 4Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq)Sunday, April 22

Page 2: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

End-to-End Support for Single Cell RNA-Seq

• Project success depends on recognition and good integration of all aspects of the single cell genomics workflow

• Failures in one aspect can lead to problems downstream

• Lack of integration makes troubleshooting especially difficult

• Relatively “easy” to collect data; potential to get “stuck” at last step

Experimental Design

Sample Prep &

Handling

Capture & Library Prep

SequencingBioinformatic Analysis & Data Interpretation

Successful Validation &

Project Completion

Page 3: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Lessons learned from various scRNA-Seq projects• Selecting the right approach for the project goals• Importance of numbers and capture efficiency• Full-length transcript versus 3’ (or 5’) end-counting

• Sample prep matters• Single nuclei RNA-Seq for specific applications• Effects of low viability sample preps on downstream data quality

• Data wrangling• What single cell RNA-Seq looks like and why the reference is important

• A proposed model for integrated informatics support

Page 4: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Selecting the right approach for the project goals

Importance of numbers and capture efficiency

Page 5: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Single Cell RNA-Seq to Identify Cell-Type Specific Transcriptional Programs in Mammalian Cochlea

Modified from Waddington 1957

Prosensory Domain

Embr

yoni

cPo

stna

tal

Organ of Corti Goals of our single cell RNA-Seq study• Transcriptional programs• Understand changes in cellular plasticity• Novel cell type markers

Page 6: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Modified from Waddington 1957

Theoretical principle of trajectory analysis with single cell RNA-Seq data

Undifferentiated Precursors

Differentiated Cell Types

Single cell profiles at multiple profiles provide snapshots of

transcriptional expression across many cells

Page 7: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Modified from Waddington 1957

Theoretical principle of trajectory analysis with single cell RNA-Seq data

Undifferentiated Precursors

Differentiated Cell Types

Individual cell types can be identified from each time-point

Page 8: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Modified from Waddington 1957

Theoretical principle of trajectory analysis with single cell RNA-Seq data

Asynchrony in development allows for generation of a temporal model of expression across ”Pseudotime”

Undifferentiated Precursors

Differentiated Cell Types

Pseu

dotim

e

Transcriptional trajectory associated with differentiation of each unique cell type

can be analyzed

Page 9: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Single cell profiling with Fluidigm C1 was a good start, but was too low throughput

Page 10: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Original single cell dataset could only distinguish major cell type differences

From Burn & Kelly et al 2015

Page 11: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Drop-Seq scRNA-Seq provided better resolution than Fluidigm C1

Kelly et al 2012

HCs 1

HCs 2

Non−epithelial

Interdental med NSCs 1

med NSCs 2

lat SCs

med SCs

−80

−40

0

40

80

−100 −50 0 50tSNE_1

tSN

E_2

HCs 1HCs 2Non−epithelialInterdentalmed NSCs 1med NSCs 2lat SCsmed SCs

Drop-Seq data generated with Joey Mays (post-bac).

Outer Hair Cells

Inner Hair Cells

-Relatively low cost, once established and working consistently (not guaranteed…)-Considerable troubleshooting required (not plug & play)-Lower sensitivity (1-2,000 genes detected per cell) than Fluidigm C1 or FAC-Seq-Limited experimental design (captures back to back)

~1700 cells

Page 12: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

6

SingleCellPartitioning,LysisandBarcoding

10

•Standardsequencing configurations

•Easiertomultiplexwithnon-SClibraries

•HighqualityUMIandCellBarcodereads

•Highperformanceonpatternedflowcells

3’AssayScheme- GelBeadRTPrimersforInlineBarcoding• Cell partitioned with barcoded gel bead

• Cell Barcode & Unique Molecular Identifiers (UMI) on 3’-end of cDNA

• ~2000 median genes detected on average with ~50,000 mean sequencing reads per cells

• Gene-level counts data

10X Genomics droplet-based single cell RNA-Seq provided high-throughput method to profile large population of cells

in unbiased manner

• High efficiency of capture in 10X platform is amenable to smaller input numbers of cells (small tissue or rare population)

• Much better suited to limited samples like mouse cochlea…

Page 13: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Workflow of droplet-based single cell RNA-Seqprofiling of the mammalian cochlea

• Novel Cell Type Markers

• Transcriptional Programs

Modified from Cantos et al 2000

Data Analysis.And More Data Analysis.

Extract cochlear epithelium Dissociate cells

Single cell cDNA library generation & sequencing

Page 14: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

• ~5000 P1 cochlear epithelia cells• 99 Inner Hair Cells• 294 Outer Hair Cells• 88 Inner Phalangeal Cells• 54 Inner Pillar Cells• 229 Lateral Supporting Cells• 3753 Medial Non-sensory Cells• 74 Lateral Non-Sensory Cells

Single cell RNA-Seq of postnatal day 1 cochlea identifies expected cell subtypes

Clusters defined in unbiased manner.

Cluster identities determined from marker gene expression.

Differential expression analysis can reveal genes uniquely enriched in each cell subtype (or groups of cells)

Cochlear duct cross sectionSensory cells in Red and Green

Page 15: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Modified from Cantos et al 2000 (Wu Lab)

0

1 2

3

4

5

6

7

−20

0

20

−30 −20 −10 0 10 20tSNE_1

tSN

E_2

0

1

2

3

4

5

6

7

0

1

2

3

4

5

6

7

8

9

1011

12

13

14

15

−50

−25

0

25

−25 0 25tSNE_1

tSN

E_2

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

0

1

2

3

4

5

6

7

8

9

10

11

12

13

−60

−40

−20

0

20

40

−25 0 25tSNE_1

tSN

E_2

0

1

2

3

4

5

6

7

8

9

10

11

12

13

0

12

3

4

5

6

7

8

9

10

11

12

13

14

−25

0

25

−20 0 20 40tSNE_1

tSN

E_2

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

E12 Cochlea E14 Cochlea E16 Cochlea P7 CochleaP1 Cochlea

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

−25

0

25

50

−25 0 25 50tSNE_1

tSN

E_2

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

4,153 Cells 10,489 Cells 9,741 Cells 9,541 Cells 3,812 Cells

As cells move from undifferentiated precursors to differentiated cell types, they become more transcriptionally distinct

Greater number of scRNA-Seq datapoints allows better modeling of dynamic expression changes during differentiation

Page 16: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Selecting the right approach for the project goals

Full-length transcript versus 3’ (or 5’) end-counting

Page 17: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Full-length scRNA-Seq should allow isoform discrimination / quantitation – requires sufficient coverage

0.50

0.75

1.00

1.25

0 25 50 75 100Position

valu

e

variableS37_A02

S37_A03

S37_A04

S37_A05

S37_A06

S37_B01

S37_B02

S37_B03

S37_B05

S37_C01

S37_C02

S37_C03

S37_C04

S37_C05

S37_C06

S37_D01

S37_D02

S37_D03

S37_D04

S37_D05

S37_D06

S37_E01

S37_E02

S37_E03

S37_E04

S37_E05

S37_E06

S37_F01

S37_F02

S37_F03

S37_F04

S37_F05

S37_F06

S37_G01

S37_G02

S37_G03

S37_G04

S37_G05

S37_G06

1

2

3

0 25 50 75 100Position

valu

e

variableS13_A01

S13_A02

S13_A03

S13_A04

S13_B01

S13_B02

S13_B03

S13_B04

S13_C01

S13_C02

S13_C03

S13_C04

S13_D01

S13_D02

S13_D03

S13_D04

S13_E01

S13_E02

S13_E03

S13_E04

S13_F01

S13_F02

S13_F03

S13_F04

S13_G01

S13_G02

S13_G04

S13_H01

S13_H02

S13_H03

S27_A03

S27_A04

S27_A06

S27_B03

S27_B04

S27_B05

S27_B06

S27_C03

S27_C04

S27_C05

S27_C06

S27_D04

S27_D05

S27_D06

S27_E01

S27_E02

S27_E04

S27_E05

S27_E06

S27_F01

S27_F03

S27_F04

S27_F05

S27_F06

S27_G01

S27_G02

S27_G03

S27_G04

S27_G05

S27_G06

S27_H03

S27_H04

S27_H05

S27_H06

5’ to 3’ Coverage Plots

Fluidigm C1 (v1 Chemistry) Clontech SMARTer v4 and SMART-Seq2

Page 18: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Single cell gene specific target amplification & qPCR to study crucial differential isoform usage in developing cochlea

Whole transcriptome scRNA-Seq data was too sparse and had 3’ bias – not reliable enough. Moved to STA -> qPCR approach

Mutation in mice and humans in intron of gene leads to mis-splicing and deafness. Cell type specific phenotype. C1 Biomark HD

Collaboration with Banfi Lab at Univ of Iowa – Nakano et al paper in final review

Page 19: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Sample prep matters

Single nuclei RNA-Seq for specific applications

Page 20: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Single nucleus RNA-Seq provides a useful alternative to whole single cell RNA-Seq

• Allows scRNA-Seq profiling of difficult to dissociate tissues• Neuronal tissues• Biobanked samples

• Avoid dissociation-associated transcriptional changes• Nuclear transcriptome

representative of whole cell transcriptome• Lower sensitivity & more

intronic readsLake et al 2016 Science

Habib et al 2017 Nature Methods

Page 21: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Single nuclei DropSeq to study spinal cord neurons and cell type specific activity in behavior

Sathyamurthy & Johnson et al (2018) Cell Reports

Modified detergent concentration and adjusted flow rates to optimize single nuclei encapsulation efficiency

Time post-treatment of target signal upregulation

Page 22: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Habib et al 2017 Nature Methods “DroNc-seq“

0

25000

50000

75000

Fres

h

Nuc

lei

Tim

e

Actin

oD

Identity

nUMI

0

2500

5000

7500

Fres

h

Nuc

lei

Tim

e

Actin

oD

Identity

nGene

0.00

0.05

0.10

0.15

Fres

h

Nuc

lei

Tim

e

Actin

oD

Identity

percent.mito

0

1

2

3

4

5

Fres

h

Nuc

lei

Tim

e

Actin

oD

Identity

Fos

0

1

2

3

4

Fres

h

Nuc

lei

Tim

e

Actin

oD

Identity

Jun

Single nuclei RNA-Seq has limited sensitivity – still good for identifying major cell types

Hair Cells

Supp Cells

GER1GER2

GER3GER4

InterdentalLER

Oc90+

Mes

Igfbp7+−20

0

20

−20 0 20tSNE_1

tSN

E_2

Hair Cells

Supp Cells

GER1

GER2

GER3

GER4

Interdental

LER

Oc90+

Mes

Igfbp7+

IHC OHC

IPhC

IPC

OPCDC1

DC2

DC3Hensen

Border GER1

GER2GER3

GER4Interdental

LERBmp4+

Oc90+

−20

0

20

40

−30 0 30tSNE_1

tSN

E_2

OHC

IPhC

IPC

OPC

DC1

DC2

DC3

Hensen

Border

GER1

GER2

GER3

GER4

Interdental

LER

Bmp4+

Oc90+

Evaluating sensitivity and effectiveness with cochlear epithelial cells

Page 23: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Sample prep matters

Effects of low viability sample preps on downstream data quality

Page 24: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

scRNA-Seq capture input of a high viability cell preparation is important for

• 10X Genomics recommends loading 90% viable cells or higher

• Often the underlying causes of below target # of cells

• Can contribute a “background signal” –ambient RNA

Page 25: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Effects of low cell viability on downstream and options to enrich for viable cells

~10% cell viability & no dead cell removal

Dead cell removal -> 50% cell viability

No clear inflection point

Viable cells with good transcript counts; low background

Page 26: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Any selection or manipulation can have the effect of biasing cell populations

• Selection / enrichment is trade—off between ”cleaner” data and a potential to bias the data

• Alignment of cell rations with histology or flow cytometry can provide insight and confidence

CD45

Exp

ress

ion

Page 27: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Data wrangling What single cell RNA-Seq looks like and why the reference is important

Page 28: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Type of downstream data depends on scRNA-Seqmethod - data defined by cDNA library type

• ”Full-length” scRNA-Seq methods generate reads that can span entire transcript length –might not cover very 5’ or 3’ end

• 3’ or 5’ transcript end enriched scRNA-Seqfor gene-level counts (5’ for VDJ methods)

Page 29: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

What does ”full-length” single cell RNA-Seq data look like?

1

2

3

0 25 50 75 100Position

valu

e

variableS13_A01

S13_A02

S13_A03

S13_A04

S13_B01

S13_B02

S13_B03

S13_B04

S13_C01

S13_C02

S13_C03

S13_C04

S13_D01

S13_D02

S13_D03

S13_D04

S13_E01

S13_E02

S13_E03

S13_E04

S13_F01

S13_F02

S13_F03

S13_F04

S13_G01

S13_G02

S13_G04

S13_H01

S13_H02

S13_H03

S27_A03

S27_A04

S27_A06

S27_B03

S27_B04

S27_B05

S27_B06

S27_C03

S27_C04

S27_C05

S27_C06

S27_D04

S27_D05

S27_D06

S27_E01

S27_E02

S27_E04

S27_E05

S27_E06

S27_F01

S27_F03

S27_F04

S27_F05

S27_F06

S27_G01

S27_G02

S27_G03

S27_G04

S27_G05

S27_G06

S27_H03

S27_H04

S27_H05

S27_H06

0.50

0.75

1.00

1.25

0 25 50 75 100Position

valu

e

variableS37_A02

S37_A03

S37_A04

S37_A05

S37_A06

S37_B01

S37_B02

S37_B03

S37_B05

S37_C01

S37_C02

S37_C03

S37_C04

S37_C05

S37_C06

S37_D01

S37_D02

S37_D03

S37_D04

S37_D05

S37_D06

S37_E01

S37_E02

S37_E03

S37_E04

S37_E05

S37_E06

S37_F01

S37_F02

S37_F03

S37_F04

S37_F05

S37_F06

S37_G01

S37_G02

S37_G03

S37_G04

S37_G05

S37_G06

5’ to 3’ Coverage Plots

Page 30: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Public reference annotations don’t cover all the possible transcript isoforms and polyA sites

• Be aware that you might be missing some of your reads for less mature reference annotations• 3’ (or 5’) end counting could exacerbate this? • Cell-type specific rare exon usage of alternative polyA signal?

Bulk RNA-Seq Data 3’ Only scRNA-Seq

Page 31: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

A complex genome adds some additional challenges – 3’ ends of genes can overlap

• “Unstranded” RNA-Seq reads aligning to overlapping 3’ end can’t be reliably disambiguated

• Interestingly, 3’-end only (or 5’-only) libraries are inherently “stranded”

Page 32: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

A proposed model for integrated informatics support

Page 33: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Guidance, Training & Directed Informatics Support

Experimental Design & Planning

Core Facility Embedded Bioinformatics

Two-way communication at wet-bench steps

Data QC and primary data processing

Subject matter expert query of data & identify specific testing

Basic level informatics-trained end-user

Advanced informatic analysis

Bioinformatic support: Lab-embedded, third-party, or

core-embedded

Page 34: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Single Cell Analysis Facility

SequencingFacility & Genomics

Tech Laboratory

NCI Genomics

CoreProtein Analysis

Core

NCI Investigator Labs (clinical & basic science)

& NIH Intramural CommunityBroader Single Cell Community

Bioinformatic Analyst will be embedded in our Single Cell Analysis Facility and will work closely with SCAF team & NCI Investigator

NIH Bethesda Campus

Frederick National Lab Campus

• Sequencing Production

• Project Tracking • R&D Innovation• Analysis Workflows

NIH Clinical Center on Main Bethesda Campus

NCI Single Cell Analysis Facility

Collaborative Bioinformatics

& NCI Data Science

Laboratory

NCI Single Cell Analysis Facility as a collaborative team (we are hiring a bioinformatic analyst!)

NCI Flow Core

Page 35: Methods, applications & analysis of scRNA-Seq• Project success depends on recognition and good integration of all ... Nakano et al paper in final review. Sample prep matters

Acknowledgements

Cochlear scRNA-Seq work was funded by NIDCD Intramural Research Program - DC000021 (MWK) & Other Sources of

Funding for Collaborators

Laboratory of Cochlear DevelopmentMatt KelleyJoseph MaysAlejandro AnayaMahmoud KhalilBetsy DriverWeise ChangKathryn EllisHanna SherrillTessa SandersKuni Iwasa

NIDCD Genomics & Computational Biology Core

Robert MorellErich Boger

Decibel TherapeuticsJoe BurnsAdam PalermoKathy So

CollaboratorsBanfi Lab (Univ of Iowa)Levine Lab (NINDS)Friedman Lab (NIDCD)Klein Lab (NICHD)

Developers of open-source analysis packages

NIH High-performance computing staff

NCI Center for Cancer Research Office of Science Technology ResourcesNCI Genomics Core

Val Bliskovsky, PhD & Liz Conner, PhD

Frederick National Lab - Cancer Technology Research ProgramSequencing Facility & Genomic Technology Lab

NIDCR Flow Cytometry CoreMehrnoosh Abshari