next generation cell sorting...bring up hyperfinder plugin •select sorting parameters, then choose...

36
Next Generation Cell Sorting New Tools and Methods How do we bridge computational data analysis with single cell sorting? How can we sort using derived parameters (cluster ID, tSNE cords, etc.)? Classify Identify Populations Train Create Training Sets Sort Classify Events and Sort Joe Trotter [email protected]

Upload: others

Post on 06-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Next Generation Cell Sorting

New Tools and Methods

How do we bridge computational data analysis with single cell sorting?How can we sort using derived parameters (cluster ID, tSNE cords, etc.)?

Classify

Identify Populations

Train

Create Training Sets

Sort

Classify Events and Sort

Joe [email protected]

Page 2: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Obtain data driven perspectives that provide insights during exploration/discovery

Consider San Francisco Bay

Other Views and Perspectives

can Provide Insights

Page 3: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Obtain data driven perspectives that provide insights during exploration/discovery

Consider San Francisco Bay

Other Views and Perspectives

can Provide Insights

Page 4: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Consider an immunofluorescence panel with 27 markers (27 colors + 6 scatter measurements)

A gated population is defined in a series of 2D projections

Very Different Views of the Same Data Set

A cluster is defined by all dimensions simultaneously

as a region of local density in marker space

High Dimensional Space(Force Directed Layout of 27 Marker Dataset)

Page 5: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Obtain data driven perspectives that provide insights during exploration/discovery

Opt-SNE by CD25

UMAP by CD25

Opt-SNE by Cluster ID

UMAP by Cluster ID

Cluster 22

4 Stage Cleanup into All Cells… then dimensionality reduction and cluster

tSNE

UMAP

X-Shift

Page 6: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Obtain data driven perspectives that provide insights during exploration/discovery

Opt-SNE Contour

UMAP Contour

tSNE

UMAP

X-Shift

4 Stage Cleanup into All Cells… then dimensionality reduction and clusterOpt-SNE by Cluster ID

UMAP by Cluster ID

Cluster 22

Page 7: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Obtain data driven perspectives that provide insights during exploration/discovery

Opt-SNE by CD45RA

UMAP by CD45RA

CD4 CD25 CD45RA

4 Stage Cleanup into All Cells… then dimensionality reduction and cluster

tSNE

UMAP

X-Shift

Opt-SNE Contour

UMAP Contour

Page 8: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Find Gates for Clustered Populations?

Published New Alternate Strategies

Figure from Becht et al.

Page 9: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

GateFinder: Aghaeepour et. al., Bioinformatics 2018

• GateFinder uses clustered/gated populations as training sets

• Computes a gating scheme using Polygons with convex hulls

• Uses each measurement only once

GateFinder: Projection-based Gating Strategy Optimization for Flow and Mass Cytometry

Nima Aghaeepour*, Erin F. Simonds*, David JHF Knapp, Robert V. Bruggner, Karen Sachs, Pier Federico Gherardini, Nikolay Samusik, Sean C. Bendall, Gabriela K. Fragiadakis, Brice Gaudilliere, Martin S. Angst, Connie J. Eaves, William A. Weiss, Wendy J. Fantl, Garry P. Nolan, Bioinformatics, 2018.

+ Can output complex (convex) polygon gates

- Does not guarantee a globally optimal strategy

* Requires R Environment and library

Page 10: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

HyperGate: Becht et. al., Bioinformatics 2019

• HyperGate uses clustered/gated populations as training sets• Computes a gating scheme using Rectangles• Achieves a higher yield and purity than human experts, SVMs, and

Random Forests on public data sets

+ Provides a globally optimal strategy

- Produces only rectangular gates

* Requires R Environment and library

Page 11: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Computes optimized gating strategy using polygon gates

• Split the data into training and validation sets (other algorithms seem to skip this step!)

1. Build a biaxial gate sequence

• Either as convex polygon gates (GateFinder-like)

1. Try gates in all pairs of dimensions

2. Find the best sequence of gates by extending the sequence and/or replacing gates in the sequence and/or removing useless gates from the sequence

• Or as hyper-rectangular gates (HyperGate-like)

1. Optimize hyperbox boundaries around the target pop

2. Iteratively reduce the list of dimensions by dropping least useful dimensions one at a time

2. Stochastically optimize boundaries to maximize the F1-score

• Allow each vertex to move independently

• Report precision, recall, and F1-score using the validation set

HyperFinder

Page 12: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Explaining the HyperFinder algorithm

Identify target and background populations

Random Split

Training Set

30%70%

Validation Set

Input Data

A set of (dim-1)*dim/2 polygon gates

Fit convex polygons to each possible

biaxial plot

Try adding every unused gate to the

gating strategy. Pick the one that

maximizes the overall Fβ score

Try replacing every gate in the gating strategy with

every unused gate. Accept the replacement if Fβ score

is improved

Stop when eitherMAX_NUM_GATES is

reached, or adding any gate improves

the Fβ by less than MIN_FMEAS_IMPROVEMENT

MIN_FMEAS_IMPROVEMENT = 0.005MAX_NUM_GATES = 8 (default)Fβ – weighted F-measureΛi -lower bound Vi –upper bound

Create a bounding hyperbox around the

target cell set, by defining Λi and Vi on

each dimension di ∈ D

Remove that dimension di, dropping which has the

smallest impact on the Fβscore of the hyperbox

Stop when either count(D)=MAX_NUM_GATES*2, or no further dimension can be

removed without impacting Fβ by more than

MIN_FMEAS_IMPROVEMENT

OR

Try moving every polygon point by a

random amount either toward or away from the polygon center

Stop when Fβ cannot be improved any further

Compute Fβ score using validation set

Tentative gating

strategy

Adjust Λi, Vi on every dimension di ∈ D, such that the maximize the Fβ score of the

hyperbox

Create rectangular biaxial gates based on Λi, Vi for each

remaining dimension

Algorithmic step

Object

GateFinder-like initialization (default)

HyperGate-like initialization

Export gates and Fβ score to FlowJo

Optimized Gating Strategy(non-convex polygons)

Stochastic optimization of polygon boundaries

Accept the change if it improves the overall Fβ

END

Turn rectangles

into polygons

Redundant gate removal

Page 13: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Data Collection -> Analyze Data -> Training Set -> Gating Strategy -> Validate -> Sort

Workflow – using a new FlowJo plugin toolkit

t-SNE and X-Shift Clustering ClusterExplorer HyperFinder

Page 14: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Explaining the workflow

Clustering

BD FACSDiva™ Export functionavailable in FlowJo v10.6

BD FACSDiva™

Finding the cluster of interestRemoving the outliers byt-SNE gating on the clustered population

ClusterExplorer FlowJo plug-in

t-SNE Cleanup

HyperFinder

FlowJo pluginsX-shiftPhenographFlowMeansFlowSOM…

Conventional gating

Use cases:1) shorten the gating strategy2) Try and capture the same population while leaving some markers out

Use case:Objectively define a gating strategy for a multidimensional clustered population

Sorting

10-15 minon 100K dataset

Page 15: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Example Sorting Experiment:

A Typical Monocyte SortBD FACSAria™ Cell Sorter1. Forward Scatter

• FSC-H• FSC-W

2. Side Scatter• SSC-H• SSC-W

3. Live/Dead (Zombie Green)4. HLA-DR (BV785)5. CD14 (PerCP-Cy5.5)6. CD16 (APC)

10 Measurements(10 Dimensional Data Set)Goal: Sort Classical MonocytesCD14 high CD16 lowNo compensation required

Page 16: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Workflow of a simple example:

A Typical Monocyte Sort

• Gates 1,2,3 -> cleanup most debris, doublets, and include all remaining Leucocytes

• This allows us to use FSC-A, SSC-A, and all relevant fluorescence measurements for tSNE and Clustering

1. Clean up sample to remove debris, clumps, and doublets/outliers using manual gating in FlowJo

1 2 3

Page 17: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Cluster and display in a familiar dimensionality reduction plot

• Unsupervised Clustering of All (singlet) Leucocytes (X-Shift)

• tSNE map of All Single Cell Leucocytes – color by X-Shift Cluster ID and CD14 Expression

2. Cluster All Leucocytes and create a tSNE map of the data set

Page 18: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Bring up ClusterExplorer plugin

Page 19: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

• Select CD14 Positive from Expression Profile and verify Cluster/tSNE ID(s) and map distribution (CD14+ is cluster 2)

• In this case, we can simply draw a cleanup gate on the tSNE map of Cluster 2

Use ClusterExplorer

Page 20: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Display target cluster in Opt-SNE

• Plot Cluster 2 in FlowJo tSNE and draw a cleanup gate

Page 21: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Apply cleanup gate in Opt-SNE

• Plot Cluster 2 in FlowJo tSNE and draw a cleanup gate

Page 22: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Bring up Hyperfinder plugin

• Select Sorting Parameters, then choose the Training Set for HyperFinder

Page 23: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

• HyperFinder creates a two-gate strategy – 98.9% F1 Measure

Verify Hyperfinder results in FlowJo

Insure the computed gating strategy produces the desired population to be sorted

• Verify in FlowJo

Page 24: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

• HyperFinder creates a two-gate strategy – 98.9% F1 Measure

Verify Hyperfinder results in FlowJo

Insure the computed gating strategy produces the desired population to be sorted

• Verify in FlowJo

Training Set HyperFinder Set

98.9% F1 Measure5171 Events

Page 25: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

• Export the FlowJo workspace as a new FACSDiva™ Experiment

Export the gates to FACSDiva™

• Export the gates as a new Experiment file and load into FACSDiva™ Software on the cell sorter

• Compensation may be optionally exported

• The Biexponential scaling (if used for clustering) will be exported as well.

• Select either Dot plot or Density plot style for new worksheet plots

Contains Original Experiment + New Worksheet

Page 26: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Example Sorting Experiment:

The Original WorksheetFACSAria™ (FACSDiva™ SW)1. Forward Scatter

• FSC-H• FSC-W

2. Side Scatter• SSC-H• SSC-W

3. Live/Dead (Zombie Green)4. HLA-DR (BV785)5. CD14 (PerCP-Cy5.5)6. CD16 (APC)

10 Measurements10 Dimensional Data SetGoal: Sort Classical MonocytesCD14 high CD16 lowNo compensation required

4888 Events5.7% of Total

Page 27: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Example Sorting Experiment:

Computational WorksheetFACSAria™ (FACSDiva™ SW)1. Forward Scatter

• FSC-H• FSC-W

2. Side Scatter• SSC-H• SSC-W

3. Live/Dead (Zombie Green)4. HLA-DR (BV785)5. CD14 (PerCP-Cy5.5)6. CD16 (APC)

10 Measurements10 Dimensional Data SetGoal: Sort Classical MonocytesCD14 high CD16 lowNo compensation required

5237 Events6.2% of Total

Page 28: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Another Use Case: a gate challenge 12 Gates required to identify a rare population – but sorter has a maximum gate depth of 8?

1 2 3

4 5 6

7 8 9

10 11 12

The Training Set

Single Lymph Scatter -> CD45+ CD19- CD3+ CD4+ CD8- (CD25lo CD127hi) CD45RA- CCR-6- CCR-4- CXCR-3+ CCR-5- PD1+ CD38- CD27 single (+),A CD4 T non Treg memory/effector, naïve, pd1+ exhausted and chronically stimulated, antigen specific subset

Page 29: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

A gate challenge HyperFinder returns a gating strategy for the target CD27+

population with > 93% F1 score using 8 gates

A CD4 T non Treg memory/effector, naïve, pd1+ exhausted and chronically stimulated, antigen specific subset

100

1000

10000

100000

1000000

0 1 2 3 4 5 6 7 8 9 10 11 12

Even

ts in

Gat

e

Gate Step

Cell Number at each Gate Step

Manual Gates HyperFinder

Page 30: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods

Obtain data driven perspectives that provide insights during exploration/discovery

Opt-SNE by CD25

UMAP by CD25

Opt-SNE by Cluster ID

UMAP by Cluster ID

Cluster 22

4 Stage Cleanup into All Cells… then dimensionality reduction and cluster

tSNEUMAPX-Shift

Page 31: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

The Advantage of Algorithm Driven Methods Consider an immunofluorescence panel with 27 markers (27 colors + 6 scatter measurements)

A cluster is defined by all dimensions simultaneously as a region of local density in marker space

High Dimensional Space(Force Directed Layout of 27 Marker Dataset)

t-SNE UMAPFDL

Cluster Explorer

Page 32: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Look for Naïve Tregs using ClusterExplorer

Obtaining different perspectives and views of the data can be very helpful Select CD25+ Clusters

Cluster 22 is the desired CD45RA+ Treg

Page 33: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Look for Naïve Tregs using HyperFinder

From the t-SNE Cleanup HyperFinder constructs a 7 gate sorting strategy

Cluster 22

Page 34: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Verify Naïve TregsAll T Cells + Training Set All T Cells + HyperFinder Gated

The HyperFinder Gated Population

Page 35: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Verify Naïve TregsAll T Cells + Training Set All T Cells + HyperFinder Gated

The HyperFinder Gated Population

Note that gating on scatter, CD3 and CD4 first could bias against Tregs

Page 36: Next Generation Cell Sorting...Bring up Hyperfinder plugin •Select Sorting Parameters, then choose the Training Set for HyperFinder •HyperFinder creates a two-gate strategy –98.9%

Thank you!

✓BD Advanced Technology Group

• Nikolay Samusik

• Allison Irvine

• Eric Diebold

• Keegan Owsley

✓BD R&D• Aaron Tyznik

• Aaron Middlebrook

• Nihan Kara

✓BD Europe• Jens Fleischer

✓ FlowJo• Josef Spidlen

• Ian Taylor

✓ BD SORP Group• Geoffrey Osborne

– Vladimir Azersky

✓ Stanford– Nima Aghaeepour

And to countless others…

✓ Queensland Brain Institute

• Virginia Nink