transfer learning for auto-gating of flow cytometry...

Transfer Learningfor Auto-gating of Flow Cytometry Data

Gyemin LeeLloyd StoolmanClayton Scott

University of Michigan

ICML 2011 Workshop on Unsupervised and Transfer LearningJuly 2, 2011

Lee, Stoolman, Scott (University of Michigan) TL for Auto-gating of Flow Cytometry Data ICML 2011 workshop (July 2, 2011) 1 / 13

Flow Cytometry

A technique for rapidly quantifying physical and chemical properties of largenumbers of cells.e.g. size, shape, and fluorescent antigen attributes

Applications : diagnosis of blood-related diseases such as acute leukemia,chronic lymphoproliferative disorders and malignant lymphomas

FS SS CD45 CD4 CD8 CD3790 626 592 177 252 303496 477 675 485 306 383684 553 548 180 325 322681 588 563 221 258 272632 565 531 0 134 41... ... ... ... ... ...

Each column corresponds to ameasured feature

Each row corresponds to a cell

10,000 ∼ 100,000 cells/rows for anexperiment


Gating

Typical flow cytometry data analysis involves visualizing multiple2-dimensional scatter plots and manually selecting subset of cells from thescatter plots.

⇓ gating

⇒ assigning binary labels yi ∈ {−1,1} to every cell xi


Gating

The distribution of cell populations differs from patient to patient.


Automated Gating

Problems of manual gating

labor-intensive and time-consuminghighly subjective and not standardizedmodern clinical laboratories see dozens of cases per day

⇒ highly desirable to automate “gating”

Automated gating

In flow cytometry data analysis, more than 70% of studies focused onautomated gating techniques 1.In automatic gating, majority of approaches rely on unsupervisedclustering/mixture modeling.

1Bashashati & Brinkman, 2009Lee, Stoolman, Scott (University of Michigan) TL for Auto-gating of Flow Cytometry Data ICML 2011 workshop (July 2, 2011) 5 / 13

Auto-gating as a Transfer Learning Problem

Given

M labeled source datasets Dm = {(xm,i , ym,i)}Nmi=1 ∼ Pm for m = 1, . . . ,M

an unlabeled target dataset T = {xt,i}Nti=1 ∼ Pt

Goal : assign labels {yt,i}Nt

i=1 to T with low misclassification

D1 D2

⋯

DM

T

⇒

{yt,i}Nti=1


Our Approach (1/2)

Consider linear decision functions

ftest(x) = ⟨w , x⟩ + b ≷ 0

1. Summarize expert knowledge fm from each of the M source dataset Dm

to build a baseline classifier f0.

D1 ⇒ f1D2 ⇒ f2⋮ ⋮DM ⇒ fM

⎫⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎭

⇒ f0 = ⟨w0,x⟩ + b0 ≷ 0 (baseline)

where fm ∶ (wm,bm)← SVM(Dm), m = 1, . . . ,M

f0 ∶ (w0,b0)← robust mean({(wm,bm)}m)

f0


Our Approach (2/2)

2. Transfer the knowledge by adapting f0 to the target task T based on thelow-density separation principle.

f0T }⇒ ft = ⟨wt ,x⟩ + bt ≷ 0

Adjust the hyperplane parameters (w,b) so that the decision boundary passesthrough a region where the marginal density of T is low.

Find (wt ,bt) near (w0,b0) that minimizes the number of data points insidethe margin

Nt

∑i=1

I{ ∣⟨wt , xt,i ⟩ + bt ∣∥wt∥

< ∆}

ft


Auto-gating Example

Comparison of the gating from the baseline (f0) and the proposed transferlearning (ft) to the gating by the expert (true).

true

f0

ft


Experiments - setup

0 5 10 15 20 25 30 350

2

4

6

8

10x 10

4

Case

Num

ber

of C

ells

total Cells(+) labeled Cells

35 peripheral blood datasets are provided by the Department of Pathology,University of Michigan

Leave-One-Out Setting

choose a dataset as a target task Thide the labels of Ttreat the other datasets as source tasks Dm, m = 1, . . . ,34


Experiments - results

Our Transfer Learning Approach

f0 : baseline classifier with no adaptationft : classifier adapted to T by varying both the direction and the bias

Reference Approaches

Pooling : merge all the source data, and learn a classifier on this datasetOracle : standard SVM with the true labels of the target task data


Experiments - results

Pool f0 ft Oracleavg 9.81 3.70 2.49 2.12

std err 1.68 0.54 0.30 0.27

⇒ Our strategy can successfully replicate what experts do in thefield without labeled training set for the target task.


Conclusion and Forthcoming work

Conclusion

We cast flow cytometry auto-gating as a transfer learning problem.

By combining the transfer learning and the low-density separation criterionfor class separation, our strategy can leverage expert-gated datasets for theautomatic gating of a new unlabeled dataset.

Forthcoming work

General kernel-based framework

Generalization error analysis

Joint with Gilles Blanchard


Our Approach - detail

2-1. Varying biasFor a grid of biases {sj}, count points inside the margin

cj ←Nt

∑

i=1

I{∣⟨w, xt,i ⟩ + b − si ∣

∥w∥<∆}, ∀j : count

p(z)←∑j

cj δ(z − sj) ∗1

√

2πhexp(−

z2

2h2) : smooth

z∗ ← gradient descent (p(z), 0) : find minimizing bias

bnew ← b − z∗ : update bias

2-2. Varying normal vectorLet wt = w0 + atvt where vt = eig(cov([w1, . . . ,wM])).

For a grid of the amount of changes {ak}, count points inside the margin

ck ←Nt

∑

i=1

I{∣⟨w0 + akvt , xt,i ⟩ + b∣

∥w0 + akvt∥< 1} : count

g(a)←∑k

ck δ(a − ak) ∗1

√

2πhexp(−

a2

2h2) : smooth

at ← gradient descent (g(a), 0) : find minimizing at

wnew← w0 + atvt : update direction


transfer learning for auto-gating of flow cytometry...

Documents