paired sampling in density-sensitive active learning

19
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer Science Carnegie Mellon University

Upload: randi

Post on 01-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Paired Sampling in Density-Sensitive Active Learning. Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer Science Carnegie Mellon University. Outline. Problem setting Motivation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Paired Sampling in  Density-Sensitive Active Learning

Paired Sampling in Density-Sensitive Active Learning

Pinar Donmez joint work with Jaime G. Carbonell

Language Technologies Institute School of Computer Science Carnegie Mellon University

Page 2: Paired Sampling in  Density-Sensitive Active Learning

Outline

Problem settingMotivationOur approachExperimentsConclusion

Page 3: Paired Sampling in  Density-Sensitive Active Learning

Setting

X: feature space, label set Y={-1,+1} Data D ~ X x Y D = T U U

T: training set U: unlabeled set T is small initially, U is large

Active Learning: Choose most informative samples to label Goal: high performance with least number of labeling

requests

Page 4: Paired Sampling in  Density-Sensitive Active Learning

Motivation

Optimize the decision boundary placement Sampling disproportionately on one side may not be

optimal Maximize likelihood of straddling the boundary with

paired samples

Three factors affect sampling Local density Conditional entropy maximization Utility score

Page 5: Paired Sampling in  Density-Sensitive Active Learning

Illustrative Example

Left Figure significant shift in the current hypothesis large reduction in version space

Right Figure small shift in the current hypothesis small reduction in version space

Paired sampling Single point sampling

Page 6: Paired Sampling in  Density-Sensitive Active Learning

Density-Sensitive Distance

Cluster Hypothesis: decision boundary should NOT cut clusters squeeze distances in high density regions increase distances in low density regions

Solution: Density-Sensitive Distance find the weakest link along each path in a graph G

a better way to avoid outliers (i.e. a very short edge in a long path)

Chapelle & Zien (2005)

Page 7: Paired Sampling in  Density-Sensitive Active Learning

Density-Sensitive Distance

Apply MDS (Multi-dimensional Scaling) to to obtain a Euclidean embedding

Find eigenvalues and eigenvectors ofPick the first p eigenvectors s.t.

Page 8: Paired Sampling in  Density-Sensitive Active Learning

Active Sampling Procedure

Given a training set T in MDS space1. Train logistic regression classifier on T

2. For all Compute the pairwise score

3. Choose the pair with the maximum score

4. Repeat 1-3

Page 9: Paired Sampling in  Density-Sensitive Active Learning

Details of the Scoring Function S

Two components of S1. Likelihood of a pair having opposite labels (straddling the

decision boundary)2. Utility of the pair

By cluster assumption decision boundary should not clusters => points in different

clusters are likely to have different labels

In the transformed space, points in different clusters have low similarity (large distance)

Thus, we can estimate

Page 10: Paired Sampling in  Density-Sensitive Active Learning

An Analysis Justifying our Claim

Pairwise distances are divided into bins Pairs are assigned to bins acc. to their distances For each bin, relative frequency of pairs with opposite class labels

are computed This graph (empirically) shows that likelihood of having opposite

labels for two points monotonically increases with the pairwise distance between them.

* This graph is plotted on g50c dataset.

Page 11: Paired Sampling in  Density-Sensitive Active Learning

Utility Function

Two components Local density depends on

number of close neighbors their proximity

Conditional Entropy

For binary problems

Page 12: Paired Sampling in  Density-Sensitive Active Learning

Uncertainty-Weighed Density

captures the density of a given point information content of its neighbors

novelty: each neighbor’s contribution weighed by its uncertainty reduces the effect of highly certain neighbors dense points with highly uncertain neighbors become

important

Page 13: Paired Sampling in  Density-Sensitive Active Learning

Utility Function

utility of a pair is

regularize information content (entropy) of the pair proximity-weighted information content of neighbors

Page 14: Paired Sampling in  Density-Sensitive Active Learning

Experimental Data

pair with maximum score selected

Six binary datasets

Page 15: Paired Sampling in  Density-Sensitive Active Learning

Experiment Setting

For each data set start with 2 labeled data points (1 +, 1 -) run each method for 20 iterations results averaged over 10 runs

Baselines Uncertainty Sampling Density-only Sampling Representative Sampling (Xu et. al. 2003) Random Sampling

Page 16: Paired Sampling in  Density-Sensitive Active Learning

Results

Page 17: Paired Sampling in  Density-Sensitive Active Learning

Results

Page 18: Paired Sampling in  Density-Sensitive Active Learning

Conclusion

Our contributions: combine uncertainty, density, and dissimilarity across

decision boundary proximity-weighted conditional entropy selection is

effective for active learning

Results show our method significantly outperforms baselines in

error reduction fewer labeling requests than others to achieve the same

performance

Page 19: Paired Sampling in  Density-Sensitive Active Learning

Thank You!