towards anytime active learning: interrupting experts to

15
Towards Anytime Active Learning: Interrupting Experts to Reduce Annotation Costs Maria E. Ramirez Loaiza Aron Culotta Mustafa Bilgic Machine Learning Lab

Upload: others

Post on 19-Jan-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Towards Anytime Active Learning: Interrupting Experts to Reduce Annotation Costs

Maria E. Ramirez Loaiza

Aron Culotta

Mustafa Bilgic

Machine Learning Lab

Data X Learner Expert

Active Learning

𝑓 𝑥 = 𝑦

y y

How to spend the budget?

x x

Budget

2 www.cs.iit.edu/~ml Anytime Active Learning

Example: Text Classification

Which document and how much to read?

3 www.cs.iit.edu/~ml Anytime Active Learning

Example: Diagnostic Images

4 www.cs.iit.edu/~ml Anytime Active Learning

Example: Diagnostic Images

5 www.cs.iit.edu/~ml Anytime Active Learning

Example: Diagnostic Images

Which image and how much time to spend? 6 www.cs.iit.edu/~ml Anytime Active Learning

The Problem

• Anytime active learning

– Learner interrupts the expert at anytime, ask label

• The problem:

– Which instances to pick

– How much time to spend on each

7 www.cs.iit.edu/~ml Anytime Active Learning

Outline

• Anytime active learning in detail

• Experimental Results

• Future Work

• Conclusion

8 www.cs.iit.edu/~ml Anytime Active Learning

The Problem: Formally

• 𝑥𝑖𝑘 represents interruption

– Subinstance

– Interrupt at time k

• Value of information (VOI)

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur et viverra sem. Cras metus felis, vulputate vel leo nec, vestibulum suscipit lorem. Integer dolor nisi, porttitor in auctor a, malesuada id massa. Proin dictum nibh at convallis ornare. Morbi lectus justo, auctor eu tincidunt in, semper eget leo. Mauris massa.

k =50 words

Subinstances 𝒙𝒊𝒌

9 www.cs.iit.edu/~ml

Current Error Expected Future Error

Anytime Active Learning

Our Approach

• Search over all possible subinstances

• Models tradeoff between VOI and cost

10 www.cs.iit.edu/~ml

Labeling Cost VOI of Subinstance

Anytime Active Learning

Full Fixed

Search Space

𝑥110 𝑥1

20 … 𝑥1𝑘 … 𝑥1

100 𝑥1

𝑥210 𝑥2

20 … 𝑥2𝑘 … 𝑥2

100 𝑥2

𝑥𝑖10 𝑥𝑖

20 … 𝑥𝑖𝑘 … 𝑥𝑖

100 𝑥𝑖

𝑥𝑛10 𝑥𝑛

20 … 𝑥𝑛𝑘 … 𝑥𝑛

100 𝑥𝑛

Our framework

11 www.cs.iit.edu/~ml Anytime Active Learning

Results – Fixed Size Interruption

0.50

0.55

0.60

0.65

0.70

0 2000 4000 6000 8000 10000 12000 14000 16000

Acc

ura

cy

Number of Words

Accuracy - Movie - MNB with Fixed Subinstance Size

RND-FULL

RND-FIX-10

RND-FIX-30

VOI-FULL

VOI-FIX-10

VOI-FIX-30

Interrupted

Full

12 www.cs.iit.edu/~ml Anytime Active Learning

Results – Anytime Active Learning

0.50

0.55

0.60

0.65

0.70

0 2000 4000 6000 8000 10000 12000 14000 16000

Acc

ura

cy

Number of Words

Accuracy - Movie - MNB with Adaptive Subinstance Size

ANYTIME-γ=0.0001

ANYTIME-γ=0.0005

ANYTIME-γ=0.01

VOI-FULL

VOI-FIX-10

VOI-FIX-30

Full

13 www.cs.iit.edu/~ml

Interrupted

Anytime Active Learning

• Incorporate label error

• Other ways to interrupt

– Structured reading

– Automatic summary

• Other domains

– e.g., vision

• User study

Future Work

Sections Sections

14 www.cs.iit.edu/~ml Anytime Active Learning

Summary

Conclusion

• Introduce a new framework

– Anytime active learning

• Give active learner control over expert time

• Many future directions

Anytime Active Learning www.cs.iit.edu/~ml 15