hierarchical active learning with group proportion...

19
Zhipeng Luo and Milos Hauskrecht Department of Computer Science University of Pittsburgh, USA Hierarchical Active Learning with Group Proportion Feedback Paper Id: 4050

Upload: others

Post on 22-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Zhipeng Luo and Milos Hauskrecht

Department of Computer Science

University of Pittsburgh, USA

Hierarchical Active Learningwith Group Proportion Feedback

Paper Id: 4050

Page 2: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Motivation: Data Annotation Cost

Images Electronic Health Records (EHRs)

Dog ? Heart disease ?

Learning of classification models relies on labeled data instancesProblems:• Large number of labeled instances may be needed for learning• The actual cost or hardness of labeling of data instances may differ:

vs

Page 3: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Objective

• How to reduce the annotation cost? How to learn an instance-levelclassification model, 𝑓: 𝑥 → 𝑦, more efficiently with a smaller number of queries?

• Solution studied in this work:

Group Annotation + Active Learning with Groups

• Why the groups and queries on groups:

• Many similar instances can be ‘labeled’ together as groups

• Groups often have a more compact description that is easier to understand for a user, further reducing the annotation cost

Page 4: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

The Group Concept

• Group: represents a (sub)set of instances 𝑥 in the data

Page 5: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

The Group Concept

• Group: represents a (sub)set of instances 𝑥 in the data

• Region of the input space 𝒙: defines a subpopulation of 𝑥

• A group may be covered by a region

Page 6: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

The Group Concept

• Group: represents a (sub)set of instances 𝑥 in the data

• Region of the input space 𝒙: defines a subpopulation of 𝑥

• A group may be covered by a region

• A region/group can be assessed with respect to target 𝒚 by using a proportion label

• A proportion of instances in the group that belong to one of the classes

60% of red (positives)

Page 7: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Active learning with groups algorithm: idea

Hierarchical Active Learning with Groups (HALG)

• Makes queries that assess proportion labels for the groups

• Learns instance-level classifier model from the groups and their labels

• Actively chooses queries to gradually improve the model

Basic questions:

• How to form the groups?

• How to describe the groups to the user/annotator?

• How to learn an instance level classifier from group-proportion feedback?

• How to pick the groups to be labeled next?

Page 8: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

𝑈

• Hierarchical Group Formation: based on hierarchical clustering

How to form the groups?

Page 9: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

How to form the groups?

𝑈𝑈

Page 10: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

How to form the groups?

𝑈

…… ……

• Hierarchical Group Formation: based on hierarchical clustering

𝑈

Page 11: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

How to describe the groups to the user?

𝑥2

𝑥1

Conjunctive pattern:1 ≤ 𝑥1 ≤ 26 ≤ 𝑥2 ≤ 7

“This group of instances is 80% likely to be positive.”

Decision treefitting

Omitinstances

• Fit groups of instances obtained via hierarchical clustering to ‘rectangular’ input space regions formed by conjunctive patterns

Page 12: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Group Query Example

“What proportion of patients with:• sex=female,• 40<age<50,• chest pain type=3, and• fasting blood sugar within [130,150] mg/dLsuffer from a heart disease?”

“60% of patients in this group suffer from a heart disease.”

P. Rashidi and D. Cook. Ask me better questions: active learning queries based on rule induction. In Proceedings of the 17th ACM SIGKDD, 2011.

Page 13: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Learning a Model from Labeled Groups

• Existing algorithms for learning classifiers from proportion labels:[Quadrianto 2009, Kuck 2012, Patrini 2014, Yu 2013]

• Assume the groups and proportions are known apriori

• Algorithms: either (1) define a loss based on the fit of the groups to the discriminative projection or (2) rely on sample-based algorithms

• We use a sample-based algorithm:• Sample instance labels according to their group proportion labels;

• Feed them to any instance-based learning algorithms.

• E.g. MLE estimates of the parameters 𝜽 defining 𝑃(𝑦|𝒙, 𝜽).

N. Quadrianto, et al. Estimating labels from label proportions. Journal of Machine Learning Research, 2009.Kuck, Hendrik, and Nando de Freitas. Learning about individuals from group statistics. arXiv:1207.1393, 2012.Patrini, Giorgio, et al. (Almost) no label no cry. Advances in Neural Information Processing Systems, 2014.Yu, Felix X., et al. ∝ SVM for learning with label proportions. arXiv preprint arXiv:1306.0886, 2013.

Page 14: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

How to select the group to be labeled next?

…… ……

“65%+” (prior)

“85%+”“40%+”

“15%+” “60%+” “95%+” “50%+”

• Top-bottom group labelling process: assigns proportion label to rectangular regions fitted to original clusters induced by the hierarchical clustering algorithm from more general to more specific groups

Page 15: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Active Group Selection

• How to select the group to query next?• Query A’s or B’s children next?

• Intuition: split the parent group that is the most influential for our model:• Impure: with an uncertain label;

• Large: affects many instances.

• Solution:• Split the group which can potentially

lead to the maximum model change to the current classifier.

…… ……

65%+

B: 85%+A: 40%+

? ? ? ?

Page 16: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Active Group Selection: Maximum Model Change

Key idea: estimate the model change after the group split and annotation

• Assume 𝜽{𝐴,𝐵} are the parameters of the current model, trained with groups A and B,

• For each group, A or B (say A here):• Infer the labels of A’s children {A1, A2};• Re-train the model with A’s children

instead, to get 𝜽{𝐴1,𝐴2,𝐵};• See how much the model has changed by

splitting A: 𝑑(𝜽 𝐴,𝐵 , 𝜽 𝐴1,𝐴2,𝐵 )

• Finally, split the group and query its children that comes with the largest estimated change.

65%+

B: 85%+A: 40%+

A1: Ƹ𝜇𝐴1 A2: Ƹ𝜇𝐴2 B1: Ƹ𝜇𝐵1 B2: Ƹ𝜇𝐵2

Inferred by 𝜽{𝐴,𝐵}

Page 17: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Experiments and Observations

• Our method HALG outperforms other active learning methods.• Initially: learning with groups is

superior to learning with instances

• In the long run: maximum model change principle leads to faster model convergence

Asuncion, Arthur, and Newman. Data set “Wine” from UCI machine learning repository, 2007

Page 18: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Conclusions and Future Work

• We have developed an annotation cost-effective framework for learning instance-level classifiers, effective to situations where instance-labeling is expensive but group-labeling is more feasible and efficient.

• Future work• Dynamic and supervised clustering: check out our new paper @ ECML 2018

• Handling of high dimensional inputs

• Theoretic analysis of the sample complexity of learning from groups

Z. Luo and M. Hauskrecht. Hierarchical Active Learning with Proportion Feedback on Regions. To appear in European Conference on Machine Learning, 2018.

Page 19: Hierarchical Active Learning with Group Proportion Feedbackpeople.cs.pitt.edu/~zpluo/papers/ijcai_ppt_2018.pdf · 2018-10-09 · •Dynamic and supervised clustering: check out our

Thank you!

Hierarchical Active Learningwith Group Proportion Feedback

Paper Id: 4050