toward mixed-initiative clustering yifen huang tom m. mitchell carnegie mellon university agents...

27
Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Upload: gertrude-hawkins

Post on 21-Jan-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Toward Mixed-Initiative Clustering

Yifen HuangTom M. Mitchell

Carnegie Mellon University

Agents that Learn from Human TeachersMarch 23, 2009

Page 2: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Semi-supervised clustering:A user performs an oracle role.

Unsupervised clustering:A machine builds the model alone.

Page 3: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Semi-supervised clustering:A user performs an oracle role.

Unsupervised clustering:A machine builds the model alone.

Mixed-Initiative Clustering

Page 4: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Key Question

• How can autonomous clustering algorithms be extended to enable mixed-initiative clustering approaches involving an iterative sequence of computer-suggested and user-suggested revisions to converge to a useful hierarchical clustering?

– From autonomous clustering to mixed-initiative clustering

– From flat feedback to hierarchical feedback

Page 5: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Activity X contains this list of emails:## ## ## ## ##An email from Andrea

Thomaz belongs to your AAAI symposium activity.

Adam Cheyer is a key-person to your CALO activity.

Page 6: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Too lazy to comment.

Activity X contains this list of emails:## ## ## ## ##An email from Andrea

Thomaz belongs to your AAAI symposium activity.

What the hell is this?? DELETE!

This is correct.Adam Cheyer is a key-

person to your CALO activity.

Page 7: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Computer-to-user language: hypotheses

User-to-computer language: modified hypotheses

Model adaptation algorithm

Framework for Mixed-Initiative Clustering

Page 8: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

User Interface

Page 9: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Communicative Languages inSemi-Supervised Clustering

Cluster Document

Page 10: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Communicative Languages inSemi-Supervised Clustering

Cluster DocumentConfirmRemove

Page 11: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Enriching Languages in Flat Clustering

Cluster Document

Word Person

ConfirmRemove

Page 12: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Enriching Languages in Flat Clustering

Cluster Document

Word Person

ConfirmRemove

ConfirmRemove

ConfirmRemove

Page 13: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Enriching Languages in Hierarchical Clustering

Cluster Document

Word Person

ClusterConfirmRemove

ConfirmRemove

ConfirmRemove

Page 14: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Cluster Document

Move

Word Person

Cluster

Move

MoveMergeAddSplit

ConfirmRemove

ConfirmRemove

ConfirmRemove

Enriching Languages in Hierarchical Clustering

Page 15: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Experiment Design

• Can mixed-initiative clustering help a user achieve the result faster?

• Can mixed-initiative clustering help a machine build a better model?

Page 16: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Dataset

• An email dataset of one of the authors– 623 emails– 6684 unique words and 135 individual people– Manually sorted into a hierarchy of 15 cluster nodes

including a root, 3 intermediate nodes and 11 leaf nodes

Page 17: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Feedback Sessions

• Five initial hierarchical clustering results

• Two feedback sessions on each result– Diligent session– Lazy session

Page 18: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Diligent User

Cluster Document

Move

Word Person

Cluster

Move

MoveMergeAddSplit

ConfirmRemove

ConfirmRemove

ConfirmRemove

Page 19: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Lazy User

Cluster Document

Move

Word Person

Cluster

Move

MoveMergeAddSplit

ConfirmRemove

ConfirmRemove

ConfirmRemove

Page 20: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Lazy User vs. Diligent User

Page 21: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Measurement• User feedback is equivalent to edge modification.• Edge Modification Ratio (EMR) equals the ratio of

edges needed to be modified in order to reach the reference hierarchy.

1

3 42 5

6 7 8

9

e1 e4

10 11 12 13 14 15

16 17 18 19 20 21

22 23 24

25 26 27 28 29e9

e3

e2

e6e5

e10e11

e12

e13e14

e7

e8

e15

e16

e17

e18

e19e20 e21

e22e23

e24e25e26

e27 e28

Considering hierarchical accuracy with user feedback

e3e8

e18

e21

e12

50.18

28EMR

Page 22: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Good Results (4/5)

Page 23: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Bad Result (1/5)

Page 24: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

One More StepToward Mixed-Initiative

Clustering

Yifen HuangTom M. Mitchell

Carnegie Mellon University

Agents that Learn from Human TeachersMarch 23, 2009

Page 26: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Elaborated Framework for Mixed-Initiative Clustering

Page 27: Toward Mixed-Initiative Clustering Yifen Huang Tom M. Mitchell Carnegie Mellon University Agents that Learn from Human Teachers March 23, 2009

Future Work

• Feasibility study of the low-latency mixed-initiative interface