shuang-hong yang, hongyuan zha, bao-gang hu nips2009

13
Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpor a Shuang-Hong Yang, Hongyuan Zha, Bao-Gan g Hu NIPS2009 Some contents are from author’s paper and poster Presented by Haojun C hen

Upload: oralee

Post on 21-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora. Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009. Presented by Haojun Chen. Some contents are from author’s paper and poster. Outline. Introduction - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instanc

e Corpora

Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu

NIPS2009

Some contents are from author’s paper and poster

Presented by Haojun Chen

Page 2: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Outline

• Introduction

• Dirichelet-Bernoulli Alignment (DBA) Model

• Model Inference and Prediction

• Experiments

• Conclusion

Page 3: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Introduction

• In this paper, multi-class, multi-label and multi-instance classification (M3C) problem is considered.

• Goal: infer class label for both pattern and its instances

Pattern

Class

Instance

Feature

Figure is adopted from author’s poster

(e.g. document)

(e.g. paragraph)

(e.g. word)

(e.g. topic)

Page 4: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

• For a multi-class, multi-label multi-instance corpus , we define

Problem Formalization

: set of input patterns

: corresponding labels

: set of instances in pattern n

: dictionary features

: a bag of discrete features

: class label

Page 5: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Basic Assumption

• Assumption 1 [Exchangeability]: A corpus is a bag of patterns, and each pattern is a bag of instances.

• Assumption 2 [Distinguishablity]: Each pattern can belong to several classes, but each instance belongs to a single class.

Tree Structure Assumption

Page 6: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Dirichelet-Bernoulli Alignment (DBA) Model (1)

DBA generative process:1. Sample pattern-level class mixture

Page 7: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

2. For each of the M instances in X Choose instance-level class label Generate the instance according to observation model

Dirichelet-Bernoulli Alignment (DBA) Model (2)

Page 8: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

3. Generate pattern-level label

Dirichelet-Bernoulli Alignment (DBA) Model (3)

where

Page 9: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Model Inference and Prediction

• Parameter Estimation (MLE)

• Variational Approximation

• Prediction– Pattern Classification:– Instance Disambiguation:

Page 10: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Why The Name?

• Lower bound

Fourth term

Page 11: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Experiments 1

• Text classification– ModApte split of the Reuters-21578 text collection, 10788 docum

ents, 10 classes– Each paragraph of a document is represented with Vector-

Space-Model – Eliminate docs with empty label sets, length<20. Remaining 187

9 docs, 721 docs (38.4%) with multiple labels– Compared with Multinomial-event-model-based Naive-Bayes

(MNB) and two state-of-art multi-instance multi-label classifiers (MIMLSVM and MIMLBOOST)

Page 12: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Experiments 2• Named entity disambiguation

– Yahoo! Answer query log crawled in 2008,101 classes, 216563 questions

– 300 entities for training and 100 for test

– Compared with Multinomial Naive Bayes with TF (MNBTF) or TFIDF (MNBTFIDF) attributes, as well as linear SVM classifier with TF (SVMTF) or TFIDF (SVMTFIDF) attributes.

Page 13: Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009

Conclusion

• A Dirichlet-Bernoulli Alignment model is proposed and proved to be useful for both pattern classification and instance disambiguation