shuang-hong yang, hongyuan zha, bao-gang hu nips2009
DESCRIPTION
Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instance Corpora. Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu NIPS2009. Presented by Haojun Chen. Some contents are from author’s paper and poster. Outline. Introduction - PowerPoint PPT PresentationTRANSCRIPT
Dirichlet-Bernoulli Alignment: A Generative Model for Multi-Class Multi-Label Multi-Instanc
e Corpora
Shuang-Hong Yang, Hongyuan Zha, Bao-Gang Hu
NIPS2009
Some contents are from author’s paper and poster
Presented by Haojun Chen
Outline
• Introduction
• Dirichelet-Bernoulli Alignment (DBA) Model
• Model Inference and Prediction
• Experiments
• Conclusion
Introduction
• In this paper, multi-class, multi-label and multi-instance classification (M3C) problem is considered.
• Goal: infer class label for both pattern and its instances
Pattern
Class
Instance
Feature
Figure is adopted from author’s poster
(e.g. document)
(e.g. paragraph)
(e.g. word)
(e.g. topic)
• For a multi-class, multi-label multi-instance corpus , we define
Problem Formalization
: set of input patterns
: corresponding labels
: set of instances in pattern n
: dictionary features
: a bag of discrete features
: class label
Basic Assumption
• Assumption 1 [Exchangeability]: A corpus is a bag of patterns, and each pattern is a bag of instances.
• Assumption 2 [Distinguishablity]: Each pattern can belong to several classes, but each instance belongs to a single class.
Tree Structure Assumption
Dirichelet-Bernoulli Alignment (DBA) Model (1)
DBA generative process:1. Sample pattern-level class mixture
2. For each of the M instances in X Choose instance-level class label Generate the instance according to observation model
Dirichelet-Bernoulli Alignment (DBA) Model (2)
3. Generate pattern-level label
Dirichelet-Bernoulli Alignment (DBA) Model (3)
where
Model Inference and Prediction
• Parameter Estimation (MLE)
• Variational Approximation
• Prediction– Pattern Classification:– Instance Disambiguation:
Why The Name?
• Lower bound
Fourth term
Experiments 1
• Text classification– ModApte split of the Reuters-21578 text collection, 10788 docum
ents, 10 classes– Each paragraph of a document is represented with Vector-
Space-Model – Eliminate docs with empty label sets, length<20. Remaining 187
9 docs, 721 docs (38.4%) with multiple labels– Compared with Multinomial-event-model-based Naive-Bayes
(MNB) and two state-of-art multi-instance multi-label classifiers (MIMLSVM and MIMLBOOST)
Experiments 2• Named entity disambiguation
– Yahoo! Answer query log crawled in 2008,101 classes, 216563 questions
– 300 entities for training and 100 for test
– Compared with Multinomial Naive Bayes with TF (MNBTF) or TFIDF (MNBTFIDF) attributes, as well as linear SVM classifier with TF (SVMTF) or TFIDF (SVMTFIDF) attributes.
Conclusion
• A Dirichlet-Bernoulli Alignment model is proposed and proved to be useful for both pattern classification and instance disambiguation