methods for learning classifier combinations: no clear winner

24
06/19/22 Dmitriy Fradkin, ACM SAC' 2005 1 Methods for Learning Classifier Combinations: No Clear Winner Dmitriy Fradkin, Paul Kantor DIMACS, Rutgers University

Upload: gannon

Post on 23-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Methods for Learning Classifier Combinations: No Clear Winner. Dmitriy Fradkin, Paul Kantor DIMACS, Rutgers University. Topic 1. Topic 2 …. New Topics. System 2. System 2. System 2. System 1. System 1. System 1. ?. Local Fusion. Federated or Global Fusion. Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 1

Methods for Learning Classifier Combinations: No Clear Winner

Dmitriy Fradkin, Paul KantorDIMACS, Rutgers University

Page 2: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 2

Topic 1 Topic 2 ….. New Topics

System 2System 1

Federated or Global Fusion

System 2System 1

?

System 2System 1

Local Fusion

Page 3: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 3

Overview

• Discuss local fusion methods• Describe a new fusion approach for multi-

topic problems that we call “federated”• Compare it empirically to the global

approach, previously described in [Bartell et. al. 1994]

• Interpret the results

Page 4: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 4

Related Work in IR

• [Bartell et. al, 1994] - global fusion of systems• [Hull et. al, 1996] - local fusion methods for

document filtering (averaging, linear and logistic regression, grid search)

• [Lam and Lai 2001] used category-specific features to model error-rate, and then picked the single best system for a category

• [Bennet et.al, 2002] uses “reliability indicators” together with scores as input to a metaclassifier

Page 5: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 5

Combination of Classifiers

}1,0{),( qdyRelevance Judgment:

Decision Rule: )),((}1,0{),( qqdrsignqdC

The problem of fusion can be formulated as the problem offinding a way to combine several decision rules

Page 6: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 7

Linear Combinations

qd

lqdxl

qdxsignqdC qj

l

jjF

on topic document tosystems by thegiven scores, normalized ofvector

ldimensiona-an is ),( and thresholda is weights,of vector ldimensiona-an is where

)),((),(1

),'(min),'(max

),'(min),(

''

'

qdrqdr

qdrqdrx

sdsd

sds

s

Page 7: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 8

Input to Local Fusion

documentjth for judgement relevance - )(documents ,...,1

document,given afor scores of vector - x j

jynj

System 1 System 2 … System L Relevancedoc 1 x_11 x_12 … x_1l y_1doc 2 x_21 x_22 … x_2l y_2… … … … … …doc n x_n1 x_n1 … x_nl y_n

Page 8: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 9

Local Fusion Methods

2

,...,1

)(min :Linear

nj

jj xy

2

,...,1,)(min :2Linear

nj

jj xy

xxxx :CentroidA new fusion method:

Other methods:

Page 9: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 10

Local Fusion Methods (cont.)

jj

j

jj

jjjnj

j

xpp

xypp

pypy

1log

),,|1(

))1log()1(log(min :Logistic,...,1,

Since log is a monotone function, the underlying decision rule is linear

Page 10: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 11

Threshold Tuning

• Once a vector of parameters is found for a local rule, we compute fusion score on the training set and find a threshold maximizing a particular utility measure:

Different combinations lead to different scores and decisions.

),( iii q

Page 11: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 13

Global Fusion

When there are many topics:• Combine all document-query relevance judgments and

corresponding score together (as if for a single query)• Compute a local fusion rule

When data for a new training topic becomes available we can either: • solve the problem from the scratch, or• continue using the same rule.

Page 12: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 14

Input to Global Fusion

System 1 System 2 … System L Relevancedoc 1/query 1 x_111 x_112 … x_11l y_11doc 1/query 2 x_121 x_122 … x_12l y_12… … … … … …doc 1/query m x_111 x_112 … x_11l y_1mdoc 2/ query 1 x_21 x_212 … x_21l y_21… … … … … …doc n/ query m x_n1 x_nm1 … x_nml y_nm

Page 13: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 15

Question:

Suppose we know local fusion rules on a set of queries.• Can we exploit this knowledge on other queries? • Can we come up with a scheme that can easily incorporate new training queries?

Page 14: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 16

Federated Fusion

m

jjm 1

1*

m

jj

m

jj qmm 11

),(11*

),( rulesfusion localir with the queries ningGiven trai j1 jm,...,qq

New training topics are easy to incorporate!

Page 15: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 17

Experimental Evaluation• Reuters Corpus v1, version 2 (RCV1-v2)• 99 topics• Completely judged• ~23K documents (as in Lewis et. al. 2004) to train

individual systems• Selected 4060 (from ~ 800K) to construct fusion

rules• 9-fold cross-validation over topics

Page 16: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 18

Utility Measures

T+ - all positive documents; D+ - submitted positive;D- - submitted negative

||2||||2T11NU

TDD

5.15.0)5.0T11NU,max(T11SU

Page 17: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 19

Term Representation

otherwise 0

0,d)(t,f' if d)),(t,log(f'1 d)f(t,

where f’(t,d) is number of times a term occurs in a document.

IDF weighting: let i’(t) is the number of documents, in the training set T, containing term t. Then:

)i'(t)T

((t)iD

11

log

Page 18: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 20

Individual Classifiers

• Bayesian Binary Regression (BBR) [Genkin et. al. 2004]

• kNN, k=384 (k was chosen on the basis of prior experiments)

• Rocchio Classifier

Page 19: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 22

Single Classifiers and BBR-kNN fusion

0

0.2

0.4

0.6

0.8

1

1.2

1911 29

6

178

133 82 52 31 20 14 9 4

Topics (# of Relevant Documents)

T11S

U

kNNRocchioBBRBBR-kNN globalBBR-kNN federated

Page 20: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 23

Global vs. Federated

0

0.2

0.4

0.6

0.8

1

CC

AT

M14

C18

C18

1

GC

RIM E12

C21

G15

C17

2

E51

2

E11

E13

C18

2

GE

NT

C17

3

E31

E31

1

G15

2

C33

1

E14

1

Topics (in decreasing number of relevant documents)

T11S

U BBR-kNN global

BBR-kNN federated

Page 21: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 24

Global vs. Federated

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Global (T11SU)

Fede

rate

d (T

11SU

)

BBR-kNN global

Series1

Page 22: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 25

Results

Local Fusion kNN Rocchio BBR BBR-kNN global BBR-kNN federatednone 0.583 0.54 0.578 … …Centroid … … … 0.569 0.587Linear … … … 0.569 0.574Linear 2 … … … 0.569 0.575Logistic … … … 0.556 0.549

Average T11SU measure across 99 topics of RCV1

Page 23: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 26

Conclusions• Centroid method performs best with federated fusion• Federated fusion gives higher average utility,• But global fusion performs better on greater number of topics.• This seems to be related to the number of relevant documents for individual topics (federated is better for topics with few relevant documents).• No Clear Winner: the choice of methods depends on user’s objectives• However, computationally Federated fusion is more efficient• Have to consider topic properties when choosing a combination method

Page 24: Methods for Learning Classifier Combinations: No Clear Winner

04/22/23 Dmitriy Fradkin, ACM SAC'2005 29

Acknowledgments

• KD-D group via NSF grant EIA-0087022• Members of DIMACS MMS project: Fred Roberts (PI), Andrei Anghelescu, Alex

Genkin, Dave Lewis, David Madigan, Vladimir Menkov

• Kwong Bor Ng• Anonymous reviewers