fine-tuning ranking models:

27
Fine-tuning Ranking Models: a two-step optimization approach Vitor Jan 29, 2008 Text Learning Meeting - CMU With invaluable ideas from ….

Upload: darice

Post on 24-Jan-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Fine-tuning Ranking Models:. Vitor Jan 29, 2008 Text Learning Meeting - CMU. a two-step optimization approach. With invaluable ideas from …. Motivation. Rank, Rank, Rank… Web retrieval, movie recommendation, NFL draft, etc. Einat ’s contextual search Richard ’s set expansion (SEAL) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fine-tuning Ranking Models:

Fine-tuning Ranking Models:a two-step optimization approach

Vitor

Jan 29, 2008

Text Learning Meeting - CMU

With invaluable ideas from ….

Page 2: Fine-tuning Ranking Models:

Motivation

• Rank, Rank, Rank…– Web retrieval, movie recommendation, NFL draft, etc.– Einat’s contextual search– Richard’s set expansion (SEAL)– Andy’s context sensitive spelling correction algorithm– Selecting seeds in Frank’s political blog classification

algorithm– Ramnath’s thunderbird extension for

• Email Leak prediction• Email Recipient suggestion

Page 3: Fine-tuning Ranking Models:

Help your brothers!

• Try Cut Once!, our Thunderbird extension– Works well with Gmail accounts

• It’s working reasonably well• We need feedback.

Page 4: Fine-tuning Ranking Models:

Leak warnings: hit x to remove recipient

Pause or cancel send of message

Timer: msg is sent after 10sec by default

Suggestions:hit + to add

Thunderbird plug-in

Classifier/rankers written in JavaScript

Email Recipient Recommendation

Page 5: Fine-tuning Ranking Models:

Email Recipient Recommendation

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

TOCCBCC CCBCC

MA

P

Frequency

Recency

M1uc

M2uc

TFIDF

KNN

36 Enron users

Page 6: Fine-tuning Ranking Models:

Email Recipient Recommendation

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

TOCCBCC CCBCC

MA

P

Frequency

Recency

M1uc

M2uc

TFIDF

KNN

Threaded

[Carvalho & Cohen, ECIR-08]

Page 7: Fine-tuning Ranking Models:

Aggregating Rankings

• Many “Data Fusion” methods– 2 types:

• Normalized scores: CombSUM, CombMNZ, etc.• Unnormalized scores: BordaCount, Reciprocal Rank Sum, etc.

• Reciprocal Rank:– The sum of the inverse of the rank of document in each

ranking.

Rankingsq iq

i drankdRR

)(

1)(

[Aslam & Montague, 2001]; [Ogilvie & Callan, 2003]; [Macdonald & Ounis, 2006]

Page 8: Fine-tuning Ranking Models:

Aggregated Ranking Results

[Carvalho & Cohen, ECIR-08]

Page 9: Fine-tuning Ranking Models:

Intelligent Email Auto-completionTOCCBCC

CCBCC

Page 10: Fine-tuning Ranking Models:

Can we do better?

• Not using other features, but better ranking methods

• Machine learning to improve ranking: Learning to rank: – Many (recent) methods:

• ListNet, Perceptrons, RankSvm, RankBoost, AdaRank, Genetic Programming, Ordinal Regression, etc.

– Mostly supervised– Generally small training sets– Workshop in SIGIR-07 (Einat was in the PC)

Page 11: Fine-tuning Ranking Models:

Pairwise-based Ranking

)()( jiji dfdfdd

mimiiii xwxwxwddf ...,w)( 2211

Rank q

d1

d2

d3

d4

d5

d6

...

dT

We assume a linear function f

0, jiji ddwdd

Goal: induce a ranking function f(d) s.t.

Therefore, constraints are:),...,,( 62616 mxxx

Page 12: Fine-tuning Ranking Models:

Ranking with Perceptrons

• Nice convergence properties and mistake bounds– bound on the number of mistakes/misranks

• Fast and scalable

• Many variants [Collins 2002, Gao et al 2005, Elsas et al 2008]

– Voting, averaging, committee, pocket, etc.

– General update rule:

– Here: Averaged version of perceptron

)]()([1NRR

tt dfdfWW

Page 13: Fine-tuning Ranking Models:

Rank SVM

• Equivalent to maximing AUC

,2

1min

2

RPi

iranksvmw

CwL

RP

NRRranksvmw

ddwwL ],1[min2

[Joachims, KDD-02],

[Herbrich et al, 2000]

)},{(,1,,0 subject to i NRRiNRR ddRPddw

.2C

1 where,

Equivalent to:

Page 14: Fine-tuning Ranking Models:

Loss Function

NRR ddw ,

0

0.5

1

1.5

2

-3 -2 -1 0 1 2 3

Lo

ss

Page 15: Fine-tuning Ranking Models:

Loss Function

NRR ddw ,

0

0.5

1

1.5

2

-3 -2 -1 0 1 2 3

Lo

ss

Page 16: Fine-tuning Ranking Models:

Loss Function

NRR ddwx ,

0

0.5

1

1.5

2

-3 -2 -1 0 1 2 3

Lo

ss

)(11

11

1

xsigmoidee

exx

x

Page 17: Fine-tuning Ranking Models:

Loss Functions

• SigmoidRank

• SVMrank

RP

NRRranksvmw

ddwwL ],1[min2

RP

NRRkSigmoidRanw

ddwsigmoidwL )],(1[min2

xexsigmoid

1

1)(

Not convex

Page 18: Fine-tuning Ranking Models:

Fine-tuning Ranking Models

Base Ranker

Sigmoid Rank

Non-convex:

Minimizing a very close approximation for the number of misranks

Final model

Base ranking model

e.g., RankSVM, Perceptron, etc.

Page 19: Fine-tuning Ranking Models:

Gradient Descent

)( )()(

)()()1(

kdrankSigmoi

k

kk

kk

wLw

www

)( )](1[))(( Since xsigmoidxsigmoidxsigmoidx

)],(1)[,( 2)( )(NRRNRR

RP

kdrankSigmoi ddwsigmoidddwsigmoidwwL

Page 20: Fine-tuning Ranking Models:

Results in CC prediction

0.472

0.516

0.479

0.524 0.521

0.480

0.25

0.3

0.35

0.4

0.45

0.5

0.55

TOCCBCC CCBCC

MA

P

Frequency

Recency

TFIDF

KNN

Percep

Percep+Sigmoid

RankSVM

RankSVM+Sigmoid

36 Enron users

Page 21: Fine-tuning Ranking Models:

Set Expansion (SEAL) Results

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

SEAL-1 SEAL-2 SEAL-3

MA

P

Percep

Percep+Sigmoid

RankSVM

RankSVM+Sigmoid

ListNet

ListNet+Sigmoid

[Listnet: Cao et al. , ICML-07]

[Wang & Cohen, ICDM-2007]

Page 22: Fine-tuning Ranking Models:

Results in Letor

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Ohsumed Trec3 Trec4

MA

P

Percep

Percep+Sigmoid

RankSVM

RankSVM+Sigmoid

ListNet

ListNet+Sigmoid

Page 23: Fine-tuning Ranking Models:

Learning Curve

0.900

0.905

0.910

0.915

0.920

0 5 10 15 20 25 30

epoch

AU

C

Perceptron

RankSVM

TOCCBCC Enron: user lokay-m

Page 24: Fine-tuning Ranking Models:

Learning Curve

CCBCC Enron: user campbel-m

0.94

0.95

0.96

0.97

0.98

0 5 10 15 20 25 30 35 40

epochs

AU

C

Perceptron

RankSVM

Page 25: Fine-tuning Ranking Models:

Regularization Parameter

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

RankSVM RankSVM +Sigmoid

RankSVM RankSVM +Sigmoid

RankSVM RankSVM +Sigmoid

MA

P

C=10 C=1 C=0.1 C=0.01 C=0.001

TREC3 TREC4 Ohsumed=2

Page 26: Fine-tuning Ranking Models:

Some Ideas

• Instead of number of misranks, optimize other loss functions:– Mean Average Precision, MRR, etc.– Rank Term:

– Some preliminary results with Sigmoid-MAP

• Does it work for classification?

}|}{}{{

)],(1[1)(iNRRj

jii ddwsigmoiddRank

Page 27: Fine-tuning Ranking Models:

Thanks