query anchoring using discriminative query...

Query Anchoring Using Discriminative

Query ModelsSaar Kuzi Anna Shtok Oren Kurland

Technion – Israel Institute Of Technology

We thank SIGIR for the conference travel grant

Pseudo-feedback-basedQuery ExpansionHighly ranked documents are used to induce a query model

Relying on pseudo feedback may result in query drift:

Documents in the result list could be non relevant

Relevant documents can contain non query-pertaining information(Harman ’92, He&Ounis ’09, Lv&Zhai ’09)

2

Initial Result List

Pseudo Feedback

𝐷𝑖𝑛𝑖𝑡

Query

Query AnchoringTechniques for mitigating the risk in relying on pseudo feedback

Direct:

Interpolation with a model of the original query (e.g., Zhai&Lafferty ’01, Abdul-Jaleel et al. ’04, Lv&Zhai ’09)

Using the original query model as a prior (Tao&Zhai ’04, Tao&Zhai ’06)

Indirect:

Term clipping (e.g., Zhai&Lafferty ’01, Abdul-Jaleel et al. ’04, Ye et al. ’10)

Differential impact of documents on the query model (Lavrenko&Croft ‘01, Abdul-Jaleel et al. ’04, Lv&Zhai ‘14)

3

Our Approach A novel indirect query anchoring approach using a new discriminativeterm-based model

An accurate term-based representation of the initial ranking

Initial Result List𝐷𝑖𝑛𝑖𝑡

Query Model

Method

4

Discriminative Query Model

Anchored Query Model

Learning-to-rank

Language Model Notation Unigram language models are used

Given text 𝑥,

𝑝𝑀𝐿𝐸 𝑡 𝑥 ≝𝑡𝑓 𝑡 ∈ x

𝑥

𝑝𝐷𝑖𝑟 𝑡 𝑥 ≝𝑡𝑓 𝑡 ∈ 𝑑 + 𝜇𝑝MLE 𝑡 𝐶

𝜇 + 𝑥

Two language models, 𝜃1 and 𝜃2, are compared using cross entropy:

𝑡 is a term, 𝑥 is the length of 𝑥, and 𝐶 is the collection of documents

5

𝐶𝐸(𝑝(∙ |𝜃1)||𝑝 ∙ 𝜃2 ) = −

𝑡

𝑝 𝑡 𝜃1 log 𝑝(𝑡|𝜃2)

Mixture Model (Zhai&Lafferty ’01)

6

C

𝒅𝟏

𝜽𝑻

𝒅𝟐 𝒅𝒏…

𝑑∈𝐷𝑖𝑛𝑖𝑡

𝑡∈𝑑

𝑡𝑓 𝑡 ∈ 𝑑 log( 1 − 𝛾 𝑝 𝑡 𝜃𝑇 + 𝛾𝑝 𝑡 𝐶 ))

𝑝 𝑡 𝑀𝑀 ≝ 𝜆𝑝𝑀𝐿𝐸 𝑡 𝑞 + 1 − 𝜆 𝑝(𝑡|𝜃𝑇𝑐𝑙𝑖𝑝𝑝𝑒𝑑

)

R

𝒅𝟏q 𝒅𝒏…

𝑝 𝑡 𝑅𝑀1 ≝

𝑑∈𝐷𝑖𝑛𝑖𝑡

𝑝𝐷𝑖𝑟 𝑡 𝑑 𝑝(𝑑|𝑞)

𝑝 𝑡 𝑅𝑀3 ≝ 𝜆𝑝𝑀𝐿𝐸 𝑡 𝑞 + 1 − 𝜆 𝑝 𝑡 𝑅𝑀1𝑐𝑙𝑖𝑝𝑝𝑒𝑑

Relevance Model (Lavrenko&Croft ’01)

(Abdul-Jaleel et al. ’04)

The log-likelihood of documents in 𝐷𝑖𝑛𝑖𝑡:

Generative Query

Models

Discriminative Model

The pseudo feedback assumption: the higher a document is

ranked in the initial result list, the higher its relevance likelihood

∀𝑑𝑖 ,𝑑𝑗 ∈ 𝐷𝑖𝑛𝑖𝑡, if 𝑟(𝑑𝑖) > 𝑟(𝑑𝑗), then 𝑑𝑖 is more likely to

be relevant than 𝑑𝑗

7

SVM-rank (Joachims ’02)

min1

2𝑤 ∙ 𝑤 + 𝐶

𝑖,𝑗

𝜉𝑖,𝑗

∀𝑖∀𝑗. 𝑟 𝑑𝑖 > 𝑟 𝑑𝑗 𝑤(𝜙 𝑑𝑖 − 𝜙 𝑑𝑗 ) ≥ 1 − 𝜉𝑖,𝑗

∀𝑖∀𝑗. 𝑟 𝑑𝑖 > 𝑟 𝑑𝑗 𝜉𝑖 ,𝑗 ≥ 0

𝜙 𝑑 = (log 𝑝𝐷𝑖𝑟 𝑡1 𝑑 , . . . , 𝑙𝑜𝑔 𝑝𝐷𝑖𝑟(𝑡 𝑉 |𝑑))

𝑉 is the vocabulary used in the initial result list

8

Model Derivation

9

𝑤

𝑤+ 𝑤−

𝜃𝑤+ 𝜃𝑤−

𝐿1 Norm

Positive Anchor Model

Negative Anchor Model

NegativeComponents

Positive Components

AnchorPos Method

∀𝑡. 𝑠(𝑡) = 𝜆2𝑝(𝑡|𝜃) + 𝜆3𝑝(𝑡|𝜃𝑤+)

𝑝(𝑡|𝜗+)

𝑝(𝑡|𝜃𝐴𝑛𝑐ℎ𝑜𝑟𝑃𝑜𝑠) = 𝜆1𝑝𝑀𝐿𝐸 (𝑡|𝑞) + (1 − 𝜆1)𝑝(𝑡|𝜗+)

Anchoring a generative model, 𝜃, using the positive anchor model, 𝜃𝑤+

1. Term clipping

2. Sum normalization

Interpolation with the original query

model

𝜆1 + 𝜆2 + 𝜆3 = 1, 𝜆𝑖 ≥ 0

10

ClipNeg MethodClipping negative anchor terms

1. Setting to 0 the probabilities of negative anchor terms

2. Term clipping

3. Sum normalization

𝑝(𝑡|𝜗−)

𝑝(𝑡|𝜃𝐶𝑙𝑖𝑝𝑁𝑒𝑔) = 𝜆𝑝𝑀𝐿𝐸(𝑡|𝑞) + (1 − 𝜆)𝑝(𝑡|𝜗−)

Interpolation with the original query

model

𝑝(𝑡|𝜃)

11

Related Work Existing query anchoring techniques (direct query anchoring, term

clipping and differential weighting)

Applying our approach on top of these yields further improvements

Methods for improving the quality of the pseudo feedback result list

(e.g., Mitra et al. ’98, Lee et al. ’08)

Our model can be induced from any ranked list

12

Related Work A supervised term classification approach (Cao et al. ’08)

Our approach is unsupervised and focuses on unigram query models

Clustering of terms in a query model (Udupa et al. ’09)

A method for cluster selection was not proposed

Fusing the result lists: the initial and the expansion-based

(Zighelnic&Kurland ’08)

Our methods operate on the model level and post better performance

13

Evaluation TREC datasets: TREC123, ROBUST, and WT10G

The initial result list is retrieved using a standard language model

method (Lafferty&Zhai ’01): −𝐶𝐸(𝑝𝑀𝐿𝐸(∙ |𝑞)||𝑝𝐷𝑖𝑟 ∙ 𝑑 )

Baselines:

◦ Generative Models (RM3, MM)

◦ Fusion (Zighelnic&Kurland ’08)

Values of free parameters are set using leave-one-out cross validation

14

The Discriminative Model

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Ke

nd

all's

tau

ROBUST

v=100

v=500

v=ALL

α

−𝛼𝐶𝐸(𝑝(∙ |𝜃𝑤+)||𝑝𝐷𝑖𝑟 ∙ 𝑑 ) + 1 − 𝛼 𝐶𝐸(𝑝(∙ |𝜃𝑤−)||𝑝𝐷𝑖𝑟 ∙ 𝑑 )

• Re-ranking an initial result list of 100 documents according to:

• The positive and negative anchor models are clipped to use ν terms

16

𝐑𝐌𝟏 (𝐀𝐏 = 𝟐𝟒. 𝟖)

𝛉𝐰+ (𝐀𝐏 = 𝟒𝟗. 𝟑)

Query: Airport Security, ROBUST, QL(AP=24.8) Discriminative vs.

Generative

16

• The discriminative model assigns high probabilities to terms with high IDF values

• The generative models are much more similar to each other, with respect to the terms they promote, than they are to the discriminative model

20

22

24

26

28

30

RM3 MM

MA

P

TREC123

An

cho

rPo

s

An

cho

rPo

s

Fusi

on

Fusi

on

20

22

24

26

28

30

RM3 MMM

AP

ROBUST

Fusi

on

18

19

20

21

22

RM3 MM

MA

P

WT10G

Fusi

on

17

𝜓∗∗

𝜓∗ ∗

𝜓∗ 𝜓∗

𝜓

∗∗

AnchorPos

Statistically significant differences with:

Generative Model

Fusion

Anchoring a generative model using the positive anchor

model

𝜓

∗

Fusi

on

An

cho

rPo

s

An

cho

rPo

s

Fusi

on

Fusi

on

An

cho

rPo

s

An

cho

rPo

s

Fusi

on

Fusi

on

20

22

24

26

28

30

RM3 MM

MA

P

ROBUST

20

22

24

26

28

30

RM3 MM

MA

P

TREC123

ClipNeg

18

19

20

21

22

RM3 MM

MA

P

WT10G

Clip

Ne

g

Clip

Ne

g

Clip

Ne

gC

lipN

eg

Clipping negative

anchor terms

Statistically significant differences with the generative model

18

∗

∗

Clip

Ne

g

Clip

Ne

g

Clip

Ne

g

Clip

Ne

g

Clip

Ne

g

∗

Summary We presented a novel unsupervised pseudo-feedback-based

discriminative query model that is based on a learning-to-rank-

approach

We devised a few methods that use the discriminative model to

perform (indirect) query anchoring of existing query models

Empirical evaluation showed that using our methods can improve the

performance of highly effective generative query models

19