Transcript
Page 1: Discovering Key Concepts in Verbose Queries

Discovering Key Concepts in Verbose Queries

Michael Bendersky and W. Bruce Croft

University of Massachusetts

SIGIR 2008

Page 2: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

Page 3: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 4: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 5: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• Use of key concepts?

Page 6: Discovering Key Concepts in Verbose Queries

Objective

• “Discovering Key Concepts in Verbose Queries”

• Use of key concepts?

• Combine with current IR model

Page 7: Discovering Key Concepts in Verbose Queries

Retrieval Model

• Conventional Language Model:

score(q,d) = p(q|d) = )(

),(

dp

dqp

Page 8: Discovering Key Concepts in Verbose Queries

Retrieval Model

• Conventional Language Model:

score(q,d) = p(q|d) =

• New Model:

score(q,d) = p(q|d) = =

)(

),(

dp

dqp

)(

),,(

dp

cdqpic

i)(

),(

dp

dqp

Page 9: Discovering Key Concepts in Verbose Queries

Final Retrieval Function

score(q,d) = ic

ii dcpqcpdqp )|()|()1()|(

Page 10: Discovering Key Concepts in Verbose Queries

Final Retrieval Function

score(q,d) =

Language Model

ic

ii dcpqcpdqp )|()|()1()|(

Page 11: Discovering Key Concepts in Verbose Queries

Final Retrieval Function

score(q,d) =

Key Concepts

ic

ii dcpqcpdqp )|()|()1()|(

Page 12: Discovering Key Concepts in Verbose Queries

What is a Concept?

• Noun phrase in a query

Page 13: Discovering Key Concepts in Verbose Queries

What is a Concept?

• Noun phrase in a query

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 14: Discovering Key Concepts in Verbose Queries

What is a Concept?

• Noun phrase in a query

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 15: Discovering Key Concepts in Verbose Queries

Finding ‘Key’ Concepts

• Rank concepts by p(ci|q)

Page 16: Discovering Key Concepts in Verbose Queries

Finding ‘Key’ Concepts

• Rank concepts by p(ci|q)

• Compute p(ci|q) by frequency?

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Page 17: Discovering Key Concepts in Verbose Queries

Finding ‘Key’ Concepts

• Approximate p(ci|q) by machine learning

• h(ci) is ci’s query-independent importance score

• p(ci|q) = h(ci) / ciq h(ci)

ci AdaBoost.M1 h(ci)

Page 18: Discovering Key Concepts in Verbose Queries

Features of a Concept

• is_cap : is capitalized• tf : in corpus• idf : in corpus• ridf : idf modified by Poisson model• wig : weighted information gain; change in entro

py from corpus to retrieved data• g_tf : Google term frequency• qp : number of times the concept appears as a

part of a query in MSN Live• qe : number of times the concept appears as ex

act query in MSN Live

Page 19: Discovering Key Concepts in Verbose Queries

TREC Corpus

Page 20: Discovering Key Concepts in Verbose Queries

Exp 1: Identifying Key Concept

• Cross-validation on corpus

• Each fold has 50 queries

• Check whether the top concept is a key concept

• Assume 1 key concept per query during annotation

Page 21: Discovering Key Concepts in Verbose Queries

Exp 1: Identifying Key Concept

Page 22: Discovering Key Concepts in Verbose Queries

Exp 1: Identifying Key Concept

• Better than idf ranking

Page 23: Discovering Key Concepts in Verbose Queries

Exp 2: Information Retrieval

score(q,d) =

• Use only the top 2 concepts for each query

• q is the entire <desc> section = 0.8

ic

ii dcpqcpdqp )|()|()1()|(

Page 24: Discovering Key Concepts in Verbose Queries

Exp 2: Information Retrieval

• KeyConcept[2]<desc> : author’s method

• SeqDep<desc> : include all bigrams in query

Page 25: Discovering Key Concepts in Verbose Queries

Exp 2: Information Retrieval

Page 26: Discovering Key Concepts in Verbose Queries

What to take home?

• Singling out key concepts improves retrieval


Top Related