discovering key concepts in verbose queries

Post on 21-Jan-2016

38 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Discovering Key Concepts in Verbose Queries. Michael Bendersky and W. Bruce Croft University of Massachusetts SIGIR 2008. Objective. “Discovering Key Concepts in Verbose Queries”. Objective. “Discovering Key Concepts in Verbose Queries” Number 829 Spanish Civil War support - PowerPoint PPT Presentation

TRANSCRIPT

Discovering Key Concepts in Verbose Queries

Michael Bendersky and W. Bruce Croft

University of Massachusetts

SIGIR 2008

Objective

• “Discovering Key Concepts in Verbose Queries”

Objective

• “Discovering Key Concepts in Verbose Queries”

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Objective

• “Discovering Key Concepts in Verbose Queries”

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Objective

• “Discovering Key Concepts in Verbose Queries”

• Use of key concepts?

Objective

• “Discovering Key Concepts in Verbose Queries”

• Use of key concepts?

• Combine with current IR model

Retrieval Model

• Conventional Language Model:

score(q,d) = p(q|d) = )(

),(

dp

dqp

Retrieval Model

• Conventional Language Model:

score(q,d) = p(q|d) =

• New Model:

score(q,d) = p(q|d) = =

)(

),(

dp

dqp

)(

),,(

dp

cdqpic

i)(

),(

dp

dqp

Final Retrieval Function

score(q,d) = ic

ii dcpqcpdqp )|()|()1()|(

Final Retrieval Function

score(q,d) =

Language Model

ic

ii dcpqcpdqp )|()|()1()|(

Final Retrieval Function

score(q,d) =

Key Concepts

ic

ii dcpqcpdqp )|()|()1()|(

What is a Concept?

• Noun phrase in a query

What is a Concept?

• Noun phrase in a query

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

What is a Concept?

• Noun phrase in a query

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Finding ‘Key’ Concepts

• Rank concepts by p(ci|q)

Finding ‘Key’ Concepts

• Rank concepts by p(ci|q)

• Compute p(ci|q) by frequency?

• <num> Number 829

<title> Spanish Civil War support

<desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

Finding ‘Key’ Concepts

• Approximate p(ci|q) by machine learning

• h(ci) is ci’s query-independent importance score

• p(ci|q) = h(ci) / ciq h(ci)

ci AdaBoost.M1 h(ci)

Features of a Concept

• is_cap : is capitalized• tf : in corpus• idf : in corpus• ridf : idf modified by Poisson model• wig : weighted information gain; change in entro

py from corpus to retrieved data• g_tf : Google term frequency• qp : number of times the concept appears as a

part of a query in MSN Live• qe : number of times the concept appears as ex

act query in MSN Live

TREC Corpus

Exp 1: Identifying Key Concept

• Cross-validation on corpus

• Each fold has 50 queries

• Check whether the top concept is a key concept

• Assume 1 key concept per query during annotation

Exp 1: Identifying Key Concept

Exp 1: Identifying Key Concept

• Better than idf ranking

Exp 2: Information Retrieval

score(q,d) =

• Use only the top 2 concepts for each query

• q is the entire <desc> section = 0.8

ic

ii dcpqcpdqp )|()|()1()|(

Exp 2: Information Retrieval

• KeyConcept[2]<desc> : author’s method

• SeqDep<desc> : include all bigrams in query

Exp 2: Information Retrieval

What to take home?

• Singling out key concepts improves retrieval

top related