intent mining from search results

17
Intent Mining from Search Results Jan Pedersen

Upload: angie

Post on 22-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Intent Mining from Search Results. Jan Pedersen. Outline. Intro to Web Search Free text queries Architecture Why it works Result Set Mining Disambiguation Correction Amplification. The Worst Interface ( ca 1990). The Search Interface ( ca 2010). Search wasn’t always like this. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Intent Mining from  Search Results

Intent Mining from Search Results

Jan Pedersen

Page 2: Intent Mining from  Search Results

Outline

• Intro to Web Search– Free text queries– Architecture– Why it works

• Result Set Mining– Disambiguation– Correction– Amplification

Page 3: Intent Mining from  Search Results

The Worst Interface (ca 1990)

The Search Interface (ca 2010)

Page 4: Intent Mining from  Search Results

Search wasn’t always like this

ttl/(tennis and (racquet or racket))isd/1/8/2002 and motorcyclein/newmar-julieSource: USPTO

Page 5: Intent Mining from  Search Results

Salton’s Contribution

Source: cs.cornell.edu

• Free text queries• Approximate matching• Relevance ranking

• Exploit redundancy• Meta data• Scored-OR

Page 6: Intent Mining from  Search Results

Life of a query

Gerry Salton

(Scored-OR 10, ([(“Gerry” or “Gerald”),0.3], [“Salton”,0.7]))

Index• Separation between user query and backend query

• Relevance scoring and ranking• Query-in-context summaries

Page 7: Intent Mining from  Search Results

Why Does it Work?

Page 8: Intent Mining from  Search Results

Semantic Meta-Data

Segment Tail OverallAll Queries 100% 100%Word Count > 4 41% 20%Misspelled 21% 11%Perfect Matches Popularity 28% 54%Partial Matches Popularity 45% 28%No Matches Popularity 9% 7%

Page 9: Intent Mining from  Search Results

RESULT SET MINING

Page 10: Intent Mining from  Search Results

Query Expansion

• [Gerry Salton] [Gerry Salton Cornell]• Disambiguation via Expansion• Pseudo Relevance Feedback (Evans)

Page 11: Intent Mining from  Search Results

Life of a query (2)

Gerry Salton

(Scored-OR 10, ([(“Gerry” or “Gerald”),0.3], [“Salton”,0.7]))

Index

Gerry Salton Gerry Salton Cornell

• Result Set Analysis• Automated Query expansion• Reranking

Page 12: Intent Mining from  Search Results

Spelling Correction

• Session Log Mining• Multiple queries with Blending• Behavioral feedback loop

Blend(Scored-AND(200, “britinay”, “spares”), Scored-AND(200, “britney”, “spears”))

Scored-AND(200, OR(“britinay”, “britney”), OR(“spares”, “spears”))

Page 13: Intent Mining from  Search Results

Web Search

Gerry Salton

• Speller• Synonyms

Index

First Stage reRanking: 100K

(Scored-AND 200,”Gerry”, “Salton”)

IndexIndexIndexIndexIndex100B

LocalNews

Second Stage reRanking: 5K

Third Stage reRanking: 50

• Query Understanding• Federation• ReRanking and Blending

Page 14: Intent Mining from  Search Results

• Entity Detection• Grouping• Summarization

Page 15: Intent Mining from  Search Results

Post Result Triggering

• Alternative to Answer Blending• Structured Data integration• Off-page data joins

Page 16: Intent Mining from  Search Results

Grouping

• Reranked Results• Compressed Presentation• Coherently grouped

Page 17: Intent Mining from  Search Results

Summary

• Web Queries are not User Intent– Suffer from ambiguity and errors

• Intent can be mined from results– Query Correction– Disambiguation– Grouping and Organization