personalized query expansion for the web paul-alexandru chirita, claudiu s. firan, wolfgang nejdl...
TRANSCRIPT
![Page 1: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/1.jpg)
Personalized Query Expansion for the Web
Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl
Gabriel Barata
![Page 2: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/2.jpg)
Motivation
by Tojosan @ Flickr
![Page 3: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/3.jpg)
What is query expansion?
Add meaningful search terms to the query…
![Page 4: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/4.jpg)
What is PIR based query expansion?
Add meaningful search terms to the query…
… related to the use’s interests.
![Page 5: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/5.jpg)
Why PIR based query expansion?
More personalization quality!
More privacy!
![Page 6: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/6.jpg)
Example
Google search: “canon book”
![Page 7: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/7.jpg)
Example
Top 3 results:• The Canon: A Whirligig Tour of the Beautiful
Basics of Science (Hardcover) @ Amazon
• Western Canon @ Wikipedia
• Biblical Canon @ Wikipedia
![Page 8: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/8.jpg)
Example
Top 3 results:• The Canon: A Whirligig Tour of the Beautiful
Basics of Science (Hardcover) @ Amazon
• Western Canon @ Wikipedia
• Biblical Canon @ Wikipedia
![Page 9: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/9.jpg)
Example
Expanded query: “canon book bible”
![Page 10: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/10.jpg)
Example
Top 3 results:• Biblical Canon @ Wikipedia
• Books of the Bible @ Wikipedia
• The Canon of the Bible @ catholicapologetics.org
![Page 11: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/11.jpg)
Query Expansion using Desktop data
by Old Shoe Woman @ Flickr
![Page 12: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/12.jpg)
Algorithms
• Expanding with Local Desktop Analysis• Expanding with Global Desktop Analysis
![Page 13: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/13.jpg)
Algorithms
• Expanding with Local Desktop Analysis• Expanding with Global Desktop Analysis
![Page 14: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/14.jpg)
Expanding with Local Desktop Analysis
• Term and Document Frequency• Lexical Compounds• Sentence Selection
![Page 15: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/15.jpg)
Expanding with Local Desktop Analysis
• Term and Document Frequency• Lexical Compounds• Sentence Selection
![Page 16: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/16.jpg)
Term and Document Frequency
𝑇𝑒𝑟𝑚𝑆𝑐𝑜𝑟𝑒= 12+ 12∙𝑛𝑟𝑊𝑜𝑟𝑑𝑠− 𝑝𝑜𝑠𝑛𝑟𝑊𝑜𝑟𝑑𝑠 ൨∙log(1+ 𝑇𝐹)
![Page 17: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/17.jpg)
Expanding with Local Desktop Analysis
• Term and Document Frequency• Lexical Compounds• Sentence Selection
![Page 18: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/18.jpg)
Lexical Compounds
{ adjective? Noun+ }
![Page 19: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/19.jpg)
Expanding with Local Desktop Analysis
• Term and Document Frequency• Lexical Compounds• Sentence Selection
![Page 20: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/20.jpg)
Sentence Selection
𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑆𝑐𝑜𝑟𝑒= 𝑆𝑊2𝑇𝑊 + 𝑃𝑆+ 𝑇𝑄2𝑁𝑄
𝑃𝑆= ቐ
𝐴𝑣𝑔ሺ𝑁𝑆ሻ− 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝐼𝑛𝑑𝑒𝑥𝐴𝑣𝑔2(𝑁𝑆) ,𝑖𝑓 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝐼𝑛𝑑𝑒𝑥≤ 100 ,𝑖𝑓 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝐼𝑛𝑑𝑒𝑥 > 10
𝑇𝐹> 𝑚𝑠= ቐ7− 0.1× ሺ25− 𝑁𝑆ሻ ,𝑖𝑓 𝑁𝑆< 257 ,𝑖𝑓 𝑁𝑆 ∈[25,40]7+ 0.1× ሺ𝑁𝑆− 40ሻ ,𝑖𝑓 𝑁𝑆> 40
![Page 21: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/21.jpg)
Expanding with Global Desktop Analysis
• Term Co-occurrence Statistics• Thesaurus based Expansion
![Page 22: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/22.jpg)
Expanding with Global Desktop Analysis
• Term Co-occurrence Statistics• Thesaurus based Expansion
![Page 23: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/23.jpg)
Term Co-occurrence Statistics
![Page 24: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/24.jpg)
Expanding with Global Desktop Analysis
• Term Co-occurrence Statistics• Thesaurus based Expansion
![Page 25: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/25.jpg)
Thesaurus based Expansion
![Page 26: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/26.jpg)
Experiments & Evaluation
by Canadian Museum of Nature @ Flickr
![Page 27: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/27.jpg)
Experiments
• 18 users• Files indexed within user selected paths,
Emails and Web cache
![Page 28: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/28.jpg)
Experiments
• They chose 4 queries:– 1 from the top 2% log queries (avg. length = 2.0)
– 1 random log query (avg. length = 2.3)
– 1 self-selected specific query (avg. length = 2.9)
– 1 self-selected ambiguous query (avg. length = 1.8)
![Page 29: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/29.jpg)
Evaluation
𝐷𝐶𝐺ሺ𝑖ሻ= ቐ
𝐺ሺ1ሻ ,𝑖𝑓 𝑖 = 1𝐷𝐶𝐺ሺ𝑖 − 1ሻ+ 𝐺ሺ𝑖ሻlog2(i) ,𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
![Page 30: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/30.jpg)
Evaluation
• Evaluated algorithms:– Google: Google query output– TF, DF: Term and Document Frequency– LC, LC[O]: Regular and Optimized Lexical Compounds– TC[CS], TC[MI], TC[LR]: Term Co-occurrences
Statistics using Cosine Similarity, Mutual Information and Likelihood Ratio
– WN[SYN], WN[SUB], WN[SUP]: WordNet based expansion with synonyms, sub-concepts and super-concepts.
![Page 31: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/31.jpg)
ResultsLog queries:
![Page 32: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/32.jpg)
ResultsSelf-selected queries:
![Page 33: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/33.jpg)
Introducing Adaptativity
by RavenCore17 @ Flickr
![Page 34: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/34.jpg)
Query Clarity
![Page 35: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/35.jpg)
Adaptive Expansion
![Page 36: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/36.jpg)
Experiments
• Same experimental setup as for the previous analyzis.
![Page 37: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/37.jpg)
Results
Log queries:
![Page 38: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/38.jpg)
Results
Self-selected queries:
![Page 39: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/39.jpg)
Results
![Page 40: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/40.jpg)
Conclusions
by ThisIsIt2 @ Flickr
![Page 41: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/41.jpg)
Conclusions
• Five techniques for determining expansion terms from personal documents.
• Empirical analysis showed that these approaches perform very well.
• Expansion process adapts accordingly to query features.
• Adaptive expansion process proved to yield significant improvements over the static one.
![Page 42: Personalized Query Expansion for the Web Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata](https://reader034.vdocument.in/reader034/viewer/2022051215/56649cfa5503460f949cbd13/html5/thumbnails/42.jpg)
End
Any questions?