oair 2013
DESCRIPTION
These slides refer to the talk I gave at the last International Conference on Open Research Areas in Information Retrieval (OAIR 2013), where I presented a research paper entitled "Modeling and Predicting the Task-by-Task Behavior of Search Engine Users". Web search engines answer user needs on a query-by-query fashion, namely they retrieve the set of the most relevant results to each issued query, independently. However, users often submit queries to perform multiple, related tasks. In this work, we first discuss a methodology to discover from query logs the latent tasks performed by users. Furthermore, we introduce the Task Relation Graph (TRG) as a represen- tation of users’ search behaviors on a task-by-task perspective. The task-by-task behavior is captured by weighting the edges of TRG with a relatedness score computed between pairs of tasks, as mined from the query log. We validate our approach on a concrete application, namely a task recommender system, which suggests related tasks to users on the basis of the task predictions derived from the TRG. Finally, we show that the task recommendations generated by our solution are beyond the reach of existing query suggestion schemes, and that our method recommends tasks that user will likely perform in the near future.TRANSCRIPT
![Page 1: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/1.jpg)
Modeling and Predicting the Task-by-Task Behavior of Search Engine
Users
Gabriele TolomeiUniversità Ca’ Foscari Venezia, Italy
Claudio LuccheseISTI-CNR, Pisa, Italy
Salvatore OrlandoUniversità Ca’ Foscari Venezia, Italy
Fabrizio SilvestriISTI-CNR, Pisa, Italy
Raffaele PeregoISTI-CNR, Pisa, Italy
May, 23 2013 - Lisbon, Portugal
10th International Conference in the RIAO series
![Page 2: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/2.jpg)
2
Outline
• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work
![Page 3: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/3.jpg)
3
Outline
• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work
![Page 4: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/4.jpg)
4
A New Way of Search
May, 23 2013 - Lisbon, Portugal
Alice
Bob
Same Task! “Reserving a hotel room in New York”
![Page 5: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/5.jpg)
5
… and Search Engines?
• Roughly, they are still Web document retrieval tools– answering on a per-query basis– ten-blue links to relevant Web pages
May, 23 2013 - Lisbon, Portugal
![Page 6: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/6.jpg)
6
Information Need Hierarchy
• Web Task: any (atomic) activity that a user performs through Web search– “find a recipe”, “book a flight”, “read
news”, etc.– distinct users may use different queries to
accomplish the same Web task
• Web Mission: composition of Web tasks to achieve complex goals – distinct users may use different Web tasks
to accomplish the same Web mission
May, 23 2013 - Lisbon, Portugal
[Jones and Klinkner, CIKM ‘08]
![Page 7: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/7.jpg)
7
Goals
• Mine Search Engine logs to detect Web tasks
• Provide a user model for task-oriented search– from query-by-query to task-by-task
• Show how such model can be used to design a real-world application– from query to task recommendation
May, 23 2013 - Lisbon, Portugal
![Page 8: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/8.jpg)
8
Outline
• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work
![Page 9: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/9.jpg)
9
The Big Picture
• Bottom-up, 2-stage clustering solution:– User Task Discovery from “raw” queries issued
by the same user and stored in query logs– Collective Task Discovery from distinct User
Tasks
• Graph-based representation of Collective Tasks and their relatedness (TRG)May, 23 2013 - Lisbon, Portugal
![Page 10: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/10.jpg)
10
User Task Discovery
• User Task– set of possibly non contiguous queries (multi-
tasking), issued by a single user, whose aim is to carry out a specific Web task
• QC-HTC– Graph-based query clustering solution
proposed in our previous work [Lucchese et al., WSDM’11]
– outperforms other techniques for session boundary detection in query logs (e.g., QFG [Boldi et al., CIKM’08])
May, 23 2013 - Lisbon, Portugal
![Page 11: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/11.jpg)
11
User Task Discovery: QC-HTC
• Splits long-term user session into shorter time-based sessions
• Builds a weighted undirected graph for each time-based session– nodes in each graph are the queries of a time-based
session
• Weight-links consecutive pairs of queries with their content-based similarity:– lexical (query character n-grams)– semantic (query “wikification”)
• Merges any two sequential clusters if their first (head) and last (tail) queries are similar enough
May, 23 2013 - Lisbon, Portugal
![Page 12: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/12.jpg)
12
Task-oriented User Sessions
May, 23 2013 - Lisbon, Portugal
![Page 13: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/13.jpg)
13
Collective Task Discovery
• Collective Task– group of distinct user tasks (i.e., distinct sets of
queries performed by several users) to represent the same Web task
• Identify similar user tasks by clustering their “bag of words” representations – Each user query is a sentence– Each user task is a concatenation of possibly
many sentences (i.e., a text document)
• T = {T1, …, TK} is the final set of Collective Tasks
May, 23 2013 - Lisbon, Portugal
![Page 14: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/14.jpg)
14
Mapping User to Collective Tasks
… … … …
May, 23 2013 - Lisbon, Portugal
![Page 15: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/15.jpg)
15
Task Relation Graph (TRG)
• Task-oriented model of user search behavior• TRG(T, E, w, η) is a weighted directed graph
– nodes are the set of collective tasks T={T1, …, TK}
– edges E represent task relatedness– w: TxT [0,1] is the weighting-edge function– ηis a weight threshold
• Ti and Tj are linked together iff w(Ti, Tj) > η
May, 23 2013 - Lisbon, Portugal
![Page 16: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/16.jpg)
16
Outline
• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work
![Page 17: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/17.jpg)
User Task Discovery
![Page 18: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/18.jpg)
18
Data Set: AOL 2006 Query Log
May, 23 2013 - Lisbon, Portugal
![Page 19: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/19.jpg)
19
Results
Results were evaluated on a manually-built ground-truth of user tasks [Lucchese et al., TOIS 2013]
May, 23 2013 - Lisbon, Portugal
![Page 20: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/20.jpg)
Collective Task Discovery
![Page 21: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/21.jpg)
21
Data Set: AOL 2006 Query Log
May, 23 2013 - Lisbon, Portugal
![Page 22: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/22.jpg)
22
Training Set vs. Test Set
May, 23 2013 - Lisbon, Portugal
![Page 23: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/23.jpg)
23
Clustering User Tasks
• Algorithm: Repeated Bisections vs. Agglomerative
• Similarity Measure: Cosine similarity vs. Pearson’s correlation
• Objective Function: maximize intra-cluster similarity
• Stop Criterion: choose heuristically the final number K of clusters through the “elbow method”
• We select K = 1,024
May, 23 2013 - Lisbon, Portugal
![Page 24: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/24.jpg)
24
Results and Example
Results were evaluated on a manually-built ground-truth of collective tasks [Lucchese et al., TOIS 2013]
May, 23 2013 - Lisbon, Portugal
![Page 25: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/25.jpg)
Task Relation Graph
![Page 26: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/26.jpg)
26
Building TRG: Task Relatedness
• Use the training set to compute w(Ti,Tj)
• Frequent Sequential Patterns– η= support (i.e., probability) of Ti and Tj co-
occurring in a specified sequence: P(<Ti, Tj>)
– task order matters!
• Association Rules Ti Tj – η= support: P({Ti, Tj})
– η= confidence: P(Tj|Ti)
– task order doesn’t matter!
May, 23 2013 - Lisbon, Portugal
![Page 27: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/27.jpg)
27
Task Recommendation
• One out of many possible applications of TRG
• A user is performing (or has just performed) a task Ti
– indeed a user task which is similar to a known Ti
• Retrieve from TRG the set Rm(Ti) including the m-top related nodes/tasks to Ti
– tasks in Rm(Ti) are those having the m highest edge weights among all the adjacent nodes to Ti
May, 23 2013 - Lisbon, Portugal
![Page 28: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/28.jpg)
28
Task Recommendation: Experiments
• Use TRGs built from training set to generate task recommendations for the test set
• Original user sessions in test set are split in 1/3 prefix and 2/3 suffix sets of user tasks
• Each user task is mapped to a candidate collective task Tc (cosine similarity)
• From all the Tc in prefix retrieve the union-set
of recommendations U Rm(Tc) from TRG
May, 23 2013 - Lisbon, Portugal
![Page 29: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/29.jpg)
29
Task Recommendation: Evaluation
Coverage is affected by the edge weighting function and by the threshold η
May, 23 2013 - Lisbon, Portugal
![Page 30: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/30.jpg)
30
Task Recommendation: Results (top-1)
May, 23 2013 - Lisbon, Portugal
![Page 31: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/31.jpg)
31
Task Recommendation: Results (top-3)
May, 23 2013 - Lisbon, Portugal
![Page 32: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/32.jpg)
32
Task Recommendation: Examples
May, 23 2013 - Lisbon, Portugal
![Page 33: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/33.jpg)
33
Task Recommendation: Examples
May, 23 2013 - Lisbon, Portugal
![Page 34: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/34.jpg)
34
Task vs. Query Recommendation
• To show that task recommendation is different from well-known query recommendation
• TRG vs. QFG– 83.8% of top-3 query suggestions generated by
QFG live in the same (collective) task– Only 15.1% of top-3 query suggestions generated
by QFG lead to 2 separate (collective) tasks
• QFG is great if user wants to stay in the same task
• TRG allows user to switch and jump to other tasks
May, 23 2013 - Lisbon, Portugal
![Page 35: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/35.jpg)
35
Outline
• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work
![Page 36: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/36.jpg)
36
The “Take-Away” Message
• Web Search Engines should handle user requests from “query-by-query” to “task-by-task”
• New models for user search behavior are needed: from Query Flow Graph to Task Relation Graph
• Task Relation Graph may be exploited for several applications (e.g., Task Recommendation)
May, 23 2013 - Lisbon, Portugal
![Page 37: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/37.jpg)
37
Future Work
• Advanced Task Representation– E.g., linked data, as opposed to simple bag-of-
queries
• Automatic Task Labeling (taxonomy of Web tasks):– Linking queries of collective tasks with referent
entities in a knowledge base– Exploit entity categories to label the whole task
• Use TRG for other applications– Task-based advertising, Mission discovery, etc.
• New SERP to render task-oriented results
May, 23 2013 - Lisbon, Portugal
![Page 38: OAIR 2013](https://reader036.vdocument.in/reader036/viewer/2022081515/554f7cb4b4c905d25b8b4857/html5/thumbnails/38.jpg)
Thank You!Questions?