Download - OAIR 2013

Transcript
Page 1: OAIR 2013

Modeling and Predicting the Task-by-Task Behavior of Search Engine

Users

Gabriele TolomeiUniversità Ca’ Foscari Venezia, Italy

Claudio LuccheseISTI-CNR, Pisa, Italy

Salvatore OrlandoUniversità Ca’ Foscari Venezia, Italy

Fabrizio SilvestriISTI-CNR, Pisa, Italy

Raffaele PeregoISTI-CNR, Pisa, Italy

May, 23 2013 - Lisbon, Portugal

10th International Conference in the RIAO series

Page 2: OAIR 2013

2

Outline

• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work

Page 3: OAIR 2013

3

Outline

• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work

Page 4: OAIR 2013

4

A New Way of Search

May, 23 2013 - Lisbon, Portugal

Alice

Bob

Same Task! “Reserving a hotel room in New York”

Page 5: OAIR 2013

5

… and Search Engines?

• Roughly, they are still Web document retrieval tools– answering on a per-query basis– ten-blue links to relevant Web pages

May, 23 2013 - Lisbon, Portugal

Page 6: OAIR 2013

6

Information Need Hierarchy

• Web Task: any (atomic) activity that a user performs through Web search– “find a recipe”, “book a flight”, “read

news”, etc.– distinct users may use different queries to

accomplish the same Web task

• Web Mission: composition of Web tasks to achieve complex goals – distinct users may use different Web tasks

to accomplish the same Web mission

May, 23 2013 - Lisbon, Portugal

[Jones and Klinkner, CIKM ‘08]

Page 7: OAIR 2013

7

Goals

• Mine Search Engine logs to detect Web tasks

• Provide a user model for task-oriented search– from query-by-query to task-by-task

• Show how such model can be used to design a real-world application– from query to task recommendation

May, 23 2013 - Lisbon, Portugal

Page 8: OAIR 2013

8

Outline

• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work

Page 9: OAIR 2013

9

The Big Picture

• Bottom-up, 2-stage clustering solution:– User Task Discovery from “raw” queries issued

by the same user and stored in query logs– Collective Task Discovery from distinct User

Tasks

• Graph-based representation of Collective Tasks and their relatedness (TRG)May, 23 2013 - Lisbon, Portugal

Page 10: OAIR 2013

10

User Task Discovery

• User Task– set of possibly non contiguous queries (multi-

tasking), issued by a single user, whose aim is to carry out a specific Web task

• QC-HTC– Graph-based query clustering solution

proposed in our previous work [Lucchese et al., WSDM’11]

– outperforms other techniques for session boundary detection in query logs (e.g., QFG [Boldi et al., CIKM’08])

May, 23 2013 - Lisbon, Portugal

Page 11: OAIR 2013

11

User Task Discovery: QC-HTC

• Splits long-term user session into shorter time-based sessions

• Builds a weighted undirected graph for each time-based session– nodes in each graph are the queries of a time-based

session

• Weight-links consecutive pairs of queries with their content-based similarity:– lexical (query character n-grams)– semantic (query “wikification”)

• Merges any two sequential clusters if their first (head) and last (tail) queries are similar enough

May, 23 2013 - Lisbon, Portugal

Page 12: OAIR 2013

12

Task-oriented User Sessions

May, 23 2013 - Lisbon, Portugal

Page 13: OAIR 2013

13

Collective Task Discovery

• Collective Task– group of distinct user tasks (i.e., distinct sets of

queries performed by several users) to represent the same Web task

• Identify similar user tasks by clustering their “bag of words” representations – Each user query is a sentence– Each user task is a concatenation of possibly

many sentences (i.e., a text document)

• T = {T1, …, TK} is the final set of Collective Tasks

May, 23 2013 - Lisbon, Portugal

Page 14: OAIR 2013

14

Mapping User to Collective Tasks

… … … …

May, 23 2013 - Lisbon, Portugal

Page 15: OAIR 2013

15

Task Relation Graph (TRG)

• Task-oriented model of user search behavior• TRG(T, E, w, η) is a weighted directed graph

– nodes are the set of collective tasks T={T1, …, TK}

– edges E represent task relatedness– w: TxT [0,1] is the weighting-edge function– ηis a weight threshold

• Ti and Tj are linked together iff w(Ti, Tj) > η

May, 23 2013 - Lisbon, Portugal

Page 16: OAIR 2013

16

Outline

• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work

Page 17: OAIR 2013

User Task Discovery

Page 18: OAIR 2013

18

Data Set: AOL 2006 Query Log

May, 23 2013 - Lisbon, Portugal

Page 19: OAIR 2013

19

Results

Results were evaluated on a manually-built ground-truth of user tasks [Lucchese et al., TOIS 2013]

May, 23 2013 - Lisbon, Portugal

Page 20: OAIR 2013

Collective Task Discovery

Page 21: OAIR 2013

21

Data Set: AOL 2006 Query Log

May, 23 2013 - Lisbon, Portugal

Page 22: OAIR 2013

22

Training Set vs. Test Set

May, 23 2013 - Lisbon, Portugal

Page 23: OAIR 2013

23

Clustering User Tasks

• Algorithm: Repeated Bisections vs. Agglomerative

• Similarity Measure: Cosine similarity vs. Pearson’s correlation

• Objective Function: maximize intra-cluster similarity

• Stop Criterion: choose heuristically the final number K of clusters through the “elbow method”

• We select K = 1,024

May, 23 2013 - Lisbon, Portugal

Page 24: OAIR 2013

24

Results and Example

Results were evaluated on a manually-built ground-truth of collective tasks [Lucchese et al., TOIS 2013]

May, 23 2013 - Lisbon, Portugal

Page 25: OAIR 2013

Task Relation Graph

Page 26: OAIR 2013

26

Building TRG: Task Relatedness

• Use the training set to compute w(Ti,Tj)

• Frequent Sequential Patterns– η= support (i.e., probability) of Ti and Tj co-

occurring in a specified sequence: P(<Ti, Tj>)

– task order matters!

• Association Rules Ti Tj – η= support: P({Ti, Tj})

– η= confidence: P(Tj|Ti)

– task order doesn’t matter!

May, 23 2013 - Lisbon, Portugal

Page 27: OAIR 2013

27

Task Recommendation

• One out of many possible applications of TRG

• A user is performing (or has just performed) a task Ti

– indeed a user task which is similar to a known Ti

• Retrieve from TRG the set Rm(Ti) including the m-top related nodes/tasks to Ti

– tasks in Rm(Ti) are those having the m highest edge weights among all the adjacent nodes to Ti

May, 23 2013 - Lisbon, Portugal

Page 28: OAIR 2013

28

Task Recommendation: Experiments

• Use TRGs built from training set to generate task recommendations for the test set

• Original user sessions in test set are split in 1/3 prefix and 2/3 suffix sets of user tasks

• Each user task is mapped to a candidate collective task Tc (cosine similarity)

• From all the Tc in prefix retrieve the union-set

of recommendations U Rm(Tc) from TRG

May, 23 2013 - Lisbon, Portugal

Page 29: OAIR 2013

29

Task Recommendation: Evaluation

Coverage is affected by the edge weighting function and by the threshold η

May, 23 2013 - Lisbon, Portugal

Page 30: OAIR 2013

30

Task Recommendation: Results (top-1)

May, 23 2013 - Lisbon, Portugal

Page 31: OAIR 2013

31

Task Recommendation: Results (top-3)

May, 23 2013 - Lisbon, Portugal

Page 32: OAIR 2013

32

Task Recommendation: Examples

May, 23 2013 - Lisbon, Portugal

Page 33: OAIR 2013

33

Task Recommendation: Examples

May, 23 2013 - Lisbon, Portugal

Page 34: OAIR 2013

34

Task vs. Query Recommendation

• To show that task recommendation is different from well-known query recommendation

• TRG vs. QFG– 83.8% of top-3 query suggestions generated by

QFG live in the same (collective) task– Only 15.1% of top-3 query suggestions generated

by QFG lead to 2 separate (collective) tasks

• QFG is great if user wants to stay in the same task

• TRG allows user to switch and jump to other tasks

May, 23 2013 - Lisbon, Portugal

Page 35: OAIR 2013

35

Outline

• Motivation• Research Challenges• Experiments and Results• Conclusion and Future Work

Page 36: OAIR 2013

36

The “Take-Away” Message

• Web Search Engines should handle user requests from “query-by-query” to “task-by-task”

• New models for user search behavior are needed: from Query Flow Graph to Task Relation Graph

• Task Relation Graph may be exploited for several applications (e.g., Task Recommendation)

May, 23 2013 - Lisbon, Portugal

Page 37: OAIR 2013

37

Future Work

• Advanced Task Representation– E.g., linked data, as opposed to simple bag-of-

queries

• Automatic Task Labeling (taxonomy of Web tasks):– Linking queries of collective tasks with referent

entities in a knowledge base– Exploit entity categories to label the whole task

• Use TRG for other applications– Task-based advertising, Mission discovery, etc.

• New SERP to render task-oriented results

May, 23 2013 - Lisbon, Portugal

Page 38: OAIR 2013

Thank You!Questions?


Top Related