davide mottin, senjuti basu roy, alice marascu, yannis velegrakis, themis palpanas, gautam das a...

26
Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das A Probabilistic Optimization Framework for the Empty-Answer Davide Mottin A Probabilistic Optimization Framework for the Empty-Answer Problem Davide Mottin Alice Marascu, Senjuti Basu Roy, Gautam Das, Themis Palpanas, Yannis Velegrakis

Upload: hassan-hanger

Post on 14-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin

A Probabilistic Optimization Framework for the Empty-Answer Problem Davide MottinAlice Marascu, Senjuti Basu Roy, Gautam Das, Themis Palpanas, Yannis Velegrakis

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 2

Empty-Answer Problem

CARDB

query = Alarm, DSL, Manual

{}

No answer

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 3

Dealing with the Empty Answer Problem

Ranking results based on user preferencesIR [Baeza11] and database solutions [Chaudhuri04]

Query relaxationModify some of the query conditions [Mishra09]

(-) Suggests all the modification together(-) Does not take user feedback into account

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 4

Our Solution: Interactive Query Relaxation

Suggests one relaxation at a timeTakes user feedback into accountModels user preferencesOptimization centric relaxation suggestions

User centric (effort, relevance)System-centric (profit)

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 5

ChallengesExponential number of relaxationsModeling user preferencesSystem encoding of different objective functions

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 6

Our Approach

A probabilistic optimization framework• Based on probability that user says yes to relaxation Q’ of

query Q

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin

The Probabilistic Framework

Probability of accepting relaxation Q’ of Q belief of user that an answer will be found in the database: Priorlikelihood the user will like the answers of relaxed query: Pref

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin

The Probabilistic Framework

Probability to reject a relaxation

Cost for a relaxation

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 9

Different objective functionsMaximize profit

Pref: favors solutions with highest values of individual tuplesa static function

Maximize answer relevancePref: favors solutions with most relevant tuples to original query

Semi-dynamic function (computed only once with the user query

Minimize user effortPref: favors solutions with least number of user interactions

fully dynamic function (changes at every relaxation)

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 10

Minimum Effort Objective

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 11

Min-Effort Relaxation Tree

0 0

0.3 0.71

1 10 0

1 2

1

0.3 0.7

Query : (Alarm, DSL, Manual) Relaxation nodes

Choice nodes

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 12

Algorithmic Contributions

Exact algorithm (FastOpt): Upper and lower bound for each nodePruning can be enabled for this algorithm

Approximate algorithm (CDR): Nodes cost approximated by probability distributionRelaxation nodes: min/max distribution of CostChoice nodes: sum distribution of CostApproximated by computing the convolution cost

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 13

Fast Solution (FastOpt)

Idea: prune non-optimal relaxations in advance• Upper and lower bound of cost function• Prune branches using upper/lower bounds

reasoning

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 14

FastOpt Algorithm (Min-Effort)

Prune!!!

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 15

Experimental SetupDatasets:

US Home dataset: 38k tuples, 18 attributesCar dataset: 100k tuples, 31 attributesSyntetic datasets: 20k to 500k tuples

Baseline algorithms: Previous works: top-k, query-refinement, rankingRandom relaxationGreedy: choose the first non empy otherwise random

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 16

User Study Set up

1. Interactive vs non-interactive• Measure user satisfaction with our interactive

approach vs relax at-once approaches• 100 Amazon Turk users, 5 queries each

2. Objective functions effectiveness• Compare proposed relaxations with objective

function goals (max profit, min effort, max user relevance)

• Three tasks• 100 users, 5 queries

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 17

Experimental Results Highlights

Scalability results:FastOpt (Exact): timely exact answers for small queriesCDR (Approximate)

real time answers for queries size 10results close to optimal

User study resultsInteractive methods preferred over non-interactiveObjective functions correctly achieve their optimization goals

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 18

User Effort Comparison

• CDR close to optimal• Random and Greedy

produce 1.5 more relaxations

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 19

Query TimeExponential behaviour

Efficient for small queries

1.4 sec for query size 10!!!

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 20

User Study

Users prefer interactive systems to relaxations all at onceBetter quality answers

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 21

ConclusionsIntroduce novel principled, user-centric and interactive approach for the empty-answer problemPropose exact and approximate algorithmsDemonstrate scalability of proposed techniques with database and query sizeShow effectiveness of the different objective functions Verify quality of the answers and superior usability of our interactive approach

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 23

Goal comparison

Objective functions achieve their goalsDynamic and Semi-Dynamic very similar in performance

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 24

Approximate Solution (CDR)Idea: use cost distribution instead of actual cost. 1. b-size histogram in each node2. Construct the tree first L levels3. Expand the branch with the biggest probability of

being the optimal

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 25

Choose the Branch to Expand1. compute the probability that the cost is smaller than

the siblings 2. choose the son with the highest probability

Pr(n1<n2) = 0.6 n1 n2 Pr(n2<n1) = 0.4

Expand this!

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das

A Probabilistic Optimization Framework for the Empty-Answer ProblemDavide Mottin 26

Bibliography

[Mishra09] C. Mishra and N. Koudas, “Interactive query refinement,” in EDBT,2009.[Roy08] S. Basu Roy, H. Wang, G. Das, U. Nambiar, and M. Mohania, “Minimum-effort driven dynamic faceted search in structured databases,” in CIKM, 2008.[Chadhuri04] S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum, “Probabilistic ranking of database query results,” in VLDB, 2004.[Baeza11] R. A. Baeza-Yates and B. A. Ribeiro-Neto, Modern Information Retrieval, 2011.