top-k query processing and optimization

39
Top-k Query Top-k Query Processing and Processing and Optimization Optimization 198:541 198:541 (slides courtesy of Ihab F. (slides courtesy of Ihab F. Ilyas and Walid G. Aref) Ilyas and Walid G. Aref)

Upload: miracle

Post on 30-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Top-k Query Processing and Optimization. 198:541 (slides courtesy of Ihab F. Ilyas and Walid G. Aref). Outline. Motivation with Examples Query Model Top-k Selection Top-k Join Ranking in Relational Query Engines Summary Open Research Topics. Outline. Motivation with Examples - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Top-k Query Processing and Optimization

Top-k Query Top-k Query Processing and Processing and OptimizationOptimization

198:541 198:541

(slides courtesy of Ihab F. Ilyas (slides courtesy of Ihab F. Ilyas and Walid G. Aref)and Walid G. Aref)

Page 2: Top-k Query Processing and Optimization

OutlineOutline Motivation with ExamplesMotivation with Examples Query ModelQuery Model

Top-k SelectionTop-k Selection Top-k Join Top-k Join

Ranking in Relational Query EnginesRanking in Relational Query Engines SummarySummary Open Research TopicsOpen Research Topics

Page 3: Top-k Query Processing and Optimization

OutlineOutline Motivation with ExamplesMotivation with Examples Query ModelQuery Model

Top-k SelectionTop-k Selection Top-k Join Top-k Join

Ranking in Relational Query EnginesRanking in Relational Query Engines SummarySummary Open Research TopicsOpen Research Topics

Page 4: Top-k Query Processing and Optimization

MotivationMotivation Information Retrieval/ Database systems Information Retrieval/ Database systems

integrationintegration IR: IR: uncertaintyuncertainty and and rankingranking for effective retrieval for effective retrieval Database Systems: advanced Database Systems: advanced data managementdata management Supporting new data types is certainly not Supporting new data types is certainly not

enoughenough True integration includes (among other things):True integration includes (among other things):

Indexing Indexing Query processing and optimization Query processing and optimization Query LanguageQuery Language

Page 5: Top-k Query Processing and Optimization

MotivationMotivation ApplicationsApplications

Multimedia search by contents Multimedia search by contents (multi-features/examples)(multi-features/examples)

MiddlewareMiddleware Search enginesSearch engines Data miningData mining

New requirementsNew requirements Multi-criteria rankingMulti-criteria ranking Rank aggregation from external sourcesRank aggregation from external sources Joining ranked infinite streamsJoining ranked infinite streams

Most applications are interested in theMost applications are interested in the top-k top-k resultsresults

Page 6: Top-k Query Processing and Optimization

Example 1: Ranking in Example 1: Ranking in Multimedia RetrievalMultimedia Retrieval

Color Histogram

Edge Histogram

Texture

Query

Color Histogram

Edge Histogram

Texture

VideoDatabase

Page 7: Top-k Query Processing and Optimization

RANK( ) OVER in SQL 99

Example 2Example 2

SELECT SELECT h.id h.id , s, s.name.nameFROM FROM houses h , schools shouses h , schools sWHERE WHERE h.location = s.locationh.location = s.location

ORDER BY ORDER BY h.price+10 x h.price+10 x s.tuitions.tuition

STOP AFTER STOP AFTER 1010

Page 8: Top-k Query Processing and Optimization

Example 2 (Cont’d)Example 2 (Cont’d)

IIDD

LocationLocation PricePrice

11

22

33

44

55

66

LafayetteLafayette

W.LafayetW.Lafayettete

IndianapoIndianapolislis

KokomoKokomo

LafayetteLafayette

KokomoKokomo

…………

90,0090,0000

110,0110,00000

111,0111,00000

118,0118,00000

125,0125,00000

154,0154,00000

IDID LocationLocation TuitioTuitionn

11

22

33

44

55

66

77

88

IndianapoIndianapolislis

W.LafayetW.Lafayettete

LafayetteLafayette

LafayetteLafayette

IndianapoIndianapolislis

IndianapoIndianapolislis

KokomoKokomo

KokomoKokomo

30003000

35003500

60006000

62006200

70007000

79007900

82008200

82008200

SchoolsHouses

11 33 15000150000011 44 15200152000022 22 14500145000033 11 141001410000

Page 9: Top-k Query Processing and Optimization

OutlineOutline Motivation with ExamplesMotivation with Examples Theoretical Foundation of Rank Theoretical Foundation of Rank

AggregationAggregation Query ModelQuery Model

Top-k SelectionTop-k Selection Top-k Join Top-k Join

Ranking in Relational Query EnginesRanking in Relational Query Engines SummarySummary Open Research TopicsOpen Research Topics

Page 10: Top-k Query Processing and Optimization

Query Model: Top-k Query Model: Top-k SelectionSelection

One relation R with:One relation R with: nn attributes attributes AA11, … ,A, … ,Ann

mm scores scores ss11, …., s, …., smm

Scores are expressions over the attributesScores are expressions over the attributes Example: Example: ss11 = A = A11 and and ss22=A=A22+A+A33

A combining function A combining function FF to compute to compute total scoretotal score

An example template: An example template:

SELECT SELECT some_attributessome_attributesFROM FROM RRWHERE sWHERE selection_conditionelection_conditionORDER BY ORDER BY F(s1,…, sm)F(s1,…, sm)STOP AFTER STOP AFTER kk

Page 11: Top-k Query Processing and Optimization

mm Relations Relations RR11, ….., R, ….., Rmm | R | Rii has: has: n attributesn attributes score attribute, sscore attribute, sii (can be an expression over (can be an expression over

other attributes)other attributes)

A global score for a join result is A global score for a join result is computed as computed as

FF(s(s11,…., s,…., smm))

An example template:An example template:

Query Model: Top-k JoinQuery Model: Top-k Join

SELECT SELECT some_attributessome_attributesFROM FROM RR11,…..,R,…..,Rmm

WHERE WHERE join_conditionjoin_conditionORDER BY ORDER BY F(sF(s11,…..,s,…..,smm)) STOP AFTERSTOP AFTER k k

Page 12: Top-k Query Processing and Optimization

Top-k Selection QueriesTop-k Selection Queries

Approaches:Approaches: Filter/Restart methodFilter/Restart method Rank aggregation from multiple listsRank aggregation from multiple lists Using indexes and materialized viewsUsing indexes and materialized views

Page 13: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Filter/Restart Method Filter/Restart Method

[Carey and Kossman SIGMOD’ 97][Carey and Kossman SIGMOD’ 97][Donjerkovic and Ramakrishnan VLDB’99][Donjerkovic and Ramakrishnan VLDB’99][Bruno et al. TODS’02][Bruno et al. TODS’02][Chaudhuri et al. TKDE’04][Chaudhuri et al. TKDE’04] Map the top-k query to a selection Map the top-k query to a selection

predicatepredicate x > tx > t

For multi-criteria ranking (multiple scoring For multi-criteria ranking (multiple scoring attributes) attributes) a range query a range querytt1111< x< x1 1 <t<t1212 and t and t2121 < x < x22 < t < t2222

Estimate the cut-off Estimate the cut-off t t based on based on kk and the and the data distributiondata distribution

Page 14: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsLists

S1

4020301050

ID

12345

S2

3040502010

ID

32145

S2

5040302010

S1

5040302010

ID

51324

L11.ID = L22.ID

Page 15: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsLists Assumptions:Assumptions:

Sorted Inputs on the individual scoresSorted Inputs on the individual scores The combining function is The combining function is monotonemonotone

Differ in the access capabilities on the listsDiffer in the access capabilities on the lists Sorted Access OnlySorted Access Only Sorted + Random AccessSorted + Random Access Random Access Only ! Random Access Only !

Differ in pipeline supportDiffer in pipeline support The output can serve as input to another instanceThe output can serve as input to another instance

Most Algorithms can be cast as a specialization Most Algorithms can be cast as a specialization of the of the A* A* algorithm algorithm

Page 16: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsLists

BOTH

SortedAccessOnly

Random Access/ProbesOnly

FA, TA, Quick-combine, Multi-Step

NRA, Stream-combine

MPro, Upper, Pick

Page 17: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsLists

Sorted + Random Access AvailableSorted + Random Access Available Go in the lists in Go in the lists in parallelparallel Keep track of all “seen” objectsKeep track of all “seen” objects Update scores of objects at each stepUpdate scores of objects at each step Maintain a threshold Maintain a threshold TT: an upper-bound for : an upper-bound for

all the unknown scoresall the unknown scores An object is qualified as a top-k ifAn object is qualified as a top-k if

The object’s combined score is known and is The object’s combined score is known and is greater than greater than TT

Page 18: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsListsTA TA [Fagin et al. PODS’01],[Fagin et al. PODS’01],

Quick-combineQuick-combine [Güntzer et al. VLDB’00], [Güntzer et al. VLDB’00],

Multi-stepMulti-step [Nepal and Ramakrishna ICDE’99][Nepal and Ramakrishna ICDE’99]

Algorithm Sketch: Algorithm Sketch: F=F= S1 + S2 S1 + S2ID

32145

S2

5040302010

S1

5040302010

ID

51324

Buffer

3: (80)5: (60)3: (80)1: (70)5: (60)2: (60)

T = 100T = 80

Random Access

Page 19: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsLists

No Random Access availableNo Random Access available The combined score of an object is two The combined score of an object is two

parts:parts: ““Seen”: From lists where we encountered the Seen”: From lists where we encountered the

objectobject ““Unseen”: Upper bound of all missing scores Unseen”: Upper bound of all missing scores Sounds familiar? Sounds familiar? A* search A* search

An object is qualified as a top-k ifAn object is qualified as a top-k if The object’s lower-bound score is greater than The object’s lower-bound score is greater than

the upper-bound score of all other objects the upper-bound score of all other objects

Page 20: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsListsNRANRA [Fagin et al. PODS’01],[Fagin et al. PODS’01],

Stream-combineStream-combine [Güntzer et al. ITCC’00][Güntzer et al. ITCC’00]

Algorithm Sketch:Algorithm Sketch: F=F= S1 + S2 S1 + S2

ID

32145

S2

5040302010

S1

5040302010

ID

51324

Buffer

5: (50 – 100)3: (50 – 100)

5: (50 – 90)3: (50 – 90)1: (40 – 80)2: (40 – 80)

3: (80 – 80)1: (70 – 70)5: (50 – 80)2: (40 – 70)

Page 21: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsLists

At least one attribute with sorted-accessAt least one attribute with sorted-access

Some with no Sorted Access (Probe Some with no Sorted Access (Probe Attributes)Attributes)

Schedule the Probe AttributesSchedule the Probe Attributes using using statistics in ascending order of their statistics in ascending order of their “probing” cost“probing” cost

Page 22: Top-k Query Processing and Optimization

Top-k Selection Top-k Selection Rank Aggregation from Multiple Rank Aggregation from Multiple

ListsListsUpperUpper [Bruno et al. ICDE’02][Bruno et al. ICDE’02]

MPro MPro [Chang and Hwang SIGMOD’02],[Chang and Hwang SIGMOD’02],

Algorithm Sketch:Algorithm Sketch:

IIDD

ss pp11 pp22MinMin(s,p(s,p11,,

pp22))

aa

bb

cc

dd

ee

0.90.900

0.80.800

0.70.700

0.60.600

0.50.500

.85.85

.78.78

.75.75

.90.90

.70.70

.75.75

.90.90

.20.20

.90.90

.80.80

.75.75

.78.78

.20.20

.60.60

.50.50

a: 0.9a: 0.85

Uunseen= 0.90.8

a: 0.85

b: 0.8

b: 0.8

a: 0.75

b: 0.78

a: 0.75

0.7

b: 0.78

a: 0.75

c: 0.7

b: 0.78

a: 0.75

c: 0.7

Candidates Queue

Upper-bound of the unseen scores

Page 23: Top-k Query Processing and Optimization

Top-k SelectionTop-k SelectionMaterialized Views (PREFER)Materialized Views (PREFER)

[[Hristidis et al. SIGMOD’01]Hristidis et al. SIGMOD’01]

Ranking functionRanking function

Materialize a view that ranks the entire Materialize a view that ranks the entire input relation on input relation on ffvv | | v =(vv =(v11,v,v22,...,v,...,vmm) ) (predefined weights)(predefined weights)

For an input query weights For an input query weights q =(qq =(q11,q,q22,…,q,…,qmm)) Get a Get a prefixprefix of the view based on of the view based on v v and and qq Sort the prefix on Sort the prefix on ffq q

The top results are guaranteed to be in the prefixThe top results are guaranteed to be in the prefix

m

iiiv Scorevf

1

.

Page 24: Top-k Query Processing and Optimization

Top-k SelectionTop-k Selection Materialized Views (PREFER)Materialized Views (PREFER)

The prefix:The prefix: Determine a thresholdDetermine a threshold T Tv,qv,q|| ))(())(())((,, 11

,, vvqqqqqqvvvv ttffttffTTttffRRtt

IDID AA11

AA22

A3A3 ffvv(t)(t) ffqq(t)(t)

11

22

33

44

55

66

1010

2020

1717

1515

55

1515

1717

2020

1818

1010

1010

1010

2020

1111

1212

88

1212

55

16.16.88

16.16.44

15.15.44

10.10.22

9.89.8

99

17.17.22

17.17.33

16.16.11

9.99.9

10.10.11

99

v v =(0.2,0.4,0.=(0.2,0.4,0.4)4)q q =(0.1,0.6,0.=(0.1,0.6,0.3) 3)

Tv,q = 14.26Tv,q = 14.26

Maximize Maximize ffvv(t)(t) while while maintaining inequalitymaintaining inequality

Page 25: Top-k Query Processing and Optimization

Top-k SelectionTop-k SelectionMaterialized Views (PREFER)Materialized Views (PREFER)

Multiple viewsMultiple views For each query For each query qq, a view , a view vv that covers that covers qq with with

some prefix constraint some prefix constraint 10 to 100 views is a typical number to cover 10 to 100 views is a typical number to cover

the space of queriesthe space of queries Heuristic to cover the maximum number of Heuristic to cover the maximum number of

queries using queries using nn views views View selection for an input queryView selection for an input query

Store view coverage in an R-treeStore view coverage in an R-tree A query is a point in the spaceA query is a point in the space Get the view that contains the queryGet the view that contains the query

Page 26: Top-k Query Processing and Optimization

OutlineOutline Motivation with ExamplesMotivation with Examples Theoretical Foundation of Rank Theoretical Foundation of Rank

AggregationAggregation Query ModelQuery Model

Top-k SelectionTop-k Selection Top-k Join Top-k Join

Ranking in Relational Query EnginesRanking in Relational Query Engines SummarySummary Open Research TopicsOpen Research Topics

Page 27: Top-k Query Processing and Optimization

mm Relations Relations RR11, ….., R, ….., Rmm | R | Rii has: has: n attributesn attributes score attribute, sscore attribute, sii (can be an expression over (can be an expression over

other attributes)other attributes)

A global score for a join result is A global score for a join result is computed as computed as

FF(s(s11,…., s,…., smm))

An example template:An example template:

Top-k JoinTop-k Join

SELECT SELECT some_attributessome_attributesFROM FROM RR11,…..,R,…..,Rmm

WHERE WHERE join_conditionjoin_conditionORDER BY ORDER BY F(sF(s11,…..,s,…..,smm)) STOP AFTERSTOP AFTER k k

Page 28: Top-k Query Processing and Optimization

OutlineOutline Motivation with ExamplesMotivation with Examples Theoretical Foundation of Rank Theoretical Foundation of Rank

AggregationAggregation Query ModelQuery Model

Top-k SelectionTop-k Selection Top-k Join Top-k Join

Ranking in Relational Query EnginesRanking in Relational Query Engines SummarySummary Open Research TopicsOpen Research Topics

Page 29: Top-k Query Processing and Optimization

Supporting Ranking in Supporting Ranking in Relational DatabasesRelational Databases

Approaches:Approaches: Map to a multi-dimensional range query Map to a multi-dimensional range query

(Filter/Restart)(Filter/Restart)

User defined functionUser defined function

Core implementation as a query operatorCore implementation as a query operator New AlgebraNew Algebra The notion of ranked relations The notion of ranked relations Algorithms implement the ranking processAlgorithms implement the ranking process

Page 30: Top-k Query Processing and Optimization

Ranking in Relational Ranking in Relational DatabasesDatabases

Range queriesRange queries[Bruno et al. TODS’02][Bruno et al. TODS’02]

Filter/Restart MethodFilter/Restart Method Given a top-k Given a top-k selection selection query query qq over a over a

relation relation RR Use multidimensional histogram Use multidimensional histogram HH to to

estimate a search distance destimate a search distance dqq

The region (The region (q,dq,dqq) contains is expected to ) contains is expected to include the top-k answersinclude the top-k answers

Perform a range query on (Perform a range query on (q,dq,dqq)) Return top-k answers of the resultsReturn top-k answers of the results If #results < k, If #results < k, RESTARTRESTART

Page 31: Top-k Query Processing and Optimization

Ranking in Relational Ranking in Relational DatabasesDatabases

Range queriesRange queries Using histogramsUsing histograms

Create a small synthetic Relation Create a small synthetic Relation SS consistent consistent with the histogram on with the histogram on RR

Compute the Compute the Dist(q,t)Dist(q,t) for every tuple t in for every tuple t in SS ddqq is the maximum distance between is the maximum distance between qq and the and the

top-k tuples in top-k tuples in SS Building Building SS

No restarts:No restarts: d dqq is large enough is large enough more results more results need to be filteredneed to be filtered

Restarts: Restarts: restarts possible restarts possible less filtering less filtering

Page 32: Top-k Query Processing and Optimization

Supporting Ranking in Supporting Ranking in Relational DatabasesRelational Databases

As a Query OperatorAs a Query Operator User-defined FunctionUser-defined Function

(+)(+) Under the optimizer's Under the optimizer's control control

(+)(+)Can be shuffled with Can be shuffled with other operators in a other operators in a query evaluation plan for query evaluation plan for better performancebetter performance

(+)(+)General enough and General enough and highly applicablehighly applicable

(-)(-)Changes to query Changes to query engineengine

(+)(+)Easy to implement Easy to implement and ready-to-go solution and ready-to-go solution

(-)(-)Implementation Implementation outside the SQL engine outside the SQL engine lose lose

efforts of the query efforts of the query optimizeroptimizer

Page 33: Top-k Query Processing and Optimization

OutlineOutline Motivation with ExamplesMotivation with Examples Theoretical Foundation of Rank Theoretical Foundation of Rank

AggregationAggregation Query ModelQuery Model

Top-k SelectionTop-k Selection Top-k Join Top-k Join

Ranking in Relational Query EnginesRanking in Relational Query Engines SummarySummary Open Research TopicsOpen Research Topics

Page 34: Top-k Query Processing and Optimization

SummarySummary Wide applicability of ranking queries as essential Wide applicability of ranking queries as essential

functionality in many applications warrants efficient functionality in many applications warrants efficient handling by database systemshandling by database systems

One step towards integrating IR effective retrieval and DB One step towards integrating IR effective retrieval and DB effective handling of dataeffective handling of data

Defined two flavors:Defined two flavors: Top-k Selection QueriesTop-k Selection Queries Top-k Join QueriesTop-k Join Queries

Basic Techniques:Basic Techniques: Filter / RestartFilter / Restart Rank AggregationRank Aggregation Using Indexes and Materialized ViewsUsing Indexes and Materialized Views

Page 35: Top-k Query Processing and Optimization

SummarySummary Rank Aggregation has solid theoretical Rank Aggregation has solid theoretical

background from voting and social selectionbackground from voting and social selection

Many rank-aggregation algorithms available Many rank-aggregation algorithms available with similar core ideawith similar core idea

ChallengesChallenges Unified framework for rank-aware query Unified framework for rank-aware query

processingprocessing Integrating rank-aggregation as a basic operation Integrating rank-aggregation as a basic operation

in practical relational database systemsin practical relational database systems

Page 36: Top-k Query Processing and Optimization

References (1)References (1) [ACE+03[ACE+03] Walid G. Aref, Ann C. Catlin, Ahmed K. Elmagarmid, J. Fan, ] Walid G. Aref, Ann C. Catlin, Ahmed K. Elmagarmid, J. Fan,

Moustafa A. Hammad, Ihab F. Ilyas, Mirette Marzouk, Sunil Prabhakar, and Moustafa A. Hammad, Ihab F. Ilyas, Mirette Marzouk, Sunil Prabhakar, and X. Zhu. X. Zhu. VDBMS: A testbed facility for research in video database bench VDBMS: A testbed facility for research in video database bench markingmarking. ACM Multimedia Systems Journal, Special Issue on Multimedia . ACM Multimedia Systems Journal, Special Issue on Multimedia Document Management Systems, 2003. Document Management Systems, 2003.

[ASC02] [ASC02] Sihem Amer-Yahia, SungRan Cho, Divesh Srivastava, Sihem Amer-Yahia, SungRan Cho, Divesh Srivastava, Tree Pattern Tree Pattern Relaxation, Relaxation, In EDBT, 2002 In EDBT, 2002

[BCG02][BCG02] Nicolas Bruno, Surajit Chaudhuri, and Luis Gravano. Nicolas Bruno, Surajit Chaudhuri, and Luis Gravano. Top-k Top-k selection queries over relational databases: Mapping strategies and selection queries over relational databases: Mapping strategies and performance evaluationperformance evaluation. TODS, 27(2), 2002. . TODS, 27(2), 2002.

[BGM02][BGM02] Nicolas Bruno, Luis Gravano, Amélie Marian: Evaluating Top-k Nicolas Bruno, Luis Gravano, Amélie Marian: Evaluating Top-k Queries over Web-Accessible Databases. In ICDE, 2002 Queries over Web-Accessible Databases. In ICDE, 2002

[CBC+00][CBC+00] YuanChi Chang, Lawrence Bergman, Vittorio Castelli, YuanChi Chang, Lawrence Bergman, Vittorio Castelli, ChungSheng Li, MingLing Lo, and John R. Smith. ChungSheng Li, MingLing Lo, and John R. Smith. The onion technique: The onion technique: indexing for linear optimization queriesindexing for linear optimization queries. In SIGMOD, 2000. . In SIGMOD, 2000.

[CK97][CK97] Michael J. Carey and Donald Kossmann, Michael J. Carey and Donald Kossmann, On saying ``Enough On saying ``Enough already !” in SQLalready !” in SQL, SIGMOD, 1997 Tucson, Arizona, SIGMOD, 1997 Tucson, Arizona

Page 37: Top-k Query Processing and Optimization

References (2)References (2) [CH02][CH02] Kevin ChenChuan Chang and Seung won Hwang. Kevin ChenChuan Chang and Seung won Hwang. Minimal probing: Minimal probing:

supporting expensive predicates for top-k queriessupporting expensive predicates for top-k queries. In SIGMOD, 2002. . In SIGMOD, 2002.

[Con85][Con85] M.J. Condorcet. M.J. Condorcet. Essai sur l'application de l'analyse a la probabilit e Essai sur l'application de l'analyse a la probabilit e des decisions rendues a la puralite des voixdes decisions rendues a la puralite des voix, 1785. , 1785.

[DKN+01][DKN+01] Cynthia Dwork, S. Ravi Kumar, Moni Naor, and D. Sivakumar. Cynthia Dwork, S. Ravi Kumar, Moni Naor, and D. Sivakumar. Rank Rank aggregation methods for the webaggregation methods for the web. In World Wide Web, 2001. . In World Wide Web, 2001.

[DR99][DR99] Donko Donjerkovic, Raghu Ramakrishnan: Donko Donjerkovic, Raghu Ramakrishnan: Probabilistic Optimization of Probabilistic Optimization of Top NTop N Queries. In VLDB 1999 Queries. In VLDB 1999

[Fag99][Fag99] Ronald Fagin. Ronald Fagin. Combining fuzzy information from multiple systemsCombining fuzzy information from multiple systems. . Journal of Computer and System Sciences (JCSS), 58(1), Feb 1999. Journal of Computer and System Sciences (JCSS), 58(1), Feb 1999.

[FLN01][FLN01] Ronald Fagin, Amnon Lotem, and Moni Naor. Ronald Fagin, Amnon Lotem, and Moni Naor. Optimal aggregation Optimal aggregation algorithms for middlewarealgorithms for middleware. In PODS, Santa Barbara, California, May 2001.. In PODS, Santa Barbara, California, May 2001.

[GBK00][GBK00] Ulrich G˜untzer, WolfTilo Balke, and Werner Kießling. Ulrich G˜untzer, WolfTilo Balke, and Werner Kießling. Optimizing Optimizing multifeature queries for image databasesmultifeature queries for image databases. In VLDB, September 10--14, Cairo, . In VLDB, September 10--14, Cairo, Egypt, 2000. Egypt, 2000.

Page 38: Top-k Query Processing and Optimization

References (3)References (3)

[GBK01][GBK01] Ulrich G˜untzer, WolfTilo Balke, and Werner Kießling. Ulrich G˜untzer, WolfTilo Balke, and Werner Kießling. Towards Towards efficient multifeature queries in heterogeneous environmentsefficient multifeature queries in heterogeneous environments . In ITCC, . In ITCC, 2001. 2001.

[HGP03][HGP03] Vagelis Hristidis, Luis Gravano, and Yannis Papakonstantinou. Vagelis Hristidis, Luis Gravano, and Yannis Papakonstantinou. Efficient IR-style keyword search over relational databasesEfficient IR-style keyword search over relational databases . In VLDB, . In VLDB, Berlin, Germany, 2003. Berlin, Germany, 2003.

[HKP01][HKP01] Vagelis Hristidis, Nick Koudas, and Yannis Papakonstantinou. Vagelis Hristidis, Nick Koudas, and Yannis Papakonstantinou. PREFER: A system for the efficient execution of multiparametric PREFER: A system for the efficient execution of multiparametric ranked queries. ranked queries. In SIGMOD, Santa Barbara, California, 2001In SIGMOD, Santa Barbara, California, 2001

[IAE02[IAE02] Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. ] Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. Joining Joining Ranked Inputs in PracticeRanked Inputs in Practice. In VLDB, Honk-Kong, China, 2002. . In VLDB, Honk-Kong, China, 2002.

[IAE03[IAE03] Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. ] Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. Supporting Supporting top-k join queries in relational databasestop-k join queries in relational databases. In VLDB, Berlin, Germany, 2003. . In VLDB, Berlin, Germany, 2003.

[ISA+04[ISA+04] Ihab F. Ilyas, Rahul Shah, Walid G. Aref, Jeff Vitter, and Ahmed K. ] Ihab F. Ilyas, Rahul Shah, Walid G. Aref, Jeff Vitter, and Ahmed K. Elmagarmid. Elmagarmid. Rank-aware Query OptimizationRank-aware Query Optimization. SIGMOD, Paris, France, . SIGMOD, Paris, France, 20042004

Page 39: Top-k Query Processing and Optimization

References (4)References (4)

[LCI+05] [LCI+05] Chengkai Li, Kevin. C.-C. Chang,  Ihab F. Ilyas, and  Sumin Song Chengkai Li, Kevin. C.-C. Chang,  Ihab F. Ilyas, and  Sumin Song RankSQL: Query Algebra and Optimization for Relational Top-k RankSQL: Query Algebra and Optimization for Relational Top-k QueriesQueries. . In Proceedings of the 2005 ACM SIGMOD Conference on In Proceedings of the 2005 ACM SIGMOD Conference on Management of Data, Baltimore, Maryland Management of Data, Baltimore, Maryland (To Appear)(To Appear)

[NCS+01][NCS+01] Apostol Natsev, YuanChi Chang, John R. Smith, ChungSheng Li, Apostol Natsev, YuanChi Chang, John R. Smith, ChungSheng Li, and Jeffrey Scott Vitter. and Jeffrey Scott Vitter. Supporting incremental join queries on ranked Supporting incremental join queries on ranked inputsinputs. In VLDB, Rome, Italy, 2001. . In VLDB, Rome, Italy, 2001.

[NR99][NR99] Surya Nepal and M. V. Ramakrishna. Surya Nepal and M. V. Ramakrishna. Query processing issues in Query processing issues in image (multimedia) databasesimage (multimedia) databases. In ICDE, Sydney, Australia, 1999. . In ICDE, Sydney, Australia, 1999.

[RGM03][RGM03] Sriram Raghavan and Hector GarciaMolina. Sriram Raghavan and Hector GarciaMolina. Complex queries over Complex queries over web repositoriesweb repositories. In VLDB, Berlin, Germany, 2003. . In VLDB, Berlin, Germany, 2003.

[TPK+03[TPK+03] Panayiotis Tsaparas, Themistoklis Palpanas, Yannis Kotidis, Nick ] Panayiotis Tsaparas, Themistoklis Palpanas, Yannis Kotidis, Nick Koudas, and Divesh Srivastava. Koudas, and Divesh Srivastava. Ranked join indicesRanked join indices, ICDE 2003. , ICDE 2003.

[UF01][UF01] Tolga Urhan and Michael J. Franklin. Tolga Urhan and Michael J. Franklin. Dynamic pipeline scheduling Dynamic pipeline scheduling for improving interactive query performancefor improving interactive query performance. In VLDB, Roma, Italy, 2001.. In VLDB, Roma, Italy, 2001.