adaptive query processing with eddies

87
Adaptive Query Processing with Eddies Amol Deshpande University of Maryland

Upload: herman

Post on 25-Feb-2016

59 views

Category:

Documents


4 download

DESCRIPTION

Adaptive Query Processing with Eddies. Amol Deshpande University of Maryland. Roadmap. Adaptive Query Processing: Motivation Eddies [AH’00] STAIRs [DH’04] and SteMs [RDH’03] Experimental Study Implementation in PostgreSQL [Des’03] Continuous queries [MSHR’02] (very briefly) Open problems. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Adaptive Query Processing with Eddies

Adaptive Query Processing with Eddies

Amol DeshpandeUniversity of Maryland

Page 2: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 3: Adaptive Query Processing with Eddies

Query Processing in Database Systems

Database System

Declarative Query

Results

Page 4: Adaptive Query Processing with Eddies

Query Processing: Example

Database System

Students Enrolled Courses

Name LevelJoe Junior

Jen Senior

Name CourseJoe CS1

Jen CS2

Course InstructorCS2 Smith

select *from students, enrolled, courseswhere students.name = enrolled.name and enrolled.course = courses.course

Page 5: Adaptive Query Processing with Eddies

Query Processing: Example

Name LevelJoe Junior

Jen Senior

Name CourseJoe CS1

Jen CS2

Course InstructorCS2 Smith

select *from students, enrolled, courseswhere students.name = enrolled.name and enrolled.course = courses.course

Students Enrolled

Name Level CourseJoe Junior CS1

Jen Senior CS2

Enrolled Courses

Students Enrolled

Courses

Name Level Course InstructorJen Senior CS2 Smith

Page 6: Adaptive Query Processing with Eddies

Example Query: Execution Plans

Students Enrolled

Courses

E C

S E

A Query Execution Plan

S E

CSE

SEC

Courses Enrolled

Students

E S

C E

An alternate Execution Plan

C E

SCE

SEC

Page 7: Adaptive Query Processing with Eddies

Cost-based Query Optimization

Students Enrolled

Courses

E C

S E

A Query Execution Plan

S E

CSE

SEC

Estimate cost of each plan and choose the best

Cost = f(|S|, |E|, R)

Runtime Parameters

Input sizesCost = g(|SE|, |C|, R)

Cost (Plan)

=

+

Page 8: Adaptive Query Processing with Eddies

Cost-based Query Optimization

DeclarativeQuery

Results

Query Optimizer

Query Executor

CompiledQuery Plan

Disk(s)

Page 9: Adaptive Query Processing with Eddies

Cost-based Query Optimization

DeclarativeQuery

ResultsN

etwork

Query Optimizer

Query Executor

CompiledQuery Plan

Disk(s)

Wide area data sources: e.g. remote tables, web data sources

Page 10: Adaptive Query Processing with Eddies

Cost-based Query Optimization

DeclarativeQuery

ResultsN

etwork

Query Optimizer

Query Executor

CompiledQuery Plan

Disk(s)

Streaming data e.g. Stock tickers Network logs Sensor networks

Page 11: Adaptive Query Processing with Eddies

Estimation Errors

Students Enrolled

Courses

E C

S E

A Query Execution Plan

S E

CSE

SECCost = g(|SE|, |C|, R)

Input sizes may not be availableErroneous estimation of intermediateresult sizes

Page 12: Adaptive Query Processing with Eddies

Effect on the cost function maybe unpredictable

Estimation Errors

Students Enrolled

Courses

E C

S E

A Query Execution Plan

S E

C

Cost = g(|SE|, |C|, R)

Unknown runtime parametersSE

SEC

Page 13: Adaptive Query Processing with Eddies

How to solve this problem ? More sophisticated estimation techniques

Sophisticated summary structures e.g. MHists [PI’97], Wavelets [VWI’98]

Feedback loop in the optimization process e.g. [SLMK’01, BC’02]

Adaptive query processing Can’t always build and maintain synopses Runtime environments can be very unpredictable So…adapt query plans mid-way during execution

Page 14: Adaptive Query Processing with Eddies

Eddies: Extreme Adaptivity

Telegraph & TelegraphCQ (at UC Berkeley) Eddies [AH’00] SteMs [RDH’03] Continuous queries [MSHR’02, CF’02, C+’03, K+’03] Implementation in PostgreSQL [Des04] Fault-tolerance and load balancing [SHB’04] STAIRs [DH’03]

Other work Distributed eddies, Content-based Routing [BB’05]

Dynamic QEP,Parametric,Competitive

staticplans

latebinding

inter-operator

per tuple

TraditionalDBMS

Query Scrambling,MidQuery

Re-opt

EddiesXJoin, DPHJConvergent

QP

intra-operator

Page 15: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 16: Adaptive Query Processing with Eddies

Eddies [AH’00]

Plans considered by the optimizer

pred2(S)S Output

select * from Swhere pred1(S) and pred2(S)

pred1(S)

pred1(S)S Outputpred2(S)

Decision made apriori based on statistics Sort by (1-s)/c, where s = selectivity, c = cost

Page 17: Adaptive Query Processing with Eddies

Eddies [AH’00]

Executing the query using an Eddy

select * from Swhere pred1(S) and pred2(S)

pred2(S)

pred1(S)

EddyS Output

An eddy operator• Intercepts tuples from source(s) and output tuples from operators• Query executed by routing tuples between the operators• Uses feedback from the operators to route

Change routing ==> Change query execution plan used

Page 18: Adaptive Query Processing with Eddies

Per-tuple State

Executing the query using an Eddy

select * from Swhere pred1(S) and pred2(S)

pred2(S)

pred1(S)

EddyS Output

Two Bitmaps1) Ready bits - which operators can

a tuple be routed to next2) Done bits - which operators has a

tuple already been throughExample:

Ready(t1) = [1, 1] - can be routed to eitherDone(t1) = [0, 0] - not done either

Example:

Ready(t2) = [1, 0] - can be routed to pred1Done(t2) = [0, 1] - done pred2

For selection queries, ready is a bit-complement of done

Page 19: Adaptive Query Processing with Eddies

Eddies: Routing Policy Choosing which operator to route a given tuple to

The brain of the eddy

Lottery Scheduling [Avnur 00] Simplified Description 1. Maintain for each operator: tuples sent tuples returned cost per tuple2. Choose (roughly) based on the above3. Explore by randomly sending tuples in the wrong orders

sent = 100received = 2

sent = 10received = 20

Send here 99% of the timeSend to the other operator 1% of the time

pred2(S)

pred1(S)

EddyS Output

Page 20: Adaptive Query Processing with Eddies

A Join Query

Name LevelJoe Junior

Jen Senior

Name CourseJoe CS1

Jen CS2

Course InstructorCS2 Smith

select *from students, enrolled, courseswhere students.name = enrolled.name and enrolled.course = courses.course

Students Enrolled

Name Level CourseJoe Junior CS1

Jen Senior CS2

Enrolled Courses

Students Enrolled

Courses

Name Level Course InstructorJen Senior CS2 Smith

Page 21: Adaptive Query Processing with Eddies

Eddies [AH’00]A traditional query plan Query execution using an eddy

S E

E C

S E

Output

CEddy

S E

E C

SEC

Output

A key difference: Tuples can’t be arbitrarily routed to any

operator E.g. S tuples can’t be routed to E Join C Use ready bits to identify this

Page 22: Adaptive Query Processing with Eddies

Query Execution using Eddies

EddySEC

Insert with key hash(joe)Probe

to find matches

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Junior

Joe Junior

Joe Jr

No matches; Eddy processesthe next tuple

Output

Page 23: Adaptive Query Processing with Eddies

Query Execution using Eddies

EddySEC

InsertProbe

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Jr

Jen Sr

Joe CS1

Joe CS1Joe CS1

Joe Jr CS1

Joe Jr CS1Joe Jr CS1

Output

CS2 Smith

Page 24: Adaptive Query Processing with Eddies

Query Execution using Eddies

EddySEC

Output

Probe

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Jr

Jen Sr

CS2 Smith

Jen CS2

Joe CS1

Joe Jr CS1Jen CS2

Jen CS2

Jen CS2 Smith

Probe

Jen CS2 SmithJen CS2 SmithJen Sr. CS2 Smith

Jen Sr. CS2 Smith

Page 25: Adaptive Query Processing with Eddies

Per-tuple State

EddySEC

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Junior

Output

S Join E E Join CReady 1 0

Done 0 0

Page 26: Adaptive Query Processing with Eddies

Per-tuple State

EddySEC

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Jr

Jen Sr

Joe CS1Output

CS2 Smith

S Join E E Join CReady 1 1

Done 0 0

Page 27: Adaptive Query Processing with Eddies

Per-tuple State

EddySEC

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Jr

Jen Sr

Joe CS1

Joe Jr CS1

Output

CS2 Smith

S Join E E Join CReady 0 1

Done 1 0

Page 28: Adaptive Query Processing with Eddies

Eddies: Postmortem

Students Enrolled

Output

Courses

E C

S E

Courses Enrolled

Output

Students

E S

C E

Eddy executes different query execution plans for different parts of data

Course Instructor

CS2 Smith

Course Instructor

CS2 Smith

Name Course

Joe CS1

Name Level

Joe Junior

Jen Senior

Name Level

Joe Junior

Jen Senior

Name Course

Jen CS2

Page 29: Adaptive Query Processing with Eddies

Joins and Lottery Scheduling Lottery scheduling doesn’t work well with joins

Page 30: Adaptive Query Processing with Eddies

Example: Delayed Data SourcesSETUP:

|S E|

|E C|

>>

E C

S E

S E

C

E S

C E

C E

S

Execution plan 1 Execution plan 2

Cost (Plan 1) > Cost (Plan 2)

SE

SEC

CE

SEC

Page 31: Adaptive Query Processing with Eddies

SETUP:

E and C arrive early; S is delayed

Example: Delayed Data Sources

time

|S E|

|E C|

>>

SEC

Page 32: Adaptive Query Processing with Eddies

S0

SETUP:

E and C arrive early; S is delayed

time

|S E|

|E C|

>>

SEC

E C

EddySEC

Output

S EHashTable

S.NameHashTable

E.Name

HashTableE.Course

HashTableC.Course

S0 E

CS0E

Eddy decides to route E to E CEddy learns the correct sizes

Too Late !!

S

SE

S –S0

(S –S0)E

sent and received suggestS Join E is better option

Page 33: Adaptive Query Processing with Eddies

E C

EddySEC

Output

S EHashTable

S.NameHashTable

E.Name

HashTableE.Course

HashTableC.Course

S

SETUP:

E and C arrive early; S is delayed

|S E|

|E C|

>>

Query is executed using the worse plan.

E

C

Too Late !!

SE

E C

S E

S E

C

Execution Plan Used

State got embedded as aresult of earlier routing decisions

Page 34: Adaptive Query Processing with Eddies

Joins and Lottery Scheduling Lottery scheduling doesn’t work well with joins

Not clear how any routing policy can work without reasonable knowledge of future Whatever the current state in the join operators, an

adversary can send tuples to make it look very bad

Two possible solutions: Allow manipulation of state (STAIRs) [DH’04] Don’t embed state in the operators (SteMs) [RDH’03]

Page 35: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 36: Adaptive Query Processing with Eddies

STAIRs [DH’04] Expose join state to the eddy

Provide state management primitives That guarantee correctness of execution That can be used to manipulate embedded

state in the operators Also allow support for cyclic queries etc

Page 37: Adaptive Query Processing with Eddies

New Operator: STAIR

E C

EddySEC

Output

S EHashTable

S.NameHashTable

E.Name

HashTableE.Course

HashTableC.Course

Page 38: Adaptive Query Processing with Eddies

New Operator: STAIRStorage, Transformation and Access for Intermediate Results

HashTable

E.Name STAIR

HashTable

S.Name STAIR

HashTable

E.Course STAIR

HashTable

C.Course STAIR

EddySEC

Output

Page 39: Adaptive Query Processing with Eddies

HashTable

E.Name STAIR

HashTable

S.Name STAIR

HashTable

E.Course STAIR

HashTable

C.Course STAIR

EddySEC

Output

Query execution using STAIRSSimilar to using Join Operators

s1s1

s1

Probe into E.Name STAIRBuild into S.Name

STAIR

s1

Page 40: Adaptive Query Processing with Eddies

STAIR: Operations Build (insert):

Insert the given tuple into the STAIR Probe (lookup):

Find matching tuples for the given tuple State Management Operations:

Demotion Promotion

Page 41: Adaptive Query Processing with Eddies

State Management Primitive: DemotionReplace a tuple in a STAIR with a projection of that tuple

HashTable

E.Name STAIR

EddySEC

Output

HashTable

S.Name STAIR

HashTable

E.Course STAIR

HashTable

C.Course STAIR

s1

s1e1e2

e2c1

c1

e1e1e1

e2e2

Demoting e2c1 to e2

e2

Can be thought of as undoing work

e2c1e2c1

s1e1s1e1

Page 42: Adaptive Query Processing with Eddies

e1c1e1c1

State Management Primitive: PromotionReplace a tuple in a STAIR with the result of joining it with other tuples

HashTable

E.Name STAIR

EddySEC

Output

HashTable

S.Name STAIR

HashTable

E.Course STAIR

HashTable

C.Course STAIR

s1

s1e1e2

e2c1

c1

e1

Two arguments:• A tuple• A join to be used to promote this tuple

Can be thought of as precomputation of work

Promoting e1 using E C

e1

e1e1

e1e1c1

Page 43: Adaptive Query Processing with Eddies

STAIRs: Correctness Theorem: For any sequence of applications of

the state management operations, STAIRs will produce the correct query output. STAIRs will produce every result tuple There will be no spurious duplicates

Page 44: Adaptive Query Processing with Eddies

Lifting Burden of History: Delayed Data Sources

Page 45: Adaptive Query Processing with Eddies

SETUP:

E and C arrive early; S is delayed

time

|S E|

|E C|

>>

SEC

E C

EddySEC

Output

S EHashTable

S.NameHashTable

E.Name

HashTableE.Course

HashTableC.Course

S0

S0 E

CS0E

Eddy decides to route E to E CEddy learns the correct selectivities

Page 46: Adaptive Query Processing with Eddies

SETUP:

E and C arrive early; S is delayed

time

|S E|

|E C|

>>

SEC Eddy

SEC

Output

S0

HashTable

S0

S.Name STAIR

HashTable

E

E.Name STAIR

HashTable

S0E

E.Course STAIR

HashTable

C

C.Course STAIR

Eddy decides to route E to E CEddy learns the correct selectivitiesEddy decides to migrate E

EC

ECE E

EEC

E

By promoting E using E C

Page 47: Adaptive Query Processing with Eddies

SETUP:

E and C arrive early; S is delayed

time

|S E|

|E C|

>>

SEC Eddy

SEC

Output

HashTable

S0

S.Name STAIR

HashTable

E.Name STAIR

HashTable

S0E

E.Course STAIR

HashTable

C

C.Course STAIR

EC

E

S

S –S0S –S0

(S –S0) E C

Page 48: Adaptive Query Processing with Eddies

EddySEC

Output

HashTable

S.Name STAIR

HashTable

E.Name STAIR

HashTable

SE

E.Course STAIR

HashTable

C

C.Course STAIR

EC

E

S

S0 E

C

E C

S E

UNION

E C

S – S0

S E

E C

Most of the data isprocessed using thecorrect plan

Page 49: Adaptive Query Processing with Eddies

Further Motivating Adaptive State Management Eager pre-computation for faster response

times Query scrambling [UFA’98] Partial results [RH’02]

Selective caching of intermediate results Continuous queries over streams

Cyclic queries Adapting the join spanning tree used

Page 50: Adaptive Query Processing with Eddies

Making State Migration Decisions Another policy question

Optimal migration decisions Requires knowledge of future selectivities and the

sizes of relations

Page 51: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 52: Adaptive Query Processing with Eddies

Alternative: SteMs [RDH’03] Don’t embed the state in the operators at all

Note: Not the original motivation for SteMs Focus was on increasing opportunities for

adaptivity by breaking up the join operators

We will focus on a very simplistic version of the operator

Page 53: Adaptive Query Processing with Eddies

Query Execution using SteMs

EddySEC

S SteM

E SteM

C SteM

Store S tuplesAllow probes using E tuples ie. If an E tuple is routed to it, find matching S tuplesCould use any indexing technique to find matches

Store E tuplesAllow probes using S and C tuplesNeed to build two internal indexes

Page 54: Adaptive Query Processing with Eddies

Query Execution using SteMs

EddySEC

Insert

Probe

S SteM

Joe Jr

Jen Sr

Joe CS1

CS2 Smith

E SteM

C SteM

Jen CS2

Jen CS2 Smith

Jen Sr. CS2 SmithJen CS2Jen CS2

Jen CS2

Jen CS2

Jen CS2 Smith

Jen Sr. CS2 Smith

Probe

Page 55: Adaptive Query Processing with Eddies

Query Execution using SteMs

State inside the operators is independent of previous routing decisions Because no intermediate tuples are ever stored

Doesn’t have the same problem as the join or STAIR operators

Optimal routing policy easy to write down Similarities to queries with only selections

But not storing intermediate results increases the computation cost significantly

Page 56: Adaptive Query Processing with Eddies

SteMs: Drawbacks Recomputation of intermediate result tuples

Constrained plan choices Available plans depend highly on the arrival

order

Page 57: Adaptive Query Processing with Eddies

EddySEC

S SteM

E SteM

C SteM

S0

SETUP:

E and C arrive early; S is delayed

time

|S E|

|E C|

>>

SEC

S0

E

C

S –S0can only be routedto E SteM for probingand is forced to be executedas (S Join E) Join C

Under the mechanism, there is no way to execute the other plan for this setup

Page 58: Adaptive Query Processing with Eddies

SteMs: Drawbacks Recomputation of intermediate result tuples

Constrained plan choices Available plans depend highly on the arrival

order

Though more subtle, the second drawback might be the more important one

Page 59: Adaptive Query Processing with Eddies

Recap An eddy operator

Can affect the query execution plan(s) used by routing different tuples differently

Eddy w/ Selections: Well understood Even if selections are correlated

Babu, Munagala et al [SIGMOD 2004, ICDT 2005]

Page 60: Adaptive Query Processing with Eddies

Recap Eddies for multi-way joins

Opportunities for adaptivity depend on the join operators used Higher adaptivity tends to push logic into the eddy ==>

Routing policies very important

Similarities toselections

Sort-mergeHybrid-Hash

Index-nestedloop joins

Nested-loopJoins SteMs/

STAIRs

Blocking opeatorsLittle adaptivity

See [AH’00] Suffers from state accumulation

problems

Pipelined/SymmetricHash Join

Policy issues not well-understood

Page 61: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 62: Adaptive Query Processing with Eddies

Implementation Details

In PostgreSQL Database System code base In the context of TelegraphCQ project

Highly efficient implementation [SIGREC’04] Eddy, SteMs, STAIRs export get_next() functions Routing decisions are made per batch

Can control batch size Routing decisions made for all possible ready bitmaps

Decisions are encoded in arrays that are indexed with ready bits

Efficiently find the operator to route to

Page 63: Adaptive Query Processing with Eddies

Results - Overheads (1)

All plans have identical costs, so adaptivity plays no role

Page 64: Adaptive Query Processing with Eddies

Results - Overheads (2)

Page 65: Adaptive Query Processing with Eddies

Policies used for experiments Routing policy:

Observe: Selectivities of predicates on base tables Domain sizes of join attributes

Compute join selectivities and use them to route tuples Migration policy:

Tie state migration decisions to routing decisions Follow the routing policy decisions to make sure that

most tuples are routed correctly Caveats :

May end doing migrations late in the query execution May thrash

Page 66: Adaptive Query Processing with Eddies

State Migration: Illustrative Example

select * from customer c, orders o, lineitem l where c.custkey = o.custkey and

o.orderkey = l.orderkey and c.nationkey = 1 and c.acctbal > 9000 and l.shipdate > date ’1996-01-01’

Setup:lineitem arrives sorted on shipdate==> selectivity(l.shipdate > …) very low initially==> orders routed to join with lineitem (bad)

No explicit delays introduced

Page 67: Adaptive Query Processing with Eddies

Illustrative Example (1)

Page 68: Adaptive Query Processing with Eddies

Illustrative Example (2)

Page 69: Adaptive Query Processing with Eddies

Experiments: Synthetic Workload Modeled after the Wisconsin Benchmark 20 Tables for varying sizes Randomly generated queries Environment

Rates proportional to table sizes; no delays or Random initial delays introduced or Random data rates

Page 70: Adaptive Query Processing with Eddies

Traditional vs STAIRs

Page 71: Adaptive Query Processing with Eddies

SteMs vs STAIRs

Page 72: Adaptive Query Processing with Eddies

Joins vs STAIRs

Page 73: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 74: Adaptive Query Processing with Eddies

Continous Query Processing Eddies ideal for executing continuous queries over data

streams Dynamic runtime conditions make a static plan unsuitable Queries typically executed over sliding windows

Find average over last one week Note: Continuous vs Multi-query processing

Not identical Data streams literature does not make this difference

explicit Application environments tend to have a large number of

simultaneous queries

Page 75: Adaptive Query Processing with Eddies

Continous Query Processing CACQ [Madden et al 2002]

Focus on sharing work as much as adaptivity Uses SteMs augmented with a deletion

operator To handle sliding windows

Also uses predicate indexes For handling a large number of queries on the

same set of streams but with different predicates

E.g. millions of stock alerts over a few streams

Page 76: Adaptive Query Processing with Eddies

Roadmap

Adaptive Query Processing: Motivation

Eddies [AH’00]

STAIRs [DH’04] and SteMs [RDH’03]

Experimental Study Implementation in PostgreSQL [Des’03]

Continuous queries [MSHR’02] (very briefly)

Open problems

Page 77: Adaptive Query Processing with Eddies

Some open problems (1) Eddies for continuous query processing

Much work since CACQ, but not a solved problem E.g. computational inefficiency of SteMs Many other proposed CQ architectures face the same

problem MJoins (NiagaraCQ) Stanford STREAM processor (earlier version)

Later added intermediate result caches Note: These two don’t use eddies explicitly

Routing policies for CQ still an open question Different from routing policies for non-CQ queries

Page 78: Adaptive Query Processing with Eddies

Some open problems (2) Routing policies

Whether eddies will succeed depends on the routing policies

Little work so far...

SteMs, STAIRs Theoretical analysis of optimization space,

and practical viability analysis needed Especially in the context of continuous query

processing

Page 79: Adaptive Query Processing with Eddies

Some open problems (3) Eddies for multi-query processing (non-CQ)

SteMs may be sufficient for CQ processing, but not for normal multi-query processing

Parallel, distributed environments, P2P, Grid..

Disk: Flexibility demanded by adaptive techniques at

odds against the careful scheduling typically done by DBMSs

XJoins Very little work on understanding this

Page 80: Adaptive Query Processing with Eddies

Some open problems (4) Optimization with expanded plan space

Eddies can explore a plan space much larger than traditional plan space

They allow relations to be broken into pieces, with each piece executed separately

Can we explore this plan space in a non-adaptive setting ?

Recent work on: Conditional Planning [Deshpande et al, ICDE

2005] Content-based Routing [Babu et al, VLDB 2005]

Page 81: Adaptive Query Processing with Eddies

Summary Increasing need for adaptivity Eddy: A highly adaptive query processor

Executes queries by routing tuples through operators

SteMs, STAIRs New operators proposed to handle problems

with traditional join operators Very promising especially for continuous and

wide-area query processing Exciting research lies ahead…

Page 82: Adaptive Query Processing with Eddies

The End Questions ?

Page 83: Adaptive Query Processing with Eddies

Fatal Flaw: Burden of Routing History

EddySEC

Output

S EHashTable

S.NameHashTable

E.Name

E C

HashTableE.Course

HashTableC.Course

Joe Jr

Jen Sr

CS2 Smith

Joe CS1

Joe Jr CS1

Jen CS2

Jen CS2 Smith

Routing decisions get embedded in the state

Future adaptibility is severly constrained

Page 84: Adaptive Query Processing with Eddies

Example: Delayed Data SourcesSETUP:

|S E|

|E C|

>>

E C

S E

S E

C

E S

C E

C E

S

Execution plan 1 Execution plan 2

Cost (Plan 1) > Cost (Plan 2)

SE

SEC

CE

SEC

Page 85: Adaptive Query Processing with Eddies

SETUP:

E and C arrive early; S is delayed

Example: Delayed Data Sources

time

A plan may have to be chosen without any statistical information about the data

Earliest time sufficient information may be available to choose optimal plan

|S E|

|E C|

>>

SEC

Page 86: Adaptive Query Processing with Eddies

Tricky State Configurations: 1

Want to undo the decision to route E1 to S E

E C

EddySEC

Output

S EHashTable

S.NameHashTable

E.Name

HashTableE.Course

HashTableC.Course

S0 E1

CS0E1 E2

E2CResult S0ECalready produced

Page 87: Adaptive Query Processing with Eddies

EddySE

I

E C

HashTableE.Course

HashTableC.Course

C1SE1 E2

S EHashTable

S.NameHashTable

E.Name

S E1E2C1

C I

HashTableC.Intstructor

HashTableI.Instructor

I

C

E2C2I

C2I

C2SE1C1SE2C1

Tricky State Configurations: 2