an adaptive query execution engine for data integration zachary ives, daniela florescu, marc...
TRANSCRIPT
An Adaptive Query Execution Engine for Data Integration
Zachary Ives, Daniela Florescu, Marc Friedman, Alon Levy, Daniel S. Weld
University of Washington
Slides by Peng Li, Modified by Rachel Pottinger
Modified by Ben Zhu, Yidan Liu
Presenter: Ben Zhu Discussion Leader: Yidan Liu
Outline Motivations for Tukwila Tukwila Architecture Interleaving of planning and execution Adaptive Query Operators
Dynamic Collector Double Pipelined Join
Summary
Why data integration? The goal is to provide a uniform query interface
to a multitude of data sources.
The key advantage of data integration is that it frees users from having to do the followings, locate the sources relevant to their query interact with each source independently manually combine the data from the different sources
The main challenges of the design of DISs: Query Reformulation The construction of wrapper programs Query optimizers and efficient query
execution engines
The need for adaptivity Absence of statistics Unpredictable data arrival characteristics Overlap and redundancy among sources Optimizing the time to the initial answers to the
query Network bandwidth generally constrains the
data sources to be “small”
Tukwila Architecture
Interleaving of planning and execution
Novel features of Tukwila:The optimizer can create a partial plan if essential statistics are missing or uncertainThe optimizer generates both annotated operator trees and appropriate event-condition-action rules.Optimizer conserves the state of its search space when it calls the execution engine, so it is able to resume optimization in an incremental fashion.
Interleaving of planning and execution – Query plans Operators in Tukwila are organized into
pipelined units called fragments A plan includes a partially-ordered set of
fragments and a set of global rules A fragment consists of a fully pipelined tree of
physical operators and a set of local rules. At the end of a fragment, results are
materialized, and the rest of the plan can be re-optimized or rescheduled
Interleaving of planning and execution - Rules
Rules are the key mechanism for implementing several kinds of adaptive behavior in TukwilaRe-optimization
the optimizer’s cardinality estimate for the fragment’s result is significantly different from the actual size re-invoke optimizer
Contingent planning
check properties of the result to select the next fragment
Adaptive operators
policy for memory overflow resolution and collectors
Rescheduling
reschedule if a source times out
Interleaving of planning and execution - Rule format
When event if condition then actions
When closed(frag1)
if card(join1)>2*est_card(join1)
then replan
An event triggers a rule, causing it to check its condition. If the condition is true, the rule fires, executing the actions.
Discussion 1
For one of the following motivating situations of TukwilaAbsence of statisticsUnpredictable data arrival characteristicsOverlap and redundancy among sourcesOptimizing the time to initial answers
Q1: Can you give some examples where the chosen topic matters?Q2: If you are a member of Tukwila team, what rules or policy would you have to deal with the problem?Discussion
Form 4 groups (3~4 person per group, one team per topic)Discuss Q1 and Q2 for one topic (5 ~ 7 minutes)
Interleaving of planning and execution – Query ExecutionExecutes a query plan (basic function)Gathers statistics about each operationHandles exception conditions or re-invoke
the optimizer
Interleaving of planning and execution – Event Handling Interprets rules attached to query execution plan Execution system may generate events at any
time, which are fed into an event queue For each event, the event handler uses a hash
table to find all matching rules in the active set For each active rule, it evaluates the conditions
and if they are satisfied, all of the rule’s actions are executed
Adaptive Query Operators - Dynamic Collectors Performs a union over a large number of
overlapping sources Standard union operators can’t handle errors or
decide to ignore slow mirror data sources Tukwila treats the task as a primitive operator so
that it can provide guidance about the order in which data sources should be accessed
A key distinguishing aspect: the flexibility to contact only some of the sources
Adaptive Query OperatorsDouble Pipelined JoinIssues with conventional joins Sort merge joins & indexed joins --can not be pipelined, require an initial sorting
or indexing step Nested loops joins and hash joins --follow an asymmetric execution model For Nested loops joins, we must wait for the entire inner
table to be transmitted initially before pipelining begins For hash joins, we must load the entire inner relation
into a hash table before we can pipeline.
Adaptive Query OperatorsDouble Pipelined JoinChallenges in the context of data integration Optimizer may not know the relative size of
each relation The time to first tuple is important, so it may be
better to use the larger data source as the inner relation if it sends data faster
The time to first tuple is extended by the hash join’s non-pipelined behavior when it is reading the inner relation
Adaptive Query OperatorsDouble Pipelined Hash Join Symmetric and incremental Produces tuples almost immediately and masks
slow data source transmission rates Data-driven in behavior: each join relation sends
tuples through the join operator as quickly as possible. At any time, all of the data encountered so far has been joined and the resulting tuples have already been output
Trade-off is that it MUST hold hash tables for BOTH relations in memory
Adaptive Query OperatorsDouble Pipelined Hash Join2 problems occurIt follows a data-driven, bottom-up execution model, while Tukwila is top-down, iterator-based
Multithreading: the join consists of separate threads for output, left child and right child
As each child reads tuples, it places them into a small tuple transfer queue
It requires enough memory to hold both join relations
Adaptive Query OperatorsDouble Pipelined Hash JoinHandling memory overflowTake some portion of the hash table and swap it to disk
Incremental left flush: switch to a strategy of reading only tuples from the right-side relation and as necessary flush a bucket from the left-side relation’s hash table when system runs out of memory – gradually degrade into hybrid hash, flushing buckets lazily
Incremental symmetric flush: pick a bucket to flush to disk and flush the bucket from both sources
Incremental left flush will perform fewer disk I/OsIncremental symmetric flush may have reduced latency since both relations continue to be processed in parallel
Discussion 2When this paper was written, perhaps it was okay to claim that “the sizes of most data integration queries are expected to be only moderately large”. But does this hold today, especially with the coming era of “cloud computing”? Specifically:
1. Do double pipelined hash joins seem efficient enough for today’s data?
2. Would you used double pipelined hash joins in non-data integration applications?
Summary Challenges for data integration and motivations
for Tukwila General Tukwila architecture Interleaving of planning and execution Dynamic collector operator Double pipelined hash join
Eddies: Continuously Adaptive Query Processing
Ran Avnur, Jesepth M. Hellestein
University of California, Berkeley
Previously presented by Hongrae Lee
Modified by Ben Zhu, Yidan Liu
Presenter: Ben Zhu Discussion Leader: Yidan Liu
OutlineIntroduction
Reorderability of plans
Rivers and Eddies
Routing tuples in Eddies
Summary
Static Query ProcessingTraditional query processing scheme
1. Optimizing a query2. Executing a static query plan
This traditional scheme is not appropriate forLarge scale widely-distributed information resources orMassively parallel database systems !
New Requirements
Increased complexity in large-scale systemHardware and workload
Data
User interface
We want query execution plansTo be reoptimized regularly during query processing
To allow system to adapt dynamically to fluctuations in computing resources, data characteristics, and user preferences
Run-Time Fluctuations
Three properties that vary during query processing
The costs of operators
Their selectivities
The rates at which tuples arrive from the inputs
The first and third issues commonly occur in wide area environments, the second one commonly arises due to correlations between predicates and the order of tuple delivery
These issues may become more common in cluster (shared-nothing) systems
Discussion 1"we favor adaptability over best-case performance"
1. Does this seem reasonable? In this case? In general?
2. If adaptivity is needed only when best-case missing or could also be a general strategy in regular query processing. Do you think it is good or bad to apply it in traditional query processing? Please give reasons or examples to support your opinions.
Eddy
Two Challenges for This SchemeHow can we reorder operators?
Reorderability of plans
How should we route tuples?Routing tuples in Eddies
Reorderability of Plans
Synchronization BarriersOne task waits for other tasks to be finished
Barriers limit concurrency, and hence performance
It is desirable to minimize the overhead of synchronization barriers in a dynamic performance environment
Issues affect the overhead: the frequency of barriers and the gap between arrival times of the two inputs at the barrier
Reorderability of Plans
Moments of SymmetryThe barrier where the order of the inputs to a join can often be changed without modifying any state in the join
Allow reordering of the inputs to a single binary operator
The combination of commutativity and moments of symmetry allows for very aggressive reordering of a plan tree
Join Algorithms and Reordering
Constraints on reorderingUnindexed join input is ordered before the indexed inputPreserving the ordered inputsSome join algorithms work only for equijoins
Join algorithms in EddyWe favor join algorithms with
Frequent moments of symmetryAdaptive or nonexistent barriersMinimal ordering constraints
Rules out hybrid hash join, merge joins, and nested loops joins Choice: Ripple Join
Frequently-symmetric versions of traditional iteration, hashing and indexing schemes
Favors adaptivity over best-case performance
Ripple Joins
Ripple joinsHave moments of symmetry at each “corner” of a rectangular rippleAre designed to allow changing rates for each inputOffer attractive adaptivity features at a modest overhead in performance and memory footprint
Block Index Hash
Rivers and Eddies
RiverA shared-nothing parallel query processing frameworkPre-optimization
A heuristic pre-optimizer must choose how to initially pair off relations into joins
An eddy in the RiverIs implemented via a module in a riverEncapsulates the scheduling of its participating operatorsExplicitly merges multiple unary and binary operators into a single n-ary operator within a query planA tuple is associated with a tuple descriptor containing a vector of Ready and Done bits
Routing Tuples in Eddies
An eddy moduleDirects the flow of tuples from the inputs through the various operators to the output
Provides the flexibility to allow each tuple to be routed individually through the operators
The routing policy determines the efficiency
Naïve eddyNaïve eddy
Tuples enter eddy with low priority, and when returned to eddy from an operator are given high priority
Tuples flow completely through eddy before new tuplesPrevents being ‘clogged’ with new tuples
Edges in a River DFG -> Fixed-size queue -> back-pressure in a fluid flow
Production along the input to any edge is limited by the rate of consumption at the outputTuples are routed to the low-cost operator first
Cost-aware policySelectivity-unaware policy
Learning Selectivity : Lottery Scheduling
To track both Consumption (determined by cost)Production (determined by cost and selectivity)
Lottery SchedulingMaintain ‘tickets’ for an operator
An operator’s chance of receiving the tuple∝The counts of tickets
The eddy can track (learn) an ordering of the operators that gives good overall efficiency
DebitDebitCreditCredit
Eddy Operator Eddy Operator
Responding to Dynamic Fluctuations
Eddies couldn’t adaptively react over time to the changes in performance and data characteristicsUse a window scheme instead of point scheme
Banked tickets for running a lotteryEscrow tickets for measuring efficiency during the windowAt the beginning of the window, the value of the escrow account replaces the banked account, and the escrow account is resetIt ensures that operators “re-prove themselves” each window
Some Experimental Results
Discussion 2If you were to design an adaptive query processor, what could be the possible tradeoffs you need to balance?
Which would you rather use: Tukwila or Eddies? Why?
SummaryEddies are
A query processing mechanism that allow fine-grained, adaptive, online optimizationBeneficial in the unpredictable query processing environments
ChallengesTo develop eddy ‘ticket’ policies that can be formally proved to converge quicklyTo attack the remaining static aspectsTo harness the parallelism and adaptivity available to us in riversTo explore the application of eddies and rivers to the generic space of dataflow programming
Thank you