an adaptive query execution engine for data integration zachary ives, daniela florescu, marc...

An Adaptive Query Execution Engine for Data Integration

Zachary Ives, Daniela Florescu, Marc Friedman, Alon Levy, Daniel S. Weld

University of Washington

Slides by Peng Li, Modified by Rachel Pottinger

Modified by Ben Zhu, Yidan Liu

Presenter: Ben Zhu Discussion Leader: Yidan Liu

Outline Motivations for Tukwila Tukwila Architecture Interleaving of planning and execution Adaptive Query Operators

Dynamic Collector Double Pipelined Join

Summary

Why data integration? The goal is to provide a uniform query interface

to a multitude of data sources.

The key advantage of data integration is that it frees users from having to do the followings, locate the sources relevant to their query interact with each source independently manually combine the data from the different sources

The main challenges of the design of DISs: Query Reformulation The construction of wrapper programs Query optimizers and efficient query

execution engines

The need for adaptivity Absence of statistics Unpredictable data arrival characteristics Overlap and redundancy among sources Optimizing the time to the initial answers to the

query Network bandwidth generally constrains the

data sources to be “small”

Tukwila Architecture

Interleaving of planning and execution

Novel features of Tukwila:The optimizer can create a partial plan if essential statistics are missing or uncertainThe optimizer generates both annotated operator trees and appropriate event-condition-action rules.Optimizer conserves the state of its search space when it calls the execution engine, so it is able to resume optimization in an incremental fashion.

Interleaving of planning and execution – Query plans Operators in Tukwila are organized into

pipelined units called fragments A plan includes a partially-ordered set of

fragments and a set of global rules A fragment consists of a fully pipelined tree of

physical operators and a set of local rules. At the end of a fragment, results are

materialized, and the rest of the plan can be re-optimized or rescheduled

Interleaving of planning and execution - Rules

Rules are the key mechanism for implementing several kinds of adaptive behavior in TukwilaRe-optimization

the optimizer’s cardinality estimate for the fragment’s result is significantly different from the actual size re-invoke optimizer

Contingent planning

check properties of the result to select the next fragment

Adaptive operators

policy for memory overflow resolution and collectors

Rescheduling

reschedule if a source times out

Interleaving of planning and execution - Rule format

When event if condition then actions

When closed(frag1)

if card(join1)>2*est_card(join1)

then replan

An event triggers a rule, causing it to check its condition. If the condition is true, the rule fires, executing the actions.

Discussion 1

For one of the following motivating situations of TukwilaAbsence of statisticsUnpredictable data arrival characteristicsOverlap and redundancy among sourcesOptimizing the time to initial answers

Q1: Can you give some examples where the chosen topic matters?Q2: If you are a member of Tukwila team, what rules or policy would you have to deal with the problem?Discussion

Form 4 groups (3~4 person per group, one team per topic)Discuss Q1 and Q2 for one topic (5 ~ 7 minutes)

Interleaving of planning and execution – Query ExecutionExecutes a query plan (basic function)Gathers statistics about each operationHandles exception conditions or re-invoke

the optimizer

Interleaving of planning and execution – Event Handling Interprets rules attached to query execution plan Execution system may generate events at any

time, which are fed into an event queue For each event, the event handler uses a hash

table to find all matching rules in the active set For each active rule, it evaluates the conditions

and if they are satisfied, all of the rule’s actions are executed

Adaptive Query Operators - Dynamic Collectors Performs a union over a large number of

overlapping sources Standard union operators can’t handle errors or

decide to ignore slow mirror data sources Tukwila treats the task as a primitive operator so

that it can provide guidance about the order in which data sources should be accessed

A key distinguishing aspect: the flexibility to contact only some of the sources

Adaptive Query OperatorsDouble Pipelined JoinIssues with conventional joins Sort merge joins & indexed joins --can not be pipelined, require an initial sorting

or indexing step Nested loops joins and hash joins --follow an asymmetric execution model For Nested loops joins, we must wait for the entire inner

table to be transmitted initially before pipelining begins For hash joins, we must load the entire inner relation

into a hash table before we can pipeline.

Adaptive Query OperatorsDouble Pipelined JoinChallenges in the context of data integration Optimizer may not know the relative size of

each relation The time to first tuple is important, so it may be

better to use the larger data source as the inner relation if it sends data faster

The time to first tuple is extended by the hash join’s non-pipelined behavior when it is reading the inner relation

Adaptive Query OperatorsDouble Pipelined Hash Join Symmetric and incremental Produces tuples almost immediately and masks

slow data source transmission rates Data-driven in behavior: each join relation sends

tuples through the join operator as quickly as possible. At any time, all of the data encountered so far has been joined and the resulting tuples have already been output

Trade-off is that it MUST hold hash tables for BOTH relations in memory

Adaptive Query OperatorsDouble Pipelined Hash Join2 problems occurIt follows a data-driven, bottom-up execution model, while Tukwila is top-down, iterator-based

Multithreading: the join consists of separate threads for output, left child and right child

As each child reads tuples, it places them into a small tuple transfer queue

It requires enough memory to hold both join relations

Adaptive Query OperatorsDouble Pipelined Hash JoinHandling memory overflowTake some portion of the hash table and swap it to disk

Incremental left flush: switch to a strategy of reading only tuples from the right-side relation and as necessary flush a bucket from the left-side relation’s hash table when system runs out of memory – gradually degrade into hybrid hash, flushing buckets lazily

Incremental symmetric flush: pick a bucket to flush to disk and flush the bucket from both sources

Incremental left flush will perform fewer disk I/OsIncremental symmetric flush may have reduced latency since both relations continue to be processed in parallel

Discussion 2When this paper was written, perhaps it was okay to claim that “the sizes of most data integration queries are expected to be only moderately large”. But does this hold today, especially with the coming era of “cloud computing”? Specifically:

1. Do double pipelined hash joins seem efficient enough for today’s data?

2. Would you used double pipelined hash joins in non-data integration applications?

Summary Challenges for data integration and motivations

for Tukwila General Tukwila architecture Interleaving of planning and execution Dynamic collector operator Double pipelined hash join

Eddies: Continuously Adaptive Query Processing

Ran Avnur, Jesepth M. Hellestein

University of California, Berkeley

Previously presented by Hongrae Lee

Modified by Ben Zhu, Yidan Liu

Presenter: Ben Zhu Discussion Leader: Yidan Liu

OutlineIntroduction

Reorderability of plans

Rivers and Eddies

Routing tuples in Eddies

Summary

Static Query ProcessingTraditional query processing scheme

1. Optimizing a query2. Executing a static query plan

This traditional scheme is not appropriate forLarge scale widely-distributed information resources orMassively parallel database systems !

New Requirements

Increased complexity in large-scale systemHardware and workload

Data

User interface

We want query execution plansTo be reoptimized regularly during query processing

To allow system to adapt dynamically to fluctuations in computing resources, data characteristics, and user preferences

Run-Time Fluctuations

Three properties that vary during query processing

The costs of operators

Their selectivities

The rates at which tuples arrive from the inputs

The first and third issues commonly occur in wide area environments, the second one commonly arises due to correlations between predicates and the order of tuple delivery

These issues may become more common in cluster (shared-nothing) systems

Discussion 1"we favor adaptability over best-case performance"

1. Does this seem reasonable? In this case? In general?

2. If adaptivity is needed only when best-case missing or could also be a general strategy in regular query processing. Do you think it is good or bad to apply it in traditional query processing? Please give reasons or examples to support your opinions.

Two Challenges for This SchemeHow can we reorder operators?

Reorderability of plans

How should we route tuples?Routing tuples in Eddies

Reorderability of Plans

Synchronization BarriersOne task waits for other tasks to be finished

Barriers limit concurrency, and hence performance

It is desirable to minimize the overhead of synchronization barriers in a dynamic performance environment

Issues affect the overhead: the frequency of barriers and the gap between arrival times of the two inputs at the barrier

Reorderability of Plans

Moments of SymmetryThe barrier where the order of the inputs to a join can often be changed without modifying any state in the join

Allow reordering of the inputs to a single binary operator

The combination of commutativity and moments of symmetry allows for very aggressive reordering of a plan tree

Join Algorithms and Reordering

Constraints on reorderingUnindexed join input is ordered before the indexed inputPreserving the ordered inputsSome join algorithms work only for equijoins

Join algorithms in EddyWe favor join algorithms with

Frequent moments of symmetryAdaptive or nonexistent barriersMinimal ordering constraints

Rules out hybrid hash join, merge joins, and nested loops joins Choice: Ripple Join

Frequently-symmetric versions of traditional iteration, hashing and indexing schemes

Favors adaptivity over best-case performance

Ripple Joins

Ripple joinsHave moments of symmetry at each “corner” of a rectangular rippleAre designed to allow changing rates for each inputOffer attractive adaptivity features at a modest overhead in performance and memory footprint

Block Index Hash

Rivers and Eddies

RiverA shared-nothing parallel query processing frameworkPre-optimization

A heuristic pre-optimizer must choose how to initially pair off relations into joins

An eddy in the RiverIs implemented via a module in a riverEncapsulates the scheduling of its participating operatorsExplicitly merges multiple unary and binary operators into a single n-ary operator within a query planA tuple is associated with a tuple descriptor containing a vector of Ready and Done bits

Routing Tuples in Eddies

An eddy moduleDirects the flow of tuples from the inputs through the various operators to the output

Provides the flexibility to allow each tuple to be routed individually through the operators

The routing policy determines the efficiency

Naïve eddyNaïve eddy

Tuples enter eddy with low priority, and when returned to eddy from an operator are given high priority

Tuples flow completely through eddy before new tuplesPrevents being ‘clogged’ with new tuples

Edges in a River DFG -> Fixed-size queue -> back-pressure in a fluid flow

Production along the input to any edge is limited by the rate of consumption at the outputTuples are routed to the low-cost operator first

Cost-aware policySelectivity-unaware policy

Learning Selectivity : Lottery Scheduling

To track both Consumption (determined by cost)Production (determined by cost and selectivity)

Lottery SchedulingMaintain ‘tickets’ for an operator

An operator’s chance of receiving the tuple∝The counts of tickets

The eddy can track (learn) an ordering of the operators that gives good overall efficiency

DebitDebitCreditCredit

Eddy Operator Eddy Operator

Responding to Dynamic Fluctuations

Eddies couldn’t adaptively react over time to the changes in performance and data characteristicsUse a window scheme instead of point scheme

Banked tickets for running a lotteryEscrow tickets for measuring efficiency during the windowAt the beginning of the window, the value of the escrow account replaces the banked account, and the escrow account is resetIt ensures that operators “re-prove themselves” each window

Some Experimental Results

Discussion 2If you were to design an adaptive query processor, what could be the possible tradeoffs you need to balance?

Which would you rather use: Tukwila or Eddies? Why?

SummaryEddies are

A query processing mechanism that allow fine-grained, adaptive, online optimizationBeneficial in the unpredictable query processing environments

ChallengesTo develop eddy ‘ticket’ policies that can be formally proved to converge quicklyTo attack the remaining static aspectsTo harness the parallelism and adaptivity available to us in riversTo explore the application of eddies and rivers to the generic space of dataflow programming

Thank you

an adaptive query execution engine for data integration zachary ives, daniela florescu, marc...

Documents