distributed database systemsmaterials.dagstuhl.de/files/17/17262/17262.katjahose1.slides.pdf ·...

Distributed Database Systems

Distributed Database SystemsA classic perspective

Katja Hose

Department of Computer ScienceAalborg [email protected]

DagstuhlJune 27, 2017

Katja Hose Distributed Database Systems Dagstuhl, June 27, 2017 1 / 24

Introduction

What is distributed data management?

Example: distributed database system

Scenario

Company with offices in Aalborg, Berlin, New York, and Paris

Database of products and customers

Introduction

What is distributed data management?

Example: distributed database system

Definition: distributed database

A collection of multiple, logically interrelated databases distributed over acomputer network

Introduction

Challenges

Distributed database designFragmentation, replication, and distribution

Distributed query processingExecuting a query over the network in the most cost-effective way

Distributed concurrency controlSynchronizing access such that integrity is maintained

Reliability of distributed DBMSEnsure consistency, detect failures, and recover from failures

Heterogeneous databasesTranslation between database systems – data model, data language

Introduction

Challenges

Introduction

Challenges

Introduction

Challenges

Introduction

Challenges

Introduction

Challenges

Fragmentation and Allocation

1 IntroductionWhat is distributed data management?Challenges

2 Fragmentation and Allocation

3 Distributed query processingData localizationGlobal Query OptimizationJoin Order OptimizationQuery Execution

Fragmentation

Alternatives

Tuples vs. columns, i.e., horizontal vs. vertical

Horizontal fragmentationVertical fragmentation

Many algorithms proposed in the literature

Allocation

Golden rules of allocation

Place data as close as possible to where it will be used

Use load balancing to find a global optimization of systemperformance

Optimal allocation

Objective:Minimize storage (

∑S) and transfer costs (

∑T ) for P fragments and

K nodes

Optimization problem

Find minimum∑

S +∑

Often heuristics are applied

Allocation

Optimal allocation

K nodes

Find minimum∑

S +∑

Allocation

Optimal allocation

K nodes

Find minimum∑

S +∑

Distributed query processing

1 IntroductionWhat is distributed data management?Challenges

2 Fragmentation and Allocation

3 Distributed query processingData localizationGlobal Query OptimizationJoin Order OptimizationQuery Execution

Phases of distributed query processing

Data localization

Example – horizontal reduction

Schema

Projects1 = σBudget≤150.000(Projects)

Projects2 = σ150.000<Budget≤200.000(Projects)

Projects3 = σBudget>200.000(Projects)

Reconstruction expression (horizontal fragmentation)

Projects = Projects1 ∪Projects2 ∪Projects3

Example query

σLocation=′Aalborg′∧Budget≤100.000(Projects)

After replacing references to global relations

σLocation=′Aalborg′∧Budget≤100.000(Projects1 ∪Projects2 ∪Projects3)

Further optimization is possible!

Data localization

Schema

Example query

Data localization

Schema

Example query

Data localization

Schema

Example query

Data localization

Schema

Example query

Data localization

Objective

Eliminate non-necessary subqueries (contradiction between query andfragmentation predicate)

Query with fragmentation expressionσLocation=′Aalborg′∧Budget≤100.000(Projects1∪Projects2∪Projects3)

Fragment definitionsProjects1 = σBudget≤150.000(Projects)Projects2 = σ150.000<Budget≤200.000(Projects)Projects3 = σBudget>200.000(Projects)

Because ofσBudget≤100.000(Projects2) = ∅, σBudget≤100.000(Projects3) = ∅

We obtain the reduced queryσLocation=′Aalborg′(σBudget≤100.000(Projects1))

Data localization

Objective

Eliminate non-necessary subqueries (contradiction between query andfragmentation predicate)

Query with fragmentation expressionσLocation=′Aalborg′∧Budget≤100.000(Projects1∪Projects2∪Projects3)

Fragment definitionsProjects1 = σBudget≤150.000(Projects)Projects2 = σ150.000<Budget≤200.000(Projects)Projects3 = σBudget>200.000(Projects)

Because ofσBudget≤100.000(Projects2) = ∅, σBudget≤100.000(Projects3) = ∅

We obtain the reduced queryσLocation=′Aalborg′(σBudget≤100.000(Projects1))

Data localization

Example – join reduction

Query Projects on Assignment

After replacing global relations with reconstruction expressions

(Projects1 ∪Projects2 ∪Projects3) on (Assignment1 ∪Assignment2)

After applying the distributive law

(Projects1 on Assignment1) ∪ (Projects1 on Assignment2) ∪(Projects2 on Assignment1) ∪ (Projects2 on Assignment2) ∪

(Projects3 on Assignment1) ∪ (Projects3 on Assignment2)

Data localization

Query with fragmentation expression

Some of these partial joins are empty, e.g.:

Projects1 on Assignment2 = ∅

Because their fragmentation expressions contradict:

Projects1 = σPNo=′P1′∨PNo=′P2′(Projects) andAssignment2 = σPNo=′P3′∨PNo=′P4′(Assignment)

Reduced query

(Projects1 on Assignment1) ∪ (Projects2 on Assignment2) ∪(Projects3 on Assignment2)

Data localization

Query with fragmentation expression

Some of these partial joins are empty, e.g.:

Projects1 on Assignment2 = ∅

Because their fragmentation expressions contradict:

Projects1 = σPNo=′P1′∨PNo=′P2′(Projects) andAssignment2 = σPNo=′P3′∨PNo=′P4′(Assignment)

Reduced query

(Projects1 on Assignment1) ∪ (Projects2 on Assignment2) ∪(Projects3 on Assignment2)

Global Query Optimization

Global query optimization

Phases

1 Spanning the search space usingtransformation rules→ equivalent search plans

2 Applying a search strategy and acost model→ choose an efficient plan

Main focus: join trees and joinordering

Search space

Tree variants for join order optimization

Linear join trees

All inner nodes have at least one leaf node (base relation) as childReduces search space

Bushy trees

May have inner nodes with no base relation as childHigh potential for parallelization

⊲⊳

⊲⊳ R1

⊲⊳ R2

linear join tree

⊲⊳

⊲⊳ ⊲⊳

R1 R2 R3 R4

bushy join tree

Search strategy

Deterministic search strategy

Systematic generation of query plans

Starting with plans accessing the base relations

Constructing complex plans by combining easier plans, e.g., joiningone more relation at each step

Implementations

Exhaustive search guarantees finding the best plan

Dynamic programming

Greedy algorithm

Cost model

Detailed statistics

Calibrated parameters

Fomulas (cardinality estimation, processing costs, communicationcosts, etc.)

Query Execution

Site selection and data transfer

Query shipping

Query initiator (node at whichthe query is issued/optimized)sends the query to othernodes

Receiver nodes compute thequery result and ship the resultback to the initiator

Query Execution

Data shipping

Query remains at the initiator

Initiator sends data requestmessages to other nodes

Receiver nodes ship allrequired data to the initiator

Initiator computes result

Query Execution

Hybrid shipping

Initiator sends partial queriesto other nodes

Other nodes execute someparts of the query and sendintermediate results to theinitiator

Initiator executes remainingquery operations(post-processing)

Query Execution

Ship whole

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

nodeR: send data request message (relation S) to nodeS

nodeS : send requested data (relation S) to nodeR

Total costs: 2 messages, 18 attribute values

Query Execution

Ship whole

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Ship whole

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Ship whole

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Ship whole

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR: 2 messages, 18 attribute values

Execution at nodeS : 2 messages, 14 attribute values

Execution at nodeX : 4 messages, 18 + 14 = 32 attribute values

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

nodeR: send data request message (tuples of relation S with B = ‘7′) to nodeS

nodeS : send requested data (0 tuples of relation S with B = ‘7′) to nodeR

nodeS : send requested data (1 tuple of relation S with B = ‘1′) to nodeR

Total costs: 7 · 2 = 14 messages, 7 + 2 · 3 = 13 attribute values

Execution at nodeS : 6 · 2 = 12 messages, 6 + 2 · 2 = 10 attribute values

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Query Execution

Fetch as needed

R A B3 71 14 67 74 56 25 7

S B C D9 8 81 5 19 4 24 3 34 2 65 7 8

R on S A B C D1 1 5 14 5 7 8

Execution at nodeR

Execution at nodeS : 6 · 2 = 12 messages, 6 + 2 · 2 = 10 attribute valuesKatja Hose Distributed Database Systems Dagstuhl, June 27, 2017 20 / 24

Query Execution

Ship whole vs. fetch as needed

Conclusion

Fetch as needed results in a high number of messages

Ship whole results in high amounts of transferred data

More advanced join execution algorithms

Semijoin

Bitvector join

Query Execution

Ship whole vs. fetch as needed

Conclusion

Fetch as needed results in a high number of messages

Ship whole results in high amounts of transferred data

More advanced join execution algorithms

Semijoin

Bitvector join

Query Execution

Semijoin

Query Execution

Bitvector join

Summary

Query optimization in distributed database systems benefits from globalknowledge and control.

We can decide on fragmentation and allocation

Create detailed statistics

Finetune a cost model

Optimize join processing (Semijoin, Bitvector join,. . . )

Further interesting topics

Distributed concurrency control

Reliability and failure

Heterogeneity and data integration

Summary

Query optimization in distributed database systems benefits from globalknowledge and control.

We can decide on fragmentation and allocation

Create detailed statistics

Finetune a cost model

Optimize join processing (Semijoin, Bitvector join,. . . )

Further interesting topics

Distributed concurrency control

Reliability and failure

Heterogeneity and data integration

. . .Katja Hose Distributed Database Systems Dagstuhl, June 27, 2017 24 / 24

References

This talk involves material based on the following sources:

M. Tamer Ozsu, P. Valduriez. Principles of Distributed Database Systems.Third Edition, Springer, 2011.

P. Dadam. Verteilte Datenbanken und Client/Server-Systeme.Springer-Verlag, Berlin, Heidelberg 1996.

Kai-Uwe Sattler Vorlesung: Verteilte Datenbanksysteme.

distributed database systemsmaterials.dagstuhl.de/files/17/17262/17262.katjahose1.slides.pdf ·...

Documents