algorithmic aspects of parallel query processingsuciu/tutorial-sigmod... · 2018. 6. 16. ·...
TRANSCRIPT
![Page 1: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/1.jpg)
Algorithmic Aspects of Parallel Query Processing
Paris Koutris – U. of WisconsinSemih Salihoglu – U. of Waterloo
Dan Suciu – U. of Washington
![Page 2: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/2.jpg)
Motivation• Most modern data analytics tools process data on a
cluster:– Spark, Dremel, Redshift, Myria, Hive, Impala, Scope,
Flink, etc• Reason: use sufficiently many nodes to avoid having to
spill intermediate results to disk• Consequence: data is processed on clusters with x10-
x1000 nodes
2
This tutorial: basic data processing algorithms on a large cluster
![Page 3: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/3.jpg)
3
Tutorial based on this survey
https://tinyurl.com/y99w99b4
Free until June 18(create account)
![Page 4: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/4.jpg)
Outline• Models of parallel computation (Dan)
• Two-way joins (Paris)
• Multi-way joins (Paris+Semih)
• Sorting & Matrix multiplication (Paris+Semih)
• Conclusion (Dan)
4
![Page 5: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/5.jpg)
Models
• Abstract model to analyze algorithms
• Massively Parallel Communication (MPC)(simplified BSP model [VALIANT ’90])
– Cluster of nodes (=servers, =processors)
– Computation = several rounds
– Each round = processing + communication
• Shared-nothing architecture
5
![Page 6: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/6.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Input (size=IN)
Input data = size IN
Number of servers = p
O(IN/p) O(IN/p)
![Page 7: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/7.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Input (size=IN)
Input data = size IN
One round = Compute & communicate
Number of servers = p
O(IN/p) O(IN/p)
![Page 8: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/8.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
Round 1
Input data = size IN
One round = Compute & communicate
Number of servers = p ≤L ≤L
O(IN/p) O(IN/p)
![Page 9: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/9.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
Round 1
Round 2
Input data = size IN
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
![Page 10: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/10.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
![Page 11: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/11.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
![Page 12: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/12.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
![Page 13: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/13.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
Cost: Ideal Practical ε∈(0,1) Naïve 1 Naïve 2Load L L = IN/p L = IN/p1-ε L = IN L = IN/pRounds r 1 O(1) 1 p
![Page 14: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/14.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
Cost: Ideal Practical ε∈(0,1) Naïve 1 Naïve 2Load L L = IN/p L = IN/p1-ε L = IN L = IN/pRounds r 1 O(1) 1 p
![Page 15: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/15.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
Cost: Ideal Practical ε∈(0,1) Naïve 1 Naïve 2Load L L = IN/p L = IN/p1-ε L = IN L = IN/pRounds r 1 O(1) 1 p
![Page 16: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/16.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
Cost: Ideal Practical ε∈(0,1) Naïve 1 Naïve 2Load L L = IN/p L = IN/p1-ε L = IN L = IN/pRounds r 1 O(1) 1 p
![Page 17: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/17.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p ≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
Cost: Ideal Practical ε∈(0,1) Naïve 1 Naïve 2Load L L = IN/p L = IN/p1-ε L = IN L = IN/pRounds r 1 O(1) 1 p
![Page 18: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/18.jpg)
Massively Parallel Communication Model (MPC)
Server 1 Server p. . . .
Server 1 Server p. . . .
Server 1 Server p. . . .
Input (size=IN)
. . . .
Round 1
Round 2
Round 3. . . .
Input data = size IN
Max communication load / round / server = L
Algorithm = Several rounds
One round = Compute & communicate
Number of servers = p≤L
≤L
≤L
≤L
≤L
≤L
O(IN/p) O(IN/p)
Cost: Ideal Practical ! ∈ (0,1) Naïve 1 Naïve 2Load L L = IN/p L = IN/p1-ε L = IN L = IN/pRounds r 1 O(1) 1 p
![Page 19: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/19.jpg)
Discussion: Traditional Models• Circuits ≈ oblivious MPC
– Circuit-size = p×r, Depth = r, Fan-in = L
• PRAM: shared-memory, p processors– Brent’s theorem: Tp = O(Circuit-size/p + Depth)
• BSP [VALIANT ’90]: shared-nothing– detailed communication cost– MPC removes those details
19
![Page 20: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/20.jpg)
Summary of the Model
• MPC– shared nothing– All-to-all communication
• Two cost parameters:– L (load) = max communication at each server– r (number of rounds)
20
![Page 21: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/21.jpg)
Outline• Models of parallel computation (Dan)
• Two-way joins (Paris)
• Multi-way joins (Paris+Semih)
• Sorting & Matrix multiplication (Paris+Semih)
• Conclusion (Dan)
21
![Page 22: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/22.jpg)
2-way Joins
22
Join(x,y,z) = R(x,y) ⋈ S(y,z)SELECT * FROM R , S WHERE R.y = S.y ;
We will see 2 types of techniques for a parallel join:• hash-based join• sort-based join
![Page 23: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/23.jpg)
Parallel Hash Join
Round #1 computation: each server• computes the join R(x,y) ⋈ S(y,z) of the local instances
Choose a hash function h that maps values from the domain to one of the p servers
Round #1 communication: each server• sends record R(x,y) to server h(y)• sends record S(y,z) to server h(y)
23
x ya b
a c
b c
y zb d
b e
c e
R SJoin(x,y,z) = R(x,y) ⋈ S(y,z)• IN = |R| + |S| • p servers
![Page 24: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/24.jpg)
A Simple Analysis w/o Skew
Suppose that every value of y appears at most once in the database (no skew). Then:
In other words, for large enough input, with high probability we have load:
24
! = #(%&/()
Pr[L (1 + )IN
p] pe
2IN3p
How far away is the maximum load L from the expected load IN/p ?
![Page 25: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/25.jpg)
A Simple Analysis with SkewSuppose now that every value of y appears exactly d times in the database. Then:
25
Pr[L (1 + )IN
p] pe
2IN3pd
The exponent has now an additional factor d
! = Θ(%&/()! << %&/( ! >> %&/(
, = -(%&/() , = -(%& log(() / () 12 31456 12 , = %&
![Page 26: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/26.jpg)
The Effect of Skew
26
0
1
2
3
4
5
6
7
8
9
10
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000
degree
threshold(in
millions)
numberofprocessors (p)
• IN = 100 billion tuples• at most 30% over the expected load IN/p with probability 95%
As the number p ofservers grows, it ismore likely that weobserve the effectsof skew!
d(in million)
pp = 100d = 4,000,000
p = 1,000d = 10,000
![Page 27: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/27.jpg)
Skew in the Extreme
27
Product(x,z) = R(x) ⋈ S(z) SELECT * FROM R , S ;
• In the extreme, all tuples in R and all tuples in S have the same value for attribute y
• Parallel hash-join will incur a load " = $% in this case• We can do better by observing that the join then
degenerates to a Cartesian product:
![Page 28: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/28.jpg)
• Choose shares p1, p2 s.t. ! = !1 × !2• Arrange servers in a !1 × !2 rectangle
Round #1 communication: each server• sends a tuple from R to a random row• sends a tuple from S to a random columnRound #1 computation: each server• computes the Cartesian product locally
Optimal choice for p1, p2: |'| / !1 = |)| / !2
Cartesian Product1 2 p2
1
2
p1
R(x) à
S(z) à
Product(x,z) = R(x) ⋈ S(z)
L = 2
s|R||S|
p
• OUT = |R| * |S|• The above 1-round algorithm is optimal for the load• When |R| << |S|, the algorithm broadcasts R and partitions only S
28
![Page 29: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/29.jpg)
Parallel Join for Arbitrary Skew
• For inputs with arbitrary skew, we need to combine the 2 techniques: parallel hash join + Cartesian product
• Key concept: heavy hitter
Heavy hitter: any value of the join attribute y that occurs at least !"/$ times in R or S
A value that is not heavy hitter is called light hitter
29
![Page 30: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/30.jpg)
Parallel Join for Arbitrary SkewAlgorithm1. Run the parallel hash join for the light hitter values
load ! = #(%&/()2. For each heavy hitter bi compute the Cartesian product
of the subquery R(x, bi) ⋈ S(bi, z) using pi exclusive servers
By choosing the pi appropriately such that their sum is p, we can get:
30
L = O
sOUT
p+
IN
p
!
![Page 31: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/31.jpg)
Parallel Sort Join
31
Algorithm [HU ET AL. ‘17]1. Union the two relations R,S2. Parallel sort the result using the value of the join
attribute as sort key3. We distinguish two cases for a value of y:
– if all tuples with value b are in the same server, the join can be computed locally
– for values that cross multiple servers, we apply the Cartesian product algorithm
L = O
sOUT
p+
IN
p
!
![Page 32: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/32.jpg)
2-way Joins in Practice
32
• Parallel Hash Join [SparkSQL, Hive, Myria, Impala, …]– most commonly used join algorithm
• Broadcast Join [Hive, Impala, SparkSQL]– if one relation is much smaller than the other relation,
broadcast it to every server• Parallel Sort Join [SparkSQL]
Note: the choice of the local join algorithm is independentof the parallel algorithm!
![Page 33: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/33.jpg)
Outline• Models of parallel computation (Dan)
• Two-way joins (Paris)
• Multi-way joins (Paris+Semih)
• Sorting & Matrix multiplication (Paris+Semih)
• Conclusion (Dan)
33
![Page 34: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/34.jpg)
From 2-way to Multiway Joins
The triangle query • Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)• |R| = |S| = |T| = N tuples• IN = 3 N
• Algorithm introduced by [AFRATI AND ULLMAN ’10]– Named later Shares Algorithm– For MapReduce
• Analyzed/optimized [BEAME ET AL. ’13,’14]– HyperCube Algorithm – For the MPC model
34
The triangle query can be computed in one round!
![Page 35: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/35.jpg)
Triangles in One Round
• Place servers in a ! ⁄# $ × ! ⁄# $ × ! ⁄# $ cube• Each server is identified by a coordinate (', ), *)• Choose 3 random, independent hash functions hx, hy, hz
35
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
Δ(a,b,c)@(hx(a),hy(b),hz(c))
R(a,b)->(hx(a),hy(b),*)
S(b,c)->(*,hy(b),hz(c))
T(c,a)->(hx(a),*,hz(c))
x
y
z
![Page 36: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/36.jpg)
Analysis for Triangles
36
Can we compute triangles with ! = #/%?• No! [BEAME ET AL. ‘13]• In fact, any 1-round algorithm has load
! = Ω(#/%2/3), even on inputs without skew
Theorem The HyperCube algorithm computes triangles with load ! = ,(#/% ⁄. /) w.h.p. on any input database without skew
• Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)• |R| = |S| = |T| = N tuples
![Page 37: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/37.jpg)
Multiway Joins
HyperCube Algorithm• organize the p servers in a p1 × p2 × … × pk hypercube• choose k independent hash functions
Round #1 communication: send Sj(xj1, xj2, …) to allservers whose coordinates agree with hj1(xj1), hj2(xj2), …Round #1 computation: compute Q locally on every server
Q(x1, x2, …, xk ) = S1(x1) ⋈ S2(x2) ⋈ … ⋈ Sl (xl)
How do we choose the shares so that we minimize L?
37
shares
![Page 38: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/38.jpg)
Choosing the SharesQ(x1, x2, …, xk ) = S1(x1) ⋈ S2(x2) ⋈ … ⋈ Sl (xl)
• the shares must satisfy
• #tuples a server receives from Sj =
• To optimize load, we minimize
Y
i
pi p
|Sj |Qi:xi2Sj
pi
maxj
|Sj |Qi:xi2Sj
pi
38
![Page 39: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/39.jpg)
Refresher: Covers & Packings
!"#$ Σ" &" = !()* Σ+ ,+ = - ∗
Fractional vertex coverweights &1, &2, … &3 ≥ 0 s.t.for all Sj: ∑7:9: ∈ <= &" ≥ 1Fractional edge packingweights ,1, ,2, … ,> ≥ 0 s.t.for all xi: ∑?:9: ∈ <= ,? ≤ 1
Q(x1, x2, …, xk ) = S1(x1) ⋈ S2(x2) ⋈ … ⋈ Sl (xl)
39
½
½
½
½½
½
![Page 40: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/40.jpg)
Optimal Load
[Beame’14]
Q(x1, x2, …, xk ) = S1(x1) ⋈ S2(x2) ⋈ … ⋈ Sl (xl)
Theorem The HyperCube algorithm computes a join query in one round with load
The load achieved is optimal
* The load analysis works only for data without skew
L = maxedge packing u
Qtj=1 |Sj |uj
p
!1/P
j uj
40
When |S1| = |S2| = ... = N then " = $ / & ⁄( )∗
![Page 41: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/41.jpg)
Example - Equal Size
τ* = 3/2½
½
½01τ* = 1
R(x,y) ⋈ S(y,z) R(x,y) ⋈ S(y,z) ⋈ T(z,x)
L = N / p L = N / p3/2
When all relations have equal size (N), then the optimal load formula becomes " = $ / & ⁄( )∗
41
![Page 42: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/42.jpg)
Example - Unequal Size
Edge packinguR, uS, uT
Maximum load L Max when HyperCube Shares
1/2, 1/2, 1/2 (|R| |S| |T|)1/3 / p2/3
1, 0, 0 |R| / p
0, 1, 0 |S| / p
0, 0, 1 |T| / p
0, 0, 0 0
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
L = max of these values42
L = maxu
|R|uR |S|uS |T |uT
p
1/(uR+uS+uT )
![Page 43: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/43.jpg)
Example - Unequal Size
Edge packinguR, uS, uT
Maximum load L Max when HyperCube Shares
1/2, 1/2, 1/2 (|R| |S| |T|)1/3 / p2/3 |R| ≈ |S| ≈ |T| px, py, pz > 1
1, 0, 0 |R| / p
0, 1, 0 |S| / p
0, 0, 1 |T| / p
0, 0, 0 0
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
L = max of these values43
L = maxu
|R|uR |S|uS |T |uT
p
1/(uR+uS+uT )
![Page 44: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/44.jpg)
Example - Unequal Size
Edge packinguR, uS, uT
Maximum load L Max when HyperCube Shares
1/2, 1/2, 1/2 (|R| |S| |T|)1/3 / p2/3 |R| ≈ |S| ≈ |T| px, py, pz > 1
1, 0, 0 |R| / p pz = 1
0, 1, 0 |S| / p
0, 0, 1 |T| / p
0, 0, 0 0
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
L = max of these values44
L = maxu
|R|uR |S|uS |T |uT
p
1/(uR+uS+uT )
|R|p
s|S||T |
p
![Page 45: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/45.jpg)
HyperCube Speedup
• Given by 1/#∑ %&, better than 1/# ⁄( )∗
• As p increases, speedup degrades to 1/# ⁄( )∗
45
speedup
p
![Page 46: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/46.jpg)
Skew Matters
• Skewed data significantly degrades the performance in distributed query processing
• Skewed values must be treated specially!
• State of the art in large scale distributed systems: DIY L
46
![Page 47: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/47.jpg)
The SkewHC algorithm
Def. Fix ! ⊆ $1, … , $(. The residual query Qx is obtained from Q by removing the variables x and the empty atoms
Def. A value is a heavy hitter if it occurs > N/p times
|S1| = |S2| = … = NQ(x1, x2, …, xk ) = S1(x1) ⋈ S2(x2) ⋈ … ⋈ Sl (xl)
47
Algorithm: in parallel, for every combination of heavy/light,compute the residual query for that combination
Theorem Let +∗ (.) = 12$! 3∗ (.!)• load of the algorithm is 4 = 5(6/8 ⁄: ;∗)• any algorithm needs load 4 = Ω(6/8 ⁄: ;∗)
![Page 48: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/48.jpg)
Example
x y z Residual query τ* L p1 × p2 × p3
light light light R(x,y) ⋈ S(y,z) ⋈ T(z,x) 3/2 N/p2/3 p1/3 × p1/3 × p1/3
. . . . . . .
Heavy hitter = a value that occurs at least N/p timesEach attribute has at most p heavy hitters
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
48
![Page 49: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/49.jpg)
Example
x y z Residual query τ* L p1 × p2 × p3
light light light R(x,y) ⋈ S(y,z) ⋈ T(z,x) 3/2 N/p2/3 p1/3 × p1/3 × p1/3
light light heavy R(x,y) ⋈ S(y) ⋈ T(x) 2 N/p1/2 p1/2 × p1/2 × 1
. . . . . . .
Heavy hitter = a value that occurs at least N/p timesEach attribute has at most p heavy hitters
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
49
![Page 50: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/50.jpg)
Example
x y z Residual query τ* L p1 × p2 × p3
light light light R(x,y) ⋈ S(y,z) ⋈ T(z,x) 3/2 N/p2/3 p1/3 × p1/3 × p1/3
light light heavy R(x,y) ⋈ S(y) ⋈ T(x) 2 N/p1/2 p1/2 × p1/2 × 1
light heavy heavy R(x) ⋈ T(x) 1 N/p p × 1 × 1
. . . . . . .
Heavy hitter = a value that occurs at least N/p timesEach attribute has at most p heavy hitters
Δ(x,y,z) = R(x,y) ⋈ S(y,z) ⋈ T(z,x)
50
![Page 51: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/51.jpg)
Summary
51
What about multiple rounds?
Query No Skew1 Round (!*)
Skew1 Round (ψ*)
!* = 3/2IN/p2/3
ψ* = 2IN/p1/2
!* = 1IN/p
ψ* = 2IN/p1/2
general IN/p1/τ* IN/p1/ψ*
x
y z
x y z
≤
≤
≤
![Page 52: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/52.jpg)
Multiple Rounds• Queries are typically executed in multiple rounds
SELECT cKey, month, sum(price)FROM Orders, CustomersGROUP BY cKey, month
• High-level Question:
§ What is the computational power of rounds?
• Upshot: Few results & many open questions for CQs
52
![Page 53: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/53.jpg)
1-round vs Multi-roundQuery No Skew
1 Round (!*)Skew
1 Round (ψ*)
!* = 3/2IN/p2/3
ψ* = 2IN/p1/2
!* = 1IN/p
ψ* = 2IN/p1/2
!* = 2IN/p1/2
ψ* = 2IN/p1/2
x
y z
x y z
yx
R(x),S(x, y),T(y)
≤
≤
≤
53
No Skew, Multi Rounds => Easy
![Page 54: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/54.jpg)
1-round vs Multi-round
Skew, Multi Round
Query No SkewMulti-Round
No Skew1 Round (!*)
Skew1 Round (ψ*)
IN/p!* = 3/2IN/p2/3
ψ* = 2IN/p1/2
IN/p!* = 1IN/p
ψ* = 2IN/p1/2
IN/p!* = 2IN/p1/2
ψ* = 2IN/p1/2
x
y z
x y z
yx
R(x),S(x, y),T(y)
≤
≤
≤
≤
≤
≤
"* =
"* =
"* = 1
No Skew, Multi Rounds => Easy
Tight for only some queriesGeneral queries open
AGM bound "*1 ≤ "*, !* ≤ ψ* 54
![Page 55: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/55.jpg)
Background: AGM BoundGiven Q: R1 ⋈ …⋈ Rn, what’s the max size of |OUT|?
Assume |Ri| are equal. Let be a fractional edge cover:
Then: dddddddddd
!e = (e1...en )
yx
R(x) S(x, y) T(y)
"* = 1
55
|OUT |≤ IN |!e|
|OUT |≤ IN 2
"* : weight of minimum fractional edge cover
|OUT |≤ IN ρ*
1 0 1
![Page 56: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/56.jpg)
Multi-round Communication LBSimple counting argument!
Round 1 Round 2 Round r…
…
L tuples L tuples L tuples…
Net
wor
k C
ard
Processor k
with !" tuples, we can output (!")%∗ tuples:So: '(!")%∗≥ )*%∗ => " ≥ -()*/(! ' ⁄0 %∗))
If ! = -(1), then " ≥ -()*/' ⁄0 %∗)56
![Page 57: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/57.jpg)
Extreme No Skew Multi-Round: Easy Case• Iterative binary joins at each round
• Ex:
• Any Q, with r = n-1 and optimal L
R(x,y) S(y, z)
T(z, x)TMP(x, y, z)
OUT(x, y, z)
Round 1:
Round 2:
S(y, z)1 n2 7… …n 3
L =O( INp)
L =O( INp)
x
y z
R(x, y)1 22 4… …n 9
T(z, x)1 52 6… …n 1
Intermediate relation sizes do not grow with binary joinsin the case of extreme no-skew 57
![Page 58: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/58.jpg)
Skew Multi-Round: Hard Case (1)• Iterative binary joins if Q decomposes into semijoins
• Ex 1: yx
L =O( INp+OUTp) =O( IN
p+INp) =O( IN
p)
L =O( INp)
Semijoins remove potential outputs each round
without growing intermediate relations
!* = 1
ψ* = 2Round 1:
Round 2:
R(x) S(x, y)
T(y)TMP(x, y)
OUT(x, y)R(x) T(y)S(x, y)
58
![Page 59: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/59.jpg)
Skew Multi-Round: Hard Case (2)• Ex 2:
• Decompose into “semijoin residual queries”
x
y z
SL(y, z)RL(x, y) TL(z, x)
Light values
S(y, 3)R(x, y) T(3, x)
2 round semijoin on p2/3 machinesL=O(IN/p2/3)
O(p1/3) many Heavy values(e.g. z = 3)
deg(h) < IN/p1/3?
yxq(z=3):
r: 2, L=O(IN/p2/3) => worst-case optimal
1 round HC on p machinesL=O(IN/p2/3)
59
Heavy-Light + Semijoins Alg
!* = 3/2
![Page 60: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/60.jpg)
Limitation of HL+Semijoins
Heavy Light+Semijoins: decomposes Q into``residual semijoin queries’’ (recursively) based on degrees
But only when arities ≤ 2 or several special casesOpen: L=O(IN/p1/!*) for general queries?
60
![Page 61: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/61.jpg)
Example Difficult Query
!* = 2ψ* = 3
x1 x2 x3
y1 y2 y3
R1
S1 S2 S3
R2
Open: L=O(IN/p1/2) in O(1) rounds?
61
![Page 62: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/62.jpg)
• !* can be very high for even small-size queries
Scalability Limitation of L=IN/p1/!* (1)
2x speedup requires 1024x more processors
62
A1 A20…A0R1 R2 R20
!* = 10
• HC or Heavy-Light+Semijoin give poor scalability
![Page 63: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/63.jpg)
• Iterative BJ might generate a lot of intermediate data
Scalability Limitation of L=IN/p1/!* (2)
Problem: |Ti| > p|IN|
better run 1 round & replicate IN
Upshot: Semijoins can help if OUT is small
Round 1R1(x1, x2 ) R2(x2, x3 )
R3(x3, x4 )T1(x1, x2, x3)
T2(x1, x2, x3, x4)
…
Round 2
Round 3
63
![Page 64: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/64.jpg)
64
A0A1λ=R1
A1A3λ=R3
A0A2λ=R2
A2A4λ=R4
A2A5λ=R5
width = 1
Yannakakis Algorithm For Acyclic QueriesInput: Acyclic query Q: R1 ⋈ R2 ⋈ … ⋈ Rn &
a width-1 Generalized Hypertree Decomposition of Q
Q
A0
A1 A2
A3 A4 A5
R1 R2
R3R4 R5
A2subtree
width-1 GHD
![Page 65: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/65.jpg)
65
R1A0 A1
a01 a11a02 a12a03 a13
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21a02 a22a03 a23
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋊
Yannakakis Algorithm For Acyclic QueriesUpward Semijoin
Phase
width-1 GHD
![Page 66: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/66.jpg)
66
R1A0 A1
a01 a11a02 a12a03 a13
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21a02 a22
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋊
Upward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 67: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/67.jpg)
67
R1A0 A1
a01 a11a02 a12a03 a13
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋊
Upward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 68: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/68.jpg)
68
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋊
Upward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 69: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/69.jpg)
69
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋉
Downward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 70: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/70.jpg)
70
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋉
Downward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 71: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/71.jpg)
71
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41a22 a42
R5A2 A5
a21 a51a25 a55
⋉
Downward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 72: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/72.jpg)
72
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41
R5A2 A5
a21 a51a25 a55
⋉
Downward SemijoinPhase
Yannakakis Algorithm For Acyclic Queries
![Page 73: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/73.jpg)
73
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
R2A0 A2
a01 a21
R4A2 A4
a21 a41
R5A2 A5
a21 a51
⋈
Join Phase
Yannakakis Algorithm For Acyclic Queries
![Page 74: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/74.jpg)
74
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
T1A0 A2 A5
a01 a21 a51
R4A2 A4
a21 a41
⋈
Join Phase
Yannakakis Algorithm For Acyclic Queries
![Page 75: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/75.jpg)
75
R1A0 A1
a01 a11
R3A1 A3
a11 a31a11 a32
T2A0 A2 A4 A5
a01 a21 a41 a51
⋈
Join Phase
Yannakakis Algorithm For Acyclic Queries
![Page 76: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/76.jpg)
76
T3A0 A1 A2 A4 A5
a01 a11 a21 a41 a51
R3A1 A3
a11 a31a11 a32
⋈
Join Phase
Yannakakis Algorithm For Acyclic Queries
![Page 77: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/77.jpg)
77
OUTA0 A1 A2 A3 A4 A5
a01 a11 a21 a31 a41 a51a01 a11 a21 a32 a41 a51
O(n) semijoins + O(n) joins
Serial Run-Time: O(IN+OUT)
because |Ti| ≤ OUT
Yannakakis Algorithm For Acyclic Queries
![Page 78: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/78.jpg)
78
GYM: Distributed Yannakakis on Acyclic Q
r =O(n)
• Naively distribute semijoins&joins separate rounds:
• Linear scalability: 2x processors, 2x speed up
GYMr=O(n)
HL + Semijoinsr = O(1), arity 2
OUT< p1-1/!*IN L = (IN +OUT)/p L = IN/p1/!*≤
L =O( IN+OUTp
)
Larger p allows for larger OUT
![Page 79: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/79.jpg)
GYM in O(d) Rounds
Lowest depth w-1 GHD:d = Ɵ(n)
• Acyclic queries have w-1 GHDs w/ different depths
d=1
Can optimize GYM to run in r = O(d) and same Lw/ parallel joins and semijoins
Path-n
Star-n A0A1λ=R1
A0A2λ=R2
A0Anλ=Rn
A0A3λ=R3
…
A1 An-1 An…A0R1 R2 RnRn-
1
…
R1R2
R3R4
R5
Rn
A4
A0
A5
An
A1A2
A3
79
![Page 80: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/80.jpg)
Ex: Vanilla GYM
R1A0 A1
⋊R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 1
Upward SemijoinPhase
80
![Page 81: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/81.jpg)
Ex: Vanilla GYM
R1A0 A1
⋊R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 2
Upward SemijoinPhase
81
![Page 82: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/82.jpg)
Ex: Vanilla GYM
R1A0 A1
⋊R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 3
Upward SemijoinPhase
82
![Page 83: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/83.jpg)
Ex: Vanilla GYM
R1A0 A1⋊
R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 4
Downward SemijoinPhase
83
![Page 84: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/84.jpg)
Ex: Vanilla GYM
R1A0 A1
⋊
R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 5
Downward SemijoinPhase
84
![Page 85: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/85.jpg)
Ex: Vanilla GYM
R1A0 A1 ⋊
R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 6
Downward SemijoinPhase
85
![Page 86: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/86.jpg)
Ex: Vanilla GYM
R1A0 A1
⋈
R2
A0 A2
R4
A0 A4
R3
A0 A3
Round 7
Join Phase
86
![Page 87: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/87.jpg)
Ex: Vanilla GYM
T1A0 A1 A2
⋈R4
A0 A4
R3
A0 A3
Round 8
Join Phase
87
![Page 88: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/88.jpg)
Ex: Vanilla GYM
T2A0 A1 A2 A3
⋈
R4
A0 A4
Round 9
Join Phase
88
![Page 89: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/89.jpg)
Ex: Vanilla GYM
OUTA0 A1 A2 A3 A4
r = 9 L =O( IN+OUTp
)
89
![Page 90: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/90.jpg)
R1A0 A1
⋊
Round 1
Ex: Optimized GYM
⋊ ⋊R2
A0 A2
R4
A0 A4
R3
A0 A3
Upward SemijoinPhase
90
![Page 91: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/91.jpg)
R`1A0 A1
Round 2
Ex: Optimized GYM
R``1A0 A1
R```1A0 A1
∩R1
A0 A1
Upward SemijoinPhase
91
![Page 92: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/92.jpg)
R1A0 A1
⋊
Round 3
Ex: Optimized GYM
⋊ ⋊
R2
A0 A2
R4
A0 A4
R3
A0 A3
Downward SemijoinPhase
92
![Page 93: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/93.jpg)
R1A0 A1⋈
Round 4
Ex: Optimized GYM
R2
A0 A2
R4
A0 A4
R3
A0 A3
Skew-HC
Join Phase
93
![Page 94: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/94.jpg)
Ex: Optimized GYM
r = 4 L =O( IN+OUTp
)
OUTA0 A1 A2 A3 A4
94
![Page 95: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/95.jpg)
Generalizing To Any Query and GHD
A1 An-1 An…A0R1 R2
RnRn-1Rn
Rn-1
R1
R2
…Tk=R1Rn/2Rn
R1 R3 R5 Rn
T1=R1R2R3
Ti=R1R4R7
Rn-2
Tk=Rn-2Rn-1Rn
…
…
… Tj=Rn-6Rn-3Rn
…
T1=R1R2R3…Rn
Any Q with width-w, depth-d GHD can run in:r=O(d), L=O((INw + OUT)/p)
w=1,d=n w=n/2, d=1 w=3, d=log(n)
Can tradeoff r and L by constructing GHDs with different w and d
95
![Page 96: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/96.jpg)
1. High-degree residual query decompositions + semijoins:
L=O(IN/p1/!*) (for some Qs)
2. Yannakakis style processing improves L if OUT is small
3. Depth & width of decompositions allow r & L tradeoffs
Main Takeaways
96
![Page 97: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/97.jpg)
• Most Systems: Iterative Binary Join Plans
• Tributary Join [CHU ET AL. SIGMOD ‘15]
• Subgraph Queries:
• BiGJoin [AMMAR ET AL., VLDB ‘18]
• SEED [LAI ET AL., VLDB Journal, ‘17]
• TwinTwigJoin [LAI ET AL., VLDB ‘16]
• PSgL [SHAO ET AL., SIGMOD ’14]
Multi-round Multiway Joins In Practice
97
![Page 98: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/98.jpg)
Outline• Models of parallel computation (Dan)
• Two-way joins (Paris)
• Multi-way joins (Paris+Semih)
• Sorting & Matrix multiplication (Paris+Semih)
• Conclusion (Dan)
98
![Page 99: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/99.jpg)
Sorting
• Sorting is a fundamental operation in data processing– join computation (parallel merge join)– similarity joins– aggregation/grouping
• input size: N items
![Page 100: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/100.jpg)
The PSRS algorithmParallel Sort by Regular Sample (PSRS)
• find p−1 values, called splitters −∞ = y0 < y1 <···< yp-1 < yp = +∞
• each server gets assigned one of the p intervals• partition the data such that all items in the same interval
are sent to the same server• each server sorts the items locally
100
![Page 101: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/101.jpg)
The PSRS algorithmHow does PSRS find the splitters?
• each server sorts its data locally, and computes the p-1local splitters (called the regular sample) that partition uniformly the local data
• each server broadcasts its regular sample• the final p splitters are computed by sorting the union of
the local regular samples, and choosing every p-th item.
101
![Page 102: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/102.jpg)
The PSRS algorithm: analysis
• PSRS achieves a load of ! = #(%/'), assuming that the number of servers p << N1/3
• Modern implementations of PSRS replace the local sorting by sampling to improve performance
102
What happens for larger values of p?
![Page 103: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/103.jpg)
Cole’s Algorithm
103
• designed for the PRAM model• works for arbitrary number of processors p
sorts in time O(N / p log(N) )
• does not naturally extend to BSP or MPC, since the processors access the shared memory in patterns that are costly to convert into messages
![Page 104: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/104.jpg)
Goodrich’s Algorithm
104
• designed for BSP (hence can be adapted to MPC)• works for arbitrary number of processors p
with load L = N/p, it runs in O(logL(N)) rounds
• the algorithm is very complex!
![Page 105: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/105.jpg)
Lower Bounds on Sorting
105
Theorem The minimum number of rounds needed by any MPC algorithm to sort N items is Ω(logLN)
• The lower bounds are independent of the number of servers p
• Having more processors doesn’t improve communication or synchronization!
Theorem The minimum communication needed by any MPC algorithm to sort N items is Ω(N logLN)
![Page 106: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/106.jpg)
Sorting in Practice
106
• None of the optimal parallel algorithms are used in practice
• In real-world instances of sorting, parallelism is coarse-grained (p << N)
• Typical method: find the splitters, partition, and then sort locally
Year Winner Time p and Memory/Processor
2016 Tencent Sort 134s 512 (512GB)
2015 FuxiSort 377s 3134 (96GB) + 243 (128GB)
2014 TritonSort 1378s 186 (244GB)
2014 Apache Spark 1406s 207 (244GB)
2013 Hadoop 4328s 2100 (64GB)
2011 TritonSort 8274s 52 (24GB)
![Page 107: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/107.jpg)
Conventional Square Matrix Multiplication
107
3.2 6.7 … 7.4
4.3 2.7 … 8.7
… … … …
-1.1 0.3 … 1.4
5.8 0.1 … 2.2
6.1 3.8 … 1.6
… … … …
0.8 2.5 … 0.4
8.8 0.5 … 1.4
1.5 1.0 … 2.5
… … … …
4.4 3.0 … 5.6
=n
n
n
n
A B C
• Focus on Conventional Algs that do all n3 products
Ø i.e.: Strassen-like algs are not allowed
• Reflects practice & o.w. LBs are trivial
• Stateless MPC: Procs keep O(L) elements (memory)
• Analyze other parameters, p, r, C=prL
![Page 108: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/108.jpg)
SQL Query of Matrix Multiplication
108
i j v1 1 3.2... … …n n 1.4
j k v1 1 5.8... … …n n 2.6
A Bi k sum1 1 8.8... … …n n 5.6
C
3.2 6.7 … 7.44.3 2.7 … 8.7… … … …-1.1 0.3 … 1.4
5.8 0.1 … 2.26.1 3.8 … 1.6… … … …0.8 2.5 … 0.4
8.8 0.5 … 1.41.5 1.0 … 2.5… … … …4.4 3.0 … 5.6
× =i
j
j
k k
i
select A.i,B.k,sum(A.v*B.v)from A, Bwhere A.j = B.jgroup by A.i, B.k
Join Part
Aggregation Part
![Page 109: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/109.jpg)
1-round Algorithm (1)
109
• Need entire rows & cols: 2n ≤ L ≤ n2
A B• Suppose L = 2tn (so each proc. can store 2t rows and cols)
k1
Q: # rows k1 & # cols k2 should a processor get?
A: k1=k2= t can do k1k2n = t2n products
k2
Ex:t=3, n=75 => 9*75 = 675 prods/proc[McKellar & Coffman, CACM, 1969], [Afrati et. al. VLDB, 2013]
![Page 110: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/110.jpg)
110
A
B
C1 … CK
L = 2tn, C = K2L => C= O(n4/L)
p1
p2
p3
pK^2
…
R1
RK
…
L = 2tn: Divide into K=n/t rectangular groups
1 Ri & 1 Cj
per processor
1-round Algorithm (2)
[McKellar & Coffman, CACM, 1969], [Afrati et. al. VLDB, 2013]
![Page 111: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/111.jpg)
2-round Algorithm (1)• Can do partial products & aggregate in separate rounds
AL / 2
L / 2
B
p1
p2
p3
pH^3
r = 1 r = 2
p1
pi
pj
… …
A11 A1H
H =nL / 2
=ntn
AH1
prods/proc: (tn)3/2
(3*75)3/2=3375 vs 675
AHH
…
…
…
… …
B1H…
…
…
… …
B1H
BH1 BHH
![Page 112: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/112.jpg)
Generalizing to > 2 Rounds
• Group into H groups of H2 s.t.Gz: Ai,j × Bj,k s.t. j = (i + k + z) mod H
• Each r: mult. as many block products as possible• w/ 1 proc doing 1 block product (+ partial aggregation)
H3
block
products
×A22 B22
×AHH BHH
×AH1 B2H×A12 B21
×A37 B74
×A11 B11
×A59 B9H …
…
r = 1r = 1
r = 2
…
[McColl & Tiskin, Algoritmica, 1999], [Pietracaprina et. al., ICS, 2012]
![Page 113: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/113.jpg)
Example Groups
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G0
C0
0
C01 C02 C03
C1
0
C11 C12 C13
C2
0
C21 C22 C23
C3
0
C31 C32 C33
Each group has exactly 1 block-mult for one Cij block
![Page 114: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/114.jpg)
Example Groups
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G0A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G1A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G2
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G3A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
Each group has exactly 1 block-mult for one Cij block
C0
0
C01 C02 C03
C1
0
C11 C12 C13
C2
0
C21 C22 C23
C3
0
C31 C32 C33
C0
0
C01 C02 C03
C1
0
C11 C12 C13
C2
0
C21 C22 C23
C3
0
C31 C32 C33
C0
0
C01 C02 C03
C1
0
C11 C12 C13
C2
0
C21 C22 C23
C3
0
C31 C32 C33
C0
0
C01 C02 C03
C1
0
C11 C12 C13
C2
0
C21 C22 C23
C3
0
C31 C32 C33
![Page 115: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/115.jpg)
Example 1: p=H2 (H=4, p=16)
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G0
p1
p2
p3
p16
…
Round 1
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
Partial Sums
OUT
![Page 116: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/116.jpg)
Example 1: p=H2 (H=4, p=16)G1
p1
p2
p3
p16
Round 2
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
Partial Sums
OUT
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
…
![Page 117: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/117.jpg)
Example 1: p=H2 (H=4, p=16)G2
p1
p2
p3
p16
Round 3
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
Partial Sums
OUTP00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33…
![Page 118: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/118.jpg)
Example 1: p=H2 (H=4, p=16)G3
p1
p2
p3
p16
Round 4
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
Final Output
OUTC00 C01 C02 C03
C10 C11 C12 C13
C20 C21 C22 C23
C30 C31 C32 C33
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33…
![Page 119: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/119.jpg)
Example 2: p=2H2 (H=4, p=32)
G0
p1
p16
…
Round 1
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
Partial Sums 1
p17
p32
…P’00 P’01 P’02 P’03
P’10 P’11 P’12 P’13
P’20 P’21 P’22 P’23
P’30 P’31 P’32 P’33
Partial Sums 2
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
G1
OUT
OUT
![Page 120: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/120.jpg)
Example 2: p=2H2 (H=4, p=32)p1
p16
…
Round 2
OUT
p17
p32
… OUT
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
G2
G3
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
A00 A01 A02 A03
A10 A11 A12 A13
A20 A21 A22 A23
A30 A31 A32 A33
B00 B01 B02 B03
B10 B11 B12 B13
B20 B21 B22 B23
B30 B31 B32 B33
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
P’00 P’01 P’02 P’03
P’10 P’11 P’12 P’13
P’20 P’21 P’22 P’23
P’30 P’31 P’32 P’33
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
Partial Sums 1
P’00 P’01 P’02 P’03
P’10 P’11 P’12 P’13
P’20 P’21 P’22 P’23
P’30 P’31 P’32 P’33
Partial Sums 2
![Page 121: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/121.jpg)
Example 2: p=2H2 (H=4, p=32)p1
p16
…
Round 3
P00 P01 P02 P03
P10 P11 P12 P13
P20 P21 P22 P23
P30 P31 P32 P33
p17
p32
…
P’00 P’01 P’02 P’03
P’10 P’11 P’12 P’13
P’20 P’21 P’22 P’23
P’30 P’31 P’32 P’33
Final OutputC00 C01 C02 C03
C10 C11 C12 C13
C20 C21 C22 C23
C30 C31 C32 C33
OUT
![Page 122: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/122.jpg)
Cost Analysis
122
Generalizes 2D and 3D algorithms.
Tight both for C and r for a given L!
• Sub-linear Scalability: pL > n3/L1/2 => L > n2/p2/3 => L > IN/p2/3
**Next: B/c of connection to query and "* of is 3/2**
Communication Rounds
Rectangle-Block O(n4/L) 1
Square-Block rpL=O(n3/L1/2) O(n3/(pL3/2) + logLn)
![Page 123: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/123.jpg)
Round-Independent LB on C (and L) (1)
i
j k
A
B
C
Suppose a proc. has N aij, N bjk and contributes to N cik elements
Q: How many elementary multiplications can it perform?
Ai j
N
Vi j k
?
v111
c111
b111
a111
……
… Bj k
N
Ci k
N
A: AGM => !*= 3/2, so O(N3/2)
[Irony, Toledo, & Tiskin, JPDC, 2004], [Hong & Kung, STOC, 1981]
![Page 124: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/124.jpg)
Phase 1 Phase 2 Phase k …
…
L elements L elements L elements…
Net
wor
k C
ard
Processor k ≤ L input elements
Round-Independent LB on C (and L) (2)
With L communication, a proc can do O(L3/2) products.
n3 products must be performed.
=> C = Ω(n3/L1/2) (irrespective of r)
![Page 125: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/125.jpg)
Lower Bounds on Rounds
• max(LB for Join, LB for GroupBy-and-Aggr)
• LB for Join:
C = r × p × L > n3/L1/2 => r =Ω(n3/(pL3/2))
• LB for GroupBy-and-Aggr: r = Ω(logLn)
r = Ω(max(n3/(pL3/2), logLn)
[Irony, Toledo, & Tiskin, JPDC, 2004], [Goodrich, SICOMP, 1999]
![Page 126: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/126.jpg)
n2
Com
mun
icat
ion
(C)
n3
Load (L)2nn1/2n1/3n1/4 n2
n2.5
requires≥ 4 rounds
1-round LB on C = n4/Lmulti-round LB on C = n3/L1/2
requires≥ 3 rounds
requires≥ 2 rounds
Summary
LBs met by:
Square-block multi-r &
Rectangle-block 1 round algs
![Page 127: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/127.jpg)
Other Results• Non-Square MM
• Sparse square and non-square MM
• Strassen-like MM
• Cholesky, LU, QR decompositions, eigenvalue and
singular value decompositions
127
![Page 128: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/128.jpg)
Outline• Models of parallel computation (Dan)
• Two-way joins (Paris)
• Multi-way joins (Paris+Semih)
• Sorting & Matrix multiplication (Paris+Semih)
• Conclusion (Dan)
128
![Page 129: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/129.jpg)
SummaryAlgorithms for:• Joins: 2-way, multi-way• Sorting• Matrix multiplication
Goal: minimize total runtime• Minimize communication cost• Minimize number of rounds
129
![Page 130: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/130.jpg)
Main Takeaways
Joins:• Skew-free data
– Optimal communication related to the fractional edge packing number !*
• Skewed data– Optimal communication related to the
fractional edge covering number "*Total communication >> input data
130
![Page 131: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/131.jpg)
Main TakeawaysSorting and matrix multiplication
• No skew
• Total communication = O(IN)
• When p << IN, algorithms are simple
• When p ≈ IN, algorithms are complex
131
![Page 132: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/132.jpg)
Open Questions
• Sorting-based optimal multi-join algorithm
• Optimal communication cost f(IN,OUT)
• Lower bound for multi-rounds (w/o skew)
132
![Page 133: Algorithmic Aspects of Parallel Query Processingsuciu/tutorial-sigmod... · 2018. 6. 16. · Motivation •Most modern data analytics tools process data on a cluster: –Spark, Dremel,](https://reader035.vdocument.in/reader035/viewer/2022071105/5fdf6c7f24beed3c513b71b3/html5/thumbnails/133.jpg)
133
Questions?
https://tinyurl.com/y99w99b4
Free until June 18(create account)