links between convex geometry and join processing
DESCRIPTION
Links between Convex Geometry and Join Processing. Christopher Ré Stanford University. “Query processing is not rocket science… When you flunk out of query processing, we make you go build rockets.” – Anonymous (J. Hamilton or D. DeWitt). Warning : This is (mostly) a theory talk… . - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/1.jpg)
Links between Convex Geometry and Join Processing
Christopher RéStanford University
![Page 2: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/2.jpg)
“Query processing is not rocket science… When you flunk out of query processing, we make you go build rockets.” – Anonymous (J. Hamilton or D. DeWitt)
![Page 3: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/3.jpg)
Warning: This is (mostly) a theory talk…
… but we (and others) are building database engines with these ideas.
![Page 4: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/4.jpg)
Motivation: Joins!Databases are about three things:
Efficiency,
Efficiency, and
Efficiency
Worst-caseParallel
Beyond Worst-case
Geometry
![Page 5: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/5.jpg)
Joins Since System R
Join(R,S,T) = { (a,b,c) : (a,b) in R, (a,c) in S, (b,c) in T}
R(A,B), S(A,C), T(B,C)
Join(R,S,T) = Join(Join(R,S),T)
System R searches through pairwise joinsFor 40+ years, major commercial database
use System-R style optimizer.
![Page 6: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/6.jpg)
Nugget: “DBs have been asymptotically suboptimal for the last 4 decades…”
6
![Page 7: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/7.jpg)
Example QueriesData: R(A,B), S(A,C),
and T(B,C)
Q1 = Join(R,S)
Q2 = Join(R,S,T)
Today: graph where edges colored R,S,T.
v SR
T
Nodes are data values.
“triples of nodes on a path of length 2 that goes via R then S”
“triples of nodes that form R-S-T triangles”
![Page 8: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/8.jpg)
Background: Triangles [Alon 80, Loomis-Whitney 49]
8
If R,S,T contain ≤ N tuples, how big can |Q| be?
Let Q be Join(R,S,T) = “R-S-T triangles”Data: R(A,B), S(A,C), and T(B,C))
R(A,B), S(A,C) T(B,C))|Join(R,S)| ≤ N2
R covers B, S covers C
Correct asymptotic:|Q| in Q( N3/2 )Can we compute Q in time O(N3/2)?
|Join(R,S,T)| ≤ N2
![Page 9: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/9.jpg)
1
Data inR and S
N N
Pairwise Joins are Suboptimal
9
Panic! A simple modification:any pairwise join plan takes W(N2)
JOIN(R,S) = [N]x {1} x [N]
|Join(R,S)| = N2
DB is toast!
|R|=|S|=|T|=N[N]={1,…,N}
R=[N] x {1} S ={1} x [N]T = {1} x [N]
R(A,B), S(B,C),T(A,C)Data
![Page 10: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/10.jpg)
10
Relax! [Itai & Rodeh 78, Alon et. al 97]
It’s cute, let’s see it.“The heavy-light technique”
Heavy versus Light
![Page 11: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/11.jpg)
Sketch: Heavy v. Light Nodes
Call a node heavy if it has more than N1/2 neighbors. Let H be set of heavy nodes.
Goal: Time O(N3/2) – ignoring log factors.
(case 1) If v in H, check whether each edge e in E forms a triangle with v.
Case I: In total most 2 N|H| probes Since |H| ≤ 2N1/2 then total time O(N3/2)
11
v ex yv1 v2.. …
N Edges
2 probes: each O(1) time.
![Page 12: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/12.jpg)
Case 2.
12
(case 2) If v not in H, for each pair of edges check.
Union is linear, so we’re done.
v
x yv1 v2.. …
N Edges
Case II: Each light node explores d(v)2 where d(v) is the degree of node v.
![Page 13: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/13.jpg)
How do we generalize to joins?
13
![Page 14: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/14.jpg)
Fractional Hypergraph Covers
14
Given a hypergraph H=(V,E) a fractional edge cover is x : E R such that x ≥ 0 and for each v in V we have Se : v in e x(e) ≥ 1
Ex: R(A,B),S(B,C),T(A,C). x(R) + x(T) ≥ 1 // cover for Ax(R) + x(S) ≥ 1 // cover for Bx(S) + x(T) ≥ 1 // cover for C
x(R,S,T)=(1,0,1) … or… x(R,S,T)=(0.5, 0.5, 0.5)
We think of a query as hypergraph to cover.
A C
BR S
T
![Page 15: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/15.jpg)
Size bounds [GM05, AGM08]
15
Triangle: |R|=|S|=|T| ≤ N, x(R)=x(S)=x(T)=0.5 N1.5
Thm [Atserias, Grohe, Marx FOCS08]: Given any hypergraph cover x
for (V,E) then S(Q,N) ≤ Pe in E |Re|x(e)
Fix a query Q=(V,E).Let N be a tuple of |E| positive integers.
Define S(Q, N) be the maximum size of Q subject to|Re|≤ Ne
![Page 16: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/16.jpg)
One more example.
16
R(A,B,C),S(A,B,D),T(A,C,D),U(B,C,D)
x(R) = x(S) = x(T) = x(U) = 1/3
Output size is O(N4/3)
Known since Loomis-Whitney (1940s Geometers!)
R(A,B,C),S(A,B,D),T(A,C,D),U(B,C,D)
![Page 17: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/17.jpg)
AGM’s result.
17
Atserias, Grohe, and Marx (AGM) allow one to write a linear program that tightly
bounds the output size of any join query.
Open: Compute the output in upper bound time?
We would call this worst-case optimal
Proof using Han/Shearer’s lemma (non constructive)
![Page 18: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/18.jpg)
Ngo, Porat, Ré, and Rudra (PODS 2012)
18
We show AGM’s fractional cover inequality is equivalent to the Bollabás-Thomason
inequality from geometry.
1st algorithm for joins with optimal worst-case runtime
(experts: optimal data complexity)
Algorithmic Idea: LP is a guide to decide “heavy” v. “light” of previous example.
![Page 19: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/19.jpg)
Implemented & described at ICDT14!
19
Todd Veidhulzen. “Leapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm” [ICDT14, Best Newcomer Award!]
Tidbit: Faster on cyclic queries than other commercial DB optimizers… without
resorting to specialized graph processing!
![Page 20: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/20.jpg)
Much simpler proofs!
20
Skew strikes back: New Developments in the Theory of Join Algorithms. (SIGMOD Record 13 )
Atri Rudra
HungNgo
Even simpler than heavy
vs. light!
![Page 21: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/21.jpg)
21
3 Vignettes of Related Ideas
![Page 22: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/22.jpg)
Faster Detection: Alon-Yuster-Zwick, one can check if a graph contains a 2k-length cycle in O(N2-1/k) using heavy vs. light argument.
Systems: Heavy vs. light used in parallel database systems (e.g., Teradata).
22
(1) Heavy versus Light
![Page 23: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/23.jpg)
(2) Map Reduce Joins: Afrati & Ullman EDBT 2010
23
A
C
B
Q1 = R(A,B),S(B,C),T(A,C)Q2 = R(A,B),S(B,C)
Mappers send data to reducer via hash function.
Optimal: Recast as a (fractional) cover problem! [Koutris, Suciu, and Beame PODS14]
Goal: Given p reducers, minimize communication by picking “how large” each attribute’s share is.
Afrati & Ullman. Solve Constrained Mathematical Program.
Lower bound portion uses Covers!
![Page 24: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/24.jpg)
(3) Tighter Runtime Guarantees
24
Yannakakis’s seminal algorithm has a stronger guarantee for a-acyclic queries.
O( N + OUT )General: O(N + Nw* + OUT) where w* is the fractional hypertreewidth (NPRR+Y’s+Treewidth)
Databases with N tuples.
Databases with N+1 tuples.
Measure: longest runtime in each strata
(Traditional Worst-case)
Databases with N tuples.
Databases with N+1 tuples.
Output Size 0.5N 0.2N
![Page 25: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/25.jpg)
25
Begs a question: What is the tightest guarantee thatone can hope for?
![Page 26: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/26.jpg)
Pathology of Worst-case Analysis for Joins
26
My not-so-secret goal: Join theory even closer to practice.
Data often stored in an index or sorted, and this changes landscape
Worst case: insensitive how data are stored.
Worst case: one reads the entire input.Rarely, if ever, does this happen on large databases…
Worst case: answer is hugeUsually, the answer is smaller than the database—not larger!
![Page 27: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/27.jpg)
We all want to go “beyond worst-case”
27
One of the1st beyond worst-case analysis was by a DB
theoretician: Ron Fagin.Instance Optimality
Tim Roughgarden (Stanford) has great notes on this.
![Page 28: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/28.jpg)
Measuring complexity
28
Databases with N tuples.
Databases with N+1 tuples.
Worst-case analysis (Traditional CS)
Let W(A,N) = supD T(A,D) s.t. D has N tuples
Let T(A,D) be # of steps that algorithm A takes on database D
Measure W(A,N) growth with N, asymptotically.
![Page 29: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/29.jpg)
Notions of complexity
29
Databases with N tuples.
Databases with N+1 tuples.
Instance Optimality
Algorithm Opt is instance optimal if there exists constant c such that T(Opt,D) ≤ c T(A,D)
for A in a class of algorithms and any D.
Essentially singleton boxes, much stronger…
Let T(A,D) be # of steps that algorithm A takes on database D
![Page 30: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/30.jpg)
So how do we pick a class of algorithms?
30
![Page 31: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/31.jpg)
What do join algorithms do?
31
Famous algorithms: Hash, Sort-merge, index-nested, block-nested loop, Grace, PRISM, double pipelined.…
Observation: Algorithms are generic. Do not depend on data values
but may use data order & equality.E.g., use an index to skip many consecutive values.
Call these comparison-based algorithms.
![Page 32: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/32.jpg)
A Nugget: “A little sorting changes the complexity
landscape a lot.”
32
![Page 33: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/33.jpg)
Suppose R[j] = 2j and S[j] = 2j+1 for j=1…N
Warm up: Intersection [Huang & Lin 1972]
33
Given two sorted lists R and S of length NR[1] < … < R[N] and S[1] < … < S[N].
At position i, no idea what comes next.
Ping-pong back and forth—W(N) time
![Page 34: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/34.jpg)
Warm up: Intersection [Huang & Lin 1972]
34
Given two sorted lists R and S of length NR[1] < … < R[N] and S[1] < … < S[N].
Suppose R[i] = i and S[i] = N/2 + i for i=1…N/2
Skip to R[N] in O(log N) time!
![Page 35: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/35.jpg)
Message: Difference in the certification
35
Same: Input and output size are the same!
running time is Log(N) v. N
R[N] < S[1] is enough
Certify each alternation
Different: Work to certify output is empty
![Page 36: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/36.jpg)
36
Goal: Algorithms that run in time proportional to the size of a smallest certificate.
![Page 37: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/37.jpg)
Generalizing Certificates to Joins
37
To define a certificate, need to describe:1. How are data stored? (search trees), and2. How are certificates encoded?
(arguments)
Assume: a global attribute order A1…An.All relations are stored consistently with this order
(… we can remove this …)
![Page 38: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/38.jpg)
Think of the data in a trie
38
A2 A4 A5
1 2 41 2 71 3 57 4 210 4 1
R
1 7
4
4432
57
10
12
A4
A5
R[3]
R[1,2]
R[1,1,2]
A2
R[2,1]
R[3,1,1]
R[1,2,1]
Index indicates the tuples order.
A relation R(A2,A4,A5)
![Page 39: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/39.jpg)
39
NB: search trees capture hash tables, B+trees, tries,
up to a log N factor.R
1 7
4
4432
57
10
12
Compare elements in this dictionary
order…
![Page 40: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/40.jpg)
Any algorithm must certify its output
An argument is a set of propositional statements of the following forms.1. R[i] < S[j]2. R[i] = S[j]3. R[i] = R[j]
40
Here i,j are tuples of indexes as illustrated in previous slide.
Certificate is an argument, cert, such that any instance that satisfies cert has the same output (up to isomorphism).
![Page 41: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/41.jpg)
Certificate Complexity
41
Goal: run in time O( (|cert| + Z) log N) where cert denotes a smallest certificate,Z is the size of the output, andN is the size of the data.
O hides constants depending on Q.
NB: Input Size under log
Runtimes of the above are essentially instance optimal for comparison based.Ron Fagin says “log-instance optimal”
![Page 42: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/42.jpg)
Comments about Certificates
42
1. A comparison-based algorithm (all available join algorithms) takes at least |cert| steps.
2. N ≥ |cert| where N is the input size (strictly finer notion of complexity)
3. Certificates provide an instance-dependent measure of complexity. (conditioning)
![Page 43: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/43.jpg)
Minesweeper Algorithm (MS)
43
“Removing the haystack to find the needles”
![Page 44: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/44.jpg)
Minesweeper: Key operation
44
A B2 43 23 5
Deduce: no output tuple can be in an interval
Index for R in (A,B) order
Big Conceptual Change: Find best way to rule out all tuples… not to find tuples.
(=3, [3,4] )
([-∞, 1],*)
Consider: Q = R(A,B),S(A)
(=2,[-∞,3])No output tuple has A = 3 and B in [3,4]
A4
Index for S
![Page 45: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/45.jpg)
Picture the Output Space as a Grid
45
R(A,B),S(A)
A values
B values
(=3, [3,4] )
([-∞, 1],*) (=2,[-∞,4])
([-∞,4], *)
A B2 43 23 5
A4
R(A,B) S(A)
Goal: Run proportional to smallest cover.
![Page 46: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/46.jpg)
The algorithm
1. Pick an uncovered point, t.
2. Find all possible ways to cover t with “gaps”.
3. Insert gaps in to a data structure.
Repeat until all points covered.
46
t
Hard part: Data structure to find
t, efficiently.
Idea: Reuse information as much as possible.
![Page 47: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/47.jpg)
Nugget: “The boundary for efficiency has changed from the worst case.”
47
![Page 48: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/48.jpg)
The Boundary
48
View a query as a hypergraph.
R(A,B),S(A,B,C),T(B,C,D)
A B
C D
Want: Acyclic-like properties but closed under edge removal. a-acyclic does not have this property.
Turns out, b-acyclic [Fagin83] is the right notion
![Page 49: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/49.jpg)
Certificate Dichotomy [PODS14]
49
Theorem [NNRR14, Certificate Dichotomy] Given query Q
(1)if Q is b-acyclic, then there is some order of attributes such that MS takes O(|cert| + Z) on all instances.
(2)Assuming the 3SUM conjecture, for any b-cyclic query there is some family of instances where any algorithm runs in time W( |cert|4/3 + Z)
O hides log N factor
![Page 50: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/50.jpg)
Further Results1. No polynomial time bound in |cert| for a-acyclic queries (Exponential time hypothesis)
a-acyclic is the worst-case boundary, this changes complexity landscape.
2. Q, treewidth w, MS runs in O(|cert|w+1 + Z) time
3. Fractional results for triangle, O(|cert|3/2 + Z). 50
![Page 51: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/51.jpg)
Some PRELIMINARY Empirical Results
51
Dung Nguyen, Hung Q. Ngo & LogicBlox. Single threaded & multicore version inside LB.
7 graph datasets (100M+ edges)
(2) On small # of b-acyclic queries in LB, up to 230x faster and at most 5% slower than LFTJ
Take away: minesweeper might be useful.
Real evaluation underway with LB’s help!
(1) Real certificates can be 1000x smaller than N
![Page 52: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/52.jpg)
52
Future Directions in JoinsCompressed Representations.
Column-type storage [Abadi VLDB05] Morphing and Bit Weaving [Patel et. al. VLDB06 VLDB2012]. Factorised DBs using covers [Olteanu et al. VLDB2012-2013]. Compression & bit-level operations for modern hardware?
Deeper understanding of data properties. Bounded degree data [A. Durand & E. Grandjean ToCL97] Bounded expansion [Segoufin ICDT13, Kazana&Segoufin
PODS13] Bounded treewidth data [Gottlob et al. AAAI06, Aarnborg 85,
Grohe ICDT99]Personally obsessed!
One join algorithm to rule them all? Stay tuned…
![Page 53: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/53.jpg)
Convex Geometry and Joins
1. worst-case optimal Join algorithm via Fractional Covers
2. Other Uses of Fractional Covers
3. Beyond Worst-case for Joins using Covers and Geometry.
Tool: Fractional covers to bound volumes
![Page 54: Links between Convex Geometry and Join Processing](https://reader035.vdocument.in/reader035/viewer/2022062305/5681666d550346895dda0ae6/html5/thumbnails/54.jpg)
54
Conclusion
Joins are awesome: Theory and Practice can argue!
Geometry is key to these new algorithms.
Beyond worst-case analysis may be an opportunity for the DB
community.