![Page 1: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/1.jpg)
CMU SCS
Carnegie Mellon Univ.
Dept. of Computer Science
15-415 - Database Applications
Query Optimization
(R&G ch. 15; Sys. R q-opt paper)
![Page 2: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/2.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 2
Overview - detailed
• Why q-opt?
• Equivalence of expressions
• Cost estimation
• Plan generation
• Plan evaluation
![Page 3: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/3.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 3
Cost-based Query Sub-System
Query Parser
Query Optimizer
Plan
Generator
Plan Cost
Estimator
Query Plan Evaluator
Catalog Manager
Usually there is a heuristics-based rewriting step before the cost-based steps.
Schema Statistics
Select *
From Blah B
Where B.blah = blah Queries
![Page 4: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/4.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 4
Why Q-opt?
• SQL: ~declarative
• good q-opt -> big difference
– eg., seq. Scan vs
– B-tree index, on P=1,000 pages
![Page 5: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/5.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 5
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
• estimate cost; pick best
![Page 6: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/6.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 6
Q-opt - example
select name
from STUDENT, TAKES
where c-id=‘415’ and
STUDENT.ssn=TAKES.ssn
STUDENT TAKES
s
p
![Page 7: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/7.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 7
Q-opt - example
STUDENT TAKES
s
p
STUDENT TAKES
s
p Canonical form
![Page 8: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/8.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 8
Q-opt - example
STUDENT TAKES
s
p
Index; seq scan
Hash join;
merge join;
nested loops;
![Page 9: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/9.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 9
Overview - detailed
• Why q-opt?
• Equivalence of expressions
• Cost estimation
• ...
![Page 10: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/10.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 10
Equivalence of expressions
• A.k.a.: syntactic q-opt
• in short: perform selections and projections
early
• More details: see transf. rules in text
![Page 11: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/11.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 11
Equivalence of expressions
• Q: How to prove a transf. rule?
• A: use RTC, to show that LHS = RHS, eg:
)2()1()21(?
RRRR PPP sss
)2()1()21(?
RRRR PPP sss
![Page 12: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/12.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 12
Equivalence of expressions
))()2())(1(
)()21(
)()21(
)2()1()21(?
tPRttPRt
tPRtRt
tPRRt
LHSt
RRRR PPP sss
![Page 13: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/13.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 13
Equivalence of expressions
QED
RHSt
RRt
RtRt
tPRttPRt
RRRR
PP
PP
PPP
)2()1(
))2(())1((
))()2())(1(
...
)2()1()21(?
ss
ss
sss
![Page 14: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/14.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 14
Equivalence of expressions
• Q: how to disprove a rule??
)2()1()21(?
RRRR AAA ppp
![Page 15: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/15.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 15
Equivalence of expressions
• Selections
– perform them early
– break a complex predicate, and push
– simplify a complex predicate
• („X=Y and Y=3‟) -> „X=3 and Y=3‟
))...)((...()( 21.̂..2^1 RR pnpppnpp ssss
![Page 16: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/16.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 16
Equivalence of expressions
• Projections
– perform them early (but carefully…)
• Smaller tuples
• Fewer tuples (if duplicates are eliminated)
– project out all attributes except the ones
requested or required (e.g., joining attr.)
![Page 17: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/17.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 17
Equivalence of expressions
• Joins
– Commutative , associative
– Q: n-way join - how many diff. orderings?
RSSR
)()( TSRTSR
![Page 18: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/18.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 18
Equivalence of expressions
• Joins - Q: n-way join - how many diff.
orderings?
• A: Catalan number ~ 4^n
– Exhaustive enumeration: too slow.
![Page 19: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/19.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 19
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
• estimate cost; pick best
![Page 20: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/20.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 20
Cost-based Query Sub-System
Query Parser
Query Optimizer
Plan
Generator
Plan Cost
Estimator
Query Plan Evaluator
Catalog Manager
Usually there is a heuristics-based rewriting step before the cost-based steps.
Schema Statistics
Select *
From Blah B
Where B.blah = blah Queries
![Page 21: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/21.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 21
Cost estimation
• Eg., find ssn‟s of students with an „A‟ in 415 (using seq. scanning)
• How long will a query take?
– CPU (but: small cost; decreasing; tough to estimate)
– Disk (mainly, # block transfers)
• How many tuples will qualify?
• (what statistics do we need to keep?)
![Page 22: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/22.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 22
Cost estimation
• Statistics: for each relation
„r‟ we keep
– nr : # tuples;
– Sr : size of tuple in bytes
…
Sr
#2
#3
#nr
#1
![Page 23: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/23.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 23
Cost estimation
• Statistics: for each relation
„r‟ we keep
– …
– V(A,r): number of distinct
values of attr. „A‟
– (recently, histograms, too) …
Sr
#1 #2
#3
#nr
![Page 24: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/24.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 24
Derivable statistics
• blocking factor =
max# records/block
(=?? )
• br: # blocks (=?? )
• SC(A,r) = selection
cardinality = avg# of
records with A=given
(=?? )
…
fr
Sr
#1
#2
#br
![Page 25: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/25.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 25
Derivable statistics
• blocking factor = max# records/block (=
B/Sr ; B: block size in bytes)
• br: # blocks (= nr / (blocking-factor) )
![Page 26: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/26.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 26
Derivable statistics
• SC(A,r) = selection cardinality = avg# of
records with A=given (= nr / V(A,r) )
(assumes uniformity...) – eg: 10,000
students, 10 colleges – how many students
in SCS?
![Page 27: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/27.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 27
Additional quantities we need:
• For index „i‟:
– fi: average fanout (~50-100)
– HTi: # levels of index „i‟ (~2-3)
• ~ log(#entries)/log(fi)
– LBi: # blocks at leaf level
HTi
![Page 28: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/28.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 28
Statistics
• Where do we store them?
• How often do we update them?
![Page 29: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/29.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 29
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
• estimate cost; pick best
![Page 30: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/30.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 30
Selections
• we saw simple predicates (A=constant; eg.,
„name=Smith‟)
• how about more complex predicates, like
– „salary > 10K‟
– „age = 30 and job-code=“analyst” ‟
• what is their selectivity?
![Page 31: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/31.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 31
Selections – complex predicates
• selectivity sel(P) of predicate P :
== fraction of tuples that qualify
sel(P) = SC(P) / nr
![Page 32: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/32.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 32
Selections – complex predicates
• eg., assume that V(grade, TAKES)=5
distinct values
• simple predicate P: A=constant
– sel(A=constant) = 1/V(A,r)
– eg., sel(grade=„B‟) = 1/5
• (what if V(A,r) is unknown??) grade
count
A F
![Page 33: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/33.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 33
Selections – complex predicates
• range query: sel( grade >= „C‟)
– sel(A>a) = (Amax – a) / (Amax – Amin)
grade
count
A F
![Page 34: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/34.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 34
Selections - complex predicates
• negation: sel( grade != „C‟)
– sel( not P) = 1 – sel(P)
– (Observation: selectivity =~ probability)
grade
count
A F
‘P’
![Page 35: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/35.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 35
Selections – complex predicates
conjunction:
– sel( grade = „C‟ and course = „415‟)
– sel(P1 and P2) = sel(P1) * sel(P2)
– INDEPENDENCE ASSUMPTION
P1 P2
![Page 36: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/36.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 36
Selections – complex predicates
disjunction:
– sel( grade = „C‟ or course = „415‟)
– sel(P1 or P2) = sel(P1) + sel(P2) – sel(P1 and P2)
– = sel(P1) + sel(P2) – sel(P1)*sel(P2)
– INDEPENDENCE ASSUMPTION, again
P1 P2
![Page 37: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/37.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 37
Selections – complex predicates
disjunction: in general
sel(P1 or P2 or … Pn) =
1 - (1- sel(P1) ) * (1 - sel(P2) ) * … (1 - sel(Pn))
P1 P2
![Page 38: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/38.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 38
Selections – summary
– sel(A=constant) = 1/V(A,r)
– sel( A>a) = (Amax – a) / (Amax – Amin)
– sel(not P) = 1 – sel(P)
– sel(P1 and P2) = sel(P1) * sel(P2)
– sel(P1 or P2) = sel(P1) + sel(P2) – sel(P1)*sel(P2)
– sel(P1 or ... or Pn) = 1 - (1-sel(P1))*...*(1-sel(Pn))
– UNIFORMITY and INDEPENDENCE ASSUMPTIONS
![Page 39: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/39.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 39
Result Size Estimation for Joins
• Q: Given a join of R and S, what is the range of
possible result sizes (in #of tuples)?
– Hint: what if R_colsS_cols = ?
– R_colsS_cols is a key for R (and a Foreign Key in
S)?
![Page 40: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/40.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 40
Result Size Estimation for Joins
• General case: R_colsS_cols = {A} (and A is key for
neither)
– match each R-tuple with S-tuples
est_size <~ NTuples(R) * NTuples(S)/NKeys(A,S)
<~ nr * ns / V(A,S)
– symmetrically, for S:
est_size <~ NTuples(R) * NTuples(S)/NKeys(A,R)
<~ nr * ns / V(A,R)
– Overall:
est_size = NTuples(R)*NTuples(S)/MAX{NKeys(A,S),
NKeys(A,R)}
![Page 41: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/41.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 41
0
1
2
3
4
5
6
7
8
9
10
0
1
2
3
4
5
6
7
8
9
10
On the Uniform Distribution
Assumption
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Uniform distribution approximating D Distribution D
• Assuming uniform distribution is rather
crude
![Page 42: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/42.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 42
0
1
2
3
4
5
6
7
8
9
10
0
1
2
3
4
5
6
7
8
9
10
Histograms
• For better estimation, use a histogram
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Equidepth histogram ~ quantiles Equiwidth histogram
Bucket 1
Count=8
Bucket 2
Count=4
Bucket 3
Count=15
Bucket 4
Count=3
Bucket 5
Count=15
Bucket 1
Count=9
Bucket 2
Count=10
Bucket 3
Count=10
Bucket 4
Count=7
Bucket 5
Count=9
![Page 43: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/43.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 43
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
– single relation
– multiple relations
• estimate cost; pick best
![Page 44: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/44.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 44
plan generation
• Selections – eg.,
select *
from TAKES
where grade = „A‟
• Plans?
…
fr
Sr
#1
#2
#br
![Page 45: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/45.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 45
plan generation
• Plans?
– seq. scan
– binary search
• (if sorted &
consecutive)
– index search
• if an index exists
…
fr
Sr
#1
#2
#br
![Page 46: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/46.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 46
plan generation
seq. scan – cost?
• br (worst case)
• br/2 (average, if we
search for primary
key)
…
fr
Sr
#1
#2
#br
![Page 47: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/47.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 47
plan generation
binary search – cost?
if sorted and consecutive:
• ~log(br) +
• SC(A,r)/fr (=blocks spanned by qual. tuples)
…
fr
Sr
#1
#2
#br
![Page 48: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/48.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 48
plan generation
estimation of selection
cardinalities SC(A,r):
non-trivial – we saw it
earlier
…
fr
Sr
#1
#2
#br
![Page 49: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/49.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 49
plan generation
method#3: index – cost?
– levels of index +
– blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
...
case#1: primary key
case#2: sec. key –
clustering index
case#3: sec. key – non-
clust. index
![Page 50: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/50.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 50
plan generation
method#3: index – cost?
– levels of index +
– blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
..
case#1: primary key – cost:
HTi + 1
HTi
![Page 51: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/51.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 51
plan generation
method#3: index – cost?
– levels of index +
– blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
case#2: sec. key –
clustering index
HTi + SC(A,r)/fr
HTi
![Page 52: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/52.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 52
plan generation
method#3: index – cost?
– levels of index +
– blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
...
case#3: sec. key – non-
clust. index
HTi + SC(A,r)
(actually, pessimistic...)
![Page 53: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/53.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 53
plan generation
method#3: index – cost?
– levels of index +
– blocks w/ qual. tuples
fr
Sr
…
#1
#2
#br
...
(actually, pessimistic...
better estimates:
Cardenas‟ formula)
![Page 54: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/54.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 54
Cardena‟s formula
• q: # qual records
• Q: # qual. blocks
• N: # records total
• B: # blocks total
• Q=?? …
#1
#2
#B
Alfonso Cardenas
(UCLA)
![Page 55: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/55.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 55
Cardena‟s formula
• Pessimistic:
– Q = q
…
#1
#2
#B
q
![Page 56: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/56.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 56
Cardena‟s formula
• Pessimistic:
– Q = q
• More realistic
– Q = q if q<=B
– Q = B otherwise …
#1
#2
#B
![Page 57: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/57.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 57
Cardena‟s formula
• Cardenas‟ formula
…
#1
#2
#B
])/11(1[ qBBQ
![Page 58: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/58.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 58
Plans for single relation -
summary
• no index: scan (dup-elim; sort)
• with index:
– single index access path
– multiple index access path
– sorted index access path
– index-only access path
![Page 59: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/59.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 59
Citation
• P. G. Selinger, M. M. Astrahan, D. D.
Chamberlin, R. A. Lorie, and T. G. Price.
Access path selection in a relational
database management system. In SIGMOD
Conference, pages 23--34, 1979.
![Page 60: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/60.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 60
Frequently cited database publications http://www.informatik.uni-trier.de/~ley/db/about/top.html
# Publication
608 Peter P. Chen: The Entity-Relationship Model - Toward a
Unified View of Data. ACM Trans. Database Syst. 1(1): 9-
36(1976)
580 E. F. Codd: A Relational Model of Data for Large Shared Data
Banks. Commun. ACM 13(6): 377-387(1970)
371 Patricia G. Selinger, Morton M. Astrahan, Donald D.
Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path
Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34
366 Jeffrey D. Ullman: Principles of Database and Knowledge Base
Systems, Volume I. Computer Science Press 1988, ISBN 0-
7167-8158-1
… …
![Page 61: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/61.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 61
Statistics for Optimization
• NCARD (T) - cardinality of relation T in tuples
• TCARD (T) - number of pages containing tuples
from T
• P(T) = TCARD(T)/(# of non-empty pages in the
segment)
– If segments only held tuples from one relation there
would be no need for P(T)
• ICARD(I) - number of distinct keys in index I
• NINDX(I) - number of pages in index I
![Page 62: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/62.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 62
Predicate Selectivity Estimation
F = 1 – F(expr) NOT expr
F = F(expr1) * F(expr2) expr1 and expr2
F = F(expr1)+F(expr2)–F(expr1)*F(expr2) expr1 or expr2
F = (value2-value1)/(high key-low key)
F = 1/4 otherwise
val1 < attr < val2
F = 1/max(ICARD(I1),ICARD(I2)) or
F = 1/ICARD(Ii) – if only index i exists, or
F = 1/10
attr1 = attr2
F = 1/ICARD(attr index) – if index exists
F = 1/10 otherwise
attr = value
![Page 63: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/63.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 63
Costs per Access Path Case
TCARD/P + W*RSICARD Segment scan
F(preds)*(NINDX(I)+TCARD)+W*RSICARD Clustered index I
matching >=1
preds
F(preds)*(NINDX(I)+NCARD)+W*RSICARD
Non-clustered
index I matching
>=1 preds
1+1+W Unique index
matching equal
predicate
![Page 64: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/64.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 64
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
– single relation
– multiple relations
• Main idea
• Dynamic programming – reminder
• Example
• estimate cost; pick best
![Page 65: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/65.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 65
• r1 JOIN r2 JOIN ... JOIN rn
• typically, break problem into 2-way joins
– choose between NL, sort merge, hash join, ...
n-way joins
![Page 66: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/66.jpg)
CMU SCS
Queries Over Multiple Relations
• As number of joins increases, number of alternative plans
grows rapidly need to restrict search space
• Fundamental decision in System R: only left-deep join trees
are considered. Advantages?
– fully pipelined plans.
• Intermediate results not written to temporary files.
• Not all left-deep trees are fully pipelined (e.g., SM join).
C D B A 66 B A
C
D
B A
C
D
![Page 67: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/67.jpg)
CMU SCS
67
Queries Over Multiple Relations
• As number of joins increases, number of alternative plans
grows rapidly need to restrict search space
• Fundamental decision in System R: only left-deep join trees
are considered. Advantages?
– fully pipelined plans.
• Intermediate results not written to temporary files.
• Not all left-deep trees are fully pipelined (e.g., SM join).
B A
C
D
B A
C
D
C D B A
![Page 68: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/68.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 68
Queries over Multiple Relations
• Enumerate the orderings (= left deep tree)
• enumerate the plans for each operator
• enumerate the access paths for each table
Dynamic programming, to save cost
estimations
![Page 69: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/69.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 69
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
– single relation
– multiple relations
• Main idea
• Dynamic programming – reminder
• Example
• estimate cost; pick best
![Page 70: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/70.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 70
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
Cheapest flight PIT -> SG ?
$800
![Page 71: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/71.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 71
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
Assumption: NO package deals: cost CDG->SG
is always $800, no matter how reached CDG
$800
![Page 72: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/72.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 72
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
Solution: compute partial optimal, left-to-right:
$800
$50
$450
$650
$1050
$850
$950
![Page 73: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/73.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 73
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
Solution: compute partial optimal, left-to-right:
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
![Page 74: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/74.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 74
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
Solution: compute partial optimal, left-to-right:
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
![Page 75: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/75.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 75
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
Solution: compute partial optimal, left-to-right:
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
![Page 76: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/76.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 76
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
So, best price is $1,500 – which legs?
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
![Page 77: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/77.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 77
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
So, best price is $1,500 – which legs?
A: follow the winning edges, backwards
![Page 78: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/78.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 78
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
So, best price is $1,500 – which legs?
A: follow the winning edges, backwards
![Page 79: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/79.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 79
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
So, best price is $1,500 – which legs?
A: follow the winning edges, backwards
![Page 80: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/80.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 80
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
Q: what are the states, costs and arrows, in q-opt?
![Page 81: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/81.jpg)
CMU SCS
(Reminder: Dynamic
Programming)
Faloutsos CMU SCS 15-415 81
PIT
CDG
ATL
SG
BOS
FRA
JKF
$200
$150
$500
$800
$50
$450
$650
$1050
$850
$950
$200
$150
$50
$700
$650
$1500
Q: what are the states (and costs and arrows), in q-opt?
A: set of intermediate result tables
![Page 82: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/82.jpg)
CMU SCS
Q-opt and Dyn. Programming
• E.g., compute R join S join T
Faloutsos CMU SCS 15-415 82
R
S
T
R join S
T
R
S join T
R join S join T
…
150 (SM)
2,500 (NL)
…
…
![Page 83: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/83.jpg)
CMU SCS
Q-opt and Dyn. Programming
• Details: how to record the fact that, say R is
sorted on R.a? or that the user requires
sorted output?
• A:
• E.g., consider the query
select *
from R, S, T
where R.a = S.a and S.b = T.b
order by R.a
![Page 84: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/84.jpg)
CMU SCS
Q-opt and Dyn. Programming
• Details: how to record the fact that, say R is
sorted on R.a? or that the user requires
sorted output?
• A: record orderings, in the state
• E.g., consider the query
select *
from R, S, T
where R.a = S.a and S.b = T.b
order by R.a
![Page 85: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/85.jpg)
CMU SCS
Q-opt and Dyn. Programming
• E.g., compute R join S join T order by R.a
Faloutsos CMU SCS 15-415 85
R
S
T
R join S
T
R
S join T
R join S join T
…
150 (SM)
2,500 (NL)
![Page 86: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/86.jpg)
CMU SCS
Q-opt and Dyn. Programming
• E.g., compute R join S join T order by R.a
Faloutsos CMU SCS 15-415 86
R
S
T
R join S
T
R
S join T
R join S join T
…
150 (SM)
2,500 (NL) R join S join T,
sorted R.a
sort
Any other changes?
![Page 87: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/87.jpg)
CMU SCS
Q-opt and Dyn. Programming
Faloutsos CMU SCS 15-415 87
R
S
T
R join S
T
R
S join T
R join S join T
…
150 (SM)
2,500 (NL) R join S join T,
sorted R.a
sort
R join S (R.a)
T
2000 (NL)
50 (HJ)
![Page 88: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/88.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 88
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
– single relation
– multiple relations
• Main idea
• Dynamic programming – reminder
• Example
• estimate cost; pick best
![Page 89: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/89.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 89
Candidate
Plans
R S
B
SELECT S.sname, B.bname, R.day FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid
1. Enumerate relation orderings:
B S
R
S R
B
B R
S
R B
S x
R B
S x
Prune plans with cross-products immediately!
![Page 90: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/90.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 90
R S
B
SELECT S.sname, B.bname, R.day FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid
2. Enumerate join algorithm choices:
R S
B
HJ
HJ
R S
B
HJ
NLJ
R S
B
NLJ
HJ
R S
B
NLJ
NLJ
+ do same for 4 other plans
4*4 = 16 plans so far..
Candidate
Plans
![Page 91: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/91.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 91
SELECT S.sname, B.bname, R.day FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND R.bid = B.bid
3. Enumerate access method choices:
R S
B
NLJ
NLJ
+ do same for other plans
R S
B
NLJ
NLJ
(heap scan)
(heap scan)
(heap scan)
R S
B
NLJ
NLJ
(INDEX scan on R.bid)
(heap scan)
(heap scan)
Candidate
Plans
![Page 92: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/92.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 92
Now estimate the cost of each
plan
Example:
R S
B
NLJ
NLJ
(INDEX scan on R.sid)
(heap scan)
(heap scan)
![Page 93: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/93.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 93
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
– single relation
– multiple relations
– nested subqueries
• estimate cost; pick best
![Page 94: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/94.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 94
Q-opt steps
• Everything so far: about a single query
block
![Page 95: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/95.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 95
Query Rewriting
• Re-write nested queries
• to: de-correlate and/or flatten them
![Page 96: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/96.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 96
Example: Decorrelating a Query
SELECT S.sid FROM Sailors S WHERE EXISTS
(SELECT * FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid)
Equivalent uncorrelated query:
SELECT S.sid FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103)
• Advantage: nested block only needs to be
executed once (rather than once per S tuple)
![Page 97: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/97.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 97
Example: “Flattening” a Query
SELECT S.sid FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103) Equivalent non-nested query:
SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103
• Advantage: can use a join algorithm + optimizer can
select among join algorithms & reorder freely
![Page 98: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/98.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 98
System R:
– break query in query blocks
– simple queries (ie., no joins): look at stats
– n-way joins: left-deep join trees; ie., only one
intermediate result at a time
• pros: smaller search space; pipelining
• cons: may miss optimal
– 2-way joins: NL and sort-merge
Structure of query optimizers:
r1 r2 r3 r4
![Page 99: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/99.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 99
More heuristics by Oracle, Sybase and
Starburst (-> DB2)
In general: q-opt is very important for large
databases.
(„explain select <sql-statement>‟ gives plan)
Structure of query optimizers:
![Page 100: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/100.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 100
Q-opt steps
• bring query in internal form (eg., parse tree)
• … into „canonical form‟ (syntactic q-opt)
• generate alt. plans
• estimate cost; pick best
![Page 101: Carnegie Mellon Univ. Dept. of Computer Science 15-415 ... › course › 15-415-f11 › ... · CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications](https://reader033.vdocument.in/reader033/viewer/2022060404/5f0eef857e708231d441aa9b/html5/thumbnails/101.jpg)
CMU SCS
Faloutsos CMU SCS 15-415 101
Conclusions
• Ideas to remember:
– syntactic q-opt – do selections early
– selectivity estimations (uniformity, indep.;
histograms; join selectivity)
– hash join (nested loops; sort-merge)
– left-deep joins
– dynamic programming