robust query processing goetz graefe, christian könig, harumi kuno, volker markl, kai-uwe sattler...

27
Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

Upload: anderson-rennick

Post on 14-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

Robust query processing

Goetz Graefe, Christian König, Harumi Kuno,Volker Markl, Kai-Uwe Sattler

Dagstuhl – September 2010

Page 2: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 2

Max-diff histograms

True distribution Average value

Equal width

Equal area

Max-diff

Equal height?

Page 3: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 3

Histograms with slope

True distribution Average value

Linear regression

Max-diff with slope

Max-diff

Page 4: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 4

Slope, patterns, extrapolation

Page 5: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 5

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

40.00

45.00

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

Query execution

Ela

pse

d t

ime

Measured values

Slow average

Fast average

Detecting query slowdown

Page 6: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 6

External merge sort

• Initial runs: size M, count N/M

• Merge fan-in F = M − read-ahead buffers

• Merge depth = merge levels = logF (N/M)

… ……

…Size = F×M

Size = M

Fan-in = F

Page 7: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 7

Hybrid hash join • Applies if M < N1 ≤ F×M

1 < N1/M ≤ F

0 < logF(N1/M) ≤ 1

• Actual fan-out K: 1 < K ≤ FHash table + K output buffers

(M−K) + (K×M) ≥ N1

K ≥ (N1−M) / (M−1)

• Fairly smooth cost function Eases query optimization

Eases memory management

1

K

… 1

K

Page 8: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 8

Merging vs.partitioning

Duality of sorting & hashingIssue Sorting Hashing

In-memoryalgorithm

Quicksort etc. “Classic” hashing

Large (& very large)inputs

(Multi-level)merging

(Recursive)partitioning

Tradeoffs Fan-in vs. large I/O& read-ahead

Fan-out vs. largeI/O & write-behind

Partial levels On-demand spilling(SIGMOD 98)

Hybrid hashing

Multiple inputs “Interesting orders” Hash teams

Page 9: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 9

Multiple optimization techniques are needed to find this plan Join clause inferred between line item & part supply Group-by list reduced by functional dependencies Grouping (on alternative column) pushed down through join “Interesting orderings” between scans, joins, grouping

Page 10: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 10

Multiple optimization techniques in a hash-based planSame as previous example, plus Integrated hash operation … … within a hash team Disk-order scans

Page 11: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 11

Star joins: semi-join reduction

First, join each dimension table with an index of the fact table;then, (hash-) intersect bookmark lists;finally, fetch fact table rows

Also considered: Cartesian products of dimension tables

Page 12: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 12

Symmetric semi-join reduction

Index T1 (a, s) Index T2 (a, s)

Join “T1.a = T2.a”

Select … from T1 join T2 on T1.a = T2.a where …

Fetch using T1.s

Fetch using T1.s

Fields T1.s, T2.s

Fields T1.*, T2.s

Fields T2.a, T2.s

Fields T1.*, T2.*

Page 13: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 13

Index-to-index navigation performanceSingle-table execution times

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

800.00

900.00

1,000.00

Row count

Tim

e [

se

co

nd

s]

Scan plan Fetch plan Join plan Fetch 9115 Hash join

Merge join Join + fetchTrad. fetch

Page 14: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 14

2-dimensional parameter space

Page 15: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 15

Fast loads and fast queriesQ

uery

per

form

ance

Load bandwidth

Multipleindexes

No indexes or statistics

Zonemaps

PartitionedB-trees

Zonefilters

Zoneindexes

?

Page 16: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Adaptive merging 16

Traditional index choices• Don’t index. Scan for each query – no cost for

index creation

• Index creation before query processing– Useful for predictable workloads

• “Monitoring and tuning” wizard– Extra effort, hard to predictScan

Index creation Index searches

Adaptive Indexing

Index tuning

Page 17: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 17April 18, 2023 17

Adaptive merging in partitioned B-trees

run generation

merging

a za a azzz

a za za a azzz

… after merging a-j

a zk k kzzzkj #4#3#2#1#0

Page 18: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 18April 18, 2023 18

Adaptive merging vs database cracking

Database crackingImproved crackingAdaptive merging

Page 19: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 19

Tree of losers • Traditional priority queue

– Enter and exit at root

– 2 log2 M comparisons

• Tree of winners – Enter at leaf, exit at root

– log2 M comparisons

– Specific entry points – Duplicate entries – M/2 entries

• Tree of losers – Enter at leaf, exit at root – No duplicates, M entries

Run 4: key A

0: F 7: B

Run 3: key D

1: G 2: E 5: D 6: C

0: F1: G

2: E3: D

4: A5: D

6: C7: B

Array slot 0

1

2 3

7654

Page 20: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 20

Graceful degradation • Exploit large memory

– Even during small merge

– Merge from memory

• Smooth transition – Run generation to merging

• Continuous cost function – Effect of hybrid hash join

– 2 × 6 GB ÷ 100 MB/s = 120 sec = 2 min

1 2

0 12 3

0

Page 21: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 21

Graceful degradation in memory hierarchy

Output

Main memory

Flash memory

A few runs on disk

Rotating disk drive

Run inmemory

A few runson flash

Buffer forlarge disk pages

High fan-inmerge

Page 22: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 22

SQL Server lock modes

Page 23: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 23

Optimal B-tree node sizes in 1997

Page 24: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

April 18, 2023 Dagstuhl - Robust Query Processing 24

Hilbert space-filling curve

Page 25: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

Nicolas Bruno and Surajit Chaudhuri, Automatic Physical Database Tuning: A Relaxation-based Approach, in Proceedings of the ACM International Conference on Management of Data (SIGMOD), Association for Computing Machinery, Inc., 2005

Automatic Tuning: Relaxation-based

Page 26: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

Sanjay Agrawal, Nicolas Bruno, Surajit Chaudhuri, and Vivek Narasayya, AutoAdmin: Self-Tuning Database Systems Technology, in Data Engineering Bulletin, IEEE Computer Society, 2006

Self-Tuning DB: AutoAdmin

Page 27: Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010

Surajit Chaudhuri, Arnd Christian König, and Vivek Narasayya, SQLCM: A Contiuous Monitoring Framework for Relational Database Engines, in ICDE 2004.

Continuous Monitoring: SQLCM