sd202: evaluation algorithms

105
SD202: Evaluation algorithms Louis Jachiet Louis JACHIET 1 / 53

Upload: others

Post on 02-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SD202: Evaluation algorithms

SD202: Evaluation algorithms

Louis Jachiet

Louis JACHIET 1 / 53

Page 2: SD202: Evaluation algorithms

Introduction

Query⇒How to compute

the result of a query?

Louis JACHIET 2 / 53

Page 3: SD202: Evaluation algorithms

Introduction

Query⇒

How to compute

the result of a query?

Louis JACHIET 2 / 53

Page 4: SD202: Evaluation algorithms

Introduction

Query⇒

How to compute

the result of a query?

Louis JACHIET 2 / 53

Page 5: SD202: Evaluation algorithms

Introduction

Query⇒How to compute

the result of a query?

Louis JACHIET 2 / 53

Page 6: SD202: Evaluation algorithms

Introduction

Query⇒How to compute efficiently

the result of a query?

Louis JACHIET 2 / 53

Page 7: SD202: Evaluation algorithms

Remember the relational algebra?

A dialect comprising:

• Relation R• RENAME(t,a,b)

• DROP(t,a)

• FILTER(t,cond)

• PRODUCT(t,t’)

• UNION

And also:

• JOIN(t,t’,cond,type)

• AGGREGATE(t,cols, exprs)

• and other operators

Louis JACHIET 3 / 53

Page 8: SD202: Evaluation algorithms

Query evaluation in a nutshell

SQL query

Relational Algebra term

Output

Data

Translation

Execution

Is it that simple?

Of course, no :)

Louis JACHIET 4 / 53

Page 9: SD202: Evaluation algorithms

Query evaluation in a nutshell

SQL query

Relational Algebra term

Output

Data

Translation

Execution

Is it that simple?

Of course, no :)

Louis JACHIET 4 / 53

Page 10: SD202: Evaluation algorithms

Query evaluation in a nutshell

SQL query Relational Algebra term

Query Optimizer

Best query plan Compiled query Output

Data

Compilation Execution

Louis JACHIET 5 / 53

Page 11: SD202: Evaluation algorithms

Today’s lesson

We will cover several topics

• Logical optimization

• A model of computation time

• Efficient retrieval using indexes

• Several algorithms to joins

• A general method for query optimization

• Some advanced topics (e.g. transactions)

Louis JACHIET 6 / 53

Page 12: SD202: Evaluation algorithms

Logical optimization of queries

Page 13: SD202: Evaluation algorithms

Comparing relational algebra terms

Consider the two following terms, are they computing the

same thing? Is one more efficient than the other?

FILTER(

JOIN(Room, Movie),

title="Le Bonheur"

)

JOIN(Room,

FILTER(Movie,

title="Le Bonheur")

)

Here JOIN is the natural join (i.e. on shared attributes)

Louis JACHIET 7 / 53

Page 14: SD202: Evaluation algorithms

Comparing relational algebra terms

What about those two terms?

JOIN(

JOIN(Room, Movie),

OscarMovies

)

JOIN(

Room,

JOIN(OscarMovies,Movie)

)

Louis JACHIET 8 / 53

Page 15: SD202: Evaluation algorithms

Equivalence rule

We can devise a set of rewriting rules

• JOIN(a,JOIN(b,c)) → JOIN(JOIN(a,b),c)

Associativity

• JOIN(a,b) → JOIN(b,a)

Commutativity

• FILTER(JOIN(a,b),f) → JOIN(FILTER(a,f),b)

Distributivity when f operates on attributes in a

• And many other rules. . .

Louis JACHIET 9 / 53

Page 16: SD202: Evaluation algorithms

A RA term optimizer

Algorithm

• Start with an RA term ϕ

• Set oldplans = ∅, plans = {ϕ}• While oldplans 6= plans

• oldplans = plans

• plans = plans ∪{ϕ′ | where ϕ′ is a rewriting of ϕ ∈ plans}• Return the most efficient ϕ ∈ plans

In practice, exploring the full plan space

is prohibitively expensive for large queries.

Also, how can we determine the most efficient ϕ?

Louis JACHIET 10 / 53

Page 17: SD202: Evaluation algorithms

A RA term optimizer

Algorithm

• Start with an RA term ϕ

• Set oldplans = ∅, plans = {ϕ}• While oldplans 6= plans

• oldplans = plans

• plans = plans ∪{ϕ′ | where ϕ′ is a rewriting of ϕ ∈ plans}• Return the most efficient ϕ ∈ plans

In practice, exploring the full plan space

is prohibitively expensive for large queries.

Also, how can we determine the most efficient ϕ?

Louis JACHIET 10 / 53

Page 18: SD202: Evaluation algorithms

A RA term optimizer

Algorithm

• Start with an RA term ϕ

• Set oldplans = ∅, plans = {ϕ}• While oldplans 6= plans

• oldplans = plans

• plans = plans ∪{ϕ′ | where ϕ′ is a rewriting of ϕ ∈ plans}• Return the most efficient ϕ ∈ plans

In practice, exploring the full plan space

is prohibitively expensive for large queries.

Also, how can we determine the most efficient ϕ?

Louis JACHIET 10 / 53

Page 19: SD202: Evaluation algorithms

Cost model on RA terms

Determining the efficiency of an RA term ϕ depends on:

• a cost-model for each RA operator

• a cardinality estimation for each subterm appearing in ϕ

This simple model ignores the fact one RA operator (e.g. JOIN)

can be performed with different algorithms... more on that later!

Louis JACHIET 11 / 53

Page 20: SD202: Evaluation algorithms

Cost model on RA terms

Determining the efficiency of an RA term ϕ depends on:

• a cost-model for each RA operator

• a cardinality estimation for each subterm appearing in ϕ

This simple model ignores the fact one RA operator (e.g. JOIN)

can be performed with different algorithms... more on that later!

Louis JACHIET 11 / 53

Page 21: SD202: Evaluation algorithms

A cost model to estimate

computation time

Page 22: SD202: Evaluation algorithms

Modeling computation time

Basic model

Count the number of operations.

We know it is correct (up to a constant)

Data access time

CPU L1 cache ≤1ns

CPU L2 cache 5ns

RAM access 0.1µs

SSD seek 0.1ms

HDD seek 10ms

I/O speed (vague numbers!)

CPU caches 50GB/s

RAM 10GB/s

SSD 500MB/s

HDD sequential 100MB/s

HDD random 30MB/s

Looking only up to a constant factor does not give the full picture!

Louis JACHIET 12 / 53

Page 23: SD202: Evaluation algorithms

Modeling computation time

Basic model

Count the number of operations.

We know it is correct (up to a constant)

Data access time

CPU L1 cache ≤1ns

CPU L2 cache 5ns

RAM access 0.1µs

SSD seek 0.1ms

HDD seek 10ms

I/O speed (vague numbers!)

CPU caches 50GB/s

RAM 10GB/s

SSD 500MB/s

HDD sequential 100MB/s

HDD random 30MB/s

Looking only up to a constant factor does not give the full picture!

Louis JACHIET 12 / 53

Page 24: SD202: Evaluation algorithms

Modeling computation time

Basic model

Count the number of operations.

We know it is correct (up to a constant)

Data access time

CPU L1 cache ≤1ns

CPU L2 cache 5ns

RAM access 0.1µs

SSD seek 0.1ms

HDD seek 10ms

I/O speed (vague numbers!)

CPU caches 50GB/s

RAM 10GB/s

SSD 500MB/s

HDD sequential 100MB/s

HDD random 30MB/s

Looking only up to a constant factor does not give the full picture!

Louis JACHIET 12 / 53

Page 25: SD202: Evaluation algorithms

Modeling computation time

Basic model

Count the number of operations.

We know it is correct (up to a constant)

Data access time

CPU L1 cache ≤1ns

CPU L2 cache 5ns

RAM access 0.1µs

SSD seek 0.1ms

HDD seek 10ms

I/O speed (vague numbers!)

CPU caches 50GB/s

RAM 10GB/s

SSD 500MB/s

HDD sequential 100MB/s

HDD random 30MB/s

Looking only up to a constant factor does not give the full picture!

Louis JACHIET 12 / 53

Page 26: SD202: Evaluation algorithms

A more precise model of time

Count the different kinds of operations separately:

• number of seeks × tseek

• number of blocks read sequentially × tread

• number of blocks written × twrite

• number of CPU operations × tcpu

• size of block sblock (typically a few kB).

Louis JACHIET 13 / 53

Page 27: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable

Not much can be done to optimize this FULL SCAN (N sequential

reads).

SELECT * FROM myTable

WHERE field=”value”

Can we do better than a full scan? It depends!

Louis JACHIET 14 / 53

Page 28: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable

Not much can be done to optimize this FULL SCAN (N sequential

reads).

SELECT * FROM myTable

WHERE field=”value”

Can we do better than a full scan?

It depends!

Louis JACHIET 14 / 53

Page 29: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable

Not much can be done to optimize this FULL SCAN (N sequential

reads).

SELECT * FROM myTable

WHERE field=”value”

Can we do better than a full scan? It depends!

Louis JACHIET 14 / 53

Page 30: SD202: Evaluation algorithms

SELECT * FROM myTable WHERE field=”value”

Algorithm 1

Read the table sequentially.

⇒ N sequential reads.

Algorithm 2

Precompute list L with the posi-

tion of relevant records. For each

element of L, retrieve it.

⇒ L non sequential reads.

Algorithm 2 is better only when the condition is very selective!

Louis JACHIET 15 / 53

Page 31: SD202: Evaluation algorithms

SELECT * FROM myTable WHERE field=”value”

Algorithm 1

Read the table sequentially.

⇒ N sequential reads.

Algorithm 2

Precompute list L with the posi-

tion of relevant records. For each

element of L, retrieve it.

⇒ L non sequential reads.

Algorithm 2 is better only when the condition is very selective!

Louis JACHIET 15 / 53

Page 32: SD202: Evaluation algorithms

SELECT * FROM myTable WHERE field=”value”

Algorithm 1

Read the table sequentially. ⇒ N sequential reads.

Algorithm 2

Precompute list L with the posi-

tion of relevant records. For each

element of L, retrieve it.

⇒ L non sequential reads.

Algorithm 2 is better only when the condition is very selective!

Louis JACHIET 15 / 53

Page 33: SD202: Evaluation algorithms

SELECT * FROM myTable WHERE field=”value”

Algorithm 1

Read the table sequentially. ⇒ N sequential reads.

Algorithm 2

Precompute list L with the posi-

tion of relevant records. For each

element of L, retrieve it.

⇒ L non sequential reads.

Algorithm 2 is better only when the condition is very selective!

Louis JACHIET 15 / 53

Page 34: SD202: Evaluation algorithms

SELECT * FROM myTable WHERE field=”value”

Algorithm 1

Read the table sequentially. ⇒ N sequential reads.

Algorithm 2

Precompute list L with the posi-

tion of relevant records. For each

element of L, retrieve it.

⇒ L non sequential reads.

Algorithm 2 is better only when the condition is very selective!

Louis JACHIET 15 / 53

Page 35: SD202: Evaluation algorithms

How to get the estimation for tseek, twrite, tread, tcpu, etc.

• use default values

• use default values based on hardware

• detect values at boot

• learn values

• ...

Default values are generally OK but can often be tuned for optimal

performance.

Louis JACHIET 16 / 53

Page 36: SD202: Evaluation algorithms

How to get the estimation for tseek, twrite, tread, tcpu, etc.

• use default values

• use default values based on hardware

• detect values at boot

• learn values

• ...

Default values are generally OK but can often be tuned for optimal

performance.

Louis JACHIET 16 / 53

Page 37: SD202: Evaluation algorithms

Indexes

Page 38: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable WHERE field=”value”

How to do better than a full scan?

With indexes!

Different types of indexes:

• Hash

for equality tests only

• B-tree

for arbitrary comparisons

• GIN

for case-insensitive search, n-grams, prefixes, JSON field

lookup, etc.

• GiST

for non ordered data (e.g. geographical data, date inter-

vals, etc.)

• . . .

Louis JACHIET 17 / 53

Page 39: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable WHERE field=”value”

How to do better than a full scan?

With indexes!

Different types of indexes:

• Hash

for equality tests only

• B-tree

for arbitrary comparisons

• GIN

for case-insensitive search, n-grams, prefixes, JSON field

lookup, etc.

• GiST

for non ordered data (e.g. geographical data, date inter-

vals, etc.)

• . . .

Louis JACHIET 17 / 53

Page 40: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable WHERE field=”value”

How to do better than a full scan?

With indexes!

Different types of indexes:

• Hash

for equality tests only

• B-tree

for arbitrary comparisons

• GIN

for case-insensitive search, n-grams, prefixes, JSON field

lookup, etc.

• GiST

for non ordered data (e.g. geographical data, date inter-

vals, etc.)

• . . .

Louis JACHIET 17 / 53

Page 41: SD202: Evaluation algorithms

Simple query evaluation

SELECT * FROM myTable WHERE field=”value”

How to do better than a full scan?

With indexes!

Different types of indexes:

• Hash for equality tests only

• B-tree for arbitrary comparisons

• GIN for case-insensitive search, n-grams, prefixes, JSON field

lookup, etc.

• GiST for non ordered data (e.g. geographical data, date inter-

vals, etc.)

• . . .

Louis JACHIET 17 / 53

Page 42: SD202: Evaluation algorithms

Hash indexes

Data

An array A of size K containing lists of pair (key,pointer).

Check key x

Compute hash(x) and iterate over the elements of A(hash(x)).

Add key x to pointer p

Append (x , p) to the list A(hash(x)).

Louis JACHIET 18 / 53

Page 43: SD202: Evaluation algorithms

Hash indexes

A B A B A B A B

Cab 11 Alice 42 Bob 3 Dacy 34

H P

0 Bob

1

2 Cab

3

4 Alice

Dacy

What is the time to retrieve or add data?

(use c the average number of collisions)

Louis JACHIET 19 / 53

Page 44: SD202: Evaluation algorithms

Hash indexes

A B A B A B A B

Cab 11 Alice 42 Bob 3 Dacy 34

H P

0 Bob

1

2 Cab

3

4 Alice

Dacy

What is the time to retrieve or add data?

(use c the average number of collisions)

Louis JACHIET 19 / 53

Page 45: SD202: Evaluation algorithms

Dense hash indexes

A B A B A B A B

Cab 11 Alice 42 Bob 3 Dacy 34

H P1 P2 P3 P4

0 Bob

1

2 Cab Dacy Ewen Fiona

3

4 Alice

Gael

Louis JACHIET 20 / 53

Page 46: SD202: Evaluation algorithms

Denser hash indexes

A B A B A B A B A B

Cab 11 Alice 42 Bob 3 Dacy 34 Ewen 102

H P1 P2 P3 P4

0

1

2

3

4

Louis JACHIET 21 / 53

Page 47: SD202: Evaluation algorithms

Pros & Cons of the denser hash indexes

Pros

• Generally the fastest index (but with a small margin)

• Especially good on large datasets with complex types like strings

• Small space overhead

Cons

• At least two seeks required

• Cannot retrieve data in sorted order

• Can only check equality

Louis JACHIET 22 / 53

Page 48: SD202: Evaluation algorithms

B-trees

B-trees

Just like binary search trees but B-ary trees.

Bob Ewen

Alice Ann

Cab Dacy

Fiona Gael

Louis JACHIET 23 / 53

Page 49: SD202: Evaluation algorithms

Binary Search Trees (BST)

def search(key, node):

if node is None or key == node.key:

return node

if key < node.key:

return search(key, node.left)

return search(key, node.right)

Insertion

Insertion code is similar to search but where we

make sure that the tree stays balanced.

Deletion

Deletion is a bit tricky...

8

3 10

1 6 14

4 7 13

Louis JACHIET 24 / 53

Page 50: SD202: Evaluation algorithms

B-trees

Data

A search tree where all nodes (except the root) have an arity

between B/2 and B.

Check existence of key x or retrieve tuples with x ∈ [a; b]

Recursively go down the tree. If we neglect CPU time this is

logB(N) seeks. For large B we can expect 3 or 4 seeks at most.

Insert key x

Same as looking up except when a node contains strictly more

than B elements, it is split into two. The splitter is added to the

parent which can recursively trigger a splitting up to the root.

Delete key x

Same as insertion except we merge with neighbors when the node

is too empty.Louis JACHIET 25 / 53

Page 51: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

12

Louis JACHIET 26 / 53

Page 52: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

12 24

Louis JACHIET 26 / 53

Page 53: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

12 24 52

Louis JACHIET 26 / 53

Page 54: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

12 24 52 67

Louis JACHIET 26 / 53

Page 55: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

12 24 52 67 89

Louis JACHIET 26 / 53

Page 56: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

52

12 24

67 89

Louis JACHIET 26 / 53

Page 57: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

52

12 24

55 67 89

Louis JACHIET 26 / 53

Page 58: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

52

12 24 30

55 67 89

Louis JACHIET 26 / 53

Page 59: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

52

12 20 24 30

55 67 89

Louis JACHIET 26 / 53

Page 60: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

52

5 12 20 24 30

55 67 89

Louis JACHIET 26 / 53

Page 61: SD202: Evaluation algorithms

Exercise on B-trees

Describe the state of an initially empty B-tree with B = 4

after the following insertions:

• 12

• 52

• 24

• 67

• 89

• 55

• 30

• 20

• 5

20 52

5 12

24 30

55 67 89

Louis JACHIET 26 / 53

Page 62: SD202: Evaluation algorithms

Pros & Cons of the B-tree indexes

Pros

• Quite fast (especially on simple types)

• Retrieve data in order with very few seeks

• Can be used for prefixes / subkeys; a B-tree on (x , y) can be

used to test whether a given x exists or retrieve the correspond-

ing y .

Cons

• Larger number of seeks required

• Larger space overhead (especially for complex data types)

Generally the default index structure used!

Louis JACHIET 27 / 53

Page 63: SD202: Evaluation algorithms

Creating indexes

CREATE INDEX title_idx ON movies (name);

CREATE INDEX title_idx ON movies (lower(name)) USING hash;

CREATE UNIQUE INDEX ON movies (year,name);

Louis JACHIET 28 / 53

Page 64: SD202: Evaluation algorithms

Inverted Index

Inverted Index query

Given a set of sets X1, . . . ,Xn and an item i report the integers e

such that i ∈ Xe .

You typically find them in cookbooks (recipes by ingredient)

Implementation

Typically a B-tree or a hash index giving for each i the set of e

such that i ∈ Xe .

Usage

They can be used for indexing JSON fields or getting records

where some word appears in a string.

Louis JACHIET 29 / 53

Page 65: SD202: Evaluation algorithms

Inverted Index

Inverted Index query

Given a set of sets X1, . . . ,Xn and an item i report the integers e

such that i ∈ Xe .

You typically find them in cookbooks (recipes by ingredient)

Implementation

Typically a B-tree or a hash index giving for each i the set of e

such that i ∈ Xe .

Usage

They can be used for indexing JSON fields or getting records

where some word appears in a string.

Louis JACHIET 29 / 53

Page 66: SD202: Evaluation algorithms

Inverted Index

Inverted Index query

Given a set of sets X1, . . . ,Xn and an item i report the integers e

such that i ∈ Xe .

You typically find them in cookbooks (recipes by ingredient)

Implementation

Typically a B-tree or a hash index giving for each i the set of e

such that i ∈ Xe .

Usage

They can be used for indexing JSON fields or getting records

where some word appears in a string.

Louis JACHIET 29 / 53

Page 67: SD202: Evaluation algorithms

Usage of Generalized Inverted Indexes in PostgreSQL

CREATE TABLE movies (

name VARCHAR(255), tags VARCHAR(16)[] );

INSERT INTO movies (name, tags)

values ('Star Wars',

'{"action", "adventure", "fantasy"}');

CREATE INDEX movie_tags ON movies USING gin (tags);

SELECT * FROM movies WHERE 'action' = ANY (tags);

SELECT * FROM movies

WHERE '{"action", "fantasy"}' && tags;

Louis JACHIET 30 / 53

Page 68: SD202: Evaluation algorithms

Usage of Generalized Inverted Indexes in PostgreSQL

CREATE INDEX lyrics_idx ON songs

USING GIN (to_tsvector('english',lyrics));

SELECT name FROM songs

WHERE to_tsvector('english',lyrics) @@ to_tsquery('make');

-- name

-- -------------------------

-- Never Gonna Give You Up

-- It's Friday!

Louis JACHIET 31 / 53

Page 69: SD202: Evaluation algorithms

Bitmap indexes

Principle

For each value v store a bit array of length n (the number of

records) where the i-th bit is set when the record has value v

Typical use case

• (Rarely) as an index built for values of low cardinality (gender,

binary values, enum)

• (Very often) built for intermediate results of complex conditions

(e.g. x < 10 or y > 42 and z < 1).

Can be built for pages instead of individual tuples.

Louis JACHIET 32 / 53

Page 70: SD202: Evaluation algorithms

Bitmap indexes

Principle

For each value v store a bit array of length n (the number of

records) where the i-th bit is set when the record has value v

Typical use case

• (Rarely) as an index built for values of low cardinality (gender,

binary values, enum)

• (Very often) built for intermediate results of complex conditions

(e.g. x < 10 or y > 42 and z < 1).

Can be built for pages instead of individual tuples.

Louis JACHIET 32 / 53

Page 71: SD202: Evaluation algorithms

Generalized Search Tree based indexes (GiST)

Principle

A GiST is a tree shaped index but where you cannot “split”

efficiently into two non overlapping subsets.

Examples

Shapes in 2D, date intervals, text, etc.

Key idea

Split into two (or more) subsets with an overlapping.

Consequence

Any search will go down on several parts of the tree, hence it is not

as efficient as a B-tree!

Louis JACHIET 33 / 53

Page 72: SD202: Evaluation algorithms

Generalized Search Tree based indexes (GiST)

Principle

A GiST is a tree shaped index but where you cannot “split”

efficiently into two non overlapping subsets.

Examples

Shapes in 2D, date intervals, text, etc.

Key idea

Split into two (or more) subsets with an overlapping.

Consequence

Any search will go down on several parts of the tree, hence it is not

as efficient as a B-tree!

Louis JACHIET 33 / 53

Page 73: SD202: Evaluation algorithms

Generalized Search Tree based indexes (GiST)

Principle

A GiST is a tree shaped index but where you cannot “split”

efficiently into two non overlapping subsets.

Examples

Shapes in 2D, date intervals, text, etc.

Key idea

Split into two (or more) subsets with an overlapping.

Consequence

Any search will go down on several parts of the tree, hence it is not

as efficient as a B-tree!

Louis JACHIET 33 / 53

Page 74: SD202: Evaluation algorithms

Generalized Search Tree based indexes (GiST)

Principle

A GiST is a tree shaped index but where you cannot “split”

efficiently into two non overlapping subsets.

Examples

Shapes in 2D, date intervals, text, etc.

Key idea

Split into two (or more) subsets with an overlapping.

Consequence

Any search will go down on several parts of the tree, hence it is not

as efficient as a B-tree!

Louis JACHIET 33 / 53

Page 75: SD202: Evaluation algorithms

Examples of GiST indexing

From postgis website

Louis JACHIET 34 / 53

Page 76: SD202: Evaluation algorithms

In conclusion

Queries containing patterns like FILTER(R,x = v)

can often be drastically improved.

It is also used to speed up key constraints.

Louis JACHIET 35 / 53

Page 77: SD202: Evaluation algorithms

Computing joins

Page 78: SD202: Evaluation algorithms

Joins

SELECT * FROM tableA, tableB WHERE idA=idB ;

SELECT * FROM tableA

WHERE EXISTS (

SELECT 1 FROM tableB

WHERE idA=idB) ;

SELECT * FROM tableA LEFT JOIN tableB ON idA=idB ;

Louis JACHIET 36 / 53

Page 79: SD202: Evaluation algorithms

Computing joins

Joins can be computed with many different algorithms

• Nested loop join

• Block nested loop joins

• Index-join

• Sort-merge join

• Hash join

• Indexed-join

• and many others...

Louis JACHIET 37 / 53

Page 80: SD202: Evaluation algorithms

Nested loop join

The simplest join

for t1 in tableA:

for t2 in tableB:

if condition(t1,t2):

output(t1,t2)

tableB will be read |tableA| times...

Louis JACHIET 38 / 53

Page 81: SD202: Evaluation algorithms

Nested loop join

The simplest join

for t1 in tableA:

for t2 in tableB:

if condition(t1,t2):

output(t1,t2)

tableB will be read |tableA| times...

Louis JACHIET 38 / 53

Page 82: SD202: Evaluation algorithms

Block nested loop join

A twist on the nested loop join:

for b1 in tableA.blocks

for b2 in tableB.blocks

for t1 in b1:

for t2 in b2:

if condition(t1,t2):

output(t1,t2)

Same number of operations but each block is loaded once.

Louis JACHIET 39 / 53

Page 83: SD202: Evaluation algorithms

Index-join

Key idea

Imagine we have an index on tB(id) and we have the following

query:

SELECT * FROM tA, tB WHERE tA.id=tB.id

We can use the index on tB to retrieve the matching tuples from

tA.

for t1 in tA

for t2 in tB.idIndex(t1.id)

output(t1,t2)

Louis JACHIET 40 / 53

Page 84: SD202: Evaluation algorithms

Index-join

Key idea

Imagine we have an index on tB(id) and we have the following

query:

SELECT * FROM tA, tB WHERE tA.id=tB.id

We can use the index on tB to retrieve the matching tuples from

tA.

for t1 in tA

for t2 in tB.idIndex(t1.id)

output(t1,t2)

Louis JACHIET 40 / 53

Page 85: SD202: Evaluation algorithms

Index-join

What if the join condition is tA.id = tB.id AND tA.x>tB.x?

We can use the index on a weakened condition:

for t1 in tA

for t2 in tB.idIndex(t1.id)

if t1.x > t2.x:

output(t1,t2)

Or more generally:

for t1 in tableA

for t2 in tableB.index[t1.value]:

if condition(t1,t2):

output(t1,t2)

Louis JACHIET 41 / 53

Page 86: SD202: Evaluation algorithms

Index-join

What if the join condition is tA.id = tB.id AND tA.x>tB.x?

We can use the index on a weakened condition:

for t1 in tA

for t2 in tB.idIndex(t1.id)

if t1.x > t2.x:

output(t1,t2)

Or more generally:

for t1 in tableA

for t2 in tableB.index[t1.value]:

if condition(t1,t2):

output(t1,t2)

Louis JACHIET 41 / 53

Page 87: SD202: Evaluation algorithms

Index-join

What if the join condition is tA.id = tB.id AND tA.x>tB.x?

We can use the index on a weakened condition:

for t1 in tA

for t2 in tB.idIndex(t1.id)

if t1.x > t2.x:

output(t1,t2)

Or more generally:

for t1 in tableA

for t2 in tableB.index[t1.value]:

if condition(t1,t2):

output(t1,t2)

Louis JACHIET 41 / 53

Page 88: SD202: Evaluation algorithms

Sort-merge-join

Another algorithm for the intersection of two lists

lA.sort()

lB.sort()

while len(lA)>0 and len(lB)>0:

if lA[-1] = lB[-1]:

output(lA[-1],lB[-1])

if lA[-1] >= lB[-1]:

lA.pop()

else:

lB.pop()

Louis JACHIET 42 / 53

Page 89: SD202: Evaluation algorithms

Sort-merge-join

More generally, for a sort-merge-join

• Get both inputs in sorted order

• Advance in both inputs by increasing order

• Output matching couples

Handles more than equality

Comparisons, max, min, . . . .

Louis JACHIET 43 / 53

Page 90: SD202: Evaluation algorithms

Hash-join

How to find the intersection of two sets A and B?

d = hashset()

res = []

for a in A:

d.add(a)

for b in B:

if b in d:

res.append(b)

Louis JACHIET 44 / 53

Page 91: SD202: Evaluation algorithms

Hash-join

More generally, for a join with a condition C containing a set

of equalities x1 = y1 ∧ · · · ∧ xk = yk :

• For each tuple t1 in tableA store it in a hash table with key

(t1.x1, . . . , t1.xk),

• For each tuple t2 in tableB look up the tuples t ′ in the hash

table with key (t2.y1, . . . , t2.yk),

• If C (t1, t2) output (t1, t2).

Louis JACHIET 45 / 53

Page 92: SD202: Evaluation algorithms

Indexed-join

Simple idea

Pre-compute and index the join result.

Allows for very fast join but not without drawbacks

• Not available in most databases

• Slow to update

• Can eat up a lot of space

Only useful for very frequent joins with very few updates on

the joined tables.

Louis JACHIET 46 / 53

Page 93: SD202: Evaluation algorithms

Indexed-join

Simple idea

Pre-compute and index the join result.

Allows for very fast join but not without drawbacks

• Not available in most databases

• Slow to update

• Can eat up a lot of space

Only useful for very frequent joins with very few updates on

the joined tables.

Louis JACHIET 46 / 53

Page 94: SD202: Evaluation algorithms

Query evaluation in a nutshell

SQL query Relational Algebra term

Query Optimizer

Best plan Compiled query

Data

Execution

Louis JACHIET 47 / 53

Page 95: SD202: Evaluation algorithms

Query evaluation in a nutshell

SQL query Relational Algebra term

Best plan Compiled query

Data

Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans

Cost

Execution

Louis JACHIET 47 / 53

Page 96: SD202: Evaluation algorithms

Query evaluation in a nutshell

SQL query Relational Algebra term

Best plan Compiled query

Data

Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans Logical plansPhysical plansGraded plans

Cost

Execution

Louis JACHIET 47 / 53

Page 97: SD202: Evaluation algorithms

Plans in Postgres

Page 98: SD202: Evaluation algorithms

EXPLAIN

Most databases allow the user to see the query plan used

Generally with the EXPLAIN COMMAND.

jachiet=> EXPLAIN SELECT name FROM songs WHERE to_tsvector('english',lyrics) @@ to_tsquery('make');

QUERY PLAN

---------------------------------------------------------------------------------------------

Bitmap Heap Scan on songs (cost=8.25..12.76 rows=1 width=32)

Recheck Cond: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

-> Bitmap Index Scan on lyrics_idx (cost=0.00..8.25 rows=1 width=0)

Index Cond: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

Interesting to understand why some queries are slow:

• maybe the query is more complex than you thought?

• maybe your are missing some indexes?

• maybe Postgres selects a dumb query plan?

Louis JACHIET 48 / 53

Page 99: SD202: Evaluation algorithms

EXPLAIN

Most databases allow the user to see the query plan used

Generally with the EXPLAIN COMMAND.

jachiet=> EXPLAIN SELECT name FROM songs WHERE to_tsvector('english',lyrics) @@ to_tsquery('make');

QUERY PLAN

---------------------------------------------------------------------------------------------

Bitmap Heap Scan on songs (cost=8.25..12.76 rows=1 width=32)

Recheck Cond: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

-> Bitmap Index Scan on lyrics_idx (cost=0.00..8.25 rows=1 width=0)

Index Cond: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

Interesting to understand why some queries are slow:

• maybe the query is more complex than you thought?

• maybe your are missing some indexes?

• maybe Postgres selects a dumb query plan?

Louis JACHIET 48 / 53

Page 100: SD202: Evaluation algorithms

EXPLAIN

Most databases allow the user to see the query plan used

Generally with the EXPLAIN COMMAND.

jachiet=> EXPLAIN SELECT name FROM songs WHERE to_tsvector('english',lyrics) @@ to_tsquery('make');

QUERY PLAN

---------------------------------------------------------------------------------------------

Bitmap Heap Scan on songs (cost=8.25..12.76 rows=1 width=32)

Recheck Cond: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

-> Bitmap Index Scan on lyrics_idx (cost=0.00..8.25 rows=1 width=0)

Index Cond: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

Interesting to understand why some queries are slow:

• maybe the query is more complex than you thought?

• maybe your are missing some indexes?

• maybe Postgres selects a dumb query plan?

Louis JACHIET 48 / 53

Page 101: SD202: Evaluation algorithms

EXPLAIN ANALYZE

Adding ANALYZE makes the engine run the query and gather

statistics

=> EXPLAIN ANALYZE SELECT name FROM songs WHERE to_tsvector('english',lyrics) @@ to_tsquery('make');

QUERY PLAN

------------------------------------------------------------------------------------------------

Seq Scan on songs (cost=0.00..2.02 rows=1 width=32) (actual time=1.412..3.126 rows=2 loops=1)

Filter: (to_tsvector('english'::regconfig, lyrics) @@ to_tsquery('make'::text))

Planning Time: 0.175 ms

Execution Time: 3.154 ms

Louis JACHIET 49 / 53

Page 102: SD202: Evaluation algorithms

Effect of the cardinality estimation

jachiet=> EXPLAIN SELECT * FROM students WHERE age = 20 ;

QUERY PLAN

-----------------------------------------------------------------------------------

Bitmap Heap Scan on students (cost=39.89..107.62 rows=1498 width=13)

Recheck Cond: (age = 20)

-> Bitmap Index Scan on students_age_idx (cost=0.00..39.52 rows=1498 width=0)

Index Cond: (age = 20)

(4 rows)

jachiet=> EXPLAIN SELECT * FROM students WHERE age = 33 ;

QUERY PLAN

----------------------------------------------------------------------------------

Index Scan using students_age_idx on students (cost=0.29..8.22 rows=1 width=13)

Index Cond: (age = 33)

(2 rows)

Louis JACHIET 50 / 53

Page 103: SD202: Evaluation algorithms

Understanding Postgres plans

The various Postgres operators

• SeqScan explore the full table

• Index Scan explore the relevant part of a table using an index

• Index Only Scan explore the index

• Bitmap Heap Scan like a SeqScan but might skip some of the

disk pages using a bitmap index

• Bitmap Index Scan builds a bitmap with an index

• Bitmap Or / Bitmap And builds a bitmap as the Or/And of

two bitmaps

• Filter, Nested loop, Hash join, Merge join, self explana-

tory.

• Materialize we need to talk about the execution...

Louis JACHIET 51 / 53

Page 104: SD202: Evaluation algorithms

Execution of query plans

Database engines prefer to “stream” tuples rather than computing

everything. For instance, for a filter:

Dofor t in subExpression:

if cond(t):

yield t

Don’tl = compute(subExpression)

return [e for e in l if cond(e)]

It avoids the materialization of large intermediate datasets

But it is sometimes unavoidable...

Louis JACHIET 52 / 53

Page 105: SD202: Evaluation algorithms

Conclusion

• Queries in databases are automatically optimized

• It works well up

Louis JACHIET 53 / 53