optimizer (1)

56
Explaining the Postgres Query Optimizer BRUCE MOMJIAN January, 2012 The optimizer is the "brain" of the database, interpreting SQL queries and determining the fastest method of execution. This talk uses the EXPLAIN command to show how the optimizer interprets queries and determines optimal execution. Cr eative Common s Attri bu tion Lice ns e ht tp ://momji an .u s/ pr es en tati on s Expla ining the P ostgr es Query Opti mizer 1 / 56

Upload: ivanchang

Post on 03-Apr-2018

241 views

Category:

Documents


0 download

TRANSCRIPT

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 1/56

Explaining the Postgres Query Optimizer

BRUCE MOMJIAN

January, 2012

The optimizer is the "brain" of the database, interpreting SQL

queries and determining the fastest method of execution. Thistalk uses the EXPLAIN command to show how the optimizerinterprets queries and determines optimal execution.Creative Commons Attribution License http://momjian.us/presentations

Explaining the Postgres Query Optimizer 1 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 2/56

Postgres Query Execution

UserTerminal

CodeDatabase

Server

Application

Queries

Results

PostgreSQL

Libpq

Explaining the Postgres Query Optimizer 2 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 3/56

Postgres Query Execution

utility

Plan

Optimal Path

Query

Postmaster

Postgres Postgres

Libpq

Main

Generate Plan

Traffic Cop

Generate Paths

Execute Plan

e.g. CREATE TABLE, COPY

SELECT, INSERT, UPDATE, DELETE

Rewrite Query

Parse Statement

Utility

Command

Storage ManagersCatalogUtilities

Access Methods Nodes / Lists

Explaining the Postgres Query Optimizer 3 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 4/56

Postgres Query Execution

utility

Plan

Optimal Path

Query

Generate Plan

Traffic Cop

Generate Paths

Execute Plan

e.g. CREATE TABLE, COPYSELECT, INSERT, UPDATE, DELETE

Rewrite Query

Parse Statement

UtilityCommand

Explaining the Postgres Query Optimizer 4 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 5/56

The Optimizer Is the Brain

 http://www.wsmanaging.com/ 

Explaining the Postgres Query Optimizer 5 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 6/56

What Decisions Does the Optimizer Have to Make?

◮ Scan Method

◮ Join Method◮ Join Order

Explaining the Postgres Query Optimizer 6 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 7/56

Which Scan Method?

◮ Sequential Scan

◮ Bitmap Index Scan◮ Index Scan

Explaining the Postgres Query Optimizer 7 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 8/56

 A Simple Example Using pg_class.relname

SELECT relnameFROM pg_classORDER BY 1LIMIT 8;

relname-----------------------------------

_pg_foreign_data_wrappers_pg_foreign_servers_pg_user_mappingsadministrable_role_authorizationsapplicable_roles

attributescheck_constraint_routine_usagecheck_constraints

(8 rows)

Explaining the Postgres Query Optimizer 8 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 9/56

Let’s Use Just the First Letter of pg_class.relname

SELECT substring(relname, 1, 1)FROM pg_classORDER BY 1LIMIT 8;

substring-----------

___aa

acc

(8 rows)

Explaining the Postgres Query Optimizer 9 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 10/56

Create a Temporary Table with an Index 

CREATE TEMPORARY TABLE sample (letter, junk) ASSELECT substring(relname, 1, 1), repeat(’x’, 250)FROM pg_classORDER BY random(); -- add rows in random order

SELECT 253CREATE INDEX i_sample on sample (letter);CREATE INDEX

 All the queries used in this presentation are available athttp://momjian.us/main/writings/pgsql/optimizer.sql.

Explaining the Postgres Query Optimizer 10 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 11/56

Create an E XPLAIN Function

CREATE OR REPLACE FUNCTION lookup_letter(text) RETURNS SETOF text AS $$BEGINRETURN QUERY EXECUTE ’

EXPLAIN SELECT letterFROM sampleWHERE letter = ’’’ || $1 || ’’’’;

END$$ LANGUAGE plpgsql;CREATE FUNCTION

Explaining the Postgres Query Optimizer 11 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 12/56

What is the Distribution of the sample Table?

WITH letters (letter, count) AS (SELECT letter, COUNT(*)FROM sample

GROUP BY 1)SELECT letter, count, (count * 100.0 / (SUM(count) OVER ()))::numeric(4,1) AS "%"FROM lettersORDER BY 2 DESC;

Explaining the Postgres Query Optimizer 12 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 13/56

What is the Distribution of the sample Table?

letter | count | %--------+-------+------

p | 199 | 78.7s | 9 | 3.6c | 8 | 3.2r | 7 | 2.8t | 5 | 2.0v | 4 | 1.6f | 4 | 1.6d | 4 | 1.6u | 3 | 1.2a | 3 | 1.2_ | 3 | 1.2

e | 2 | 0.8i | 1 | 0.4k | 1 | 0.4

(14 rows)

Explaining the Postgres Query Optimizer 13 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 14/56

Is the Distribution Important?

EXPLAIN SELECT letterFROM sampleWHERE letter = ’p’;

QUERY PLAN------------------------------------------------------------------------

Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32)Index Cond: (letter = ’p’::text)

(2 rows)

Explaining the Postgres Query Optimizer 14 / 56

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 15/56

Is the Distribution Important?

EXPLAIN SELECT letterFROM sampleWHERE letter = ’d’;

QUERY PLAN------------------------------------------------------------------------

Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32)Index Cond: (letter = ’d’::text)

(2 rows)

Explaining the Postgres Query Optimizer 15 / 56

I h D b I ?

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 16/56

Is the Distribution Important?

EXPLAIN SELECT letterFROM sampleWHERE letter = ’k’;

QUERY PLAN------------------------------------------------------------------------Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=32)

Index Cond: (letter = ’k’::text)(2 rows)

Explaining the Postgres Query Optimizer 16 / 56

R i A C

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 17/56

Running A NALYZE Causesa Sequential Scan for a Common Value

ANALYZE sample;

ANALYZE

EXPLAIN SELECT letter

FROM sampleWHERE letter = ’p’;

QUERY PLAN---------------------------------------------------------

Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)Filter: (letter = ’p’::text)

(2 rows)

 Autovacuum cannot ANALYZE (or VACUUM) temporary tables becausethese tables are only visible to the creating session.

Explaining the Postgres Query Optimizer 17 / 56

S i l S

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 18/56

Sequential Scan

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

8K

Heap

A

A

D

T

A

T

A

D

A

T

A

D

A

Explaining the Postgres Query Optimizer 18 / 56

A L C V l C Bit H S

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 19/56

 A Less Common Value Causes a Bitmap Heap Scan

EXPLAIN SELECT letterFROM sampleWHERE letter = ’d’;

QUERY PLAN

-----------------------------------------------------------------------Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)

Recheck Cond: (letter = ’d’::text)-> Bitmap Index Scan on i_sample (cost=0.00..4.28 rows=4 width=0)

Index Cond: (letter = ’d’::text)(4 rows)

Explaining the Postgres Query Optimizer 19 / 56

Bit I d S

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 20/56

Bitmap Index Scan

=&

Combined

’A’ AND ’NS’

1

0

1

0

TableIndex 1

col1 = ’A’

Index 2

1

0

0

col2 = ’NS’

1 0

1

0

0

Index

Explaining the Postgres Query Optimizer 20 / 56

An E en Rarer Value Causes an Inde Scan

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 21/56

 An Even Rarer Value Causes an Index Scan

EXPLAIN SELECT letterFROM sampleWHERE letter = ’k’;

QUERY PLAN-----------------------------------------------------------------------Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)

Index Cond: (letter = ’k’::text)(2 rows)

Explaining the Postgres Query Optimizer 21 / 56

Index Scan

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 22/56

Index Scan

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

< >=Key

< >=Key

Index

Heap

< >=Key

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

A

D

A

T

Explaining the Postgres Query Optimizer 22 / 56

Let’s Look at All Values and their Effects

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 23/56

Let s Look at All Values and their Effects

WITH letter (letter, count) AS (SELECT letter, COUNT(*)FROM sampleGROUP BY 1

)SELECT letter AS l, count, lookup_letter(letter)FROM letter

ORDER BY 2 DESC;

l | count | lookup_letter---+-------+-----------------------------------------------------------------------

p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)p | 199 | Filter: (letter = ’p’::text)s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)

s | 9 | Filter: (letter = ’s’::text)c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)c | 8 | Filter: (letter = ’c’::text)r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)r | 7 | Filter: (letter = ’r’::text)

Explaining the Postgres Query Optimizer 23 / 56

OK Just the First Lines

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 24/56

OK, Just the First Lines

WITH letter (letter, count) AS (SELECT letter, COUNT(*)FROM sampleGROUP BY 1

)SELECT letter AS l, count,(SELECT *

FROM lookup_letter(letter) AS l2LIMIT 1) AS lookup_letter

FROM letter

ORDER BY 2 DESC;

Explaining the Postgres Query Optimizer 24 / 56

Just the First EXPLAIN Lines

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 25/56

Just the First E XPLAIN Lines

l | count | lookup_letter---+-------+-----------------------------------------------------------------------

p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)

t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2)f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)

e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)

(14 rows)

Explaining the Postgres Query Optimizer 25 / 56

We Can Force an Index Scan

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 26/56

We Can Force an Index Scan

SET enable_seqscan = false;

SET enable_bitmapscan = false;

WITH letter (letter, count) AS (

SELECT letter, COUNT(*)FROM sampleGROUP BY 1

)SELECT letter AS l, count,

(SELECT *

FROM lookup_letter(letter) AS l2LIMIT 1) AS lookup_letterFROM letterORDER BY 2 DESC;

Explaining the Postgres Query Optimizer 26 / 56

Notice the High Cost for Common Values

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 27/56

Notice the High Cost for Common Values

l | count | lookup_letter---+-------+-----------------------------------------------------------------------p | 199 | Index Scan using i_sample on sample (cost=0.00..39.33 rows=199 widths | 9 | Index Scan using i_sample on sample (cost=0.00..22.14 rows=9 width=2)c | 8 | Index Scan using i_sample on sample (cost=0.00..19.84 rows=8 width=2)r | 7 | Index Scan using i_sample on sample (cost=0.00..19.82 rows=7 width=2)t | 5 | Index Scan using i_sample on sample (cost=0.00..15.21 rows=5 width=2)

d | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2)v | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2)f | 4 | Index Scan using i_sample on sample (cost=0.00..15.19 rows=4 width=2)_ | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2)a | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2)u | 3 | Index Scan using i_sample on sample (cost=0.00..12.88 rows=3 width=2)e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)

i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)

(14 rows)RESET ALL;RESET

Explaining the Postgres Query Optimizer 27 / 56

This Was the Optimizer’s Preference

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 28/56

This Was the Optimizer s Preference

l | count | lookup_letter---+-------+-----------------------------------------------------------------------

p | 199 | Seq Scan on sample (cost=0.00..13.16 rows=199 width=2)s | 9 | Seq Scan on sample (cost=0.00..13.16 rows=9 width=2)c | 8 | Seq Scan on sample (cost=0.00..13.16 rows=8 width=2)r | 7 | Seq Scan on sample (cost=0.00..13.16 rows=7 width=2)

t | 5 | Bitmap Heap Scan on sample (cost=4.29..12.76 rows=5 width=2)f | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)v | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)d | 4 | Bitmap Heap Scan on sample (cost=4.28..12.74 rows=4 width=2)a | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)_ | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)u | 3 | Bitmap Heap Scan on sample (cost=4.27..11.38 rows=3 width=2)

e | 2 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)i | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)k | 1 | Index Scan using i_sample on sample (cost=0.00..8.27 rows=1 width=2)

(14 rows)

Explaining the Postgres Query Optimizer 28 / 56

Which Join Method?

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 29/56

Which Join Method?

◮ Nested Loop

◮ With Inner Sequential Scan

◮ With Inner Index Scan◮ Hash Join

◮ Merge Join

Explaining the Postgres Query Optimizer 29 / 56

What Is in pg proc.oid?

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 30/56

What Is in pg_proc.oid?

SELECT oidFROM pg_procORDER BY 1LIMIT 8;

oid-----

313334353839

4041

(8 rows)

Explaining the Postgres Query Optimizer 30 / 56

Create Temporary Tables

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 31/56

C eate e po a y ab esfrom pg_proc and pg_class

CREATE TEMPORARY TABLE sample1 (id, junk) ASSELECT oid, repeat(’x’, 250)FROM pg_procORDER BY random(); -- add rows in random order

SELECT 2256CREATE TEMPORARY TABLE sample2 (id, junk) AS

SELECT oid, repeat(’x’, 250)FROM pg_classORDER BY random(); -- add rows in random order

SELECT 260These tables have no indexes and no optimizer statistics.

Explaining the Postgres Query Optimizer 31 / 56

Join the Two Tables

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 32/56

J with a Tight Restriction

EXPLAIN SELECT sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)WHERE sample1.id = 33;

QUERY PLAN

---------------------------------------------------------------------Nested Loop (cost=0.00..234.68 rows=300 width=32)

-> Seq Scan on sample1 (cost=0.00..205.54 rows=50 width=4)Filter: (id = 33::oid)

-> Materialize (cost=0.00..25.41 rows=6 width=36)-> Seq Scan on sample2 (cost=0.00..25.38 rows=6 width=36)

Filter: (id = 33::oid)(6 rows)

Explaining the Postgres Query Optimizer 32 / 56

Nested Loop Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 33/56

p J with Inner Sequential Scan

aag

aar

aay aag

aas

aar

aaa

aay

aai

aag

No Setup Required

aai

Used For Small Tables

Outer Inner

Explaining the Postgres Query Optimizer 33 / 56

Pseudocode for Nested Loop Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 34/56

p J with Inner Sequential Scan

for (i = 0; i < length(outer); i++)

for (j = 0; j < length(inner); j++)if (outer[i] == inner[j])

output(outer[i], inner[j]);

Explaining the Postgres Query Optimizer 34 / 56

Join the Two Tables with a Looser Restriction

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 35/56

EXPLAIN SELECT sample1.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)WHERE sample2.id > 33;

QUERY PLAN----------------------------------------------------------------------

Hash Join (cost=30.50..950.88 rows=20424 width=32)Hash Cond: (sample1.id = sample2.id)-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36)-> Hash (cost=25.38..25.38 rows=410 width=4)

-> Seq Scan on sample2 (cost=0.00..25.38 rows=410 width=4)Filter: (id > 33::oid)

(6 rows)

Explaining the Postgres Query Optimizer 35 / 56

Hash Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 36/56

Hashed

Must fit in Main Memory

aakaar

aak

aay aaraam

aao aaw

aay

aag

aas

Outer Inner

Explaining the Postgres Query Optimizer 36 / 56

Pseudocode for Hash Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 37/56

for (j = 0; j < length(inner); j++)hash_key = hash(inner[j]);append(hash_store[hash_key], inner[j]);

for (i = 0; i < length(outer); i++)hash_key = hash(outer[i]);for (j = 0; j < length(hash_store[hash_key]); j++)

if (outer[i] == hash_store[hash_key][j])output(outer[i], inner[j]);

Explaining the Postgres Query Optimizer 37 / 56

Join the Two Tables with No Restriction

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 38/56

EXPLAIN SELECT sample1.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id);

QUERY PLAN-------------------------------------------------------------------------

Merge Join (cost=927.72..1852.95 rows=61272 width=32)

Merge Cond: (sample2.id = sample1.id)-> Sort (cost=85.43..88.50 rows=1230 width=4)Sort Key: sample2.id-> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=4)

-> Sort (cost=842.29..867.20 rows=9963 width=36)Sort Key: sample1.id

-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=36)(8 rows)

Explaining the Postgres Query Optimizer 38 / 56

Merge Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 39/56

Sorted

Sorted

Ideal for Large Tables

An Index Can Be Used to Eliminate the Sort

aaa

aab

aac

aad

aaa

aab

aab

aaf

aaf

aacaae

Outer Inner

Explaining the Postgres Query Optimizer 39 / 56

Pseudocode for Merge Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 40/56

sort(outer);sort(inner);i = 0;j = 0;save_j = 0;while (i < length(outer))

if (outer[i] == inner[j])output(outer[i], inner[j]);if (outer[i] <= inner[j] && j < length(inner))

j++;if (outer[i] < inner[j])

save_j = j;

elsei++;j = save_j;

Explaining the Postgres Query Optimizer 40 / 56

Order of Joined Relations Is Insignificant

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 41/56

EXPLAIN SELECT sample2.junkFROM sample2 JOIN sample1 ON (sample2.id = sample1.id);

QUERY PLAN------------------------------------------------------------------------

Merge Join (cost=927.72..1852.95 rows=61272 width=32)Merge Cond: (sample2.id = sample1.id)

-> Sort (cost=85.43..88.50 rows=1230 width=36)Sort Key: sample2.id-> Seq Scan on sample2 (cost=0.00..22.30 rows=1230 width=36)

-> Sort (cost=842.29..867.20 rows=9963 width=4)Sort Key: sample1.id-> Seq Scan on sample1 (cost=0.00..180.63 rows=9963 width=4)

(8 rows)The most restrictive relation, e.g. sample2, is always on the outer side of merge joins. All previous merge joins also had sample2 in outer position.

Explaining the Postgres Query Optimizer 41 / 56

 Add Optimizer Statistics

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 42/56

ANALYZE sample1;

ANALYZE sample2;

Explaining the Postgres Query Optimizer 42 / 56

This Was a Merge Join without Optimizer Statistics

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 43/56

EXPLAIN SELECT sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id);

QUERY PLAN------------------------------------------------------------------------

Hash Join (cost=15.85..130.47 rows=260 width=254)Hash Cond: (sample1.id = sample2.id)-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)-> Hash (cost=12.60..12.60 rows=260 width=258)

-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258)(5 rows)

Explaining the Postgres Query Optimizer 43 / 56

Outer Joins Can Affect Optimizer Join Usage

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 44/56

EXPLAIN SELECT sample1.junkFROM sample1 RIGHT OUTER JOIN sample2 ON (sample1.id = sample2.id);

QUERY PLAN-------------------------------------------------------------------------

Hash Left Join (cost=131.76..148.26 rows=260 width=254)

Hash Cond: (sample2.id = sample1.id)-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=4)-> Hash (cost=103.56..103.56 rows=2256 width=258)

-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=258)(5 rows)

Use of hashes for outer joins was added in Postgres 9.1.

Explaining the Postgres Query Optimizer 44 / 56

Cross Joins Are Nested Loop Joinsith t J i R t i ti

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 45/56

 without Join Restriction

EXPLAIN SELECT sample1.junkFROM sample1 CROSS JOIN sample2;

QUERY PLAN

----------------------------------------------------------------------Nested Loop (cost=0.00..7448.81 rows=586560 width=254)

-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=254)-> Materialize (cost=0.00..13.90 rows=260 width=0)

-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=0)(4 rows)

Explaining the Postgres Query Optimizer 45 / 56

Create Indexes

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 46/56

CREATE INDEX i_sample1 on sample1 (id);

CREATE INDEX i_sample2 on sample2 (id);

Explaining the Postgres Query Optimizer 46 / 56

Nested Loop with Inner Index Scan Now Possible

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 47/56

EXPLAIN SELECT sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)WHERE sample1.id = 33;

QUERY PLAN---------------------------------------------------------------------------------

Nested Loop (cost=0.00..16.55 rows=1 width=254)-> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4)

Index Cond: (id = 33::oid)-> Index Scan using i_sample2 on sample2 (cost=0.00..8.27 rows=1 width=258)

Index Cond: (sample2.id = 33::oid)(5 rows)

Explaining the Postgres Query Optimizer 47 / 56

Nested Loop Join with Inner Index Scan

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 48/56

aag

aar

aai

aay aag

aas

aar

aaa

aay

aai

aag

No Setup Required

Index Lookup

Index Must Already Exist

Outer Inner

Explaining the Postgres Query Optimizer 48 / 56

Pseudocode for Nested Loop Joinwith Inner Index Scan

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 49/56

 with Inner Index Scan

for (i = 0; i < length(outer); i++)

index_entry = get_first_match(outer[j])while (index_entry)output(outer[i], inner[index_entry]);index_entry = get_next_match(index_entry);

Explaining the Postgres Query Optimizer 49 / 56

Query Restrictions Affect Join Usage

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 50/56

EXPLAIN SELECT sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)WHERE sample2.junk ˜ ’^aaa’;

QUERY PLAN-------------------------------------------------------------------------------

Nested Loop (cost=0.00..21.53 rows=1 width=254)-> Seq Scan on sample2 (cost=0.00..13.25 rows=1 width=258)

Filter: (junk ˜ ’^aaa’::text)-> Index Scan using i_sample1 on sample1 (cost=0.00..8.27 rows=1 width=4)

Index Cond: (sample1.id = sample2.id)(5 rows)

No junk rows begin with ’aaa’.

Explaining the Postgres Query Optimizer 50 / 56

 All ’junk’ Columns Begin with ’xxx’

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 51/56

EXPLAIN SELECT sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)WHERE sample2.junk ˜ ’^xxx’;

QUERY PLAN------------------------------------------------------------------------

Hash Join (cost=16.50..131.12 rows=260 width=254)

Hash Cond: (sample1.id = sample2.id)-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)-> Hash (cost=13.25..13.25 rows=260 width=258)

-> Seq Scan on sample2 (cost=0.00..13.25 rows=260 width=258)Filter: (junk ˜ ’^xxx’::text)

(6 rows)

Hash join was chosen because many more rows are expected. Thesmaller table, e.g. sample2, is always hashed.

Explaining the Postgres Query Optimizer 51 / 56

Without LIMIT, Hash Is Usedfor this Unrestricted Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 52/56

for this Unrestricted Join

EXPLAIN SELECT sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id);

QUERY PLAN

------------------------------------------------------------------------Hash Join (cost=15.85..130.47 rows=260 width=254)Hash Cond: (sample1.id = sample2.id)-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)-> Hash (cost=12.60..12.60 rows=260 width=258)

-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258)

(5 rows)

Explaining the Postgres Query Optimizer 52 / 56

LIMIT Can Affect Join Usage

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 53/56

EXPLAIN SELECT sample2.id, sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)ORDER BY 1LIMIT 1;

QUERY PLAN------------------------------------------------------------------------------------------

Limit (cost=0.00..1.83 rows=1 width=258)-> Nested Loop (cost=0.00..477.02 rows=260 width=258)

-> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258)-> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4)

Index Cond: (sample1.id = sample2.id)(5 rows)

Explaining the Postgres Query Optimizer 53 / 56

LIMIT 10

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 54/56

EXPLAIN SELECT sample2.id, sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)ORDER BY 1LIMIT 10;

QUERY PLAN------------------------------------------------------------------------------------------

Limit (cost=0.00..18.35 rows=10 width=258)-> Nested Loop (cost=0.00..477.02 rows=260 width=258)

-> Index Scan using i_sample2 on sample2 (cost=0.00..52.15 rows=260 width=258)-> Index Scan using i_sample1 on sample1 (cost=0.00..1.62 rows=1 width=4)

Index Cond: (sample1.id = sample2.id)(5 rows)

Explaining the Postgres Query Optimizer 54 / 56

LIMIT 100 Switches to Hash Join

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 55/56

EXPLAIN SELECT sample2.id, sample2.junkFROM sample1 JOIN sample2 ON (sample1.id = sample2.id)ORDER BY 1LIMIT 100;

QUERY PLAN------------------------------------------------------------------------------------

Limit (cost=140.41..140.66 rows=100 width=258)

-> Sort (cost=140.41..141.06 rows=260 width=258)Sort Key: sample2.id-> Hash Join (cost=15.85..130.47 rows=260 width=258)

Hash Cond: (sample1.id = sample2.id)-> Seq Scan on sample1 (cost=0.00..103.56 rows=2256 width=4)-> Hash (cost=12.60..12.60 rows=260 width=258)

-> Seq Scan on sample2 (cost=0.00..12.60 rows=260 width=258)(8 rows)

Explaining the Postgres Query Optimizer 55 / 56

Conclusion

7/29/2019 Optimizer (1)

http://slidepdf.com/reader/full/optimizer-1 56/56

 http://momjian.us/presentations http://www.vivapixel.com/photo/14252

Explaining the Postgres Query Optimizer 56 / 56