2016 may-countdown-to-postgres-v96-parallel-query

23
Welcome Parallelism to PostgreSQL Thursday, 19 May 2016

Upload: ashnikbiz

Post on 17-Feb-2017

171 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: 2016 may-countdown-to-postgres-v96-parallel-query

Welcome Parallelism to PostgreSQL

Thursday, 19 May 2016

Page 2: 2016 may-countdown-to-postgres-v96-parallel-query

• Current State of Parallelism in PostgreSQL

• What was needed to bring server side parallelism – Work done in v9.4 and v9.5

• Parallel Query in v9.6

• Review some parallel plans

• Parallelism may not be used always

• Parallelism may not be useful always

• Parameters

• Benefits

• Questions

Agenda

2

Page 3: 2016 may-countdown-to-postgres-v96-parallel-query

• Client side parallelism – Application can open multiple sessions • One can run a batch with multiple application threads

• Server side languages can potentially do parallel operations

• I/O activity is taken off from main query execution process by walwriter and bgwriter

• effective_io_concurrency allows page prefetch requests to the kernel, for bitmap joins

• But there is no server side parallelism for dividing the same task among multiple-workers

Current State (v9.5) of Parallelism in PostgreSQL

3

Page 4: 2016 may-countdown-to-postgres-v96-parallel-query

v9.4

• Dynamic background workers

• Dynamic shared memory

• Implementation of shared memory message queues

v9.5

• Message propagation i.e. error messages from background worker can be sent to master and received by master

• Synchronization of state (GUC values, XID, CID mapping, current user and current dbetc)

• Parallel Contexts can be used by backend code to launch worker processes

A lot of work was needed and was done!

4

Page 5: 2016 may-countdown-to-postgres-v96-parallel-query

• Parallel Sequential Scan

• Parallel Joins

• Parallel Aggregates

• Though these are not in their best forms and have certain exceptions/limitations but they still work and quite useful!

v9.6: We have something that users can use!

5

Page 6: 2016 may-countdown-to-postgres-v96-parallel-query

Basically how parallelism is supposed to work

6

Page 7: 2016 may-countdown-to-postgres-v96-parallel-query

Let’s look at some plans

Page 8: 2016 may-countdown-to-postgres-v96-parallel-query

Sequential Scan without Parallelism

8

Page 9: 2016 may-countdown-to-postgres-v96-parallel-query

Parallel Sequential Scans

9

Page 10: 2016 may-countdown-to-postgres-v96-parallel-query

You may not get as many workers as you desire

10

Page 11: 2016 may-countdown-to-postgres-v96-parallel-query

Parallel Aggregate

11

Page 12: 2016 may-countdown-to-postgres-v96-parallel-query

Parallel Joins

12

Page 13: 2016 may-countdown-to-postgres-v96-parallel-query

Wow! So using ‘Parallel Workers’ should be

preferred!No, not really!

Page 14: 2016 may-countdown-to-postgres-v96-parallel-query

Parallel Query May not be used all the time

• Cost of working and coordinating among multiple worker processes defeats the advantage of parallelism

• Cost of setting up parallelism infrastructure is too high

• No worker process is available

14

Page 15: 2016 may-countdown-to-postgres-v96-parallel-query

Example

15

Page 16: 2016 may-countdown-to-postgres-v96-parallel-query

Parallel Query may not be good all the time

16

Page 17: 2016 may-countdown-to-postgres-v96-parallel-query

Parallel Query may not be good all the time

• It depends a lot on your hardware resources and process scheduling by your OS

• I tried various degree of parallelism on a test machine • 3 CPU, 3GB RAM • VM Running CentOS• Single I/O disk

• A simple ‘count’ on a table with 100million rows and 8 byte width• explain analyze select count(*) from pgbench_accounts ;

• It performs faster with parallel degree set to 0, as index scan is performed

• Make sure you have tuned your parameters well to help optimizer decide

17

Page 18: 2016 may-countdown-to-postgres-v96-parallel-query

Parameters Involved

Page 19: 2016 may-countdown-to-postgres-v96-parallel-query

Parameters which govern parallel query execution

• parallel_setup_cost

• parallel_tuple_cost

• max_worker_processes

• max_parallel_degree

• force_parallel_mode

• ALTER TABLE … SET (parallel_degree=n)

• ALTER FUNCTION … PARALLEL SAFE

• ALTER FUNCTION … COST

19

Page 20: 2016 may-countdown-to-postgres-v96-parallel-query

Benefits to the users

• Sequential scan on large tables would be faster

• Analytics workload involve aggregates would be faster

• Faster JOINs between large tables

• PostgreSQL v9.6 can be a good candidate for the backend database of data warehouse

• More parallel operations to come in future releases

20

Page 21: 2016 may-countdown-to-postgres-v96-parallel-query

What can you do?

• PostgreSQL Beta 1 is out

• Try it out…

• Test it…

• Break it…

• Report it

• Help PostgreSQL community make it better

21

Page 22: 2016 may-countdown-to-postgres-v96-parallel-query

Further Reading

• PGCon 2014: Implementing Parallelism in PostgreSQL, Robert Haas

• PGConf.US, 2016: PostgreSQL 9.6, Magnus Hagander

• PGCon, Ottawa 2015: Parallel Sequential Scan, Robert Haas and Amit Kapila at

• EnterpriseDB Blog: Parallelism Progress, Robert Haas

• Parallel Sequential Scan is Committed, Robert Haas

• EnterpriseDB Blog: Parallelism Becomes a Reality in Postgres, Amit Kapila

22

Page 23: 2016 may-countdown-to-postgres-v96-parallel-query

Send us your suggestions and questions

[email protected]

Stay Tuned!

Website: www.ashnik.com