FOSDEM 2012

Sergey Petrunya MariaDB FOSDEM 2012 MariaDB 5.3 query optimizer Taking the dolphin to where he's never been before

Sergey PetrunyaMariaDB


MariaDB 5.3 query optimizer

Taking the dolphin to where he's never been before

What is MySQL currently

MySQL is● “World's most popular database”● Used for

●Websites●OLTP applications

at the same time●“a toy database”●“cannot handle complex queries”

When does one use MySQL

Web apps● Mostly reads● Point or narrow-range select queries:

● SELECT * FROM web_pages WHERE key=...● SELECT * FROM email_archive WHERE date BETWEEN ...● SELECT * FROM forum_posts WHERE thread_id=34213 ORDER BY post_date DESC LIMIT 10

OLTP (Online Transaction Processing) applications● Same as above but more writes and ACID requirements

● SELECT balance FROM users WHERE user_id=...● UPDATE user_id

When does one not use MySQL

Decision support / analytics / reporting● Database is larger

● think “current state” data → “full history” data

● Queries shuffle through more data● “get data for order X” → “get biggest orders in the last month”

● Queries are frequently complex● “get last N posts” → “get distribution of posts by time of the day”● “get orders for item X made today” → “which fraction of those who ordered item X also ordered item Y” ?

What exactly is not working

Reasons MySQL is poor at decision support/analytics

● Large datasets● Reading lots of records from disk requires special disk access strategies

● Complex queries● Insufficient subquery optimizations

● On many levels● Insufficient support for big joins

What exactly isn't working?

Let's try running an analytics-type query● Take DBT-3 (ad-hoc, decision-support benchmark)● Load the data for scale=30 (75 GB)● And try some query:

“average price of item ordered in a certain month”

select avg(lineitem.l_extendedprice) from orders, lineitem where lineitem.l_orderkey=orders.o_orderkey and

orders.o_orderdate between date '1992-07-01' and date '1992-07-31';

What exactly isn't working? (2)

● Query time: 45 min● Why?

● Let's explore

select avg(lineitem.l_extendedprice) from orders, lineitem where lineitem.l_orderkey=orders.o_orderkey and

orders.o_orderdate between date '1992-07-01' and date '1992-07-31';

id select_type table type possible_keys key key_len ref rows Extra

1 SIMPLE orders range PRIMARY,i_o_orderdate

i_o_orderdate 4 NULL 1165090 Using where; Using index

1 SIMPLE lineitem ref PRIMARY,i_l_orderkey, i_l_orderkey_quantity

PRIMARY 4 orders.o_orderkey


What is the problem?

● Check “iostat -x”avg­cpu:  %user   %nice %system %iowait  %steal   %idle           2.01    0.00    2.26   23.62    0.00   72.11

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/ssda               0.00     0.00  229.00    0.00  3952.00     0.00

avgrq­sz avgqu­sz   await r_await w_await  svctm  %util   34.52     0.95    4.15    4.15    0.00   4.15  95.00

● IO-bound load● 229 reqests/sec● 4.15 msec average● It's a 7200 RPM hdd, which gives 8 msec disk seek. ● Not random disk seeks but close.

What is the problem (2)

● Check “SHOW ENGINE INNODB STATUS”...--------FILE I/O--------...

1 pending preads, 0 pending pwrites206.36 reads/s, 16384 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s

● Possible solutions:● Get more RAM● Get an SSD

● These are ok to speedup OLTP workloads● Speeding up analytics this way is going to be costly!

MySQL/MariaDB solution

Improved disk access strategies● Multi-Range Read● Batched Key Access

Multi Range Read● Access table records in disk order (MySQL, MariaDB)● Enumerate index entries in index order (MariaDB)

Batched Key Access● Group ref/eq_ref accesses together● Submit them to storage engine (e.g. InnoDB) as batch,

● So that Multi Range Read can do its job

Let's try the query with MRR/BKA

● Enable MRR/BKA

● Re-run the query

set optimizer_switch='mrr=on,mrr_sort_keys=on';set join_cache_level=6;set join_buffer_size=1024*1024*32;set join_buffer_space_limit=1024*1024*32;

select avg(lineitem.l_extendedprice) from orders, lineitem where lineitem.l_orderkey=orders.o_orderkey and

orders.o_orderdate between date '1992-07-01' and date '1992-07-31';

● Query time: 3 min 48 sec● Was: 45 min, 11.8x speedup

Let's try the query with MRR/BKA

● Explain is almost as before

*************************** 1. row *************************** id: 1 select_type: SIMPLE table: orders type: rangepossible_keys: PRIMARY,i_o_orderdate key: i_o_orderdate key_len: 4 ref: NULL rows: 1165090 Extra: Using where; Using index*************************** 2. row *************************** id: 1 select_type: SIMPLE table: lineitem type: refpossible_keys: PRIMARY,i_l_orderkey,i_l_orderkey_quantity key: i_l_orderkey key_len: 4 ref: dbt3sf30.orders.o_orderkey rows: 1 Extra: Using join buffer (flat, BKA join); Key-ordered Rowid-ordered scan2 rows in set (0.00 sec)

Check what is doing

● Check “iostat -x”avg-cpu: %user %nice %system %iowait %steal %idle 15.13 0.00 2.82 9.74 0.00 72.31

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-szsda 0.00 0.00 1936.00 0.00 88112.00 0.00 91.02 0.45

await r_await w_await svctm %util 0.23 0.23 0.00 0.23 44.20

● Got some CPU load● svctm down to 0.23 ms (random seeks are 8ms)● SHOW ENGINE INNODB STATUS...0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 10450.55 reads/s

Use systemtap to look at io patterns

bash# stap deviceseeks.stp -c "sleep 60"

Regular Batched Key Access

Data from a bigger benchmark

● 4 GB RAM box● DBT-3 scale=10 (30GB dataset)

Before After

avg 3.24 hrs 5.91 min

median 0.32 hrs 4.48 min

max 25.97 hrs* 22.92 min

● Can do joins we couldn't before!● NOTE: not with default settings

● Will publish special “big” configuration on

What exactly is not working

Reasons MySQL is poor at decision support/analytics

● Large datasets● Reading lots of records from disk requires special disk access strategies

● Complex queries● Insufficient subquery optimizations

● On many levels● Insufficient support for big joins

Subquery optimizations

Status in MySQL 5.x, MariaDB 5.2

● One execution strategy for every kind of subquery

● If it is appropriate for your case – OK

● If it is not – query is too slow to be usable● 10x, 100x, 1000x slower

● There are cases when even EXPLAIN is very slow● FROM subuqeries● .. and other less-obvious cases

● General public reaction

● “Don't use subqueries”

Subquery handling in MySQL 5.x

● One strategy for every kind of subquery

Subquery handling in MySQL 5.6

● Optimizations for a couple of cases

Subquery handling in MySQL 6.0

● But look, this is what was in MySQL 6.0 alpha

Subquery handling in MariaDB 5.3

● And we still had to add this to provide enough coverage

Subqueries in MariaDB 5.3

Outside view● Execution can be 10x, 100x, 1000x faster than before● EXPLAIN is always instant

A bit of detail● Competitive amount of strategies● Optimizer quality: can expect what you expect from join

optimizer● There is @@optimizer_switch flag for every new

optimization● Batched Key Access supported in important cases

Where to look for more detail

Q & A

