mysql best practices at trovit

58
MySQL best practices Design, use and optimization Iván López [email protected] 2014 1

Upload: ivan-lopez

Post on 08-Aug-2015

183 views

Category:

Software


0 download

TRANSCRIPT

MySQL best practicesDesign, use and optimization

Iván López [email protected] 20141

Table designDatabase engine (table format)

2

InnoDB table format• Use InnoDB by default

• InnoDB vs MyISAM pros:

• Row locking, allows concurrent writes

• ACID

• Non-blocking backups

• Better data recovery after a crash

• InnoDB cons:

• Lack of instant row count

3

What about MyISAM?

• Never use MyISAM for concurrent applications in real time

• Possible use cases for MyISAM:

• No writes

• Batch writes only

• Mostly full scan selects

4

Table designSegmentation and specialisation

5

Table segmentation• For very big tables or tables that will grow forever,

find some criteria to segment data into smaller tables

• For instance:

• one table per month for log recording

• one table per country for users, so you can shard them and put them on different servers closer to the users

6

Table specialisation• Don’t use the same table for heterogeneous

records that don’t share the same fields, it will increase table size and affect performance when using indexes

• For instance:

• In a database for classified ads for homes, cars and jobs, use one table for each type, because they don’t share fields like number of rooms, engine power or salary.

7

Table designPrimary Key in InnoDB

8

Clustered index

1 1 2 A B … … …

Primary key1 1 1 A A … … …

1 1 1A A

Row data

Primary keySecondary key 1

1 2 2A A

1 1 2A B

1 2 1 A B … … …

9

1 2 2 A A … … …

1 2 1A B

Clustered index• You must always define a primary key, if there’s no

natural PK for the table, define an auto incremental PK

• Records are physically ordered in table by the PK

• Accessing a row using the PK is the fastest way, because the row data is on the same page where the index search leads

• Don’t include fields in the PK that could be modified after insertion, it will delete and insert the record again at the right position if you update the PK, affecting performance

10

Secondary indexes• All indexes other than the clustered index are known as secondary

indexes

• Each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index

• InnoDB uses this primary key value to search for the row in the clustered index

• You could take profit from this design for paging by PK in selects that use a secondary index

• If the primary key is long, the secondary indexes use more space, so it is better to have a small auto increment PK and define a unique key with the fields that would be the natural PK

11

Long Primary KeyCREATE  TABLE  `jobs_tbl_trovit_stats`  (      `s_query`  varchar(255)  NOT  NULL,      `d_date`  date  NOT  NULL,      `fk_i_id_tbl_type_dates`  int(10)  unsigned  NOT  NULL,      `i_num_ads`  int(10)  DEFAULT  NULL,      `i_num_ads_salary`  int(10)  DEFAULT  NULL,      `i_num_ads_region`  int(10)  DEFAULT  NULL,      `i_num_ads_city`  int(10)  DEFAULT  NULL,      `i_num_ads_company`  int(10)  DEFAULT  NULL,      `i_num_ads_experience`  int(10)  DEFAULT  NULL,      `i_num_ads_contract`  int(10)  DEFAULT  NULL,      `f_avg_salary`  int(10)  DEFAULT  NULL,      `f_avg_region_salary`  int(10)  DEFAULT  NULL,      `f_avg_city_salary`  int(10)  DEFAULT  NULL,      `f_avg_company_salary`  int(10)  DEFAULT  NULL,      `f_avg_experience_salary`  int(10)  DEFAULT  NULL,      `f_avg_contract_salary`  int(10)  DEFAULT  NULL,      `s_similar_hash`  varchar(255)  NOT  NULL,      PRIMARY  KEY  (`s_query`,`fk_i_id_tbl_type_dates`,`d_date`),      KEY  `s_query_i_num_ads_salary_d_date`  (`s_query`,`i_num_ads_salary`,`d_date`),      KEY  `s_similar_hash_d_date`  (`s_similar_hash`,`d_date`),      KEY  `d_date`  (`d_date`)  )  ENGINE=InnoDB  DEFAULT  CHARSET=utf8  

5MM  records  =  2600MB  disk  space

12

Auto increment Primary KeyCREATE  TABLE  `jobs_tbl_trovit_stats_NEW`  (      `i_id`  int(10)  unsigned  NOT  NULL  AUTO_INCREMENT,      `s_query`  varchar(255)  NOT  NULL,      `d_date`  date  NOT  NULL,      `fk_i_id_tbl_type_dates`  int(10)  unsigned  NOT  NULL,      `i_num_ads`  int(10)  DEFAULT  NULL,      `i_num_ads_salary`  int(10)  DEFAULT  NULL,      `i_num_ads_region`  int(10)  DEFAULT  NULL,      `i_num_ads_city`  int(10)  DEFAULT  NULL,      `i_num_ads_company`  int(10)  DEFAULT  NULL,      `i_num_ads_experience`  int(10)  DEFAULT  NULL,      `i_num_ads_contract`  int(10)  DEFAULT  NULL,      `f_avg_salary`  int(10)  DEFAULT  NULL,      `f_avg_region_salary`  int(10)  DEFAULT  NULL,      `f_avg_city_salary`  int(10)  DEFAULT  NULL,      `f_avg_company_salary`  int(10)  DEFAULT  NULL,      `f_avg_experience_salary`  int(10)  DEFAULT  NULL,      `f_avg_contract_salary`  int(10)  DEFAULT  NULL,      `s_similar_hash`  varchar(255)  NOT  NULL,      PRIMARY  KEY  (`i_id`),      UNIQUE  KEY  `unique_key`  (`s_query`,`fk_i_id_tbl_type_dates`,`d_date`),      KEY  `s_query_i_num_ads_salary_d_date`  (`s_query`,`i_num_ads_salary`,`d_date`),      KEY  `s_similar_hash_d_date`  (`s_similar_hash`,`d_date`),      KEY  `d_date`  (`d_date`)  )  ENGINE=InnoDB  AUTO_INCREMENT=6462849  DEFAULT  CHARSET=utf8  

5MM  records  =  2100MB  -­‐20%  disk  space

13

Using indexes

14

Pros of indexes

1. Filter: access only the records you need, without considering more records than necessary. Applies to SELECT, UPDATE, REPLACE and DELETE

2. Sor/group: avoid using temporary tables

3. Cover: if all fields in a SELECT are included in the index used (despite its order), data is retrieved directly from this index, saving extra reads

15

Cons of indexes

1. Writes: the more indexes a table has, the slower writes are going to be

2. Size: each index is going to increase total table size

16

Rules for using indexes• Only one index is used for each table in a query

• In case of using “OR” in a WHERE condition, it works as many different queries, and each one could use its own index

• Fields use order in index is from left to right, always beginning with the first one

• Its not mandatory to use all fields in an index, but every field used must be consecutive

• Fields in WHERE condition go first, next GROUP BY and ORDER BY last

• Order of fields in WHERE condition doesn’t matter

• A range condition in WHERE or using GROUP BY or ORDER BY will prevent using the next fields in the index

17

Rules for using indexes fields use order

18

You can’t skip previous fields if you want to filter using the index for rightmost fields

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  *  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  d_date  =  ‘2014-­‐10-­‐10’  AND  fk_i_id_tbl_vertical  =  1;  

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  *  FROM  table  WHERE  d_date  =  ‘2014-­‐10-­‐10’  AND  fk_i_id_tbl_vertical  =  1;  

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  *  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  fk_i_id_tbl_vertical  =  1;  

Rules for using indexes fields order in ranges, groups and sorts

19

None of this examples could filter using the index for the field fk_i_id_tbl_vertical

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  *  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  d_date  <  ‘2014-­‐10-­‐10’  AND  fk_i_id_tbl_vertical  =  1;  

SELECT  *  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  fk_i_id_tbl_vertical  =  1,  GROUP  BY  d_date;  

SELECT  *  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  fk_i_id_tbl_vertical  =  1  ORDER  BY  d_date;

Rules for using indexes covering indexes and fields order

20

This query will use the covering index for all the requested fields, also uses the index for filtering the first field, but can’t use it for filtering the next fields because the WHERE condition is skipping the second field in the index

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  fk_i_id_tbl_vertical  =  1;  

This query will use the covering index for all the requested fields, but can’t use it for filtering because the WHERE condition is skipping the first field in the index

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical  FROM  table  WHERE  d_date  <  ‘2014-­‐10-­‐10’  AND  fk_i_id_tbl_vertical  =  1;  

Rules for using indexes covering indexes and not indexed fields

21

You can’t use covering index optimization if any of the requested fields is not included in the index

KEY  `country_date_vertical`  (`fk_c_id_tbl_countries`,`d_date`,`fk_i_id_tbl_vertical`)  

SELECT  fk_c_id_tbl_countries,  d_date,  fk_i_id_tbl_vertical,  s_query  FROM  table  WHERE  fk_c_id_tbl_countries  =  ‘es’  AND  d_date  =  ‘2014-­‐10-­‐10’  AND  fk_i_id_tbl_vertical  =  1;  

Good index design

22

Duplicated indexes

• Avoid duplicating fields in different indexes, it will affect write performance and increase table size

• Think about the most frequent uses of the table, so you can design the table itself and order the fields in indexes smartly to avoid duplicate fields

23

Promote covering indexes

• Consider adding frequently requested fields at the end of an index, even if they aren’t used in WHERE, GROUP BY or ORDER BY

• If the major part of queries on a table are optimised to use use covering indexes, is the most important performance boost you can get

24

Index cardinality• Cardinality is how many unique values an index has

• The more cardinality, the more efficient an index is filtering records

• MySQL maintains approximate statistics about cardinality

• Each MySQL version gives different cardinality values, and they can become out of date under high write load

25

Index cardinality

26

mysql>  analyze  table  users;  

mysql>  show  index  from  users;  

+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+  

|  Table  |  Non_unique  |  Key_name  |  Seq  |  Column_name  |  Cardinality  |  

+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+  

|  users  |                    0  |  PRIMARY    |      1  |  id                    |          9728224  |  

|  users  |                    1  |  age_sex    |      1  |  age                  |                  192  |  

|  users  |                    1  |  age_sex    |      2  |  sex                  |                  406  |  

|  users  |                    1  |  sex_age    |      1  |  sex                  |                      2  |  

|  users  |                    1  |  sex_age    |      2  |  age                  |                  406  |  

|  users  |                    1  |  name          |      1  |  name                |              38149  |  

|  users  |                    1  |  active      |      1  |  active            |                      2  |  

+-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐+

Cardinality and distribution

27

Cardinalityversion 5.5

Cardinalityversion 5.6 count(distinct) Distribution Filter

efficiency

PRIMARY 10.000.267 9.728.224 10.000.000 única optimum

name 18,868 38,149 10,001 ±0,01%  c/u very good

age 18 192 101 ±1%  c/u good

sex 18 2 2 50%  c/u fair

active 18 2 2 ‘0’:    0,1%  ‘1’:  99,9%

‘0’ very good‘1’ very bad

Optimising queries

28

Using EXPLAIN

29

EXPLAIN fieldsid Id of the SELECT, if there are more than one

select_type Query type (SIMPLE, UNION, SUBQUERY…)table Table nametype Record search strategy

possible_keys Indexes to considerkey Index used

key_len Used length of the indexref Fields matched with the index

rows Approximate record number to considerfiltered Percentage of filtered records by WHEREExtra Additional info

30

Search strategies from most to less optimal

• const: just one record, by primary key or unique index

• eq_ref: just one record for each other record in a JOIN

• ref: multiple records filtering by index

• index_merge: multiple records filtering by more than one index

• range: multiple records filtering by range using index

• index: read all records in the index file (index scan)

• ALL: read all records in the data file (full scan)

Information in the “Extra” field

• Using where: records are filtered after reading them, using the WHERE condition (an index wasn’t able to filter all of them)

• Using index: all field data is read directly from the index, without accessing the data file (covering index)

• Using where, Using index: as with “Using where”, records are filtered after reading them, but data comes from an index

• Using filesort: extra step to sort after filtering records, when records were not read in order from an index, using a temporary file

• Using temporary: temporary tables are needed to complete some steps and satisfy the query (in memory or disk)

Guide to understand EXPLAIN• “Using index” will boost query performance (covering index),

specially with lots of results

• Avoid “Using filesort” and “Using temporary”, specially with lots of results

• “Using where” in queries of the types “ALL” o “index” means that is not possible to discard any record directly from the index, and they will be filtered while reading all the records

• Watch how may records are going to be read approximately according to the field “rows”

• Watch the effective used length of the index according to the field “key_len”

33

Just one record by Primary Keymysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  

       -­‐>  WHERE  s_query  =  'account  executive'  

       -­‐>  AND  fk_i_id_tbl_type_dates  =  1  

       -­‐>  AND  d_date  =  '2009-­‐09-­‐28'\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats  

                 type:  const  

possible_keys:  PRIMARY,s_query_i_num_ads_salary_d_date,d_date  

                   key:  PRIMARY  

           key_len:  774  

                   ref:  const,const,const  

                 rows:  1

34

Just one record by UNIQUE KEYmysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW  

       -­‐>  WHERE  s_query  =  'account  executive'  

       -­‐>  AND  fk_i_id_tbl_type_dates  =  1  

       -­‐>  AND  d_date  =  '2009-­‐09-­‐28'\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats_NEW  

                 type:  const  

possible_keys:  unique_key,s_query_i_num_ads_salary_d_date,d_date  

                   key:  unique_key  

           key_len:  774  

                   ref:  const,const,const  

                 rows:  1

35

JOIN of just one recordmysql>  EXPLAIN  SELECT  *  FROM  tbl_users  LEFT  JOIN  tbl_countries          -­‐>  ON  tbl_users.fk_c_id_tbl_countries  =  tbl_countries.c_id\G  ***************************  1.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  tbl_users                    type:  ALL  possible_keys:  NULL                      key:  NULL              key_len:  NULL                      ref:  NULL                    rows:  20986861  ***************************  2.  row  ***************************                        id:  1      select_type:  SIMPLE                  table:  tbl_countries                    type:  eq_ref  possible_keys:  PRIMARY                      key:  PRIMARY              key_len:  6                      ref:  trovit_global.tbl_users.fk_c_id_tbl_countries                    rows:  1

36

Many records using a partial Unique Key

mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW  

       -­‐>  WHERE  s_query  =  'account  executive'\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats_NEW  

                 type:  ref  

possible_keys:  unique_key,s_query_i_num_ads_salary_d_date  

                   key:  unique_key  

           key_len:  767  

                   ref:  const  

                 rows:  1073

37

Many records using indexmysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  

       -­‐>  WHERE  d_date  =  '2012-­‐10-­‐15'\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats  

                 type:  ref  

possible_keys:  d_date  

                   key:  d_date  

           key_len:  3  

                   ref:  const  

                 rows:  10908

38

OR condition and indexesmysql>  EXPLAIN  SELECT  a  FROM  t  

       -­‐>  WHERE  b  =  1  OR  c  =  1\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  t  

                 type:  index_merge  

possible_keys:  b,c,b_a  

                   key:  b,c  

           key_len:  4,4  

                   ref:  NULL  

                 rows:  4863128  

               Extra:  Using  union(b,c);  Using  where

39

Range query using indexmysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats_NEW  

       -­‐>  WHERE  s_query  =  'account  executive’  

       -­‐>  AND  fk_i_id_tbl_type_dates  =  1  

       -­‐>  AND  d_date  <  ‘2011-­‐04-­‐25'\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_bl_trovit_stats  

                 type:  range  

possible_keys:  unique_key,s_query_i_num_ads_salary_d_date,d_date  

                   key:  unique_key  

           key_len:  774  

                   ref:  NULL  

                 rows:  539

40

Condition is not using index, but using covering index

mysql>  EXPLAIN  SELECT  s_query  FROM  jobs_tbl_trovit_stats  

       -­‐>  WHERE  i_num_ads_salary  =  10\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats  

                 type:  index  

possible_keys:  NULL  

                   key:  s_query_i_num_ads_salary_d_date  

           key_len:  775  

                   ref:  NULL  

                 rows:  5543853  

               Extra:  Using  where;  Using  index

41

Full scan query, filter condition doesn’t use index

mysql>  EXPLAIN  SELECT  i_num_ads  FROM  jobs_tbl_trovit_stats  

       -­‐>  WHERE  i_num_ads_salary  =  10\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats  

                 type:  ALL  

possible_keys:  NULL  

                   key:  NULL  

           key_len:  NULL  

                   ref:  NULL  

                 rows:  5543853  

               Extra:  Using  where

42

Full scan query, retrieve all records

mysql>  EXPLAIN  SELECT  i_num_ads  FROM  jobs_tbl_trovit_stats\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats  

                 type:  ALL  

possible_keys:  NULL  

                   key:  NULL  

           key_len:  NULL  

                   ref:  NULL  

                 rows:  5543853  

               Extra:  NULL

43

EXPLAIN and the MySQL query optimizer

44

45

KEY  `a_b_c_d`  (`a`,`b`,`c`,`d`)  +  id  

KEY  `a_b_c_d`  (`a`,`b`,`c`,`d`)  +  id  

mysql>  EXPLAIN  SELECT  a,  b,  d  FROM  t  

       -­‐>  WHERE  a  =  1  ORDER  BY  id\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  t  

                 type:  ref  

possible_keys:  a,a_b,a_b_c_d  

                   key:  a_b_c_d  

           key_len:  4  

                   ref:  const  

                 rows:  199704  

               Extra:  Using  where;  Using  index;  Using  filesort  

200000  rows  in  set  (0.28  sec)

The importance of “using index” query optimizer choses wisely and favours the performance boost of

“using index” even if it’s forced to “using filesort”

46

KEY  `a`  (`a`)  +  id  

mysql>  EXPLAIN  SELECT  a,  b,  d  FROM  t  

       -­‐>  FORCE  KEY  (a)  

       -­‐>  WHERE  a  =  1  ORDER  BY  id\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  t  

                 type:  ref  

possible_keys:  a  

                   key:  a  

           key_len:  4  

                   ref:  const  

                 rows:  199704  

               Extra:  Using  where  

200000  rows  in  set  (0.53  sec)

The importance of “using index” if we choose to force an index we think it would be better for performance (in this case the

bad extra “using filesort” is not used), we might be mistaken and the query could be slower

47

mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  

       -­‐>  ORDER  BY  d_date\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  jobs_tbl_trovit_stats  

                 type:  ALL  

possible_keys:  NULL  

                   key:  NULL  

           key_len:  NULL  

                   ref:  NULL  

                 rows:  5543853  

               Extra:  Using  filesort  

5066959  rows  in  set  (3  min  27.96  sec)

But sometimes query optimizer is wrong full scan query, it prefers to read only the data file and do a filesort, instead of using an

index to first: read keys directly in order and second: access the data file to retrieve fields

48

mysql>  EXPLAIN  SELECT  *  FROM  jobs_tbl_trovit_stats  

       -­‐>  FORCE  KEY  (d_date)  

       -­‐>  ORDER  BY  d_date\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  homes_tbl_my_searches  

                 type:  index  

possible_keys:  NULL  

                   key:  d_date  

           key_len:  3  

                   ref:  NULL  

                 rows:  5543853  

5066959  rows  in  set  (18.11  sec)

But sometimes query optimizer is wrong same query, but this time is an index scan, with millions of records, is much better to force the use of an index that satisfies the sorting of records, even if we must do a second read

from the data file to retrieve the fields for every record

Optimising queriesBest practices

49

Denormalize dates don’t use DATE_FORMAT in WHERE, GROUP BY or ORDER BYmysql>  EXPLAIN  SELECT  SUM(f_revenue_in_euros*f_revenue_share/100)  AS  revenue,  

       -­‐>  DATE_FORMAT(d_date,  “%Y-­‐%m”)  FROM  tbl_publishers_stats  

       -­‐>  WHERE  fk_i_id_tbl_publishers  =  ‘1658'  

       -­‐>  GROUP  BY  YEAR(d_date),  MONTH(d_date)\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  tbl_publishers_stats  

                 type:  ALL  

possible_keys:  NULL  

                   key:  NULL  

           key_len:  NULL  

                   ref:  NULL  

                 rows:  323826  

               Extra:  Using  where;  Using  temporary;  Using  filesort

50

Denormalize dates use specific indexed fields for year, month, day, etc

mysql>  EXPLAIN  SELECT  SUM(f_revenue_in_euros*f_revenue_share/100)  AS  revenue,  

       -­‐>  i_year,  i_month  FROM  tbl_publishers_stats  

       -­‐>  WHERE  fk_i_id_tbl_publishers  =  ‘1658'  

       -­‐>  GROUP  BY  i_year,  i_month\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  tbl_publishers_stats_NEW  

                 type:  ref  

possible_keys:  publisher_year_month  

                   key:  publisher_year_month  

           key_len:  5  

                   ref:  const  

                 rows:  236  

               Extra:  Using  where

51

Avoid using “*”• Avoid using “*” in the field list of a SELECT

• Put only the fields you need in order to save unneeded disk reads

• Big extra optimisation if all the fields belong to the index used in the query (covering index)

• Exception for using “*”: count(*) must always be used instead of count(field_name):

• To avoid causing confusion to the query optimiser

• Avoids future errors if the query is modified and the field inside the count() function is no longer indexed

52

Page queries• Avoid running any long executing query

• Long queries block the execution of MySQL internal maintenance processes, like the purge history growing several gigabytes that would never shrink again

• Long INSERT, UPDATE and DELETE, in addition to the above, will delay replication to slaves

• Use LIMIT in queries of any type to page in smaller and faster blocks

Using LIMIT the right way• LIMIT filters final results of the query, only after

WHERE, GROUP BY and ORDER BY are processed

• Beware of GROUP BY and ORDER BY “using filesort”, all records are going to be file sorted before LIMIT could take effect

• Even when using an index, the LIMIT <offset>,<row_count> syntax will run slower and slower as the offset increments

54

Using LIMIT the right way always avoid “using filesort”

55

mysql>  EXPLAIN  SELECT  *  FROM  homes_tbl_my_searches  

       -­‐>  ORDER  BY  s_where  

       -­‐>  LIMIT  10\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  homes_tbl_my_searches  

                 type:  ALL  

possible_keys:  NULL  

                   key:  NULL  

           key_len:  NULL  

                   ref:  NULL  

                 rows:  746475  

               Extra:  Using  filesort  

10  rows  in  set  (4.00  sec)

Using LIMIT the right way always avoid “using filesort”

56

mysql>  EXPLAIN  SELECT  *  FROM  homes_tbl_my_searches  

       -­‐>  ORDER  BY  s_what  

       -­‐>  LIMIT  10\G  

***************************  1.  row  ***************************  

                     id:  1  

   select_type:  SIMPLE  

               table:  homes_tbl_my_searches  

                 type:  index  

possible_keys:  NULL  

                   key:  what_where  

           key_len:  NULL  

                   ref:  NULL  

                 rows:  10  

10  rows  in  set  (0.00  sec)

#  Slower  as  offset  increments  mysql>  SELECT  i_id  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  LIMIT  5000000,100;  100  rows  in  set  (0.92  sec)  

     #  Always  fast  mysql>  SELECT  i_id  FROM  jobs_tbl_trovit_stats_NEW          -­‐>  WHERE  i_id  >  $last_id          -­‐>  LIMIT  100;  100  rows  in  set  (0.00  sec)

Using LIMIT the right way• When paging, avoid LIMIT <offset>,<row_count>

it’s better to filter using a primary or unique key

57

LIMIT for long UPDATE and DELETE• Divide long UPDATE and DELETE queries in several

shorter executions using LIMIT

• If possible, use indexed fields to find records

58

#  Crontab  to  purge  old  records  Do  mysql>  DELETE  FROM  homes_tbl_my_searches          -­‐>  WHERE  dt_date  <  ‘2013-­‐12-­‐31’          -­‐>  LIMIT  1000;  While(rows  affected  >  0)  

#  One-­‐time  UPDATE,  not  worth  to  create  an  index  just  for  this  time  Do  mysql>  UPDATE  homes_tbl_my_searches  SET  i_active  =  0          -­‐>  WHERE  i_active  !=  0  AND  s_what  =  ‘offensive  stopword’          -­‐>  LIMIT  1000;  While(rows  changed  >  0)