mysql break/fix lab - percona · data science consulting and implementation services operational...

138
© The Pythian Group Inc., 2018 1 Mon Apr 23th 2018 - Percona Live Santa Clara, CA, USA Matthias Crauwels / Pep Pla MySQL break/fix lab

Upload: others

Post on 11-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 1

Mon Apr 23th 2018 - Percona Live Santa Clara, CA, USA

Matthias Crauwels / Pep Pla

MySQL break/fix lab

Page 2: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 2

Who are we?

Page 3: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 3

3© The Pythian Group Inc., 2017

Matthias Crauwels● Living in Ghent, Belgium● Bachelor Computer Science● ~20 years Linux user / admin● 10+ years PHP developer● 5+ years MySQL DBA● ~1 year at Pythian as MySQL

Database Consultant● Father of Leander

Page 4: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 4

4© The Pythian Group Inc., 2017

Pep PlaBorn in Vinaròs, a small village near the Mediterranean and currently living in Barcelona.

Most of the time I’m busy with my three kids, my partner and our two cats.

And in my spare time I’m a DBC at Pythian, surrounded by some of the most brilliant DBAs in the world.

Page 5: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 5

5© The Pythian Group Inc., 2017

ABOUT PYTHIAN

Pythian’s 400+ IT professionals help companies adopt and manage disruptive technologies to better compete

Page 6: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© 2017 Pythian. Confidential 6

Years in Business

20Pythian Experts in 35 Countries

400+Current Clients

Globally

350+

Page 7: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

7© 2018 Pythian. Confidential

AI / ML / BLOCKCHAIN

Intelligent analytics and decision making

Software autonomy

Disruptive data technologies

CLOUD MIGRATION & OPERATIONS

Plan, Migrate, Manage, Optimize, Innovate

Multi-cloud, Hybrid-Cloud, Cloud Native

ANALYTIC DATA SYSTEMS

Kick AaaS cloud-native, pre-packaged analytics platform

Custom analytics platform design, implementation and support services–for on-premises and cloud

Data science consulting and implementation services

OPERATIONAL DATA SYSTEMS

Database services–architecture to ongoing management

On prem and in the cloud

Oracle, MS SQL, MySQL, Cassandra, MongoDB, Hadoop, AWS/Azure/Google DBaaS

Page 8: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 88© The Pythian Group Inc., 2017

Running Database Services on DC/OSGabriel Ciciliani - PythianRoom M1 - Tuesday 11:30AM - 12:20PM

Securing Your Data: All Steps for Encrypting Your MongoDB DatabaseIgor Donchovski - PythianRoom F - Wednesday 11:00AM - 11:50AM

Hands on ProxySQLRené Cannaò - ProxySQL, Derek Downey - PythianRoom 4 - Monday - 09:30AM - 12:30PM

How to Scale MongoDBIgor Donchovski - PythianRoom F - Tuesday 11:30AM - 12:20PM

Other Pythian talks this conference

Page 9: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 9

AGENDA

9© The Pythian Group Inc., 2017

● Introductions● Getting connected● Basic MySQL troubleshooting● Replication troubleshooting● Advanced topics

Page 10: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 10

We break it (already done)

You fix it… (but we’ll help you!)

Page 11: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 11© The Pythian Group Inc., 2018 11

We distribute an EC2 instance for each one of you

● Use an ssh client to connect to the instance

● Username: demo-user

● Password: plscdemo

● Don’t fix other things and follow the sequence of the slides

● One standalone MySQL instance

● Several MySQL instances using dbdeployer. (https://github.com/datacharmer/dbdeployer)

Getting connectedIP address list on

http://bit.ly/2HKJcsqCommand reference (for copy/paste)

http://bit.ly/2HKNuQt

Page 12: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 12

12© The Pythian Group Inc., 2017

Basic MySQL troubleshooting

Page 13: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 13

AGENDA

13© The Pythian Group Inc., 2017

● MySQL instance not starting○ misconfiguration○ file permissions○ corrupted files

● Connectivity issues○ misconfiguration○ recover lost password○ server gone away

Page 14: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 14© The Pythian Group Inc., 2018 14

[root@mysql ~]# service mysqld start

Initializing MySQL database

2018-04-07T10:45:53.364642Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).mysqld: Can't create/write to file '/var/tmp/ibYTHEZv' (Errcode: 13 - Permission denied)2018-04-07T10:45:53.367682Z 0 [ERROR] InnoDB: Unable to create temporary file; errno: 132018-04-07T10:45:53.367700Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error2018-04-07T10:45:53.367704Z 0 [ERROR] Plugin 'InnoDB' init function returned error.2018-04-07T10:45:53.367707Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.2018-04-07T10:45:53.367710Z 0 [ERROR] Failed to initialize builtin plugins.

2018-04-07T10:45:53.367712Z 0 [ERROR] Aborting

Initialization of MySQL database failed.Perhaps /etc/my.cnf is misconfigured.

[root@mysql ~]# ps -ef | grep mysqldroot 2711 2599 0 17:13 pts/0 00:00:00 grep --color=auto mysqld

Starting mysqld

Page 15: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 15© The Pythian Group Inc., 2018 15

[root@mysql ~]# ls -ld /var/tmp/drwxrwx--- 2 root root 4096 Apr 7 10:49 /var/tmp/

[root@mysql ~]# chmod a+rwx /var/tmp/

[root@mysql ~]# ls -ld /var/tmp/drwxrwxrwx 2 root root 4096 Apr 7 10:49 /var/tmp/

Fixing tmp dir permissions

Page 16: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 16© The Pythian Group Inc., 2018 16

[root@mysql ~]# service mysqld start

Initializing MySQL database2018-04-07T10:51:50.441437Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).2018-04-07T10:51:51.154035Z 0 [ERROR] InnoDB: mmap(137428992 bytes) failed; errno 122018-04-07T10:51:51.353940Z 0 [ERROR] InnoDB: Cannot allocate memory for the buffer pool2018-04-07T10:51:51.354044Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error2018-04-07T10:51:51.354105Z 0 [ERROR] Plugin 'InnoDB' init function returned error.2018-04-07T10:51:51.354138Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.2018-04-07T10:51:51.354168Z 0 [ERROR] Failed to initialize builtin plugins.2018-04-07T10:51:51.354189Z 0 [ERROR] Aborting

Initialization of MySQL database failed.Perhaps /etc/my.cnf is misconfigured.

[root@mysql ~]# ps -ef | grep mysqldroot 20954 19560 0 10:55 pts/0 00:00:00 grep --color=auto mysqld[root@mysql ~]#

Starting mysqld

Page 17: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 17© The Pythian Group Inc., 2018 17

[root@mysql ~]# perror 12OS error code 12: Cannot allocate memory

[root@mysql ~]# free -m total used free shared buffers cachedMem: 993 188 804 0 24 102-/+ buffers/cache: 61 931Swap: 0 0 0

[root@mysql ~]# grep innodb_buffer_pool_size /etc/my.cnfinnodb_buffer_pool_size = 100G

[root@mysql ~]# sed -i -e 's/100G/128M/' /etc/my.cnf[root@mysql ~]# grep innodb_buffer_pool_size /etc/my.cnfinnodb_buffer_pool_size = 128M

Fixing innodb_buffer_pool_size

Page 18: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 18© The Pythian Group Inc., 2018 18

[root@mysql ~]# service mysqld start

Initializing MySQL database2018-04-07T11:01:34.590140Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 100 1002018-04-07T11:01:38.031833Z 0 [Warning] InnoDB: New log files created, LSN=457902018-04-07T11:01:38.274845Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.2018-04-07T11:01:38.338383Z 0 [ERROR] unknown variable 'tmpd1r=/var/tmp'2018-04-07T11:01:38.338402Z 0 [ERROR] Aborting

Initialization of MySQL database failed.Perhaps /etc/my.cnf is misconfigured.

[root@example ~]# ps -ef | grep mysqldroot 2711 2599 0 17:13 pts/0 00:00:00 grep --color=auto mysqld

Starting mysqld

Page 19: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 19© The Pythian Group Inc., 2018 19

[root@mysql ~]# grep tmpd /etc/my.cnf[root@mysql ~]# grep tmpd /etc/mysql/my.cnfgrep: /etc/mysql/my.cnf: No such file or directory

Where is the config file?

https://dev.mysql.com/doc/refman/5.7/en/option-files.html

Check the reference manual...

strace

… or find it yourself

Page 20: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 20© The Pythian Group Inc., 2018 20

● Option “-e trace=open,stat” will help to filter the long output of the strace.

[root@mysql ~]# strace -e trace=open,stat /usr/sbin/mysqld...stat("/etc/my.cnf", {st_mode=S_IFREG|0644, st_size=569, ...}) = 0open("/etc/my.cnf", O_RDONLY) = 3stat("/etc/mysql/.my.cnf", {st_mode=S_IFREG|0644, st_size=25, ...}) = 0open("/etc/mysql/.my.cnf", O_RDONLY) = 4stat("/etc/mysql/my.cnf", 0x7ffc96d78020) = -1 ENOENT (No such file or directory)stat("/root/.my.cnf", 0x7ffc96d78020) = -1 ENOENT (No such file or directory)...

Strace

Page 21: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 21© The Pythian Group Inc., 2018 21

[root@mysql ~]# strace -e stat /usr/sbin/mysqld --print-defaults/usr/sbin/mysqld would have been started with the following arguments:--datadir=/var/lib/msql --socket=/var/lib/mysql/mysql.sock --query_cache_type=0 --query_cache_size=0 --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=1 --innodb_flush_method=O_DIRECT --innodb_open_files=4096 --innodb_purge_threads=1 --innodb_log_file_size=128M --innodb_log_files_in_group=2 --innodb_buffer_pool_size=256M --symbolic-links=0 --tmpd1r=/var/tmp

+++ exited with 0 +++

stat("/etc/my.cnf", {st_mode=S_IFREG|0644, st_size=569, ...}) = 0stat("/etc/mysql/.my.cnf", {st_mode=S_IFREG|0644, st_size=25, ...}) = 0stat("/etc/mysql/my.cnf", 0x7ffecb574ca0) = -1 ENOENT (No such file or directory)stat("/root/.my.cnf", 0x7ffecb574ca0) = -1 ENOENT (No such file or directory)

Strace: mysqld --print-defaults

Page 22: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 22© The Pythian Group Inc., 2018 22

[root@mysql ~]# cat /etc/mysql/.my.cnf[mysqld]tmpd1r=/var/tmp

[root@mysql ~]# sed -i -e 's/tmpd1r/tmpdir/' /etc/mysql/.my.cnf

[root@mysql ~]# cat /etc/mysql/.my.cnf[mysqld]tmpdir=/var/tmp

Fixing tmpdir variable

Page 23: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 23© The Pythian Group Inc., 2018 23

[root@mysql ~]# strace -e stat /usr/sbin/mysqld --print-defaults/usr/sbin/mysqld would have been started with the following arguments:--datadir=/var/lib/msql --socket=/var/lib/mysql/mysql.sock --query_cache_type=0 --query_cache_size=0 --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=1 --innodb_flush_method=O_DIRECT --innodb_open_files=4096 --innodb_purge_threads=1 --innodb_log_file_size=128M --innodb_log_files_in_group=2 --innodb_buffer_pool_size=128M --symbolic-links=0 --tmpd1r=/var/tmp

+++ exited with 0 +++

Strace: mysqld --print-defaults, again

Page 24: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 24© The Pythian Group Inc., 2018 24

[root@mysql ~]# grep datadir /etc/my.cnfdatadir=/var/lib/msql

[root@mysql ~]# sed -i -e 's/datadir=\/var\/lib\/msql/datadir=\/var\/lib\/mysql/' /etc/my.cnf

[root@mysql ~]# grep datadir /etc/my.cnfdatadir=/var/lib/mysql

Fix DATADIR path

Page 25: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 25© The Pythian Group Inc., 2018 25

[root@mysql ~]# service mysqld startStarting mysqld: [FAILED]

Wait where is my error output?

[root@mysql ~]# grep log-error /etc/my.cnflog-error=/var/log/mysqld.log

Starting mysqld

Page 26: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 26© The Pythian Group Inc., 2018 26

[root@mysql ~]# tail -25 /var/log/mysqld.log2018-04-07T11:18:21.283999Z 0 [Note] /usr/libexec/mysql57/mysqld (mysqld 5.7.21) starting as process 22111 ......2018-04-07T11:18:21.289409Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M2018-04-07T11:18:21.304322Z 0 [Note] InnoDB: Completed initialization of buffer pool...2018-04-07T11:18:21.317576Z 0 [ERROR] InnoDB: The innodb_system data file 'ibdata1' must be writable2018-04-07T11:18:21.317590Z 0 [ERROR] InnoDB: The innodb_system data file 'ibdata1' must be writable2018-04-07T11:18:21.317596Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error...2018-04-07T11:18:21.918587Z 0 [ERROR] Aborting...2018-04-07T11:18:21.918874Z 0 [Note] /usr/libexec/mysql57/mysqld: Shutdown complete

Examing the error log

Page 27: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 27© The Pythian Group Inc., 2018 27

[root@mysql ~]# ls -hal /var/lib/mysql/total 109Mdrwxr-xr-x 5 mysql mysql 4.0K Apr 7 11:20 .drwxr-xr-x 19 root root 4.0K Apr 7 11:16 ..-rw-r----- 1 mysql mysql 56 Apr 7 10:49 auto.cnf-rw------- 1 mysql mysql 1.7K Apr 7 10:49 ca-key.pem-rw-r--r-- 1 mysql mysql 1.1K Apr 7 10:49 ca.pem-rw-r--r-- 1 mysql mysql 1.1K Apr 7 10:49 client-cert.pem-rw------- 1 mysql mysql 1.7K Apr 7 10:49 client-key.pem-rw-r----- 1 mysql mysql 350 Apr 7 10:49 ib_buffer_pool-rw-r----- 1 42 42 12M Apr 7 10:49 ibdata1-rw-r----- 1 42 42 48M Apr 7 10:49 ib_logfile0-rw-r----- 1 42 42 48M Apr 7 10:49 ib_logfile1drwxr-x--- 2 mysql mysql 4.0K Apr 7 10:49 mysql-rw-r--r-- 1 mysql mysql 7 Apr 7 10:49 mysql_upgrade_infodrwxr-x--- 2 mysql mysql 4.0K Nov 22 13:49 performance_schema-rw------- 1 mysql mysql 1.7K Apr 7 10:49 private_key.pem-rw-r--r-- 1 mysql mysql 452 Apr 7 10:49 public_key.pem-rw-r--r-- 1 mysql mysql 1.1K Apr 7 10:49 server-cert.pem-rw------- 1 mysql mysql 1.7K Apr 7 10:49 server-key.pemdrwxr-x--- 2 mysql mysql 12K Apr 7 10:49 sys

Checking permissions

Page 28: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 28© The Pythian Group Inc., 2018 28

[root@mysql ~]# chown mysql:mysql /var/lib/mysql/ibdata1[root@mysql ~]# chown mysql:mysql /var/lib/mysql/ib_logfile*

[root@mysql ~]# ls -hal /var/lib/mysql/ib*-rw-r----- 1 mysql mysql 350 Apr 7 10:49 /var/lib/mysql/ib_buffer_pool-rw-r----- 1 mysql mysql 12M Apr 7 10:49 /var/lib/mysql/ibdata1-rw-r----- 1 mysql mysql 48M Apr 7 10:49 /var/lib/mysql/ib_logfile0-rw-r----- 1 mysql mysql 48M Apr 7 10:49 /var/lib/mysql/ib_logfile1

Fixing file permissions

Page 29: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 29© The Pythian Group Inc., 2018 29

[root@mysql ~]# service mysqld startStarting mysqld: [FAILED]

[root@mysql ~]# tail -100 /var/log/mysqld.log...2018-04-07T11:55:57.123099Z 0 [Note] /usr/libexec/mysql57/mysqld (mysqld 5.7.21) starting as process 23322 ...2018-04-07T11:55:57.397874Z 0 [ERROR] /usr/libexec/mysql57/mysqld: Can't find file: './mysql/user.frm' (errno: 13 - Permission denied)2018-04-07T11:55:57.397903Z 0 [ERROR] Fatal error: Can't open and lock privilege tables: Can't find file: './mysql/user.frm' (errno: 13 - Permission denied)2018-04-07T11:55:57.398000Z 0 [ERROR] Aborting

Starting mysqld

Page 30: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 30© The Pythian Group Inc., 2018 30

[root@mysql ~]# perror 13OS error code 13: Permission denied

[root@mysql ~]# ls -hl /var/lib/mysql/mysql/user.*-rw-r----- 1 root root 11K Apr 7 10:49 /var/lib/mysql/mysql/user.frm-rw-r----- 1 root root 340 Apr 7 10:49 /var/lib/mysql/mysql/user.MYD-rw-r----- 1 root root 4.0K Apr 7 10:49 /var/lib/mysql/mysql/user.MYI

[root@mysql ~]# chown mysql:mysql /var/lib/mysql/mysql/user.*

[root@mysql ~]# ls -hl /var/lib/mysql/mysql/user.*-rw-r----- 1 mysql mysql 11K Apr 7 10:49 /var/lib/mysql/mysql/user.frm-rw-r----- 1 mysql mysql 340 Apr 7 10:49 /var/lib/mysql/mysql/user.MYD-rw-r----- 1 mysql mysql 4.0K Apr 7 10:49 /var/lib/mysql/mysql/user.MYI

ERROR 13

Page 31: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 31© The Pythian Group Inc., 2018 31

[root@mysql ~]# service mysqld startStarting mysqld: [ OK ]

[root@mysql ~]# ps -ef | grep mysqldroot 22297 1 0 11:32 pts/0 00:00:00 /bin/sh /usr/libexec/mysql57/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysqlmysql 22645 22297 0 11:32 pts/0 00:00:00 /usr/libexec/mysql57/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql57/plugin --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sockroot 22719 19560 0 11:36 pts/0 00:00:00 grep --color=auto mysqld

YAY!

Starting mysqld

Page 32: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 32© The Pythian Group Inc., 2018 32

[root@mysql ~]# mysqlERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)

[root@mysql ~]# perror 2OS error code 2: No such file or directory

[root@mysql ~]# ls -hl /tmp/mysql.sockls: cannot access /tmp/mysql.sock: No such file or directory[root@mysql ~]#

Let’s try connecting

Page 33: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 33© The Pythian Group Inc., 2018 33

[root@mysql ~]# grep socket /var/log/mysqld.log | tail -n 1Version: '5.7.21' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL)

[root@mysql ~]# lsof -n | grep mysql | grep unixmysqld 22645 mysql 19u unix 0xffff88003cf82400 0t0 97542 /var/lib/mysql/mysql.sock

[root@mysql ~]# grep -B 1 socket /etc/my.cnf[client]socket=/tmp/mysql.sock

[root@mysql ~]# sed -i -e 's/\/tmp\/mysql.sock/\/var\/lib\/mysql\/mysql.sock/' /etc/my.cnf

Let’s try connecting

Page 34: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 34© The Pythian Group Inc., 2018 34

[root@mysql ~]# mysqlERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)

[root@mysql ~]# strace -e trace=open mysql...open("/etc/my.cnf", O_RDONLY) = 3open("/etc/mysql/.my.cnf", O_RDONLY) = 4open("/root/.my.cnf", O_RDONLY) = 3...ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)+++ exited with 1 +++

[root@mysql ~]# cat ~/.my.cnf[client]password=dummypass

Let’s try connecting

Page 35: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 35© The Pythian Group Inc., 2018 35

[root@mysql ~]# mysql --no-defaultsERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

[root@mysql ~]# mysql -pEnter password:ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

Now what?

Page 36: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 36© The Pythian Group Inc., 2018 36

[root@mysql ~]# sed -i 's/\[mysqld\]/&\nskip-grant-tables/' /etc/my.cnf

[root@mysql ~]# cat /etc/my.cnf[mysqld]skip-grant-tables...

[root@mysql ~]# service mysqld restartStopping mysqld: [ OK ]Starting mysqld: [ OK ][root@mysql ~]#

Recovering root password

Page 37: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 37© The Pythian Group Inc., 2018 37

[root@mysql ~]# mysqlWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 3Server version: 5.7.21 MySQL Community Server (GPL)

...

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> ALTER USER root@localhost IDENTIFIED WITH 'mysql_native_password' BY 'newpass';ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot execute this statement

mysql> UPDATE mysql.user SET plugin = 'mysql_native_password', authentication_string = PASSWORD('newpass') WHERE user='root';Query OK, 1 row affected, 1 warning (0.00 sec)Rows matched: 1 Changed: 1 Warnings: 1

Recovering root password

Page 38: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 38© The Pythian Group Inc., 2018 38

Very insecure right now![root@mysql ~]# mysql -p123456mysql: [Warning] Using a password on the command line interface can be insecure.Welcome to the MySQL monitor. Commands end with ; or \g....mysql>

Remove skip-grant-tables again from /etc/my.cnf[root@mysql ~]# sed -i 's/skip-grant-tables//' /etc/my.cnf[root@mysql ~]# service mysqld restartStopping mysqld: [ OK ]Starting mysqld: [ OK ][root@mysql ~]# sed -i 's/password=dummypass/password=newpass/' ~/.my.cnf

Recovering root password

Page 39: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 39© The Pythian Group Inc., 2018 39

[root@mysql ~]# cat ~/.my.cnf[client]password=newpass

[root@mysql ~]# mysqlWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 5Server version: 5.7.21 MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Great success!

Page 40: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 40© The Pythian Group Inc., 2018 40

[root@mysql ~]# tail -50 /var/log/mysqld.log...2018-04-07T11:32:17.977681Z 0 [ERROR] Column count of performance_schema.events_waits_current is wrong. Expected 19, found 16. Created with MySQL 50551, now running 50721. Please use mysql_upgrade to fix this error.2018-04-07T11:32:17.977791Z 0 [ERROR] Column count of performance_schema.events_waits_history is wrong. Expected 19, found 16. Created with MySQL 50551, now running 50721. Please use mysql_upgrade to fix this error.2018-04-07T11:32:17.977851Z 0 [ERROR] Column count of performance_schema.events_waits_history_long is wrong. Expected 19, found 16. Created with MySQL 50551, now running 50721. Please use mysql_upgrade to fix this error....2018-04-07T11:32:17.982462Z 0 [Note] /usr/libexec/mysql57/mysqld: ready for connections.Version: '5.7.21' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL)2018-04-07T11:32:17.986730Z 0 [Note] InnoDB: Buffer pool(s) load completed at 180407 11:32:17

It’s running… but… not really in good shape...

Let’s check error log to make sure all is well

Page 41: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 41© The Pythian Group Inc., 2018 41

[root@mysql ~]# mysql_upgradeChecking if update is needed.Checking server version.Running queries to upgrade MySQL server.Checking system database.mysql.columns_priv OK...mysql.user OKThe sys schema is already up to date (version 1.5.1).Checking databases.sys.sys_config OKUpgrade process completed successfully.Checking if update is needed.[root@mysql ~]#

mysql_upgrade

Page 42: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 42© The Pythian Group Inc., 2018 42

[root@mysql ~]# service mysqld restartStopping mysqld: [ OK ]Starting mysqld: [ OK ]

[root@mysql ~]# tail -50 /var/log/mysqld.log | grep -i error[root@mysql ~]#

No more errors!

Page 43: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 43© The Pythian Group Inc., 2018 43

[root@mysql ~]# echo "SELECT '1234567890'" | mysql -N1234567890

[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 2` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N12345678901234567890

[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 400000` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N | wc 1 1 4000001

[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 450000` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N | wcERROR 2006 (HY000) at line 1: MySQL server has gone away

MySQL server has gone away

Page 44: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 44© The Pythian Group Inc., 2018 44

[root@mysql ~]# mysql -e "SHOW GLOBAL VARIABLES LIKE 'max_allowed_packet'"+--------------------+---------+| Variable_name | Value |+--------------------+---------+| max_allowed_packet | 4194304 |+--------------------+---------+

[root@mysql ~]# mysql -e "SET GLOBAL max_allowed_packet=5242880;"

[root@mysql ~]# mysql -e "SHOW GLOBAL VARIABLES LIKE 'max_allowed_packet'"+--------------------+---------+| Variable_name | Value |+--------------------+---------+| max_allowed_packet | 5242880 |+--------------------+---------+

[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 450000` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N | wc 1 1 4500001[root@mysql ~]#

max_allowed_packet

Page 45: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 45© The Pythian Group Inc., 2018 45

[root@mysql ~]# mysql -e "SELECT SLEEP(1000);" &[1] 17979

[root@mysql ~]# kill -6 `pidof mysqld`ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query

[root@mysql ~]# tail -n 100 /var/log/mysqld.log...08:21:39 UTC - mysqld got signal 6 ;This could be because you hit a bug. It is also possible that this binaryor one of the libraries it was linked against is corrupt, improperly built,or misconfigured. This error can also be caused by malfunctioning hardware.Attempting to collect some information that could help diagnose the problem.As this is a crash and something is definitely wrong, the informationcollection process might fail....

Let’s simulate a mysqld crash

Page 46: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 46© The Pythian Group Inc., 2018 46

● Too big packet (max_allowed_packet)● Server crashed (or killed)

● OOM killer● Bugs● ...

● Session got terminated / killed● Session timing out (wait_timeout)

Always check the logs:

● mysql error log is your best friend (log-error variable)● dmesg, syslog or any core dumps may also contain info

Other reasons when “MySQL has gone away” occurs

Page 47: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 47© The Pythian Group Inc., 2018 47

“I was low on disk space so I deleted some log files”

Not really a problem while you don’t restart MySQL

[root@mysql ~]# lsof | grep mysqld | grep ib_logmysqld 30174 mysql 3uW REG 202,1 134217728 394960 /var/lib/mysql/ib_logfile0 (deleted)mysqld 30174 mysql 8uW REG 202,1 134217728 394963 /var/lib/mysql/ib_logfile1 (deleted)

Recreated on restart

2018-04-14T01:21:34.032635Z 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 128 MB2018-04-14T01:21:35.171245Z 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 128 MB2018-04-14T01:21:37.273595Z 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile02018-04-14T01:21:37.273646Z 0 [Warning] InnoDB: New log files created, LSN=125042225

… if MySQL was cleanly shutdown and innodb_fast_shutdown is not set to 2!

https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_fast_shutdown

Accidental deletes - ib_logfile

Page 48: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 48© The Pythian Group Inc., 2018 48

“I was low on disk space so I deleted some log files”

When MySQL was not cleanly shutdown (or crashed)

2018-04-14T01:24:45.872244Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html for information about forcing recovery.…InnoDB: If you get repeated assertion failures or crashes, evenInnoDB: immediately after the mysqld startup, there may beInnoDB: corruption in the InnoDB tablespace. Please refer toInnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.htmlInnoDB: about forcing recovery.

Be prepared to recover from backup...

Accidental deletes - ib_logfile

Page 49: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 49© The Pythian Group Inc., 2018 49

“I was low on disk space so I deleted some log files”

Not really a problem while you don’t restart MySQL

[root@mysql ~]# lsof | grep mysqld | grep ib_logmysqld 30174 mysql 3uW REG 202,1 134217728 394960 /var/lib/mysql/ib_logfile0 (deleted)mysqld 30174 mysql 8uW REG 202,1 134217728 394963 /var/lib/mysql/ib_logfile1 (deleted)

It’s recoverable

[root@mysql ~]# ls -hal /proc/`pidof mysqld`/fd | grep ib_loglrwx------ 1 root root 64 Apr 14 01:34 3 -> /var/lib/mysql/ib_logfile0 (deleted)lrwx------ 1 root root 64 Apr 14 01:34 8 -> /var/lib/mysql/ib_logfile1 (deleted)

mysql> FLUSH TABLES WITH READ LOCK;< leave session open and monitor SHOW ENGINE INNODB STATUS >< wait until all logs are flushed >

[root@mysql ~]# cp /proc/`pidof mysqld`/fd/3 /var/lib/mysql/ib_logfile0[root@mysql ~]# cp /proc/`pidof mysqld`/fd/8 /var/lib/mysql/ib_logfile1[root@mysql ~]# chown mysql:mysql /var/lib/mysql/ib_logfile*

< Cleanly restart MySQL >

[root@mysql ~]# ls -hl /var/lib/mysql/ib_logfile*-rw-r----- 1 mysql mysql 128M Apr 14 01:35 /var/lib/mysql/ib_logfile0-rw-r----- 1 mysql mysql 128M Apr 14 01:35 /var/lib/mysql/ib_logfile1

Accidental deletes - ib_logfile

Page 50: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 50© The Pythian Group Inc., 2018 50

● ibdata files (ibdata1 or *.ibd in the data dir) contain your data● They can be recovered in the same way as the ib_logfile, but...● ...DON’T delete them! Really!

Accidental deletes - ibdata files

Page 51: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 51

51© The Pythian Group Inc., 2017

Replication troubleshooting

Page 52: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 52

AGENDA

52© The Pythian Group Inc., 2017

● Replication concepts

○ SBR/RBR

○ GTID

○ Replication threads

● Basic troubleshooting

○ Broken replication

○ Validate replication environment

● Advanced replication

○ Replay events

Page 53: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 53© The Pythian Group Inc., 2018 53

● Binlog● Sequential● Committed transactions● Statement format● Row format

● Replication● Slave retrieves transactions from the master binlog into the relay log● Slave applies transactions from the relay log

● GTID● Unique identifier each transaction

● Replication threads● Io_thread and sql_thread

Replication concepts

Page 54: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 54© The Pythian Group Inc., 2018 54

[demo-user@mysql ~]$ ./start_environment.sh

[demo-user@mysql ~]$ ./replication_step_1.sh

Please execute the following commands

Page 55: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 55© The Pythian Group Inc., 2018 55

Execution results

Please execute the following commands

[demo-user@mysql ~]$ ./start_environment.sh# executing 'start' on /home/demo-user/sandboxes/gtid-replexecuting 'start' on master. sandbox server startedexecuting 'start' on slave 1. sandbox server started# executing 'start' on /home/demo-user/sandboxes/normal-replexecuting 'start' on master. sandbox server startedexecuting 'start' on slave 1. sandbox server started[demo-user@mysql ~]$ ./replication_step_1.shsleep(1)0

Page 56: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 56© The Pythian Group Inc., 2018 56

● dbdeployer● https://github.com/datacharmer/dbdeployer● created by Giuseppe Maxia (The Data Charmer)● rewrite of MySQL Sandbox in Go● not for production instances● allows you to quickly setup test instances:

■ no root access required■ supports replication topologies■ supports GTID

Replication tools

Page 57: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 57© The Pythian Group Inc., 2018 57

[root@mysql ~]# dbdeployerdbdeployer makes MySQL server installation an easy task.Runs single, multiple, and replicated sandboxes.

Usage: dbdeployer [command]

Available Commands: admin sandbox management tasks defaults tasks related to dbdeployer defaults delete delete an installed sandbox deploy deploy sandboxes global Runs a given command in every sandbox help Help about any command sandboxes List installed sandboxes unpack unpack a tarball into the binary directory usage Shows usage of installed sandboxes versions List available versions

Flags: --config string configuration file (default "/root/.dbdeployer/config.json") -h, --help help for dbdeployer --sandbox-binary string Binary repository (default "/root/opt/mysql") --sandbox-home string Sandbox deployment direcory (default "/root/sandboxes") --version version for dbdeployer

Use "dbdeployer [command] --help" for more information about a command.

dbdeployer

Page 58: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 58© The Pythian Group Inc., 2018 58

[demo-user@mysql ~]$ dbdeployer versionsBasedir: /home/demo-user/opt/mysql5.7.21

[demo-user@mysql ~]$ dbdeployer sandboxesgtid-repl : master-slave 5.7.21 [16747 16748]normal-repl : master-slave 5.7.21 [16743 16744]

[demo-user@mysql ~]$ dbdeployer global status# Running "status_all" on gtid-replREPLICATION /home/demo-user/sandboxes/gtid-replmaster : master off - (16747)node1 : node1 off - (16748)

# Running "status_all" on normal-replREPLICATION /home/demo-user/sandboxes/normal-replmaster : master off - (16743)node1 : node1 off - (16744)

[demo-user@mysql ~]$ dbdeployer global start

dbdeployer

Page 59: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 59© The Pythian Group Inc., 2018 59

[demo-user@mysql ~]$ cd sandboxes/normal-rep1/[demo-user@mysql normal-rep1]$ ls -hltotal 76K-rwxr--r-- 1 demo-user demo-user 1.4K Apr 8 06:53 check_slaves-rwxr--r-- 1 demo-user demo-user 993 Apr 8 06:53 clear_all-rwxr--r-- 1 demo-user demo-user 1.3K Apr 8 06:53 initialize_slaves-rwxr--r-- 1 demo-user demo-user 807 Apr 8 06:53 mdrwxr-xr-x 4 demo-user demo-user 4.0K Apr 8 06:53 master-rwxr--r-- 1 demo-user demo-user 807 Apr 8 06:53 n1-rwxr--r-- 1 demo-user demo-user 805 Apr 8 06:53 n2drwxr-xr-x 4 demo-user demo-user 4.0K Apr 8 06:53 node1-rwxr--r-- 1 demo-user demo-user 839 Apr 8 06:53 restart_all-rwxr--r-- 1 demo-user demo-user 805 Apr 8 06:53 s1-rw-rw-r-- 1 demo-user demo-user 169 Apr 8 06:53 sbdescription.json-rwxr--r-- 1 demo-user demo-user 982 Apr 8 06:53 send_kill_all-rwxr--r-- 1 demo-user demo-user 1.1K Apr 8 06:53 start_all-rwxr--r-- 1 demo-user demo-user 1.2K Apr 8 06:53 status_all-rwxr--r-- 1 demo-user demo-user 956 Apr 8 06:53 stop_all-rwxr--r-- 1 demo-user demo-user 4.5K Apr 8 06:53 test_replication-rwxr--r-- 1 demo-user demo-user 1.1K Apr 8 06:53 test_sb_all-rwxr--r-- 1 demo-user demo-user 978 Apr 8 06:53 use_all

dbdeployer

Page 60: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 60© The Pythian Group Inc., 2018 60

[demo-user@mysql normal-repl]$ ./mWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 6Server version: 5.7.21-log MySQL Community Server (GPL)

...Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost] {msandbox} ((none)) >

[demo-user@mysql normal-repl]$ ./s1Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 7Server version: 5.7.21-log MySQL Community Server (GPL)

...Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

slave1 [localhost] {msandbox} ((none)) >

dbdeployer

Page 61: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 61© The Pythian Group Inc., 2018 61

- Show slave status

Basic Replication Troubleshootingslave1 [localhost] {root} ((none)) > show slave status\G************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: rsandbox Master_Port: 16743 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 5141 Relay_Log_File: mysql-relay.000002 Relay_Log_Pos: 476 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 5141 Relay_Log_Space: 679 Seconds_Behind_Master: 0 Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Server_Id: 100 Master_UUID: 00016743-1111-1111-1111-111111111111 Master_Info_File: /home/vagrant/sandboxes/pl18/node1/data/master.info Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp:

Page 62: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 62© The Pythian Group Inc., 2018 62

- Show master status

- Show slave hosts

Basic Replication Troubleshootingmaster [localhost] {root} ((none)) > show master status\G*************************** 1. row *************************** File: mysql-bin.000001 Position: 5141 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 1 row in set (0.00 sec)

master [localhost] {root} ((none)) > show slave hosts\G*************************** 1. row *************************** Server_id: 300 Host: Port: 16745 Master_id: 100Slave_UUID: 00016745-3333-3333-3333-333333333333*************************** 2. row *************************** Server_id: 200 Host: Port: 16744 Master_id: 100Slave_UUID: 00016744-2222-2222-2222-2222222222222 rows in set (0.00 sec)

Page 63: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 63© The Pythian Group Inc., 2018 63

- Error log

- Replication catalog tables

Basic Replication Troubleshooting

2018-04-17T11:41:11.568378Z 19 [Note] 'CHANGE MASTER TO FOR CHANNEL '' executed'. Previous state master_host='127.0.0.1', master_port= 16743, master_log_file='mysql-bin.000001', master_log_pos= 5141, master_bind=''. New state master_host='127.0.0.1', master_port= 16743, master_log_file='mysql-bin.000001', master_log_pos= 4985, master_bind=''.2018-04-17T11:41:17.104913Z 31 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.

slave1 [localhost] {root} (mysql) > select * from slave_master_info\G*************************** 1. row *************************** Number_of_lines: 25 Master_log_name: mysql-bin.000003 Master_log_pos: 1507 Host: 127.0.0.1 User_name: rsandbox User_password: rsandbox Port: 16743 Connect_retry: 60...

Page 64: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 64© The Pythian Group Inc., 2018 64

- Connectivity- Verify access- Check master configuration

Basic Replication Troubleshooting

slave1 [localhost] {root} ((none)) > show slave status\G*************************** 1. row *************************** Slave_IO_State: Connecting to master

Error log:

2018-04-19T19:50:44.486470Z 8 [ERROR] Slave I/O for channel '': error connecting to master '[email protected]:3306' - retry-time: 60 retries: 1, Error_code: 2003

Slave:

[demo-user@mysql node1]$ nc -vw 10 127.0.0.1 3306nc: connect to 127.0.0.1 port 3306 (tcp) failed: Connection refused

Master:

show global variables like 'port';

Page 65: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 65© The Pythian Group Inc., 2018 65

- Connectivity (cont)- Fix the port

Basic Replication Troubleshooting

slave1 [localhost] {root} ((none)) > stop slave io_thread;

Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {root} ((none)) > change master to master_port=16743;

Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {root} ((none)) > start slave io_thread;

Query OK, 0 rows affected (0.00 sec)

Page 66: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 66© The Pythian Group Inc., 2018 66

- Wrong privileges/credentials

Basic Replication Troubleshooting

Error log

2018-04-19T20:48:35.674488Z 26 [ERROR] Slave I/O for channel '': error connecting to master '[email protected]:16743' - retry-time: 60 retries: 1, Error_code: 1045

Page 67: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 67© The Pythian Group Inc., 2018 67

- Wrong privileges/credentials (cont)

Basic Replication Troubleshooting

[demo-user@mysql normal-repl]$ ./master/use -u rsandbox -prsandboxmysql: [Warning] Using a password on the command line interface can be insecure.ERROR 1045 (28000): Access denied for user 'rsandbox'@'localhost' (using password: YES)

master [localhost] {root} ((none)) > select user,host,authentication_string from mysql.user where user='rsandbox';+----------+-------+-------------------------------------------+| user | host | authentication_string |+----------+-------+-------------------------------------------+| rsandbox | 127.% | *B07EB15A2E7BD9620DAE47B194D5B9DBA14377AD |+----------+-------+-------------------------------------------+1 row in set (0.00 sec)

master [localhost] {root} ((none)) > select password('rsandbox');+-------------------------------------------+| password('rsandbox') |+-------------------------------------------+| *B07EB15A2E7BD9620DAE47B194D5B9DBA14377AD |+-------------------------------------------+1 row in set, 1 warning (0.00 sec)

Page 68: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 68© The Pythian Group Inc., 2018 68

- Wrong privileges/credentials (cont)

Basic Replication Troubleshooting

master [localhost] {root} ((none)) > show grants for rsandbox@'127.%';+------------------------------------------------------+| Grants for rsandbox@127.% |+------------------------------------------------------+| GRANT REPLICATION SLAVE ON *.* TO 'rsandbox'@'127.%' |+------------------------------------------------------+1 row in set (0.00 sec)

master [localhost] {root} ((none)) > select user,host from mysql.user;+---------------+-----------+| user | host |+---------------+-----------+| msandbox | 127.% || msandbox_ro | 127.% || msandbox_rw | 127.% || rsandbox | 127.% || | 127.0.0.1 || msandbox | localhost |...

Page 69: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 69© The Pythian Group Inc., 2018 69

- Wrong privileges/credentials (cont)

Basic Replication Troubleshooting

master [localhost] {root} ((none)) > drop user ''@'127.0.0.1';Query OK, 0 rows affected (0.00 sec)

Page 70: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 70© The Pythian Group Inc., 2018 70

- Server id

Basic Replication Troubleshooting

2018-04-19T21:02:22.375910Z 30 [ERROR] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Misconfigured master - master server_id is 0', Error_code: 1236

Page 71: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 71© The Pythian Group Inc., 2018 71

- Server id (cont)

Basic Replication Troubleshooting

master [localhost] {root} ((none)) > set global server_id=100;Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {root} ((none)) > start slave io_thread;Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {root} ((none)) > show slave status\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: rsandbox Master_Port: 16743...

Page 72: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 72© The Pythian Group Inc., 2018 72

- Skipping events/Filtering replication

Basic Replication Troubleshooting

Last_SQL_Error: Could not execute Write_rows event on table pl18.test_table; Duplicate entry '5' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000002, end_log_pos 2748

Page 73: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 73© The Pythian Group Inc., 2018 73

- Skipping events/Filtering replication (cont)- Diagnose issue- Decide strategy

- Data from master is good (recommended)- Data from slave is good (not recommended)

Basic Replication Troubleshooting

slave1 [localhost] {root} ((none)) > select * from pl18.test_table where ident=5;+-------+--------------------------+---------------------+| ident | text | timestamp |+-------+--------------------------+---------------------+| 5 | Good data, do not remove | 2018-04-21 08:17:03 |+-------+--------------------------+---------------------+1 row in set (0.00 sec)

Page 74: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 74© The Pythian Group Inc., 2018 74

- Skipping events/Filtering replication (cont)

Basic Replication Troubleshooting

[demo-user@mysql node1]$ mysqlbinlog --base64-output=decode-rows --verbose --start-position 2704 --stop-position 2992 /home/demo-user/sandboxes/normal-repl/node1/data/mysql-relay.000002/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;DELIMITER /*!*/;# at 2704#180421 8:21:09 server id 0 end_log_pos 2556 CRC32 0x2601842e Anonymous_GTID last_committed=9 sequence_number=10

rbr_only=yes/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;# at 2769#180421 8:21:09 server id 0 end_log_pos 2628 CRC32 0x4b9b9f85 Query thread_id=14 exec_time=0 error_code=0SET TIMESTAMP=1524298869/*!*/;SET @@session.pseudo_thread_id=14/*!*/;SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1, @@session.autocommit=1/*!*/;SET @@session.sql_mode=1436549152/*!*/;SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;/*!\C utf8 *//*!*/;SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;SET @@session.lc_time_names=0/*!*/;SET @@session.collation_database=DEFAULT/*!*/;

Page 75: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 75© The Pythian Group Inc., 2018 75

- Skipping events/Filtering replication (cont)

Basic Replication Troubleshooting

BEGIN/*!*/;# at 2841#180421 8:21:09 server id 0 end_log_pos 2686 CRC32 0x7ccbff02 Table_map: `pl18`.`test_table` mapped to number 109# at 2899#180421 8:21:09 server id 0 end_log_pos 2748 CRC32 0x36e491e1 Write_rows: table id 109 flags: STMT_END_F### INSERT INTO `pl18`.`test_table`### SET### @1=5### @2='3VHQ8D28CH1X'### @3=1524298869# at 2961#180421 8:21:09 server id 0 end_log_pos 2779 CRC32 0x73cb18aa Xid = 82COMMIT/*!*/;SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;DELIMITER ;# End of log file/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Page 76: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 76© The Pythian Group Inc., 2018 76

- Skipping events/Filtering replication (cont)

Basic Replication Troubleshooting

slave1 [localhost] {root} ((none)) > set global sql_slave_skip_counter=1;Query OK, 0 rows affected (0.00 sec)

slave1 [localhost] {root} ((none)) > start slave;Query OK, 0 rows affected (0.01 sec)

Page 77: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 77© The Pythian Group Inc., 2018 77

- Skipping events/Filtering replication (cont)

Basic Replication Troubleshooting

Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table:

In my.cnf

slave-skip-errors = <error_code>,<error_code>

--slave-skip-errors=1062,1053--slave-skip-errors=all--slave-skip-errors=ddl_exist_errors

Page 78: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 78© The Pythian Group Inc., 2018 78

- And now for something completely different:

Basic Replication Troubleshooting

Please execute the following script:

./replication_step_2.sh

Page 79: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 79© The Pythian Group Inc., 2018 79

- Relay log corruption

Basic Replication Troubleshooting

Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.

Relay_Log_File: mysql-relay.000002

Page 80: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 80© The Pythian Group Inc., 2018 80

- Relay log corruption (cont)

Basic Replication Troubleshooting

slave1 [localhost] {root} ((none)) > show relaylog events;+--------------------+-----+----------------+-----------+-------------+---------------------------------------+| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |+--------------------+-----+----------------+-----------+-------------+---------------------------------------+| mysql-relay.000001 | 4 | Format_desc | 200 | 123 | Server ver: 5.7.21-log, Binlog ver: 4 || mysql-relay.000001 | 123 | Previous_gtids | 200 | 154 | || mysql-relay.000001 | 154 | Rotate | 200 | 203 | mysql-relay.000002;pos=4 |+--------------------+-----+----------------+-----------+-------------+---------------------------------------+3 rows in set (0.00 sec)

Page 81: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 81© The Pythian Group Inc., 2018 81

- Relay log corruption (cont)

Basic Replication Troubleshooting

slave1 [localhost] {root} ((none)) > change master to master_log_file='mysql-bin.000002', master_log_pos=595133;

Query OK, 0 rows affected (0.02 sec)

The value for master_log_file is Relay_Master_Log_File from show slave status

The value for master_log_pos is Exec_Master_Log_Pos from show slave status

Page 82: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 82© The Pythian Group Inc., 2018 82

- Change position/master

Basic Replication Troubleshooting

A

B C D

B

C D A

Page 83: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 83© The Pythian Group Inc., 2018 83

- Change position/master (cont)

Basic Replication Troubleshooting

A

B

C D

Make sure log_slave_updates is enabled in B .

Page 84: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 84© The Pythian Group Inc., 2018 84

- Change position/master (cont)

Basic Replication Troubleshooting

For each slave to reposition:

1. Stop replication in that slave.2. Stop sql_thread in the (future) new master.3. Retrieve current binlog position in the future new master.

slave1 [localhost] {root} ((none)) > show master status;+------------------+----------+--------------+------------------+-------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |+------------------+----------+--------------+------------------+-------------------+| mysql-bin.000001 | 454 | | | |+------------------+----------+--------------+------------------+-------------------+

4. Retrieve relative position in the future new master using show slave status. Relevant fields are (again) Relay_Master_Log_File and Exec_master_log_pos. Now you can restart replication in the future new master.

5. Restart replication in the slave to move, but up to the position retrieved in the previous step: start slave sql_thread until master_log_file=<log_file>, master_log_pos=<position>

6. The slave has reached that position, issue a change master to master_host=<new_master>, master_log_file=<file_from_step3), master_log_position=<position_from_step3;

Page 85: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 85© The Pythian Group Inc., 2018 85

- Change position/master (cont)

Basic Replication Troubleshooting

Once we have all the slaves pointing to the future new master, we need to promote it to the master role:

1. Make current master read_only to make sure no more changes are written. Retrieve current position with show master status.

2. Once the future master has reached that position, retrieve his position with show master status.3. Force the new master to stop replicating by issuing a reset master command. Disable read_only.4. Execute a change master in the former master with all the required parameters.

Page 86: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 86© The Pythian Group Inc., 2018 86

- GTID replication- No explicit position- Each transaction has a unique identifier (universally unique)

- Identifies origin server across all the replication chain- Identifies transaction sequence (only committed transactions have a

transaction sequence). No gaps on the sequence.- Slaves have a record of transactions executed and transactions missing.- No master file or position is required

- PROS- No more position/file needed

- CONS- GTIDs are not purged- Any transaction executed will have a GTID associated and is possible

that it will be replicated anytime in the future.

Basic Replication Troubleshooting

Page 87: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 87© The Pythian Group Inc., 2018 87

- GTID replication (cont)

Basic Replication Troubleshooting

./sandboxes/gtid-repl/node1/use -u root

Could not execute Write_rows event on table pl18.test_table; Duplicate entry '5' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000004, end_log_pos 3410

Page 88: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 88© The Pythian Group Inc., 2018 88

- GTID replication (cont)

Basic Replication Troubleshooting

Skip the replication event in GTID replication requires injecting an empty event in the replication flow.

But fixing the slave can also have some side effects. The fix will issue a transaction, this is a new GTID from the slave that could be replicated to subslaves or to slaves if the server is promoted to master.

Page 89: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 89© The Pythian Group Inc., 2018 89

- GTID replication (cont)

Basic Replication Troubleshooting

Skip the replication event in GTID replication requires injecting an empty event in the replication flow.

Show slave status:

Executed_Gtid_Set: 00016747-1111-1111-1111-111111111111:1-28

Page 90: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 90© The Pythian Group Inc., 2018 90

- GTID replication (cont)

Basic Replication Troubleshooting

mysqlbinlog --base64-output=decode-rows --verbose --start-position 2495 /home/demo-user/sandboxes/gtid-repl/node1/data/mysql-relay.000012 | less

Get the start position from show slave status:

Relay_Log_File: mysql-relay.000012

Relay_Log_Pos: 2495

Page 91: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 91© The Pythian Group Inc., 2018 91

- GTID replication (cont)

Basic Replication Troubleshooting

[demo-user@mysql ~]$ mysqlbinlog --base64-output=decode-rows --verbose --start-position 2495 /home/demo-user/sandboxes/gtid-repl/node1/data/mysql-relay.000012 | head -100/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;DELIMITER /*!*/;# at 2495#180423 15:17:33 server id 100 end_log_pos 3218 CRC32 0x1c670834 GTID last_committed=12 sequence_number=13 rbr_only=yes/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;SET @@SESSION.GTID_NEXT= '00016747-1111-1111-1111-111111111111:29'/*!*/;# at 2560#180423 15:17:33 server id 100 end_log_pos 3290 CRC32 0xfcd836ef Query thread_id=10 exec_time=0 error_code=0SET TIMESTAMP=1524496653/*!*/;SET @@session.pseudo_thread_id=10/*!*/;SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1, @@session.autocommit=1/*!*/;SET @@session.sql_mode=1436549152/*!*/;SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;/*!\C utf8 *//*!*/;SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;SET @@session.lc_time_names=0/*!*/;SET @@session.collation_database=DEFAULT/*!*/;BEGIN/*!*/;# at 2632#180423 15:17:33 server id 100 end_log_pos 3348 CRC32 0x8245d434 Table_map: `pl18`.`test_table` mapped to number 109# at 2690#180423 15:17:33 server id 100 end_log_pos 3410 CRC32 0x1a04c47d Write_rows: table id 109 flags: STMT_END_F### INSERT INTO `pl18`.`test_table`### SET### @1=5### @2='2SLM045U1S5K'### @3=1524496653# at 2752

Page 92: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 92© The Pythian Group Inc., 2018 92

- GTID replication (cont)

Basic Replication Troubleshooting

STOP SLAVE;SET GTID_NEXT= '00016747-1111-1111-1111-111111111111:29';BEGIN;COMMIT;SET GTID_NEXT="AUTOMATIC";START SLAVE;

Page 93: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 93© The Pythian Group Inc., 2018 93

- GTID replication (cont)

Basic Replication Troubleshooting

To fix the replication in the slave, please do a set sql_log_bin=FALSE before executing any command that can perform any change that you don’t want to replicate.

Page 94: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 94© The Pythian Group Inc., 2018 94

- Diagnose replication inconsistencies.

Advanced Replication Troubleshooting

pt-table-checksum --replicate=percona.checksums --ignore-databases mysql,sys,percona h=127.0.0.1,u=msandbox,p=msandbox,P=16743 --recursion-method dsn=h=127.0.0.1,P=16743,D=percona,t=dsns --nocheck-binlog-format

Page 95: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 95© The Pythian Group Inc., 2018 95

- Diagnose replication inconsistencies (cont.)

Advanced Replication Troubleshooting

[demo-user@mysql node1]$ pt-table-checksum --replicate=percona.checksums --ignore-databases mysql,sys,percona h=127.0.0.1,u=msandbox,p=msandbox,P=16743 --recursion-method dsn=h=127.0.0.1,P=16743,D=percona,t=dsns --nocheck-binlog-formatChecking if all tables can be checksummed ...Starting checksum ... TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE04-23T16:06:58 0 1 4429 4 0 0.060 pl18.test_table

Page 96: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 96© The Pythian Group Inc., 2018 96

- Diagnose replication inconsistencies (cont.)

Advanced Replication Troubleshooting

SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunksFROM percona.checksumsWHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc))GROUP BY db, tbl;

+------+------------+------------+--------+| db | tbl | total_rows | chunks |+------+------------+------------+--------+| pl18 | test_table | 1000 | 1 |+------+------------+------------+--------+

Page 97: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 97© The Pythian Group Inc., 2018 97

- Fix replication inconsistencies.

Advanced Replication Troubleshooting

[demo-user@mysql ~]$ pt-table-sync --dry-run --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox

[demo-user@mysql ~]$ pt-table-sync --print --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox

Page 98: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 98© The Pythian Group Inc., 2018 98

- Fix replication inconsistencies.

Advanced Replication Troubleshooting

[demo-user@mysql ~]$ pt-table-sync --dry-run --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox# NOTE: --dry-run does not show if data needs to be synced because it# does not access, compare or sync data. --dry-run only shows# the work that would be done.# Syncing via replication P=16744,h=127.0.0.1,p=...,u=msandbox in dry-run mode, without accessing or comparing data# DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE# 0 0 0 0 Chunk 16:23:58 16:23:58 0 pl18.test_table

[demo-user@mysql ~]$ pt-table-sync --print --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandboxREPLACE INTO `pl18`.`test_table`(`ident`, `text`, `timestamp`) VALUES ('5', '3VHQ8D28CH1X', '2018-04-21 08:21:09') /*percona-toolkit src_db:pl18 src_tbl:test_table src_dsn:P=16743,h=127.0.0.1,p=...,u=msandbox dst_db:pl18 dst_tbl:test_table dst_dsn:P=16744,h=127.0.0.1,p=...,u=msandbox lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15150 user:demo-user host:mysql*/;

Page 99: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 99© The Pythian Group Inc., 2018 99

- Fix replication inconsistencies.

Advanced Replication Troubleshooting

pt-table-sync --execute --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox

[demo-user@mysql ~]$ pt-table-checksum --replicate=percona.checksums --ignore-databases mysql,sys,percona h=127.0.0.1,u=msandbox,p=msandbox,P=16743 --recursion-method dsn=h=127.0.0.1,P=16743,D=percona,t=dsns --nocheck-binlog-formatChecking if all tables can be checksummed ...Starting checksum ... TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE04-23T16:30:18 0 0 4429 4 0 0.063 pl18.test_table

Page 100: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 100© The Pythian Group Inc., 2018 100

- Make your life easier: orchestrator- https://github.com/github/orchestrator- Web interface, command line and Web API.

Advanced Replication Troubleshooting

Page 101: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 101

101© The Pythian Group Inc., 2017

More advanced topics and tools

Page 102: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 102

AGENDA

102© The Pythian Group Inc., 2017

● System bottlenecks

○ Verify OS metrics

○ Run diagnostics

● MySQL bottlenecks

○ MySQL Tools

○ External tools

● Configuration

○ Dynamic variables

○ Static variables

Page 103: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 103© The Pythian Group Inc., 2018 103

● Current system status● Load● Swapping● NUMA● I/O wait

● Trends● Memory usage● CPU usage● Disk usage● Network usage

Bottlenecks Explained

Page 104: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 104© The Pythian Group Inc., 2018 104

● Has anything changed recently?● Application updates?● Database updates?

■ Schema updates■ Configuration updates

● Hardware failures or updates?■ Disk failures■ Temperature warnings■ Memory errors

● OS changes■ Patches or updates installed■ New packages installed

Where to look at first?

Page 105: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 105© The Pythian Group Inc., 2018 105

Graphs? Graphs! Grapps!!!

● Has traffic increased?● Has disk activity increased?● Check table growth● Check memory consumption

● Swap usage● Memory leaks● Buffer Pool Size and overhead

Try to correlate events using graphs!

Finding issues over time

Page 106: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 106© The Pythian Group Inc., 2018 106

● ps (processlist)● top / htop● vmstat● iostat● lsof● dmesg● Ifstat● sar● strace● numactl

Diagnose OS

Page 107: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 107© The Pythian Group Inc., 2018 107

[root@mysql ~]# ps PID TTY TIME CMD28460 pts/0 00:00:00 sudo28461 pts/0 00:00:00 su28462 pts/0 00:00:00 bash28556 pts/0 00:00:00 ps

[root@mysql ~]# ps -ef | grep mysqlmysql 17999 24982 0 Apr08 ? 00:01:49 /usr/libexec/mysql57/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql57/plugin --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sockroot 24982 1 0 Apr07 ? 00:00:00 /bin/sh /usr/libexec/mysql57/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysqlroot 28558 28462 0 22:53 pts/0 00:00:00 grep --color=auto mysql

Diagnose OS - ps (processlist)

Page 108: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 108© The Pythian Group Inc., 2018 108

[root@mysql ~]# toptop - 22:57:43 up 6 days, 13:10, 1 user, load average: 0.00, 0.00, 0.00Tasks: 81 total, 1 running, 80 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 1017060k total, 855112k used, 161948k free, 139948k buffersSwap: 0k total, 0k used, 0k free, 456684k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 19648 2492 2164 S 0.0 0.2 0:04.31 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd

● Press 1 to expand cpu to different cores● Press -u to filter a specific user (example mysql)● Press H to show different threads● Press < or > to change sort order

Diagnose OS - top

Page 109: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 109© The Pythian Group Inc., 2018 109

Same functionality as top, but a little “fancier”

Diagnose OS - htop

Page 110: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 110© The Pythian Group Inc., 2018 110

[root@mysql ~]# vmstat 1 10procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 181528 165412 163508 15768200 0 0 115 304 0 1 0 0 97 2 0 0 3 181528 167072 163216 15768148 0 0 19072 363636 1013 544 3 1 75 21 0 1 3 181528 173528 162864 15761096 0 0 24576 153132 1005 1533 4 1 70 25 0 0 4 181528 178924 162592 15754912 0 0 21760 206336 974 719 4 1 67 28 0 0 4 181528 168020 162476 15767500 0 0 24192 106496 740 420 4 0 75 21 0 0 4 181528 154252 162476 15780900 0 0 12928 206848 711 478 2 0 75 23 0 0 4 181528 173656 162204 15761588 0 0 5888 116844 518 384 1 0 71 27 0 0 4 181528 165524 162204 15770332 0 0 8576 176128 542 382 2 0 75 23 0 0 4 181528 157152 162204 15778872 0 0 8576 96256 415 269 2 0 75 23 0 0 3 181528 173116 161952 15761288 0 0 10112 12032 466 355 2 0 75 23 0

Diagnose OS - vmstat

Page 111: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 111© The Pythian Group Inc., 2018 111

[root@mysql ~]# sar -P ALL 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)

11:56:36 PM CPU %user %nice %system %iowait %steal %idle11:56:37 PM all 80.00 0.00 20.00 0.00 0.00 0.0011:56:37 PM 0 80.00 0.00 20.00 0.00 0.00 0.00

11:56:37 PM CPU %user %nice %system %iowait %steal %idle11:56:38 PM all 82.00 0.00 18.00 0.00 0.00 0.0011:56:38 PM 0 82.00 0.00 18.00 0.00 0.00 0.00

11:56:38 PM CPU %user %nice %system %iowait %steal %idle11:56:39 PM all 82.00 0.00 18.00 0.00 0.00 0.0011:56:39 PM 0 82.00 0.00 18.00 0.00 0.00 0.00

Average: CPU %user %nice %system %iowait %steal %idleAverage: all 81.33 0.00 18.67 0.00 0.00 0.00Average: 0 81.33 0.00 18.67 0.00 0.00 0.00

Diagnose OS - sar (CPU)

Page 112: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 112© The Pythian Group Inc., 2018 112

[root@mysql ~]# sar -r 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)

11:57:36 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit11:57:37 PM 64232 952828 93.68 101408 483896 859436 84.5011:57:38 PM 64232 952828 93.68 101408 483896 859436 84.5011:57:39 PM 64232 952828 93.68 101408 483896 859436 84.50Average: 64232 952828 93.68 101408 483896 859436 84.50

[root@mysql ~]# sar -S 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)

11:57:46 PM kbswpfree kbswpused %swpused kbswpcad %swpcad11:57:47 PM 0 0 0.00 0 0.0011:57:48 PM 0 0 0.00 0 0.0011:57:49 PM 0 0 0.00 0 0.00Average: 0 0 0.00 0 0.00

Diagnose OS - sar (memory)

Page 113: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 113© The Pythian Group Inc., 2018 113

[root@mysql ~]# iostat -y -x 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle 79.67 0.00 20.33 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

avg-cpu: %user %nice %system %iowait %steal %idle 81.67 0.00 18.33 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 3.00 0.00 1.33 0.00 42.67 32.00 0.00 0.00 0.00 0.00

avg-cpu: %user %nice %system %iowait %steal %idle 81.33 0.00 18.67 0.00 0.00 0.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 2.00 0.00 3.33 0.00 53.33 16.00 0.00 0.00 0.00 0.00

● -y omit first report (stats since boot)● -x extended stats● 3 interval in seconds

Diagnose OS - iostat

Page 114: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 114© The Pythian Group Inc., 2018 114

[root@mysql ~]# sar -d 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)

11:59:53 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util11:59:54 PM dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

11:59:54 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util11:59:55 PM dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

11:59:55 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util11:59:56 PM dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Average: DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %utilAverage: dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Diagnose OS - sar (I/O)

Page 115: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 115© The Pythian Group Inc., 2018 115

[root@mysql ~]# lsof | grep mysqldmysqld 17999 mysql cwd DIR 202,1 4096 393649 /var/lib/mysql...mysqld 17999 mysql mem REG 202,1 40640 2694 /lib64/libcrypt-2.17.so...mysqld 17999 mysql 2w REG 202,1 85019 394788 /var/log/mysqld.logmysqld 17999 mysql 3uW REG 202,1 134217728 394960 /var/lib/mysql/ib_logfile0mysqld 17999 mysql 8uW REG 202,1 134217728 394963 /var/lib/mysql/ib_logfile1mysqld 17999 mysql 9uW REG 202,1 12582912 394959 /var/lib/mysql/ibdata1mysqld 17999 mysql 10uW REG 202,1 12582912 395351 /var/lib/mysql/ibtmp1mysqld 17999 mysql 11u REG 202,1 0 395350 /var/tmp/ibIZu8ES (deleted)...mysqld 17999 mysql 15u IPv6 643056 0t0 TCP *:mysql (LISTEN)mysqld 17999 mysql 16u unix 0xffff88003ce0b000 0t0 643057 /var/lib/mysql/mysql.sock...mysqld 17999 mysql 23u REG 202,1 5120 395018 /var/lib/mysql/mysql/db.MYImysqld 17999 mysql 24u REG 202,1 1464 395019 /var/lib/mysql/mysql/db.MYD...mysqld 17999 mysql 45uW REG 202,1 10485760 395474 /var/lib/mysql/sbtest/sbtest1.ibdmysqld 17999 mysql 48uW REG 202,1 10485760 395476 /var/lib/mysql/sbtest/sbtest2.ibdmysqld 17999 mysql 49uW REG 202,1 10485760 395478 /var/lib/mysql/sbtest/sbtest3.ibdmysqld 17999 mysql 50uW REG 202,1 10485760 395480 /var/lib/mysql/sbtest/sbtest4.ibd

Diagnose OS - lsof

Page 116: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 116© The Pythian Group Inc., 2018 116

[root@mysql ~]# dmesg -T[Sat Apr 7 09:46:48 2018] Linux version 4.9.81-35.56.amzn1.x86_64 (mockbuild@gobi-build-64010) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC) ) #1 SMP Fri Feb 16 00:18:48 UTC 2018[Sat Apr 7 09:46:48 2018] Command line: root=LABEL=/ console=tty1 console=ttyS0 selinux=0 nvme_core.io_timeout=4294967295...[Sat Apr 14 09:46:49 2018] Out of memory: Killed process 21000, UID 48, (httpd).[Sat Apr 14 09:46:49 2018] mysqld invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0

Call Trace: [<ffffffff802c1b64>] out_of_memory+0x8b/0x203 [<ffffffff8020fa5d>] __alloc_pages+0x27f/0x308

Option -T show time in human readable format (not supported on all OS’s)

Diagnose OS - dmesg

Page 117: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 117© The Pythian Group Inc., 2018 117

[root@mysql ~]# ifstat eth0#kernelInterface RX Pkts/Rate TX Pkts/Rate RX Data/Rate TX Data/Rate RX Errs/Drop TX Errs/Drop RX Over/Rate TX Coll/Rateeth0 14283 0 592 0 21408K 0 41372 0 0 0 0 0 0 0 0 0

[root@mysql ~]# watch -n 1 ‘ifstat eth0’Every 1.0s: ifstat eth0 Fri Apr 13 23:46:46 2018

#kernelInterface RX Pkts/Rate TX Pkts/Rate RX Data/Rate TX Data/Rate RX Errs/Drop TX Errs/Drop RX Over/Rate TX Coll/Rateeth0 6645 0 243 0 9960K 0 17018 0 0 0 0 0 0 0 0 0

Diagnose OS - ifstat

Page 118: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 118© The Pythian Group Inc., 2018 118

Trace execution of an executable

Options:

● -e open to filter specific system calls (like open())● -e trace=open,read to filter multiple system calls● -o file.txt save the execution trace to a file● -p pid execute strace to running process id

Example:[root@mysql ~]# strace -p 30231Process 30231 attachedclock_gettime(CLOCK_REALTIME, {1523664842, 698096903}) = 0gettimeofday({1523664842, 698196}, NULL) = 0recvfrom(53, "\n\0\0\0", 4, MSG_DONTWAIT, NULL, NULL) = 4gettimeofday({1523664842, 698368}, NULL) = 0

Diagnose OS - strace

Page 119: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 119© The Pythian Group Inc., 2018 119

Non-Uniform Memory Access

● Hardware architecture with multiple physical CPUs● Memory speed depends on location relative to CPU

NUMA - what?

Page 120: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 120© The Pythian Group Inc., 2018 120

● MySQL loads a lot of memory at startup (ex buffer_pool init)

● Node 0 free memory is exhausted while there is still free memory● Often leads to “swapping insanity”● Solution: load memory “interleaved”

NUMA - why relevant?

Page 121: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 121© The Pythian Group Inc., 2018 121

The “NUMA-bible” by Jeremy Cole

● https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/

● https://blog.jcole.us/2012/04/16/a-brief-update-on-numa-and-mysql/

Since MySQL 5.7.9: innodb_numa_interleave setting

● https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_numa_interleave

NUMA - interesting (must) reads

Page 122: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 122© The Pythian Group Inc., 2018 122

[root@mysql ~]# numactl --hardwareavailable: 2 nodes (0-1)node 0 size: 32276 MBnode 0 free: 26856 MBnode 1 size: 32320 MBnode 1 free: 26897 MBnode distances:node 0 1 0: 10 21 1: 21 10

[root@mysql ~]# cat /proc/`pidof mysqld`/numa_maps | head -5558b3dc82000 default file=/usr/libexec/mysql57/mysqld mapped=3976 active=3953 N0=3976 kernelpagesize_kB=4558b3f767000 default file=/usr/libexec/mysql57/mysqld anon=235 dirty=235 N0=235 kernelpagesize_kB=4558b3f852000 default file=/usr/libexec/mysql57/mysqld anon=87 dirty=87 N0=87 kernelpagesize_kB=4558b3f8fc000 default anon=154 dirty=154 N0=154 kernelpagesize_kB=4558b3ff47000 default heap anon=6304 dirty=6304 N0=6304 kernelpagesize_kB=4

Diagnose OS - numactl

Page 123: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 123© The Pythian Group Inc., 2018 123

● MySQL client (mysql)● mysqladmin● mysqlbinlog● Log files

● Error log (log_error)● Slow query log (slow_query_log_file )● General log (general_log_file )

MySQL bottlenecks - MySQL tools

Page 124: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 124© The Pythian Group Inc., 2018 124

● percona-toolkit● pt-query-digest● pt-upgrade● pt-config-diff● pt-stalk● pt-pmp● ...

● tcpdump● innotop

MySQL bottlenecks - external tools

Page 125: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 125© The Pythian Group Inc., 2018 125

mysql> SHOW FULL PROCESSLIST;+----+--------+-----------+--------+---------+------+-------------------+---------------------------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |+----+--------+-----------+--------+---------+------+-------------------+---------------------------------------------------------------------+| 53 | sbtest | localhost | sbtest | Execute | 0 | statistics | SELECT DISTINCT c FROM sbtest15 WHERE id BETWEEN ? AND ? ORDER BY c || 54 | sbtest | localhost | sbtest | Sleep | 0 | | NULL || 55 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest12 WHERE id=? || 56 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest21 WHERE id=? || 58 | root | localhost | NULL | Query | 0 | starting | SHOW FULL PROCESSLIST |+----+--------+-----------+--------+---------+------+-------------------+---------------------------------------------------------------------+5 rows in set (0.00 sec)

mysql> pager grep -v SleepPAGER set to 'grep -v Sleep'

mysql> SHOW FULL PROCESSLIST;+----+--------+-----------+--------+---------+------+-------------------+-------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |+----+--------+-----------+--------+---------+------+-------------------+-------------------------------------------------+| 54 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | COMMIT || 55 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest17 WHERE id BETWEEN ? AND ? || 56 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest14 WHERE id=? || 58 | root | localhost | NULL | Query | 0 | starting | SHOW FULL PROCESSLIST |+----+--------+-----------+--------+---------+------+-------------------+-------------------------------------------------+5 rows in set (0.00 sec)

MySQL bottlenecks - processlist

Page 126: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 126© The Pythian Group Inc., 2018 126

mysql> SHOW ENGINE INNODB STATUS\G=====================================2018-04-14 01:04:25 0x7f707b41f700 INNODB MONITOR OUTPUT=====================================BACKGROUND THREAD-----------------SEMAPHORES------------TRANSACTIONS------------FILE I/O--------INSERT BUFFER AND ADAPTIVE HASH INDEX-------------------------------------LOG---BUFFER POOL AND MEMORY----------------------ROW OPERATIONS--------------END OF INNODB MONITOR OUTPUT============================

1 row in set (0.00 sec)

mysql>

MySQL bottlenecks - innodb status

Page 127: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 127© The Pythian Group Inc., 2018 127

Data dictionary - contains metadata on tablesExample: get table sizes

mysql> SELECT table_schema, table_name, engine, data_length / 1024 / 1024 as data_in_MB, index_length / 1024 / 1024 as index_in_MB FROM information_schema.TABLES WHERE table_schema NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys');+--------------+------------+--------+------------+-------------+| table_schema | table_name | engine | data_in_MB | index_in_MB |+--------------+------------+--------+------------+-------------+| sbtest | sbtest1 | InnoDB | 0.01562500 | 0.00000000 || sbtest | sbtest2 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest3 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest4 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest5 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest6 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest7 | InnoDB | 0.01562500 | 0.00000000 || sbtest | sbtest8 | InnoDB | 0.01562500 | 0.00000000 |+--------------+------------+--------+------------+-------------+8 rows in set (0.00 sec)

MySQL bottlenecks - information_schema

Page 128: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 128© The Pythian Group Inc., 2018 128

performance_schema provides insight into MySQL / InnoDB internals.

sys schema is a set of views defined to make searching in P_S a little more convenient.P_S:select if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) AS `user`,sum(`stmt`.`total`) AS `statements`,`sys`.`format_time`(sum(`stmt`.`total_latency`)) AS `statement_latency`,`sys`.`format_time`(ifnull((sum(`stmt`.`total_latency`) / nullif(sum(`stmt`.`total`),0)),0)) AS `statement_avg_latency`,sum(`stmt`.`full_scans`) AS `table_scans`,sum(`io`.`ios`) AS `file_ios`,`sys`.`format_time`(sum(`io`.`io_latency`)) AS `file_io_latency`,sum(`performance_schema`.`accounts`.`CURRENT_CONNECTIONS`) AS `current_connections`,sum(`performance_schema`.`accounts`.`TOTAL_CONNECTIONS`) AS `total_connections`,count(distinct `performance_schema`.`accounts`.`HOST`) AS `unique_hosts`,`sys`.`format_bytes`(sum(`mem`.`current_allocated`)) AS `current_memory`,`sys`.`format_bytes`(sum(`mem`.`total_allocated`)) AS `total_memory_allocated` from (((`performance_schema`.`accounts` left join `sys`.`x$user_summary_by_statement_latency` `stmt` on((if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) = `stmt`.`user`))) left join `sys`.`x$user_summary_by_file_io` `io` on((if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) = `io`.`user`))) left join `sys`.`x$memory_by_user_by_current_bytes` `mem` on((if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) = `mem`.`user`))) group by if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) order by sum(`stmt`.`total_latency`) desc

Sys: select * from sys.user_summary;

MySQL bottlenecks - performance_schema / sys

Page 129: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 129© The Pythian Group Inc., 2018 129

Tool to view and search binary log files$ mysqlbinlog --base64-output=DECODE-ROWS -vvv mysql-bin.000003/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;...CREATE TABLE sbtest1( ... ) /*! ENGINE = innodb *//*!*/;# at 566#180414 18:41:33 server id 100 end_log_pos 638 CRC32 0x8a350686 Query thread_id=8 exec_time=0 error_code=0SET TIMESTAMP=1523731293/*!*/;BEGIN/*!*/;# at 638#180414 18:41:33 server id 100 end_log_pos 695 CRC32 0x0b24f37f Table_map: `test`.`sbtest1` mapped to number 108...### INSERT INTO `test`.`sbtest1`### SET### @1=2716 /* INT meta=0 nullable=0 is_null=0 */### @2=5007 /* INT meta=0 nullable=0 is_null=0 */### @3='55695626677-52169758534-77347375130-44672760375-20882749287-44162878068-93868043135-83242682565-21261977354-27900241166' /* STRING(120) meta=65144 nullable=0 is_null=0 */### @4='36579967600-35242135535-40368674184-39875850855-96100412304' /* STRING(60) meta=65084 nullable=0 is_null=0 */# at 516259#180414 18:41:33 server id 100 end_log_pos 516290 CRC32 0xdb1b3667 Xid = 27COMMIT/*!*/;# at 516290

MySQL bottlenecks - mysqlbinlog

Page 130: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 130© The Pythian Group Inc., 2018 130

● Designed to capture “slow” queries● Define “slow” queries: long_query_time or min_examined_row_limit● Also: log_slow_admin_statements and log_queries_not_using_indexes● Slow query log options:

● slow_query_log: turn it ON or OFF● slow_query_log_file: target file

● Extra verbosity options on Percona Server and MariaDB● For profiling purposes we often set long_query_time to 0 to log all

queries that were executed.

MySQL bottlenecks - Slow query log

Page 131: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 131© The Pythian Group Inc., 2018 131

● Generic network analysing tool● Captures network traffic on TCP level● Can be used to capture all MySQL traffic:

tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000000 port 3306 > mysql.tcp.txt

● Warning this command output will not be usable with SSL encrypted connections. This will require decrypting the traffic first

● ssldump is an alternative to overcome the SSL issue

MySQL bottlenecks - tcpdump

Page 132: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 132© The Pythian Group Inc., 2018 132

● Part of the percona-toolkit● Can be used to analyse slow_query_logs , tcpdump logs , …● Examples

● Using slow_query_logpt-query-digest slow.log

● Using tcpdump logpt-query-digest --type tcpdump mysql.tcp.txt

● Convert tcpdump log in to slow_query_logpt-query-digest --output tcpdump.slow.log --no-report --type tcpdump mysql.tcp.txt

MySQL bottlenecks - pt-query-digest

Page 133: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 133© The Pythian Group Inc., 2018 133

When Load Cxns QPS Slow Se/In/Up/De% QCacheHit KCacheHit BpsIn BpsOutNow 0.00 9 9.68k 0 70/ 4/10/ 4 0.00% 100.00% 385.24k 19.01MTotal 0.00 151 427.48 0 70/ 4/ 9/ 4 0.00% 46.67% 22.85k 839.30k

Cmd ID State User Host DB Time QueryExecute 13 Sending data sbtest localhost sbtest 00:00 SELECT c FROM sbtest5 WHERE id BETWEEN ? AND ?Execute 14 Sending to client sbtest localhost sbtest 00:00 SELECT c FROM sbtest7 WHERE id BETWEEN ? AND ? ORDER BY cExecute 15 Sending to client sbtest localhost sbtest 00:00 UPDATE sbtest6 SET k=k+1 WHERE id=?Execute 17 updating sbtest localhost sbtest 00:00 UPDATE sbtest5 SET k=k+1 WHERE id=?Execute 18 starting sbtest localhost sbtest 00:00 COMMITExecute 19 updating sbtest localhost sbtest 00:00 UPDATE sbtest5 SET k=k+1 WHERE id=?Execute 20 Sending to client sbtest localhost sbtest 00:00 SELECT c FROM sbtest2 WHERE id=?

MySQL bottlenecks - innotop

Page 134: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 134© The Pythian Group Inc., 2018 134

● innodb_buffer_pool_size● innodb_io_capacity● innodb_lock_wait_timeout● query_cache_size● table_open_cache● …

Make sure you persist these changes to the my.cnf file to ensure the value is preserved on restart. (Fixed in 8.0 using SET PERSIST <variable> = <value>;)

MySQL configuration - dynamic variables

Page 135: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 135© The Pythian Group Inc., 2018 135

● innodb_buffer_pool_instances● open_files_limit● skip_name_resolve● tmpdir● ...

MySQL configuration - static variables

Page 136: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 136

136© The Pythian Group Inc., 2017

THANK YOU We’re hiring!

https://www.pythian.com/careers/

Page 137: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 137137

© The Pythian Group Inc., 2017

Example Text

● Example Bullet

TITLE

Page 138: MySQL break/fix lab - Percona · Data science consulting and implementation services OPERATIONAL DATA SYSTEMS Database services–architecture ... Don’t fix other things and follow

© The Pythian Group Inc., 2018 138

138© The Pythian Group Inc., 2017

TITLEExample Text

● Example Bullet