multi source replication with mysql 5.7 @ verisure

46
Multi-Source Replication With MySQL 5.7 @ Verisure And How We Got There 1 / 46

Upload: kenny-gryp

Post on 14-Apr-2017

192 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Page 1: Multi Source Replication With MySQL 5.7 @ Verisure

Multi-Source Replication WithMySQL 5.7 @ VerisureAnd How We Got There

 

1 / 46

Page 3: Multi Source Replication With MySQL 5.7 @ Verisure

Table of Contents

VerisureData WarehouseTungsten ReplicatorMySQL 5.7Multi-Source ReplicationIssues EncounteredCompatibility

 

3 / 46

Page 4: Multi Source Replication With MySQL 5.7 @ Verisure

Europe's most popular home alarmVerisure

 

4 / 46

Page 5: Multi Source Replication With MySQL 5.7 @ Verisure

VerisureVerisure is Europe's leading provider of professionallymonitored home alarms and services for the connected andprotected home and business.

We believe it's a human right to feel safe and secure.

We connect and protect what really matters, our servicebrings peace of mind to families and small business owners.

Thanks to our strong focus on quality and service, ourcustomers are among the most satisfied in the industry.

https://www.verisure.com/our-offer.html

 

5 / 46

Page 6: Multi Source Replication With MySQL 5.7 @ Verisure

VerisureData Warehouse

 

6 / 46

Page 7: Multi Source Replication With MySQL 5.7 @ Verisure

Data WarehouseWhy the DataWarehouse setup ?

Troubleshooting tool for 3-line.Not possible to have BI optimized DDL in Prod.BI-teams in own deploy structure/scheduleHeavy data mining to follow up on :

Product qualityGsm usage/costs

Stage for Upgrade

 

7 / 46

Page 8: Multi Source Replication With MySQL 5.7 @ Verisure

Data WarehouseGetting started

First iteration was easyOld prod hardware was kept as a Datawarehouse.

Then you add shardingAnd things got a bit harderMaybe we could use tungsten ?

 

8 / 46

Page 9: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorLegacyOperational OverheadDirect ModeHardware RequiredReplication CapacityBugs

 

9 / 46

Page 10: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorLegacy

Initially: replicate a config database to shardsneeded "temporarily" during migration to sharding

Extra tungsten instances added to replicate to DW

 

10 / 46

Page 11: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorLegacy ... grew into...

 

11 / 46

Page 12: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten Replicator... grew into...

 

12 / 46

Page 13: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorShard migration done

Down to one Tungsten per shard

 

13 / 46

Page 14: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorDirect Mode

Due to legacy reasons, direct mode of tungsten is used.Separate host was configured to serve as tungsten host:

~0.15ms round trip time to database as extraTHL requires disk space:

Replication LAG = lot of disk space.Ended up with several shard clusters with tungsteninstances

 

14 / 46

Page 15: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorReplication Capacity

Parallel (per schema) Replication was used:heavier shards limit

Global Warming (Tungsten is very CPU Intensive)

 

15 / 46

Page 16: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorBugsIssue 960 (fixed in Tungsten Replicator 3.0):

When using statement based replication with temporarytables where a ROLLBACK of a commit is applied, thereplicator would fail to execute the rollback statement.

... and just commit the to be rollbacked transaction.

Before the fix, replication broke a lot and shards hadto be rebuilt regularly.

 

16 / 46

Page 17: Multi Source Replication With MySQL 5.7 @ Verisure

Tungsten ReplicatorOperational Overhead

Hard for by Non-DBA's such as oncall staffHard ... even for DBA'sCustom Percona Toolkit Plugin For Tungsten Replicator:https://github.com/grypyrg/percona-toolkit-plugin-tungsten-replicator

$ pt-table-checksum -u checksum --no-check-binlog-format \ --recursion-method=dsn=D=p,t=dsns --plugin=pt-plugin-tungsten_replicator.plCreated plugin from /vagrant/pt-plugin-tungsten_replicator.pl.PLUGIN get_slave_lag: Using Tungsten Replicator to check replication lagTungsten Replicator status of host node3 is OFFLINE:NORMAL, waitingReplica node3 is stopped. Waiting.*Tungsten Replicator status of host node3 is OFFLINE:NORMAL, waitingReplica lag is 119 seconds on node3. Waiting. TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE07-03T10:49:54 0 0 2097152 7 0 213.238 app.large_table

 

17 / 46

Page 18: Multi Source Replication With MySQL 5.7 @ Verisure

Move toMySQL 5.7

 

18 / 46

Page 19: Multi Source Replication With MySQL 5.7 @ Verisure

Move to MySQL 5.7Why?

MSR to replace Tungsten Replicator:built-in solution, easy operationallyreplication capacity: parallel replicationless infrastructure requiredeasier to train oncall staff

The start to validate and get experience withMySQL/Percona Server 5.7

 

19 / 46

Page 20: Multi Source Replication With MySQL 5.7 @ Verisure

Move to MySQL 5.7Native replication replaces Tungsten

 

20 / 46

Page 21: Multi Source Replication With MySQL 5.7 @ Verisure

Compare        Before - After

 

21 / 46

Page 22: Multi Source Replication With MySQL 5.7 @ Verisure

MySQL 5.7Data Warehouse Queries

Collect queries (slowlog)Replay with pt-upgrade on 2 dw

 

22 / 46

Page 23: Multi Source Replication With MySQL 5.7 @ Verisure

MySQL 5.7Data Warehouse Queries

few queries were reported slower:sometimes prefers worse indexto be further investigated

table: alarms partitions: p201401,p201603,p201604 type: range key: alarm_insid_sid_time_ix key_len: 13 rows: 165 Extra: Using index condition; Using where; Using temporary; Using filesort

table: alarms partitions: p201401,p201603,p201604 type: range key: alarm_insid_time_ix key_len: 9 rows: 8089 Extra: Using index condition; Using where

 

23 / 46

Page 24: Multi Source Replication With MySQL 5.7 @ Verisure

MySQL 5.7Multi Source Replication

 

24 / 46

Page 25: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationSyntax

Create user

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'repluser05'@'192.168.204.10' IDENTIFIED BY 'rFAQKARW8rLZ9b2Z';

Figure out where to start

cat xtrabackup_binlog_info mysql-bin.203534 53973866

 

25 / 46

Page 26: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationSyntax

Requirements

SET GLOBAL master_info_repository = 'TABLE';SET GLOBAL relay_log_info_repository = 'TABLE';

 

26 / 46

Page 27: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationSyntax CHANGE MASTER TO MASTER_HOST='192.168.204.50', MASTER_USER='repluser05', MASTER_PASSWORD='rFAQKARW8rLZ9b2Z', MASTER_LOG_FILE='mysql-bin.203534', MASTER_LOG_POS=53973866 FOR CHANNEL 'host05';

SHOW SLAVE STATUS FOR CHANNEL 'host05'\G

STOP SLAVE IO_THREAD FOR CHANNEL 'host05'; RESET SLAVE FOR CHANNEL 'host05';

 

27 / 46

Page 28: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationLoading the data

At first you setup replication before shards is usedBut sooner or later a reload is needed.

ChallengesPhysical backups can't be used to merge severalinstancesTB sized databases and mysqldump, not efficientload of data must be fast, or replication will nevercatch up. (based on past experience with Tungsten)Production is 5.6 and DW 5.7.Partitioned tables not supported for IMPORTTABLESPACE.

 

28 / 46

Page 29: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationLoading the data

Dump the data using xtrabackup--export --prepare

Dump the schema using mysqldump--no-data --triggers --routines

Restore the DDLmysql < ddl.sql

Load the datadiscard tablespacecpimport tablespace

 

29 / 46

Page 30: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationLoading the data Tips and Tricks

5.5 -> 5.6Tables with timestamps must be rebuilt to new formatRequires a extra machine to use for the rebuild.LoadALTER TABLE FORCEDump and start the Load

5.6 -> 5.7Tables must be created with row_format=COMPACTALTER TABLE ROW_FORMAT=COMPACT

 

30 / 46

Page 31: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationLoading the data Tips and Tricks

5.6: Partitioned tablesNot supported, butImport each partition as a separate tableAdd to table using EXCHANGE PARTITION

Supported in 5.7, but no time to test yet...

 

31 / 46

Page 32: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationSkipping a Trx, non-GTID:mysql> SET GLOBAL sql_slave_skip_counter=1;mysql> START SLAVE;ERROR 3086 (HY000): When sql_slave_skip_counter > 0, it is not allowed to start more than one SQL thread by using 'START SLAVE [SQL_THREAD]'. Value of sql_slave_skip_counter can only be used by one SQL thread at a time.Please use 'START SLAVE [SQL_THREAD] FOR CHANNEL' to start the SQL thread which will use the value of sql_slave_skip_counter.

mysql> START SLAVE FOR CHANNEL 'one';

 

32 / 46

Page 33: Multi Source Replication With MySQL 5.7 @ Verisure

Multi Source ReplicationReplication Filters

Replication filters cannot be configured per channel:http://bugs.mysql.com/bug.php?id=80843

 

33 / 46

Page 34: Multi Source Replication With MySQL 5.7 @ Verisure

Replication Capacity ImprovementsTungsten:

channels=5parallel-queue.maxSize=75000

# cat shard.listshard01=0shard02=1shard03=2shard04=3shard05=4

MySQL 5.7 Parallel Replication (per source):

slave_parallel_type=DATABASEslave_parallel_workers=5slave_pending_jobs_size_max=32M

 

34 / 46

Page 35: Multi Source Replication With MySQL 5.7 @ Verisure

Replication Capacity Improvements

 

35 / 46

Page 36: Multi Source Replication With MySQL 5.7 @ Verisure

Replication Capacity Improvements

 

36 / 46

Page 37: Multi Source Replication With MySQL 5.7 @ Verisure

Replication Capacity ImprovementsNew environment has lower replication capacity withlargest shards.Waiting for slave-parallel-type=LOGICAL_CLOCKWaiting on App to become ready forbinlog_format=ROWNeed more in depth analysis of the collected statistics

 

37 / 46

Page 38: Multi Source Replication With MySQL 5.7 @ Verisure

MySQL 5.7Issues Encountered

 

38 / 46

Page 39: Multi Source Replication With MySQL 5.7 @ Verisure

seconds_behind_master bughttps://bugs.mysql.com/bug.php?id=66921https://bugs.mysql.com/bug.php?id=80084 (still open)

 

39 / 46

Page 40: Multi Source Replication With MySQL 5.7 @ Verisure

Crash: innodb_open_files >open_files_limit

http://bugs.mysql.com/bug.php?id=78981Fixed in 5.6.30, 5.7.12, 5.8.0

| Variable_name | Value |+-------------------+-------+| innodb_open_files | 16384 || open_files_limit | 8510 |

2015-10-27 10:20:33 5535 [ERROR] InnoDB: Trying to do i/o to a tablespace which 2015-10-27 10:20:33 7fa725a05700 InnoDB: Error: trying to access tablespace 11015335InnoDB: but the tablespace does not exist or is just being dropped.2015-10-27 10:20:33 7fa725a05700 InnoDB: Operating system error number 24 in a file operation.InnoDB: Error number 24 means 'Too many open files'.InnoDB: Some operating system error numbers are described at...2015-10-27 10:20:33 7fa725a05700 InnoDB: Assertion failure in thread 140355867531008 in file buf0buf.cc line 2740InnoDB: We intentionally generate a memory trap.

 

40 / 46

Page 41: Multi Source Replication With MySQL 5.7 @ Verisure

Crash: Upgrade from 5.6 to 5.7 MSRReplication channels are getting same name in MSRafter upgrade, can also Crash MySQLhttps://bugs.mysql.com/bug.php?id=80302 -- Open :(

mysql> show slave status\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_Port: 11204 [..] Channel_Name: master1 Master_TLS_Version:*************************** 2. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_Port: 13358 [..] Channel_Name: master1 Master_TLS_Version:2 rows in set (0.00 sec)

 

41 / 46

Page 42: Multi Source Replication With MySQL 5.7 @ Verisure

MySQL 5.7 Multi SourceCompatibility

 

42 / 46

Page 43: Multi Source Replication With MySQL 5.7 @ Verisure

Percona ToolkitPercona Toolkit is missing MSR support.

Slave Lag: pt-heartbeat:https://github.com/grypyrg/percona-toolkit-plugin-heartbeat

 

43 / 46

Page 44: Multi Source Replication With MySQL 5.7 @ Verisure

InnoTop Multi Source SupportWritten by Johan Nilsson (Verisure)Soon to be merged:https://github.com/innotop/innotop/pull/129

[RO] Replication Status (? for help) 127.0.0.1, 3m, 1.93 QPS, 5/1/0 con/run/cac thds,

________________________________ Slave SQL Status ______________________________Channel Master Master UUID On? TimeLag Catchup RPos Lastone localhost d7e93be0-0452-08002774c31b Yes 00:00 0.00 327two localhost 5b9d58e4-0452-08002774c31b Yes 00:00 0.00 4

________________________________ Slave I/O Status _______________________________Channel Master Master UUID On? File RSize Postwo localhost 5b9d58e4-0472-08002774c31b No 57-co.bin.000003 154 one localhost d7e93be0-04b2-08002774c31b Yes 57-co.bin.000003 545

____________________________________________ Master Status ________________________File Position Binlog Cache Executed GTID Set Server UUID57-community-bin.000003 154 0.00% N/A b40426f3-045

 

44 / 46

Page 45: Multi Source Replication With MySQL 5.7 @ Verisure

Monitoring ToolsOur favorite things

Mytopinnotop

Patch for channelsIchinga/NagiosMrtg

Some Mysql metrics that are important for us.Grafana/Graphite/Collect

 

45 / 46

Page 46: Multi Source Replication With MySQL 5.7 @ Verisure

Kristofer [email protected]

Kenny [email protected]

Multi-Source Replication WithMySQL 5.7 @ Verisure

And How We Got There (Almost :-)Questions?

 

46 / 46