mysql high availability: managing farms of distributed servers (mysql fabric)

55
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1

Upload: alfranio-junior

Post on 27-Jan-2015

114 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1

Page 2: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.2

MySQL High Availability:Managing Farms of Distributed Servers(MySQL Fabric)

Mats KindahlAlfranio CorreiaNarayanan Venkateswaran

Page 3: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

3 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

The following is intended to outline our general product direction. It is intended

for information purposes only, and may not be incorporated into any contract.

It is not a commitment to deliver any material, code, or functionality, and

should not be relied upon in making purchasing decision. The development,

release, and timing of any features or functionality described for Oracle’s

products remains at the sole discretion of Oracle.

Safe Harbor Statement

Page 4: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

4 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Agenda

MySQL High Availability Options

MySQL Fabric – New kid on the block

MySQL Fabric – Failure detection and Failover

MySQL Fabric-aware connectors

MySQL Fabric – Playing with the new kid

Page 5: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

5 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL High Availability Options

Page 6: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

6 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

What Causes Downtime?

System Failures– Server faults

– Software bugs or crashes

Physical Disasters

Scheduled Maintenance

User Errors

Page 7: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

7 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Effect and Impact

Effect:– Service Unavailability

– Bad response time

Impact:

– Revenue loss

– Negative impact on customer relationships

– Reduced employee productivity

– Regulatory issues

Page 8: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

8 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Another Amazon Outage Exposes the Cloud's Dark LiningBy Brad Stone - Bloomberg Businessweek

“The entire incident lasted all of 49 minutes...”

Page 9: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

9 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Causes of Downtime in Production MySQL ServersBy Baron Schwartz – Percona

“It is ironic but true that high-availability tools can cause downtime.”

Page 10: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

10 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Failures are inevitable so design your systems taking this into account.

Page 11: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

11 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

High Availability Solutions

Primary-Secondary

Shared Nothing Clusters

Tightly-coupled Clusters

Page 12: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

12 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Simple to configure

Different Platforms

Configured over LAN or WAN

No Shared Storage or Virtual IP required

Primary-Secondary

Characteristics

MySQL Replication in 5.6

Ma

ster

Sla

ve

Sla

ve

Sla

veS

lave

Page 13: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

13 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Asynchronous Replication: risk of data loss (unless using semi-sync)

Performance overhead to master

No automatic failover or switchover (unless using MySQL Utilities)

Primary-Secondary

Characteristics

MySQL Replication in 5.6

Ma

ster

Sla

ve

Sla

ve

Sla

veS

lave

Page 14: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

14 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Multi-master architecture

No single point of failure

Support for SQL and NoSQL Interfaces

Synchronous replication

Shared Nothing Clusters

Characteristics

MySQL Cluster

MySQL Cluster Data Nodes MySQL Servers

Page 15: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

15 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Tightly Coupled Clusters

Provide Active/Passive Solution

Examples:

– DRBD

– WSFC

– Solaris Clustering

– Oracle Virtual Machines

Page 16: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

16 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Linux Kernel module integrated into Oracle Linux

Synchronous replication

Only one MySQL operational

Distributed Replicated Block Device

Characteristics

DRBD (Regular Operation)

Pacemaker

MySQL

DRBD

MySQL

DRBD

Corosync

Se

rvic

es

Clu

ste

r

Page 17: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

17 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Cluster Management System required

Virtual IP migration

Distributed Replicated Block Device

Characteristics

DRBD (Failover)

Pacemaker

MySQL

DRBD

MySQL

DRBD

Corosync

Se

rvic

es

Clu

ste

r

Page 18: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

18 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Cluster Management System required

Virtual IP migration

Distributed Replicated Block Device

Characteristics

DRBD (Failover)

Pacemaker

MySQL

DRBD

MySQL

DRBD

Corosync

Se

rvic

es

Clu

ste

r

Page 19: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

19 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Required:– Windows Clustering– Shared Storage

Only one MySQL Operational

Virutal IP migration

Shared storage used to vote

Shared Storage

Characteristics

Windows Server Failover Clustering (Regular Operation)

Sh

are

d S

tora

ge

Se

rve

rs

MySQL

Windows Clustering

MySQL

Windows Clustering

Se

rvic

es

VoteData

BinaryLog

Page 20: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

20 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL Fabric – New kid on the block

Page 21: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

21 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Distributed framework

Extensions are first-class Citizens

Supported by a variety of connectors

Fault-tolerant solution

You can suggest features, report bugs and contribute patches

MySQL Fabric

Still early alpha, long journey ahead

Farms of MySQL 5.6 Servers

Page 22: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

22 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Support for Primary-Secondary

Focus on MySQL 5.6 and later

Written in Python

Birds-eye View

Characteristics

High Availability Groups

MySQL Fabric Application

XML-RPC

SQL

Key Components

Page 23: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

23 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Fabric-aware connectors:– Route Transactions– Cache Information– Currently Python, Java,

PHP

Birds-eye View

Characteristics

High Availability Groups

MySQL Fabric Application

XML-RPC

SQL

Fabric-aware Connectors

Page 24: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

24 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

XML-RPC is widely available

Extensible Framework

Failures taken into account

Architecture

Characteristics

MySQL

MySQL FabricFramework

ExecutorState Store(Persister)

Sh

?HA

MySQLAMQP XML-RPC

??Extensions

Backing Store

Protocols

Page 25: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

25 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL Fabric: Prerequisites

MySQL Servers 5.6.10 (or later):– Backing Store

– Managed Servers

Python 2.6 or 2.7 MySQL Utilities 1.4.0

– Available at labs (http://labs.mysql.com)

Page 26: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

26 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL Fabric – Failure Detection and Failover

Page 27: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

27 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Fabric keeps information on groups

Application defines the group that it will use

Connection failures regularly propagated

HA Overview

Characteristics

High Availability GroupMySQL Fabric

ApplicationOperator

Page 28: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

28 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Failure Detection and Failover

Current Status:– Simple failure detector/recovery per group

Considering:– Make connectors report failures

– Support external/custom failure detectors

– Improve failover/switchover algorithm

– Extend servers/system to avoid the split-brain problem

Page 29: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

29 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Enabled per groupFailure Detection

group = Group.fetch(self.__group_id)for server in group.servers():  if server.is_alive():    continue  if group.master == server.uuid:    trigger("FAIL_OVER", [], self.__group_id)  else:    trigger("SERVER_LOST", [], self.__group_id,             server.uuid)  server.status = MySQLServer.FAULTY

Failover if master has gone

Notification if not master

Server marked as faulty

Page 30: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

30 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Failover

Ma

ster

Sla

ve

Sla

ve

Sla

ve

Sla

ve

T1T2T3 T1

T2T3

T1

T1T2

T1

Master fails

Page 31: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

31 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Failover

Ma

ster

Sla

ve

Sla

ve

Sla

ve

Sla

ve

T1T2T3 T1

T2T3

T1

T1T2

T1

Choosing a candidate

Page 32: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

32 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Failover

Ma

ster

Sla

ve

Sla

ve

Sla

ve

Sla

ve

T1T2T3 T1

T2T3

T1

T1T2

T1

Pointing to the new master

Page 33: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

33 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Making Fabric Itself HA

Current Status:– Fabric can automatically resume on-going activities

– Backing store is not left in an inconsistent state

– Information is cached in the connector

Considering:– Replicated State Machine among Fabric nodes

– Use MySQL Cluster as backing store

Page 34: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

34 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Crash-safe Procedures

MySQL FabricFramework

State Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

MySQL

Executor

Procedure. Step 1

Procedure. Step 2

Procedure. Step 3

Regular Execution

Page 35: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

35 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Crash-safe Procedures

MySQL FabricFramework

State Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

MySQL

Executor

Procedure. Step 1

Procedure. Step 2

Procedure. Step 3

Failover/Recovery Execution

Page 36: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

36 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Crash-safe Procedures

MySQL FabricFramework

State Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

MySQL

Executor

Procedure. Step 1

Procedure. Step 2

Procedure. Step 3

Resuming Execution

Page 37: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

37 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Writing a procedure

@_events.on_event(STEP_1)def do_something(group_id):    _do_it(group_id)    _events.trigger_within_procedure(STEP_2, group_id)    )

@do_something.undodef undo_something(group_id):    _undo_it(group_id)

Trigger the next step

Compensate Operation

Transactional Context

Page 38: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

38 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL Fabric: Using MySQL Cluster

MySQL FabricFramework

State Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

MySQL FabricFramework

State Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

MySQL Cluster

Executor Executor

Page 39: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

39 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL

MySQL FabricFramework

Executor State Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

MySQL

MySQL FabricFramework

State Store(Persister)

Sh

HA

XML-RPC

MySQL FabricFramework

Executor

MySQLAMQP

MySQL

MySQL FabricFramework

ExecutorState Store(Persister)

Sh

HA

MySQLAMQP XML-RPC

RSMRSM

MySQL Fabric: Using Replicated State Machine

Page 40: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

40 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL Fabric-aware Connectors

Page 41: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

41 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Use MySQLFabricConnectionWriting an application

import mysql.connector.fabric as connector

conn = connector.MySQLFabricConnection(    fabric={"host": "fabric.example.com", "port" : 8080},    user='mats', passwd= 'passwd', database="employees")conn.set_property(group='YYZ')cur = conn.cursor()

Connecting to a Group

Define a group

Get a cursor to master in YYZ

Page 42: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

42 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Connectors cannot hide failuresMulti-statement transaction

Page 43: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

43 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Connectors cannot hide failuresSingle-statement transaction

Page 44: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

44 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Writing an application

try:  conn.start_transaction()  conn.execute('INSERT...')  conn.execute('UPDATE...')  self.__cnx.commit()except InterfaceError as error:  cur = conn.cursor()

Handling Connection Failures

Connectors cannot safely retry orreconnect

Page 45: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

45 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Plan your application to retry after a failure.

Page 46: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

46 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Good practices

Handle session information in the retry logic:– Temporary tables

– Session variables

– Prepared statements

Check the wait_timeout server's property Do not set connection_timeout

Page 47: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

47 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Blogs http://alfranio-distributed.blogspot.com/2013/09/writing-fault-tolerant-database.html http://alfranio-distributed.blogspot.com/2013/09/tips-to-build-fault-tolerant-database.html

Documents

http://miscalculation/why-mysql/white-papers/mysql-guide-to-high-availability-solutions/

http://dev.mysql.com/doc/workbench/en/mysql-utilities.html

Code

MySQL Fabric available at http://labs.mysql.com/

References

Page 48: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

48 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

MySQL Fabric – Playing with the new kid

Page 49: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

49 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Use MTR

Do it manually, use sandbox, whatever you like

Starting MySQL Servers

Quick Setup rpl_fabric_gtid.cnf:

!include ../my.cnf

[mysqld.n]reporthost=localhostlogslaveupdatesinnodbgtidmode=onenforcegtidconsistencymasterinforepository=TABLE

source include/have_innodb.inc

rpl_fabric_gtid.test:

Page 50: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

50 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Python 2.6 or 2.7

MySQL Utilities 1.4.0

Check configuration file

MySQL Fabric Installation

Quick Setup fabric.cfg:[storage]address = localhost:3306user = fabricpassword = database = fabricconnection_timeout = 6

[protocol.xmlrpc]address = localhost:8080threads = 5url = file:///var/log/fabric.log

Page 51: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

51 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Configure the state store

Start fabric

Manage your groups

Run MySQL Fabric

Quick Setupmysqlfabric manage setup

mysqlfabric manage start

Terminal 1:

mysqlfabric listcommands

mysqlfabric group create YYZ

mysqlfabric group add localhost:1300root ''

Terminal 2:

Page 52: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

52 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Thoughts for the Future

● Connector multi-cast● Scatter-gather

● Internal interfaces● Improve extension support● Improve procedures support

● Command-line interface● Improving usability● Focus on ease-of-use

● More protocols● MySQL-RPC Protocol?● AMQP?

● More frameworks?

● More HA group types● DRBD● MySQL Cluster

● Fabric-unaware connectors?

Page 53: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

53 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Thoughts for the Future

● “More transparent” sharding● Single-query transactions● Cross-shard joins is a problem

● Multiple shard mappings● Independent tables

● Multi-way shard split● Efficient initial sharding● Better use of resources

● High-availability executor● Node failure stop execution● Replicated State Machine● Fail over to other Fabric node

● Distributed failure detector● Connectors report failures● Custom failure detectors

Page 54: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

54 | 21/09/2013 | Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Thank you!

Page 55: MySQL High Availability: Managing Farms of Distributed Servers (MySQL Fabric)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.55

Your Feedback is Highly Appreciated!

http://forums.mysql.com/list.php?144