msmm done right - percona do… · © continuent 2016 multi-site, multi-master done right matthew...
TRANSCRIPT
© Continuent 2016
Multi-Site, Multi-Master Done Right
Matthew Lang, Director of Professional Services, Americas of ContinuentMatthew Churcher, Lead Performance Engineer of New Voice Media
© Continuent 2016
Hi J• DBA/4GL Programmer for Progress RDMS• Web developers wanted better database support – LAMP• Wrote batch daily/2x daily/hourly batch loader from Progress
to MySQL for product search• Wrote real time replication from MySQL to Progress, then vice
versa, no more batch!• Became MySQL DBA/Sysadmin• Healthcare company for doctors needed high availability, we
decided on Continuent Tungsten.• Here I am
© Continuent 2016
Continuent is Back
• Continuent is again an independent entity, no ties to VMware
• Run by Team of original Continuent members• Continuent Support remains the same• We now have a freedom to update products and
functionality • Tungsten Replicator still Open Source
© Continuent 2016
Continuent Facts
BusinessCriticalDeploymentExamplesHighAvailabilityforMySQL
Largest clusterdeploymentperforms800M+transactions/day on275TBofrelationaldata
Business Continuity Cross-siteclustertopologieswidelydeployedincludingprimary/DR andmulti-master
High PerformanceReplication
Largestinstallationstransfer billionsoftransactionsdailyusing highspeed,parallelreplication
HeterogeneousIntegration
CustomersreplicatefromMySQL toOracle,Hadoop,Redshift,Vertica,andothers
Real-timeAnalyticsOptimized dataloadingfordatawarehouseswithdeploymentsofupto200MySQLmastersfeedingtoHadoop
© Continuent 2016
The Dream: multiple, active DBMS servers with identical data over distance
HighAvailability
Updatespropagatedimmediatelytoallservers
Transparentread/writetoanyserver
HighPerformance
© Continuent 2016
Synchronous multi-master clusters claim to deliver on the dream
Table fooid=1, data=6
Ordering Table fooid=1, data=5
Table fooid=7, data=25
[1] id=1, data=6[2] id=1, data=5[3] id=7, data=25
© Continuent 2016
Synchronous multi-master introduces new problems
Table fooid=1, data=6
Ordering Table fooid=1, data=5REJECTED!
Table fooid=7, data=25
[1] id=1, data=6[2] id=1, data=5[3] id=7, data=25
© Continuent 2016
…That grow as data scale in volume and distance
• Transaction failures due to conflicts• Operations like SELECT FOR UPDATE not
supported• Slow writes due to synchronous messaging• Large transactions lock system or cause failures• Cross-site replication is unstable
© Continuent 2016
Continuent Tungsten Clustering
© Continuent 2016
Continuent Clustering: HA, DR and Performance Scaling
db2 db1 db3
Slave Master Slave
Application Stack
Continuent Connector
Application Stack
Continuent Connector
•Tungstenclustersmakeoff-the-shelfopen-sourcedatabaseslooklikeasingledatabasetoapps
•Installseamlesslywithoutmigratingdataoralteringapplications•Automaticfailoverwithsimplerecoveryforhighavailability•Performmaintenancewithouthaltingordisruptingapplicationstoenhanceoveralluptime•AddadditionalnodesordisasterrecoveryclusterstoscalecapacityandboostHAprotection
•Cloud-readywithmajordeploymentsinAmazonaswellason-premise datacenters•Replicationcanoperateasstandaloneproduct
© Continuent 2016
Continuent Connector operates as an intelligent proxy to the DBMS
• Any MySQL client can connect• Connector initiates connections on behalf of client to
the DBMS
mySQL
Master
mySQL
Slave
mySQL
Slave
Connector
MySQL Protocol
COM_QUERYCOM_INIT_DB
COM_DROP_DB…
© Continuent 2016
Continuent SmartScale provides session load balancing
• Initial write goes to master• Reads go to replicas if it is safe to do so.
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector(Session “X” Binlog Position)
Select Data
Write committed
Not received
Received but not applied
Read from master
NO read from slave
© Continuent 2016
Continuent SmartScale provides session load balancing
• Reads go to slave when it has caught up with master
mySQL
Master
mySQL
Slave
mySQL
Slave
Application Connector(Session “X” Binlog Position)
Select Data
Write committed
Received but not applied
Received and
appliedRead from slave
© Continuent 2016
Load Balancing (Routing) Methods
RoutingMethod HostSelection AutoR/WSplitting SlaveLatency MaximumAppliedLatency
Smartscale BySession Yes(bySQLstatement) Lazy Yes
DirectReads ByContent Yes(bySQLstatement) Lazy Yes
Host-based ByHostname No Yes Yes
Port-based ByNetworkPort No No Yes
SQL-based BySQLcomment No No Yes
Manager
Replicator
Manager
Replicator
Manager
Replicator
Continuent clusters add HA and scaling without taking features away
15
Slave Master Slave
Continuent Connector Continuent Connector
© Continuent 2016
Continuent clusters automatically monitor all cluster nodes for failure
Continuent Connector
Master
SlaveSlave
© Continuent 2016
Cluster rules failover master if DBMS no longer accepts network connections
Continuent Connector
Master
SlaveSlave
X1. Detect non-responsive node
2. Halt in-coming connections
3. Find and promote most up-to-date slave
© Continuent 2016
Cluster rules failover master if DBMS no longer accepts network connections
Continuent Connector
New Master
Shunned nodeSlave4. Administrator inspects and recovers old master X
• Failednodescanbereprovisioned fromabackupwithasinglemanagementcommand
~OR~• Administratorcanrecoverthenodeandadditbackintothecluster
© Continuent 2016
Zero Downtime Maintenance
Rolling maintenance proceeds node-by-node starting with slaves and proceeding to master
Slave upgrade
Slave upgrade Switch Master
upgrade
• Shun slave• Resize
journal, restart mysqld
• Return node to cluster
• Discard and reprovision on failure
• Repeat for remaining slave(s)
• Switch master to promote an upgraded slave
• Upgrade old master
• Maintenance is now done!
© Continuent 2016
Tungsten Replicator
© Continuent 2016
Tungsten Replicator Overview
Replication Replication
Replication
• HighPerformanceReplicationto/fromMySQLandOracleandtoDatawarehouses
• MultiSite,MultiMasterbetweenindividualMySQLnodesorContinuent Clusters
• OpenSource• GlobalTransactionID’s• ParallelApplyofTransactions• AdditionaltopologiessuchasFan-in/Fan-
out,starreplication• AdvancedFiltering
© Continuent 2016
Asynchronous replication decouples transaction processing on master and slave DBMS nodes
Replicator
mySQL
DBMS Logs
mySQL
Replicator
THL
THL
Read transactions
from binary logs
Apply using JDBC(Transactions + metadata)
(Transactions + metadata)
Master
Slave
© Continuent 2016
Replicator Pipeline with Parallel Apply
THL
Parallel queue(Transactions + metadata)
Slave
Extract Filter Apply Extract Filter Apply
Extract Filter Apply
Extract Filter Apply
Extract Filter Apply
StageStageStage
Slave Replicator Pipeline
Masterreplicator
© Continuent 2016
Putting it all Together
© Continuent 2016
Multi Site/Multi-master clusters operate as independent, active clusters on 2 or more sites
SJC Service NYC Service
Slave Slave
Master
Slave Slave
MasterCross-Region Replication
(AsyncMulti-master)
Continuent Connector Continuent Connector
© Continuent 2016
Robust Multi Site Replication
Continuent Connector
Master
SlaveSlave
NYCService
• ReadfromMastertoavoidlatency
• ReadfromSlavestoreduceloadonMaster
• RecoverautomaticallyfromWANdisruptions
• SSLencryption– notjustreplicationtraffic,butallclustertraffic.
© Continuent 2016
Cross Site Recovery
• Tungsten Replicators are fully aware of remote cluster status
• If node in remote cluster fails, Tungsten Replicators reconfigures itself to read from a working node.
• Completely transparent and automatic, no intervention required.
Continuent Connector
Master
SlaveSlaveX
© Continuent 2016
Requirements (very few!)
• MySQL• Binary Logging on – ROW• Primary Keys on tables• auto-increment-offset = 1 (set this to cluster number)• auto-increment-increment = 2 (set this to total number of
clusters)
Site Pkey Value(Insert) Pkey Values(afterReplication)
SJC 1,3,4 1,2,3,4,5,6
NYC 2,4,6 1,2,3,4,5,6
© Continuent 2016
Multi Site/Multi-master clusters operate as independent, active clusters on 2 or more sites
SJC Service NYC Service
Slave Slave
Master
Slave Slave
MasterCross-Region Replication
(AsyncMulti-master)
Continuent Connector Continuent Connector
Continuent Disaster Recovery creates composite clusters that span sites and are ready for immediate failover
SJC Master Service NYC Slave Service
Slave Slave
Master
Slave Slave
RelayCross-Region Replication
(Asyncmaster/slave)
Continuent Connector Continuent Connector
© Continuent 2016
Contact Us535 Mission St. 14th Floor San Francisco, CA 94105 Tel +1 (408) 431 3305 Fax +1 (408) 668-1009e-mail: [email protected]
www.continuent.comfacebook.com/continuentTwitter @continuent
Tungsten Replicator:http://tungsten-replicator.org