mysql ha presentation
Post on 13-Jul-2015
6.303 Views
Preview:
TRANSCRIPT
MySQL HA Using different solutionsRobert KrzykawskiDB Team Coordinator, bwin games.
Anders KarlssonPrincipal Sales Engineer, MySQL
Agenda Who are we?
HA Basics – Anders
How we did it; Success or failure – Robert
Summary
Questions?
Anders Karlsson Sales Engineer with Sun / MySQL
for 5+ years I have been in the RDBMS business for
20+ years I have worked for many of the major vendors and
with most of the vendor products I’ve been in roles as
> Sales Engineer> Consultant> Porting engineer> Support engineer> Etc.
Outside MySQL I build websites (www.papablues.com), develop Open Source software (MyQuery, ndbtop etc), am a keen photographer and drives sub-standard cars, among other things.Also: www.makezfsgpl.com ! Right now!
Robert Krzykawski DB Team Coordinator @ bwin Games AB
Have been working with MySQL in every way from system admin, DBA, DBD and now taking a more system architectural role.
Been involved in building both small and big web based solutions since 1998 using MySQL.
My roles throughout my professional life have varied. System administrator, Technical Sales support, DBA, DBD, Programmer, Application architect and System architect.
Off work I am trying to automate things with scripts and programs to off load myself when “on work”.
I am also trying to find time to snowboard, play some paintball and a recently introduced hobby is our Maine Coon kittens.
Why do you need HA Something can break. It usually will, eventually
You will need to maintain yourdatabase eventually, without shuttingthe whole system down
Adding HA to an existing runningsystem is difficult, Much more so thanto provide HA from the start
You want a good nights sleep!You want failover to be automatic!
HA Concepts Fault tolerant architectures
> These are hardware architectures with supporting software that prevents against even individual component failures
Single Point of Failure (SPOF)> In any fault tolerant setup, you want to avoid a SPOF, as a
link is not better than it’s weakest link
Fail over and Fail back> Fail over is the process of switching from a failed
component to another component, dormant or also active. Fail back is the process of failing back from the backup component to the original one.
Some HA Components Heartbeat
> Heartbeat is an HA component that checks that the services that are being failed over, are alive. Heartbeat can check individual servers, software services, networking etc.
HA Monitor> The HA Monitor has different names in different
frameworks. This is the component that allows configuration of the services, ensures proper shutdown and startup and allows manual control
Replication> Replication is a common component that ensures that the
data content of managed data rich components are in sync
What should I require? Don’t aim too high, aim for what is reasonable for
your needs
Aim to ensure that no important data is lost> What is “important data”? You decide! Different data means
different “needs”!
Aim to ensure that the solution can be automated. You will want this eventually anyway
Aim to ensure a solution that can easily be tested and administered
Aim to ensure that the solution is performant and scalable
MySQL Replication> Easy to use and set up. Low performance impact> Asynchronous only. Failback can be difficult. Need
additional components
MySQL with DRBD / ZFS / AVS> Easy to use. Low cost software only. Synchronous.
Good HA software integration.> Certain performance impact. Limited data size and
transaction rates.
HA with MySQL – In short
MySQL with Shared storage> Good performance. Eases hardware management.
Good integration with HA software.> Costly. SAN itself is a SPOF.
MySQL Cluster> Very good performance. Self contained. Very short
fail-over times. Software only solution.> Needs several physical servers. Not optimized for
all MySQL applications.
HA with MySQL – In short
bwin games ab
Our goal at bwin We were faced with a requirement; establish a
highly available database platform.
We had some rules to follow from management.> interruptions due to hardware failure should not require
hands-on work.> Downtime should be minimized during interruptions.> Performance of DB platform should not decrease when
operating as usual> Performance can decrease if a failure has occurred but
should not deem the service unusable.> Implementation should be done by the operations
department. Developers should not be involved.
What solutions did we consider? Master/Master
Linux HA
HP Service Guard
Sun Cluster
Combination of the above
MySQL Cluster
Will walk through all of the above
Master/Master Master/Master with two active nodes would give
us a seamless switch if we have a good load balancer.> Will give us the ability to do schema changes “on line”> Not only higher availability when both nodes are up, but
better performance.> Can eliminate the use of production slaves. > One entry point for application when using “LB”
Linux HA/ServiceGuard/SunCluster Service IP switch will cause a glitch in service.
Since we are running 4.0 we can’t really do a master/master setup with service IP switching.
Slave integrity is important and we are running 4.0; One master data. Can’t switch to slave and hope that everything was replicated.
We are using SAN – Shared storage possible.
One instance, two machines – One active, one standby.
Innodb log size will be a problem.
Timeout during recovery can cause problems during switch.
MySQL Cluster High availability built in if implemented correct
Requires more hardware.
More complex solution
Requires application to support NDB
Not full feature set.
Obstacles We are using MySQL 4.0 in our biggest database
Master/Master scenario on 4.0 requires higher level of application awareness.
LinuxHA/ServiceGuard/Sun Cluster will cause small glitch when we move resources.
MySQL Cluster will require even more application changes in our case.
Our Choice LinuxHA because it is GPL/LGPL. Free and not
owned by an organization.
Fastest way to implement, did not require any support from dev. Department.
All other ways required changes in application.
Layout Two versions
We do..
Use Linux HA 2.0. Needed for setup of “cluster”
Use SAN. Shared storage is easier and faster, but Expensive. > DRBD can be used but saves the same data twice Also
comes with a performance decrease.
Heartbeat on two bonds. Primary database interconnect network, secondary on database service network
We have LUNs presented to multiple hosts
Services have rules to be run on specific hosts only.
We fence using RiLOE> Have plans to fence on port level in FC switches.
What’s good and what’s bad.. Easy and fast implementation
Our config does not increase/decrease performance.
Innodb log size causes long recovery times. Testing to decrease it has caused performance penalties.
Our solution is not fool proof because of long recovery times.
It causes interruption of service.
We can say it’s HA, but true HA solution would give us 100% uptime.
2nd Setup is complicated. We should aim for having simple setups. More common
What can we do better. Fine tune config for faster recovery/startup
Add better fencing
Monitor failover in case recovery takes long
Master/Master or Multi master.> If application can reconnect or if we have a smart load
balancer we have no outages.> Upgrades or schema changes can be made “online”> No separation between writes and reads. Less complicated
for developers. One entry point.
Summary Concepts
Components
Requirements
Technologies
Your goal
Considerations
Obstacles
How we did it @ bwin games AB
HA recommendations
Questions
The question is not, ‘What is the answer?’ The question is, ‘What is the question?’
Henri Poincaré
Thank you for your time! And thank you for listening so kindly.
We can be found on:
Robert Krzykawski – http://krzykawski.com
Anders Karlsson – http://papablues.comhttp://karlssonondatabases.blogspot.com/
top related