linux ha users guide

26
The Linux-HA User’s Guide Florian Haas <[email protected]> The Linux-HA Project<[email protected]>

Upload: par2cola

Post on 22-Oct-2014

142 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Linux Ha Users Guide

The�Linux-HA�User’s�Guide

Florian�Haas�<[email protected]>The�Linux-HA�Project<[email protected]>

Page 2: Linux Ha Users Guide

The�Linux-HA�User’s�Guideby Florian HaasCopyright © 2009, 2010 LINBIT HA-Solutions GmbH, The Linux-HA Project

License information

The text of and illustrations in this document are licensed under a Creative Commons Attribution–Share Alike 3.0 Unported license("CC-BY-SA").

• A summary of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/.

• The full license text is available at http://creativecommons.org/licenses/by-sa/3.0/legalcode.

• In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.

Page 3: Linux Ha Users Guide

iii

Preface ....................................................................................................................... ivI. Introduction to Heartbeat .......................................................................................... 1

1. Heartbeat as a Cluster Messaging Layer ............................................................. 22. Components .................................................................................................... 3

2.1. Communication module .......................................................................... 32.2. Cluster Consensus Membership ............................................................... 32.3. Cluster Plumbing Library ........................................................................ 32.4. IPC Library ........................................................................................... 42.5. Non-blocking logging daemon ................................................................ 4

II. Installing Heartbeat .................................................................................................. 53. Building and installing from source ..................................................................... 6

3.1. Building and installing Cluster Glue from source ......................................... 63.1.1. Cluster Glue build prerequisites .................................................... 63.1.2. Downloading Cluster Glue sources ................................................ 63.1.3. Building Cluster Glue ................................................................... 73.1.4. Building Packages ....................................................................... 8

3.2. Building and installing Heartbeat from source ............................................ 83.2.1. Heartbeat build prerequisites ....................................................... 83.2.2. Downloading Heartbeat sources ................................................... 83.2.3. Building Heartbeat ...................................................................... 93.2.4. Building Packages ..................................................................... 10

4. Installing pre-built packages ............................................................................ 114.1. Debian and Ubuntu .............................................................................. 114.2. Fedora, RHEL and CentOS .................................................................... 114.3. OpenSUSE and SLES ............................................................................ 11

III. Administrative Tasks .............................................................................................. 125. Creating an initial Heartbeat configuration ........................................................ 13

5.1. The ha.cf file ................................................................................... 135.2. The authkeys file ............................................................................. 135.3. Propagating the cluster configuration to cluster nodes ............................. 145.4. Starting Heartbeat services .................................................................. 145.5. Where to go from here ........................................................................ 15

6. Upgrading from previous Heartbeat versions ..................................................... 166.1. Upgrading from Heartbeat 2.1 clusters not using the CRM ........................ 16

6.1.1. Stopping Heartbeat services ...................................................... 166.1.2. Upgrade software ..................................................................... 166.1.3. Enabling the Heartbeat cluster to use Pacemaker .......................... 166.1.4. Restarting Heartbeat ................................................................. 17

6.2. Upgrading from CRM-enabled Heartbeat 2.1 clusters .............................. 176.2.1. Placing the cluster in unmanaged mode ....................................... 176.2.2. Backing up the CIB ................................................................... 186.2.3. Stopping Heartbeat services ...................................................... 186.2.4. Wiping files related to the CRM .................................................. 186.2.5. Restoring the CIB ..................................................................... 186.2.6. Upgrading software .................................................................. 196.2.7. Restarting Heartbeat services .................................................... 196.2.8. Returning the cluster to managed mode ...................................... 196.2.9. Upgrading the CIB schema ......................................................... 19

IV. Getting Help and Helping Out ................................................................................. 207. Reporting problems ........................................................................................ 21

7.1. Mailing Lists ....................................................................................... 217.1.1. Linux-HA ................................................................................. 217.1.2. Linux-HA-dev .......................................................................... 21

7.2. Bug Tracking System ........................................................................... 217.3. IRC .................................................................................................... 21

8. Submitting patches ........................................................................................ 22

Page 4: Linux Ha Users Guide

iv

PrefaceThis guide aims to be the definitive reference guide for users of the Heartbeat cluster messaginglayer, the Linux-HA cluster resource agents, and other clustering building blocks maintained bythe Linux-HA project.

We welcome and actively encourage any and all feedback about this document. Please usethe [email protected] [mailto:[email protected]] mailing list for comments,corrections, and suggestions for improvement.

Page 5: Linux Ha Users Guide

Part I. Introduction�to�Heartbeat

Page 6: Linux Ha Users Guide

2

Chapter 1. Heartbeat�as�a�ClusterMessaging�Layer

Heartbeat is a daemon that provides cluster infrastructure (communication and membership)services to its clients. This allows clients to know about the presence (or disappearance!) of peerprocesses on other machines and to easily exchange messages with them.

In order to be useful to users, the Heartbeat daemon needs to be combined with a clusterresource manager (CRM) which has the task of starting and stopping the services (IP addresses,web servers, etc) that cluster will make highly available. The canonical cluster resource managertypically associated with Heartbeat is Pacemaker [http://www.clusterlabs.org], a highly scalableand feature-rich implementation that supports both Heartbeat and, alternatively, the Corosync[http://www.corosync.org] cluster messaging layer.

Note

Up until Heartbeat release 2.1.3, Pacemaker was developed jointly with Heartbeat aspart of the Linux-HA umbrella project. After this release, the Pacemaker project wasspun off as a separate project and continues as such, while maintaining full supportfor Heartbeat cluster messaging.

Page 7: Linux Ha Users Guide

3

Chapter 2. ComponentsThe Heartbeat messaging layer comprises several interlocking but distinct components.

2.1. Communication�moduleThe Heartbeat communication module provides strongly authenticated, locally-ordered multicastmessaging over basically any media, IP-based or not. Heartbeat supports cluster communicationsover the following network link types:

• unicast UDP over IPv4;

• broadcast UDP over IPv4;

• multicast UDP over IPv4;

• serial link communications.

Warning

Please consider some important caveats with regard to serial link connectivity (seethe ha.cf man page, man ha.cf). As a general rule: when in doubt, serial linksshould be avoided.

Heartbeat can detect node failure reliably in less than a half-second. It will register with the systemwatchdog timer if configured to do so.

The heartbeat layer has an API which provides the following classes of services:

• intra-cluster communication - sending and receiving packets to cluster nodes

• configuration queries

• connectivity information (who can the current node hear packets from) - both for queries andstate change notifications

• basic group membership services

2.2. Cluster�Consensus�MembershipCCM provides strongly connected consensus cluster membership services. It ensures that everynode in a computed membership can talk to every other node in this same membership. CCMImplements both an OCF draft membership API, and the SAF AIS membership API. Typically itcomputes membership in sub-second time.

2.3. Cluster�Plumbing�LibraryThe Cluster plumbing library is a collection of very useful functions which provide a variety ofservices used by many of our main components. A few of the major objects provided by thislibrary include:

• compression API (with underlying compression plugins)

• Non-blocking logging API

• memory management oriented to continuously running services

Page 8: Linux Ha Users Guide

Components

4

• Hierarchical name-value pair messaging facility promoting portability and version upgradecompatibility (also provides optional message compression facilities)

• Signal unification - allowing signals to appear as mainloop events

• Core dump management utilities - promoting capture of core dumps in a uniform way, andunder all circumstances

• timers (like glib mainloop timers - but they work even when the time of day clock jumps)

• child process management - death of children causes invocation of process object, withconfigurable death-of-child messages

• Triggers (arbitrary events triggered by software)

• Realtime management - setting and unsetting high priorities, and locked into memoryattributes of processes.

• 64-bit HZ-granularity time manipulation (longclock_t)

• User id management for security purposes, for processes which need some root privileges.

• Mainloop integration for IPC, plain file descriptors, signals, etc. This means that all thesedifferent event sources are managed and dispatched consistently.

2.4. IPC�LibraryAll interprocess communication is performed using a very general IPC library which provides non-blocking access to IPC using a flexible queueing strategy, and includes integrated flow control. ThisIPC API does not require sockets, but the currently available implementations use UNIX (Local)Domain sockets.

This API also includes built-in authentication and authorization of peer processes, and is portableto most POSIX-like OSes. Although use of Glib mainloop with these APIs is not required, Heartbeatprovides simple and convenient integration with mainloop.

2.5. Non-blocking�logging�daemonlogd is Heartbeat’s logging daemon, capable of logging to a syslog daemon, to files, or both.logd never blocks, instead, it discards messages lagging too far behind.

Once it is capable of outputting messages again, logd prints a lost message count. Queue sizesare controllable both overall, and on a per-application basis.

Page 9: Linux Ha Users Guide

Part II. Installing�Heartbeat

Page 10: Linux Ha Users Guide

6

Chapter 3. Building�and�installing�fromsource

Building and installing the Heartbeat cluster messaging layer from source amounts to building thefollowing packages:

• heartbeat itself;

• the cluster-glue package containing the Heartbeat local resource manager (LRM) andSTONITH plugins.

Since heartbeat has a build dependency on cluster-glue, it is necessary to build and installcluster-glue first, before one proceeds with building and installing heartbeat.

3.1. Building�and�installing�Cluster�Glue�fromsource

3.1.1. Cluster�Glue�build�prerequisites

Building Cluster Glue requires the presence of the following tools and libraries on the build system:

• A C compiler (typically gcc) and associated C development libraries;

• the flex scanner generator and the bison parser compiler;

• net-snmp development headers, to enable SNMP related functionality;

• OpenIPMI development headers, to enable IPMI related functionality;

• Python (just the language interpreter, not library headers).

Note

This list applies to the default software configuration. If you configure the sourcewith non-standard options, other dependencies may apply.

3.1.2. Downloading�Cluster�Glue�sources

Several options are available for retrieving the Cluster Glue source code, for building locally ona target system.

Downloading�a�release�tarball

Downloading a released version of Heartbeat as a compressed tarball is equivalent to fetchinga tagged snapshot from the Mercurial source code repository. Release tags follow the formatglue-x.y.z, where x.y.z is the released version of Cluster Glue you wish to download.

If, for example, one wants to download the 1.0.1 release, the correct sequence of commandswould be:

# wget http://hg.linux-ha.org/glue/archive/glue-1.0.1.tar.bz2# tar -vxjf glue-1.0.1.tar.bz2

Page 11: Linux Ha Users Guide

Building and installing from source

7

Downloading�the�latest�Mercurial�snapshot

The latest development code is always available in the Mercurial repository as the tip revision.

To download a tarball auto-generated from the tip, use this sequence of commands:

# wget http://hg.linux-ha.org/glue/archive/tip.tar.bz2# tar -vxjf tip.tar.bz2

Checking�out�sources�from�Mercurial

This is the method you would apply if you have the Mercurial utilities locally installed. Checkingout the sources amounts to cloning the repository:

$ hg clone http://hg.linux-ha.org/glue cluster-gluerequesting all changesadding changesetsadding manifestsadding file changesadded 12491 changesets with 34830 changes to 2632 filesupdating working directory356 files updated, 0 files merged, 0 files removed, 0 files unresolved

3.1.3. Building�Cluster�Glue

Building Cluster Glue is an automated process making extensive use of GNU Autotools. Whenbuilding and installing on the same machine, it usually amounts to just the following sequence ofcommands:

$ ./autogen.sh$ ./configure$ make$ sudo make install

Note

The autogen.sh script is a convenience wrapper around automake,autoheader, autoconf, and libtool.

A number of configuration options are supported, and you may tweak some of them tooptimize Heartbeat for your system. To retrieve a list of configuration options, you may invokeconfigure with the --help option. A customized build may thus comprise these steps:

$ ./autogen.sh$ ./configure --help$ ./configure configuration-options$ make$ sudo make install

Some typical configuration options that you may wish to set are --prefix, --sysconfdir,and --localstatedir, as shown in this example:

$ ./autogen.sh$ ./configure --help$ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var$ make$ sudo make install

Page 12: Linux Ha Users Guide

Building and installing from source

8

3.1.4. Building�Packages

RPM�packages

RPM spec files are provided in the Cluster Glue source tree both for SuSE and Red Hat baseddistributions:

• cluster-glue-suse.spec should be used for OpenSUSE and SLES installations.

• cluster-glue-fedora.spec is for Fedora, Red Hat Enterprise Linux, and CentOS.

Debian�packages

Debian packaging for Cluster glue is maintained in a Mercurial repository onalioth.debian.org [http://hg.debian.org/hg/debian-ha/cluster-glue/]. Thus, rather thancloning or downloading from the upstream Mercurial repository, use the one hosted on alioth.

Once you have checked out or unpacked the source tree from the alioth repository, simplyinvoke dpkg-buildpackage from the top of the source tree — like you would with any otherDebian package.

3.2. Building�and�installing�Heartbeat�fromsource

3.2.1. Heartbeat�build�prerequisites

Building Heartbeat requires the presence of the following tools and libraries on the build system:

• A C compiler (typically gcc) and associated C development libraries;

• the flex scanner generator and the bison parser compiler;

• net-snmp development headers, to enable SNMP related functionality;

• OpenIPMI development headers, to enable IPMI related functionality;

• Python (just the language interpreter, not library headers)

• the cluster-glue development headers. See Section 3.1, “Building and installing ClusterGlue from source” [6] for details on how to build these from source.

Note

This list applies to the default software configuration. If you configure the sourcewith non-standard options, other dependencies may apply.

3.2.2. Downloading�Heartbeat�sources

Downloading�a�release�tarball

Downloading a released version of Heartbeat as a compressed tarball is equivalent to fetchinga tagged snapshot from the Mercurial source code repository. Release tags follow the formatSTABLE-x.y.z, where x.y.z is the released version of Heartbeat you wish to download.

If, for example, one wants to download the 3.0.4 release, the correct sequence of commandswould be:

Page 13: Linux Ha Users Guide

Building and installing from source

9

# wget http://hg.linux-ha.org/dev/archive/STABLE-3.0.4.tar.bz2# tar -vxjf STABLE-3.0.4.tar.bz2

Downloading�the�latest�Mercurial�snapshot

The latest development code is always available in the Mercurial repository as the tip revision.

To download a tarball auto-generated from the tip, use this sequence of commands:

# wget http://hg.linux-ha.org/dev/archive/tip.tar.bz2# tar -vxjf tip.tar.bz2

Checking�out�sources�from�Mercurial

This is the method you would apply if you have the Mercurial utilities locally installed. Checkingout the sources amounts to cloning the repository:

$ hg clone http://hg.linux-ha.org/dev heartbeat-devrequesting all changesadding changesetsadding manifestsadding file changesadded 12491 changesets with 34830 changes to 2632 filesupdating working directory356 files updated, 0 files merged, 0 files removed, 0 files unresolved

3.2.3. Building�Heartbeat

Building Heartbeat is an automated process making extensive use of GNU Autotools. Whenbuilding and installing on the same machine, it usually amounts to just the following sequence ofcommands:

$ ./bootstrap$ ./configure$ make$ sudo make install

Note

The bootstrap script is a convenience wrapper around automake, autoheader,autoconf, and libtool. ConfigureMe is a convenience wrapper for theautoconf-generated configure script.

A number of configuration options are supported, and you may tweak some of them tooptimize Heartbeat for your system. To retrieve a list of configuration options, you may invokeconfigure with the --help option. A customized build may thus comprise these steps:

$ ./bootstrap$ ./configure --help$ ./configure <configuration-options>$ make$ sudo make install

Some typical configuration options that you may wish to set are --prefix, --sysconfdir,and --localstatedir, as shown in this example:

$ ./bootstrap$ ./configure --help$ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var

Page 14: Linux Ha Users Guide

Building and installing from source

10

$ make$ sudo make install

3.2.4. Building�Packages

RPM�packages

RPM spec files are provided in the Heartbeat source tree both for SuSE and Red Hat baseddistributions:

heartbeat-suse.spec should be used for OpenSUSE and SLES installations. heartbeat-fedora.spec is for Fedora, Red Hat Enterprise Linux, and CentOS.

Debian�packages

Debian packaging for Heartbeat is maintained in a Mercurial repository onalioth.debian.org [http://hg.debian.org/hg/debian-ha/heartbeat/]. Thus, rather thancloning or downloading from the upstream Mercurial repository, use the one hosted on alioth.

Once you have checked out or unpacked the source tree from the alioth repository, simplyinvoke dpkg-buildpackage from the top of the source tree — like you would with any otherDebian package.

Page 15: Linux Ha Users Guide

11

Chapter 4. Installing�pre-built�packagesCluster Glue and Heartbeat are available as pre-built binary packages on a number of platforms,including

• Debian (fully included in squeeze and up, backports packages are available for lenny);

• Ubuntu (since lucid);

• Fedora (since release 12);

• OpenSUSE (since release 11).

Commercially supported enterprise packages for Red Hat Enterprise Linux and SUSE LinuxEnterprise server are available from LINBIT [http://www.linbit.com].

This section describes the steps necessary to install binary packages on these platforms.

4.1. Debian�and�UbuntuInstalling the cluster-glue and heartbeat packages on Debian and Ubuntu is astraightforward process. Assuming you have the correct package repositories configured for APT,install the two packages with the following commands:

aptitude install heartbeat cluster-glue

Since you most likely will also want to install Pacemaker (beyond the scope of this manual), doso by issuing the following commands as well:

aptitude install cluster-agents pacemaker

4.2. Fedora,�RHEL�and�CentOSOn Red Hat platforms, you install the cluster-glue and heartbeat packages with the YUMpackage manager. Assuming you have the correct package repositories configured in +/etc/yum.repos.d/, install the two packages with the following commands:

yum install heartbeat cluster-glue

Since you most likely will also want to install Pacemaker (beyond the scope of this manual), doso by issuing the following commands as well:

yum install resource-agents pacemaker

4.3. OpenSUSE�and�SLESOn SUSE platforms, you install the cluster-glue and heartbeat packages with the Zypperpackage manager. Assuming you have the correct package repositories configured, install the twopackages with the following commands:

zypper install heartbeat cluster-glue

Since you most likely will also want to install Pacemaker (beyond the scope of this manual), doso by issuing the following commands as well:

zypper install resource-agents pacemaker

Page 16: Linux Ha Users Guide

Part III. Administrative�Tasks

Page 17: Linux Ha Users Guide

13

Chapter 5. Creating�an�initialHeartbeat�configuration

For any Heartbeat cluster, the following configuration files must be available:

• /etc/ha.d/ha.cf— the global cluster configuration file.

• /etc/ha.d/authkeys— a file containing keys for mutual node authentication.

5.1. The�ha.cf�fileThe following example is a small and simple ha.cf file:

autojoin nonemcast bond0 239.0.0.43 694 1 0bcast eth2warntime 5deadtime 15initdead 60keepalive 2node alicenode bobpacemaker respawn

Setting autojoin to none disables cluster node auto-discovery and requires that cluster nodesbe listed explicitly, using the node options. This speeds up cluster start-up in clusters with a fixedsmall number of nodes.

This example assumes that bond0 is the cluster’s interface to the shared network, and that eth2is the interface dedicated for DRBD replication between both nodes. Thus, bond0 can be used forMulticast heartbeat, whereas on eth2 broadcast is acceptable as eth2 is not a shared network.

The next options configure node failure detection. They set the time after which Heartbeat issuesa warning that a no longer available peer node may be dead (warntime), the time after whichHeartbeat considers a node confirmed dead (deadtime), and the maximum time it waits forother nodes to check in at cluster startup (initdead). keepalive sets the interval at whichHeartbeat keep-alive packets are sent. All these options are given in seconds.

The node option identifies cluster members. The option values listed here must match the exacthost names of cluster nodes as given by uname -n.

pacemaker respawn enables the Pacemaker cluster manager, and ensures that Pacemaker isautomatically restarted in case of a failure.

Note

Prior to Heartbeat release 3.0.4, the pacemaker keyword was named crm. Newerversions still retain the old name as a compatibility alias, but pacemaker is thepreferred syntax.

5.2. The�authkeys�file/etc/ha.d/authkeys contains pre-shared secrets used for mutual cluster nodeauthentication. It should only be readable by root and follows this format:

auth <num>

Page 18: Linux Ha Users Guide

Creating an initialHeartbeat configuration

14

<num> <algorithm> <secret>

num is a simple key index, starting with 1. Usually, you will only have one key in your authkeysfile.

algorithm is the signature algorithm being used. You may use either md5 or sha1; the use ofcrc (a simple cyclic redundancy check, not secure) is not recommended.

secret is the actual authentication key.

You may create an authkeys file, using a generated secret, with the following shell hack:

( echo -ne "auth 1\n1 sha1 "; \ dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \ > /etc/ha.d/authkeyschmod 0600 /etc/ha.d/authkeys

5.3. Propagating�the�cluster�configuration�tocluster�nodes

In order to propagate the contents of the ha.cf and authkeys configuration files, you may use theha_propagate command, which you would invoke using either

/usr/lib/heartbeat/ha_propagate

or

/usr/lib64/heartbeat/ha_propagate

This utility will copy the configuration files over to any node listed in /etc/ha.d/ha.cf usingscp. It will afterwards also connect to the nodes using ssh and issue chkconfig heartbeaton in order to enable Heartbeat services on system startup.

5.4. Starting�Heartbeat�servicesThe Heartbeat services are started just as you would any other system service on your machine.Depending on your system platform, you might be using one of the following commands:

/etc/init.d/heartbeat start

service heartbeat start

rcheartbeat start

Through the pacemaker entry in your ha.cf [13], Heartbeat will now start the Pacemakerdaemon (named crmd for historical reasons) along with the rest of its services. After a fewseconds, you should be able to detect Heartbeat processes in your process table:

# ps -AHfww | grep heartbeatroot 2772 1639 0 14:27 pts/0 00:00:00 grep heartbeatroot 4175 1 0 Nov08 ? 00:37:57 heartbeat: master control processroot 4224 4175 0 Nov08 ? 00:01:13 heartbeat: FIFO readerroot 4227 4175 0 Nov08 ? 00:01:28 heartbeat: write: bcast eth2root 4228 4175 0 Nov08 ? 00:01:29 heartbeat: read: bcast eth2root 4229 4175 0 Nov08 ? 00:01:35 heartbeat: write: mcast bond0root 4230 4175 0 Nov08 ? 00:01:32 heartbeat: read: mcast bond0102 4233 4175 0 Nov08 ? 00:03:37 /usr/lib/heartbeat/ccm102 4234 4175 0 Nov08 ? 00:15:02 /usr/lib/heartbeat/cib

Page 19: Linux Ha Users Guide

Creating an initialHeartbeat configuration

15

root 4235 4175 0 Nov08 ? 00:17:14 /usr/lib/heartbeat/lrmd -rroot 4236 4175 0 Nov08 ? 00:02:48 /usr/lib/heartbeat/stonithd102 4237 4175 0 Nov08 ? 00:00:54 /usr/lib/heartbeat/attrd102 4238 4175 0 Nov08 ? 00:08:32 /usr/lib/heartbeat/crmd102 5724 4238 0 Nov08 ? 00:04:47 /usr/lib/heartbeat/pengine

Finally, you should also be able to confirm that the cluster is working, via the Pacemaker crm_moncommand:

# crm_mon -1============Last updated: Mon Dec 13 14:29:36 2010Stack: HeartbeatCurrent DC: alice (083146b9-6e26-4ac8-a705-317095d0ba57) - partition with quorumVersion: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b2 Nodes configured, unknown expected votes24 Resources configured.============

Online: [ alice bob ]

5.5. Where�to�go�from�hereNow that you have a working Heartbeat configuration, you will want to continue with configuringPacemaker and adding cluster resources.

The following documents are available for your perusal:

• Clusters From Scratch [http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/] is an excellent guide for configuring Pacemaker clusters. Thedocument primarily covers Pacemaker on Corosync, but starting with its chapter "UsingPacemaker Tools" applies identically to Heartbeat clusters.

• The DRBD User’s Guide [http://www.drbd.org/users-guide/] has a chapter dedicated tointegrating DRBD with Pacemaker clusters.

• Several Technical Guides covering Heartbeat/Pacemaker clusters for a variety of applicationsare available from the LINBIT web site [http://www.linbit.com/en/education/tech-guides].

Page 20: Linux Ha Users Guide

16

Chapter 6. Upgrading�from�previousHeartbeat�versions

6.1. Upgrading�from�Heartbeat�2.1�clusters�notusing�the�CRM

For Heartbeat 2.1 clusters not using the CRM (i.e., clusters configured with the haresourcesfile, an upgrade to 3.0 involves converting the current configuration to one that is suitable forPacemaker.

Note

This upgrade procedure does incur application down time. However, when theupgrade is properly planned, tested, and executed, this down time amounts tominutes, possibly even seconds (depending on configuration).

6.1.1. Stopping�Heartbeat�services

You should commence the upgrade procedure on your current standby node, that is, the clusternode currently not running any resources. If your cluster is using an active-active configuration(both nodes running resources), select one and issue the following command to transfer allresources to the peer node:

# hb_standby

Then, and on that node only, stop Heartbeat services:

# /etc/init.d/heartbeat stop

6.1.2. Upgrade�software

While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been splitup into modular parts. Thus you will replace Heartbeat with three individual pieces of software:Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer.

• Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run makeuninstall. Then, install Cluster Glue and Heartbeat.

• Upgrading using locally built packages: When installing packages manually, uninstall theheartbeat package first. Then install cluster-glue, the version 3 heartbeat package,resource-agents, and pacemaker.

• Upgrading using a package repository: When upgrading using an APT, YUM, or Zypperrepository, you should just be able to run the install command for heartbeat version 3 andpacemaker, and the dependencies will be resolved automatically.

Do not restart Heartbeat services at this point.

6.1.3. Enabling�the�Heartbeat�cluster�to�use�Pacemaker

The cluster messaging layer must now be instructed to start Pacemaker on cluster start-up. Todo so, add

Page 21: Linux Ha Users Guide

Upgrading from previousHeartbeat versions

17

crm respawn

to your ha.cf configuration file.

Important

At this point, you should also check your ha.cf file against the ha.cf(5) manualpage, and remove any deprecated options.

When your ha.cf modifications are complete, copy the file to the peer node.

6.1.4. Restarting�Heartbeat

Your cluster is now ready to be restarted in Pacemaker-enabled mode. To do so:

1. Run /etc/init.d/heartbeat stop on your still-active node. This will shutdown yourcluster resources.

2. Run /etc/init.d/heartbeat start on your standby node (the one where you createdyour CIB). This will start the local Heartbeat instance and Pacemaker, and wait for other clusternodes to check in.

3. Run /etc/init.d/heartbeat start on your the other node. This will start the localHeartbeat instance and Pacemaker, fetch the CIB automatically, and start applications.

6.2. Upgrading�from�CRM-enabled�Heartbeat2.1�clusters

This section outlines the steps necessary to upgrade a Heartbeat 2.1 cluster with the built-inCRM enabled, to Heartbeat 3.0 with Pacemaker.

Note

When properly planned and executed, the upgrade procedure can be completed ina manner of minutes, with no application down time. It is strongly recommendedto read and understand the steps outlined in this section before attempting aproduction cluster uqpgrade. All commands stated in this section must be run asroot. Do not perform the individual steps in parallel on all of your cluster nodes.Instead, complete the procedure on each node before continuing with the next.

6.2.1. Placing�the�cluster�in�unmanaged�mode

With this step, the cluster temporarily relinquishes control of its resources. This means that thecluster no longer monitors its nodes or resources for the duration of the upgrade, and will notrectify and application or node failures during this time. Currently running resources, however,will continue to run.

# crm_attribute -t crm_config -n is_managed_default -v false

In most configurations, individual resources do not set the is_managed attribute individually,and hence the cluster-wide attribute is_managed_default applies to all of them.

If in your specific configuration you do have resources that have this attribute set, you shouldremove it to make sure the default applies:

# crm_resource -t primitive -r <resource-id> -p is_managed -D

Page 22: Linux Ha Users Guide

Upgrading from previousHeartbeat versions

18

6.2.2. Backing�up�the�CIB

At this point, is is important to save a copy of the Cluster Information Base (CIB). The CIB to saveis stored in a file named cib.xml, normally located in /var/lib/heartbeat/crm.

# cp /var/lib/heartbeat/crm/cib.xml ~

You need to perform this step on only one node currently connected to the cluster. Do not deletethis file, it will be restored later.

6.2.3. Stopping�Heartbeat�services

You may now stop Heartbeat with /etc/init.d/heartbeat stop or the preferredcommand to stop a system service on your distribution (service heartbeat stop,rcheartbeat stop, etc.).

In case you are running a legacy version of Heartbeat affected by a shutdown bug, then gracefulcrmd shutdown will not work properly in unmanaged mode.

In this case, after you have initiated a graceful service shutdown with the above command, killthe crmd process manually:

• use ps -AHfww to retrieve the process ID of crmd;

• kill crmd with a TERM signal.

6.2.4. Wiping�files�related�to�the�CRM

Warning

Before proceeding with this section, verify that you have created a backup copy ofthe CIB on one of your cluster nodes, as outlined in Section 6.2.2, “Backing up theCIB” [18].

You should now wipe local CRM-related files from your node. To do so, remove all files from thedirectory where the CRM stores CIB information, commonly /var/lib/heartbeat/crm.

# rm /var/lib/heartbeat/crm/*

6.2.5. Restoring�the�CIB

Note

You should only proceed with this step if Heartbeat is still stopped on all clusternodes, and all cluster nodes have had their CIB contents wiped. If you still haveremaining nodes that have a residual CIB configuration, proceed as outlined inSection 6.2.4, “Wiping files related to the CRM” [18].

Restoring the CIB means copying the CIB backup described in Section 6.2.2, “Backing up theCIB” [18] to the /var/lib/heartbeat/crm directory.

# cp ~/cib.xml /var/lib/heartbeat/crm/cib.xml# chown hacluster:haclient /var/lib/heartbeat/crm/cib.xml# chmod 0600 /var/lib/heartbeat/crm/cib.xml

You must perform this step on one node only, namely the first node on which you are about toupgrade the cluster software. On all other nodes, the /var/lib/heartbeat/crm directorymust remain empty — Pacemaker distributes the CIB automatically.

Page 23: Linux Ha Users Guide

Upgrading from previousHeartbeat versions

19

6.2.6. Upgrading�software

While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been splitup into modular parts. Thus you will replace Heartbeat with three individual pieces of software:Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer.

• Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run makeuninstall. Then, install Cluster Glue and Heartbeat.

• Upgrading using locally built packages: When installing packages manually, uninstall theheartbeat package first. Then install cluster-glue, the version 3 heartbeat package,resource-agents, and pacemaker.

• Upgrading using a package repository: When upgrading using an APT, YUM, or Zypperrepository, you should just be able to run the install command for heartbeat version 3 andpacemaker, and the dependencies will be resolved automatically.

If this is the last node to be upgraded in your cluster, and your package management systemdid not restart Heartbeat services after the software upgrade, you should now proceed toSection  6.2.7, “Restarting Heartbeat services” [19]. Otherwise, you should move to thenext node and proceed as outlined in Section  6.1.1, “Stopping Heartbeat services” [16]Section 6.2.6, “Upgrading software” [19].

6.2.7. Restarting�Heartbeat�services

Note

This step may be omitted in case your package management system automaticallyrestarts Heartbeat services during post-install.

First, restart Heartbeat on the node where you restored the CIB (see Section 6.2.5, “Restoringthe CIB” [18]) with /etc/init.d/heartbeat start. Then, repeat this command on theremaining cluster nodes. At this time,

• the cluster is still in unmanaged mode (meaning it does not start, stop, or monitor anyresources),

• the cluster redistributes the old CIB among its nodes, and

• the cluster is still using the pre-upgrade CIB schema.

6.2.8. Returning�the�cluster�to�managed�mode

Once the cluster software has been upgraded, it is recommended to return the cluster intomanaged mode:

# crm_attribute -t crm_config -n is_managed_default -v true

6.2.9. Upgrading�the�CIB�schema

Although an upgraded cluster can theoretically operate on the pre-upgrade CIB schemaindefinitely, it is strongly recommended to upgrade the CIB to the current schema. To do so,run the following command after cluster communications between all nodes have been re-established:

# cibadmin --upgrade --force

Page 24: Linux Ha Users Guide

Part IV. Getting�Help�and�Helping�Out

Page 25: Linux Ha Users Guide

21

Chapter 7. Reporting�problemsThis chapter describes the appropriate channels for dealing with bugs and issues releated to thesoftware.

7.1. Mailing�ListsThe Linux-HA project maintains two mailing lists. Both are subscriber-only lists, hosted atlists.linux-ha.org.

7.1.1. Linux-HA

This is the general-purpose community mailing list. Post here if you have a general question aboutHeartbeat or the resource agents. The post address for this list is [email protected][mailto:[email protected]].

7.1.2. Linux-HA-dev

This is the developer mailing list. Post here if you have enhancement suggestions or think youhave found a bug. The post address for this list is [email protected] [mailto:[email protected]].

7.2. Bug�Tracking�SystemThe Linux-HA project uses a public Bugzilla installation run by the Linux Foundation at http://developerbugs.linuxfoundation.org. To report bugs or comment on existing entries, you mustcreate an account, which requires a valid email address.

7.3. IRCLinux-HA developers can usually be found on the irc.freenode.net server, in the #linux-ha channel.

Page 26: Linux Ha Users Guide

22

Chapter 8. Submitting�patchesIf you have found a problem in Cluster Glue or Heartbeat, and you are able to fix it, the developerswould love to see a patch from you. To do so, please follow the steps outlined in this section.

Create a working copy (a Mercurial clone) of the upstream repository with the followingcommand:

hg clone http://hg.linux-ha.org/dev heartbeat

Create a new Mercurial queue, and a new patchset:

cd heartbeathg qinithg qnew --edit --force fix-superfrobnication

Note

The --force option rolls your local modications into the patch set you are justcreating.

In your patch message, be sure to include a meaningful description, for example:

High: IPC: fix superfrobnication on 63-bit platforms

In the IPC layer, superfrobnication breaks on 63-bit middle-endianLinux platforms when ACPI is disabled. Fencepost alignfoobar_create_reqqueue() and guard against memory spinlock starvationto fix this.

Now the patch set is good for review on the mailing list:

hg email [email protected] fix-superfrobnication

Once your patch has been accepted for merging, one of the upstream developers will push it intothe repository. At that point, you can update your checkout from upstream, and remove yourown patch set.

hg qpop -ahg pull --updatehg qdelete fix-superfrobnication