introduction to powerha

Introduction to PowerHAPower High Availability

Skill Level: Intermediate

Uma Chandolu ([email protected])Senior Staff Software EngineerIBM

Tejaswini Kaujalgi ([email protected])Software EngineerIBM India Software Labs

17 Aug 2010

PowerHA for AIX® is the new name for HACMP (High Availability ClusterMultiprocessing). HACMP is an application that makes system fault resilient andreduces downtime of applications. This article concentrates on the introduction toPowerHA and provides a detailed explanation of how to configure a 2 node cluster.Considering the demand for this configuration from various customers, this documentwill be very useful in understanding PowerHA and setting up a 2 node cluster.

Introduction

In today's increasing business demands, critical applications need to be available allthe time, and the system needs to be fault tolerant. But these fault tolerant systemsalways come with a heavy cost. Hence, there is need of an application whichprovides these facilities and is also cost effective.

A High Availability solution can ensure that the failure of any component of thesolution does not cause the application and its data to be unavailable to the usercommunity. This is achieved through the elimination, or masking, of both plannedand unplanned downtime by eliminating single points of failure. Also, there is nospecialized hardware required to make an application highly available. PowerHAdoes not perform some administrative tasks like backups, time synchronization, and

Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. of 22

mailto:[email protected]


http://www.ibm.com/developerworks/ibm/trademarks/

http://www.ibm.com/legal/copytrade.shtml

any application specific configuration.

Figure 1 is an illustration of the failover capacity. When one server goes down, theother takes over.

Figure 1. Failover capacity

Overview of PowerHA

The terms PowerHA and HACMP are used interchangeably. As mentioned earlier, iteliminates single points of failure (SPOF). The following table shows possibleSPOFs:

Cluster object Eliminated as SPOF by:

Node Use multiple Nodes

Power source Using multiple circuits or uninterrupted powersupplies

Network Adapter Using redundant network adapters

Network Using multiple networks to connect nodes

developerWorks® ibm.com/developerWorks




TCP/IP Subsystem Using non-IP networks to connect adjoiningnodes and clients

Disk Adapter Using redundant disk adapter or multi Pathhardware

Disk Using multiple disks with mirroring or raid

Application Adding nodes for takeover; configuringapplication monitor

VIO server Implementing dual VIO server

Site Adding an additional site

The main goal is to have 2 servers so that if one fails, the other takes over.PowerHA is a clustering technology that provides both failover protection by havingredundancy and horizontal scalability through concurrent/parallel access.

PowerHA terminology

There are various terminologies used in PowerHA. They can be classified intotopology components and resource components.

The topology components are basically the physical components. They include:

• Nodes: System p servers can be standalone partitions or vios clients

• Networks: IP networks and Non IP networks

• Communication interfaces: Token Ring or Ethernet adapters

• Communication devices: RS232 or heartbeat over disk

The resources components are the logical entities that will be made highly available.They include:

• Application server: It involves the start/stop scripts of the application

• Service IP address: The end users are generally given an IP address toconnect to the application. This IP address is mapped to a node wherethe application is actually running. Since the IP address needs to remainhighly available, it is a part of the resource group.

• File system: Many applications require the file systems to mounted.

• Volume group: Volume groups are required to be made highly availablewith many applications.

All the resources together form an entity called a resource group. PowerHA handles

ibm.com/developerWorks developerWorks®




this as a single unit. It keeps the resource groups highly available. Resource groupshave policies associated with it. Those are:

1. Startup policy: This tells which node the resource group should activate

2. Fallover policy: When a failure happens, this determines the fallovertarget node

3. Fallback policy: This tells whether or not the resource group will fallback.

Whenever a failure happens, it looks for these policies and works accordingly.

The subsystems of PowerHA

Figure 2. Subsystems of PowerHA

The illustration above shows how PowerHA comprises of a number of softwarecomponents:

• The cluster manager, clstrmgr, is the core process that monitors clustermembership. The cluster manager includes a topology manager tomanage the topology components, a resource manager to manage





resource groups, an event manager with event scripts that works throughthe RMC facility, and RSCT to react to failures.

• The clinfo process provides an API for communicating between clustermanager and your application. Clinfo also provides remote monitoringcapabilities and can run a script in response to a status change in thecluster.

• In PowerHA 5, clcomdES allows the cluster managers to communicate ina secure manner without using rsh and the /.rhost files.

Configuration of a 2 node cluster

Before starting off with the configurations, lets look at the networking and the storageconsiderations of PowerHA.

Networking

PowerHA uses networks to detect and diagnose failures as well as providing clientswith highly available access to applications.

Internode communication also happens through networks. PowerHA detects 3 kindsof failures directly. Those are the network, NIC and node failure. It detects anddiagnosis through the use of RSCT daemon. RSCT in fact detects the loss ofheartbeat packets that are sent across all the networks and determines the exactloss (Node, NIC or network failure).

Figure 2 shows that the heartbeat packets are transferred and received by all NICs,which helps in determining the failures.

Figure 3. Cluster representing heartbeat packets





If the heartbeat packets are stopped, then both nodes assume that the other is downand hence each will try to bring the resource group online. This could result inmassive data corruption.

To avoid this, PowerHA uses 2 kinds of networks:

1. IP networks: Examples are Ethernet, Ether channel, etc

2. Non-IP networks: An example is RS232 (this is needed to make surethat even if the network goes down, PowerHA is capable of differentiatingbetween network failure and node failure)

IP address take over (IPAT)

Most of the applications require that the IP address be highly available. To ensurethis, we include this service IP into the resource group. The movement of thisservice IP from one NIC to another is called as IP address take over. There are twoways to use IPAT:

1. IPAT via aliasing: PowerHA adds the service IP address to the NIC,accomplished using AIX IP aliasing feature

2. IPAT via replacement: PowerHA replaces the Interface IP address withthe Service IP





Storage

Storage can be broadly classified into two types:

1. Private storage: Owned by only one node

2. Shared storage: Owned by more then one node in the cluster

All applications' data resides in the shared storage. To avoid data inconsistency,shared storage protection can be done in the following ways:

1. Reserve/release-based shared storage protection: Used with standardvolume groups

2. RSCT-based shared storage protection: Used with enhanced concurrentvolume groups

HACMP 5.x supports the RSCT-based style of shared storage protection, whichrelies on AIXs RSCT component to coordinate the ownership of shared storagewhen using enhanced concurrent volume groups in non-concurrent mode.

Configuration

Before starting with the configuration, the cluster must be properly planned. Theonline planning worksheets (OLPW) can be used for the planning purpose. Here, itexplains the configuration of a two node cluster. In the example provided, bothnodes have 3 Ethernet adapters and 2 shared disks.

Step 1: Fileset installation

After installing AIX, the first step is to install the required filesets. Install the followingfilesets. The RSCT and BOS filesets can be found in the AIX base version CDs. Thelicense for PowerHA needs to be purchased to install the HACMP filesets.

RSCT filesetsrsct.compat.basic.hacmprsct.compat.clients.hacmp

rsct.basic.hacmprsct.basic.rte

rsct.opt.storagermrsct.crypt.desrsct.crypt.3des

rsct.crypt.aes256

BOS filesetsbos.data

bos.adt.libmbos.adt.syscalls

bos.clvm.enhbos.net.nfs.server

HACMP 5.5 filesetscluster.adt.es

cluster.es.assistcluster.es.cspoccluster.es.plugins

cluster.assist.licensecluster.doc.en_US.assist

cluster.doc.en_US.escluster.es.worksheets

cluster.licensecluster.man.en_US.assist

cluster.man.en_US.escluster.es.client





cluster.es.servercluster.es.nfscluster.es.cfs

After installing the filesets, reboot the partition.

Step 2: Setting the path

Next, the path needs to be set. To do that, add the following to the /.profile file:

export PATH=$PATH:/usr/es/sbin/cluster:/usr/es/sbin/cluster/utilities

Step 3: Network configuration

To configure an IP address on the Ethernet adapters, do the following:

#smitty tcpip -> Minimum Configuration and Startup -> Choose Ethernet network interface

You will have three Ethernet adapters. Two with private IP address and one withpublic IP address.

Here, enter the relevant fields for en0, which you will configure for a public IPaddress.

Image 1. Configuration of a public IP address





This will configure the IP address and start the TCP/IP services on it.

Similarly, you configure the private IP addresses on en1 and en2.

Image 2. Configuration of a private IP address





Similarly, configure en2 with private IP 10.10.210.21 and start the TCP/IP services.

Next, you need to add the IP addresses (Of both node1, node2 and the service IPwhich db2live here) and the labels into /etc/hosts file. It should look like:

# Internet Address Hostname # Comments127.0.0.1 loopback localhost # loopback (lo0) name/address192.168.20.72 node1.in.ibm.com node1192.168.20.201 node2.in.ibm.com node210.10.35.5 node2ha110.10.210.11 node2ha210.10.35.4 node1ha110.10.210.21 node1ha2192.168.22.39 db2live

The idea is that you should include each of the three ports for each machine withrelevant labels for name resolution.

Perform similar operations on node2. Configure en0 with the public IP and en1 anden2 with private IPs and edit the /etc/hosts file.

To test that all is well, you can issue pings to the various IP addresses from eachmachine.

Step 4: Storage configuration

We need to have a shared storage to create heartbeat over FC disk. The disks needto be allocated from SAN. Once both the nodes are able to see the same disks (thiscan be identified using LUN number), heartbeat over disks will be configured.

This method does not use Ethernet to avoid a single point of failure from theEthernet network/switches/protocols.

The first step is to identify the available major number on all the nodes.

Image 3. Identifying available major number

Pick a unique number. In this case, we picked 100.

On node1





1. Create a vg "hbvg" on the shared disk "hdisk1" with enhanced concurrentcapability.

#smitty mkvg

Image 4. Volume group creation

Once hbvg is created, the autovaryon flag needs to be disabled. To dothat run the following command.

#chvg -an hbvg

2. Next we create logical volumes in the volume group "hbvg". Enter an LVname such as hbloglv, select 1 for the number of logical partitions, selectjfslog as the type, and set scheduling to sequential. Let the remainingoptions have the default value. Press enter.

#smitty mklv

Image 5. Logical Volume creation





Once lv is created, initialize the logform.

#logform /dev/hbloglv

Repeat this process to create another LV of type jfs and named hblv, butotherwise identical.

3. Next we create a filesystem. To do that, enter the following:

#smitty crfs ->Add a Journaled File System -> Add a Journaled File System on aPreviously Defined Logical Volume -> Add a Standard Journaled File System

Here enter the lv name "hblv", lv for log as " hbloglv" and the mount point/hb_fs

Image 6. Filesystem creation in a Logical Volume

Once the Filesystem is created, try mounting the file system. Beforemoving to node2, unmount /hb_fs and varyoffvg hbvg.





On Node 2

1. Identify the shared disk using PVID. Import the volume group with samemajor number (here it is 100) from the shared disk (hdisk1):

#importvg -V 100 -y hbvg hdisk1

2. Varyon the volume group and disable auto start at mount.

#varyonvg hbvg

#chvg -an hbvg

Now, you should be able to mount the filesystem. Once done, unmountthe filesystem and varyoffvg hbvg.

3. Verification of Heartbeat over FC:Open 2 different sessions of both the nodes. On node1, run followingcommand. Where hdisk1 is the shared disk.

#/usr/sbin/rsct/bin/dhb_read -p hdisk1 -r

On node2:

/usr/sbin/rsct/bin/dhb_read -p hdisk1 -t

Basically one node will heartbeat to the disk and the other will detect it.Both nodes should return to the command line after reporting Linkoperating normally.

Application specific configuration

If you are making any application, for example DB2 server highly available,application specific configuration needs to be done. That is beyond the scope of thisarticle.

HACMP related configuration

1. Network takeover:On both nodes:





1. Run grep -i community /etc/snmpdv3.conf | greppublic and ensure that there is an uncommented line similar toCOMMUNITY public public noAuthNoPriv 0.0.0.0 0.0.0.0.

2. Next we need to add all the IP addresses of nodes, NIC's in the/etc/rhosts file.

# cat /usr/es/sbin/cluster/etc/rhosts192.168.20.72192.168.20.20110.10.35.510.10.210.1110.10.35.410.10.210.21192.168.22.39

Configuring PowerHA cluster

On Node 1:

1. First define a cluster

#smitty hacmp --> Extended Configuration --> Extended Topology Configuration -->Configure an HACMP Cluster --> Add/Change/Showan HACMP Cluster

Image 7. Defining a cluster

Press enter. Now, the cluster is defined.

2. Add nodes to the defined cluster.

#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration --> Configure HACMP Nodes --> Add a Node to the HACMPCluster

Image 8. Adding nodes to a cluster





Similarly, add another node to the cluster.

Till now, we have defined a cluster and added nodes to it. Next we willmake the 2 nodes communicate with each other.

3. Adding network. We will add 2 kinds of networks, IP (Ethernet) andnon-IP (diskhb).

#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Networks --> Add a Network to theHACMP Cluster

Select "ether" from the list.

Image 9. Adding networks to the cluster

After this is added, return to "Add a network to the HACMP cluster" andalso add the diskhb network.

4. The next step establishes what physical devices from each node areconnected to each network.

#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Communication Interfaces/Devices--> Add Communication Interfaces/Devices -->Add Pre-definedCommunication Interfaces and Devices--> Communication Interfaces





Pick the network that we added in the last step (IP_network) and enterconfiguration similar to this:

Image 10. Adding communication devices to the cluster

There should be a warning about an insufficient number ofcommunication ports on particular networks.

These last steps need to be repeated for the different adapters to beassigned to the various networks for HACMP purposes; the warnings canbe ignored then, by the time all adapters are assigned to networks thewarnings must be gone. In any case, repeat for all interfaces.

Note that for the disk communication (the disk heartbeat), the steps areslightly different.

#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Communication Interfaces/Devices --> AddCommunication Devices

Select shared_diskhb or the relevant name as appropriate and fill in thedetails as below:

Image 11. Adding communication interfaces to the cluster

Each node in the cluster also needs to have a persistent node IP address.We associate each node with its persistent IP as follows:





#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Persistent Node IP Label/Addresses

Add all the details as below:

Image 12. Adding persistent IP address to the cluster

Checkpoint:

After adding everything, we should check if everything was addedcorrectly.

#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Show HACMP Topology -->Show Cluster Topology

It will list all the networks, interfaces, devices. Verify that they are addedcorrectly.

5. Adding Resource Group:Till now we have defined a cluster, added nodes to it and also configuredboth IP as well as non-IP_network. The next step is to configure aresource group. As defined earlier, a resource group is a collection ofresources. Application server is one such resource which needs to bekept highly available, for example a db2 server.

Adding an application server to resource group:

#smitty hacmp --> Extended Configuration-->Extended ResourceConfiguration-->HACMP Extended Resources Configuration--> ConfigureHACMP Application Servers-->Add an Application Server

Image 13. Adding resources - Application server





This specifies the server name and the start and the stop scripts neededto start/stop the application server. For applications such as DB2,WebSphere, SAP, Oracle, TSM, ECM, LDAP, IBM HTTP, the start/stopscripts come with the product. For other applications, administrator shouldwrite their own scripts to start/stop the application.

The next resource that we will add into the resource group is a service IP.It is through this IP only that the end users will connect to the application.Hence Service IP should be kept highly available.

#smitty hacmp --> Extended Configuration-->Extended ResourceConfiguration-->HACMP Extended Resources Configuration-->Configure HACMPService IP Labels/Addresses--> Add a Service IP Label/Address

Choose "Configurable on Multiple Nodes" and then "IP_network". Herewe have db2live as the service IP.

Image 14. Adding resources - Service IP

Now the resources are added, we will create a Resource Group (RG),define RG policies and add all these resources to it.

#smitty hacmp --> Extended Configuration-->HACMP Extended Resource GroupConfiguration--> Add a Resource Group

Image 15. Resource Group creation





Once RG is created, we can change attributes of it using,

#smitty hacmp --> Extended Configuration-->HACMP Extended Resource GroupConfiguration-->Change/Show Resources and Attributes for a Resource Group

Select db2_rg and configure as desired:

Image 16. Defining various attributes of the Resource Group

6. Verification and synchronizationOnce everything is configured on the primary node [node1 here], we needto synchronize this with all other nodes in the cluster. To do that,

#smitty hacmp--> Extended Configuration--> Extended Verification andSynchronization

Image 17. Verification and synchronization of the cluster





This will check the status and configuration of the local node first, andthen it will propagate the configuration to the other nodes in the cluster, ifthey are reachable. There should be plenty of details on any errors (andpasses too).

Once this is done, your cluster is ready. You can test it by moving the RGmanually. To do that,

#smitty hacmp--> System Management (C-SPOC)--> HACMP Resource Group andApplication Management--> Move a Resource Group to Another Node / Site-->Move Resource Groups to Another Node

Choose "node2" and press Enter. You should see the stop scripts runningon node1 and start scripts running on node2. After few seconds, the RGwill be online on node2.





Resources

Learn

• Check out the High Availability Cluster Multi-Processing (HACMP) productdocumentation.

• Attend a complimentary developerWorks Live! briefing to get up-to-speedquickly on IBM products and tools as well as IT industry trends.

• Watch developerWorks on-demand demos ranging from product installation andsetup demos for beginners, to advanced functionality for experienceddevelopers.

Get products and technologies

• Evaluate IBM products in the way that suits you best: Download a product trial,try a product online, use a product in a cloud environment, or spend a few hoursin the SOA Sandbox learning how to implement Service Oriented Architectureefficiently.

Discuss

• Get involved in the My developerWorks community. Connect with otherdeveloperWorks users while exploring the developer-driven blogs, forums,groups, and wikis.

• Follow developerWorks on Twitter.

• Get involved in the My developerWorks community.

• Participate in the AIX and UNIX® forums:

• AIX Forum

• AIX Forum for developers

• Cluster Systems Management

• Performance Tools Forum

• Virtualization Forum

• More AIX and UNIX Forums

About the authors

Uma Chandolu



http://www-03.ibm.com/systems/p/library/hacmp_docs.html

http://www.ibm.com/developerworks/offers/techbriefings/

http://www.ibm.com/developerworks/offers/lp/demos/

http://www.ibm.com/developerworks/downloads/

http://www.ibm.com/developerworks/downloads/soasandbox/index.html

http://www.ibm.com/developerworks/community/

http://twitter.com/developerworks

https://www.ibm.com/developerworks/mydeveloperworks

http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=747&cat=72




http://www.ibm.com/developerworks/forums/forum.jspa?forumID=748

http://www.ibm.com/developerworks/forums/dw_auforums.jsp



Uma M. Chandolu works as a Development Support Specialist on AIX.He has five years of extensive hands-on experience in AIXenvironments and demonstrated expertise in AIX system administrationand other subsystems. He has experience interfacing with customersand handling customer-critical situations. He has been recognized asan IBM developerWorks contributing author. He can be contacted [email protected].

Tejaswini KaujalgiTejaswini Kaujalgi works as Systems Software Engineer in the IBM AIXUPT Release team in Bangalore. She has been working on AIX,PowerHA, Security, and VIOS components on pSeries for more than 3years at IBM India Software Labs. She has also worked on variouscustomer configurations using LDAP, Kerberos, RBAC, PowerHA andAIX on pSeries. She is an IBM certified System p Administrator. Youcan reach her at [email protected].



[email protected]




introduction to powerha

Documents