au powerha pdf

Upload: anvesh-reddy

Post on 02-Jun-2018

232 views

Category:

Documents


1 download

TRANSCRIPT

  • 8/11/2019 Au Powerha PDF

    1/18

    Copyright IBM Corporation 2010 Trademarks

    Introduction to PowerHA Page 1 of 18

    Introduction to PowerHA

    Power high availability

    Tejaswini Kaujalgi([email protected])

    Software Engineer

    IBM India Software Labs

    Uma Chandolu([email protected])

    Senior Staff Software EngineerIBM

    17 August 2010

    PowerHA for AIX is the new name for HACMP (High Availability Cluster Multiprocessing).

    HACMP is an application that makes system fault resilient and reduces downtime of

    applications. This article introduces PowerHA and provides a detailed explanation of how

    to configure a two node cluster.Considering the demand for this configuration from various

    customers, this document is very useful for understanding PowerHA and setting up a two node

    cluster.

    Introduction

    In today's increasing business demands, critical applications need to be available all the time,

    and the system needs to be fault tolerant. But these fault tolerant systems always come with a

    heavy cost. Hence, there is need of an application which provides these facilities and is also cost

    effective.

    A high availability solution can ensure that the failure of any component of the solution does

    not cause the application and its data to be unavailable to the user community. This is achieved

    through the elimination, or masking, of both planned and unplanned downtime by eliminating

    single points of failure. Also, there is no specialized hardware required to make an application

    highly available. PowerHA does not perform some administrative tasks like backups, time

    synchronization, and any application specific configuration.

    Figure 1 is an illustration of the failover capacity. When one server goes down, the other takes

    over.

    http://www.ibm.com/legal/copytrade.shtmlhttp://www.ibm.com/legal/copytrade.shtmlhttp://www.ibm.com/legal/copytrade.shtmlhttp://www.ibm.com/legal/copytrade.shtmlhttp://www.ibm.com/legal/copytrade.shtmlmailto:[email protected]:[email protected]:[email protected]://www.ibm.com/developerworks/ibm/trademarks/http://www.ibm.com/legal/copytrade.shtml
  • 8/11/2019 Au Powerha PDF

    2/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 2 of 18

    Figure 1. Failover capacity

    Overview of PowerHA

    The terms PowerHA and HACMP are used interchangeably. As mentioned earlier, it eliminates

    single points of failure (SPOF). The following table shows possible SPOFs:

    Cluster object Eliminated as SPOF by:

    Node Use multiple nodes

    Power source Using multiple circuits or uninterrupted power supplies

    Network adapter Using redundant network adapters

    Network Using multiple networks to connect nodes

    TCP/IP subsystem Using non-IP networks to connect adjoining nodes and clients

    Disk adapter Using redundant disk adapter or multi-path hardware

    Disk Using multiple disks with mirroring or raid

    Application Adding nodes for takeover; configuring application monitor

    VIO server Implementing dual VIO server

    Site Adding an additional site

    The main goal is to have 2 servers so that if one fails, the other takes over. PowerHA is a

    clustering technology that provides both failover protection by having redundancy and horizontal

    scalability through concurrent/parallel access.

    PowerHA terminology

    There are various terminologies used in PowerHA. They can be classified into topology

    components and resource components.

    The topology components are basically the physical components. They include:

  • 8/11/2019 Au Powerha PDF

    3/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 3 of 18

    Nodes: System p servers can be standalone partitions or vios clients

    Networks: IP networks and non-IP networks

    Communication interfaces: Token ring or Ethernet adapters

    Communication devices: RS232 or heartbeat over disk

    The resources components are the logical entities that will be made highly available. They include:

    Application server: It involves the start/stop scripts of the application

    Service IP address: The end users are generally given an IP address to connect to the

    application. This IP address is mapped to a node where the application is actually running.

    Since the IP address needs to remain highly available, it is a part of the resource group.

    File system: Many applications require the file systems to mounted.

    Volume group: Volume groups are required to be made highly available with many

    applications.

    All the resources together form an entity called a resource group. PowerHA handles this as

    a single unit. It keeps the resource groups highly available. Resource groups have policies

    associated with it. Those are:

    1. Startup policy: This tells which node the resource group should activate

    2. Fallover policy: When a failure happens, this determines the fallover target node

    3. Fallback policy: This tells whether or not the resource group will fallback.

    Whenever a failure happens, it looks for these policies and works accordingly.

    The subsystems of PowerHA

    Figure 2. Subsystems of PowerHA

    The illustration above shows how PowerHA comprises of a number of software components:

    The cluster manager (clstrmgr) is the core process that monitors cluster membership. The

    cluster manager includes a topology manager to manage the topology components, a

  • 8/11/2019 Au Powerha PDF

    4/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 4 of 18

    resource manager to manage resource groups, an event manager with event scripts that

    works through the RMC facility, and RSCT to react to failures.

    The clinfo process provides an API for communicating between cluster manager and your

    application. Clinfo also provides remote monitoring capabilities and can run a script in

    response to a status change in the cluster.

    In PowerHA V5, clcomdES allows the cluster managers to communicate in a secure mannerwithout using rsh and the /.rhost files.

    Configuration of a 2 node cluster

    Before starting off with the configurations, let's look at the networking and the storage

    considerations of PowerHA.

    Networking

    PowerHA uses networks to detect and diagnose failures, as well as providing clients with highly

    available access to applications.

    Internode communication also happens through networks. PowerHA detects three kinds of failures

    directly. Those are the network, NIC and node failure. It detects and diagnosis through the use of

    RSCT daemon. RSCT detects the loss of heartbeat packets that are sent across all the networks

    and determines the exact loss (network, NIC or node failure).

    Figure 2 shows that the heartbeat packets are transferred and received by all NICs, which helps in

    determining the failures.

    Figure 3. Cluster representing heartbeat packets

    If the heartbeat packets are stopped, then both nodes assume that the other is down and each will

    try to bring the resource group online. This could result in massive data corruption.

    To avoid this, PowerHA uses two kinds of networks:

    1. IP networks: Examples are Ethernet, Ether channel, etc.

    2. Non-IP networks: An example is RS232 (this is needed to make sure that even if the network

    goes down, PowerHA is capable of differentiating between network failure and node failure)

  • 8/11/2019 Au Powerha PDF

    5/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 5 of 18

    IP address take over (IPAT)

    Most applications require that the IP address be highly available. To ensure this, we include this

    service IP into the resource group. The movement of this service IP from one NIC to another is

    called as IP address take over. There are two ways to use IPAT:

    1. IPAT via aliasing: PowerHA adds the service IP address to the NIC (accomplished using AIX

    IP aliasing feature).

    2. IPAT via replacement: PowerHA replaces the Interface IP address with the Service IP.

    Storage

    Storage can be broadly classified into two types:

    1. Private storage: Owned by only one node

    2. Shared storage: Owned by one or more nodes in the cluster

    All applications' data resides in the shared storage. To avoid data inconsistency, shared storageprotection can be done in the following ways:

    1. Reserve/release-basedshared storage protection: Used with standard volume groups

    2. RSCT-basedshared storage protection: Used with enhanced concurrent volume groups

    HACMP 5.x supports the RSCT-based style of shared storage protection, which relies on AIXs

    RSCT component to coordinate the ownership of shared storage when using enhanced concurrent

    volume groups in non-concurrent mode.

    Configuration

    Before starting with the configuration, the cluster must be properly planned. The online planningworksheets (OLPW) can be used for the planning purpose. Here, it explains the configuration of

    a two node cluster. In the example provided, both nodes have 3 Ethernet adapters and 2 shared

    disks.

    Step 1: Fileset installation

    After installing AIX, the first step is to install the required filesets. The RSCT and BOS filesets can

    be found in the AIX base version CDs. The license for PowerHA needs to be purchased to install

    the HACMP filesets. Install the following filesets:

    RSCT filesets BOS filesets HACMP 5.5 filesets

    rsct.compat.basic.hacmp

    rsct.compat.clients.hacmp

    rsct.basic.hacmp

    rsct.basic.rte

    rsct.opt.storagerm

    rsct.crypt.des

    rsct.crypt.3des

    rsct.crypt.aes256

    bos.data

    bos.adt.libm

    bos.adt.syscalls

    bos.clvm.enh

    bos.net.nfs.server

    cluster.adt.es

    cluster.es.assist

    cluster.es.cspoc

    cluster.es.plugins

    cluster.assist.license

    cluster.doc.en_US.assist

    cluster.doc.en_US.es

    cluster.es.worksheets

    cluster.license

    cluster.man.en_US.assist

    cluster.man.en_US.es

  • 8/11/2019 Au Powerha PDF

    6/18

  • 8/11/2019 Au Powerha PDF

    7/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 7 of 18

    Image 2. Configuration of a private IP address

    Similarly, configure en2 with private IP 10.10.210.21 and start the TCP/IP services. Next, you need

    to add the IP addresses (of both node1, node2 and the service IP which db2live here) and the

    labels into /etc/hosts file. It should look like the following:

    # Internet Address Hostname # Comments

    127.0.0.1 loopback localhost # loopback (lo0) name/address

    192.168.20.72 node1.in.ibm.com node1

    192.168.20.201 node2.in.ibm.com node2

    10.10.35.5 node2ha1

    10.10.210.11 node2ha2

    10.10.35.4 node1ha110.10.210.21 node1ha2

    192.168.22.39 db2live

    The idea is that you should include each of the three ports for each machine with relevant labels

    for name resolution.

    Perform similar operations on node2. Configure en0 with the public IP and en1 and en2 with

    private IPs and edit the /etc/hosts file. To test that all is well, you can issue pings to the various IP

    addresses from each machine.

    Step 4: Storage configuration

    We need to have a shared storage to create heartbeat over FC disk. The disks need to be

    allocated from SAN. Once both the nodes are able to see the same disks (this can be identified

    using LUN number), heartbeat over disks will be configured.

    This method does not use Ethernet to avoid a single point of failure from the Ethernet network/

    switches/protocols. The first step is to identify the available major number on all the nodes (as

    shown on Image 3below).

  • 8/11/2019 Au Powerha PDF

    8/18

  • 8/11/2019 Au Powerha PDF

    9/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 9 of 18

    Image 5. Logical Volume creation

    Once lv is created, initialize the logform with the following:#logform /dev/hbloglv

    Repeat this process to create another LV of type jfs and named hblv (but otherwise identical).

    3. Next, we create a filesystem. To do that, enter the following:#smitty crfs ->Add a Journaled File System -> Add a Journaled File System on a

    Previously Defined Logical Volume -> Add a Standard Journaled File System

    Here enter the lv name "hblv", lv for log as " hbloglv" and the mount point /hb_fs

    Image 6. Filesystem creation in a Logical Volume

    Once the Filesystem is created, try mounting the file system. Before moving to node2,unmount /hb_fs and varyoffvg hbvg.

    On Node 2

    1. Identify the shared disk using PVID. Import the volume group with same major number (we

    used 100) from the shared disk (hdisk1):#importvg -V 100 -y hbvg hdisk1

    2. Varyon the volume group and disable auto start at mount.

  • 8/11/2019 Au Powerha PDF

    10/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 10 of 18

    #varyonvg hbvg

    #chvg -an hbvg

    Now, you should be able to mount the filesystem. Once done, unmount the filesystem and

    varyoffvg hbvg.

    3. Verification of Heartbeat over FC:Open 2 different sessions of both the nodes. On node1, run following command where hdisk1

    is the shared disk.#/usr/sbin/rsct/bin/dhb_read -p hdisk1 -r

    On node2:/usr/sbin/rsct/bin/dhb_read -p hdisk1 -t

    Basically, one node will heartbeat to the disk and the other will detect it. Both nodes should

    return to the command line after the reporting link operates normally.

    Application specific configuration

    If you are creating any application (for example DB2 server) highly available, specific configuration

    needs to be done. That is beyond the scope of this article.

    HACMP related configuration

    Network takeover on both nodes:

    1. Run grep -i community /etc/snmpdv3.conf | grep public and ensure that there is an

    uncommentedline similar to COMMUNITY public public noAuthNoPriv 0.0.0.0 0.0.0.0.2. Next we need to add all the IP addresses of nodes, NIC's in the /etc/rhosts file.

    # cat /usr/es/sbin/cluster/etc/rhosts

    192.168.20.72

    192.168.20.201

    10.10.35.5

    10.10.210.11

    10.10.35.4

    10.10.210.21

    192.168.22.39

    Configuring PowerHA cluster

    On Node 1:

    1. First, define a cluster:#smitty hacmp --> Extended Configuration --> Extended Topology Configuration -->

    Configure an HACMP Cluster --> Add/Change/Show

    an HACMP Cluster

  • 8/11/2019 Au Powerha PDF

    11/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 11 of 18

    Image 7. Defining a cluster

    Press Enter; now, the cluster is defined.

    2. Add nodes to the defined cluster:#smitty hacmp --> Extended Configuration --> Extended Topology

    Configuration --> Configure HACMP Nodes --> Add a Node to the HACMP

    Cluster

    Image 8. Adding nodes to a cluster

    Similarly, add another node to the cluster. Now, we have defined a cluster and added nodes toit. Next, we will make the two nodes communicate with each other.

    3. To add networks, we will add two kinds of networks, IP (Ethernet) and non-IP (diskhb).#smitty hacmp --> Extended Configuration --> Extended Topology

    Configuration--> Configure HACMP Networks --> Add a Network to the

    HACMP Cluster

    Select "ether" from the list.

  • 8/11/2019 Au Powerha PDF

    12/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 12 of 18

    Image 9. Adding networks to the cluster

    After this is added, return to "Add a network to the HACMP cluster" and also add the diskhb

    network.

    4. The next step establishes what physical devices from each node are connected to each

    network.#smitty hacmp --> Extended Configuration --> Extended Topology

    Configuration--> Configure HACMP Communication Interfaces/Devices

    --> Add Communication Interfaces/Devices -->Add Pre-defined

    Communication Interfaces and Devices--> Communication Interfaces

    Pick the network that we added in the last step (IP_network) and enter configuration similar to

    this:

    Image 10. Adding communication devices to the cluster

    There should be a warning about an insufficient number of communication ports on particular

    networks. These last steps need to be repeated for the different adapters to be assigned to

    the various networks for HACMP purposes. The warnings can be ignored. By the time all

    adapters are assigned to networks, the warnings must be gone. In any case, repeat for all

    interfaces.Note that for the disk communication (the disk heartbeat), the steps are slightly different.#smitty hacmp --> Extended Configuration --> Extended Topology

    Configuration--> Configure HACMP Communication Interfaces/Devices --> Add

    Communication Devices

    Select shared_diskhb or the relevant name as appropriate and fill in the details as below:

  • 8/11/2019 Au Powerha PDF

    13/18

  • 8/11/2019 Au Powerha PDF

    14/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 14 of 18

    Image 13. Adding resources - Application server

    This specifies the server name, the start and the stop scripts needed to start/stop the

    application server. For applications such as DB2, WebSphere, SAP, Oracle, TSM, ECM,

    LDAP, IBM HTTP, the start/stop scripts come with the product. For other applications,

    administrators should write their own scripts to start/stop the application.

    The next resource that we will add into the resource group is a service IP. It is through this IP

    only that the end users will connect to the application. Hence, service IP should be kept highlyavailable.#smitty hacmp --> Extended Configuration-->Extended Resource

    Configuration-->HACMP Extended Resources Configuration-->Configure HACMP

    Service IP Labels/Addresses--> Add a Service IP Label/Address

    Choose "Configurable on Multiple Nodes" and then "IP_network". Here we have db2live as

    the service IP.

    Image 14. Adding resources - Service IP

    Now the resources are added, we will create a resource group (RG), define RG policies, and

    add all these resources to it.#smitty hacmp --> Extended Configuration-->HACMP Extended Resource Group

    Configuration--> Add a Resource Group

  • 8/11/2019 Au Powerha PDF

    15/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 15 of 18

    Image 15. Resource group creation

    Once RG is created, we can change attributes of it using,#smitty hacmp --> Extended Configuration-->HACMP Extended Resource Group

    Configuration-->Change/Show Resources and Attributes for a Resource Group

    Select db2_rg and configure as desired:

    Image 16. Defining various attributes of the resource group

    6. Verification and synchronization

    Once everything is configured on the primary node (node1), we need to synchronize this with

    all other nodes in the cluster. To do that, do the following:#smitty hacmp--> Extended Configuration--> Extended Verification and

    Synchronization

  • 8/11/2019 Au Powerha PDF

    16/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 16 of 18

    Image 17. Verification and synchronization of the cluster

    This will check the status and configuration of the local node first, and then it will propagate

    the configuration to the other nodes in the cluster, if they are reachable. There should be

    plenty of details on any errors and passes, too. Once this is done, your cluster is ready. You

    can test it by moving the RG manually. To do that, do the following:#smitty hacmp--> System Management (C-SPOC)--> HACMP Resource Group and

    Application Management--> Move a Resource Group to Another Node / Site-->Move Resource Groups to Another Node

    Choose "node2" and press Enter. You should see the stop scripts running on node1 and start

    scripts running on node2. After few seconds, the RG will be online on node2.

  • 8/11/2019 Au Powerha PDF

    17/18

    ibm.com/developerWorks/ developerWorks

    Introduction to PowerHA Page 17 of 18

    Resources

    Learn

    Check out the High Availability Cluster Multi-Processing (HACMP)product documentation.

    Attend a complimentary developerWorks Live! briefingto get up-to-speed quickly on IBMproducts and tools as well as IT industry trends.

    Watch developerWorks on-demand demosranging from product installation and setup demos

    for beginners, to advanced functionality for experienced developers.

    Get products and technologies

    Evaluate IBM productsin the way that suits you best: Download a product trial, try a product

    online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox

    learning how to implement service-oriented architecture efficiently.

    Discuss

    Get involved in the My developerWorks community. Connect with other developerWorks

    users while exploring the developer-driven blogs, forums, groups, and wikis.

    Follow developerWorks on Twitter.

    Get involved in the My developerWorks community.

    Participate in theAIX and UNIX forums:

    AIX Forum

    AIX Forum for developers

    Cluster Systems Management

    Performance Tools Forum Virtualization Forum

    More AIX and UNIX Forums

    http://www.ibm.com/developerworks/forums/dw_auforums.jsphttp://www.ibm.com/developerworks/forums/forum.jspa?forumID=748http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=749&cat=72http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=907&cat=72http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=905&cat=72http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=747&cat=72http://www.ibm.com/developerworks/forums/dw_auforums.jsphttp://www.ibm.com/developerworks/forums/forum.jspa?forumID=748http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=749&cat=72http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=907&cat=72http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=905&cat=72http://www.ibm.com/developerworks/forums/dw_forum.jsp?forum=747&cat=72https://www.ibm.com/developerworks/mydeveloperworkshttp://twitter.com/developerworkshttp://www.ibm.com/developerworks/community/http://www.ibm.com/developerworks/downloads/soasandbox/index.htmlhttp://www.ibm.com/developerworks/downloads/http://www.ibm.com/developerworks/offers/lp/demos/http://www.ibm.com/developerworks/offers/techbriefings/http://www-03.ibm.com/systems/p/library/hacmp_docs.html
  • 8/11/2019 Au Powerha PDF

    18/18

    developerWorks ibm.com/developerWorks/

    Introduction to PowerHA Page 18 of 18

    About the authors

    Tejaswini Kaujalgi

    Tejaswini Kaujalgi works as Systems Software Engineer in the IBM AIX UPT Release

    team in Bangalore. She has been working on AIX, PowerHA, Security, and VIOS

    components on pSeries for more than 3 years at IBM India Software Labs. She

    has also worked on various customer configurations using LDAP, Kerberos, RBAC,

    PowerHA and AIX on pSeries. She is an IBM certified System p Administrator. You

    can reach her at [email protected].

    Uma Chandolu

    Uma M. Chandolu works as a Development Support Specialist on AIX. He has five

    years of extensive hands-on experience in AIX environments and demonstrated

    expertise in AIX system administration and other subsystems. He has experience

    interfacing with customers and handling customer-critical situations. He has been

    recognized as an IBM developerWorks contributing author. He can be contacted at

    [email protected].

    Copyright IBM Corporation 2010

    (www.ibm.com/legal/copytrade.shtml)

    Trademarks

    (www.ibm.com/developerworks/ibm/trademarks/)

    http://www.ibm.com/developerworks/ibm/trademarks/http://www.ibm.com/legal/copytrade.shtmlhttp://www.ibm.com/developerworks/ibm/trademarks/http://www.ibm.com/legal/copytrade.shtmlhttp://localhost/var/www/apps/conversion/tmp/scratch_8/[email protected]:[email protected]