creating no dataloss standby databases with ibmÒ db2Ò … · 2004-11-15 · emc corporation and...

EMC Corporation and IBM Corporation

Creating No Dataloss Standby Databases with

IBM DB2 Universal Database V7.2, EMC TimeFinder, and EMC SRDF

August 2002

John Macdonald

EMC Corporation

Enzo Cialini

IBM Canada Ltd.

IBM Toronto Lab


Creating No Dataloss Standby Databases with DB2 UDB and SRDF i

Disclaimers and Trademarks

Copyright © 2002 EMC Corporation and IBM Corporation. All rights reserved.

Printed 8/22/2002

EMC and IBM believe the information in this publication is accurate as of its publication date. The information is

subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” - NEITHER EMC CORPORATION NOR IBM

CORPORATION MAKE ANY REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND BOTH SPECIFICALLY DISCLAIM IMPLIED WARRANTIES OF

MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

The furnishing of this document does not imply giving license to any IBM or EMC patents.

References in this document to IBM products, Programs, or Services do not imply that IBM intends to make thes e available in all countries in which IBM operates.

Use, copying, and distribution of any EMC or IBM software described in this publication requires an applicable software license.

Trademark Information

EMC, EMC2, and Symmetrix are registered trademarks and TimeFinder, SRDF, and where information lives are trademarks of EMC Corporation.

IBM, AIX, DB2, DB2 Universal Database, and RS/6000 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both.

Windows is a registered trademark of Microsoft Corporation in the United States, other countries, or both.

All other trademarks used herein are the property of their respective owners.


Creating No Dataloss Standby Databases with DB2 UDB and SRDF ii

Table of Contents

Disclaimers and Trademarks............................................................................................................. i Introduction...................................................................................................................................... 1

SRDF Business Continuance Sequence........................................................................................... 1 Overview of DB2 UDB Suspended I/O.............................................................................................. 2 DB2 UDB Command Description...................................................................................................... 2

Test Configuration ........................................................................................................................... 4 Hardware and Volume Configuration................................................................................................ 4 Full Data Configuration .................................................................................................................... 5 Log Copy Configuration ................................................................................................................... 5

Trade-offs Between Full Data and Log Copy Scenarios ................................................................... 7 Scenario 1 – Full Data Copy ............................................................................................................. 8

Configuration Setup Tasks and Management Procedures .................................................................. 9 Configuring the Volumes on the AIX Systems ..................................................................................11 IBM DB2 UDB Storage Configuration ..............................................................................................15 Finishing Setup on the Secondary System.......................................................................................17 Reestablish the SRDF Connections for Standby Capability...............................................................19 Planned Switchback to Primary System...........................................................................................21 Disaster Switchover to Secondary System .......................................................................................22 Disaster Switchback to Primary System...........................................................................................23

Scenario 2 – Log Copy ....................................................................................................................25 Configuration Setup Tasks and Management Procedures .................................................................27 Configuring the Volumes on the AIX Systems ..................................................................................30 IBM DB2 UDB Storage Configuration ..............................................................................................35 Finishing Setup on the Secondary System.......................................................................................38 Log Copy.......................................................................................................................................41 Switch Over to Secondary System ..................................................................................................43 Switch Back to the Primary System .................................................................................................45


Creating No Dataloss Standby Databases with DB2 UDB and SRDF 1

Introduction

Mission-critical database systems must operate 24x7 with the highest degree of availability possible. As databases increase in size and as ad hoc queries place more demands on the continued availability of the system, the time and hardware resources that are required to back up and recover databases grows substantially even as maintenance windows are either drastically reduced or disappear completely. The instant split feature of EMC Symmetrix Remote Data Facility (SRDF) software and the database suspended I/O features available in IBM DB2 Universal Database (UDB) combine to provide highly available, advanced database technical architectures to meet these demands. We demonstrate the use of these features to create a coherent copy of a continuously available DB2 UDB database to meet the increasing availability requirements demanded in today’s business environment.

This paper takes you through a sample configuration setup and usage of IBM DB2 UDB Enterprise Edition (EE) V7.2 with EMC SRDF. The configurations shown here are appropriate for maintaining a disaster recovery site at a different location from the primary site. These configurations will work especially well when the two sites are reasonably close – within a metropolitan area for example. They will cause degraded response time if the two sites are further apart. These scenarios are designed to recover from disasters that occur outside of the database application – power failure, communication blackout, earthquake, operating system crash. They can also be used for planned shutdowns, such as system or hardware changes to the primary system. They do not allow for recovery from failures within the application. The "no data loss" scenario preserves all of the changes to the database, even ones that were applied in error. (Dealing with such failures can be done by augmenting these scenarios to make additional BCV copies of the database that can be used to manually recover from errors. Such extensions are not discussed here.)

SRDF Business Continuance Sequence

With EMC Symmetrix Remote Data Facility (SRDF), a Symmetrix device can be paired with a device on a separate, possibly remote, Symmetrix. This can be used to provide a business continuance solution through remotely mirrored devices. The source device is called an R1, the copy is called an R2. An R2 copy is an independently host-addressable device that is a point -in-time copy of its paired R1 device.

The R1 and R2 devices can be in a number of states. Initially, the SRDF pair is never established. Establishing the SRDF pair causes the R1 to be fully copied over top of the R2. While that initial copy is occurring, the pair is in the synchronizing state, and the R2 is not usable. (It is blocked from access by any host because its content is an inconsistent mix of its original value and the data from the R1.) When the synchorization completes, the pair is in the synchronized state, sometimes called “established.” Once established, the R2 copy can be split from the standard device, which makes it available again to its host.

Incremental establish resynchronizes a previously split SRDF pair, but rather than copying the entire volume, it need only copy those tracks that have been updated on either volume since the split. During this process, the pair is again in the synchronizing state and the R2 is not available from its host.

Restore and incremental restore are equivalent to establish and incremental establish, except that data is copied from the R2 volume and overwrites the content of the R1. During restore processing, neither volume is available for access from its host. In addition to the previous states, which correspond to the states provided by EMC TimeFinder for local business continuance volume (BCV) copies, SRDF pairs have a number of additional states that are useful for recovery procedures:

• The pair can be put into failover state, which makes the R2 volume available to its host while the R1 volume is not accessible – this allows the R2 to be used by its host to replace the function of the R1 system after some sort of outage has occurred on the R1 system preventing it from fulfilling its normal role.

• The swap state extends this by totally revering the R1 and R2 roles, which is useful for extended outages.



Overview of DB2 UDB Suspended I/O

DB2 UDB V7.2 has implemented a command that allows users to suspend database writes, which provides for the capability to use split mirroring technology while DB2 is online. This functionality is available on all distributed DB2 UDB platforms (e.g., AIX, Solaris, HP/UX, Linux, Windows). Suspended I/O supports continuous system availability by providing a full implementation for splitting a BCV without shutting down the database. The split copy (i.e., BCV) of the database can be used to do such tasks as the following:

1. Provide a transactionally consistent snapshot of the database at the current point in time. This database can be used to offload user queries that don’t need the most current version of the database.

2. Provide a standby database that can be accessed as a disaster recovery strategy if the primary database is not available. All logs from the primary database will be applied to the secondary database so that it will represent the most current transactionally consistent version of the primary database.

3. Provide the ability to offload the database backup process from the primary database system. A DB2 backup can be performed on the secondary system. The DB2 backup can then be restored on either the primary system or on another system. Rollforward can then be issued to bring the database to a particular point in time, or until the end of the logs are reached.

4. Provide the ability for a quick primary database restore. The BCV can be reestablished so the primary copy is restored to the initial data at the time of the split. The primary database can then be rolled forward to bring the database to a particular point in time, or until the end of the logs are reached.

DB2 UDB Command Description

We use the following commands in our scenarios.

set write suspend The suspend command (db2 set write suspend for database) suspends all write operations to the DB2 UDB database (i.e., to tablespaces and log files). Read operations are not suspended and are thus allowed to continue. Applications can continue to process insert, update, and delete operations utilizing the DB2 bufferpools. A database connection is required for issuing the suspend command. It is recommended that you maintain the current session for executing the subsequent resume command.

set write resume The resume command (db2 set write resume for database) resumes all write operations to the suspended DB2 UDB database. A database connection is required for issuing the resume command.

db2inidb The db2inidb command (db2inidb <db_alias> as < snapshot | standby | mirror >) is required to initialize the copy of the suspended database. You do not require a database connection prior to executing this command. It can be used in the following three cases:



1. snapshot can be applied to the secondary copy, putting it into a transactionally consistent state.

2. standby can be applied to the secondary copy, putting it into a rollforward pending state. DB2 logs from the primary database can then be applied to the secondary database.

3. mirror can be applied to the primary copy after it has been restored from the secondary copy. The primary database will be placed into a rollforward pending state and then DB2 logs can be applied to the primary database.

For further information regarding DB2 UDB’s Suspended I/O functionality, see the DB2 UDB documentation and Release Notes provided with V7.2 or V7.1 Fixpack 3 at http://www.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v7pubs.d2w/en_main. In addition, see the DB2 Developer Domain article entitled Split Mirror Using Suspended I/O in DB2 Universal Database V7 http://www7b.software.ibm.com/dmdd/library/techarticle/0204quazi/0204quazi.html.

Command Conventions Used in This Document

Commands that must be run by root are shown with a shell prompt of root> .

Commands that must be run by the database user (the instance owner or database admin) are shown with a shell prompt of dba>.

Commands in bold face are shown as the actual commands issued on one (or both) of the systems.



Test Configuration

This paper documents two test configurations.

• The first configuration, full data copy, uses the SRDF connection to continuously keep copies of the entire database synchronized between the two systems at all times, ready for immediate takeover in the event of a disaster.

• The second configuration, log copy, uses the SRDF connection to copy the original data content of the database, but after initialization only the logs are maintained with a fully synchronized copy. This configuration also provides for no data loss in the event of a disaster, but it requires a much longer transition period.

After the more detailed description of these configurations that follows, there is a section that discusses the advantages and disadvantages of these two alternatives, allowing you to choose which is more appropriate for your needs.

Hardware and Volume Configuration

These procedures were tested using an EMC Symmetrix 8430 GreyWolf and an 8530 LoneWolf running Symmetrix Enginuity version 5567. Two IBM RS/6000 H70 systems were connected via fiber HBA, one to each Symmetrix system. One RS/6000 system was used as a primary database system. The second RS/6000 system was used as a secondary database system (either as a standby system, or as a testing/analysis system). Figure 1 illustrates the test configuration.

Figure 1. DB2 UDB and EMC Test Configuration



The two RS/6000 systems were running AIX 4.3.3 and IBM DB2 UDB Enterprise Edition, V7.2.

The interaction between the RS/6000 systems and the Symmetrix was done with AIX BCV support kit 2.0, and the EMC AIX kit 4.3.3.2, both from EMC. The set of program temporary fixes (PTF) listed in the EMC Open Source Matrix as of March 2001 was applied. Additionally, EMC Solutions Enabler SYMCLI Base Component software version 4.2 was used on both systems.

The connection between the two Symmetrix systems was made with 2 fiber links.

The database application in both cases made direct use of 24 - 4GB Symmetrix hyper volumes. The data was kept on 20: 4 of them contained a journalled file system while the other 16 were presented as a single raw volume. Logs were kept in another journalled file sytem established on the other 4 volumes. The two scenarios differed in the ways in which these volumes were kept synchronized between the two systems, and in whether additional volumes (not normally visible to the host systems) were used on the Symms to help manage the process.

Full Data Configuration

The full data configuration scenario used only the 24 volumes described above. When the R1 system was operating normally, they were always kept in the synchronized (established) state. When a disaster occurred, they were split and the application was started on the R2 system. For recovery, the R1/R2 systems were logically swapped so that the R1 became the R2 (and vice versa). Thus, the volumes on the primary system (now the R2) were kept synchronized with the changes applied by the backup (now the R1) system; and they were ready to be switched back once the disaster had been repaired.

Log Copy Configuration

The log copy configuration was more complicated. The data volumes were initially copied from the primary system to the secondary, but then that SRDF set was split. Subsequent changes to the data on the primary system were not copied using SRDF. Instead they were re-created by the secondary system, which would be given copies of the logs and would apply them to the split data volumes.

The log volumes were copied continuously. Since they were always in the synchronized state, they were not visible to the secondary system. It was not appropriate to split the log volumes directly to allow the secondary system to have access to them when there were logs to be processed on the secondary system. To split the volumes would mean that they were not being kept synchronized. While they could be reestablished and thus resynchronized later, there would be a period of time when some forms of disaster would cause data loss at the secondary site. A disaster that caused the primary system to crash or fail would not be a problem – the Symmetrix could still resynchronize. However, some disasters could prevent the Symmetrix from resynchronizing – an earthquake or other destructive event would cause the data that was only on the primary Symm to be lost.

Since splitting the R2 was not acceptable, a BCV copy of the R2 was also made. Using the userexit function of DB2 UDB, whenever a log was archived on the primary system, the BCV on the secondary system would be split. A daemon on the secondary system would notice the split and pick up all newly archived logs, and then cause DB2 to roll forward through those new logs. The BCV would be reestablished. If one or more logs were archived while the BCV was split, the BCV would be split again by the primary system as soon as it was reestablished. The secondary system would apply all new logs each time.

Since the R2 was always synchronized, a disaster, even one that caused the primary Symm to be lost, would not cause data loss on the secondary system – it could use the R2 to bring the BCV fully up to date with the state of the logs at the moment of the disaster.

A second trio of volumes (R1, R2, and BCV) were used to send logs from the secondary system to the primary when the roles had been exchanged and the secondary was running the database and the primary system was being recovered.



Recovery required that the original primary system be brought back into operational state. The data volumes have to be copied from the secondary to the primary. (Although they logically have the same data, it has been created by the two systems applying transactions independently, so there can be different timestamps, different sector allocation, etc. that cause real differences. Furthermore, even when the two systems happened to write identical data, the Symms have no way of assuring that that was the case, so every track that was changed on either system since the initial copy must be copied back to the primary system.) This reverse copy takes a long time. (If it didn't take a long time, there would be no reason to use log forwarding instead of the full data copying scenario.) That means that after the initial disaster, it can be a long time before the primary system is ready to either be reinstated as the master system, or to operate as the secondary and be ready for a subsequent disaster.



Trade-offs Between Full Data and Log Copy Scenarios

The full data scenario is much simpler to manage and is much faster to recover from a disaster, but it requires higher communication bandwidth between the systems.

To start the secondary system after a disaster at the primary site requires about a minute (after you have determined that the disaster has happened and that switching to the secondary is appropriate). With the log copy scenario, it can take many minutes to get the seconary system ready to take over.

In planning for an SRDF configuration, you must compute your bandwidth requirements. You should measure separately the amount of disk I/O required for your data storage and for log storage. Unless you are not concerned about per-transaction response time, be sure to determine the peak I/O requirements rather than just the total or average requirements. Also, be sure to allow for future growth.

Next you determine the cost of the bandwidth alternatives.

If the two sites are in the same building or on the same campus, the difference cost between a lower capacity connection and a higher capacity one is probably insignificant (it costs almost as much to run one fiber between two points as it costs to run dozens – the cost of pulling the bundle and digging trenches is much greater than the cost of the fiber itself). Using local direct fiber connections will generally mean that you have more than enough bandwidth to meet your needs for either scenario.

If the sites are too far apart though, you need to deal with a communication company to get the bandwidth and you must use additional expensive equipment to connect them through the communication lines. These costs are generally related to the bandwidth required, but there can be significant variations. Often, you will want to use far less bandwidth than the capacity of a fiber cable, so there can be severe limitations on the data that can be sent.

You should choose the full data scenario if you can. The one advantage of the log copy scenario, though, is that it requires lower communication bandwidth for normal operation. Choose it if you can't afford the communication costs of the full data scenario and if you are able to live with the longer time to recover from a disaster.

If planned outages are an issue, your preferred mechanism would be to have a second system on the primary site capable of taking over. That mechanism is outside of the scope of this paper. However, if the SRDF remote site is intended to be also used to provide standby service for planned outages, the full data copy scenario is much preferred because the log copy scenario requires a great deal of time and communication bandwidth to recover after a switchover before the primary system is again ready to resume operations.



Scenario 1 – Full Data Copy

In this scenario, the data and logs are always kept fully synchronized during normal operation. After a disaster, the primary system is stopped if necessary, the data and log volumes are split, and then the secondary system takes over operation. The SRDF links are switched, so that the primary volumes are kept synchronized with the updates that are applied by the secondary system. (If the primary Symmetrix cannot communicate, either because it or the communication medium is down, then the volumes will have to be resynchronized once the problem is fixed.) After the disaster problem has been fixed (and the voumes are fully synchronized if that was interrupted by the disaster), an orderly shutdown of the secondary system allows the volumes to be switched back and the primary system to resume its normal role.

Table 1 shows the Symmetrix device group name, AIX volume group name, AIX logical volume name, file system mount points (device name in the case of raw logical volumes), the Symmetrix device numbers used on each system, and the AIX hdisk names assigned the Symmetrix hyper volumes on both systems. (There is no need that the Symmetrix device numbers or the AIX hdisk names be the same on both systems – it simply happened that our test systems had the same configuration of other disks on the system and the same choices made for the Symmetrix configuration.)

Table 1. Names and device numbers used in our configuration

Symmetrix

Group

Volume

Group

Logical

Volume

Mount

Point

Symmetrix dev

device

type

hdisk

data drvg dlv /data 000:003 R1(R2) 7:10

data drvg rlv /dev/rrlv 004:013 R1(R2) 11:26

logs lvg llv /logs 014:017 R1(R2) 27:30

gatekeeper 200:203 N/A

Note that the device type R1(R2) is normally an R1 device on the primary system and an R2 device on the secondary, but during disaster recovery stages, they are reversed.



Configuration Setup Tasks and Management Procedures

A fair amount of planning and configuration goes into setting up the storage for a large database implementation on an EMC Symmetrix. Before you can use Symmetrix volumes, the physical drives in the Symmetrix must be logically subdivided into hyper volumes, which can appear to your system as physical disks. Then the Symmetrix hyper volumes are configured as R1 or R2 (or other) Symmetrix device types. Once this is complete, the devices must be made known to your operating system. This step is operating system (OS) specific; in our case, the OS is AIX 4.3.3. The same procedures could be done on other OSs, such as HP/UX or Solaris, but the commands to manage the devices would need to be changed into the appropriate OS-specific equivalents.

SYMCLI commands must be installed and available to the root user doing the initial device configuration and subsequent operational steps. Finally, the SYMCLI database is initialized to include the new devices so that later SYMCLI commands can operate efficiently on those devices.

AIX and Symmetrix Volume Discovery

Added to .profile configuration on both systems:

Add the path to the SYMCLI software into root’s default profile:

root> export PATH=$PATH:/usr/symcli/bin

Additional commands are required when initializing devices:

root> export PATH=$PATH:/usr/lpp/Symmetrix/bin

Performed on both systems:

Detect new Symmetrix devices on AIX:

root> emc_cfgmgr

Initialize communication with the Symmetrix, and discover device paths and other device information:

root> symcfg discover

List the available devices so that you can determine how AIX device names are mapped to the Symmetrix volumes available to this host:

root> symdev list

Set Up Symmetrix Device Groups for SRDF Operations

This section describes how to define the members of a group containing pairs of SRDF volumes. The volumes must have been associated when the Symmetrixes were configured by your EMC SE. The volumes in each pair must be the same size, and all of the pairs in the group must consist of an R1 on one Symmetrix, each associated with an R2 on the other. Use the same procedure on both systems so that both will use the same group names, by simplying exchanging the direction.



General Procedure

Values required to run the procedure:

GROUP

Name for the Symmetrix Device Group consisting of R1 Symmetrix volumes to be mirrored with R2 volumes.

RDFDIR

Direction of the SRDF link. This will be either rdf1 on the R1 machine, or rdf2 on the R2 machine.

SSTART

First volume (hex Symmetrix device number) on local Symmetrix. The volume numbers on the remote Symmetrix will have been defined when your Symmetrix was configured.

SEND

Last volume (hex Symmetrix device number) on local Symmetrix.

Create the group for the volumes:

root> symdg –type RDFDIR create GROUP

Add a range of local devices to the group (they must be consistent with RDFDIR specified in the previous command:

root> symld –g GROUP –RANGE SSTART:SEND addall dev

Repeat this step if you want to combine nonconsecutive blocks of volumes into a single STD group.

Performed on the primary system:

root> symdg –type rdf1 create data root> symdg –type rdf1 create logs

root> symld –g data –RANGE 000:013 addall dev root> symld –g logs –RANGE 014:017 addall dev

Performed on the secondary system:

root> symdg –type rdf2 create data

root> symdg –type rdf2 create logs root> symld –g data –RANGE 000:013 addall dev root> symld –g logs –RANGE 014:017 addall dev

How you group Symmetrix devices is a function of the operations you plan to perform. We created a separate group for each of the data and the logs, but we could have used only a single group, because for this scenario we are always operating the same way on both of them at once.



Establish the SRDF Mirrors

To simplify configuration on the secondary system, the rest of the Logical Volume Manager (LVM) and database setup steps on the primary system were done with SRDF volumes established. This makes the setup of the secondary system much simpler, as we will see later in the section “Import the Disk Configuration on the Secondary System”.

General Procedure

Value required to run the procedure:

GROUP

Name for the Symmetrix Device Group consisting of SRDF volumes to be mirrored.

Initial establish of a group of SRDF device pairs:

root> symrdf –g GROUP –noprompt –full –exact establish

Repeat this as needed for each group.

(Optional) Wait until the process completes.

root> symmir –g GROUP –i 30 verify


root> symrdf –g data –noprompt –full –exact establish root> symrdf –g logs –noprompt –full –exact establish

We didn't need to wait at this time for the establish to complete – it could continue in the background while we proceeded with the subsequent setup tasks.

Configuring the Volumes on the AIX Systems

The AIX operating system provides a Logical Volume Manager (LVM) to manage large groups of devices as volume groups. Volume groups are further subdivided into logical volumes to be used for file systems or raw logical volumes. If you are using a different Volume Manager, you may use different commands to accomplish the same purposes.

To perform the LVM configuration, we used the AIX devices we discovered earlier (with symdev list) to create volume groups, logical volumes within the volume groups, and file systems on logical volumes to match our needs. For our demonstration, we configured a volume group with a file system and a raw logical volume to contain the data, and, on each system, another volume group with a file system to contain the logs. These were created on the primary system, and later imported onto the secondary system from the R2 copy.



Configure AIX Device Variables

A number of commands need to be given a list of physical devices. If you are only using a small number of physical devices for each purpose, it is no problem to type the entire list by hand each time. If you are using groups of devices, as in our demonstration case, it is more convenient to set up variables to contain the lists of devices. Note that our demonstration systems both assign the same volume names to the corresponding devices, so we used the same variable settings on both systems. You will likely discover different device names on your system, and your two systems might be different from each other and require different assignment lists.


root> datadk='hdisk7 hdisk8 hdisk9 hdisk10' root> rawdk='hdisk11 hdisk12 hdisk13 hdisk14' root> rawdk="$rawdk hdisk15 hdisk16 hdisk17 hdisk18" root> rawdk="$rawdk hdisk19 hdisk20 hdisk21 hdisk22" root> rawdk="$rawdk hdisk23 hdisk24 hdisk25 hdisk26"

root> logsdk='hdisk27 hdisk28 hdisk29 hdisk30'

Create Volume Groups

General Procedure


VGNAME

Name for the volume group.

PPSIZE

Size of partitions to be used.

DKLIST

List of physical volumes to be included in the group.

The partition size you choose depends on the size of the hyper volumes. There is a limit of 1016 partitions per physical volume, so your partition size must be as least 0.1% of the total size you may wish to use on the volume. The partition is the unit of allocation for space on the volume, so a large size will waste more space if a request has to be rounded up to a multiple of the partition size. Our hyper volumes were 4GB, the default partition size of 4MB was just barely too small. However, we were allocating entire volumes to dedicated purposes, so a larger size (64MB) was convenient to work with and caused no wasted space (beyond what we were wasting by choosing to allocate entire volumes).

root> mkvg –f –yVGNAME –sPPSIZE DKLIST


root> mkvg –f –ydrvg –s64 $datadk $rawdk root> mkvg –f –ylvg –s64 $logsdk

Create Logical Volumes to Contain Journalled File Systems

General Procedure




LVNAME

Name for the logical volume.

MAXPP

Maximum number of partitions to be used.

VGNAME

Name of the volume group to contain the logical volume.

DKLIST

List of physical volumes to be included in the logical volume.

Specify the number of partitions to be available to the logical volume (which we will eventually set to almost all of the available space). We set the space initially to 1, so that when a journalled file system is built, the logical volume to be used for the journal will be placed in the middle of the available space. This is not especially critical for volumes that are on a Symmetrix, but it doesn't hurt and can have some beneficial effect.

Create a Logical Volume with the File System Volumes:

root> mklv –yLVNAME –tjfs –xMAXPP VGNAME 1 DKLIST


root> mklv –ydlv –tjfs –x250 drvg 1 $datadk root> mklv –yllv –tjfs –x250 lvg 1 $logsdk

Create Logical Volumes to Contain Raw Disks

General Procedure


LVNAME


MAXPP


VGNAME


ACTPP

Initial number of partitions to be used.

DKLIST


Specify the number of partitions available to the logical volume (which we set to almost all of the available space). We allocate the space immediately here. No other logical volumes are involved for a raw volume, so no special arrangements to locate them within the volume group are required.

root> mklv –yLVNAME –xMAXPP VGNAME ACTPP DKLIST


root> mklv –yrlv –x1000 drvg 1000 $rawdk



Create Journalled File Systems

General Procedure


LVNAME

Name of the logical volume.

MNTPT

Mount point where the file system will be used.

AUTO

Specifies whether the file system is to be mounted on reboot, either yes or no; usually yes for the primary system and no for the secondary system.

ISIZE

Size for inode blocks.

FSSIZE

Final size for the file system (in 512 byte blocks).

Logical volumes to contain file systems now get those file systems created on them. Each file system needs a mount point where it will appear in the computer's directory structure. The size of an inode block controls the number of files (and directories and devices and so on) that can be created on that file system. The default of 4096 is a bit too small for the size of file systems we were creating. We made the final size of each of our file systems 32,768,000 blocks (of 512 bytes), which is all of the space (slightly under 16GB) on the logical volume. An alternate approach is to use smit fs to create file systems.

root> crfs –vjfs –dLVNAME –mMNTPT –AAUTO –prw –tno –a nbpi=ISIZE root> chfs –a size=FSSIZE MNTPNT root> mount MNTPNT


root> crfs –vjfs -ddlv –m/data –Ayes –prw –tno –a nbpi=8192 root> crfs –vjfs –dllv –m/logs –Ayes –prw –tno –a nbpi=8192 root> chfs –a size=32768000 /data root> chfs –a size=32768000 /logs root> mount /data root> mount /logs

root> mkdir /logs/logs /logs/archive /logs/retrieve /logs/applied

Make File Systems and Devices Available to the DB2 Instance

To permit the file systems and devices to be used by the database instance, the permissions must be set up properly. We show how to provide access to the mount point of the file system, but this is not a requirement. You could create subdirectories from the mount point and have the database created within these subdirectories. However, any other files or directories placed under the mount point will be copied to the SRDF volumes along with the database.


root> chown dba /data /dev/rrlv

root> chown –R dba /logs



IBM DB2 UDB Storage Configuration

IBM DB2 UDB needs storage for creating database configuration and control information. It also needs storage for tablespaces, which will in turn hold tables and other database objects. DB2 UDB provides two types of storage for databases: SMS (system-managed storage) and DMS (database-managed storage). For more information on DB2 UDB storage, refer to the IBM DB2 UDB manuals. Our configuration demonstrates the use of both SMS (for catalog, user, and temp tablespaces) and DMS (for test row tablespace).

Creating a DB2 Database in the /data File System

We create the database on the journalled file system using default SMS containers for the catalog, user, and temporary tablespaces.

General Procedure


DBNAME

Name of the database to be created.

PATH

Location of the database.

Start IBM DB2 UDB (if it is not already running):

dba> db2start

Create the database on the file system:

dba> db2 create db DBNAME on PATH


dba> db2start dba> db2 create db testdb on /data

Configure a Raw Device as DMS in a Database

A raw device can be used to hold data within a database. (This is not necessary, we do it here for demonstration purposes.)

General Procedure


DBNAME

Name of the database.

TSNAME

Name to give the tablespace that will contain the raw device.

TBNAME

Name of the table in the tablespace.

TBDEF

DB2 table definition.



DVPATH Pathname of the raw device container. DVSIZE

Amount of space available on the device.

Establish a connection to the database:

dba> db2 connect to DBNAME

Use " " on the next command so that the quotes and parentheses won't need to be escaped to protect them from interpretation by the shell:

dba> db2 "create tablespace TSNAME managed by database using \ ( device 'DEVPATH' DEVSIZE )"

Create table(s) in the tablespace.


dba> db2 connect to testdb dba> db2 "create tablespace testraw managed by database \ using ( device '/dev/rrlv' 60G )"

Any tables created in the tablespace, testraw, will be stored in the DMS tablespace.

Set the Location of Logs

The default location for the DB2 log files is in the SQLOGDIR directory, which is a relative path under the database path used during the create database command. You would omit setting the location of logs if you were not using a separate Symmetrix device group for the logs and the logpath was contained under the database path.

General Procedure


DBNAME


LPATH

Location for the logs.

Modify the default location of the DB2 log files:

dba> db2 update db cfg for DBNAME using NEWLOGPATH LPATH


dba> db2 update db cfg for testdb using NEWLOGPATH /logs

Enable Log Retain

Additional configuration parameters may need to be set. To use a standby copy of the database, you need to enable LOGRETAIN.



Performed on the Primary System:

Enable LOGRETAIN:

dba> db2 update db cfg for testdb using LOGRETAIN on

Finishing Setup on the Secondary System

The configuration of devices, file systems, and the database must now be made known to the OS on the secondary system. This is largely a matter of telling that system to use the configuration already copied onto the R2 volumes. The R2 volumes are made available to the secondary system by splitting them. We do not have to stop the database on the primary system to do this, we simply use the DB2 UDB suspend write function to ensure that the split copy has restartable integrity.

Split the R2 Volumes from the R1 Volumes

General Procedure


GROUP

Name of a device group to split

DBNAME

Name of the database that is being protected.

Make sure that the establish operation started earlier has completed successfully:

root> symrdf –g GROUP –synchronized verify

Suspend writes to the database: dba> db2 connect to DBNAME dba> db2 set write suspend for database

Split the SRDF volume groups:

root> symrdf –g GROUP –noprompt split

(repeat additional splits for each group used by the database)

Resume writes to the database: dba> db2 set write resume for database

Performed on the primary system: root> symrdf –g data –synchronized verify root> symrdf –g logs –synchronized verify dba> db2 connect to testdb dba> db2 set write suspend for database root> symrdf –g data –noprompt split root> symrdf –g logs –noprompt split dba> db2 set write resume for database



Import the Disk Configuration on the Secondary System

The volume manager setup, file system setup, and database setup have already been done on the R1 volumes on the primary system, and mirrored to the R2 volumes; the secondary system merely has to learn about them. Using the AIX volume manager, this step is simple.

General Procedure


VGNAME

Name of the volume group being imported.

DISK

Device name of the first disk in the group.

RAWDEV

Name of a raw device (if any) in the volume.

Import the volume group definition from the physical volumes:

root> importvg –yVGNAME DISK

If a raw device exists, it needs to have permissions set correctly (the contents of the file systems will have the permissions already set from the mirroring of STD devices on the primary system).

root> chown DBOWNER RAWDEV


root> importvg –ydrvg hdisk7 root> importvg –ylvg hdisk27 root> chown dba /dev/rllv

Catalog the Database on the Secondary System

The database is already present on the imported disk volumes, but IBM DB2 UDB on the secondary system does not know about the imported volumes yet. If your database application requires any auxiliary data or files that are stored outside of the database (such as control scripts), they should be set up at this time if this hasn’t been done yet.

General Procedure


DBNAME


PATH


Make the database known to the DB2 Instance:

db2> db2 catalog database DBNAME on PATH

Install database control scripts, auxiliary files, etc. as required to use the database properly on the secondary system.




dba> db2 catalog database testdb on /data

Reestablish the SRDF Connections for Standby Capability

Now that the database configuration has been imported into the secondary system, it is ready to be used for its purpose – standing by in case of a disaster on the primary system. To recover from a disaster, the secondary system needs to have fully synchronized copies of the data and logs from the primary. In this scenario, that is done by keeping all of the volumes synchronized during normal processing and not mounted on the secondary system. (If the secondary system left the volumes mounted, it would have an inconsistent idea of the contents that could cause the system to crash when the volumes became available again with their contents changed with new data from the primary system.)

Unmounting the Database Volumes from a System

Before we can reestablish the volumes, we have to prepare the secondary system for the volumes being unavailable for the duration of normal processing. They will not be available until a disaster or a planned switchover occurs, and that might be a long time in the future, and the volumes will have a great number of changes made without the knowledge of the secondary system.

General Procedure


FILESYS

Name of a file system to unmount.

Find any running DB2 applications

dba> db2 list applications

Stop them. This may require application-specific termination procedures.

To stop the database if all applications have already been stopped:

dba> db2stop

To stop the database, forcibly terminating all connected applications (use this ony if no application-specific termination procedure is required):

dba> db2stop force

Unmount the file systems:

root> umount FILESYS

Repeat this step for each file system.

Performed on the secondary system: dba> db2stop force root> umount /data

root> umount /logs



Reestablish SRDF Connections

General Procedure


GROUP

Name of a device group that will be established.

Establish each group of devices

root> symrdf –g GROUP –noprompt establish


This could be done on the primary system; we do it on the secondary system because the previous steps were done there.

root> symrdf –g data –noprompt establish root> symrdf –g logs –noprompt establish

Planned Switchover to Secondary System

Sometimes you will have adequate warning and reason that the primary system will need to be made unavailable. With such warning, you can provide an orderly switchover.

Unmount the Database Volumes from the Primary System


You might precede this with a procedure that blocks new transactions from starting for some time. dba> db2stop force

root> umount /data root> umount /logs

Make the Database Volumes Available to the Secondary System

The database volumes are no longer being changed by the primary, and all indications of their being busy (database state, fi le system journal) have been stopped. So, the secondary system can simply mount the volumes and start the database. The SRDF commands failover and switch are used to make the volumes unavailable to the primary host, and to exchange the R1/R2 roles of the two hosts. After this, the secondary system acts as the R1 and the primary system acts as the R2 in all commands.


root> symrdf –g data –noprompt failover root> symrdf –g logs –noprompt failover root> symrdf –g data –noprompt switch root> symrdf –g logs –noprompt switch

root> mount /data root> mount /logs



dba> db2start

Now you must start your application database and ensure that processes using it will be directed to the secondary system instead of the primary. (That will depend upon your circumstances; perhaps a nameserver update will be used to redirect the network address of the application from the primary system to the secondary.)

Preparing to Switch Back

It is important to keep the volumes on the primary Symmetrix up to date if possible, or the get them back up to date as soon as possible. Depending upon the reason for the planned switchover, the primary Symmetrix might be operational throughout the switchover; or it might be unavailable for a while but become available later. As soon as it is available, its volumes should be brought up to date with the ongoing activity on the secondary system.

Performed on either system:


Because of the switch, these establish commands cause data to be copied from the secondary system (which is now the R1) to the primary (which is now the R2).

Planned Switchback to Primary System

After the primary system is ready to resume running the application (which requires that the SRDF volumes are synchronized between the two systems), the procedure to switch back is essentially the same as the planned switchover procedure – you just exchange which system runs the commands. (Even though the original secondary system is currently running the application, we still refer to it as the secondary system.) The sole difference is that the failover command is replaced by the failback command, to indicate that you know that it is the original R2 system that is currently acting as the R1.


You might precede this with a procedure that blocks new transactions from starting for some time.

root> symrdf –g data –synchronized verify root> symrdf –g logs –synchronized verify root> # if necessary, wait for them to get synchronized root> symrdf –g data –i 10 verify root> symrdf –g logs –i 10 verify

dba> db2stop force



root> symrdf –g data –noprompt failback root> symrdf –g logs –noprompt failback root> symrdf –g data –noprompt switch root> symrdf –g logs –noprompt switch




root> mount /data root> mount /logs dba> db2start

Preparing to Resume Standby Capability

We now ensure that the volumes on the secondary Symmetrix will be kept up to date so that the secondary Symmetrix can resume its role of standing by in case of a disaster on the primary system.



Because of the second switch, these establish commands cause data to be again mirrored from the primary system (which is again the R1) to the secondary (which is again the R2).

Disaster Switchover to Secondary System

Sometimes you will have no warning before the primary system becomes unavailable. This can happen because of a physical disaster (e.g. an earthquake or a power failure) or a system failure (e.g. an OS crash, or the primary system loses all network connections to the database clients). When there is no such warning, you can still switch over to using the secondary system to run the application, but you have lost any opportunity for an orderly shutdown of the primary system. This means that there may be more visible effects such as aborted transactions.

One special case should be mentioned here. If the primary system and the SRDF lines were still operational, then use the planned switchover procedure described in “Planned Switchover to Secondary System “ above. This might happen if the primary system lost its network connection and was unable to communicate with its clients but still had an SRDF connection to the secondary Symm. That would be a disaster to the application, but would still permit an unplanned opportunity to use the planned shutdown procedure.

Make the Database Volumes Available to the Secondary System

The database volumes are no longer being changed by the primary, but indications of their being busy (database state, file system journal) may still be present. So, when the secondary system mounts the file systems, there may be a slight delay as the journal is replayed. More significantly, the database will indicate that it is active and so disaster recovery must be done for it before it can be used. The SRDF commands failback and switch are used to make the volumes unavailable to the secondary host, and to exchange the R1/R2 roles of the two hosts. After this, the secondary system acts as the R1 and the primary system acts as the R2 in subsequent commands.


root> symrdf –g data –synchronized verify root> symrdf –g logs –synchronized verify root> # if necessary, wait for them to get synchronized root> symrdf –g data –i 10 verify root> symrdf –g logs –i 10 verify



root> symrdf –g data –noprompt failover root> symrdf –g logs –noprompt failover root> symrdf –g data –noprompt switch root> symrdf –g logs –noprompt switch

root> mount /data root> mount /logs dba> db2start dba> db2 restart db testdb

Preparing to Switch Back

It is important to keep the volumes on the primary Symmetrix up to date if possible, or the get them back up to date as soon as possible. Depending upon the form of the disaster, the primary Symmetrix might be operational throughout the switchover; or it might be unavailable for a while but become available later. As soon as it is available, its volumes should be brought up to date with the ongoing activity on the secondary system.



Because of the switch, these establish commands cause data to be copied from the secondary system (which is now the R1) to the primary (which is now the R2).

Disaster Switchback to Primary System

After the primary system is ready to resume running the application, the normal procedure to switch back is described above.

However, if a disaster were to occur at the secondary site after the primary was ready to take over, but before it actually had taken over, you could use the disaster takeover procedure in reverse.

This is essentially the same as the disaster switchover procedure – you just exchange which system runs the commands. (Even if the original secondary system is currently running the application, we still refer to it as the secondary system.) The sole difference is that the failover command is replaced by the failback command, to indicate that you know that it is the original R2 system that is currently acting as the R1.

Performed on the secondary system (if still operating):

dba> db2stop force



root> symrdf –g data –noprompt failback root> symrdf –g logs –noprompt failback root> symrdf –g data –noprompt switch root> symrdf –g logs –noprompt switch

root> mount /data root> mount /logs



dba> db2start

Now you must start your application database and ensure that processes using it will be again directed to the primary system instead of the secondary. (That will depend upon your circumstances; perhaps a nameserver update will be used to redirect the network address of the application, or perhaps the application IP address must be retaken by the primary.)

Preparing to Resume Standby Capability

It is important to keep the volumes on the secondary Symmetrix up to date if possible, or the get them back up to date as soon as possible. Depending upon the form of the disaster, the secondary Symmetrix might be operational and able to communicate with the primary Symmetrix throughout the disaster; or it might be unavailable for a while but become available later. As soon as it is available, its volumes should be brought up to date with the ongoing activity on the primary system. Then it can resume its role of standing by in case of a disaster on the primary system.



Because of the second switch, these establish commands cause data to be again copied from the primary system (which is again the R1) to the secondary (which is again the R2).



Scenario 2 – Log Copy

In this scenario, only the logs are kept fully synchronized during normal operations. The data volumes are synchronized once with SRDF to initially copy the database to the secondary system, but they are then split. They are only updated with SRDF after that time as a step in disaster recovery. Normally, they are instead updated by the secondary system, which applies the logs that have been forwarded from the primary system.

Each Symmetrix has three sets of volumes for the logs. An R1 volume set is used as the local log directory. It is connected to an R2 volume set on the other Symmetrix. The R2 volume set has a BCV volume set associated with it. These three volume sets are contained in one Symmetrix volume group.

Figure 2. Test configuration used for log copy scenario

The Symmetrix volume group loclogs on each Symm includes the volumes that are locally mounted as /logs as well as two volume sets on the other Symmetrix: the R2 volume set (which the remote host doesn't see) and the BCV volume set which the other host mounts as /rmt/logs. The same collection of volumes has the name rmtlogs on the other Symmetrix. So, each system uses the group loclogs to refer to its own logs and the copies of them that are sent to the other Symmetrix, and each system uses the group rmtlogs to refer to the other system's logs and the copies of them that are copied back to its own Symmetrix. Since the data volumes are treated equally (they are usually split, making them available to both systems independently, but occasionally copied from one system to the other) they both use the same group name for them.

During normal operation, the primary system will have /logs mounted, and the R2 on the other Symm will remain synchronized with it. The BCV copy on the remote system will generally be established, too. The /logs volumes for the secondary system will also be mounted. Its R2 and BCV (on the primary Symm) will always remain synchronized.

When DB2 UDB stops using a log, it archives it. We set the USEREXIT facility so that this archive operation will place the log in a special directory. We run a daemon script to watch that directory, to ensure that the archived log is applied by the secondary system. When the daemon notices a newly



archived log, it splits the remote BCV copy. On the secondary system, another daemon is watching for this split. When the split happens, the secondary daemon mounts /rmt/logs and copies any newly archived logs over to its own /logs area. It can now reestablish the BCV (so that it is ready to accept additional archived logs from the primary system) and causes its own DB2 UDB system to apply the received logs to its own copy of the database. There are some special concerns – if the secondary system is too slow to reestablish the BCV, the primary system might have already archived an additional log. The archive script deals with that by ensuring that the BCV is synchronized before it tries to split it, waiting and trying the split again later if it is not yet synchronized. The secondary system processes as many logs as have been archived, so it will eventually catch up, (as long as the secondary system is fast enough to apply the logs as quickly as the primary system creates them).

Table 2 and Table 3 show the Symmetrix device group name, AIX volume group name, AIX logical volume name, file system mount points (device name in the case of raw logical volumes), the Symmetrix device numbers used on each system, and the AIX hdisk names assigned the Symmetrix hyper volumes on both systems. (There is no need that the Symmetrix device numbers or the AIX hdisk names be the same on both systems – it simply happened that our test systems had the same configuration of other disks on the system and the same choices made for the Symmetrix configuration. However, because of the alternating remote/local viewpoint issues, not all of the volumes in this scenario have the same volume numbers – so be careful.)

Table 2. Names and device numbers on the primary system

Symmetrix

Group

Volume

Group

Logical

Volume

Mount

Point

Symmetrix dev

device

type

hdisk



loclogs lvg llv /logs 014:017 R1 27:30

rmtlogs N/A 018:01B R2 31:34

rmtlogs XX XX /rmt/logs 080:083 BCV 35:38


Table 3. Names and device numbers on the secondary system

Symmetrix

Group

Volume

Group

Logical

Volume

Mount

Point

Symmetrix dev

device

type

hdisk



rmtlogs /logs 014:017 R2 27:30

loclogs lvg llv N/A 018:01B R1 31:34

rmtlogs XX XX /rmt/logs 080:083 BCV 35:38


Note that the device type R1(R2) is an R1 device on the primary system during initial setup, but is switched to an R2 device during disaster recovery. The device type R2(R1) is the corresponding volumes on the secondary system which are always in the opposite state. The logical group and logical volume names listed as XX are chosen by the OS, but we never need to use those names for these procedures, so you need not find out what names were assigned.



Configuration Setup Tasks and Management Procedures

A fair amount of planning and configuration goes into setting up the storage for a large database implementation on an EMC Symmetrix. Before you can use Symmetrix R1, R2 or TimeFinder volumes, the physical drives in the Symmetrix must be logically subdivided into hyper volumes, which can appear to your system as physical disks. Symmetrix hyper volumes are further configured as R1, R2, or BCV (or other) Symmetrix device types. Once this is complete, the devices must be made known to your operating system. This step is operating system (OS) specific; in our case, the OS is AIX 4.3.3. The same procedures could be done on other OSs, such as HP/UX or Solaris, but the commands to manage the devices would need to be changed into the appropriate OS-specific equivalents.

SYMCLI commands must be installed and available to the root user doing the initial device configuration and subsequent operational steps. Finally, the SYMCLI database is initialized to include the new devices so that later SYMCLI commands can operate efficiently on those devices.

AIX and Symmetrix Volume Discovery

Added to .profile configuration on both systems:

Add the path to the SYMCLI software into root’s default profile:

root> export PATH=$PATH:/usr/symcli/bin

Additional commands are required when initializing devices:

root> export PATH=$PATH:/usr/lpp/Symmetrix/bin


Detect new Symmetrix devices on AIX:

root> emc_cfgmgr

Initialize communication with the Symmetrix, and discover device paths and other device information

root> symcfg discover

List the available devices so that you can determine how AIX device names are mapped to the Symmetrix volumes available to this host:

root> symdev list

Set Up Symmetrix Device Groups for SRDF Operations

This section describes how to define the members of a group containing pairs of SRDF volumes, possibly associating a BCV volume with the R2. The SRDF volumes must have been associated when your Symmetrixes were configured by your EMC SE. The volumes in each pair must be the same size (and the BCV too if there is one). All of the pairs in the group must consist of an R1 on one Symmetrix, each associated with an R2 on the other. Use the same procedure on both systems so that both will use the same group names. (We use the same names with different sets in this scenario so that it is the devices with the same function, rather than the identical devices, that have the same name. ) SYMCLI commands takes these group names as parameters.

General Procedure




GROUP

Name for the Symmetrix Device Group consisting of R1 Symmetrix volumes to be mirrored with R2 (and possibly BCV) volumes.

RDFDIR

Direction of the SRDF link. This will be either rdf1 on the R1 machine, or rdf2 on the R2 machine.

SSTART

First volume (hex Symmetrix device number) on local Symmetrix. The volume numbers on the remote Symmetrix will have been defined when your Symmetrix was configured.

SEND

Last volume (hex Symmetrix device number) on local Symmetrix.

BSTART

First BCV volume (Symmetrix device number), if used.

BEND

Last BCV Volume (Symmetrix device number), if used.

Create the group for the volumes:

root> symdg –type RDFDIR create GROUP

Add a range of local devices to the group (they must be consistant with RDFDIR specified in the previous line:

root> symld –g GROUP –RANGE SSTART:SEND addall dev

Repeat this step if you want to combine nonconsecutive blocks of volumes into a single STD group.

(If required) Associate a range of BCV devices with the group.

To associate local BCV devices to the local R2 use:

root> symbcv –g GROUP –RANGE BSTART:BEND associateall dev

Repeat this step if you want to combine nonconsecutive blocks of volumes into a single BCV group.

To associate remote BCV devices to the remote R2 use:

root> symbcv –g GROUP –RANGE BSTART:BEND –rdf –rdfg 1 associateall dev

Repeat this step if you want to combine nonconsecutive blocks of volumes into a single BCV group.


root> symdg –type rdf1 create data root> symdg –type rdf1 create loclogs root> symdg –type rdf2 create rmtlogs

root> symld –g data –RANGE 000:013 addall dev root> symld –g loclogs –RANGE 014:017 addall dev



root> symld –g rmtlogs –RANGE 018:01B addall dev root> symbcv –g loclogs –RANGE 080:083 –rdf –rdfg 1 associateall dev root> symbcv –g rmtlogs –RANGE 080:083 associateall dev

Performed on the secondary System:

root> symdg –type rdf2 create data root> symdg –type rdf1 create loclogs root> symdg –type rdf2 create rmtlogs

root> symld –g data –RANGE 000:013 addall dev root> symld –g rmtlogs –RANGE 014:017 addall dev root> symld –g loclogs –RANGE 018:01B addall dev root> symbcv –g loclogs –RANGE 080:083 –rdf –rdfg 1 associateall dev

root> symbcv –g rmtlogs –RANGE 080:083 associateall dev

How you group Symmetrix devices is a function of the operations you plan to perform. We created a separate group for each of the data, the local logs, and the remote logs because we need to operate differently on them.

Establish the SRDF Mirrors

To simplify configuration on the secondary system, the rest of the LVM and database setup steps on the primary system were done with SRDF volumes established. This makes the setup of the secondary system much simpler as we will see later.

General Procedure


GROUP

Name for the Symmetrix Device Group consisting of SRDF volumes to be mirrored.

Initial establish of a group of SRDF device pairs:

root> symrdf –g GROUP –noprompt –full –exact establish




Performed on the the primary system:

root> symrdf –g data –noprompt –full –exact establish root> symrdf –g loclogs –noprompt –full –exact establish




root> symrdf –g loclogs –noprompt –full –exact establish

We didn't need to wait at this time for the establish to complete – it could continue in the background while we proceeded with the subsequent setup tasks.

Establish the BCV Mirrors

To simplify configuration on the secondary system, the rest of the LVM and database setup steps on the primary system were done with BCV volumes established, too. This makes the setup of the secondary system much simpler, as we will see later.

General Procedure


GROUP

Name for the Symmetrix Device Group consisting of STD Symmetrix volumes to be mirrored with BCV volumes

Initial establish of a group of STD - BCV device pairs

root> symmir –g GROUP –noprompt –full –exact establish





root> symmir –g rmtlogs –noprompt –full –exact establish

We didn't wait for the establish to complete; instead, we continued with other setup tasks. We wanted the results of those tasks to be included before we split the BCV copies but until then we don’t care whether the copies are complete.

Configuring the Volumes on the AIX Systems

The AIX operating system provides a Logical Volume Manager (LVM) to manage large groups of devices as volume groups. Volume groups are further subdivided into logical volumes to be used for file systems or raw logical volumes. If you are using a different Volume Manager, you may use different commands to accomplish the same purposes.



To perform the LVM configuration, we used the AIX devices we discovered earlier (with symdev list) to create volume groups, logical volumes within the volume groups, and file systems on logical volumes to match our needs. For our demonstration, we configured a volume group with a file system and a raw logical volume to contain the data, and, on each system, another volume group with a file system to contain the logs. The data was created on the primary system and later imported onto the secondary system from the R2 copy. The logs were created separately on each system, and later imported onto the other system (as remote logs) from the BCV copy of the R2 copy.

Configure AIX Device Variables

A number of commands that will be using require a list of physical devices as arguments. If you are only using a small number of physical devices for each purpose, it is no problem to type the entire list by hand each time. If you are using larger groups of devices, as in our demonstration case, it is more convenient to set up variables to contain the lists of devices. Note that our demonstration systems both assign the same volume names to the corresponding devices, but as shown Tables 2 and 3 above, we arrange them somewhat differently on the two systems. You will likely discover different device names on your system, and your two systems might be different from each other and require different assignment lists.

Performed on both systems (but only actually used on primary):

root> datadk='hdisk7 hdisk8 hdisk9 hdisk10' root> rawdk='hdisk11 hdisk12 hdisk13 hdisk14' root> rawdk="$rawdk hdisk15 hdisk16 hdisk17 hdisk18" root> rawdk="$rawdk hdisk19 hdisk20 hdisk21 hdisk22" root> rawdk="$rawdk hdisk23 hdisk24 hdisk25 hdisk26"

Performed on the primary system: root> logsdk='hdisk27 hdisk28 hdisk29 hdisk30' root> rmtlogsdk='hdisk35 hdisk36 hdisk37 hdisk38'

Performed on the secondary system root> logsdk='hdisk31 hdisk32 hdisk33 hdisk34' root> rmtlogsdk='hdisk35 hdisk36 hdisk37 hdisk38'

Create Volume Groups

General Procedure


VGNAME

Name for the volume group.

PPSIZE

Size of partitions to be used.

DKLIST

List of physical volumes to be included in the group.

The partition size you choose depends on the size of the hyper volumes. There is a limit of 1016 partitions per physical volume, so your partition size must be as least 0.1% of the total size you may wish to use on the volume. The partition is the unit of allocation for space on the volume, so a large size will waste more space if a request has to be rounded up to a multiple of the partition size. Our hyper volumes were 4GB; the default partition size of 4MB was just barely too small. However, we were allocating entire



volumes to dedicated purposes, so a larger size (64MB) was convenient to work with and caused no wasted space (beyond what we were wasting by choosing to allocate entire volumes).

root> mkvg –f –yVGNAME –sPPSIZE DKLIST


root> mkvg –f –ydrvg –s64 $datadk $rawdk


root> mkvg –f –ylvg –s64 $logsdk

Create Logical Volumes to Contain Journalled File Systems

General Procedure


LVNAME


MAXPP


VGNAME


DKLIST


Specify the number of partitions to be available to the logical volume (which we will eventually set to almost all of the available space). We set the space initially to 1, so that when a journalled file system is built the logical volume to be used for the journal will be placed in the middle of the available space. This is not especially critical for volumes that are on a Symmetrix, but it doesn't hurt and can have some beneficial effect.

Create a Logical Volume with the File System Volumes:

root> mklv –yLVNAME –tjfs –xMAXPP VGNAME 1 DKLIST


root> mklv –ydlv –tjfs –x250 drvg 1 $datadk


root> mklv –yllv –tjfs –x250 lvg 1 $logsdk



Create Logical Volumes to Contain Raw Disks

General Procedure


LVNAME


MAXPP


VGNAME


ACTPP

Initial number of partitions to be used.

DKLIST


Specify the number of partitions available to the logical volume (which we set to almost all of the available space). We allocate the space immediately here. No other logical volumes are involved for a raw volume, so no special arrangements to locate them within the volume group are required.

root> mklv –yLVNAME –xMAXPP VGNAME ACTPP DKLIST


root> mklv –yrlv –x1000 drvg 1000 $rawdk

Create Journalled File Systems

General Procedure


LVNAME

Name of the logical volume.

MNTPT

Mount point where the file system will be used.

AUTO

Specifies whether the file system is to be mounted on reboot, either yes or no; usually yes for the primary system and no for the secondary system.

ISIZE

Size for inode blocks.

FSSIZE

Final size for the file system (in 512 byte blocks).

Logical volumes to contain file systems now get those file systems created on them. Each file system needs a mount point where it will appear in the computer's directory structure. The size of an inode block



controls the number of files (and directories and devices and so on) that can be created on that file system. The default of 4096 is too small for the size of file systems we were creating. We made the final size of each of our file systems 32,768,000 blocks (of 512 bytes), which is all of the space (slightly under 16GB) on the logical volume. An alternate approach is to use smit fs to create file systems.

root> crfs –vjfs –dLVNAME –mMNTPT –AAUTO –prw –tno –a nbpi=ISIZE root> chfs –a size=FSSIZE MNTPNT root> mount MNTPNT


root> crfs –vjfs -ddlv –m/data –Ayes –prw –tno –a nbpi=8192 root> chfs –a size=32768000 /data root> mount /data


root> crfs –vjfs –dllv –m/logs –Ayes –prw –tno –a nbpi=8192 root> chfs –a size=32768000 /logs root> mount /logs root> mkdir /logs/logs /logs/archive /logs/retrieve /logs/applied

Make File Systems and Devices Available to DB2 Instance

To permit the file systems and devices to be used by the database instance, the permissions must be set up properly. We show how to provide access to the mount point of the file system, but this is not a requirement. You could create subdirectories from the mount point and have the database created within these subdirectories. However, any other files or directories placed under the mount point will be copied to the SRDF volumes along with the database.


root> chown dba /data /dev/rrlv

Performed on both systems: root> chown –R dba /logs



IBM DB2 UDB Storage Configuration

IBM DB2 UDB needs storage for creating database configuration and control information. It also needs storage for tablespaces, which will in turn hold tables and other database objects. DB2 UDB provides two types of storage for databases: SMS (system-managed storage) and DMS (database-managed storage). For more information on DB2 UDB storage, refer to the IBM DB2 UDB manuals.

Creating a DB2 Database in the /data File System

We create the database on the journalled file system using default SMS containers for the catalog, user, and temporary tablespaces.

General Procedure


DBNAME

Name of the database to be created.

PATH


Start DB2 UDB (if it is not already running):

dba> db2start

Create the database on the file system:

dba> db2 create db DBNAME on PATH


dba> db2start dba> db2 create db testdb on /data

Configure a Raw Device as DMS in a Database

A raw device can be used to hold data within a database. (This is not necessary, we do it here for demonstration purposes.)

General Procedure


DBNAME


TSNAME

Name to give the tablespace that will contain the raw device.

TBNAME

Name of the table in the tablespace.

TBDEF

DB2 table definition.



DVPATH

Pathname of the raw device container.

DVSIZE

Amount of space available on the device.

Establish a connection to the database:

dba> db2 connect to DBNAME

Use " " on the next command so that the quotes and parentheses won't need to be escaped to protect them from interpretation by the shell.

dba> db2 "create tablespace TSNAME managed by database using \ ( device 'DEVPATH' DEVSIZE )"

Create table(s) in the tablespace.


dba> db2 connect to testdb dba> db2 "create tablespace testraw managed by database \ using ( device '/dev/rrlv' 60G )"

Any tables created in the tablespace, testraw, will be stored in the DMS tablespace.

Set the Location of Logs

The default location for the DB2 log files is in the SQLOGDIR directory which is a relative path under the database path used during the create database command. You would omit setting the location of logs if you were not using a separate Symmetrix device group for the logs and the logpath was contained under the database path.

General Procedure


DBNAME


LPATH

Location for the logs.

Modify the default location of the DB2 log files:

dba> db2 update db cfg for DBNAME using NEWLOGPATH LPATH


dba> db2 update db cfg for testdb using NEWLOGPATH /logs

Compile and Install the Userexit program

We will be setting a DB2 UDB configuration parameter that causes a program to be invoked whenever a log is archived. The program is called db2uext2. A number of sample versions of this program are provided with DB2 UDB. We use the one intended to copy logs to disk. First, we make a copy of the program:




dba> cp sqllib/samples/c/db2uext2.cdisk ./db2uext2.c dba> vi db2uext2.c

We make some changes to this program by modifying some of the definitions from their default values. First, we set the directory layout that we are using in the configuration area of the program:

#define ARCHIVE_PATH "/logs/archive" #define RETRIEVE_PATH "/logs/retrieve" #define AUDIT_ERROR_PATH "/logs/logs"

Second, we change the definition used when copying files so that the program will instead link the files. This should not be done for other uses of the userexit capability. We can do it here because we carefully arrange things so that the archive, retrieve, and log directories are on the same file system, which makes linking instead of copying possible. We choose to do this because the whole point of this scenario is to have a reduced load on the SRDF link (compared to scenario 1), and if a log file were to be copied, its entire content would have to be sent through the SRDF connection again because of the copy. Using a link instead means that only the changes to the directories that allow both directories to refer to the same file have to be sent – that is only a few sectors rather than the entire log file.

#define COPY "ln" /* was "cp" */

After finishing the editting of the program, we compile it and copy it to where DB2 UDB will expect to find it:

dba> cc –o db2uext2 db2uext2.c dba> cp db2uext2 sqllib/adm

Enable Log Retain and USEREXIT

An additional configuration parameter must be set to cause DB2 UDB to actually use the installed program:

Performed on the primary system: dba> db2 update db cfg for testdb using USEREXIT on



Finishing Setup on the Secondary System

The configuration of devices, file systems, and the database must now be made known to the OS on the secondary system. This is largely a matter of telling that system to use the configuration already copied onto the R2 volumes. The R2 volumes are made available to the secondary system by splitting them. We do not to have stop the database on the primary system to do this, we simply use the DB2 UDB suspend write function to ensure that the split copy has restartable integrity. Additionally, the primary systems needs to be informed about its copy of the secondary system’s logs (but since the secondary system does not yet have a running database, there is no need to suspend access to it).

Split the R2 Volumes from the R1 Volumes

General Procedure


GROUP

Name of a device group to split

DBNAME

Name of the database that is being protected.

Make sure that the establish operation started earlier has completed successfully:

root> symrdf –g GROUP –synchronized verify

(Repeat the verify checks for every group that must be split)

Suspend writes to the database: dba> db2 connect to DBNAME dba> db2 set write suspend for database

Split the SRDF volume groups:

root> symrdf –g GROUP –noprompt split

(repeat additional splits for each group used by the database)

Resume writes to the database: dba> db2 set write resume for database


root> symrdf –g data –synchronized verify root> symrdf –g loclogs –synchronized verify root> symmir –g loclogs –rdf –synched verify root> # if any of the mirrors was not yet synchronized: root> symrdf –g data –i 10 verify (if needed) root> symrdf –g loclogs –i 10 verify (if needed) root> symmir –g loclogs –rdf –i 10 verify (if needed) dba> db2 connect to testdb dba> db2 set write suspend for database root> symrdf –g data –noprompt split root> symmir –g loclogs –noprompt -rdf split



dba> db2 set write resume for database

This additional split, making the secondary system's BCV copy of the primary system's logs, does not need to be done while the database's writes are suspended.

The copy comes from the R2, which is split during the suspend, and it will not be affected by changes to the R1 that occur after the database's writes are resumed.

root> symrdf –g rmtlogs –synchronized verify root> symmir –g rmtlog –synched verify

Wait, if necessary: root> symrdf –g rmtlogs –i 10 verify (if needed) root> symmir –g rmtlogs –i 10 verify (if needed) root> symmir –g rmtlogs –noprompt split

Import Volumes to the Same Location on a Different System

The volume manager setup, file system setup, and database setup have already been done on the R1 volumes on the primary system, and copied to the R2 volumes; the secondary system merely has to learn about it. Using the AIX volume manager, this step is simple.

General Procedure


VGNAME

Name of the volume group being imported.

DISK


RAWDEV

Name of a raw device (if any) in the volume.

Import volume group definition from the physical volumes:

root> importvg –yVGNAME DISK

If a raw device exists, it needs to have permissions set correctly (the contents of the file systems will have the permissions already set from the mirroring of STD devices on the primary system).

root> chown DBOWNER RAWDEV


root> importvg –ydrvg hdisk7 root> chown dba /dev/rrlv

Import Volumes to a Different Location on a Different System

The volume manager setup, file system setup, and database setup have already been done on the R1 volumes of the logs directory on each system, and copied through the R2 volumes to the BCV volumes. Each system merely has to learn about this configuration done by the other. However, each system



already has its own logs volumes, so the copy of the other system’s logs must be accessed at a different mount point.

General Procedure


DEVPRE

Prefix to use for device names.

MPPRE

Prefix to use for the mountpoint.

DISK


Import the volume group definition from the physical volumes:

root> recreatevg –yDEVPRE -LMPPRE DISK


root> recreatevg –L/rmt $rmtlogsdk

Use the Primary Database Copy to Catalog and Initialize the Secondary

The primary system has set up the database. An SRDF copy of the data is available on the secondary system. We use it to import that database configuration and data onto the secondary system.


The database is already present on the imported disk volumes, but DB2 UDB on the secondary system does not know about the imported volumes yet. If your database application requires any auxiliary data or files that are stored outside of the database (such as control scripts), they should be set up at this time if this hasn’t been done yet.

General Procedure


DBNAME


PATH


Make the database known to the DB2 Instance:

root> mount PATH

db2> db2start db2> db2 catalog database DBNAME on PATH db2> db2inidb DBNAME as standby

Install database control scripts, auxiliary files, etc. as required to use the database properly on the seondary system.




root> mount /data

dba> db2start dba> db2 catalog database testdb on /data dba> db2inidb testdb as standby

Log Copy

A log copy standby system maintains its own copy of the database by receiving completed logs forwarded from the primary system and applying them (rolling them forward) on its copy of the database.

As described earlier, we configured the sample DB2 userexit C program that comes with IBM DB2 UDB to provide the initial stage of the log forwarding. The same program is used on each system. On the primary system, it is invoked when each log is switched out to start forwarding that log to the secondary system. On the secondary system, it is invoked to retrieve logs that have been forwarded so that they can be applied.

We also use a perl script on each system to manage the process of forwarding the logs, using the BCV copy of the logs on the secondary system to contain the forwarded logs. The script on the primary system splits the BCV whenever there are new logs to be processed. The script on the secondary system notices that the BCV has been split, copies any new logs from it, and reestablishes the BCV.

Note that you will need to install both scripts on both systems to allow for log transmission in both directions (from primary to secondary under normal circumstances, from secondary to primary during disaster recovery).

Automating Log Forwarding – Sending System

The perl script symm_send_logs is available by anonymous ftp from ftp://ftp.emc.com/pub/elab/DB2/symm_send_logs_latest. It is used (normally on the primary system) to transmit logs that have been archived. It normally runs continuously, processing logs whenever they are archived. Whenever a new log has appeared and the remote BCV of the logs file system is synchronized, it splits the BCV. If a new log appears before the BCV is again synchronized, it waits and does the split later. After splitting the BCV, the logs are moved to keep track of which ones have been sent.


dba> export USEREXIT_DIRECTORY=/logs dba> export USEREXIT_ARCHIVE=/logs/archive dba> export USEREXIT_GROUP=loclogs dba> symm_send_logs testdb &

If you use the default settings shown above (which are the settings we have demonstrated in this scenario), or if you edit the script to set the default directory settings to your actual usage, you can skip setting the environment variables.

Automating Log Forwarding – Receiving System

The perl script symm_recv_logs is available by anonymous ftp from ftp://ftp.emc.com/pub/elab/DB2/symm_recv_logs_latest. It is used (normally on the



secondary system) to receive logs that have been archived on the other (usually primary) system and process them. It normally runs continuously, processing logs whenever they are archived. Whenever the local BCV copy of the other system's logs has been split (by the symm_send_logs script on the other system) this script collects all the new logs, reestablishes the BCV, and has DB2 rollforward through the new logs.

Performed on the Secondary System:

dba> export USEREXIT_DIRECTORY=/logs dba> export USEREXIT_REMOTE_DIRECTORY=/rmt/logs dba> export USEREXIT_REMOTE_MOUNT=/rmt/logs dba> export USEREXIT_ARCHIVE=/rmt/logs/archive dba> export USEREXIT_RETRIEVE=/logs/retrieve dba> export USEREXIT_APPLIED=/logs/applied dba> export USEREXIT_GROUP=rmtlogs dba> symm_recv_logs testdb &

If you use the default settings shown above (which are the settings we have demonstrated in this scenario), or if you edit the script to set the default directory settings to your actual usage, you can skip setting the environment variables.

Reestablishing the BCV Mirrors

During the initialization stages, the SRDF and BCV mirrors have been split. The data SRDF volumes are left split – both systems update their own copy of the data. However, the BCV copy of each system's logs does need to be established.


root> symmir –g rmtlogs –noprompt establish root> symmir –g loclogs –noprompt –rdf establish

And now the two systems are fully set up. Any database updates carried out on the primary system will be logged. Later the log will be sent to the secondary system and applied. Meanwhile, the active logs will be continuously kept synchronized, so that the secondary system has all of the information it needs to bring itself completely up to date in the event of a disaster.



Switch Over to Secondary System

This scenario does not work well for planned outages. There is significant overhead in preparing the primary system to be ready to switch back after an outage and only a tiny portion of the switching process is different for a planned outage, so we do not separate the two forms of switchover for this scenario.

Shut Down the Primary Database (if necessary)

Depending upon the reason for switching over, it may be necessary to shut down the primary database application. This is the case for a planned switchover, or if the disaster has not shut down the entire system. The secondary system is about to be set up as the owner of the database; if the primary system is able to continue acting on its copy of the database its changes will be lost even though the database is updated. Naturally, if the primary system has been crashed or otherwise disabled by the disaster, this step cannot and need not be done.

Performed on the primary system (to the extent possible):

Stop the database operations (precede this with any application specific shutdown procedures that might apply under the circumstances):

dba> db2stop force

Terminate the symm_send_logs daemon to prevent any unplanned log forwarding actions, and unmount the data volumes – they will be unavailable during the recovery stage:

root> touch /logs/archive/TESTDB/NODE0000 root> umount /data root> sleep 15

Stop SRDF Updates from the Primary to the Secondary

This step is done for a similar reason to the previous one – it prevents any operations on the database by the primary system, even if, for example, it were rebooted and tried to continue "normal" operations.


root> symrdf –g data –noprompt split

Finish Any Outstanding Archive Log Processing

Before the secondary system can start running the database, it must be fully up to date. First, it must process archived logs, if any, that have not yet been handled. Conceivably, there could have been some logs queued to be sent at the time the disaster occurred. In addition, the symm_recv_logs daemon might still be busy processing the previous log (or logs).


Wait until any previous log processing has completed. When such processing is complete, the daemon will reestablish the BCV:

root> symmir –g rmtlogs –synched –i 10 verify



Now force one additional round of log processing in case there were any completed logs that had not yet been seen. Normally the primary system would split the BCV, but it has been disconnected either by the disaster or the previous step, so it is done manually:

root> symmir –g rmtlogs –noprompt split

The symm_recv_logs daemon will notice the split BCV and process it. (It will terminate quickly if there were no new logs provided.) It will finish by establishing the BCV again. We wait for that to finish, then we can terminate the daemon:

root> symmir –g rmtlogs –synched –i 10 verify root> touch /logs/retrieve/TESTDB/NODE0000/stop_recv root> sleep 15

Finally, any active logs must be copied over and rolled forward to catch the secondary database up with the transactions that were recently completed and still in progress, so that they can be rolled forward and backed out as appropriate:

root> symmir –g rmtlogs –noprompt split root> mount /rmt/logs dba> db2stop root> cp /rmt/logs/logs/S*.LOG /logs/logs root> chown dba /logs/logs/S*.LOG root> umount /rmt/logs root> symmir –g rmtlogs –noprompt establish root> symrdf –g rmtlogs –noprompt establish dba> db2start dba> db2 rollforward db testdb to end of logs dba> db2 rollforward db testdb complete

The secondary database is now fully caught up; the secondary system can begin to operate the database application. You may need to carry out additional tasks at this point, such as updating a DNS entry to refer to the secondary system, so that client systems will be know to use it in place of the disabled primary system.

Reverse R1 and R2 Roles

The secondary system was able to accept and process logs from the primary system because it was initialized in standby mode. After the disaster, the primary system is not immediately able to go into standby mode and receive and process logs from the secondary (regardless of whether the primary system is operational). Before it can be used again, it must have its data copied from the secondary to get it synchronized with any changes that have been applied by the secondary system.

It will be possible to start this next step immediately as long as the disaster has not disabled the primary system's Symmetrix. If the disaster has disabled the Symmetrix, this step will have to wait until it has been made ready.


root> symrdf –g data suspend root> symrdf –g data write_suspend r1 root> symrdf –g data swap –refresh r1

The secondary system is now considered to be the R1 of the SRDF connection, so it can access the volumes (and update them) even while the primary volumes are being updated.



This update may take a while – all tracks that have been changed on either system since the secondary was split (back in the initialization stage, or during the previous failover if there was one) must be copied back to the primary. Many of these tracks will actually have the same data, but unfortunately, since they were written independently by the two systems (during the original transaction and during forwarded log rollforward) the Symmetrix is not able to distinguish this. (There certainly could be differences, it depends upon whether the two systems DB2 processes include any write-time-specific data to the database, or if they are capable of choosing different disk positions to write changes to.)

Switch Back to the Primary System

To switch back to the primary system, it is necessary to stop the database on the secondary, failback to the data volumes so that it is again the primary that is able to change them, ensure that the logs on the primary are fully up to date, and then start the system up esstentially the same way it was done during the initialization.


Make sure the data is fully copied to the primay system:

root> symrdf –g data –noprompt establish root> symrdf –g data –synchronized –i 10 verify

Stop the database:

dba> # any application-specific shutdown procedures dba> db2stop force

Any mechanisms (such as DNS entries) used to cause users of the database to use the secondary system should now be reverted to point them at the primary system.

Release the data and provide it back to the primary:

root> umount /data root> symrdf –g data –noprompt split root> symrdf –g data –noprompt failback root> symrdf –g data –noprompt switch root> symrdf –g data –noprompt establish


Copy the data and logs that have been changed while the secondary was operating the database:

root> mount /data root> symmir –g rmtlogs –noprompt split root> mount /rmt/logs root> cp /rmt/logs/logs/S* /logs/logs/S* root> umount /rmt/logs root> symmir –g rmtlogs –noprompt establish

The database can now be started up:

dba> db2start

Finish the setup for the secondary by providing it with its starting copy of the database:

root> symrdf –g data –synchronized verify root> # if needed



root> symrdf –g data –i 10 verify dba> db2 connect to testdb dba> db2 set write suspend for database root> symrdf –g data –noprompt split dba> db2 set write resume for database root> symm_send_logs testdb &


root> mount /data dba> db2start dba> db2inidb testdb as standby

root> symm_recv_logs &

creating no dataloss standby databases with ibmÒ db2Ò … · 2004-11-15 · emc corporation and...

Documents