high availability setup using veritas cluster server and netapp

38
HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP SYNCHRONOUS SNAPMIRROR Jorge Costa, NetApp June 2008

Upload: others

Post on 03-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP SYNCHRONOUS SNAPMIRROR Jorge Costa, NetApp

June 2008

Page 2: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 2

TABLE OF CONTENTS

1 PROBLEM DESCRIPTION ................................ .................................................................... 4

2 SOLUTION OVERVIEW......................................................................................................... 5

2.1 STORAGE ...........................................................................................................................................................................5

2.2 VCS SERVICE GROUPS....................................................................................................................................................6

2.3 VCS MAIN.CF .....................................................................................................................................................................7

3 HOW DOES IT WORK ?................................. ....................................................................... 8

3.1 ON NODE EBSIPS2: ................................... .......................................................................................................................8

3.2 ON NODE EBSIPS1: ................................... .......................................................................................................................9

3.3 SG_TEST_PROPERTIES.FILE:........................... ..............................................................................................................9

3.4 SCRIPTS SUMMARY .......................................................................................................................................................10

3.5 VCS SSH FILER CREDENTIALS .......................... ..........................................................................................................10

4 INSTALATION GUIDE.................................. ........................................................................11

4.1 TASK 1: CREATE ACCOUNTS ON BOTH FILERS ............. ..........................................................................................11

4.2 TASK 2: SET SSH-PUBLIC-KEY AUTHENTICATION.......... .........................................................................................12

4.3 TASK 3: INSTALL FCP HUK ON BOTH NODES .............. .............................................................................................12

4.4 TASK 4: EDIT SNAPMIRROR.CONF ON BOTH FILERS........ ......................................................................................13

4.5 TASK 5: SNAPMIRROR INITIALIZE FROM CDC-FAS-PO01 .... ...................................................................................13

4.6 TASK 6: INSTALL SCRIPTS............................ ................................................................................................................14

4.7 TASK 7: EDIT VCS PREONLINE TRIGGER ON NODE EBSIPS1 . ...............................................................................14

4.8 TASK 8: EDIT VCS PREONLINE TRIGGER ON NODE EBSIPS2 . ...............................................................................15

4.9 TASK 9: EDIT VCS POSTOFFLINE TRIGGER ON NODE EBSIPS 2............................................................................15

4.10 TASK 10: EDIT PROPERTIES.FILE ON BOTH NODES........ ........................................................................................16

4.11 TASK 11: EDIT SG_TEST_PRE-ONLINE_FAILOVER_TO_EBSIPS 2.SH ON EBSIPS2 ............................................17

4.12 TASK 12: EDIT SG_TEST_POST-OFFLINE_FAILBACK_TO_EBSI PS1.SH ON NODE EBSIPS2 ............................18

4.13 TASK 13: EDIT PRE-ONLINE_FAILBACK_TO_EBSIPS1.SH ON EBSIPS1 ...............................................................19

4.14 TASK 14: EDIT SG_TEST_RESYNC_FROM_EBSIPS2_TO_EBSIPS 1.SH ON NODE EBSIPS2 ..............................20

5 OPERATION MANUAL................................... ......................................................................22

6 LOGS AND TROUBLESHOOTING........................... ............................................................23

7 APPENDIX 1.........................................................................................................................24

7.1 PROPERTIES.FILE...........................................................................................................................................................24

7.2 EBSIPS1 - PREONLINE ................................ ...................................................................................................................25

7.3 EBSIPS1 – PRE-ONLINE_FAILBACK_TO_EBSIPS1.SH ........ .....................................................................................26

7.4 EBSIPS2 – RESYNC_FROM_EBSIPS2_TO_EBSIPS1.SH ........ ...................................................................................30

Page 3: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 3

7.5 EBSIPS2 – POSTOFFLINE.............................. ................................................................................................................31

7.6 EBSIPS2 – PREONLINE ................................ ..................................................................................................................32

7.7 EBSIPS2 – POST-OFFLINE_FAILBACK_TO_EBSIPS1.SH...... ...................................................................................34

7.8 EBSIPS2 – PRE-ONLINE_FAILOVER_TO_EBSIPS2.SH........ ......................................................................................35

Page 4: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 4

1 PROBLEM DESCRIPTION

ACME.CORPORATION has acquired a number of NetApp filers and is looking to replace some of

the current storage XP infrastructure with the NetApp solution. ACME has however some

interesting requirements, these required NetApp Professional Services and SDT-EMEA to

become involved in order to provide a solution for these requirements.

ACME.CORPORATION is primarily a Solaris FC shop where the NetApp filers are their Tier 2

storage platform, Filesystem of choice is VxFS on top of VxVM 5,0 and using DMP as their

multipath provider. In terms of host and application resilience ACME uses Symantec Veritas

Cluster Server 5.0.

This document has references to two sites – production (DDC) and DR (CDC); however data can

be replicated in any direction and must be made easily available after a disaster or a controlled

failover.

The requirements of replicating data in any direction and promoting one ‘DR’ site online are

provided by SnapMirror however ACME wanted to go a step further using VCS to control

failover/failback and automate as much as possible.

One of their primary concerns was to have a fully synchronous replication across their production

and DR site, NetApp SnapMirror Sync addresses this first concern. Especially now with the

release of ONTAP 7.3 which brings substantial performance improvements in both SnapMirror

Sync and Fibre-Channel. Their other concern which required the help of NetApp PS was to

integrate SnapMirror sync failover/failback with Veritas Cluster Server enabling ACME to have

one-button-click capabilities to perform controlled site failovers, storage failover and node failover.

This paper documents the NetApp SnapMirror integration with Veritas Cluster Server.

Page 5: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 5

2 SOLUTION OVERVIEW

2.1 STORAGE

The diagram above shows two sites where ACME operates: DDC and CDC (DR). Each one of these sites contains a Solaris node member of a two node VCS cluster.

Each node has access only to its local SAN storage. The LUNs made available to one host are replicated across the WAN using SnapMirror Sync to the other storage array.

While this is not a real VxVM shared disk group across two nodes, SnapMirror Sync propagates every write done at the Production site to DR, so that from a VxVM perspective the DR node ebsips2 LUNs appear to be the exactly the same disks presented on the PRD node ebsips1, and part of a VxVM cluster disk group.

The replicated LUNs on the DR site are in an offline status and unavailable to the second host during normal operation. The SnapMirror replication has to be stopped in order to online the DR LUNs and provide write access to node ebsips2.

Page 6: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 6

2.2 VCS SERVICE GROUPS

Each application configured in the cluster requires at least one VCS Service Group with one dedicated NetApp FlexVol and one dedicated LUN. This Service Group is available on a single node at a single moment in time. It cannot be online on two nodes simultaneously.

For each FlexVol configured in a VCS Service Group there is a SnapMirror relationship to another FlexVol on the DR storage array.

The diagram above shows how VCS is integrated with the NetApp Scripts.

Page 7: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 7

2.3 VCS MAIN.CF

In the ACME.CORPORATION setup there is a VCS SG named sg_test containing two resources: VxVM DiskGroup resource dgapp_ebsip1 and vxfs mountpoint resource mt_opt_VRTSnbu.

The LUNs which form DiskGroup dgapp_ebsip1 are the LUNS presented by the NetApp filer.

The VCS main.cf is listed below:

include "types.cf" cluster ebsipc ( UserNames = { admin = hKKmKFkHKeKIjHIqHJ, unix_adm = aLLnLGlILfLJkIJrIK } Administrators = { admin, unix_adm } ) system ebsips1 ( ) system ebsips2 ( ) group sg_test ( SystemList = { ebsips1 = 0, ebsips2 = 1 } AutoStart = 0 AutoFailOver = 0 AutoStartList = { ebsips1, ebsips2 } PreOnline = 1 OnlineRetryLimit = 3 OnlineRetryInterval = 120 ) DiskGroup dgapp_ebsp1 ( Critical = 1 DiskGroup = dgapp_ebsp1 ) Mount mt_opt_VRTSnbu ( Critical = 1 MountPoint = "/opt/VRTSnbu" BlockDevice = "/dev/vx/dsk/dgapp_eb sp1/v_opt_VRTSnbu" FSType = vxfs FsckOpt = "-y" ) mt_opt_VRTSnbu requires dgapp_ebsp1 // resource dependency tree // // group sg_test // { // Mount mt_opt_VRTSnbu // { // DiskGroup dgapp_ebsp1 // } // }

Page 8: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 8

3 HOW DOES IT WORK ? The integration with Veritas Cluster Server is done through a set of customized scripts(see 2.2) which are run by VCS while ‘online’ing’ or ‘offline’ing’ a Service Group. These scripts are called by the VCS triggers preonline and postoffline during normal operation. Their function is to control LUN access and invert SnapMirror replication flow.

3.1 ON NODE EBSIPS2:

Trigger preonline executes: sg_test_PRE-ONLINE_failover_to_ebsips2.sh Trigger postoffline executes: sg_test_POST_OFFLINE_failback_to_ebsips1.sh These scripts are executed by VCS when we modify some of the standard VCS triggers and enable the Service Group property PreOnline. hagrp -modif sg_test PreOnline 1

The change above forces the preonline trigger to be run every time the Service Group ‘sg_test’ is brought online. The standard VCS preonline trigger script is modified to execute our sg_test_PRE-ONLlINE_failover_to_ebsips2.sh every time the service group “sg_test ” is brought online. This is done by modifying the trigger /opt/VRTSvcs/bin/triggers/preonline as listed below: # put your code here... if ( $ARGV[0] eq "ebsips2" && $ARGV[1] eq "sg_test" ) { system("/home/netapp/scripts/sg_test_PRE-ONLINE_failover_to_ebsips2.sh"); } The Script POST_OFFLINE_failback_to_ebsips1.sh is executed by modifying the standard VCS post-offline trigger scripts adding the following code: # put your code here... if ( $ARGV[0] eq "ebsips2" && $ARGV[1] eq "sg_test" ) { system("/home/netapp/scripts/sg_test_POST-OFFLINE_failback_to_ebsips1.sh"); }

The examples above are only relevant to node ebsips2.

Page 9: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 9

3.2 ON NODE EBSIPS1:

Trigger PRE-ONLINE executes: sg_test_PRE-ONLINE_failback_to_ebsips1.sh followed by sg_test_Resync_from_ebsips2_to_ebsips1.sh The script sg_test _PRE-ONLINE_failback_to_ebsips1.sh is run by VCS when the ServiceGroup sg_test is brought online on node ebsips1 . The only required actions are to have the PreOnline attribute set to 1 and to modify the standard PreOnline trigger to call our sg_test_PRE-ONLINE_failback_to_ebsips1.sh . That change done on /opt/VRTSvcs/bin/triggers/preonline is shown below: # put your code here... if ( $ARGV[0] eq "ebsips1" && $ARGV[1] eq "sg_test" ) {

system("/home/netapp/scripts/sg_test_ PRE-ONLINE_failback_to_ebsips1.sh"); }

3.3 SG_TEST_PROPERTIES.FILE:

The sg_test_properties.file file located in /home/netapp/scripts contains variables used by all other scripts. These variables include the nodenames, filer hostnames, production and DR volumes, LUN paths and IGROUP names. All these scripts mentioned in section 3 must be customized during install or reconfiguration of VCS ServiceGroups.

Page 10: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 10

3.4 SCRIPTS SUMMARY

The table below contains a short summary of all the scripts and their locations on each node: node script run by ebsips2 /home/netapp/scripts/sg_test_PRE-ONLINE_failover_to_ebsips2.sh VCS

PreOnline trigger

ebsips2 /home/netapp/scripts/sg_test_POST_OFFLINE_failback_to_ebsips1.sh

VCS PostOffline trigger

ebsips1 /home/netapp/scripts/sg_test_PRE-ONLINE_failback_to_ebsips1.sh VCS PreOnline trigger

ebsips2 /home/netapp/scripts/sg_test_Resync_from_ebsips2_to_ebsips1.sh VCS PreOnline trigger

Both nodes

/home/netapp/scripts/sg_test_properties.file Sourced by every script

Both nodes

/opt/VRTSvcs/bin/triggers/preonline VCS

Both nodes

/opt/VRTSvcs/bin/triggers/postoffline VCS

3.5 VCS SSH FILER CREDENTIALS

In order to execute filer commands from nodes ebsips1 and ebsips2 , two users vcs_ebsips1 and vcs_ebsips2 have been defined on filers ddc-fas-p03 and cdc-fas-p01 . These users enable remote execution of commands using ssh public-key authentication. While these users have limited access as they are part of a restricted role, their level of access allows the root user from these two hosts to perform a large set of operations such as creating, destroying, ‘offline’ing’, ‘online’ing’ luns and volumes on each filer. This may be of a security concern for ACME.Clearnet, however ONTAP does not provide a mechanism to limit actions from a specific user to a set or volumes or LUNs. Currently ONTAP allows us only to limit user actions to a set of commands.

Page 11: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 11

4 INSTALATION GUIDE

This section contains the required steps to deploy or customize this solution on a new pair of nodes, or to add additional ServiceGroups to an existing VCS cluster.

There are several scripts that must be modified during the initial setup, those changes are related to FlexVols, LUNs and ServiceGroups names and are documented in the following pages.

The scripts as they’ve been deployed on ebsips1 and ebsips2 are prepared to work with a single FlexVol, two LUNs and the ServiceGroup sg_test. In order to use different LUNS, or just a different ServiceGroup name these scripts must be modified.

4.1 TASK 1: CREATE ACCOUNTS ON BOTH FILERS

Login into both NetApp Controllers and create the following objects:

Create a role for VCS, named r_vcs with the following privileges:

Name: r_vcs Info: VCS role Allowed Capabilities: login-ssh,cli-lun,cli-vol,cli -snapmirror

Create a group for VCS, named g_vcs containing role r_vcs:

Name: g_vcs Info: Rid: 131078 Roles: r_vcs

Create users vcs_ebsips1, vcs_ebsips2 on both controllers and add these users to group g_vcs:

Name: vcs_ebsips1 Info: VCS user ebsips1 Rid: 131079 Groups: g_vcs Name: vcs_ebsips2 Info: VCS user ebsips2 Rid: 131080 Groups: g_vcs

Page 12: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 12

4.2 TASK 2: SET SSH-PUBLIC-KEY AUTHENTICATION From host ebsips1 set the ssh public-keys to user vcs_ebsips1 on both filers; From host ebsips2 set the ssh public-keys to user vcs_ebsips2 on both filers; The procedure for setting ssh public-key authentication is available on the ONTAP 7.2 System Administration Guide -> Chapter 9: Using secureadmin -> Managing SSH for secureadmin -> Setting up public-key based authentication. Refer to the following link and page 191 for more information: http://now.netapp.com/NOW/knowledge/docs/ontap/rel724_vs/pdfs/ontap/sysadmin.pdf From each host try to connect to the filers using ssh:

[root@ebsips1] # ssh vcs_ebsips1@ddc-fas-p03 snapmirror status [root@ebsips1] # ssh vcs_ebsips1@cdc-fas-p01 snapmirror status [root@ebsips2] # ssh vcs_ebsips2@ddc-fas-p03 snapmirror status [root@ebsips2] # ssh vcs_ebsips2@cdc-fas-p01 snapmirror status

4.3 TASK 3: INSTALL FCP HUK ON BOTH NODES

Login to the NOW website and download the FCP HUK for Solaris:

http://now.netapp.com/NOW/download/software/sanhost_sol/4.2/

Install the software using pkgadd on both nodes:

[root@ebsips1] # pkgadd –d NTAPSANTool.pkg [root@ebsips2] # pkgadd –d NTAPSANTool.pkg

Complete instructions are available on the FCP HUK installation guide:

http://now.netapp.com/NOW/knowledge/docs/hba/sun/relhu42/pdfs/vinstall.pdf

Page 13: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 13

4.4 TASK 4: EDIT SNAPMIRROR.CONF ON BOTH FILERS

Both filers should have an entry in the /etc/snapmirror.conf for every volume to be replicated across sites.

The example below from CDC-FAS-P01 shows volume vl_EBSIPS1_2 on filer DDC-FAS-P03 is replicating to a volume with the same name on filer CDC-FAS-P01.

CDC-FAS-P01> rdfile /etc/snapmirror.conf DDC-FAS-P03:vl_EBSIPS1_2 CDC-FAS-P01:vl_EBSIPS1_2 - sync

As we need to be able to reverse replication when the ServiceGroup is active on ebsips2, an equivalent entry is required on filer DDC-FAS-P03.

This enables us to replicate volume vl_EBSIPS1_2 from CDC-FAS-P01 to DDC-FAS-P03.

ddc-fas-p03> rdfile /etc/snapmirror.conf CDC-FAS-P01:vl_EBSIPS1_2 DDC-FAS-P03:vl_EBSIPS1_2 - sync

If additional FlexVols are to be configured on the VCS ServiceGroup, then additional entries for those FlexVols must be added to the /etc/snapmirror.conf files.

4.5 TASK 5: SNAPMIRROR INITIALIZE FROM CDC-FAS-PO01

Connect to filer CDC-FAS-P01 and initialize SnapMirror between DDC-FAS-P03 -> CDC-FAS-P01.

CDC-FAS-P01> Snapmirror initialize –S DDC-FAS-P03:v l_EBSIPS1_2 –w CDC-FAS-P01:vl_EBSIPS1_2

Repeat this task for every FlexVol to be SnapMirrored across sites.

Page 14: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 14

4.6 TASK 6: INSTALL SCRIPTS

Create the directories /home/netapp/scripts and /home/netapp/logs on both nodes. Then copy the scripts listed in Appendix 1 to the correct directories.

The table shows where each script should be deployed.

node script run by ebsips2 /home/netapp/scripts/sg_test_PRE-ONLINE_failover_to_ebsips2.sh VCS

PreOnline trigger

ebsips2 /home/netapp/scripts/sg_test_POST_OFFLINE_failback_to_ebsips1.sh

VCS PostOffline trigger

ebsips1 /home/netapp/scripts/sg_test_PRE-ONLINE_failback_to_ebsips1.sh VCS PreOnline trigger

ebsips2 /home/netapp/scripts/sg_test_Resync_from_ebsips2_to_ebsips1.sh VCS PreOnline trigger

Both nodes

/home/netapp/scripts/sg_test_properties.file Sourced by every script

Both nodes

/opt/VRTSvcs/bin/triggers/preonline VCS

Both nodes

/opt/VRTSvcs/bin/triggers/postoffline VCS

4.7 TASK 7: EDIT VCS PREONLINE TRIGGER ON NODE EBSI PS1

Each ServiceGroup running on hosts ebsips1, ebsips2 requires a set of PRE-ONLINE, POST-OFFLINE and properties.file scripts.

The VCS preonline trigger contains an entry in that it executes a script if a specific ServiceGroup is bring brought online. It is then possible to manage several ServiceGroups by creating a new copy of the PRE-ONLINE_failback_to_ebsips1.sh script with a prefix for that VCS ServiceGroup and adding an entry to the VCS preonline trigger.

Example of different ServiceGroups:

SG_test_PRE-ONLINE_failback_to_ebsips1.sh

SG_billing_PRE-ONLINE_failback_to_ebsips1.sh

SG_apps_PRE-ONLINE_failback_to_ebsips1.sh

Page 15: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 15

The required changes on the VCS /opt/VRTSvcs/bin/triggers/preonline trigger:

# put your code here... if ( $ARGV[0] eq " ebsips1" && $ARGV[1] eq " sg_test" ) { system(" /home/netapp/scripts/sg_test_PRE-ONLINE_failback_to_ebsips1.sh"); }

4.8 TASK 8: EDIT VCS PREONLINE TRIGGER ON NODE EBSI PS2

Customize this script in the same way as explained for task 7.

/opt/VRTSvcs/bin/triggers/preonline:

# put your code here... if ( $ARGV[0] eq " ebsips2" && $ARGV[1] eq " sg_test" ) { system(" /home/netapp/scripts/sg_test_PRE-ONLINE_failback_to_ebsips1.sh"); }

4.9 TASK 9: EDIT VCS POSTOFFLINE TRIGGER ON NODE EB SIPS2

Customize this script in the same way as explained for task 7.

/opt/VRTSvcs/bin/triggers/postoffline:

# put your code here... if ( $ARGV[0] eq " ebsips2" && $ARGV[1] eq " sg_test" ) { system(" /home/netapp/scripts/sg_test_POST-OFFLINE_failback_to_ebsips1.sh"); }

Page 16: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 16

4.10 TASK 10: EDIT PROPERTIES.FILE ON BOTH NODES

When running multiple ServiceGroups, each SG must have a dedicated properties.file on both nodes.

The best method is to rename the properties.file to <SG_NAME>_properties.file for each ServiceGroup and include this file on the PRE-ONLINE/POST_OFFLINE scripts.

Open the <SG_NAME>_properties.file and customize as needed.

In this file you should specify every object relevant to the VCS cluster, for example:

nodePRD is the node at the production site: ebsips1; nodeDR is the node at the DR site: ebsips2; PRD_VOLUME_0 is the FlexVol at the PRD site: vl_EBSIPS1_2 DR_VOLUME_0 is the FlexVol at the DR site: vl_EBSIPS1_2 PRD_LUN_0 is the first LUN available to the PRD node: /vol/vl_EBSIPS1_2/qt_LUN001/LUN001 DR_LUN_0 is the first LUN available to the DR node: /vol/vl_EBSIPS1_2/qt_LUN001/LUN001 PRD_LUN_0_ID is the LUN ID of the first LUN on the PRD site: 1 DR_LUN_0_ID is the LUN ID of the first LUN on the DR site: 1 PRD_IGROUP is the IGROUP NAME of the PRD host DR_IGROUP is the IGROUP NAME of the DR host

#!/usr/bin/bash export nodePRD=ebsips1 export nodeDR=ebsips2 export filerPRD=ddc-fas-p03 export filerDR=CDC-FAS-P01 # define the volume names on PRD and DR filers export PRD_VOLUME_0="vl_EBSIPS1_2" #export PRD_VOLUME_1="" #export PRD_VOLUME_2="" #export PRD_VOLUME_3="" #export PRD_VOLUME_4="" export DR_VOLUME_0="vl_EBSIPS1_2" # export DR_VOLUME_1="" # export DR_VOLUME_2="" # export DR_VOLUME_3="" # export DR_VOLUME_4="" # define the LUNS availble on the PRD host from the PRD filer export PRD_LUN_0="/vol/vl_EBSIPS1_2/qt_LUN001/LUN00 1" export PRD_LUN_0_ID=1 export PRD_LUN_1="/vol/vl_EBSIPS1_2/qt_LUN002/LUN00 2" export PRD_LUN_1_ID=2 #export PRD_LUN_2="" #export PRD_LUN_2_ID= #export PRD_LUN_2="" #export PRD_LUN_2_ID= # define the LUNS to be made availble on the DR hos t from the DR filer export DR_LUN_0="/vol/vl_EBSIPS1_2/qt_LUN001/LUN001 " export DR_LUN_0_ID=1 export DR_LUN_1="/vol/vl_EBSIPS1_2/qt_LUN002/LUN002 " export DR_LUN_1_ID=2 export #DR_LUN_2="" export #DR_LUN_2_ID= # What are the igroups on DR and on the PRD filers? export PRD_IGROUP=EBSIPS1 export DR_IGROUP=EBSIPS2

Page 17: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 17

4.11 TASK 11: EDIT SG_TEST_PRE-ONLINE_FAILOVER_TO_EBSIPS2.SH ON EBSIPS2

If you have multiple ServiceGroups you need to have a dedicated PRE-ONLINE_failover_to_ebsips2.sh, we recommend renaming the script to <SG_NAME>_PRE_ONLINE_failover_to_ebsips2.sh.

Open /home/netapp/scripts/<SG_NAME>_PRE-ONLINE_failover_to_ebsips2.sh and uncomment the required entries for each LUN and FlexVol in use by the Service Group.

Then modify the first entry for the properties.file to that it reflects the name of the ServiceGroup, <SG_NAME>_properties.file.

.

.

. . /home/netapp/scripts/sg_test_properties.file . . . # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD offlineLUN $filerPRD $PRD_LUN_0 0 | tee -a $LOGFIL E offlineLUN $filerPRD $PRD_LUN_1 0 | tee -a $LOGFIL E #offlineLUN $filerPRD $PRD_LUN_2 0 | tee -a $LOGFI LE #offlineLUN $filerPRD $PRD_LUN_3 0 | tee -a $LOGFI LE . . . # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE flexVols IN USE BY $nodeDR smBreak $filerDR $DR_VOLUME_0 #smBreak $filerDR $DR_VOLUME_1 #smBreak $filerDR $DR_VOLUME_2 #smBreak $filerDR $DR_VOLUME_3 . . . # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR unmapLUN $filerDR $DR_LUN_0 $DR_IGROUP unmapLUN $filerDR $DR_LUN_1 $DR_IGROUP #unmapLUN $filerDR $DR_LUN_2 $DR_IGROUP #unmapLUN $filerDR $DR_LUN_3 $DR_IGROUP # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR # online luns onlineLUN $filerDR $DR_LUN_0 1 | tee -a $LOGFILE onlineLUN $filerDR $DR_LUN_1 1 | tee -a $LOGFILE #onlineLUN $filerDR $DR_LUN_2 1 | tee -a $LOGFILE #onlineLUN $filerDR $DR_LUN_3 1 | tee -a $LOGFILE # map luns # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR mapLUN $filerDR $DR_LUN_0 $DR_IGROUP $DR_LUN_0_ID mapLUN $filerDR $DR_LUN_1 $DR_IGROUP $DR_LUN_1_ID #mapLUN $filerDR $DR_LUN_2 $DR_IGROUP $DR_LUN_1_ID #mapLUN $filerDR $DR_LUN_3 $DR_IGROUP $DR_LUN_2_ID . . .

Page 18: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 18

4.12 TASK 12: EDIT SG_TEST_POST-OFFLINE_FAILBACK_TO_EBSIPS1.SH ON NODE EBSIPS2

If you have multiple ServiceGroups you need to have a dedicated POST_OFFLINE_failback_to_ebsips1.sh, we recommend renaming this script to <SG_NAME>_POST_OFFLINE_failback_to_ebsips1.sh.

Open /home/netapp/scripts/<SG_NAME>_POST-OFFLINE_failback_to_ebsips1.sh and uncomment the required entries for each LUN and FlexVol in use by the Service Group.

Then modify the entry for the properties.file to that it reflects the name of the ServiceGroup, <SG_NAME>_properties.file.

.

.

.

. /home/netapp/scripts/sg_test_properties.file

.

.

. # offline luns on node nodeDR, # we don't want this node to modify any data while we are failing back to PRD # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR offlineLUN $filerDR $DR_LUN_0 0 | tee -a $LOGFILE offlineLUN $filerDR $DR_LUN_1 0 | tee -a $LOGFILE #offlineLUN $filerDR $DR_LUN_2 0 | tee -a $LOGFILE #offlineLUN $filerDR $DR_LUN_3 0 | tee -a $LOGFILE . . .

Page 19: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 19

4.13 TASK 13: EDIT PRE-ONLINE_FAILBACK_TO_EBSIPS1.S H ON EBSIPS1

If you have multiple ServiceGroups you need to have a dedicated PRE-ONLINE_failback_to_ebsips1.sh, we recommend renaming the script to <SG_NAME>_ PRE-ONLINE_failback_to_ebsips1.sh.

Open /home/netapp/scripts/<SG_NAME>_PRE-ONLINE_failback_to_ebsips1.sh and uncomment the required entries for each LUN and FlexVol in use by the Service Group.

Then modify the entry for the properties.file so that it reflects the name of the ServiceGroup, <SG_NAME>_properties.file.

.

.

.

. /home/netapp/scripts/sg_test_properties.file

.

.

. # -> EDIT properties.file AND THIS SECTION TO INCLU DE THE VOLUMES IN USE BY $nodePRD smBreak $filerPRD $PRD_VOLUME_0 #smBreak $filerPRD $PRD_VOLUME_1 #smBreak $filerPRD $PRD_VOLUME_2 #smBreak $filerPRD $PRD_VOLUME_3 . . . # Invert the replication, we were doing DR->PRD, a nd now we want PRD->DR invertSync $filerPRD $PRD_VOLUME_0 $filerDR $DR_V OLUME_0 #invertSync $filerPRD $PRD_VOLUME_1 $filerDR $DR_ VOLUME_1 #invertSync $filerPRD $PRD_VOLUME_2 $filerDR $DR_ VOLUME_2 . . . # unmap the luns to avoid errors during onlining # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD unmapLUN $filerPRD $DR_LUN_0 $PRD_IGROUP unmapLUN $filerPRD $DR_LUN_1 $PRD_IGROUP #unmapLUN $filerPRD $DR_LUN_2 $PRD_IGROUP #unmapLUN $filerPRD $DR_LUN_3 $PRD_IGROUP # online and map the luns at the PRD site # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD onlineLUN $filerPRD $PRD_LUN_0 1 | tee -a $LOGFILE onlineLUN $filerPRD $PRD_LUN_1 1 | tee -a $LOGFILE #onlineLUN $filerPRD $PRD_LUN_2 1 | tee -a $LOGFIL E #onlineLUN $filerPRD $PRD_LUN_3 1 | tee -a $LOGFIL E # map luns # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD mapLUN $filerPRD $PRD_LUN_0 $PRD_IGROUP $PRD_LUN_0 _ID mapLUN $filerPRD $PRD_LUN_1 $PRD_IGROUP $PRD_LUN_1 _ID #mapLUN $filerPRD $PRD_LUN_2 $PRD_IGROUP $PRD_LUN_ 1_ID #mapLUN $filerPRD $PRD_LUN_3 $PRD_IGROUP $PRD_LUN_ 2_ID . . .

Page 20: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 20

4.14 TASK 14: EDIT SG_TEST_RESYNC_FROM_EBSIPS2_TO_EBSIPS1.SH ON NODE EBSIPS2

If you have multiple ServiceGroups you need to have a dedicated Resync_from_ebsips2_to_ebsips1.sh, we recommend renaming the script to <SG_NAME>_ Resync_from_ebsips2_to_ebsips1.sh.

Open /home/netapp/scripts/<SG_NAME>_ Resync_from_ebsips2_to_ebsips1.sh and include entries for each FlexVol in use by the Service Group.

.

.

. ssh vcs_$nodeDR@$filerPRD snapmirror resyn c -f -S $filerDR:$DR_VOLUME_0 -w $filerPRD:$PRD_VOLUME_0 | tee -a $LOGFILE . . . while [ $( ssh vcs_$nodeDR@$filerPRD snapmirror sta tus | grep -i ^$filerDR:$DR_VOLUME_0 | grep -i $filerPRD:$PRD_VOLUME_0 | grep -c "In-sync" | tee -a $LOGFILE ) -lt 1 ] . . . ssh vcs_$nodeDR@$filerPRD snapmirror statu s | grep -i ^$filerDR:$DR_VOLUME_0 | grep -i $filerPRD:$PRD_VOLUME_0 | grep "In-sync" | tee -a $LOGFILE . . .

Page 21: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 21

Task 15: Run scripts and check results

1) LUNs must be available on ebsips1

2) Confirm that SnapMirror has been initialized and is replicating to ebsips2

3) Run sg_test_PRE-ONLINE_failover_to_ebsips2.sh from host ebsips2

4) Run sg_test_POST_OFFLINE_failback_to_ebsips1.sh from host ebsips2

5) Run sg_test_Resync_from_ebsips2_to_ebsips1.sh from host ebsips2

6) Run sg_test_PRE-ONLINE_failback_to_ebsips1.sh from host ebsips1

Task 16: create VCS service Group and resource

group sg_test ( SystemList = { ebsips1 = 0, ebsips2 = 1 } AutoStart = 0 AutoFailOver = 0 AutoStartList = { ebsips1, ebsips2 } PreOnline = 1 OnlineRetryLimit = 3 OnlineRetryInterval = 120 ) DiskGroup dgapp_ebsp1 ( Critical = 1 DiskGroup = dgapp_ebsp1 ) Mount mt_opt_VRTSnbu ( Critical = 1 MountPoint = "/opt/VRTSnbu" BlockDevice = "/dev/vx/dsk/dgapp_eb sp1/v_opt_VRTSnbu" FSType = vxfs FsckOpt = "-y" ) mt_opt_VRTSnbu requires dgapp_ebsp1

Task 17: Test VCS integration

1) Bring SG online on ebsips1

2) Offline SG on ebsip1

3) Confirm that SnapMirror is in-Sync

4) Online SG on ebsip2

5) Offline SG on ebsip2

6) Confirm that SnapMirror is in-Sync

7) Online SG on ebsip1

Page 22: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 22

5 OPERATION MANUAL

Before bringing the ServiceGroup online on any node, we must check that the SnapMirror relationships are set and in-Sync.

SnapMirror can take a few minutes to propagate the changes after a resync, these changes must be fully propagated before we bring online the ServiceGroup in any node. Failure in doing so, may result in loss of data not yet propagated to the destination FlexVol.

We can verify the status of the SnapMirror Status by running the following commands:

From ebsips1:

ebsips1> ssh vcs_ebsips1@cdc-fas-p01 snapmirror sta tus SnapMirror is on. Source Destination State Lag Status ddc-fas-p03:vl_EBSIPS1_2 CDC-FAS-P01:vl_EBSIPS1_2 Snapmirrored - in-Sync CDC-FAS-P01:vl_EBSIPS1_2 ddc-fas-p03:vl_EBSIPS1_2 Source 01:06:56 Idle

From ebsips2:

ebsips1> ssh vcs_ebsips2@ddc-fas-p03 snapmirror sta tus SnapMirror is on. Source Destination State Lag Status ddc-fas-p03:vl_EBSIPS1_2 CDC-FAS-P01:vl_EBSIPS1_2 Snapmirrored - in-Sync CDC-FAS-P01:vl_EBSIPS1_2 ddc-fas-p03:vl_EBSIPS1_2 Source 01:06:56 Pending

The output above shows that FlexVol vl_EBSIPS1_2 is being replicated from (PRD)ddc-fas-p03 to (DR)cdc-fas-p01 and its current replication status is “in-Sync”. Occasionally you may found that the State is listed as Snapmirrored but there is a number on the Lag column, this indicates that SnapMirror is not yet running in Synchronous mode but propagating changes after loss of connection or a resync operation. If this happens wait a few minutes and check the SnapMirror output again.

To bring the ServiceGroup online on ebsips1 follow these steps:

ebsips1> hastatus -sum Confirm that the ServiceGroup is not running in any other node

ebsips1> ssh vcs_ebsips1@cdc-fas-p01 snapmirror sta tus

and confirm that SnapMirror Status is ‘in-Sync’

ebsips1> hagrp –online sg_test –sys ebsips1

To bring the ServiceGroup offline follow these steps:

ebsips1> hagrp –offline sg_test –any

To bring the ServiceGroup online on ebsips2 follow these steps:

ebsips2> hastatus -sum Confirm that the ServiceGroup is not running in any other node

ebsips2> ssh vcs_ebsips2@ddc-fas-p03 snapmirror sta tus and confirm that SnapMirror Status is ‘in-Sync’

ebsips2> hagrp –online sg_test –sys ebsips2

Page 23: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 23

6 LOGS AND TROUBLESHOOTING

There are a set of logs that can be review for troubleshooting:

The directory /home/netapp/logs contain logs of the different preonline/postoffline scripts.

ebsips2> cd /home/netapp/logs ebsips2> ls –ltr | grep sg_test | tail -3

These commands will return the last 3 logs:

190608_121329_sg_test_PRE-ONLINE_failover_to_ebsips2.log

190608_121423_sg_test_resync_from_ebsips2_to_ebsips1.log

190608_122352_sg_test_POST-OFFLINE_failback_to_ebsips1.log

And on ebsips1:

ebsips1> cd /home/netapp/logs ebsips1> ls –ltr | grep sg_test | tail -1

190608_121329_sg_test_PRE-ONLINE_failback_to_ebsips1.log

Additionally the VCS logs can also be found at /var/VRTSvcs/log

Page 24: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 24

7 APPENDIX 1

7.1 PROPERTIES.FILE

#!/usr/bin/bash export nodePRD=ebsips1 export nodeDR=ebsips2 export filerPRD=ddc-fas-p03 export filerDR=CDC-FAS-P01 # define the volume names on PRD and DR filers export PRD_VOLUME_0="vl_EBSIPS1_2" #export PRD_VOLUME_1="" #export PRD_VOLUME_2="" #export PRD_VOLUME_3="" #export PRD_VOLUME_4="" export DR_VOLUME_0="vl_EBSIPS1_2" # export DR_VOLUME_1="" # export DR_VOLUME_2="" # export DR_VOLUME_3="" # export DR_VOLUME_4="" # define the LUNS availble on the PRD host from the PRD filer export PRD_LUN_0="/vol/vl_EBSIPS1_2/qt_LUN001/LUN00 1" export PRD_LUN_0_ID=1 export PRD_LUN_1="/vol/vl_EBSIPS1_2/qt_LUN002/LUN00 2" export PRD_LUN_1_ID=2 #export PRD_LUN_2="" #export PRD_LUN_2_ID= #export PRD_LUN_2="" #export PRD_LUN_2_ID= # define the LUNS to be made availble on the DR hos t from the DR filer export DR_LUN_0="/vol/vl_EBSIPS1_2/qt_LUN001/LUN001 " export DR_LUN_0_ID=1 export DR_LUN_1="/vol/vl_EBSIPS1_2/qt_LUN002/LUN002 " export DR_LUN_1_ID=2 export #DR_LUN_2="" export #DR_LUN_2_ID= # What are the igroups on DR and on the PRD filers? export PRD_IGROUP=EBSIPS1 export DR_IGROUP=EBSIPS2

Page 25: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 25

7.2 EBSIPS1 - PREONLINE

# $Id: preonline,v 2.22.2.1 2006/06/23 08:00:14 vik asg Exp $ # # $Copyrights: Copyright (c) 2006 Symantec Corporat ion. # All rights reserved. # # THIS SOFTWARE CONTAINS CONFIDENTIAL INFORMATION A ND TRADE SECRETS OF # SYMANTEC CORPORATION. USE, DISCLOSURE OR REPRODUCTION IS PROHIBITED # WITHOUT THE PRIOR EXPRESS WRITTEN PERMISSION OF S YMANTEC CORPORATION. # # The Licensed Software and Documentation are deeme d to be "commercial # computer software" and "commercial computer softw are documentation" # as defined in FAR Sections 12.212 and DFARS Secti on 227.7202. $ # # preonline # preonline is invoked just before onlining the gro up. # preonline is invoked on the node where group is t o be onlined. # # A group level configurable parameter PreOnline co ntrols whether this # trigger should be invoked or not. By default, Pre Online is not set. # PreOnline can be set in one of two ways: # a) In configuration file, define # PreOnline=1 # in the group description to set PreOnline to 1 for the group. # b) While cluster is running, and in writable stat e, do # hagrp -modify <group> PreOnline 1 # to set PreOnline to 1 for group <group>. # # preonline gives user the control to take appropri ate action depending on # whether group is being manually onlined, or group is in the process of failover. # In failover case, for example, preonline can be u sed to determine whether the # group can be brought online on this node in the c luster. # # In any case, user can give control back to engine by doing: # hagrp -online -nopre <group> -sys <system>. # This will let engine continue with online process . # # Usage: # preonline <system> <group> <whyonlining> <systemw heregroupfaulted> # # <system>: is the name of the system where group i s to be onlined. # <group>: is the name of the group that is to be o nlined. # <whyonlining>: is either "FAULT" or "MANUAL". "MA NUAL" corresponds to # manual online whereas "FAULT" corresponds to both # failover as well as manual switch. # <systemwheregroupfaulted>: When preonline is invo ked due to failover # this argument is the name of the system where gr oup # was online before. # When preonline is invoked due to group online # command issued with -checkpartial option, # this argument is the name of system specified # for this option. # eval 'exec ${VCS_HOME:-/opt/VRTSvcs}/bin/perl5 -I $ {VCS_HOME:-/opt/VRTSvcs}/lib -S $0 ${1+"$@"}' if 0; $vcs_home = $ENV{"VCS_HOME"}; if (!defined ($vcs_home)) { $vcs_home="/opt/VRTSvcs"; } use ag_i18n_inc; VCSAG_SET_ENVS(); if (!defined $ARGV[0]) { VCSAG_LOG_MSG ("W", "Failed to continue; undefined system name", 15028); exit; } elsif (!defined $ARGV[1]) { VCSAG_LOG_MSG ("W", "Failed to continue; undefined group name", 15031); exit; } #

Page 26: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 26

# It is a good idea to log what you're doing. # if (defined $ARGV[3]) { # VCSAG_LOG_MSG("I", "(preonline) Invoked with arg0 =$ARGV[0], arg1=$ARGV[1], arg2=$ARGV[2], arg3=$ARGV[3]", msgid); # } else { # VCSAG_LOG_MSG("I", "(preonline) Invoked with arg0 =$ARGV[0], arg1=$ARGV[1]", msgid); # } # # put your code here... if ( $ARGV[0] eq "ebsips1" && $ARGV[1] eq "sg_tes t" ) { system("/home/netapp/scripts/sg_test_PRE-ON LINE_failback_to_ebsips1.sh"); } # # # Here is a sample code that takes into account m ultiple groups. # # $group = $ARGV[1]; # # if ($group eq "G1") { # # Say, G1 can not be onlined on this system. # # We add heuristics to determine next best sys tem to online G1. # # Say, sysb is such a system. # `$vcs_home/bin/hagrp -online G1 -sys sysb`; # # exit now, without sending online -nopre. # exit; # } elsif ($group eq "G2") { # # We add heurisitics to determine if G2 can be onlined on this system. # # Say, we determine that G2 can not be onlined anywhere. # # Exit now without sending online -nopre. # exit; # } elsif ($group eq "G3") { # # Say, to online G3 we want to make sure if an other group P1 is online. # # Query engine for P1's online state, using ha grp -state P1 -sys $ARGV[0] # # Say, that P1 is indeed online on this system . # # Don't call exit, since online nopre is calle d down below. # } # Here is a sample code to notify a bunch of users. # @recipients=("username\@servername.com"); # # $msgfile="/tmp/preonline"; # `echo system = $ARGV[0], group = $ARGV[1], whyonl ining = $ARGV[2] > $msgfile`; # foreach $recipient (@recipients) { # # Must have elm setup to run this. # `elm -s preonline $recipient < $msgfile`; # } # `rm $msgfile`; # # give control back to HAD. if (defined $ARGV[3]) { system("$vcs_home/bin/hagrp -online -nopre $ARGV [1] -sys $ARGV[0] -checkpartial $ARGV[3]"); exit; } system("$vcs_home/bin/hagrp -online -nopre $ARGV[1] -sys $ARGV[0]"); exit;

7.3 EBSIPS1 – PRE-ONLINE_FAILBACK_TO_EBSIPS1.SH

#!/usr/bin/bash # PRE-ONLINE script: PRE-ONLINE_failback_to_ebsips1 # Version 1.1 # NetApp - 6/7/2008 # This script is used by VCS preonline trigger to m ake the PRD luns available on node ebsips1

Page 27: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 27

# In order to run this script, resync_from_ebsips2_ to_ebsips1.sh must have been executed already # resync_from_ebsips2_to_ebsips1.sh performs the sy nc from DR->PRD # This script wiil bring PRD luns online and revers e the sync from DR->PRD to PRD->DR LOGFILE="/home/netapp/logs/`date +%d%m%y_%H%M%S`_sg _test_PRE-ONLINE_failback_to_ebsips1.log" . /home/netapp/scripts/sg_test_properties.file #-------------------------------------------------- ---------------------- # functions: # function pingFilerUP # usage: pingFilerUP <filer> <0|1> # where <filer> is the filer we want to check if is online # and <0|1> defines if the script should abort if t he filer # is not online. <1> = critical pingFilerUP () { FILER=$1 CRITICAL=$2 echo "checking if filer $FILER is up and running" | tee -a $LOGFILE if [ "$( ping $FILER 1 | grep -c answer )" -gt 0 ] then echo "filer $FILER is down" | tee -a $LOGFILE if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi else echo "filer $FILER is up" | tee -a $LOGFILE fi } # function offlineLUN # usage: offlineLUN <filer> <lun> <0|1> # where <filer> is the filer where we want to onlin e the lun # <lun> is the full pathname of that lun # and <0|1> defines if the script should abort if t he operation failed # <1> = critical offlineLUN () { FILER=$1 LUN=$2 CRITICAL=$3 echo "offlining luns on $filerPRD " | tee -a $LOGF ILE ssh vcs_$nodePRD@$FILER lun offline $LUN | tee -a $LOGFILE # -> confirm that luns are offline if [ "$( ssh vcs_$nodePRD@$FILER lun show $LUN | g rep -c online )" -eq "1" ] then echo "WARNING: can't confirm that LUN $FILER:$LUN is offline" if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi fi } # function onlineLUN # usage: onlineLUN <filer> <lun> <0|1> # where <filer> is the filer where we want to onlin e the lun # <lun> is the full pathname of that lun # and <0|1> defines if the script should abort if t he operation failed # <1> = critical onlineLUN () { FILER=$1 LUN=$2 CRITICAL=$3

Page 28: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 28

echo "onlining luns $LUN on $FILER " | tee -a $LOG FILE ssh vcs_$nodePRD@$FILER lun online $LUN | tee -a $ LOGFILE # -> confirm that luns are online if [ "$( ssh vcs_$nodePRD@$FILER lun show $LUN | g rep -c offline )" -eq "1" ] then echo "WARNING: can't confirm that LUN $FILER:$LUN is online" if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi fi } # function smBreak # usage: smBreak <filer> <volume> # where <filer> is the destination filer where we w ant to break # the SnapMirror relationship # and <volume> is the volume name smBreak () { FILER=$1 VOLUME=$2 # snapmirror quiesce and release on the destinatio n site echo "running snapmirror quiesce on volume $FILER: $VOLUME" | tee -a $LOGFILE ssh vcs_$nodePRD@$FILER snapmirror quiesce $FILER: $VOLUME| tee -a $LOGFILE echo "running snapmirror break on volume $FILER:$V OLUME" | tee -a $LOGFILE ssh vcs_$nodePRD@$FILER snapmirror break $FILER:$ VOLUME | tee -a $LOGFILE # -> confirm that snapmirror actions are done if [ "$( ssh vcs_$nodePRD@$FILER snapmirror status $FILER:$VOLUME | tee -a $LOGFILE | grep -c Broken )" -lt 1 ] then # FATA:failed to break the SnapMirror relationshi p echo "FATAL: failed to break the SnapMirror relat ionship" | tee -a $LOGFILE echo "snapmirror break $FILER:$VOLUME failed" | tee -a $LOGFILE exit 255 fi } # function unmapLUN # usage: unmapLUN <filer> <lun> <igroup> # where <filer> is the destination filer where we w ant to unmap # <lun> from the <IGROUP> unmapLUN () { FILER=$1 LUN=$2 IGROUP=$3 # unmap luns to avoid problems during onlining echo "unmap luns $LUN from igroup $IGROUP on filer $FILER " | tee -a $LOGFILE ssh vcs_$nodePRD@$FILER lun unmap $LUN $IGROUP | t ee -a $LOGFILE } # function mapLUN # usage: mapLUN <filer> <lun> <igroup> <lun_id> <0| 1> # where <filer> is the destination filer where we w ant to map # <lun> to the <IGROUP> using <lun_id> mapLUN () { FILER=$1 LUN=$2 IGROUP=$3 LUNID=$4 CRITICAL=$5 # map luns echo "map lun $LUN to igroup $IGROUP on filer $FIL ER using lun id $LUNID " | tee -a $LOGFILE ssh vcs_$nodePRD@$FILER lun map $LUN $IGROUP $LUNI D | tee -a $LOGFILE # check if the luns have been mapped correctly

Page 29: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 29

if [ "$( ssh vcs_$nodePRD@$FILER lun show $LUN | tee -a $LOGFILE | grep -c mapped )" -lt 1 ] then if [ "$CRITICAL" -eq "1" ] then echo "FATAL: failed to map lun $FILER:$LUN to $I GROUP" | tee -a $LOGFILE echo "FATAL: aborting PRE-ONLINE" | tee -a $LOGF ILE exit 255 else return 1 fi fi } # function invertSync # usage: inverSync <PRD filer> <PRD volume> <DR fil er> <DR volume> # where <PRD filer> and <PRD VOLUME> is source of t he snapmirror we want to establish # and <DR filer> and <DR volume> the destination invertSync () { PRDFILER=$1 PRDVOLUME=$2 DRFILER=$3 DRVOLUME=$4 # Invert the replication, we were doing DR->PRD, a nd now we want PRD->DR echo "Invert the replication: from DR->PRD to PRD- >DR" | tee -a $LOGFILE ssh vcs_$nodePRD@$DRFILER snapmirror resync - f -S $PRDFILER:$PRDVOLUME -w $DRFILER:$DRVOLUME | tee -a $LOGFILE # check that the Snapmirror is replication the cor rect way # we can't abort the PRE-ONLINE script it the repl ication fails to be # activacted from PRD->DR. as DR may not be availa ble echo "Verify the replication: it should be from PR D->DR" | tee -a $LOGFILE ssh vcs_$nodePRD@$filerDR snapmirror status | grep -i ^$PRDFILER:$PRDVOLUME | grep -i $DRFILER:$DRVOLUME | grep Snapmirrored | te e -a $LOGFILE } #-------------------------------------------------- ---- # MAIN echo "starting $0 at $(date)" | tee -a $LOGFILE echo | tee -a $LOGFILE echo "dumping enviroment: " | tee -a $LOGFILE env | tee -a $LOGFILE echo | tee -a $LOGFILE # check if filer $filerPRD is up and running pingFilerUP $filerPRD 1 # Break the current SnapMirror # We may be returning from failure scenario where D R is replicating to PRD # we need to stop this DR -> PRD replication and ac tivate the volumes on PRD # -> EDIT properties.file AND THIS SECTION TO INCLU DE THE VOLUMES IN USE BY $nodePRD smBreak $filerPRD $PRD_VOLUME_0 #smBreak $filerPRD $PRD_VOLUME_1 #smBreak $filerPRD $PRD_VOLUME_2 #smBreak $filerPRD $PRD_VOLUME_3 # check if filer $filerDR is up and running pingFilerUP $filerDR 0 if [ $? -lt 1 ] then # Invert the replication, we were doing DR->PRD, a nd now we want PRD->DR invertSync $filerPRD $PRD_VOLUME_0 $filerDR $DR_V OLUME_0 #invertSync $filerPRD $PRD_VOLUME_1 $filerDR $DR_ VOLUME_1 #invertSync $filerPRD $PRD_VOLUME_2 $filerDR $DR_ VOLUME_2 fi # check if filer $filerPRD is up and running pingFilerUP $filerPRD 1

Page 30: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 30

if [ $? -lt 1 ] then # unmap the luns to avoid errors during onlining # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD unmapLUN $filerPRD $DR_LUN_0 $PRD_IGROUP unmapLUN $filerPRD $DR_LUN_1 $PRD_IGROUP #unmapLUN $filerPRD $DR_LUN_2 $PRD_IGROUP #unmapLUN $filerPRD $DR_LUN_3 $PRD_IGROUP # online and map the luns at the PRD site # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD onlineLUN $filerPRD $PRD_LUN_0 1 | tee -a $LOGFILE onlineLUN $filerPRD $PRD_LUN_1 1 | tee -a $LOGFILE #onlineLUN $filerPRD $PRD_LUN_2 1 | tee -a $LOGFIL E #onlineLUN $filerPRD $PRD_LUN_3 1 | tee -a $LOGFIL E # map luns # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD mapLUN $filerPRD $PRD_LUN_0 $PRD_IGROUP $PRD_LUN_0 _ID mapLUN $filerPRD $PRD_LUN_1 $PRD_IGROUP $PRD_LUN_1 _ID #mapLUN $filerPRD $PRD_LUN_2 $PRD_IGROUP $PRD_LUN_ 1_ID #mapLUN $filerPRD $PRD_LUN_3 $PRD_IGROUP $PRD_LUN_ 2_ID fi # rescan devices echo "rescanning devices..." | tee -a $LOGFILE cfgadm -al | tee -a $LOGFILE devfsadm -C | tee -a $LOGFILE /opt/NTAP/SANToolkit/bin/sanlun lun show -p >/dev/n ull 2>&1 /opt/NTAP/SANToolkit/bin/sanlun lun show -p | tee -a $LOGFILE

7.4 EBSIPS2 – RESYNC_FROM_EBSIPS2_TO_EBSIPS1.SH

#!/usr/bin/bash # Resync from ebsips2 to ebsip1 LOGFILE="/home/netapp/logs/`date +%d%m%y_%H%M%S`_sg_test_resync_from_ebsips2_to_ebsi ps1.log" . /home/netapp/scripts/sg_test_properties.file echo "starting $0 at $(date)" | tee -a $LOGFILE echo | tee -a $LOGFILE echo "dumping enviroment: " | tee -a $LOGFILE env | tee -a $LOGFILE echo | tee -a $LOGFILE # check if filer $filerPRD is up and running echo "checking if filer $filerPRD is up and running " | tee -a $LOGFILE if [ "$( ping $filerPRD 1 | grep -c answer )" -gt 0 ] then echo "filer $filerPRD is down" | tee -a $LO GFILE else echo "filer $filerPRD is up" | tee -a $LOGF ILE # Invert the replication and resync DR -> P RD # resync DR -> PRD # this command may fail as the PRD site may not be available echo "Inverting the replication, resynching DR -> PRD" | tee -a $LOGFILE ssh vcs_$nodeDR@$filerPRD snapmirror resyn c -f -S $filerDR:$DR_VOLUME_0 -w $filerPRD:$PRD_VOLUME_0 | tee -a $LOGFILE # -> verify that snapmirror is configured fro m DR -> PRD

Page 31: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 31

while [ $( ssh vcs_$nodeDR@$filerPRD snapmirror sta tus | grep -i ^$filerDR:$DR_VOLUME_0 | grep -i $filerPRD:$PRD_VOLUME_0 | grep -c "In-sync" | tee -a $LOGFILE ) -lt 1 ] do echo "waiting for volume to be fully synched..." sleep 30 done ssh vcs_$nodeDR@$filerPRD snapmirror statu s | grep -i ^$filerDR:$DR_VOLUME_0 | grep -i $filerPRD:$PRD_VOLUME_0 | grep "In-sync" | tee -a $LOGFILE fi

7.5 EBSIPS2 – POSTOFFLINE

# $Id: postoffline,v 2.21 2005/10/17 12:36:59 vikas g Exp $ # # $Copyrights: Copyright (c) 2006 Symantec Corporat ion. # All rights reserved. # # THIS SOFTWARE CONTAINS CONFIDENTIAL INFORMATION A ND TRADE SECRETS OF # SYMANTEC CORPORATION. USE, DISCLOSURE OR REPRODUCTION IS PROHIBITED # WITHOUT THE PRIOR EXPRESS WRITTEN PERMISSION OF S YMANTEC CORPORATION. # # The Licensed Software and Documentation are deeme d to be "commercial # computer software" and "commercial computer softw are documentation" # as defined in FAR Sections 12.212 and DFARS Secti on 227.7202. $ # # postoffline # postoffline is invoked after a group transitions to an OFFLINE state from # a non-OFFLINE state. postoffline is invoked on th e node where group # went OFFLINE. # # There are no configurable settings that will turn ON/OFF invoking this # trigger. If you don't want this trigger to be inv oked, remove postoffline.* # files from $VCS_HOME/bin/triggers directory. # # Usage: # postoffline <system> <group> # # <system>: is the name of the system where group i s offlined. # <group>: is the name of the group that is offline d. # eval 'exec ${VCS_HOME:-/opt/VRTSvcs}/bin/perl5 -I $ {VCS_HOME:-/opt/VRTSvcs}/lib -S $0 ${1+"$@"}' if 0; $vcs_home = $ENV{"VCS_HOME"}; if (!defined ($vcs_home)) { $vcs_home="/opt/VRTSvcs"; } use ag_i18n_inc; VCSAG_SET_ENVS(); if (!defined $ARGV[0]) { VCSAG_LOG_MSG ("W", "Failed to continue; undefined system name", 15028); exit; } elsif (!defined $ARGV[1]) { VCSAG_LOG_MSG ("W", "Failed to continue; undefined group name", 15031); exit; } # # It is a good idea to log what you're doing. # VCSAG_LOG_MSG("I", "(postoffline) Invoked with ar g0=$ARGV[0], arg1=$ARGV[1]", msgid); # # put your code here... if ( $ARGV[0] eq "ebsips2" && $ARGV[1] eq "sg_tes t" ) { system("/home/netapp/scripts/sg_test_POST-O FFLINE_failback_to_ebsips1.sh"); }

Page 32: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 32

# # Here is a sample code to notify a bunch of users. # @recipients=("username\@servername.com"); # # $msgfile="/tmp/postoffline"; # `echo system = $ARGV[0], group = $ARGV[1] > $msgf ile`; # foreach $recipient (@recipients) { # # Must have elm setup to run this. # `elm -s postoffline $recipient < $msgfile`; # } # `rm $msgfile`; exit;

7.6 EBSIPS2 – PREONLINE # $Id: preonline,v 2.22.2.1 2006/06/23 08:00:14 vik asg Exp $ # # $Copyrights: Copyright (c) 2006 Symantec Corporat ion. # All rights reserved. # # THIS SOFTWARE CONTAINS CONFIDENTIAL INFORMATION A ND TRADE SECRETS OF # SYMANTEC CORPORATION. USE, DISCLOSURE OR REPRODUCTION IS PROHIBITED # WITHOUT THE PRIOR EXPRESS WRITTEN PERMISSION OF S YMANTEC CORPORATION. # # The Licensed Software and Documentation are deeme d to be "commercial # computer software" and "commercial computer softw are documentation" # as defined in FAR Sections 12.212 and DFARS Secti on 227.7202. $ # # preonline # preonline is invoked just before onlining the gro up. # preonline is invoked on the node where group is t o be onlined. # # A group level configurable parameter PreOnline co ntrols whether this # trigger should be invoked or not. By default, Pre Online is not set. # PreOnline can be set in one of two ways: # a) In configuration file, define # PreOnline=1 # in the group description to set PreOnline to 1 for the group. # b) While cluster is running, and in writable stat e, do # hagrp -modify <group> PreOnline 1 # to set PreOnline to 1 for group <group>. # # preonline gives user the control to take appropri ate action depending on # whether group is being manually onlined, or group is in the process of failover. # In failover case, for example, preonline can be u sed to determine whether the # group can be brought online on this node in the c luster. # # In any case, user can give control back to engine by doing: # hagrp -online -nopre <group> -sys <system>. # This will let engine continue with online process . # # Usage: # preonline <system> <group> <whyonlining> <systemw heregroupfaulted> # # <system>: is the name of the system where group i s to be onlined. # <group>: is the name of the group that is to be o nlined. # <whyonlining>: is either "FAULT" or "MANUAL". "MA NUAL" corresponds to # manual online whereas "FAULT" corresponds to both # failover as well as manual switch. # <systemwheregroupfaulted>: When preonline is invo ked due to failover # this argument is the name of the system where gr oup # was online before. # When preonline is invoked due to group online # command issued with -checkpartial option, # this argument is the name of system specified # for this option. # eval 'exec ${VCS_HOME:-/opt/VRTSvcs}/bin/perl5 -I $ {VCS_HOME:-/opt/VRTSvcs}/lib -S $0 ${1+"$@"}' if 0; $vcs_home = $ENV{"VCS_HOME"}; if (!defined ($vcs_home)) { $vcs_home="/opt/VRTSvcs"; }

Page 33: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 33

use ag_i18n_inc; VCSAG_SET_ENVS(); if (!defined $ARGV[0]) { VCSAG_LOG_MSG ("W", "Failed to continue; undefined system name", 15028); exit; } elsif (!defined $ARGV[1]) { VCSAG_LOG_MSG ("W", "Failed to continue; undefined group name", 15031); exit; } # # It is a good idea to log what you're doing. # if (defined $ARGV[3]) { # VCSAG_LOG_MSG("I", "(preonline) Invoked with arg0 =$ARGV[0], arg1=$ARGV[1], arg2=$ARGV[2], arg3=$ARGV[3]", msgid); # } else { # VCSAG_LOG_MSG("I", "(preonline) Invoked with arg0 =$ARGV[0], arg1=$ARGV[1]", msgid); # } # # put your code here... if ( $ARGV[0] eq "ebsips2" && $ARGV[1] eq "sg_tes t" ) { system("/home/netapp/scripts/sg_test_PRE-ONLINE_fa ilover_to_ebsips2.sh"); } # # # Here is a sample code that takes into account m ultiple groups. # # $group = $ARGV[1]; # # if ($group eq "G1") { # # Say, G1 can not be onlined on this system. # # We add heuristics to determine next best sys tem to online G1. # # Say, sysb is such a system. # `$vcs_home/bin/hagrp -online G1 -sys sysb`; # # exit now, without sending online -nopre. # exit; # } elsif ($group eq "G2") { # # We add heurisitics to determine if G2 can be onlined on this system. # # Say, we determine that G2 can not be onlined anywhere. # # Exit now without sending online -nopre. # exit; # } elsif ($group eq "G3") { # # Say, to online G3 we want to make sure if an other group P1 is online. # # Query engine for P1's online state, using ha grp -state P1 -sys $ARGV[0] # # Say, that P1 is indeed online on this system . # # Don't call exit, since online nopre is calle d down below. # } # Here is a sample code to notify a bunch of users. # @recipients=("username\@servername.com"); # # $msgfile="/tmp/preonline"; # `echo system = $ARGV[0], group = $ARGV[1], whyonl ining = $ARGV[2] > $msgfile`; # foreach $recipient (@recipients) { # # Must have elm setup to run this. # `elm -s preonline $recipient < $msgfile`; # } # `rm $msgfile`; # # give control back to HAD. if (defined $ARGV[3]) { system("$vcs_home/bin/hagrp -online -nopre $ARGV [1] -sys $ARGV[0] -checkpartial $ARGV[3]"); exit; } system("$vcs_home/bin/hagrp -online -nopre $ARGV[1] -sys $ARGV[0]"); exit;

Page 34: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 34

7.7 EBSIPS2 – POST-OFFLINE_FAILBACK_TO_EBSIPS1.SH #!/usr/bin/bash # POST-OFFLINE script: _POST-OFFLINE_failback_to_eb sips # Version 1.1 # NetApp - 6/7/2008 # This script is used by VCS postoffline trigger to make the DR luns offline on node ebsips2 LOGFILE="/home/netapp/logs/`date +%d%m%y_%H%M%S`_sg _test_POST-OFFLINE_failback_to_ebsips1.log" . /home/netapp/scripts/sg_test_properties.file #-------------------------------------------------- ---------------------- # functions: # function pingFilerUP # usage: pingFilerUP <filer> <0|1> # where <filer> is the filer we want to check if is online # and <0|1> defines if the script should abort if t he filer # is not online. <1> = critical pingFilerUP () { FILER=$1 CRITICAL=$2 echo "checking if filer $FILER is up and running" | tee -a $LOGFILE if [ "$( ping $FILER 1 | grep -c answer )" -gt 0 ] then echo "filer $FILER is down" | tee -a $LOGFILE if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi else echo "filer $FILER is up" | tee -a $LOGFILE fi } # function offlineLUN # usage: offlineLUN <filer> <lun> <0|1> # where <filer> is the filer where we want to onlin e the lun # <lun> is the full pathname of that lun # and <0|1> defines if the script should abort if t he operation failed # <1> = critical offlineLUN () { FILER=$1 LUN=$2 CRITICAL=$3 echo "offlining luns on $FILER " | tee -a $LOGFILE ssh vcs_$nodeDR@$FILER lun offline $LUN | tee -a $ LOGFILE # -> confirm that luns are offline if [ "$( ssh vcs_$nodeDR@$FILER lun show $LUN | gr ep -c online )" -eq "1" ] then echo "WARNING: can't confirm that LUN $FILER:$LUN is offline" if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi fi } #-------------------------------------------------- ---- # MAIN echo "starting $0 at $(date)" | tee -a $LOGFILE echo | tee -a $LOGFILE echo "dumping enviroment: " | tee -a $LOGFILE env | tee -a $LOGFILE

Page 35: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 35

echo | tee -a $LOGFILE # check if filer $filerDR is up and running pingFilerUP $filerDR 0 if [ $? -lt 1 ] then # offline luns on node nodeDR, # we don't want this node to modify any data while we are failing back to PRD # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD offlineLUN $filerDR $DR_LUN_0 0 | tee -a $LOGFILE offlineLUN $filerDR $DR_LUN_1 0 | tee -a $LOGFILE #offlineLUN $filerDR $DR_LUN_2 0 | tee -a $LOGFILE #offlineLUN $filerDR $DR_LUN_3 0 | tee -a $LOGFILE fi # rescan devices echo "rescanning devices..." | tee -a $LOGFILE cfgadm -al | tee -a $LOGFILE devfsadm -C | tee -a $LOGFILE /opt/NTAP/SANToolkit/bin/sanlun lun show -p >/dev/n ull 2>&1 /opt/NTAP/SANToolkit/bin/sanlun lun show -p | tee -a $LOGFILE

7.8 EBSIPS2 – PRE-ONLINE_FAILOVER_TO_EBSIPS2.SH #!/usr/bin/bash # PRE-ONLINE script: PRE-ONLINE_failover_to_ebsips2 .sh # Version 1.1 # NetApp - 6/7/2008 # This script is used by VCS preonline trigger to m ake the DR luns available on node ebsips2 LOGFILE="/home/netapp/logs/`date +%d%m%y_%H%M%S`_sg _test_PRE-ONLINE_failover_to_ebsips2.log" . /home/netapp/scripts/sg_test_properties.file #-------------------------------------------------- ---------------------- # functions: # function pingFilerUP # usage: pingFilerUP <filer> <0|1> # where <filer> is the filer we want to check if is online # and <0|1> defines if the script should abort if t he filer # is not online. <1> = critical pingFilerUP () { FILER=$1 CRITICAL=$2 echo "checking if filer $FILER is up and running" | tee -a $LOGFILE if [ "$( ping $FILER 1 | grep -c answer )" -gt 0 ] then echo "filer $FILER is down" | tee -a $LOGFILE if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi else echo "filer $FILER is up" | tee -a $LOGFILE fi } # function offlineLUN # usage: offlineLUN <filer> <lun> <0|1> # where <filer> is the filer where we want to onlin e the lun # <lun> is the full pathname of that lun # and <0|1> defines if the script should abort if t he operation failed # <1> = critical offlineLUN () {

Page 36: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 36

FILER=$1 LUN=$2 CRITICAL=$3 echo "offlining luns on $filerPRD " | tee -a $LOGF ILE ssh vcs_$nodeDR@$FILER lun offline $LUN | tee -a $ LOGFILE # -> confirm that luns are offline if [ "$( ssh vcs_$nodeDR@$FILER lun show $LUN | gr ep -c online )" -eq "1" ] then echo "WARNING: can't confirm that LUN $FILER:$LUN is offline" if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi fi } # function onlineLUN # usage: onlineLUN <filer> <lun> <0|1> # where <filer> is the filer where we want to onlin e the lun # <lun> is the full pathname of that lun # and <0|1> defines if the script should abort if t he operation failed # <1> = critical onlineLUN () { FILER=$1 LUN=$2 CRITICAL=$3 echo "onlining luns $LUN on $FILER " | tee -a $LOG FILE ssh vcs_$nodeDR@$FILER lun online $LUN | tee -a $L OGFILE # -> confirm that luns are online if [ "$( ssh vcs_$nodeDR@$FILER lun show $LUN | gr ep -c offline )" -eq "1" ] then echo "WARNING: can't confirm that LUN $FILER:$LUN is online" if [ "$CRITICAL" -eq "1" ] then echo "FATAL: aborting PRE-ONLINE" | tee -a $LOG FILE exit 255 else return 1 fi fi } # function smBreak # usage: smBreak <filer> <volume> # where <filer> is the destination filer where we w ant to break # the SnapMirror relationship # and <volume> is the volume name smBreak () { FILER=$1 VOLUME=$2 # snapmirror quiesce and release on the destinatio n site echo "running snapmirror quiesce on volume $FILER: $VOLUME" | tee -a $LOGFILE ssh vcs_$nodeDR@$FILER snapmirror quiesce $FILER:$ VOLUME| tee -a $LOGFILE echo "running snapmirror break on volume $FILER:$V OLUME" | tee -a $LOGFILE ssh vcs_$nodeDR@$FILER snapmirror break $FILER:$V OLUME | tee -a $LOGFILE # -> confirm that snapmirror actions are done if [ "$( ssh vcs_$nodeDR@$FILER snapmirror status $FILER:$VOLUME | tee -a $LOGFILE | grep -c Broken )" -lt 1 ] then # FATA:failed to break the SnapMirror relationshi p echo "FATAL: failed to break the SnapMirror relat ionship" | tee -a $LOGFILE echo "snapmirror break $FILER:$VOLUME failed" | tee -a $LOGFILE exit 255 fi } # function unmapLUN

Page 37: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 37

# usage: unmapLUN <filer> <lun> <igroup> # where <filer> is the destination filer where we w ant to unmap # <lun> from the <IGROUP> unmapLUN () { FILER=$1 LUN=$2 IGROUP=$3 # unmap luns to avoid problems during onlining echo "unmap luns $LUN from igroup $IGROUP on filer $FILER " | tee -a $LOGFILE ssh vcs_$nodeDR@$FILER lun unmap $LUN $IGROUP | te e -a $LOGFILE } # function mapLUN # usage: mapLUN <filer> <lun> <igroup> <lun_id> <0| 1> # where <filer> is the destination filer where we w ant to map # <lun> to the <IGROUP> using <lun_id> mapLUN () { FILER=$1 LUN=$2 IGROUP=$3 LUNID=$4 CRITICAL=$5 # map luns echo "map lun $LUN to igroup $IGROUP on filer $FIL ER using lun id $LUNID " | tee -a $LOGFILE ssh vcs_$nodeDR@$FILER lun map $LUN $IGROUP $LUNID | tee -a $LOGFILE # check if the luns have been mapped correctly if [ "$( ssh vcs_$nodeDR@$FILER lun show $LUN | tee -a $LOGFILE | grep -c mapped )" -lt 1 ] then if [ "$CRITICAL" -eq "1" ] then echo "FATAL: failed to map lun $FILER:$LUN to $I GROUP" | tee -a $LOGFILE echo "FATAL: aborting PRE-ONLINE" | tee -a $LOGF ILE exit 255 else return 1 fi fi } #-------------------------------------------------- ---- # MAIN echo "starting $0 at $(date)" | tee -a $LOGFILE echo | tee -a $LOGFILE echo "dumping enviroment: " | tee -a $LOGFILE env | tee -a $LOGFILE echo | tee -a $LOGFILE # check if filer $filerPRD is up and running pingFilerUP $filerPRD 0 if [ $? -lt 1 ] then # offline luns on node nodePRD, # we don't want this node to modify any data while we are running from DR # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodePRD offlineLUN $filerPRD $PRD_LUN_0 0 | tee -a $LOGFIL E offlineLUN $filerPRD $PRD_LUN_1 0 | tee -a $LOGFIL E #offlineLUN $filerPRD $PRD_LUN_2 0 | tee -a $LOGFI LE #offlineLUN $filerPRD $PRD_LUN_3 0 | tee -a $LOGFI LE fi # check if filer $filerDR is up and running pingFilerUP $filerDR 1 if [ $? -lt 1 ] then # snapmirror Break on the destination site

Page 38: HIGH AVAILABILITY SETUP USING VERITAS CLUSTER SERVER AND NETAPP

Page 38

# -> EDIT properties.file AND THIS SECTION TO INCL UDE THE flexVols IN USE BY $nodeDR smBreak $filerDR $DR_VOLUME_0 #smBreak $filerDR $DR_VOLUME_1 #smBreak $filerDR $DR_VOLUME_2 #smBreak $filerDR $DR_VOLUME_3 # unmap luns to avoid problems during onlining # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR unmapLUN $filerDR $DR_LUN_0 $DR_IGROUP unmapLUN $filerDR $DR_LUN_1 $DR_IGROUP #unmapLUN $filerDR $DR_LUN_2 $DR_IGROUP #unmapLUN $filerDR $DR_LUN_3 $DR_IGROUP # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR # online luns onlineLUN $filerDR $DR_LUN_0 1 | tee -a $LOGFILE onlineLUN $filerDR $DR_LUN_1 1 | tee -a $LOGFILE #onlineLUN $filerDR $DR_LUN_2 1 | tee -a $LOGFILE #onlineLUN $filerDR $DR_LUN_3 1 | tee -a $LOGFILE # map luns # -> EDIT properties.file AND THIS SECTION TO INCL UDE THE LUNS IN USE BY $nodeDR mapLUN $filerDR $DR_LUN_0 $DR_IGROUP $DR_LUN_0_ID mapLUN $filerDR $DR_LUN_1 $DR_IGROUP $DR_LUN_1_ID #mapLUN $filerDR $DR_LUN_2 $DR_IGROUP $DR_LUN_1_ID #mapLUN $filerDR $DR_LUN_3 $DR_IGROUP $DR_LUN_2_ID fi # rescan luns on the host echo "rescan devices ..." | tee -a $LOGFILE cfgadm -al | tee -a $LOGFILE devfsadm -C | tee -a $LOGFILE /opt/NTAP/SANToolkit/bin/sanlun lun show -p >/dev/n ull 2>&1 /opt/NTAP/SANToolkit/bin/sanlun lun show -p | tee -a $LOGFILE # last minute change requested from ACME, they woul d like the failback replication to be set # and initiated automatically # This hasn’t been tested! /home/netapp/scripts/sg_test_resync_from_ebsips2_to _ebsips1.sh