business continuity and disaster recovery for oracle11g enabled by emc symmetrix vmaxe, emc...

60
White Paper EMC SOLUTIONS GROUP Abstract This white paper describes a data protection and disaster recovery solution for virtualized Oracle Database 11g OLTP environments, enabled by EMC ® Symmetrix ® VMAXe™ with Enginuity™ for VMAXe, EMC RecoverPoint, and VMware ® vCenter™ Site Recovery Manager. It covers both local data protection and automated failover and failback between remote sites. July 2011 BUSINESS CONTINUITY AND DISASTER RECOVERY FOR ORACLE 11g ENABLED BY EMC SYMMETRIX VMAXe, EMC RECOVERPOINT, AND VMWARE vCENTER SITE RECOVERY MANAGER An Architectural Overview

Upload: emc-academic-alliance

Post on 22-Jan-2015

956 views

Category:

Technology


12 download

DESCRIPTION

This white paper describes a data protection and disaster recovery solution for Virtualized Oracle Database 11gOLTP environments, enabled by EMC Symmetrix VMAXe with Enginuity for VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager. It covers both local data protection and automated failover and failback between remote sites.

TRANSCRIPT

Page 1: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

White Paper

EMC SOLUTIONS GROUP

Abstract

This white paper describes a data protection and disaster recovery solution for virtualized Oracle Database 11g OLTP environments, enabled by EMC®

Symmetrix® VMAXe™ with Enginuity™ for VMAXe, EMC RecoverPoint, and VMware® vCenter™ Site Recovery Manager. It covers both local data protection and automated failover and failback between remote sites.

July 2011

BUSINESS CONTINUITY AND DISASTER RECOVERY FOR ORACLE 11g ENABLED BY EMC SYMMETRIX VMAXe, EMC RECOVERPOINT, AND VMWARE vCENTER SITE RECOVERY MANAGER An Architectural Overview

Page 2: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

2

Copyright © 2011 EMC Corporation. All Rights Reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.

VMware, ESX, VMware vCenter, and VMware vSphere are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners.

Part Number H8207

Page 3: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

3 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Table of contents

Executive summary ............................................................................................................... 6

Introduction to EMC Symmetrix VMAXe ............................................................................................ 6

Business case .................................................................................................................................. 7

Solution overview ............................................................................................................................ 7

Key results ....................................................................................................................................... 7

Introduction .......................................................................................................................... 8

Purpose ........................................................................................................................................... 8

Scope .............................................................................................................................................. 8

Audience.......................................................................................................................................... 8

Terminology ..................................................................................................................................... 8

Key technology components ................................................................................................ 10

Solution components ..................................................................................................................... 10

Oracle application .......................................................................................................................... 10

EMC Symmetrix VMAXe with Enginuity 5875 .................................................................................. 10

Overview ................................................................................................................................... 10

Virtual Provisioning ................................................................................................................... 11

Symmetrix Management Console .............................................................................................. 11

EMC VNX5700 ................................................................................................................................ 11

VMware vSphere ............................................................................................................................ 11

EMC RecoverPoint .......................................................................................................................... 12

Overview ................................................................................................................................... 12

RecoverPoint data protection options ........................................................................................ 12

RecoverPoint appliance ............................................................................................................. 13

RecoverPoint splitter ................................................................................................................. 13

RecoverPoint journals ................................................................................................................ 13

RecoverPoint consistency groups .............................................................................................. 13

VMware vCenter Site Recovery Manager ......................................................................................... 14

RecoverPoint and SRM integration ................................................................................................. 14

RecoverPoint CDP and CRR – how they work ................................................................................... 15

Continuous data protection ....................................................................................................... 15

Continuous remote replication .................................................................................................. 16

Solution architecture and design ......................................................................................... 17

Solution architecture ...................................................................................................................... 17

Environment profile ........................................................................................................................ 18

Hardware environment ................................................................................................................... 18

Software environment .................................................................................................................... 19

Symmetrix VMAXe storage allocation ............................................................................................. 20

Page 4: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

4

Virtual machine resources .............................................................................................................. 21

Oracle database configuration ............................................................................................. 22

Oracle database deployment on Symmetrix VMAXe ....................................................................... 22

Virtual Provisioning for Oracle ASM disk groups ........................................................................ 22

Database schema .......................................................................................................................... 22

Configuring EMC RecoverPoint ............................................................................................. 24

Preparing the VMAXe for RecoverPoint connectivity ........................................................................ 24

RPA repository and gatekeeper provisioning .............................................................................. 24

Enabling write protect bypass for RPA initiators ......................................................................... 26

RecoverPoint configuration for this solution ................................................................................... 27

Journal size ............................................................................................................................... 27

Installing RecoverPoint................................................................................................................... 28

Configuring CLR .............................................................................................................................. 28

Overview ................................................................................................................................... 28

Step 1: Present the storage to be replicated to the RPA clusters ............................................... 29

Step 2: Tag the protected devices for RecoverPoint use ............................................................ 29

Step 3: Configure RecoverPoint access to the RecoverPoint splitters ......................................... 30

Step 4: Create the consistency group ........................................................................................ 30

Configuring the consistency group for management by SRM .......................................................... 32

Automating site recovery with VMware vCenter Site Recovery Manager ................................. 33

Overview ........................................................................................................................................ 33

Prerequisites .................................................................................................................................. 34

Step 1: Install and configure SRM .................................................................................................. 34

Step 2: Connect the protected and recovery sites ........................................................................... 35

Step 3: Configure RecoverPoint array managers ............................................................................. 35

Step 4: Configure protection groups ............................................................................................... 36

Step 5: Configure inventory mappings ............................................................................................ 37

Step 6: Customize virtual machine recovery options ...................................................................... 37

Step 7: Customize recovery site IP addresses ................................................................................. 38

Step 8: Create the recovery plan ..................................................................................................... 39

Testing the Oracle Failover recovery plan ............................................................................. 40

Overview ........................................................................................................................................ 40

Testing the recovery plan ............................................................................................................... 40

Verifying the success of the recovery test ....................................................................................... 42

The recovery test report .................................................................................................................. 43

Benefits of testing recovery plans .................................................................................................. 44

RecoverPoint CRR site failover process ................................................................................. 45

Page 5: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

5 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Overview ........................................................................................................................................ 45

Running the Oracle Failover recovery plan ...................................................................................... 45

Restarting the Oracle database ...................................................................................................... 47

Verifying data integrity at the recovery site ..................................................................................... 47

Benefits of RecoverPoint site failover with SRM .............................................................................. 48

RecoverPoint CRR site failback process ................................................................................ 49

Overview ........................................................................................................................................ 49

Housekeeping ................................................................................................................................ 49

Configuring SRM for failback .......................................................................................................... 50

Testing the failback recovery plan .................................................................................................. 50

Failing back to the production site ................................................................................................. 50

Verifying failback to the production site ......................................................................................... 51

Benefits of RecoverPoint site failback with SRM ............................................................................. 51

RecoverPoint CDP database recovery ................................................................................... 52

Overview ........................................................................................................................................ 52

Preparing the test scenario ............................................................................................................ 52

Recovering the database from the CDP image ................................................................................ 53

Benefits of database protection with RecoverPoint......................................................................... 56

RecoverPoint operation when VMAXe is operating in a degraded mode ................................. 57

Overview ........................................................................................................................................ 57

Setting up VMAXe degraded mode ................................................................................................. 57

Testing degraded mode.................................................................................................................. 57

Conclusion ......................................................................................................................... 59

Summary ....................................................................................................................................... 59

Findings ......................................................................................................................................... 59

References .......................................................................................................................... 60

White papers ................................................................................................................................. 60

Product documentation .................................................................................................................. 60

Other documentation ..................................................................................................................... 60

Page 6: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

6

Executive summary

EMC® Symmetrix® is the world’s most trusted storage platform and enterprise customers have been deploying large mission-critical applications with Symmetrix for over 20 years. Symmetrix VMAX™ is the world’s most scalable storage array, with the richest feature set, best availability, and best performance in the industry.

The Symmetrix VMAXe™ series is a new class of enterprise storage, which delivers an efficient and cost-effective hardware design combined with built-in software and simplified installation, configuration, and management. VMAXe bring high-end storage array capabilities to service providers, healthcare organizations, and others with demanding virtual computing environments but limited storage expertise and IT resources.

• The VMAXe employs an implementation of the Symmetrix Virtual Matrix Architecture™ that is optimized for rapid deployment and easier management.

The system can scale from a single-bay, single-engine configuration to a six-bay, four-engine system with 960 drives and up to 1.3 PB of usable capacity. This enables customers to cost-effectively grow and upgrade their system to accommodate application and data growth.

• The Enginuity™ operating environment—the intelligence that controls all VMAXe array components—is delivered preconfigured with EMC Virtual Provisioning™ and EMC Fully Automated Storage Tiering for Virtual Pools (FAST VP).

The 100 percent virtually provisioned environment improves storage utilization and reduces costs and administration time.

FAST VP provides automatic and highly granular movement of sub-LUN data between storage tiers. This optimizes performance and reduces cost, while radically simplifying management and increasing storage efficiency.

• The VMAXe uses 100 percent internal redundancy to deliver enterprise-class reliability, availability, and serviceability (RAS).

• EMC RecoverPoint is replication technology that uses sophisticated journaling techniques and write splitting to provide local and remote replication with DRV-like recovery to any point in time. The VMAXe has an integrated write splitter to support RecoverPoint replication.

The VMAXe series is also designed for fast and efficient deployment and includes features that are particularly useful in small or crowded data centers:

• VMAXe systems are delivered preconfigured, 100 percent virtually provisioned, and ready for same-day installation and startup.

• With the VMAXe array dispersion capability, bays can be separated up to 10 meters apart, enabling very flexible deployments.

• High storage densities can also be achieved, with a single-rack, single-engine VMAXe capable of supporting 120 drives.

Introduction to EMC Symmetrix VMAXe

Page 7: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

7 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Enterprise customers of all sizes and across different vertical segments experience similar challenges—downtime, application and data growth, demanding SLAs, and budget restraints. Symmetrix VMAXe with Enginuity delivers multi-controller scale-out architecture, availability, and efficiency to enterprise customers to address these challenges.

Data protection and disaster recovery are critical to all enterprises and are among the most important aspects of administration in Oracle database environments. EMC RecoverPoint is proven technology for high-availability Oracle environments, providing local and remote replication with no impact in asynchronous environments and very limited application degradation in synchronous environments:

• RecoverPoint’s journaled replication architecture enables recovery to any point in time, which reduces the recovery point objectives (RPOs) for Oracle data.

• RecoverPoint maintains transaction-consistent images at the recovery site, which reduces the recovery time objectives (RTOs).

The integrated RecoverPoint splitter for VMAXe simplifies database replication and supports operational and disaster recovery of virtualized environments. VMware® vCenter™ Site Recovery Manager works with RecoverPoint to coordinate and automate the recovery process, and enables administrators to test their disaster recovery plans without impacting the production environment or ongoing replication.

This solution focuses on disaster recovery between heterogeneous arrays—a Symmetrix VMAXe and an EMC VNX5700™—enabled by RecoverPoint continuous remote replication (CRR) and the RecoverPoint splitter. Local point-in-time database recovery using RecoverPoint continuous data protection (CDP) is also documented.

The solution demonstrates the data protection and disaster recovery capabilities of these technologies in the context of a virtualized Oracle Database 11g OLTP environment.

The virtualization platform for the solution is enabled by VMware vSphere™ and VMware vCenter Site Recovery Manager (SRM). Integration of RecoverPoint CRR and VMWare SRM enables automated failover of the Oracle database from the production site to the recovery site and ensures that data replicated to the recovery site is available to the recovery site servers.

The solution offers the following key benefits:

• Simplified replication of Oracle OLTP database environments between heterogeneous arrays by using the integrated RecoverPoint splitter for VMAXe and the integrated RecoverPoint splitter for VNX

• Rapid recovery from unplanned disasters by using RecoverPoint in conjunction with Symmetrix VMAXe and EMC VNX arrays

• Automated failover and failback for planned and unplanned disasters, enabled by integration of RecoverPoint and VMware vCenter SRM

• Nondisruptive disaster recovery rehearsal using RecoverPoint and VMware vCenter SRM

• Point-in-time database recovery from bookmarked RecoverPoint images

Business case

Solution overview

Key results

Page 8: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

8

Introduction

This white paper describes a data protection and disaster recovery solution for virtualized Oracle Database 11g R2 OLTP environments, enabled by RecoverPoint, Symmetrix VMAXe and VNX RecoverPoint splitters, and VMware vCenter Site Recovery Manager (SRM).

The scope of this paper is to describe:

• The storage and virtualization infrastructure for the solution

• RecoverPoint and vCenter SRM configuration for automated failover and failback

• Nondisruptive disaster recovery rehearsal

• Failover to the recovery site

• Failback to the production site

• Local point-in-time recovery of the database

• RecoverPoint operation when VMAXe is operating in a degraded mode

This white paper is intended for Oracle database administrators, storage administrators, VMware administrators, EMC customers, and field personnel who want to understand the data protection and disaster recovery capabilities of Symmetrix VMAXe, RecoverPoint, and SRM in the context of virtualized Oracle OLTP databases.

Table 1 defines terms used in this white paper.

Table 1. Terminology

Term Definition

ASM Automatic Storage Management. Oracle logical volume manager.

CDP Continuous data protection (see RecoverPoint data protection options).

CLR Concurrent local and remote replication (see RecoverPoint data protection options).

CRR Continuous remote replication (see RecoverPoint data protection options).

Data device Virtual Provisioning term for devices (not mapped to the host) that provide physical storage for thin devices. Data devices must be contained in a virtual pool before they can be used.

DR Disaster recovery.

Enginuity The operating environment that provides the intelligence that controls all components in a VMAXe array.

Purpose

Scope

Audience

Terminology

Page 9: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

9 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Term Definition

FAST VP Fully Automated Storage Tiering for Virtual Pools. A feature of EMC Enginuity 5875 that provides automatic storage tiering at the sub-LUN level. VP denotes virtual pools, which are Virtual Provisioning thin pools.

RDM Raw device mapping. A method of presenting a SAN/data device directly to a virtual machine. An alternative to using VMware VMFS.

RPA RecoverPoint appliance (see RecoverPoint appliance).

RPO Recovery point objective. The maximum acceptable time period between the last available consistent image and a disaster or failure.

RTO Recovery time objective. The maximum acceptable time to bring a system or application back to operational state after a failure or disaster.

SMC Symmetrix Management Console. A browser-based interface for managing EMC Symmetrix storage.

SRM VMware vCenter Site Recovery Manager. An extension to VMware vCenter that enables integration with array-based replication.

SYMCLI Symmetrix Solutions Enabler command line interface.

Thin device (TDev) A cache-only device that is presented to a host. TDevs are pointers to units of physical storage contained in Virtual Provisioning thin pools.

vCPU Virtual CPU. A processor within a virtual machine. VMware ESX® 4.1 currently supports up to eight vCPUs per virtual machine.

Virtual pool A Virtual Provisioning thin pool, consisting of a collection of data devices that provides storage capacity for the thin devices that are bound to the pool.

VMFS VMware Virtual Machine File System.

Page 10: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

10

Key technology components

This section provides an overview of the key components of the solution, as listed in Table 2.

Table 2. Solution components

Function Components

Business application Oracle Database 11g R2 Enterprise Edition, with Oracle Grid Infrastructure and Oracle ASM

Storage platforms Production site

• EMC Symmetrix VMAXe with Enginuity for VMAXe

• EMC Symmetrix Management Console

Recovery site

• EMC VNX5700

• EMC Unisphere™

Virtualization platform VMware vSphere

VMware vCenter

Replication and recovery EMC RecoverPoint

VMware vCenter Site Recovery Manager

EMC RecoverPoint Storage Replication Adapter for VMware vCenter Site Recovery Manager

The solution is designed to provide local protection and disaster recovery for consolidated Oracle Database 11g OLTP environments. It demonstrates these capabilities for a single-instance OLTP database deployed on a VMware ESX virtual machine.

Overview The solution demonstrates the local protection and disaster recovery capabilities of Symmetrix VMAXe with Enginuity 5875, which provides the storage platform at the production site.

The Symmetrix VMAXe system is built on the highly scalable EMC Virtual Matrix Architecture, which enables it to grow seamlessly and cost-effectively from an entry-level, single-bay and single-engine configuration to a six-bay, four-engine system with 384 GB of cache memory, 960 drives, and up to 1.3 PB of usable capacity.

Built with simplicity and ease-of-use in mind, the VMAXe is 100 percent virtually provisioned, with storage tiering managed automatically by EMC Fully Automated Storage Tiering for Virtual Pools (FAST VP). Flash, FC, and SATA drives are all supported, as well as RAID 1, 5, and 6 protection options.

VMAXe systems are delivered preconfigured, ready for same-day installation and startup. They support all EMC Symmetrix monitoring and management tools, including the latest enhanced Symmetrix Management Console, which provides simpler installation and management with smart wizards.

Solution components

Oracle application

EMC Symmetrix VMAXe with Enginuity 5875

Page 11: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

11 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

The integrated RecoverPoint splitter for VMAXe enables local and/or remote replication for flexible and efficient RPOs and RTOs.

Virtual Provisioning EMC Virtual Provisioning simplifies storage configuration and management, improves capacity utilization, and enhances performance by enabling the creation of thin devices that present applications with more capacity than is physically allocated in the storage array.

The physical storage used to supply disk space to the thin devices comes from a thin pool, which is comprised of data devices that provide the actual physical storage. The thin pools can be grown or shrunk nondisruptively by adding or removing data devices. Virtual Provisioning supports local and remote replication with RecoverPoint on VMAXe.

Virtual Provisioning is described in detail in the EMC Solutions Enabler Symmetrix Array Controls CLI Version 7.3 Product Guide.

Symmetrix Management Console The Symmetrix Management Console (SMC) is a powerful, browser-based interface that simplifies management of EMC Symmetrix storage, from device creation to advanced Symmetrix features such as FAST VP, Virtual Provisioning, Auto-provisioning Groups, replication configuration, and monitoring.

For the solution, storage at the recovery site is provided by an EMC VNX5700 storage array, demonstrating RecoverPoint support for heterogeneous storage platforms.

The VNX5700 is a member of the VNX series next-generation storage platform, powered by Intel quad-core Xeon 5600 series processors. It is designed to deliver maximum performance and scalability for midtier enterprises, enabling them to grow, share, and cost-effectively manage multiprotocol file and block systems. VNX arrays incorporate the RecoverPoint splitter, which supports unified file and block replication for local data protection and disaster recovery.

EMC Unisphere is the central management platform for the EMC VNX series, providing a single combined view of file and block systems, with all features and functions available through a common interface. Unisphere is optimized for virtual applications and provides industry-leading VMware integration, automatically discovering virtual machines and ESX servers and providing end-to-end, virtual-to-physical mapping.

VMware vSphere provides the virtualization platform for the solution, with VMware ESX virtual machines hosting the database and management systems at both the production and recovery sites.

VMware vSphere abstracts applications and information from the complexity of underlying infrastructure, through comprehensive virtualization of server, storage, and networking hardware. It is the industry’s most complete and robust virtualization platform, virtualizing business-critical applications with dynamic resource pools for unprecedented flexibility and reliability.

VMware vCenter provides the centralized management platform for vSphere environments, enabling control and visibility at every level of the virtual infrastructure,

EMC VNX5700

VMware vSphere

Page 12: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

12

including integration with RecoverPoint to enable discovery of the protection status for virtual machines in the environment.

Overview RecoverPoint is an advanced enterprise-class disaster recovery solution designed with the performance, reliability, and flexibility required for enterprise applications in heterogeneous storage and server environments. It provides bi-directional local and remote data replication, without distance limits and with minimal performance degradation.

EMC RecoverPoint/EX is a product variant that is optimized for the EMC VMAXe and VNX series and the CLARiiON® CX3 and CX4 series of storage arrays. EMC RecoverPoint/EX is the product used in this solution.

RecoverPoint data protection options RecoverPoint provides the following replication options for both physical and VMware virtualized environments:

Figure 1. RecoverPoint replication options

• Continuous data protection (CDP): CDP continuously captures and stores data modifications locally, enabling local recovery from any point in time, with no data loss. Both synchronous and asynchronous replication are supported.

• Continuous remote replication (CRR): CRR supports synchronous and asynchronous replication between remote sites over FC and a WAN. Synchronous replication is supported when the remote sites are connected through FC and provides a zero RPO. Asynchronous replication provides crash-consistent protection and recovery to specific points in time, with a small RPO.

EMC RecoverPoint

Page 13: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

13 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

• Concurrent local and remote replication (CLR): CLR is a combination of CRR and CDP and provides concurrent local and remote data protection.

In RecoverPoint, CDP is normally used for operational recovery, while CRR is normally used for disaster recovery—the solution demonstrates a RecoverPoint CLR configuration.

RecoverPoint appliance RecoverPoint is appliance-based, which enables it to better support large amounts of information stored across heterogeneous environments. This out-of-band approach enables RecoverPoint to deliver continuous replication with minimal impact to an application’s I/O operations.

RecoverPoint appliances (RPAs) run the RecoverPoint software and manage all aspects of data replication. For local replication, a cluster configuration of two or more active RPAs is deployed—this supports immediate switchover to another appliance if one of the RPA nodes in a cluster goes down. For remote replication, a RecoverPoint cluster is deployed at both sites.

RPAs use powerful deduplication, compression, and bandwidth reduction technologies to minimize the use of bandwidth and dramatically reduce the time lag between writing data to storage at the source and target sites.

RecoverPoint splitter RecoverPoint uses lightweight write splitting technology, on the application server, in the fabric, or in the array, to mirror application writes to the RecoverPoint cluster. VMAXe and VNX arrays have integrated RecoverPoint splitters that operate in each front-end adapter (FA)—this ensures that the RPA receives a copy of each write.

The array-based splitter is the most effective write splitter for VMware replication, enabling replication of VMFS and RDM volumes without the cost or complexity of additional hardware. The splitter supports both FC and iSCSI volumes presented by the VMAXe or VNX arrays to any host, including to an ESX server.

RecoverPoint journals RecoverPoint journals store timestamped application writes for later recovery to selected points in time. Three journals are provisioned for local and remote replication—one production journal at the production site, and a history journal at both the production and recovery sites.

For synchronous replication, every write is retained in the history journal for recovery to any point in time. For asynchronous replication, several writes are grouped before delivery to the history journal—this supports recovery to significant points in time. Bookmarked points-in-time can be created automatically or manually to enable recovery to specific application or system events.

RecoverPoint consistency groups The consistency and write-order fidelity of point-in-time images are assured by RecoverPoint’s use of replication sets and consistency groups. A replication set defines an association between a production volume and the local and/or remote volumes to which it is replicating. A consistency group logically groups replication sets that must be consistent with one another. The consistency group ensures that

Page 14: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

14

updates to the production volumes are written to the replicas in consistent write order and that the replicas can always be used to continue working from or to restore the production source.

RecoverPoint replication is policy-driven. A replication policy, based on a particular business need, can be uniquely specified for each consistency group. This policy governs the replication parameters for the consistency group—for example, the RPO and RTO for the consistency group, and its deduplication, data compression, and bandwidth reduction settings.

VMware vCenter Site Recovery Manager (SRM) is a disaster recovery framework that integrates with EMC RecoverPoint to automate recovery of VMware datastores so that it becomes as simple as pressing a single button.

SRM is an extension to VMware vCenter that enables integration with array-based replication, discovery and management of replicated datastores, and automated migration of inventory from one vCenter to another. SRM does not replicate any data. For this, it leverages an external replication solution such as RecoverPoint. SRM servers coordinate the operations of the replicated storage arrays and vCenter servers at the production and recovery sites so that, as virtual machines at the production site are shut down, virtual machines at the recovery site start up and assume responsibility for providing the same services, using the data replicated from the production site.

Migration of protected inventory and services from one site to the other is controlled by a recovery plan that specifies the order in which virtual machines are shut down and started up, the compute resources that are allocated, and the networks they can access. SRM integrated with RecoverPoint also enables recovery plans to be tested, using a temporary copy of the replicated data, in a way that does not disrupt ongoing operations at either site.

Integration of SRM with RecoverPoint is provided by the EMC RecoverPoint Storage Replication Adapter (SRA) for VMware vCenter Site Recovery Manager. The RecoverPoint SRA supports discovery of arrays attached to RecoverPoint and of consistency groups that are enabled for management by SRM. It also supports SRM functions such as failover and failover testing by mapping SRM recovery plans to the appropriate RecoverPoint actions.

VMware vCenter Site Recovery Manager

RecoverPoint and SRM integration

Page 15: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

15 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Continuous data protection Local protection of the production environment is performed by RecoverPoint continuous data protection (CDP), using the integrated RecoverPoint splitter. Figure 2 illustrates the flow of a write in CDP.

Figure 2. RecoverPoint CDP data flow

1. The application server issues a write to a LUN that is being protected by RecoverPoint. The write is intercepted by the RecoverPoint splitter.

2. The splitter “splits” the write and sends it to the production volume and to the RPA simultaneously.

3. When the write is received by the RPA, it is acknowledged back to the splitter.

4. In parallel, the RPA moves the data into the local journal volume, along with a timestamp and any application, event, or user-generated bookmarks for the write.

5. After the data is safely stored in the journal, it is distributed to the target local volumes through the RPA FC connections—write order is preserved during distribution.

RecoverPoint CDP and CRR – how they work

Page 16: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

16

Continuous remote replication Replication of the production environment to the recovery environment is performed by RecoverPoint continuous remote replication (CRR) using the RecoverPoint splitter. Figure 3 illustrates the flow of a write in CRR.

Figure 3. RecoverPoint CRR data flow

1. The application server issues a write to a LUN that is being protected by RecoverPoint. The write is intercepted by the RecoverPoint splitter.

2. The splitter “splits” the write and sends it to the production volume and to the local RPA simultaneously, the same as in a CDP deployment.

3. When the RPA receives the write, it immediately acknowledges it back to the splitter, unless synchronous remote replication is in effect. With synchronous replication, the ACK is delayed until the write has been received by the RPA at the remote site.

4. When the write is received by the local RPA, it is bundled with other writes, deduplicated to remove redundant blocks, sequenced, and timestamped. The package is then compressed and transmitted with a checksum for delivery to the remote RPA cluster.

5. When the package is received at the remote site, the remote RPA verifies the checksum, to ensure the package was not corrupted in transmission, and uncompresses the data.

6. The RPA then writes the data to the journal volume at the remote site.

7. After the data has been written to the journal volume, it is distributed to the remote replica volumes through the RPA FC connections—write order is preserved during this distribution.

Page 17: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

17 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Solution architecture and design

EMC solutions are designed to reflect and validate real-world deployments. Figure 4 depicts the physical architecture of the solution described in this white paper.

Figure 4. Solution architecture

The solution demonstrates the local protection and disaster recovery capabilities of the EMC Symmetrix VMAXe with Enginuity, which provides the storage platform at the production site. Storage at the recovery site is provided by an EMC VNX5700 array, demonstrating RecoverPoint support for heterogeneous storage platforms. VMware vSphere provides the virtualization platform for the solution.

Solution architecture

Page 18: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

18

Table 3 details the environment profile for the solution.

Table 3. Solution profile

Profile characteristic Details

Database type OLTP

Database size 500 GB

Number of databases 1

Workload profile Swingbench Order Entry (TPC-C-like) workload

Database read/write ratio 60/40

Network connectivity FC 8 Gb and 10 GbE

Table 4 details the hardware environment for the solution.

Table 4. Solution hardware environment

Purpose Quantity Configuration

Storage (production site)

1 EMC Symmetrix VMAXe with:

• Single engine

• 96 GB cache memory

• 88 x 450 GB, 15k FC drives, plus spare

• 16 x 2000 GB SATA drives, plus spare

• 4 x 200 GB Flash drives, plus spare

• Enginuity for VMAXe 5875 2011Q2SR

Storage (recovery site)

1 EMC VNX5700 with:

• 8 Gb FC connectivity

• 1 Gb network connectivity

• 15 x 200 GB SATA Flash

• 15 x 300 GB SAS drives

• 45 x 2 TB NL-SAS drives

• 2 x Data Movers

• 8 x GbE NICs

• 1 x Control Station

• VNX Operating Environment

VMware ESX servers 2 • 2 x quad-core Xeon 5560 CPUs, 2.80 GHz, 98 GB RAM

• 2 x 10 GB CNA adapters

Network switches 2 Gigabit Ethernet switches

FC switches 4 8 GB/s FC switches, 2 per site

RecoverPoint appliances 4 GEN 4, 2 RPAs per site

Host adapters 4 10 GB CNA adapter (2 per physical server)

Environment profile

Hardware environment

Page 19: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

19 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Table 5 details the software environment for the solution.

Table 5. Solution software environment

Software Version Purpose

EMC Symmetrix Management Console

7.3 Symmetrix VMAXe configuration and management tool

EMC Unisphere 1.1 VNX management software

EMC Solutions Enabler 7.3 Symmetrix VMAXe management software

EMC RecoverPoint 3.4.1 EMC replication software, installed on each of the 4 RPAs

EMC RecoverPoint Adapter for VMware vCenter Site Recovery Manager

4.1.1 EMC software for integrating RecoverPoint and SRM

VMware vSphere 4.1 GA B260247 Hypervisor hosting all virtual machines

VMware vCenter 4.1 GA B259021 Management of VMware vSphere

VMware vCenter Site Recovery Manager

4.1.1 Managing failover and failback of virtual machines

Oracle Database 11g R2 Enterprise Edition 11.2.0.2

Oracle database software for grid computing

Red Hat Enterprise Linux 5.5 Server OS for Oracle database server

Microsoft Windows Server 2008 R2 x64 Server OS for Swingbench load generator

Swingbench 2.4.0.723 Database workload generation tool

Software environment

Page 20: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

20

The Symmetrix VMAXe array is factory-configured, based on information provided by the customer at the time of ordering. Storage pools are preconfigured based on the protection schemes requested by the customer.

To provision storage to the ESX hosts, simply run the Add New Host And Provision Virtual Storage wizard from the SMC dashboard. This guides you through the steps involved, as shown in Figure 5.

Figure 5. Add New Host and Provision Virtual Storage wizard

Table 6 details the VMFS volumes provisioned for the solution from the Symmetrix VMAXe array. All these volumes were replicated with RecoverPoint.

Table 6. Symmetrix VMAXe storage allocation

VMFS volume Capacity Meta members Thin pools

OracleVM_DataStore 100 GB 1 VM_ FC_3R5

DATA 800 GB 16 FC_3R5

REDO 100 GB 2 FC_3R5

TMP 200 GB 4 FC_3R5

FRA 1,440 GB 8 SATA_6R6

Oracle_Binaries 20 GB 1 FC_3R5

Note: Volumes replicated with RecoverPoint require a target volume of the same size on the

local array and remote array.

Symmetrix VMAXe storage allocation

Page 21: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

21 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Table 7 details the virtual allocation of hardware resources for the virtual machines at the production site.

Table 7. Virtual allocation of hardware resources for virtual machines

Virtual machine role Quantity Configuration

Oracle database node 1 6 vCPUs, 24 GB RAM, RHEL 5.5

VMware vCenter/SRM server*

1 2 vCPUs, 8 GB RAM, Windows Server 2008 R2 x64

EMC SMC/SPA server* 1 2 vCPUs, 4 GB RAM, Windows Server 2008 R2 x64

Swingbench server* 1 4 vCPUs, 8 GB RAM, Windows Server 2008 R2 x64

Application server 1 1 vCPU, 8 GB RAM, Windows Server 2008 R2 x64

* These virtual machines reside on an existing ESX server used only for management purposes.

Virtual machine resources

Page 22: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

22

Oracle database configuration

Virtual Provisioning for Oracle ASM disk groups The Oracle database for the solution has separate ASM disk groups for data files, redo logs, fast recovery area (FRA), and temp files. Separate VMFS volumes are presented to the virtual machines on dedicated datastores for each ASM disk group.

The +DATA, +TEMP, and +REDO ASM disk groups are deployed on thin provisioned devices bound to a virtual pool composed of FC devices with RAID 5 (3+1) protection. This pool provides the high performance required by these devices.

The thin device provisioned for the +REDO logs was fully provisioned at the time of deployment— that is, all physical storage was assigned up front. The VMFS datastore for the +REDO log file was also provisioned to request all storage from the outset.

The +FRA device is bound to a virtual pool composed of high-capacity SATA devices with RAID 6 (6+2) protection. FRA devices typically do not have the same high-performance demands as data files and redo logs, so SATA devices provide an efficient deployment for this data.

Figure 6 shows the relationships from the ASM disk groups through to the thin pools.

Oracle_DATA Oracle_REDO Oracle_TMP Oracle_FRA

+DATA +REDO +TMP +FRA

FC Pool RAID5(3+1) SATA Pool RAID6 (6+2)

ASM disk group

VMFS datastore

Thin device (TDEV)

Thin pool

Figure 6. Relationship between ASM disk groups and storage pools

A single instance of the Swingbench Order Entry PL/SQL (SOE) schema was used to deliver the OLTP workloads required by the solution. The Swingbench SOE schema models a traditional OLTP database, with tables and indexes residing in separate tablespaces.

Oracle database deployment on Symmetrix VMAXe

Database schema

Page 23: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

23 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

The schema for the solution contains the tables and indexes listed in Table 8.

Table 8. Schema tables and indexes

Table name Index

CUSTOMERS CUSTOMERS_PK (UNIQUE), CUST_ACCOUNT_MANAGER_IX, CUST_EMAIL_IX, CUST_LNAME_IX, CUST_UPPER_NAME_IX

INVENTORIES INVENTORY_PK (UNIQUE), INV_PRODUCT_IX, INV_WAREHOUSE_IX

ORDERS ORDER_PK (UNIQUE), ORD_CUSTOMER_IX, ORD_ORDER_DATE_IX, ORD_SALES_REP_IX, ORD_STATUS_IX

ORDER_ITEMS ORDER_ITEMS_PK (UNIQUE), ITEM_ORDER_IX, ITEM_PRODUCT_IX

PRODUCT_DESCRIPTIONS PRD_DESC_PK (UNIQUE), PROD_NAME_IX

PRODUCT_INFORMATION PRODUCT_INFORMATION_PK (UNIQUE), PROD_SUPPLIER_IX

WAREHOUSES WAREHOUSES_PK (UNIQUE)

LOGON

To verify the virtual database environment, the Swingbench Order Entry workload was run against the database, with a user count of 100, as shown in Figure 7.

Figure 7. Swingbench workload running against the database

Page 24: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

24

Configuring EMC RecoverPoint

RPA repository and gatekeeper provisioning Configuring the RecoverPoint splitter on the Symmetrix VMAXe array requires provisioning of the following volumes:

• A repository volume (3 GB minimum) for the RPA cluster.

This stores configuration information about the RPAs and RecoverPoint consistency groups, which enables a properly functioning RPA to seamlessly assume the replication activities of a failing RPA from the same cluster.

A repository volume of the same size was also provisioned on the VNX5700 for the RPA cluster at the recovery site.

• Eight unique gatekeeper volumes each for RPA1 and RPA2.

These volumes are provisioned by creating Auto-provisioning masking views that present the volumes to the RPAs. For the solution, three masking views were created for the RecoverPoint cluster, as it consists of two RPAs (RPA1 and RPA2):

• RecoverpointConfig – this presents the repository volume, the journal volumes, and the replica volumes to all nodes in the RPA cluster

• RPA1_GK – this presents the relevant gatekeepers to RPA1

• RPA2_GK – this presents the relevant gatekeepers to RPA2

Figure 8 shows creation of the RecoverpointConfig masking view using the Masking View Management – Create dialog box in Symmetrix Management Console.

For further information on Auto-provisioning masking views, consult Deploying RecoverPoint with Symmetrix—Technical Notes.

Preparing the VMAXe for RecoverPoint connectivity

Page 25: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

25 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Figure 8. Configuring masking views for a RecoverPoint cluster

Page 26: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

26

Enabling write protect bypass for RPA initiators The RecoverPoint splitter for VMAXe requires the RPA initiators to have special access that enables them to write to write-protected devices. To grant this access, the write protect bypass initiator flag—WP_Bypass(WPBP)—must be set for all RPA initiators. Figure 9 shows the SMC procedure for doing this.

Figure 9. Enabling write protect bypass for RPAs

1. Choose Enable Recover Point command for initiator group.

2. SMC confirms initiator group enabled for RecoverPoint.

3. Initiator properties show WPBP flag enabled.

1

2

3

Page 27: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

27 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

RecoverPoint was configured as follows for the solution:

• CLR was the replication method used for the solution. This combines both CDP and CRR.

• Two consistency groups were created:

Oracle_SRM—this encompasses all the replication sets for the Oracle database, Oracle binaries, and the datastore containing the running operating system for the virtual machine that hosts the Oracle software.

App_SRM—this contains the replication sets for the datastore containing the Swingbench and application server virtual machines (see Table 7).

By separating replicated virtual machines into multiple consistency groups, it was possible to plan for different disaster recovery scenarios. For example, the Oracle environment could be failed over separately from the other virtual machines, or the entire production environment could be failed over as a single process.

• For each consistency group, three journals were set up, two at the production site to support the production volumes and their local replicas, and one at the recovery site to support the remote replica. All journals on the VMAXe were bound to the FC_3R5 pool.

For further details on configuring RecoverPoint, consult the EMC RecoverPoint Release 3.4.1 Administrator’s Guide.

Journal size The size of the journal volumes should reflect your RPO. Determining the journal size requires administrators to calculate the expected peak change rate in their environment.

The following is the Journal Volume Sizing formula:

Journal size = (new data writes per second) x (required rollback time in seconds)

(1 − target side log size) x 1.05

Twenty percent of the journal must be reserved for the target side log and 5 percent for internal system needs.

For example, to support a 24-hour rollback requirement (86,400 seconds), with 5 Mb/s of new data writes to the replication volumes in a consistency group, the calculation would be as follows:

(5 x 86400)(1 − 0.2)

x 1.05 = 567000 Mb = 69.213 GB (~70 GB)

For the solution, all journals were sized to 500 GB. With a change rate of 5 Mb/s, this enables RecoverPoint to roll back at least seven days. It may be possible to increase the recover point by bookmark consolidation and journal compression.

RecoverPoint configuration for this solution

Page 28: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

28

The RecoverPoint Installer Wizard, which is part of the RecoverPoint Distribution Manager, guides you through the procedures for installing RecoverPoint and setting up the RPA clusters at the production and recovery sites. Before running the wizard, it is essential that the steps described in Preparing the VMAXe for RecoverPoint connectivity have been completed, as the wizard prompts you to identify the repository volumes created then.

Figure 10 shows the Prerequisites screen from the wizard, listing the conditions that must be met before using the wizard. It also shows the Summary screen, which lists the tasks completed during installation.

Figure 10. RecoverPoint Installer Wizard

Overview When RecoverPoint installation has successfully completed, you can configure the RecoverPoint consistency group(s) required for local and remote replication of your application data. To do this, perform the following steps for each consistency group:

1. Present the storage to be replicated to the RPA clusters at both sites.

2. Tag all VMAXe devices that are to be protected by RecoverPoint.

3. Configure RecoverPoint access to the RecoverPoint splitters on the VMAXe and VNX arrays.

4. Create the RecoverPoint consistency group.

This section describes the configuration process for the solution’s Oracle_SRM consistency group.

Installing RecoverPoint

Configuring CLR

Page 29: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

29 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Step 1: Present the storage to be replicated to the RPA clusters

Local array The application production devices that are to be protected by RecoverPoint must be presented to the RPA cluster. To do this:

1. Launch Symmetrix Management Console for the local array.

2. Open the initiator group for the existing auto-provisioning view that contains the devices to be protected.

3. Add the RPA initiator groups for all RPAs, as shown in Figure 11.

Figure 11. Adding RPA initiator groups

Remote array The remote journal and the target devices for the CRR copy must be presented to the remote RPA cluster. How this is accomplished depends on the target array type. The target array for the solution is a VNX5700. In this case, a storage group containing the journal and target devices was set up and presented to the remote RPAs. A second storage group was created and presented to the ESX host at the remote site to provide access to the replica storage.

For more detailed information on configuring VNX storage for replication with RecoverPoint, consult EMC RecoverPoint Deploying with VNX/CLARiiON Arrays and Splitter—Technical Notes.

Step 2: Tag the protected devices for RecoverPoint use In RecoverPoint splitter for VMAXe configurations, it is necessary to tag devices for RecoverPoint use. This makes the devices accessible to the splitter and also makes it easy to identify which VMAXe volumes have been set up for RecoverPoint use.

Figure 12 shows the Symmetrix Management Console procedure for tagging the devices in the storage group for the ESX server. Use the same procedure to tag the replica volumes in the RecoverpointConfig storage group.

Page 30: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

30

Figure 12. Tagging devices for RecoverPoint use

Note: You must repeat this step whenever you add devices to the configuration, if they are to be protected by RecoverPoint.

Step 3: Configure RecoverPoint access to the RecoverPoint splitters To configure RecoverPoint access to the RecoverPoint splitters on the VMAXe and VNX arrays, use the New Splitter Wizard, as shown in Figure 13.

Figure 13. Adding new splitters

Step 4: Create the consistency group Creating a consistency groups involves:

1. Defining the replication policy settings for the consistency group, the production copy (that is, the data to be protected), and the local and remote replica copies.

2. Adding replication sets to the consistency group by selecting the production volumes to be replicated and assigning the corresponding replica volumes at the local and/or remote copies.

3. Selecting the journal volumes for the production and replica journals.

Page 31: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

31 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

The New Consistency Group Wizard, in the RecoverPoint Management Application, guides you through the required steps, as shown in Figure 14. The replication policy settings are all optional as the default settings provide a practical configuration.

Figure 14. New Consistency Group Wizard steps

The consistency groups for the solution were created with the wizard and the default values were accepted for all optional settings.

Figure 15 shows the final summary screen for the Oracle_SRM consistency group, which was configured with a local copy (Oracle_CDP) and a remote copy (Oracle_CRR) to support both local data protection and remote replication. The summary screen includes the replication sets defined for the consistency group.

Figure 15. Summary of consistency group settings

Page 32: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

32

After the consistency group has been created and SRM has been installed, you need to configure the consistency group for management by SRM. You do this by using the Policy settings in the RecoverPoint Management Application, as shown in Figure 16.

Figure 16. Configuring the consistency group for management by SRM

Configuring the consistency group for management by SRM

consistency group

policy settings

Page 33: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

33 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Automating site recovery with VMware vCenter Site Recovery Manager

Installing and configuring VMware vCenter Site Recovery Manager (SRM) for automated site recovery using RecoverPoint involves these steps:

1. Install and configure SRM.

2. Configure the connection between the protected and recovery sites.

3. Configure RecoverPoint array managers.

4. Configure protection groups.

5. Configure inventory mappings.

6. Customize virtual machine recovery options.

7. Customize recovery site IP addresses.

8. Create the recovery plan.

Most of these steps are executed from the SRM interface in vCenter, where intelligent wizards support quick and easy configuration. For full details of these steps, consult the VMware vCenter Site Recovery Manager Administration Guide 4.1.

It is possible to create multiple recovery plans to cater for many different recovery scenarios. In the example shown in Figure 17, three recovery plans have been created—one to fail over the entire environment, and the other two to fail over the application and database layers independently.

Figure 17. Multiple recovery plans

All these recovery plans were successfully executed as part of the solution testing. For the purpose of demonstrating the failover process using SRM and RecoverPoint, this white paper documents failover and failback of the entire production environment.

This section describes how SRM was configured for failover from the production site to the recovery site. The RecoverPoint CRR site failback process section describes how SRM was configured for failback from the recovery site to the production site.

Overview

Page 34: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

34

SRM has several requirements for the vSphere configuration at each site:

• Each site must include a vCenter server containing at least one vSphere data center.

• The recovery site must support array‐based replication with the protected (production) site, and must have hardware and network resources that can support the same virtual machines and workloads as the protected site.

• At least one virtual machine must be located on a datastore that is replicated by RecoverPoint at the protected site.

• The protected and recovery sites must be connected by a reliable IP network. Storage arrays may have additional network requirements.

• The recovery site must have access to the same public and private networks as the protected site, though not necessarily access to the same range of network addresses.

• VMware tools must be installed on all virtual machines.

Installing and configuring SRM includes the following tasks:

1. Configure SRM databases at both sites—these store the recovery plans, inventory information, and so on.

2. Install the SRM server at both sites.

3. Install the RecoverPoint Storage Replication Adapter on the SRM server at both sites—this enables SRM and RecoverPoint integration.

4. Install the SRM client plug-in into one or more vSphere Clients at either or both sites.

Prerequisites

Step 1: Install and configure SRM

Page 35: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

35 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

With the SRM database and SRM server installed at each site, you then need to configure the connection from the protected site to the recovery site. You do this by specifying the remote SRM server IP details in the Connect to Remote Site wizard at the protected site, as shown in Figure 18.

Figure 18. Connecting the SRM and vCenter servers at the protected and recovery sites

For SRM to integrate with RecoverPoint, RecoverPoint array managers must be configured at both the protected and recovery sites. You do this by using the SRM Configure Array Managers wizard.

The wizard discovers the replicated storage devices at the protected and recovery sites, and identifies the VMFS datastores that they support. When finished, it presents a list of replicated datastore groups.

Figure 19 shows the wizard in progress, with RecoverPoint specified as the manager type and Production as the protected site. The connection information for RecoverPoint at the protected site is also specified.

Step 2: Connect the protected and recovery sites

Step 3: Configure RecoverPoint array managers

1

2

3

Page 36: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

36

Figure 19. Configuring the RecoverPoint array managers

A protection group is a collection of virtual machines that all use the same datastore group (the same set of replicated LUNs) and that all fail over together.

To create a protection group, you select the datastore group(s) to protect, and specify a datastore group at the recovery site where SRM can create placeholders for members of the protection group. You use the Create Protection Group wizard to do this. The wizard automatically detects the datastores currently protected by RecoverPoint and allows you to select the one(s) to include in the protection group.

Two protection groups were created for the use case—one for the Oracle_SRM consistency group and the other for the App_SRM consistency group. Figure 20 shows the Oracle protection group being created.

Figure 20. Creating a protection group

Step 4: Configure protection groups

Datastore group selected for inclusion in protection group

Virtual machines supported by selected datastore group

Page 37: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

37 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

When a protection group has been created, shadow virtual machines are automatically created in the recovery site vCenter inventory. These act as placeholders for the virtual machines that can be failed over with SRM. The shadow virtual machines cannot be started independently of SRM and are removed if their protection group is deleted.

Network, compute, and virtual machine folder resources must be configured at the recovery site in order for SRM to know which resources to use in the event of failover.

The Inventory Mappings screen lists the resources at the protected site and allows you to select the corresponding resources to use at the recovery site. Figure 21 shows the inventory mappings configured for the solution.

Figure 21. Creating inventory mappings

Note: For ease of management, it is useful to group protected virtual machines into a folder. For the solution, the protected virtual machines were grouped in the SRM Protected VMs folder.

After the protection groups have been created, the recovery options for individual virtual machines can be customized to suit specific requirements. The customization options available provide a powerful tool for ensuring that complex recovery plans are implemented with ease.

For example, if there are multiple virtual machines in a protection group, their startup sequence can be customized by assigning them different recovery priorities. In an Oracle database environment, this means that a recovery plan can be configured to boot the database virtual machines before the application server virtual machines.

For the solution, the Alicanto-PV1 virtual machine was assigned a recovery priority of High, as shown in Figure 22. This machine would be recovered first, before the Swingbench virtual machine (Normal priority), and the AppServer virtual machine (Low priority).

Step 5: Configure inventory mappings

Step 6: Customize virtual machine recovery options

Page 38: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

38

Figure 22. Modifying the startup priority of a virtual machine

When failing over to a different data center, typically some adjustments are required to host IP settings due to infrastructure differences. When failing over an entire configuration, this can involve updating settings for multiple virtual machines.

SRM provides a bulk IP customization utility (dr-ip-customizer.exe) for automatically updating IP settings for recovered virtual machines. The utility generates a CSV file containing the IP settings for all the virtual machines that are configured for SRM failover. You can edit this file to specify the recovery site IP settings and then run the utility again to upload the new settings to the recovery site vCenter server.

For the solution, the utility was used as follows to update the recovery site IP settings:

1. Log on to the vCenter server at the recovery site.

2. Run the dr-ip-customizer.exe utility, specifying the name and location for the CSV file, as shown in Figure 23.

Figure 23. Running the IP utility to generate a CSV file

3. Edit the CSV file to provide the IP settings for the virtual machines at the recovery site. Figure 24 shows the edited file for the solution.

Figure 24. CSV file edited for recovery site IP settings

Step 7: Customize recovery site IP addresses

Page 39: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

39 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

4. Run the utility to upload the new settings to the recovery site vCenter server, as shown in Figure 25.

Figure 25. Uploading the IP settings for the recovery site

Note: If you delete or re-create a protection group, you must repeat this process to reapply the IP customizations.

A recovery plan determines the automated steps that SRM executes when starting up a protected environment at the remote site. All recovery plans include a set of prescribed steps that are executed in a prescribed order. Any customizations you have configured prior to creating the recovery plan, such as recovery priority and IP settings, are also implemented.

To create a plan, you run the Create Recovery Plan wizard at the recovery site and simply specify the recovery plan name and the protection group to be recovered. Figure 26 shows the Oracle Failover recovery plan being created for the solution. This recovery plan fails over the entire production environment to the recovery site.

Figure 26. Creating a recovery plan

Step 8: Create the recovery plan

Page 40: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

40

Testing the Oracle Failover recovery plan

An SRM recovery plan can be tested at any time without disrupting replication or ongoing operations at the protected site.

The test is carried out in an isolated environment at the recovery site, using a temporary copy of the replicated data. It runs all the steps in the recovery plan except powering down of virtual machines at the protected site and assumption of control of replicated data by devices at the recovery site. If the plan requests suspension of local virtual machines at the recovery site, this happens during test recovery as well.

This facility enables you to rehearse your recovery plans thoroughly and easily in order to verify their timings and reliability.

This section describes a test recovery of the Oracle Failover recovery plan. The Automating site recovery with VMware vCenter Site Recovery Manager section of the white paper describes how this recovery plan was created.

First of all, 4,000 records were added to the production database. A script was then run to output the record count and the timestamp of the last entry, as shown in Figure 27. This information was later used to verify the success of the test recovery.

Figure 27. Production site record count and timestamp

The test was then initiated simply by selecting the Oracle Failover recovery plan in SRM and clicking the Test button, as shown in Figure 28.

Figure 28. Starting the recovery test

Overview

Testing the recovery plan

Page 41: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

41 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

When running a test plan, SRM creates an isolated TEST network on the recovery site and enables image access on the RecoverPoint consistency group. It also starts up the virtual machines replicated by RecoverPoint and performs any reconfiguration needed to access the disks on the remote array.

Figure 29 shows the recovery steps performed during testing of the Oracle Failover recovery plan. SRM first created an SRM bookmark and enabled image access on the CRR copy. It then recovered the high-priority Alicanto-PV1 virtual machine, followed by the normal-priority Swingbench virtual machine, and then the low-priority AppServer virtual machine. The virtual machines at the production site were not shut down as part of the test process.

Figure 29. Test recovery steps

At this stage, the environment is available on the remote site and administrators can verify application functionality in the secure environment created as part of the test.

Recover High priority VM

Recover Normal priority VM

Create SRM bookmark and enable image access on CRR copy

Recover Low priority VM

Page 42: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

42

For the Oracle Failover recovery test, the status view in the RecoverPoint Management Application verified that RecoverPoint image access was enabled, as shown in Figure 30.

Figure 30. RecoverPoint image access enabled

To verify that IP address customizations had been applied correctly, the IP settings for the failed-over virtual machines were checked in the recovery site vCenter instance, as shown in Figure 31.

Figure 31. Verifying IP address customizations

To verify data integrity at the recovery site, a script was run on the failed-over machine to output the record count and the timestamp of the last entry. As shown in Figure 32, these matched the record count and timestamp at the protected site (see Figure 27).

Verifying the success of the recovery test

Image access enabled }

Page 43: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

43 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Figure 32. Recovery site record count and timestamp

To verify that the remote environment was able to process transactions, a Swingbench load, with 100 users, was run against the database at the recovery site.

After verifying the success of the recovery test, the test was ended. At this point, SRM disabled image access on the CRR copy and the summary test report shown in Figure 33 was generated. The time taken for the test recovery was just over 24 minutes. This included the verification steps performed to ensure that the remote image was accessible and that the Oracle environment could process transactions.

Figure 33. Recovery test report

This test report can be used to demonstrate accurate timing of recovery plans and recovery reliability for auditing, compliance, system administration, and other business purposes.

The recovery test report

Page 44: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

44

Having a disaster recovery plan in place is the first step to ensuring you are prepared for the worst-case scenario. However, the only way to truly know if your disaster recovery plan works is to test it. Site Recovery Manager’s ability to execute recovery plans in test mode allows you to do just that.

• Recovery plans can be tested at any time without interrupting replication or the business’s RPO.

• Accurate timing of recovery plan tests enables administrators to predict how long recovery of the virtual environment will take.

• Administrators can quickly access a replica of their entire production environment for testing purposes.

• Automatically generated summary reports can be used to demonstrate compliance with regulatory requirements.

Benefits of testing recovery plans

Page 45: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

45 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

RecoverPoint CRR site failover process

For the use case, remote site recovery was enabled by RecoverPoint continuous remote replication (CRR), integrated with VMware vCenter SRM. SRM automates the recovery process so that executing failover becomes as simple as pressing a single button. There is no need for the user to interact with the RecoverPoint console. You simply execute the recovery plan from the SRM plug-in within the recovery site vCenter instance.

This section describes how the Oracle environment was failed over to the recovery site by executing the Oracle Failover recovery plan. The Automating site recovery with VMware vCenter Site Recovery Manager section of the white paper describes how this recovery plan was created. The Testing the Oracle Failover recovery plan section describes a test run of the recovery plan.

Prior to performing the failover, some records were added to the production database. A script was then run to output the record count and the timestamp of the last entry—this information was later used to verify the success of the failover. Figure 34 shows a total of 15,000 records in the database at this time.

Figure 34. Adding records to the production database

Failing over the production environment to the recovery site involved simply running the Oracle Failover recovery plan on the recovery site vCenter server, as shown in Figure 35.

Figure 35. Running the recovery plan

Overview

Running the Oracle Failover recovery plan

Page 46: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

46

When the failover process finished, SRM displayed a summary report of the recovery steps, as shown in Figure 36.

Figure 36. Oracle Failover summary report

SRM automatically executed the failover steps in the recovery plan, including any administration tasks needed to import the virtual machines to the remote VCenter instance. The recovery priority and IP setting customizations that were configured prior to creating the recovery plan were also implemented. This meant that the high-priority Alicanto-PV1 virtual machine was started before the normal-priority Swingbench virtual machine and the low-priority AppServer virtual machine.

The status view in the RecoverPoint Management Application (see Figure 37) shows the failed-over state of the Oracle_SRM consistency group after the recovery plan had been executed. Note that the replication direction has been reversed. This means that any changes at the DR site are replicated to the production site using the policies set up for consistency group Oracle_SRM.

Figure 37. Oracle_SRM consistency group in failover

DR site is now the source

Page 47: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

47 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

When an Oracle database virtual machine is failed over to a disaster recovery site, Oracle automatically recovers the crash-consistent CRR image that is presented to the recovery site by RecoverPoint. Figure 38 shows Oracle recovery of the crash-consistent image of the Alicanto-PV1 database host.

Figure 38. Oracle crash-consistent image recovery

To verify data integrity at the recovery site, a script was run on the failed-over Alicanto-PV1 virtual machine to output the record count and the timestamp of the last entry. As shown in Figure 39, these matched the record count and timestamp at the production site (see Figure 34).

Figure 39. Verifying data integrity after failover

Restarting the Oracle database

Verifying data integrity at the recovery site

Page 48: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

48

The benefits of automating failover and recovery of Oracle environments with RecoverPoint and SRM include:

• By virtualizing the operating system and applications, and replicating these with RecoverPoint, deployment time for the disaster recovery environment is eliminated. The entire operating environment is deployed once at the production site and is replicated to the remote site. The only installation required at the remote site is the VMware ESX servers, with vCenter, and the SRM servers required to manage failover.

• The tests performed showed that SRM and RecoverPoint can replicate entire Oracle environments between heterogeneous storage arrays (Symmetrix VMAXe and VNX5700).

• By implementing VMware ESX to virtualize server operating environments, the server environments at the production and recovery sites do not need to be identical.

• Automating site recovery with SRM takes the fear factor out of executing a disaster recovery plan. SRM recovery plans can be tested in advance, which means that there are no surprises when a real disaster recovery occurs.

Benefits of RecoverPoint site failover with SRM

Page 49: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

49 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

RecoverPoint CRR site failback process

When the issues that caused an outage at the production site have been resolved, production can be returned to the production site for normal processing. The simplest way to fail back the environment is to repeat the failover process, but in the opposite direction.

This section outlines how the Oracle environment was failed back to the production site by creating and executing the Oracle Failback recovery plan. The individual steps are described in detail in the Automating site recovery with VMware vCenter Site Recovery Manager section of the white paper.

Before you can configure SRM for failback, the following housekeeping steps must be carried out:

1. Save a summary report of the failover recovery plan. This can then serve as a guide for configuring failback. (This is recommended but not mandatory.)

2. Delete the failed-over virtual machines from the vCenter inventory at the production site using the command shown in Figure 40.

Figure 40. Deleting failed over virtual machines

3. Delete the existing protection groups and recovery plans at both the production and recovery sites. Figure 41 shows a recovery plan being deleted.

Figure 41. Deleting a recovery plan

4. Rescan the storage at the production site from the vCenter instance for each ESX server in the configuration—this is recommended but not mandatory.

Overview

Housekeeping

Page 50: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

50

For the use case, configuring SRM for failback involved these steps:

1. Use the SRM Configure Array Managers wizard to configure array managers at both sites. This time, supply the IP address of the DR RPA as the protected site.

2. Create a protection group for the datastore group to be failed back.

3. Configure inventory mappings between the protected and recovery sites. The procedure is the same as that used to configure the inventory mappings for failover. However, this time the original recovery site served as the protected site for SRM.

4. Customize the recovery priority settings for the virtual machines.

5. On the SRM server at the production site, use the dr-ip-customizer.exe utility to customize the IP addresses for the production virtual machines.

6. Use the Create Recovery Plan wizard to create the Oracle Failback recovery plan at the production site—this plan is exactly the same as the plan created for failover.

After SRM had been configured, the recovery plan was tested to ensure that failback worked as expected. The procedure was the same as that described in the Testing the Oracle Failover recovery plan section of the white paper.

After verifying that the failback test was successful, the Oracle Failback recovery plan was executed to fail back the environment to the production site and resume normal processing. The procedure was the same as that described in the RecoverPoint CRR site failover process section of the white paper.

Figure 42 shows the SRM summary report that was generated when the recovery plan had finished. This demonstrates that the entire process for failing back three virtual machines to the production array took less than 15 minutes.

Figure 42. Recovery plan summary

Configuring SRM for failback

Testing the failback recovery plan

Failing back to the production site

Page 51: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

51 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Prior to failing back to the production site, some processing was performed at the recovery site. A script was then run to output the record count and the timestamp of the last entry. There were 22,000 records in the database and the timestamp was 31-May-11 11.32 am, as shown in Figure 43.

Figure 43. Record count and timestamp at recovery site

After executing the Oracle Failback recovery plan, the script was run at the original production site, as shown in Figure 44. There were 22,000 records in the database and the timestamp was exactly the same as on the recovery site.

Figure 44. Record count and timestamp at original production site, after failback

The benefits of automating failback of Oracle environments with RecoverPoint and SRM are similar to the benefits of failing over Oracle environments with these technologies.

• Administrators can create and customize failback recovery plans to automate the failback process.

• Administrators can test their failback recovery plans in advance to verify that they work correctly.

• After a recovery plan has been set up, administrators can test or run the plan with a single click.

Verifying failback to the production site

Benefits of RecoverPoint site failback with SRM

VMs are running at DR-VCENTER

VMs are running at PR-VCENTER

Page 52: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

52

RecoverPoint CDP database recovery

In RecoverPoint CLR configurations, you can recover your Oracle database from an image at either the local site or the remote site. You can also manually bookmark particular points in time at either site, such as an event in an application, or a point in time to which you wish to fail over. You can then recover the database from any of these bookmarks.

For the purpose of this solution, manual bookmarks were created at both the CDP and CRR sites, and recovery from both images was tested. This section describes recovery from the CDP copy.

After a point-in-time image had been manually bookmarked in the local journal, the database was intentionally corrupted by deleting the data files. Oracle was unable to start because of the missing files, so the database was recovered from the pre-corruption bookmarked image and then started.

The following steps were performed to set up the test scenario, before recovering the database:

1. Configure the database consistency group (Oracle_SRM) for management by RecoverPoint. You do this using the Policy settings in the RecoverPoint Management Application, as shown in Figure 45. This is necessary because the consistency group is being used by both RecoverPoint CDP and SRM.

Figure 45. Configuring the consistency group for management by RecoverPoint

2. Create a pre-corruption manual bookmark, as shown in Figure 46.

Figure 46. Creating a bookmark

Overview

Preparing the test scenario

Page 53: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

53 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

3. Delete the data files from the production database. Figure 47 shows Oracle unable to start the database because of the missing files.

Figure 47. Oracle message indicating database corruption

The following steps were performed to recovery the database from the pre-corruption bookmarked image:

1. Enable image access for the bookmarked image. You do this by using the Enable Image Access option for the CDP replica, as shown in Figure 48.

Figure 48. Enabling image access to the CDP replica

2. Prior to recovering the production image, shut down the production virtual machine and any other protected virtual machines in the same consistency group.

Note: Instead of recovering immediately to production, you can mount the CDP image to another ESX server in order to verify that it contains the correct data. You do not need to shut down the production virtual machine to do this, but you do need to undo any writes before recovering to production.

Recovering the database from the CDP image

Image Access menu

Select pre-corruption bookmarked image

Page 54: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

54

3. Select the Recover Production option at the CDP replica, as shown in Figure 49, and initiate the production restore process.

Figure 49. Initiate the restore from the bookmarked image

4. Before resuming processing at the production site, enable image access for the image immediately following the Synchronization Completed image, as shown in Figure 50.

Figure 50. Enabling image access

5. Resume processing on the production site.

After recovery has been completed on the production image, the Resume Production option is enabled, as shown in Figure 51. When you click this, write splitting resumes on the production volumes.

Recovery menu

Page 55: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

55 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Figure 51. Resuming production

6. Rescan the storage at the production site from the vCenter instance for each ESX server in the configuration.

7. Restart the production virtual machine and any other protected virtual machines in the same consistency group.

Oracle now performs its own recovery to the crash-consistent image that has been restored to the production volumes. If the selected image does not contain the data that was missing, you can repeat steps 1 to 6 for an earlier image.

Figure 52 shows the recovered Oracle instance following the Oracle restart.

.

Figure 52. Recovered Oracle instance

Figure 53 shows the record count and timestamp for the recovered database, which matches the record count in the bookmarked image.

Figure 53. Record count and timestamp for the recovered database

Resume Production

Page 56: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

56

The benefits of using RecoverPoint for database protection include the following:

• RecoverPoint is constantly recording application-consistent snapshot images that are accessible through its intuitive GUI interface. In the event of operating system or database corruption, administrators can roll back through these images or restore the database from an image that has been manually bookmarked for a particular purpose.

• RecoverPoint provides a zero RPO, which means that administrators can quickly roll back to the point in time just before corruption occurred.

• RecoverPoint images can also be used to mount a copy of the database to a different host in order to verify it or to simply restore part of a database file for surgical type repairs.

The testing documented in this section outlines the steps required to restore an Oracle database to a consistent known good state, enabling faster recovery from a situation that could potentially involve media recovery, lead to lengthy downtime, and require manual intervention to recover to a consistent state.

Benefits of database protection with RecoverPoint

Page 57: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

57 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

RecoverPoint operation when VMAXe is operating in a degraded mode

With Symmetrix VMAXe arrays, all directors operate in an active state at all times. This means that there is no delay in processing I/O in the event of a director failure—the failed I/O is simply transmitted down an alternate path. When the failed path comes back online, it resumes processing I/O as normal. The RecoverPoint splitter deals with the loss of a director in the same way, by redirecting split writes down alternate channels. This provides a highly available disaster recovery solution.

To test the operation of RecoverPoint when VMAXe is operating in a degraded state, one of the VMAXe front-end director ports was forced offline. SMC indicated that the director status was Dead, as shown in Figure 54.

Figure 54. SMC view of offline director status

The dead director was also detected and logged as a problem by RecoverPoint, as shown in Figure 55.

Figure 55. RecoverPoint indicating a problem communicating with the storage array

While the director was offline, the Swingbench load generator was running to generate load. At this stage, the Oracle Failover recovery plan was run in test mode. Figure 56 shows the summary report from this test.

Overview

Setting up VMAXe degraded mode

Testing degraded mode

X indicates problem communicating with storage array

Page 58: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

58

Figure 56. Oracle Failover recovery plan summary report

When the test recovery was complete, the database was started on the remote site to verify consistency. Figure 57 shows an extract from the Oracle alert log. This demonstrates that the database successfully restarted from the crash-consistent image at the recovery site.

Figure 57. Oracle alert log showing the database successfully restarted

While the director was offline, RecoverPoint replication continued by using the remaining paths. When the director came back online, RecoverPoint automatically returned to normal state, with no need for manual intervention.

Page 59: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

59 Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

Conclusion

This white paper describes a data protection and disaster recovery solution for virtualized Oracle Database11g OLTP environments, using EMC RecoverPoint and VMware vCenter Site Recovery Manager. Write splitting was implemented by RecoverPoint splitters at the production site (Symmetrix VMAXe array) and recovery site (VNX5700 array).

EMC RecoverPoint is a mature replication technology that provides granular local and remote point-in-time failover and recovery protection for data centers. In an Oracle environment, RecoverPoint can reliably protect the database by providing multiple, consistent point-in-time recovery points that are maintained by sophisticated journaling technology. The automated bookmarking and journaling enable selective rollback to a crash-consistent image of the Oracle database and for very granular recovery of the environment. This provides a great deal of flexibility when recovering from a disaster scenario and provides DVR-like recovery capability.

With VMware vSphere virtualization of the database hosts, and VMware vCenter management of the virtual infrastructure, administrators can automate and test their disaster recovery plans using VMware vCenter Site Recovery Manager (SRM). SRM, integrated with RecoverPoint, enables creation and automation of sophisticated recovery plans that ensure managed and documented site recovery in the event of a disaster. SRM also enables rehearsal and modification of these recovery plans, with no impact to production or replication. This ensures reliable and predictable recovery in the event of a true disaster.

The key findings of the solution testing include:

• Integration of RecoverPoint splitter technology into the Symmetrix VMAXe and VNX arrays simplifies RecoverPoint configuration and also reduces costs, as no additional write-splitting hardware is required.

• The RecoverPoint splitter supports replication across heterogeneous storage platforms.

• RecoverPoint enables replication of entire virtualized Oracle environments between data centers for disaster recovery.

• RecoverPoint is a robust technology, with an intuitive GUI, that enables faster recovery from a DR scenario, with a granularity not possible with other replication technologies.

• RecoverPoint enables disaster recovery scenarios to be tested without affecting production or interrupting replication, and while maintaining consistency across protected consistency groups.

• Integration of RecoverPoint with vCenter Site Recovery Manager enables DR testing to be carried out in isolated environments on the recovery site so that production can remain active and replication can continue uninterrupted. SRM also documents the recovery process.

Summary

Findings

Page 60: Business Continuity and Disaster Recovery for Oracle11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC Symmetrix VMAXe, EMC RecoverPoint, and VMware vCenter Site Recovery Manager

An Architectural Overview

60

References

For additional information, see the white papers and technical notes listed below.

• Maximize Operational Efficiency for Oracle RAC with EMC Symmetrix FAST VP (Automated Tiering) and VMware vSphere—An Architectural Overview

• Improving VMware Disaster Recovery With EMC RecoverPoint—Applied Technology

• Storage Provisioning with Symmetrix Auto-provisioning Groups —Technical Notes

• EMC RecoverPoint/EX for the VMAXe Series—Applied Technology

• Rapid Deployment and Scale Out for Oracle E-Business Suite Enabled by EMC RecoverPoint, EMC Replication Manager, and VMware vSphere—A Detailed Review

• EMC RecoverPoint Deploying with VNX/CLARiiON Arrays and Splitter—Technical Notes

• Deploying RecoverPoint with Symmetrix—Technical Notes

For additional information, see the product documents listed below.

• EMC RecoverPoint Site Recovery Manager Failback Plug-in—Technical Notes

• EMC Solutions Enabler Symmetrix Array Controls CLI Version 7.3 Product Guide

• EMC RecoverPoint Release 3.4.1 Administrator’s Guide

For additional information, see the VMware documents listed below.

• Getting Started with VMware vCenter Site Recovery Manager Site Recovery Manager 4.0 and later

• VMware vCenter Site Recovery Manager Administration Guide 4.1 and later

• Adding a DNS Update Step to a Recovery Plan—Technical Note

White papers

Product documentation

Other documentation