tr-4396: metrocluster on clustered data ontap 8.3 ... · on the netapp clustered data ontap® 8.3...

Technical Report

MetroCluster in Clustered Data ONTAP 8.3 Verification Tests Using Oracle Workloads Business Workloads Group, PSE, NetApp

April 2015 | TR-4396

Abstract

This document describes the results of functional testing of NetApp® MetroCluster

™ software

on the NetApp clustered Data ONTAP® 8.3 operating system in an Oracle Database 11g R2

environment. Proper operation is verified as well as expected behavior during each of the test

cases. Specific equipment, software, and functional failover tests are included along with

results.

2 MetroCluster in Clustered Data ONTAP 8.3 Verification Tests Using Oracle Workloads © 2015 NetApp, Inc. All Rights Reserved.

TABLE OF CONTENTS

1 Introduction ........................................................................................................................................... 4

1.1 Best Practices .................................................................................................................................................4

1.2 Assumptions ...................................................................................................................................................4

2 Executive Summary.............................................................................................................................. 4

3 Product Overview ................................................................................................................................. 4

3.1 NetApp Storage Technology ...........................................................................................................................5

3.2 Oracle Database and Oracle Real Application Clusters ..................................................................................7

4 Challenges for Disaster Recovery Planning ...................................................................................... 7

4.1 Logical Disasters .............................................................................................................................................8

4.2 Physical Disasters ...........................................................................................................................................8

5 Value Proposition ................................................................................................................................. 8

6 High-Availability Options ..................................................................................................................... 8

6.1 ASM Mirroring .................................................................................................................................................8

6.2 Two-Site Storage Mirroring .............................................................................................................................9

7 High-Level Topology ............................................................................................................................ 9

8 Test Case Overview and Methodology ............................................................................................. 10

9 Test Results ........................................................................................................................................ 12

9.1 Loss of Single Oracle Node (TC-01) ............................................................................................................. 12

9.2 Loss of Oracle Host HBA (TC-02) ................................................................................................................. 12

9.3 Loss of Individual Disk (TC-03) ..................................................................................................................... 13

9.4 Loss of Disk Shelf (TC-04) ............................................................................................................................ 14

9.5 Loss of NetApp Storage Controller (TC-05) .................................................................................................. 15

9.6 Loss of Back-End Fibre Channel Switch (TC-06) .......................................................................................... 16

9.7 Loss of Interswitch Link (TC-07) ................................................................................................................... 17

9.8 Maintenance Requiring Planned Switchover from Site A to Site B (TC-08) .................................................. 18

9.9 Disaster Forcing Unplanned Manual Switchover from Site A to Site B (TC-09) ............................................ 19

10 Conclusion .......................................................................................................................................... 21

Appendix .................................................................................................................................................... 21

Detailed Test Cases ............................................................................................................................................. 21

Deployment Details .............................................................................................................................................. 30

Network ................................................................................................................................................................ 34

Data Layout .......................................................................................................................................................... 35


Materials List ........................................................................................................................................................ 37

LIST OF TABLES

Table 1) Test case summary. ....................................................................................................................................... 11

Table 2) Oracle host specifications. ............................................................................................................................. 30

Table 3) Oracle specifications. ..................................................................................................................................... 30

Table 4) Kernel parameters. ......................................................................................................................................... 31

Table 5) Oracle initiation file parameters. ..................................................................................................................... 32

Table 6) NetApp storage specifications. ....................................................................................................................... 33

Table 7) Server network specifications. ........................................................................................................................ 34

Table 8) Storage network specifications. ...................................................................................................................... 34

Table 9) FC back-end switches. ................................................................................................................................... 34

Table 10) Materials list for testing. ................................................................................................................................ 37

LIST OF FIGURES

Figure 1) MetroCluster overview. ...................................................................................................................................6

Figure 2) MetroCluster mirroring. ...................................................................................................................................7

Figure 3) Test environment. ...........................................................................................................................................9

Figure 4) Data layout. ................................................................................................................................................... 10

Figure 5) Test phases................................................................................................................................................... 11

Figure 6) Loss of Oracle node. ..................................................................................................................................... 12

Figure 7) Loss of an Oracle server host HBA. .............................................................................................................. 13

Figure 8) Loss of an individual disk. ............................................................................................................................. 14

Figure 9) Loss of disk shelf. .......................................................................................................................................... 15

Figure 10) Loss of NetApp storage controller. .............................................................................................................. 16

Figure 11) Loss of an FC switch. .................................................................................................................................. 17

Figure 12) Loss of ISL. ................................................................................................................................................. 18

Figure 13) Loss of primary site for planned maintenance. ............................................................................................ 19

Figure 14) Loss of primary site. .................................................................................................................................... 20

Figure 15) Aggregate and volume layouts and sizes. ................................................................................................... 35

Figure 16) Volume and LUN layouts for site A. ............................................................................................................ 36


1 Introduction

This document describes the results of a series of tests demonstrating that an Oracle Database 11g R2

Real Application Cluster (RAC) database configured in a NetApp MetroCluster solution in clustered Data

ONTAP 8.3 operates without problems while under load in a variety of possible failure scenarios.

The tests simulate several different failure scenarios. This technical report documents their effects on the

Oracle Database 11g database environment. The tests were conducted while both the database servers

and the NetApp storage controllers were subjected to a heavy transactional workload meant to increase

the amount of stress on the system as well as to better represent more of a real-world environment.

In order to pass a test, the MetroCluster cluster had to remain online and accessible while the database

continued to serve I/O in the environment without errors.

1.1 Best Practices

This document should not be interpreted as a best practice guide for using solutions with Oracle

databases on NetApp MetroCluster software in clustered Data ONTAP 8.3. Customer requirements vary,

and therefore configurations vary as well. The configuration described in this document reflects the most

common two-site customer need encountered by NetApp, but many others exist as well. In addition,

NetApp MetroCluster technology is not required for Oracle RAC. Most Oracle RAC clusters on NetApp

storage do not require synchronous remote replication and therefore are used with standard Data ONTAP

clusters, not with MetroCluster clusters. Although Oracle Database 11g R2 was used for these tests, the

principles are equally applicable to Oracle Database 12c and later. The 11g R2 version was chosen as

the most mature, stable, and commonly used version of Oracle RAC.

Although the Fibre Channel (FC) protocol is used in this document, the same overall design and

procedures can be used for NFS and iSCSI.

For more information about all other configuration details, including Oracle database and kernel

parameters, see the appendix of this document. For general Oracle best practices, including those for

Oracle RAC, see TR-3633: Best Practices for Oracle Databases on NetApp Storage.

1.2 Assumptions

Throughout this document, the examples assume two physical sites, site A and site B. Site A represents

the main data center on campus. Site B is the campus disaster recovery (DR) location that provides

protection during a complete data center outage. All components are named to show clearly where they

are physically located.

It is also assumed that the reader has a basic familiarity with both NetApp and Oracle products.

2 Executive Summary

MetroCluster in clustered Data ONTAP 8.3 provides native continuous availability for business-critical

applications, including Oracle. The testing demonstrated that our Oracle Database 11g R2 RAC cluster

operated as expected in the MetroCluster environment under a moderate to heavy transactional workload

when subjected to a variety of failure scenarios that resulted in limited, moderate, and complete disruption

to the systems in our primary production site.

These tests show that NetApp MetroCluster technology and the Oracle RAC database together provide a

winning combination for continuous application availability.

3 Product Overview

This section describes the NetApp and Oracle products used in the solution.

http://www.netapp.com/us/media/tr-3633.pdf


3.1 NetApp Storage Technology

This section describes the NetApp hardware and software used in the solution.

FAS8000 Series Storage Systems

NetApp FAS8000 series storage systems combine a unified scale-out architecture with leading data-

management capabilities. They are designed to adapt quickly to changing business needs while

delivering core IT requirements for up-time, scalability, and cost efficiency. These systems offer the

following advantages:

Speed the completion of business operations. Leveraging a new high-performance, multicore architecture and self-managing flash acceleration, FAS8000 unified scale-out systems boost throughput and decrease latency to deliver consistent application performance across a broad range of SAN and NAS workloads.

Streamline IT operations. Simplified management and proven integration with cloud providers let you deploy the FAS8000 in your data center and in a hybrid cloud with confidence. Nondisruptive operations simplify long-term scaling and improve uptime by facilitating hardware repair, tech refreshes, and other updates without planned downtime.

Deliver superior total cost of ownership. Proven storage efficiency and a two-fold improvement in price/performance ratio over the previous generation reduce capacity utilization and improve long-term return on investment. NetApp FlexArray

™ storage virtualization software lets you integrate

existing arrays with the FAS8000, increasing consolidation and providing even greater value to your business.

Clustered Data ONTAP Operating System

NetApp clustered Data ONTAP 8.3 software delivers a unified storage platform that enables unrestricted,

secure data movement across multiple cloud environments and paves the way for software-defined data

centers, offering advanced performance, availability, and efficiency. Data ONTAP clustering capabilities

help you keep your business running nonstop.

Clustered Data ONTAP is an industry-leading storage operating system. Its single feature-rich platform

allows you to scale infrastructure without increasing IT staff. Clustered Data ONTAP provides the

following benefits:

Nondisruptive operations:

Perform storage maintenance, hardware lifecycle operations, and software upgrades without interrupting your business.

Eliminate planned and unplanned downtime.

Proven efficiency:

Reduce storage costs by using one of the most comprehensive storage efficiency offerings in the industry.

Consolidate and share the same infrastructure for workloads or tenants with different performance, capacity, and security requirements.

Seamless scalability:

Scale capacity, performance, and operations without compromise, regardless of application.

Scale SAN and NAS from terabytes to tens of petabytes without reconfiguring running applications.

MetroCluster Solution

A self-contained solution, NetApp MetroCluster high-availability (HA) and DR software lets you achieve

continuous data availability for mission-critical applications at half the cost and complexity.


MetroCluster software combines array-based clustering with synchronous mirroring to deliver continuous

availability and zero data loss. It provides transparent recovery from most failure scenarios so that critical

applications continue running uninterrupted. It also eliminates repetitive change-management activities to

reduce the risk of human error and administrative overhead.

New MetroCluster enhancements deliver the following improvements:

Local node failover in addition to site switchover

End-to-end continuous availability in a virtualized environment with VMware HA and fault tolerance

Whether you have a single data center, a campus, or a metropolis-wide environment, use the cost-

effective NetApp MetroCluster solution to achieve continuous data availability for your critical business

environment.

Figure 1 shows a high-level view of a MetroCluster environment that spans two data centers separated by

a distance of up to 200km. MetroCluster software in clustered Data ONTAP 8.3 provides the following

features:

MetroCluster is an independent two-node cluster at each site, up to 200km apart.

Each site serves data to local clients or hosts and acts as secondary to the other site.

The client/host network spans both sites, just as with fabric and stretch MetroCluster.

Interswitch links (ISLs) and redundant fabrics connect the two clusters and their storage.

All storage is fabric attached and visible to all nodes.

Local HA handles almost all planned and unplanned operations.

Switchover and switchback transfer the entire cluster's workload between sites.

Figure 1) MetroCluster overview.

SyncMirror Mirroring

NetApp SyncMirror®, an integral part of MetroCluster, combines the disk-mirroring protection of RAID 1

with industry-leading NetApp RAID technology. During an outage—whether from a disk problem, a cable

break, or a host bus adapter (HBA) failure—SyncMirror can instantly access the mirrored data without

operator intervention or disruption to client applications. SyncMirror maintains strict physical separation

between two copies of your mirrored data. Each copy is called a plex. As Figure 2 shows, each

controller’s data has its “mirror” at the other location.


Figure 2) MetroCluster mirroring.

With MetroCluster, all mirroring is performed at an aggregate level so that all volumes are automatically

protected with one simple replication relationship. Other protection solutions operate at an individual

volume level. This means that to protect all of the volumes (which could be hundreds), some type of

replication relationship must be created after each source and destination volume is created.

3.2 Oracle Database and Oracle Real Application Clusters

The Oracle Database 11g R2 Enterprise Edition provides industry-leading performance, scalability,

security, and reliability on clustered or single servers with a wide range of options to meet the business

needs of critical enterprise applications.

Oracle Database with Real Application Clusters (RAC) brings an innovative approach to the challenges of

rapidly increasing amounts of data and demand for high performance. In the scale-out model of Oracle

RAC, active-active clusters use multiple servers to deliver high performance, scalability, and availability,

making Oracle Database 11g the ideal platform for private and public cloud deployments.

The use of Oracle RAC clusters running on extended host clusters provide the highest level of Oracle

capability for availability, scalability, and low-cost computing. It also supports popular packaged products

such as SAP, PeopleSoft, Siebel, and Oracle E*Business Suite, as well as custom applications.

4 Challenges for Disaster Recovery Planning

Disaster recovery (DR) is defined as the processes, policies, and procedures related to preparing for

recovery or continuation of technical infrastructure critical to an organization after a natural disaster (such

as flood, tornado, volcano eruption, earthquake, or landslide) or a human-induced disaster (such as a

threat having an element of human intent, negligence, or error or involving a failure of a human-made

system).

DR planning is a subset of a larger process known as business continuity planning, and it should include

planning for the resumption of applications, data, hardware, communications (such as networking), and

other IT infrastructure. A business continuity plan (BCP) includes planning for non-IT related aspects such


as key personnel, facilities, crisis communication, and reputation protection, and it should refer to the

disaster recovery plan (DRP) for IT-related infrastructure recovery or continuity.

Generically, a disaster can be classified as either logical or physical. Both categories are addressed with

HA, recovery processing, and/or DR processes.

4.1 Logical Disasters

Logical disasters include, but are not limited to, data corruption by users or technical infrastructure.

Technical infrastructure disasters can result from file system corruption, kernel panics, or even system

viruses introduced by end users or system administrators.

4.2 Physical Disasters

Physical disasters include the failure of any storage component on site A or site B that supersedes the

resiliency features of an HA pair of NetApp controllers not based on MetroCluster that would normally

result in downtime or data loss.

In certain cases, mission-critical applications should not be stopped even in a disaster. By leveraging

Oracle RAC extended-distance clusters and NetApp storage technology, it is possible to address those

failure scenarios and provide a robust deployment for critical database environments and applications.

5 Value Proposition

Typically, mission-critical applications must be implemented with two requirements:

RPO = 0 (recovery point objective equal to zero), meaning that data loss from any type of any failure is unacceptable

RTO ~= 0 (recovery time objective as close to zero as possible), meaning that the time to recovery from a disaster scenario should be as close to 0 minutes as possible

The combination of Oracle RAC on extended-distance clusters with NetApp MetroCluster technology

meets these RPO requirements by addressing the following common failures:

Any kind of Oracle Database instance crash

Switch failure

Multipathing failure

Storage controller failure

Storage or rack failure

Network failure

Local data center failure

Complete site failure

6 High-Availability Options

Multiple options exist for spanning sites with an Oracle RAC cluster. The best option depends on the

available network connectivity, the number of sites, and customer business needs. NetApp Professional

Services can offer assistance with configuration planning and, when necessary, can offer Oracle

consulting services as well.

6.1 ASM Mirroring

Automatic storage management (ASM) mirroring, also called ASM normal redundancy, is a frequent

choice when only a very small number of databases must be replicated. In this configuration, the Oracle


RAC nodes span sites and leverage ASM to replicate data. Storage mirroring is not required, but

scalability is limited because as the number of databases increases the administrative burden to maintain

many mirrored ASM disk groups becomes excessive. In these cases, customers generally prefer to mirror

data at the storage layer.

This approach can be configured with and without a tiebreaker to control the Oracle RAC cluster quorum.

6.2 Two-Site Storage Mirroring

The configuration chosen for these tests was two-site storage mirroring because it reflects the most

common use of site-spanning Oracle RAC with MetroCluster.

As described in detail in the following section, this option establishes one of the sites as a designated

primary site and the other as the designated secondary site. This is done by first selecting one site to host

the active storage site and then configuring two Oracle Cluster Ready Services (CRSs) and voting

resources on it. The other site is a synchronous but passive replica. It does not directly serve data. It also

contains only one CRS and voting resource.

7 High-Level Topology

Figure 3 shows the architecture of the configuration used for our validation testing. These tests used a

two-node Oracle RAC database environment with a RAC node deployed at both site A and site B with the

following specifics:

The sites were separated by a 20km distance, and fiber spools were used for both the MetroCluster and the RAC nodes.

The RAC configuration used the FC protocol and ASM to provide access to the database.

A WAN emulator was used to simulate a 20km distance between the RAC nodes for the private interconnect and to introduce approximately 10ms of latency into the configuration.

Figure 3) Test environment.


The Oracle binaries were installed locally on each server. The configuration included the following

specific arrangements:

The data and logs were mirrored to site B with the storage controllers at site B acting strictly in a passive DR capacity.

Using a single front-end fabric spanning both sites, the FC LUNs were presented to both Oracle RAC nodes by the storage controllers at site A.

There were three OCR disks: two at site A and one at site B.

There were three voting disks: two at site A and one at site B.

Figure 4 shows the distribution of the Oracle logs and data files across both controllers on site A. For

more information about the hardware and software used in this test configuration, see the appendix of this

document.

Figure 4) Data layout.

8 Test Case Overview and Methodology

All of the test cases listed in Table 1 were executed by injecting a specific fault into an otherwise

nominally performing system under a predefined load driven to the Oracle RAC cluster. The load was

generated by a utility called Simple Little Oracle Benchmark (SLOB), and it used a combination of 90%

reads and 10% writes with a 100% random access pattern.

This load delivered more than 70K IOPS evenly across the FAS8060 controllers at site A, resulting in

storage CPU and disk utilization of 30% to 40% during the tests. The goal was not to measure

performance of the overall environment specifically but to subject the environment to a substantial load

during testing.


To increase the load on the test environment, we made sure that the Oracle RAC node that was installed

in site B participated in the load generation by driving IOPS to the FAS8060 storage controllers on site A

across the network.

Table 1) Test case summary.

Test Case Description

TC01 Loss of a single Oracle node

TC02 Loss of an Oracle host HBA

TC03 Loss of an individual disk in an active data aggregate

TC04 Loss of an entire disk shelf

TC05 Loss of a NetApp storage controller

TC06 Loss of a back-end FC switch on the MetroCluster cluster

TC07 Loss of an ISL

TC08 Sitewide maintenance requiring a planned switchover from site A to site B

TC09 Sitewide disaster requiring an unplanned manual switchover from site A to site B

For more information about how we conducted each of these tests, see the appendix of this document.

Each test was broken into the following three phases:

1. A baseline stage, indicative of normal operations. A typical duration for this stage was 15 minutes.

2. A fault stage, during which the specific fault under test was injected and allowed to continue in this stage for 15 minutes to provide sufficient time to verify correct database behavior.

3. A recovery stage, in which the fault was corrected and database behavior was verified. When applicable, this stage generally included 30 additional minutes of run time after the fault was corrected.

Figure 5 shows the process. Before each stage of a specific test, we used the automatic workload

repository (AWR) functionality of the Oracle database to create a snapshot of the current condition of the

database. After the test was complete, we captured the data between the snapshots to understand the

impact of the specific fault on database performance and behavior. Finally, we monitored the CPU, IOPS,

and disk utilization on the storage controllers throughout the tests.

Figure 5) Test phases.


9 Test Results

The following sections summarize the tests that were performed and report the results of each.

9.1 Loss of Single Oracle Node (TC-01)

This test case resulted in the loss of the Oracle RAC node on site B. As Figure 6 shows, this loss was

accomplished by powering off the Oracle RAC database node while it was under load. For this test, we

ran the workload for a total of 60 minutes and allowed the RAC node to be disabled for 15 minutes before

restarting it to correct the fault.

Figure 6) Loss of Oracle node.

As expected, we observed no impact to the Oracle RAC functionality during this test. Also as expected,

we observed a larger impact to the overall performance driven from the database because the loss of one

of the database nodes reduced the amount of I/O data driven to the FAS8060 controllers on site A. The

database remained operational during the 15 minutes we allowed the test to continue in the failed state.

To correct the failure, we powered on the RAC node located at site B and observed that it was correctly

added back into the RAC environment. We then started the workload again on both RAC nodes to verify

that they were both operating correctly.

9.2 Loss of Oracle Host HBA (TC-02)

This test resulted in the loss of an HBA on one of the Oracle RAC nodes. As Figure 7 shows, this loss

was accomplished by removing the cables from an HBA on the Oracle node at site A. For this test, we ran

the workload for a total of 60 minutes and allowed the HBA to be disconnected for 15 minutes before

reconnecting it to correct the fault.


Figure 7) Loss of an Oracle server host HBA.

As expected, during this test we observed no impact to the Oracle RAC functionality while the database

servers continued to drive load to the FAS8060 controllers on site A. The database remained operational

during the 15 minutes we allowed the test to continue in the failed state.

To correct the failure, we reconnected the HBA on the RAC node located at site A and verified that it

ultimately started participating in the workload again after a brief time. We observed no database errors

during this test.

9.3 Loss of Individual Disk (TC-03)

This test resulted in the loss of a disk on one of the storage controllers. As Figure 8 shows, this loss was

accomplished by removing one of the disks on an active data aggregate at site A. For this test, we ran the

workload for a total of 60 minutes and allowed the disk to be removed for 15 minutes before reinserting it

to correct the fault.


Figure 8) Loss of an individual disk.

As expected, during this test we observed no impact to the Oracle RAC functionality and minimal impact

to the overall performance driven from the database while both RAC nodes continued to drive load to the

FAS8060 controllers on site A. The database remained operational during the 15 minutes we allowed the

test to continue in the failed state.

Note: NetApp RAID DP® technology can survive the failure of two disks per aggregate, and it

automatically reconstructs the data on the spare.

9.4 Loss of Disk Shelf (TC-04)

This test resulted in the loss of an entire shelf of disks on one of the FAS8060 storage controllers. As

Figure 9 shows, this loss was accomplished by powering off one of the disk shelves at site A. For this

test, we ran the workload for a total of 60 minutes and allowed the disk shelf to be powered off for 15

minutes before reapplying power to correct the fault.


Figure 9) Loss of disk shelf.



FAS8060 controllers on site A. The database remained operational during the 15 minutes we allowed the

test to continue in the failed state.

With the use of SyncMirror in MetroCluster, shelf failure at either site is transparent. There are two plexes,

one at each site. In normal operation, all reads are fulfilled from the local plex, and all writes are

synchronously updated on both plexes. If one plex fails, reads continue seamlessly on the remaining plex,

and writes are directed to the remaining plex. If the hardware can be powered on for recovery, the

resynchronization of the recovered plex is automatic. If the failed shelf must be replaced, the new disks

are added to the mirrored plex. Afterward, resynchronization again becomes automatic.

9.5 Loss of NetApp Storage Controller (TC-05)

This test resulted in the unplanned loss of an entire storage controller. As Figure 10 shows, this was

accomplished by powering off one of the FAS8060 storage controllers at site A. The surviving storage

controller automatically took over the workload that was initially shared evenly across both storage

controllers.

Note: The storage controller takeover and giveback process used for this test differs from the MetroCluster switchover and switchback process used in test cases TC-08 and TC-09.

For this test we ran the workload for a total of 60 minutes and allowed the controller to be powered off for

15 minutes before reapplying power to correct the fault and performing a storage controller giveback to

bring both FAS8060 controllers back on line at site A.


Figure 10) Loss of NetApp storage controller.

As expected, during this test we observed no impact to the Oracle RAC functionality, with a larger impact


surviving FAS8060 storage controller, albeit at a lower rate because of the nature of the failure.

After performing a storage controller giveback to rectify the failure, we allowed the test to continue for an

additional 30 minutes and observed that overall performance returned to prefailure levels. We continued

to observe no problems with the operation of the database.

9.6 Loss of Back-End Fibre Channel Switch (TC-06)

This test resulted in the loss of one of the MetroCluster FC switches. As Figure 11 shows, this loss was

accomplished by powering off one of the switches at site A. For this test, we ran the workload for a total of

60 minutes and allowed the switch to be powered off for 15 minutes before reapplying power to correct

the fault.


Figure 11) Loss of an FC switch.


to the overall performance driven from the database. In this case, continuous operation was maintained

by automatically moving all of the I/O across the surviving path to the LUNs on the surviving switch.

After rectifying the failure by reapplying power to the switch, we allowed the test to continue for an

additional 30 minutes and observed that overall performance was maintained at prefailure levels. We

continued to observe no problems with the operation of the database.

9.7 Loss of Interswitch Link (TC-07)

This test resulted in the loss of one of the ISLs on the MetroCluster FC switches. As Figure 12 shows, this

loss was accomplished by unplugging the ISL on one of the switches at site A. For this test, we ran the

workload for a total of 60 minutes and allowed the ISL to be disconnected for 15 minutes before

reconnecting it to correct the fault.


Figure 12) Loss of ISL.


to the overall performance driven from the database. In this case, continuous operation was maintained

by automatically moving all of the I/O across the surviving paths.

After rectifying the failure by reconnecting the ISL to the switch, we allowed the test to continue for an

additional 30 minutes and observed that the overall performance was maintained at prefailure levels. We

continued to observe no problems with the operation of the database.

9.8 Maintenance Requiring Planned Switchover from Site A to Site B (TC-08)

This test resulted in the planned switchover of the FAS8060 storage controllers on site A in order to

conduct a maintenance operation. As Figure 13 shows, this was accomplished by executing a

MetroCluster switchover that changed the LUNs serving the RAC database from site A to those that were

mirrored at site B.


Figure 13) Loss of primary site for planned maintenance.

For this test, we ran the workload for a total of 60 minutes. After 15 minutes, we initiated the MetroCluster

switchover command from the FAS8060 controllers on site B. After the switchover was successfully

completed, we observed that the workload was picked up by the FAS8060 controllers at site B and that

both Oracle RAC nodes continued to operate normally without interruption.

Note: The switchover was accomplished by using a single command to switch over the entire storage resource from site A to site B while preserving the configuration and identity of the LUNs. The result was that no action, rediscovery, remapping, or reconfiguration was required from the perspective of the Oracle RAC database.

We allowed the test to continue in the switched-over state for another 15 minutes and then initiated the

MetroCluster switchback process to restore site A as the primary site for the Oracle RAC database. After

successfully completing the MetroCluster switchback process, we observed the FAS8060 in site A

resuming the processing of the workload from both RAC nodes, and the database operation continued

without interruption.

During this test, were observed no problems with the operation of the Oracle RAC database.

9.9 Disaster Forcing Unplanned Manual Switchover from Site A to Site B (TC-09)

This test resulted in the unexpected complete loss of site A because of an unspecified disaster. As Figure

14 shows, this loss was accomplished by powering off both of the FAS8060 storage controllers and the

Oracle RAC node located at site A.

Our expectation was that the second Oracle RAC node running on site B would lose access to the

database LUNs hosted on the FAS8060 controllers on site A and shut down. After officially declaring the

loss of site A, we manually initiated a MetroCluster switchover from site A to site B and restarted the

Oracle RAC database instance on site B.


Figure 14) Loss of primary site.

For this test, we ran the workload for a total of 60 minutes. After 15 minutes, we powered off both of the

FAS8060 controllers and the Oracle RAC node on site A. We continued in this state for a total of 15

minutes. As expected, the Oracle RAC database node that was running on site B lost access to the voting

LUNs on site A and stopped working after exceeding the defined timeout period. As discussed previously,

this interruption occurred because of the lack of a third-site tiebreaker service, which is the most common

configuration chosen by customers. If completely seamless DR capability is desired, this can be

accomplished through the use of a third site.

We then initiated the MetroCluster switchover command from the FAS8060 controllers on site B. After the

switchover was completed, we restarted the Oracle RAC node on site B and observed that it started

normally without additional manual intervention after the database LUNs were redirected to the copies

that had been mirrored to the FAS8060 controllers on site B.

To verify that the database was working, we initiated the workload from the surviving RAC node and

observed that it successfully drove IOPS to the FAS8060 controllers on site B.

We allowed the test to continue for an additional 15 minutes and then initiated the MetroCluster

switchback process to restore site A as the primary site for the Oracle RAC database. After successfully

completing the MetroCluster switchback process, we restarted the Oracle RAC server on site A and

verified that it was added back into the cluster. We then restarted the workload and verified that the

Oracle RAC nodes on site A and site B were again driving load to the FAS8060 controllers on site A.


10 Conclusion

NetApp MetroCluster software in clustered Data ONTAP 8.3 provides native continuous availability for

business-critical applications, including Oracle. Our tests demonstrated that even under heavy

transactional workloads Oracle databases continue to function normally during a wide variety of failure

scenarios that could potentially cause downtime and data loss.

In addition, clustered Data ONTAP provides the following benefits:

Nondisruptive operations leading to zero data loss

Set-it-once simplicity

Zero change management

Lower cost and complexity of competitive solutions

Seamless integration with storage efficiency, SnapMirror®, nondisruptive operations, and virtualized

storage

Unified support for both SAN and NAS

Together, these products create a winning combination for continuous data availability.

Appendix

This appendix provides detailed information about the test cases described in this document as well as

about deployment, the network, the data layout, and the list of materials used.

Detailed Test Cases

TC-01: Loss of Single Oracle Node

Test Case Details

Test case number TC-01

Test case description No single point of failure should exist in the solution. Therefore, the loss of one of the Oracle servers in the cluster was tested. This test was accomplished by halting a host in the cluster while running a test workload.

Test assumptions A completely operational NetApp MetroCluster cluster has been installed and configured properly.

A completely operational Oracle RAC environment has been installed and configured.

The SLOB utility has been installed and configured to generate a workload consisting of 90% reads and 10% writes with a 100% random access pattern.

Test data or metrics to capture

AWR data as described in section 8, “Test Case Overview and Methodology”

IOPS, CPU, and disk utilization data from both NetApp FAS8060 controllers

Expected results The loss of an Oracle RAC node causes no interruption of Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at a lower rate because of the loss of one of the RAC nodes. No database errors are detected.

Test Methodology

1. Initiate the defined workload by using the SLOB tool for a total of 60 minutes. SLOB generates an initial AWR snapshot.


2. Allow the workload to run for 15 minutes to establish consistent performance.

3. Initiate an AWR snapshot to capture database-level IOPS and latency information before the fault is injected.

4. Halt one of the Oracle RAC servers and allow the test to continue for 15 minutes.

5. Initiate an AWR snapshot to capture database-level IOPS and latency during the fault.

6. Bring the halted server back online and verify that it is placed back into the RAC environment.

7. Allow the test to continue for the remainder of the 60-minute duration.

8. SLOB creates a final AWR snapshot at the end of the test to capture database-level IOPS and latency for the period after the fault.

TC-02: Loss of Oracle Host HBA

Test Case Details

Test number TC-02

Test case description No single point of failure should exist in the solution. Therefore, the loss of one of the Oracle servers in the cluster was tested. This test was accomplished by halting a host in the cluster while running a test workload.







Expected results Removal of the HBA connection from the Oracle RAC node causes no interruption of Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at prefailure levels. No database errors are detected.

Test Methodology




4. Remove the cable from the FC HBA on the Oracle RAC server on site B and allow the test to continue for 15 minutes.


6. Reinstall the cable.


8. SLOB creates a final AWR snapshot at the end of the test to capture database-level IOPS and latency for the period after the fault is corrected.


TC-03: Loss of Individual Disk

Test Case Details

Test number TC-03

Test case description No single point of failure should exist in the solution. Therefore, the loss of a single disk was tested. This test was accomplished by removing a disk drive from the shelf hosting the database data files on the FAS8060 running on site A while running an active workload.







Expected results The removal of the disk drive causes no interruption of Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at prefailure levels. No database errors are detected.

Test Methodology




4. Remove one of the disks in an active aggregate and allow the test to continue for 15 minutes.

5. Reinstall the disk drive.





TC-04: Loss of Disk Shelf

Test Case Details

Test number TC-04

Test case description No single point of failure should exist in the solution. Therefore, the loss of an entire shelf of disks was tested. This test was accomplished by turning off both power supplies on one of the disk shelves hosting the database data files on the FAS8060 running on site A while running an active workload.







Expected results The loss of a disk shelf causes no interruption of the Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at prefailure levels. No database errors are detected.

Test Methodology




4. Turn off the power supplies on the designated disk shelf and let the test continue for 15 minutes.


6. Turn on the power supplies on the affected disk shelf.




TC-05: Loss of NetApp Storage Controller

Test Case Details

Test number TC-05

Test case description No single point of failure should exist in the solution. Therefore, the loss of one of the FAS8060 controllers serving the database on site A was tested while an active workload was running.







Expected results The loss of a controller of an HA pair has no impact on Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at a lower rate in the time frame while the second storage controller is halted and the surviving storage controller is handling the entire workload.

After the storage giveback process is completed, performance returns to prefailure levels because both storage controllers are again servicing the workload. No database errors are detected.

Test Methodology




4. Without warning, halt one of the controllers of the FAS8060 HA pair on site A.

5. Initiate a storage takeover by the surviving node and let the test continue for 15 minutes.


7. Reboot the halted storage controller.

8. Initiate a storage giveback operation to bring the failed node back into the storage cluster.




TC-06: Loss of Back-End Fibre Channel Switch

Test Case Details

Test number TC-06

Test case description No single point of failure should exist in the solution. Therefore, the loss of an entire FC switch supporting the MetroCluster cluster was tested. This test was accomplished by simply removing the power cord from one of the Brocade 6510 switches in site A while running an active workload.







Expected results The loss of a single MetroCluster FC switch causes no interruption of the Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at prefailure levels. No database errors are detected.

Test Methodology




4. Power off one of the MetroCluster Brocade 6510 switches in site A and allow the test to run for 15 minutes.


6. Power on the Brocade 6510 switch.




TC-07: Loss of Interswitch Link

Test Case Details

Test number TC-07

Test case description No single point of failure should exist in the solution. Therefore, the loss of one of the ISLs was tested. This test was accomplished by removing the FC cable between two Brocade 6510 switches on site A and site B while running an active workload.

Test assumptions A completely operational NetApp MetroCluster cluster has been properly installed and configured.






Expected results The loss of one of the ISL switch links between site A and site B causes no interruption of the Oracle RAC operation. During the failure period, IOPS continue to the FAS8060 at prefailure levels. No database errors are detected.

Test Methodology




4. Disconnect one of the MetroCluster ISLs.


6. Reconnect the affected ISL.




TC-08: Maintenance Requiring Planned Switchover from Site A to Site B

Test Case Details

Test number TC-08

Test case description If there is a required maintenance window for the FAS8060 storage controllers at site A, the MetroCluster switchover feature should be capable of moving the production workload to site B and presenting the Oracle RAC database LUNs from the FAS8060 storage controllers at site B, allowing the database to continue operations. To test this premise, we initiated a MetroCluster switchover and switchback from site A to site B and then back to site A after the maintenance was complete.






IOPS, CPU, and disk utilization data from both NetApp FAS8060 controllers on site A and site B

Expected results Moving the production operations from site A to site B by using the MetroCluster switchover operations causes no interruption of the Oracle RAC operation. After the MetroCluster switchover, IOPS are directed to the FAS8060 storage controllers at site B from both RAC nodes. After the MetroCluster switchback, IOPS are again directed at the FAS8060 storage controllers on site A. No database errors are detected.

Test Methodology




4. On site B, initiate a MetroCluster switchover of production operations and let the test continue to run in switchover mode for 15 minutes.


6. Heal the aggregates on site A.

7. Perform a MetroCluster switchback to return to normal operation.

8. Verify successful switchback.




TC-09: Disaster Forcing Unplanned Manual Switchover from Site A to Site B

Test Case Details

Test number TC-09

Test case description If an unplanned disaster at site A takes out the FAS8060 storage controllers and the Oracle RAC node at site A, the MetroCluster switchover feature should be capable of moving the production workload to site B and presenting the Oracle RAC database LUNs from the FAS8060 storage controllers at site B, allowing the database to continue operations.

To test this premise, we powered off the FAS8060 storage controllers and the Oracle RAC node located at site A to simulate a site failure. We then manually initiated a MetroCluster switchover and switchback from site A to site B and then back to site A after mitigating the disaster at site A.

Test assumptions A completely operational NetApp MetroCluster cluster has been properly installed and configured.





IOPS, CPU, and disk utilization data from both NetApp FAS8060 controllers on site A and site B

Expected results As a result of the disaster, the FAS8060 at site A is lost, which ultimately causes the RAC node at site B to lose access to the database LUNs and stop running. Manually moving the production operations from site A to site B through the MetroCluster switchover operations allows the Oracle RAC database to be restarted by using the surviving database node.

After the MetroCluster switchover and restart of the database, IOPS are directed to the FAS8060 storage controllers at site B from the surviving RAC node on site B. After the disaster is repaired and the MetroCluster switchback is completed, the repaired Oracle RAC node on site A is restarted and added back into the database. IOPS are again directed at the FAS8060 storage controllers on site A from both Oracle RAC nodes.

Test Methodology




4. On site B, initiate a MetroCluster switchover of production operations and let the test continue to run in switchover mode for 15 minutes.


6. Heal the aggregates on site A.

7. Perform a MetroCluster switchback to return to normal operation.

8. Verify successful switchback.




Deployment Details

In this section, the deployment details of the architecture are listed in Table 2 through Table 6.

Table 2) Oracle host specifications.

Oracle Hosts

Server Two Fujitsu Primergy RX300 S7 servers

Operating system RedHat Enterprise Linux 6.5

Memory 132GB

Network interfaces Eth0:10000Mb/sec, MTU=9,000

Eth1:10000Mb/sec, MTU=9,000



HBA QLogic QLE2562 - PCI-Express dual-channel 8Gb FC HBA

Host attach kit and version NetApp Linux Host Utilities version 6.2

Multipathing Yes

SAN switches, models, and firmware

Brocade 6510, v7.0.2c

Local storage used RHEL 6.5 only

Table 3) Oracle specifications.

Oracle

Version 11.2.0.4.0

ASM (SAN only) 11.2.0.4.0

Oracle CRS (SAN only) 11.2.0.4.0

For these tests, we set the Oracle RAC parameters miscount and disktimeout to 120 and 300

seconds, respectively. These parameters control the amount of time the RAC nodes wait after losing

access to storage and/or network heartbeats before taking themselves out of the cluster to prevent a

potential split-brain situation. These values should be changed from the defaults only with careful

understanding of the storage, network, and cluster layout.


Table 4) Kernel parameters.

Kernel Parameters: /etc/sysctl.conf File

kernel.sem 250 32000 100 128

kernel.shmmni 4096

kernel.sem 250 32000 100 128

net.ipv4.ip_local_port_range 6815744

net.core.rmem_default 4194304

net.core.rmem_max 16777216

net.core.wmem_default 262144

net.core.wmem_max 16777216

net.ipv4.ipfrag_high_thresh 524288

net.ipv4.ipfrag_low_thresh 393216

net.ipv4.tcp_rmem 4096 524288 16777216

net.ipv4.tcp_wmem 4096 524288 16777216

net.ipv4.tcp_timestamps 0

net.ipv4.tcp_sack 0

net.ipv4.tcp_window_scaling 1

net.core.optmem_max 524287

net.core.netdev_max_backlog 2500

sunrpc.tcp_slot_table_entries 128

sunrpc.udp_slot_table_entries 128

net.ipv4.tcp_mem 16384 16384 16384

fs.file-max 6815744

fs.aio-max-nr 1048576

net.ipv4.tcp_no_metrics_save 1

net.ipv4.tcp_moderate_rcvbuf 0

vm.swappiness 0


Table 5) Oracle initiation file parameters.

Oracle init.ora Parameters

MCCDB2.__db_cache_size 3G

MCCDB1.__db_cache_size 3G

MCCDB1.__java_pool_size 67108864

MCCDB1.__large_pool_size 83886080

MCCDB2.__oracle_base /u01/app/oracle'#ORACLE_BASE set from environment

MCCDB1.__oracle_base /u01/app/oracle'#ORACLE_BASE set from environment

MCCDB2.__pga_aggregate_target 300M

MCCDB1.__pga_aggregate_target 419430400

MCCDB2.__sga_target 4G

MCCDB1.__sga_target 4294967296

MCCDB2.__shared_io_pool_size 0

MCCDB1.__shared_io_pool_size 0

MCCDB2.__shared_pool_size 300M

MCCDB1.__shared_pool_size 922746880

MCCDB2.__streams_pool_size 0

MCCDB1.__streams_pool_size 0

*.audit_file_dest '/u01/app/oracle/admin/MCCDB/adump'

*.audit_trail 'db'

*.cluster_database TRUE

*.compatible '11.2.0.4.0'

*.control_files '+FRA/MCCDB/control01.ctl','+FRA/MCCDB/control02.ctl'

*.db_block_size 8192

*.db_domain ''

*.db_name 'MCCDB'

*.db_writer_processes 20

*.diagnostic_dest '/u01/app/oracle'

*.dispatchers (PROTOCOLTCP) (SERVICE MCCDBXDB)'

MCCDB1.instance_number 1

MCCDB2.instance_number 2

*.log_buffer 102400000


Oracle init.ora Parameters

*.open_cursors 300

*.pga_aggregate_target 400M

*.processes 1500

*.remote_listener 'rac-mcc:1521'

*.remote_login_passwordfile 'exclusive'

*.sessions 1655

*.sga_target 4294967296

MCCDB2.thread 2

MCCDB1.thread 1

MCCDB2.undo_tablespace 'UNDOTBS2'

MCCDB1.undo_tablespace 'UNDOTBS1'

Table 6) NetApp storage specifications.

NetApp Storage

Model Four FAS8060 storage systems (2 two-node clusters)

Number of disks 192

Size of disks 838.36GB

Drive type SAS

Shelf type DS2246

Number of shelves 8

Operating system Data ONTAP 8.3RC1

Flash Cache™

1TB

Network interface card (NIC) Dual 10GbE controller IX1-SFP+

Target HBA Qlogic 8324 (2a,2b)

4 back-end switches Brocade 6510

Kernel: 2.6.14.2

Fabric OS: v7.0.2c

Made on: Fri Feb 22 21:29:23 2013

Flash: Mon Nov 4 18:39:15 2013

BootProm: 1.0.9

Software NFS, CIFS, FCP, FlexClone®, OnCommand

® Balance


Network

Table 7, Table 8, and Table 9 list the network details.

Table 7) Server network specifications.

Hostname Interface IP Address Speed MTU Purpose

stlrx300s7-85 eth0 172.20.160.100 10Gb/s 9,000 RAC interconnect

eth0:1 169.254.76.209 10Gb/s 9,000

eth2 10.61.164.204 1Gb/s 1,500 Public

eth2:1 10.61.164.138 1Gb/s 1,500 Public VIP

eth2:2 10.61.164.140 1Gb/s 1,500 Mgmt

eth2:3 10.61.164.142 1Gb/s 1,500 Mgmt

stlrx300s7-87 eth0 172.20.160.102 10Gb/s 9,000 RAC interconnect

eth0:1 169.254.180.210 10Gb/s 9,000

eth2 10.61.164.206 1Gb/s 1,500 Public



Table 8) Storage network specifications.

SVM LIF Port IP Address Speed

MTU Role

Cluster stl-mcc-01-01_clus1 stl-mcc-01-01 e0a 169.254.228.130 10Gb 9,000 Cluster

Cluster stl-mcc-01-01_clus2 stl-mcc-01-01 e0c 169.254.183.28 10Gb 9,000 Cluster

Cluster stl-mcc-01-02_clus1 stl-mcc-01-02 e0a 169.254.32.214 10Gb 9,000 Cluster

Cluster stl-mcc-01-02_clus2 stl-mcc-01-02 e0c 169.254.235.240 10Gb 9,000 Cluster

Stl-mcc-01 cluster_mgmt stl-mcc-01-01 e0i 10.61.164.172 1Gb 1,500 Cluster mgmt

Stl-mcc-01 stl-mcc-01-01_icl1 stl-mcc-01-01:e0b 10.61.164.176 10Gb 1,500 Intercluster

Stl-mcc-01 stl-mcc-01-01_mgmt1 stl-mcc-01-01 e0i 10.61.164.170 1Gb 1,500 Node mgmt

Stl-mcc-01 stl-mcc-01-02_icl1 stl-mcc-01-02 e0b 10.61.164.177 10Gb 1,500 Intercluster

Stl-mcc-01 stl-mcc-01-02_mgmt1 stl-mcc-01-02 e0i 10.61.164.171 1Gb 1,500 Node mgmt

Table 9) FC back-end switches.

Hostname IP Address

FC_switch_A1 10.61.164.166

FC_switch_A2 10.61.164.167

FC_switch_B1 10.61.164.168


Hostname IP Address

FC_switch_B2 10.61.164.169

Data Layout

Figure 15 and Figure 16 show the layout of the data.

Figure 15) Aggregate and volume layouts and sizes.


Figure 16) Volume and LUN layouts for site A.


Materials List

Table 10 lists the materials used in the testing.

Table 10) Materials list for testing.

Quantity Description

2 HA pairs of FAS8060 (total 4 nodes)

4 Brocade 6510 switches for back-end MC SAN

4 FC/SAS bridges

8 2,246 disk shelves with 900GB SAS drives


Refer to the Interoperability Matrix Tool (IMT) on the NetApp Support site to validate that the exact product and feature versions described in this document are supported for your specific environment. The NetApp IMT defines the product components and versions that can be used to construct configurations that are supported by NetApp. Specific results depend on each customer's installation in accordance with published specifications.

Trademark Information

NetApp, the NetApp logo, Go Further, Faster, ASUP, AutoSupport, Campaign Express, Cloud ONTAP, Customer Fitness, Data ONTAP, DataMotion, Fitness, Flash Accel, Flash Cache, Flash Pool, FlashRay, FlexArray, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexVol, FPolicy, GetSuccessful, LockVault, Manage ONTAP, Mars, MetroCluster, MultiStore, NetApp Insight, OnCommand, ONTAP, ONTAPI, RAID DP, SANtricity, SecureShare, Simplicity, Simulate ONTAP, Snap Creator, SnapCopy, SnapDrive, SnapIntegrator, SnapLock, SnapManager, SnapMirror, SnapMover, SnapProtect, SnapRestore, Snapshot, SnapValidator, SnapVault, StorageGRID, Tech OnTap, Unbound Cloud, and WAFL are trademarks or registered trademarks of NetApp, Inc., in the United States and/or other countries. A current list of NetApp trademarks is available on the Web at http://www.netapp.com/us/legal/netapptmlist.aspx.

Cisco and the Cisco logo are trademarks of Cisco in the U.S. and other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. TR-4396-0415

Copyright Information

Copyright © 1994–2015 NetApp, Inc. All rights reserved. Printed in the U.S. No part of this document covered by copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an electronic retrieval system—without prior written permission of the copyright owner.

Software derived from copyrighted NetApp material is subject to the following license and disclaimer:

THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp.

The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.

RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).

http://support.netapp.com/matrix/mtx/login.do

http://www.netapp.com/us/legal/netapptmlist.aspx

tr-4396: metrocluster on clustered data ontap 8.3 ... · on the netapp clustered data ontap® 8.3...

Documents