maximizing data center investments for disaster recovery and

15
Making Leaders Successful Every Day October 5, 2007 Maximizing Data Center Investments For Disaster Recovery And Business Resiliency by Stephanie Balaouras and Galen Schreck for IT Infrastructure & Operations Professionals

Upload: datacenters

Post on 29-Nov-2014

930 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Maximizing Data Center Investments For Disaster Recovery And

Making Leaders Successful Every Day

October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliencyby Stephanie Balaouras and Galen Schreckfor IT Infrastructure & Operations Professionals

Page 2: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. All rights reserved. Forrester, Forrester Wave, RoleView, Technographics, and Total Economic Impact are trademarks of Forrester Research, Inc. All other trademarks are the property of their respective companies. Forrester clients may make one attributed copy or slide of each figure contained herein. Additional reproduction is strictly prohibited. For additional reproduction rights and usage information, go to www.forrester.com. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change. To purchase reprints of this document, please email [email protected].

For IT Infrastructure & Operations ProfessionalsIncludes data from Business Data Services, Client Choice topic

EXECUTIVE SUMMARYBuilding a data center is a massive investment. It requires investment in real estate, reinforced facilities, raised floors, state-of-the art power and cooling, and IT infrastructure such as networks, servers, and storage — not to mention the experienced data center staff to manage it all. Firms build new data centers for a variety of reasons: capacity limitations, modernization, consolidation, and many others. But firms also need an alternate data center that’s an appropriate distance away so they can failover critical business operations in the event of a primary site failure. Given the necessary investment, an alternate data center simply can’t remain idle waiting for some disaster to occur. Firms must determine ways to maximize this investment to improve business operations, accelerate growth, or elevate availability.

TABLE OF CONTENTSDR Is A Fundamental Consideration In Data Center Design And Build-Out

Site Integrity Mitigates The Most Common Causes Of Downtime

Site Selection Reduces The Risk Of Natural And Manmade Disasters

Enterprises Demand Rapid Site Recovery

Multi-Data-Center Companies Are Using Their Other Sites For Recovery

Evolving To Active-Active Data Center Site Configurations

What Are The Typical Configurations For Active-Active Sites?

What Are The Key Considerations In Active-Active Data Center Design?

Enabling Disaster Recovery Technologies

RECOMMENDATIONS

Don’t Plan Your Next Data Center Move Without A DR Plan

Supplemental Material

NOTES & RESOURCESForrester interviewed eight vendor and user companies, including Bank of America, GoldenGate Software, IBM, Hitachi Data Systems, Network Appliance, Oracle, and VMware.

Related Research Documents“Six Years After 9/11, Most Firms Are Not Ready For Another Disaster”September 11, 2007

“What Can Enterprise IT Learn From The Web Giants?”August 15, 2007

“Market Overview: Business Continuity Planning Software”May 30, 2007

“Planning Your Next Disaster”April 18, 2007

October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency This is the first document in the “Next-Generation Data Center Design Strategies” series.

by Stephanie Balaouras and Galen Schreckwith Simon Yates and Walid Saleh

2

4

10

11

12

Page 3: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction ProhibitedOctober 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

2

DR IS A FUNDAMENTAL CONSIDERATION IN DATA CENTER DESIGN AND BUILD-OUT

Disaster recovery (DR) preparedness continues to be a major IT theme, particularly for North American enterprises. According to Forrester’s Business Technographics® November 2006 North American And European Enterprise IT Budgets And Spending Survey, 21% of North American enterprises and 14% of European enterprises identified “significantly upgrading disaster recovery capabilities” as a critical IT priority. Disaster recovery presumes that you have an alternate site to which you can recover your primary data center operations. According to Forrester’s Business Data Services Enterprise And SMB Hardware Survey, North America and Europe, Q3 2007, 65% of respondents have at least one alternate data center (see Figure 1-1).

Due to capacity limitations, modernization, and data center consolidation initiatives, many firms are re-evaluating the location of existing data centers and re-examining their data center facility profile. This can often lead to a new data center build-out or at least leasing data center floor space from a collocation provider that has already made the investment in a data center facility. This re-evaluation and re-examination provides an opportunity to address some of the most fundamental disaster recovery considerations in data center design and build-out — site integrity, site selection, and site recovery.

· Current data centers do not have adequate site integrity. More often than not, data centers are located near or in company facilities such as headquarters. Often data centers start in one or two dedicated computer rooms and then expand into other makeshift rooms as the company grows. They don’t have sufficient building integrity, backup power, or cooling. They can often be located in parts of the building very susceptible to floods.

· Current data centers may be located in high-risk areas. Data centers are rarely carefully planned or carefully located at a site that has the lowest possible risk from natural or manmade disasters. They’re typically located in or near company headquarters or other significant business operations, which are often in major urban centers that can be the target of manmade disaster and disruptions.

· Companies do not take advantage of existing data centers to act as internal recovery sites. Companies often have multiple data centers regionally and internationally, but, because of federated or distributed IT governance, one IT group may decide to outsource disaster recovery to a service provider when it could have used another company data center as a recovery site and kept control of disaster recovery preparedness in-house.

Site Integrity Mitigates The Most Common Causes Of Downtime

Today, when companies build out new data centers, they are building these data centers to at least a tier three data center specification as outlined by the Uptime Institute.1 A tier three data center has some of the following characteristics: uninterruptible power supplies, backup power generation,

Page 4: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction Prohibited October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

3

redundant and diverse connections to the power grid, a standalone building, concurrently maintainable infrastructure (redundant systems and distribution paths), and on-site staff to monitor and correct any facility issues.

A tier three data center will mitigate many of the preventable events such as power outages, network failures, and hardware failures. Companies often prepare for the most extreme of natural or manmade disasters such as hurricanes or terrorist events, but in reality it’s much more mundane events such as power outages or a flood in the building that are the most common causes of downtime and declared disasters. Disaster declaration and site recovery is expensive and risky, and companies should avoid disaster declaration unless it’s absolutely necessary.

Meeting a tier three data center specification doesn’t always require a new data center build-out. IT can retrofit and upgrade existing data centers. The toughest specification to meet is the requirement to have a standalone building. This is the specification that often forces companies to relocate their data center away from company headquarters and to either build out a new data center or lease data center floor space from a collocation provider.

Site Selection Reduces The Risk Of Natural And Manmade Disasters

Conducting a risk assessment is critical to the location of your production and recovery sites. It is also critical to the determination of the appropriate geographic distance between sites. Before selecting a data center site, it’s important that the business and IT conduct a risk assessment of the potential location options.

A local risk assessment identifies the likeliest set of threat events such as natural and manmade disasters. Companies should strive to locate their data centers in regions with the lowest possible threat profile. There are consultants who specialize in conducting local threat assessments, but often this information can be obtained from government agencies, such as FEMA in the United States, or even from your insurance provider. Threat assessments are typically led by operational risk management in conjunction with IT operations and IT security.

Another important aspect of site selection is geographic distance. Unfortunately, there is no “rule of thumb” for distance between sites, and companies should not attempt to compare themselves to their peers on this count. Geographic distance between production data and recovery sites is a tradeoff between achieving enough distance to escape local threats and managing technology limitations and cost. Excessive site separation can affect recovery time, raise costs for network equipment and bandwidth, increase application latency, and impact staff. The greater the distance, the more difficult it might be for staff to travel from the affected data center to the recovery site. Otherwise, companies will either need to fully staff the recovery site with qualified data center employees or invest in enough management software to manage the data center remotely (often termed a “lights-out data center”). However, if the sites are too close, your data centers could be subject to the same risks.

Page 5: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction ProhibitedOctober 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

4

Enterprises Demand Rapid Site Recovery

Given the total cost of downtime — which includes lost revenue, lost worker productivity, and lost market share — older approaches to disaster recovery, such as cold site recovery or even recovery from a shared IT infrastructure, are no longer adequate. Typically, a shared site infrastructure model can’t support recovery time objectives of less than 24 hours because data and system restores must be done from tape — and first the tapes must be found and then shipped to the site. Increasingly, companies prefer a standby, dedicated IT infrastructure (servers, storage, and network) that mirrors their production IT configuration and is ready to take over production processing at any time, regardless of whether this infrastructure is provisioned internally or the company uses a third-party service provider.

The question of whether or not you need to engage a third-party disaster recovery service provider such as IBM, HP, or SunGard depends on whether or not you have an alternate data center for recovery and the internal expertise to manage a complex disaster recovery solution. Most companies that turn to a service provider usually do so because they either don’t have a second, hardened data center; don’t have the budget to invest in redundant IT infrastructure and therefore would like to take advantage of a shared site infrastructure or the investments in IT infrastructure already made by the service provider; or they are seeking a second data center much further out of region than their current data centers. In addition, the service provider can provide a range of disaster recovery management services.

Multi-Data-Center Companies Are Using Their Other Sites For Recovery

Increasingly, very large companies that have multiple internal data centers will use these data centers as recovery sites for the each other. They are deploying technologies such as server virtualization and storage area networks at multiple data centers and creating their own DR configurations using array-, appliance-, database-, and host-based replication.

Server virtualization increases flexibility in resource “repurposing” between DR and other deferrable workloads such as development, test, and QA at the alternate site. In addition, these companies want to keep DR in-house. Because they want to maintain control of scheduling and executing disaster recovery tests, they’re reluctant to sign long contracts — often three to five years — without guarantees and strict service-level agreements. They also need the ability to remain flexible, especially if the company has plans for mergers and acquisitions or even divestures of existing business entities. As our data shows, of companies that have an alternate site, 46% own the site and 18% use collocation (see Figure 1-2).

EVOLVING TO ACTIVE-ACTIVE DATA CENTER SITE CONFIGURATIONS

It’s rare for a firm to build out a new data center that would act as an idle standby for a production data center in case something bad happens at the primary site. According to the Uptime Institute, a new 20,000-square-foot (1,800-square-meter) tier three data center costs more than $44 million.2 The only realistic way a company can take DR back in-house is if the company can also take advantage of the alternate site to run deferrable workloads or even production workloads.

Page 6: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction Prohibited October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

5

Companies must take steps to move to more “active-active” data center configurations, and enterprise architects and data center planners must take disaster recovery into consideration at the start of any data center consolidation initiative. It’s much more difficult to build resiliency into applications, IT environments, and data center sites after the fact.

Cost is always a consideration, but it also turns out that in certain industries — particularly financial services — 24x7 availability demands and competition are also drivers of the evolution to active-active data center configurations. Customers expect their banking services, such as ATM transactions and online services, to be available at all times. In fact, according to Forrester’s Enterprise And SMB Hardware Survey, North America And Europe, Q3 2007, more companies — across all industries — identified the requirement to stay online and competitive 24x7 as well as the requirement to improve the availability of a mission-critical application as the drivers to improve disaster recovery capabilities. So there’s some evidence to suggest that, at least among enterprises that have an alternate site, they’ve graduated — from the basic requirement for disaster recovery preparedness as dictated by the cost of downtime and regulatory compliance — to business resiliency (see Figure 2).

Figure 1 How Enterprises Provision Their Recovery Site

Source: Forrester Research, Inc.42939

65% of respondents said they have at least one alternate backup data center . . .1-1

Don’t know7%

No, and one is not planned in the next 12 months

17%

No, but one is planned in the next 12 months

10%

Yes, more than one16%

Yes, one49%

Base: 189 data center decision-makers at North American and European enterprises

. . . of which 46% said they own their recovery site1-2

Other6%

Dedicated IT infrastructureat a service provider

12%

Shared IT infrastructure at a service provider

19%

Collocation site18%

We own the site46%

Base: 124 data center decision-makers at North American and European enterprises with a backup data center

“How do you provision your recovery site?”

Source: Enterprise And SMB Hardware Survey, North America And Europe, Q3 2007

“Do you have a backup data center or other site that acts as a failover location in the event of a data center failure? If yes, how many?”

Page 7: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction ProhibitedOctober 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

6

Figure 2 What’s Driving The Need To Upgrade DR Capabilities?

Source: Forrester Research, Inc.42939

“What is driving the need to improve your disaster recovery capabilities?”

52%

48%

44%

44%

35%

3%

56%Cost of downtime

Improving availability ofmission-critical application

Requirement to stay onlineand competitive 24x7

Increased risk

Fiduciary responsibilityto stakeholders

Regulatory or legal drivers

Other

Base: 124 data center decision-makers at North American and European enterprises with a backup data center(multiple responses accepted)

Source: Enterprise And SMB Hardware Survey, North America And Europe, Q3 2007

What Are The Typical Configurations For Active-Active Sites?

There are several ways you can approach the development of truly “active-active” data center configurations. It’s not an all-or-nothing approach and it’s more of an evolution than a revolution. Many of the customers interviewed for this report, or who have engaged with Forrester in the past, take the evolution to active-active data center configurations one application at a time, or they evolve from a more passive configuration to a more active configuration as they gain knowledge, expertise, and confidence. Example configurations include:

· Active-passive with secondary workload offload. In this configuration, the recovery site is used for non-production applications and workloads. This configuration requires unidirectional data replication from the primary site to the recovery site. The IT infrastructure at the recovery site is used for read-only and nonproduction workloads such as business intelligence and reporting. In addition, copies of production data are used for application development and testing, as well as to offload backups. End users report that a major benefit of this configuration is the ability to system-test new functionality or upgrades in an identical environment before introducing it to production. It’s also much easier to schedule and conduct large-scale disaster recovery tests without affecting production applications.

Page 8: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction Prohibited October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

7

· Active-passive with planned workload rotation. This is a twist on the active-passive with secondary workload offload configuration. Rather than use the recovery site as a passive site to offload nonproduction applications, there is a planned rotation of the production workloads from the production site to the recovery site, and the recovery site then becomes the production site for an extended period of time — it doesn’t immediately fail back. It requires unidirectional replication between the sites. The benefit of this approach is that you know you have the ability to declare a disaster and failover to the alternate site relatively smoothly because you’ve already done it on multiple occasions. It also gives you the opportunity to conduct any required maintenance at the alternate site without disrupting production applications. This is a more popular configuration than you might think. IBM is a major proponent and advocate of this configuration, and it’s the foundation for IBM’s “Integrated Testing Facility” IT data center strategy and concept.3

· Active-active with application separation. In this configuration, a firm must separate production applications between two sites. So, you run some of your production applications at one site and some at the other, and you have standby servers at each site. The data for each application is replicated to its standby servers, and each data center acts as the recovery site for the other in the event of disaster or disruption. This is an increasingly common configuration because companies often have a natural separation of applications by business use. For example, financial accounting and HR applications are run at one site, and engineering- and manufacturing-related applications are run at the other.

· Active-active with multiple application instances. In this configuration, an instance of an application runs at each site. An external resource broker routes transactions to a particular data center based on a policy such as geography or utilization. Bidirectional data replication is used to keep both instances of the application in sync. If one of the applications or data centers were to fail, transactions would automatically continue to process at the other site. There essentially would never be any downtime, and end users would never notice any application downtime. One application can be taken down for maintenance while the other remains active. The underlying replication solution should have collision handling to prevent two different users from making conflicting changes to the same application and loop detection to avoid the same change going around in a loop between the databases as if it is a new change. This type of replication capability is available from companies like GoldenGate Software or ZeroNines.

· Active-active with a single-instance application stretch cluster. In this configuration, a single instance of an application runs across two sites using database and operating system clustering technologies and synchronous data replication. An example of this is Oracle’s Extended Distance Oracle RAC Cluster. As with the previous data center configuration, if one of the applications or data centers were to fail, transactions would automatically be processed at the other site, and there would never be any downtime. One site can be taken down for

Page 9: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction ProhibitedOctober 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

8

maintenance while the other remains active. However, this solution is sensitive to latency, limiting distance to about 16 miles (25 kilometers) or less. In addition, Oracle recommends host-based mirroring over array-based replication, as it enables both instances to write to both copies at the same time — so failover is more seamless. With array-based replication, you need to switch over to the replicated array if the primary becomes unavailable — which is not integrated with the database stretch cluster.

What Are The Key Considerations In Active-Active Data Center Design?

Each of these configurations has advantages and disadvantages. To understand these, you will need to consider:

· Required infrastructure capacity. In all the configurations, you need to be certain that you have enough additional capacity in terms of compute power and storage capacity to handle the failover of the applications from the other data center. Otherwise, you have to be prepared to run in a degraded state. Not only must there be enough capacity for actual failover, but it’s important for testing as well. Firms that have brought DR “in-house” often complain that it’s just as difficult —if not more difficult — to schedule testing with line-of-business owners than it is with an external DR service provider. Although you may need additional physical servers at each site to accommodate a full failover and to facilitate testing, we think there’s still a benefit to utilizing the investment you’ve made in the data center itself. Plus, as server virtualization and booting servers from networked storage become more commonplace, the ability to quickly reprovision server workloads will be much easier and require less gear.

· Infrastructure capacity utilization. If your goal is to reduce costs while taking on an acceptable amount of risk, then go with the configuration that maximizes your use of physical assets or rapidly enables you to repurpose physical assets. For some firms, performance degradation in a disaster recovery scenario is an acceptable risk given the required investment in additional physical servers. It’s not a requirement to have the same number of physical servers at each site if you’re using server virtualization.4 You can choose to run the same number of virtual machines on a few physical servers at the alternate site, or you can quickly repurpose physical servers at the alternate site that are currently running virtual machines in support of secondary workloads.

· Maintaining identical IT configurations at each site. In DR, one of the biggest challenges is maintaining identical server configurations across both data centers. Without good change management and configuration management processes in place, server configurations between the two sites can significantly drift. One of the customers we spoke with reported that all of the company’s data centers use exactly the same system configuration, and the company has extremely tight and restricted change management policies. In addition, the customer reported that the company standardized on a very short list of operating systems and platforms. The tight restrictions and the limited number of configurations help to minimize configuration drift.

Page 10: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction Prohibited October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

9

· Application failover and application restart. In the event of a system failure or the failure of an entire site, you must have procedures in place to restart applications at the alternate site and ensure failover transactions. All of the data center configurations described above are an improvement over cold or warm site recovery, but some of the configurations provide for a far more seamless, potentially nondisruptive, continuation of application processing.5

· Legacy applications. Ideally, enterprise architects and application developers are intimately involved in disaster recovery planning, and resiliency and availability should be major considerations in application development, testing, and final deployment. It’s much easier to deploy a new application into an active-active data center configuration than it is to redesign existing applications.

· Data integrity. When running multiple instances of the same application across multiple sites or running a single instance of an application across multiple sites, it’s critical to keep the data synchronized. This requires either the synchronous or asynchronous replication of data between data centers. If you don’t have experience deploying and managing a replication solution, it’s strongly recommended that you take advantage of the vendor’s consultative and professional services to determine the appropriate bandwidth between sites to avoid any potential latency issues. In addition, with array-based replication, any data corruption that occurs at one site is almost immediately copied to the alternate site. That’s why storage vendors will recommend that you have enough additional capacity at the alternate site for at least one or more point-in-time copies of your data. Hitachi Data Systems notes that it has the ability to split off this point-in-time copy nondisruptively.

· Data center site separation. Many of the configurations that rely on either array-based synchronous replication or an extended cluster will likely have distance limitations, probably 30 miles (50 kilometers) or less. The appropriate distance between data centers depends on your local threat assessment — the two data centers should not be subject to the same local disaster or business disruption. If they are subject to the same threats, the distance is insufficient for disaster recovery preparedness. In reality, what you have is a very expensive high-availability configuration. It may turn out that you potentially need a third data center an extended distance away to provide the sufficient separation.

· Disaster recovery testing. According to Forrester’s Enterprise And SMB Hardware Survey, North America And Europe, Q3 2007, 23% of enterprises never conduct a full test of their disaster recovery plan. If you don’t test your plan, you’re not mitigating risks. As you move to more active-active data center configurations, you must have the ability to conduct a full disaster recovery test — not just the ability to bring up a given application at the alternate site, but the ability to failover all applications that support a given business process to the alternate site. This can be very challenging because there is no longer a one-to-one relationship between the business process and an application. A company’s financial accounting and reporting

Page 11: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction ProhibitedOctober 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

10

capabilities might rely on as many as 40 different applications. All of these applications need to be tested together to test the ability to truly recover this business process. What does this mean as you consider active-active data center configurations? It means that you must have enough additional capacity at the alternate site to support adequate testing.

ENABLING DISASTER RECOVERY TECHNOLOGIES

There are a number of key enabling technologies that all IT infrastructure and operations professionals should familiarize themselves with as they begin the planning for more active-active data center configurations. Specifically, you need to understand:

· Data replication. Many companies still rely on off-site tape vaulting for disaster recovery, which does not support active-active data center configurations. Active-active processing requires an upgrade to some kind of electronic data replication between sites. There are several choices for data replication including storage array-, storage appliance-, database-, and server-based replication. Each of these approaches to replication has advantages and disadvantages, and it’s likely that you will need more than one technology to support a range of recovery time and recovery point objectives. When evaluating a replication technology, you should take the following into account: the ability to support synchronous, asynchronous, and batch replication or remote point-in-time copies; multisite replication such as 1-to-1, 1-to-many, and many-to-1; host performance impact; bandwidth requirements; bandwidth efficiency; the ability to survive bandwidth disruptions; host operating system and database support; ease of deployment and management; the ability to use the secondary site productively; and of course cost.

Array-based replication is one of the most widely deployed technologies because it offloads processing from the server, supports synchronous replication, and supports complex multisite replication. It’s also host-agnostic. Database replication technologies are also widely deployed and sometimes preferred by database administrators because they control the configuration and management of the technology. For example, Oracle Data Guard is an integrated feature of the database. Configuration, administration, failover, and failback are all handled through Oracle Enterprise Manager. In addition, because it is “database-aware,” it can detect when Oracle data blocks are corrupted, thus reducing the risk of transmitting a corruption to the recovery site. There are also independent software vendors such as GoldenGate that specialize in the ability to replicate data between any databases, even heterogeneous databases, because the solution is

“database-aware” and handles all of the transformation and the conversion of the data. Database replication technologies also transmit significantly less data than array-based replication, and it’s possible to extend replication over much greater distances.

· Server virtualization. Many companies are successfully using server virtualization to improve the availability of applications at the primary site as well as facilitate a rapid restart of applications at the recovery site. When servers running virtual machines are connected

Page 12: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction Prohibited October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

11

to networked storage, you can use array-based replication technologies to replicate virtual machine configuration files from the production data center to the alternate site automatically. This reduces the need to fastidiously monitor system configurations between data centers. The system configurations are always in sync because any changes are continuously replicated to the alternate site. When it comes to recovery at the alternate site, it’s less “recovery” and more

“restart” because the data and system images are already there. In addition to storage companies like Network Appliance, software companies like PlateSpin and Vizioncore have partnered closely with VMware to capitalize on the growing demand for this type of DR configuration.

· Geographically dispersed clustering. For automatic failover of applications between data centers with zero downtime, you can also employ geographically dispersed clustering technologies. Like local clustering between physical servers or virtual machines, there is a cluster interconnect or heartbeat between nodes. If one of the nodes is unavailable, processing fails over to the remaining node and processing continues. This technology is necessary when you want to maintain a single instance of the application across two sites. It’s also very useful when you want to implement the planned workload rotation configuration. Clustering is operating system specific and must integrate with data replication. For this reason, system vendors like HP and IBM will offer end-to-end solutions that combine their clustering software with their array replication solutions. Storage vendors such as EMC, HDS, and Network Appliance must work with clustering vendors to validate the interoperability of their replication technology.

IBM has a significant number of customers that use dispersed clustering technologies for planned workload rotation. In the IBM System z mainframe environment, a growing percentage of its more than 300 Geographically Dispersed Parallel Sysplex (GDPS) licenses operate in a planned workload rotation configuration using IBM Metro Mirror synchronous disk replication.

R E C O M M E N D A T I O N S

DON’T PLAN YOUR NEXT DATA CENTER MOVE WITHOUT A DR PLAN

Planning a data center consolidation, new construction, or relocation without taking disaster recovery into consideration is a missed opportunity to provide your company with not only a more effective DR plan but also a competitive advantage. You should:

· Include application development in data center strategy. The most seamless form of DR is baked into the application — not tacked on as an afterthought. For example, by building active-active processing into the application, you can plan for a graceful failure where transactions are picked up by remaining data centers. In this scenario, there is no recovery or failover phase — making resiliency, not DR, the best term to describe it. Where DR must be added on to applications after the fact, software from ZeroNines or Marathon Technologies provides innovative ways of retrofitting active-active capabilities.

Page 13: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction ProhibitedOctober 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

12

· Use server virtualization to expand DR to more applications. Virtualization products such as VMware, Microsoft Windows Server Virtualization, and XenEnterprise make applications more portable and servers easier to reassign in an emergency. The hypervisor at the core of each of these products essentially hides the differences between server’s hardware configurations. That means that a virtual machine (VM) can be easily copied to a remote site on a nightly basis, and in a DR event can be started up on foreign servers. Furthermore, multiple VMs can run on the same server at once in the event that there are not enough physical machines to go around. Third-party tools from companies like Racemi, PlateSpin, and Vizioncore can help automate the recovery process as well as the VM snapshots and replication.

· Define a catalog of DR capabilities and selection criteria. No one DR technology will be able to address all of your platforms or applications. Your organization should have a range of DR capabilities on tap to address applications ranging from low to very high criticality. Furthermore, the DR service catalog should specify the criteria that determine which DR technology is selected for a given application. This is an important part of your IT governance that ensures that the right technology is selected for every job. Without specific selection criteria, application owners will frequently overspecify the level of protection their application requires.

SUPPLEMENTAL MATERIAL

Methodology

Forrester conducted interviews with vendors of data replication technologies, and cluster technologies as well as systems integrators delivering disaster-recovery-related services. We asked the vendors to provide customer references for their technology or a customer for a primary research interview.

Companies Interviewed For This Document

Bank of America

GoldenGate Software

Hitachi Data Systems

IBM

Network Appliance

Oracle

VMware

Page 14: Maximizing Data Center Investments For Disaster Recovery And

© 2007, Forrester Research, Inc. Reproduction Prohibited October 5, 2007

Maximizing Data Center Investments For Disaster Recovery And Business Resiliency For IT Infrastructure & Operations Professionals

13

ENDNOTES1 Uptime Institute defines a four tier data center system and provides discussions and illustrations of each

tier. Source: W. Pitt Turner IV, John H. Seader, and Kenneth G. Brill, “Tier Classifications Define Site Infrastructure Performance,” The Uptime Institute, December 4, 2006.

2 According to Uptime Institute’s estimates, a 20,000-square-foot (1,800-square-meter) data center supporting an equipment density of 100 watts/square foot at a tier three level of resiliency would cost approximately $44.4 million. Source: W. Pitt Turner IV and John H. Seader, “Dollars per kW plus Dollars per Square Foot Are a Better Data Center Cost Model than Dollars per Square Foot Alone,” The Uptime Institute, April 18, 2006.

3 IBM’s “Integrated Testing Facility” (ITF) is a data center concept developed by former IBM senior IT Architect John B. Koberlein, based on his 40-plus years of experience in the IT industry. The objective is to reduce time-to-market, reduce total cost of ownership, and improve availability for customers while developing business and technical skills to sustain these elements.

4 Adoption of server virtualization has accelerated dramatically in North America since 2005, with 51% of enterprises now using or piloting the technology. See the February 7, 2007, “Server Virtualization Accelerates In North America” report.

5 Cold site recovery would first require the manual restore of data from backups followed by a restart of applications.

Page 15: Maximizing Data Center Investments For Disaster Recovery And

Forrester Research, Inc. (Nasdaq:

FORR) is an independent

technology and market research

company that provides pragmatic

and forward-thinking advice to

global leaders in business and

technology. For more than 24 years,

Forrester has been making leaders

successful every day through its

proprietary research, consulting,

events, and peer-to-peer executive

programs. For more information,

visit www.forrester.com.

Australia

Brazil

Canada

Denmark

France

Germany

Hong Kong

India

Israel

Japan

Korea

The Netherlands

Switzerland

United Kingdom

United States

Headquarters

Forrester Research, Inc.

400 Technology Square

Cambridge, MA 02139 USA

Tel: +1 617.613.6000

Fax: +1 617.613.5000

Email: [email protected]

Nasdaq symbol: FORR

www.forrester.com

M a k i n g L e a d e r s S u c c e s s f u l E v e r y D a y

For a complete list of worldwide locations,visit www.forrester.com/about.

Research and Sales Offices

42939

For information on hard-copy or electronic reprints, please contact the Client

Resource Center at +1 866.367.7378, +1 617.617.5730, or [email protected].

We offer quantity discounts and special pricing for academic and nonprofit institutions.