dclick san cnsldtd rev2

151
Site Audit & SAN ANALYSIS MTI Technology Corporation DOUBLE CLICK SITE AUDIT & SAN ANALYSIS Report SO#: PS310018 1 document.doc

Upload: larry-wilson

Post on 24-Jan-2017

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

MTI Technology Corporation

DOUBLE CLICK SITE AUDIT & SAN ANALYSISReport

SO#: PS310018

1document.doc

Page 2: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Laurence E. Wilson

12 February 2003

REV-2

2document.doc

Page 3: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

PREFACE1. The purpose of these SAN analyses is to document hardware, software, control systems and common practices.

2. Customers have stated in the past that what they seek in a SAN Analysis is vendor neutrality in as much as possible.

3. In respect to 1& 2 above the goal is to recommend improvements in hardware / software / procedures or infrastructure. Where possible several vendor solutions are listed where they are found to be compatible / functional.

Much of the information in this document is simply documentation of the customer data center. It may seem that you already know much of this information but it is collected for 2 primary reasons.

1. So that if the customer decided to use another vendor (other than or in addition to MTI), he could provide copies of this Analysis to those vendors. Since you already paid for a SAN analysis, why pay for a second if you use more than one vendor?

3document.doc

Page 4: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS2. So that MTI Sales Engineers (and the author) can gain a

sufficiently high level understanding of your site, that they can implement whatever solutions you choose.

Each chapter has a summary reviewing that section and recommendations from that chapter.

If you wish, please read Chapter 9 (Summary Road Map) and Chapter 10 (Recommendations / Summary) first. The rest of the chapters and appendixes provide supporting documentation for the suggestions / conclusions in 9 & 10.

4document.doc

Page 5: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSISTable of Contents - PAGE

PREFACE - 021.0 Overview - 07

1.1 Customer Summary - 071.2 Current Environment Summary - 081.3 NYC Data Center - 091.4 Primary Data Center - 10

2.0 Projections & Growth - 11 2.1 Current / Future Application - 11 2.2 Current / Future Users - 12 2.3 Future Hardware Growth -12

2.3.1 TAPE - 132.3.2 DISK - 13

2.4 Issues -142.5 Summary - 14

3.0 Proposed SAN - 15

3.1 SAN Topology - 15 3.2 Hardware Base – Directors - 15

5document.doc

Page 6: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

3.3 Multiple Operating Systems - 16 3.4 FCP/SCSI 3 SAN (Core San #1) - 16 3.5 IP/FIBRE Backup SAN (Core San #2) - 16

3.5.1 Combined FCP/IP SAN (Core San #3) - 17 3.6 Standardization - 18

3.6.2 Switches/Directors - 183.6.3 HBA - 183.6.4 Server Hardware - 183.6.5 Backup Hardware - 18

3.6.6 Backup Software - 19 3.7 NAS - 19 3.8 Summary - 20

4.0 Backup - 21 4.1 Current - 21 4.2 Proposed - 21

4.2.1 LAN Free - 234.2.2 Server Free - 24

6document.doc

Page 7: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

4.2.3 Veritas - 264.2.4 ATL / StorageTek – DLT (Tape Libraries) - 28

4.3 Near Line Data - 28 4.4 Summary - 29

5.0 Performance & Applications - 30 5.1 Current Applications - Performance Impact - 30 5.2 Proposed SAN - Impact on Applications Perf - 30 5.3 Proposed SAN - Future Growth Impact - 31 5.4 Recommended Enhancements - 31 5.4.1 V-Cache - 31 5.5 Tuning Oracle Database with V-Cache - 32

5.5.1 Using V-Cache with Oracle - 335.5.2 Determining The Need For V-Cache - 375.5.3 V-Cache Oracle Summary - 39

5.6 Summary - 40

6.0 Disaster Recovery - 41 6.1 Current - 41 6.2 Proposed - 41

7document.doc

Page 8: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

6.2.1 Off Site Disaster Discovery - 41 6.2.2 Create A DoubleClick owned datacenter -41 6.2.3 Off Site Storage - 42 6.3 Summary - 43

7.0 Control / Virtualization / Data Migration - 44 7.1 MSM / Fusion - 44 7.2 Veritas NBU - 45 7.3 Veritas SANPOINT / Disk Management - 46 7.4 Fujitsu (Softek) SANVIEW - 47 7.4.1 Fujitsu (Softek) Storage Manager - 47

7.4.2 Fujitsu (Softek) TDMF - 48 7.5 MTI DataSentry2 - 49

7.5.1 QuickSnap (Point In Time Images) - 50 7.5.2 QuickDR (Data Replication & Migration) -50 7.5.3 QuickCopy (Full Point In Time Image Vol) - 52

7.5.4 QuickPath (Server free & Zero impact Bup) - 52

8document.doc

Page 9: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

7.6 Software Review - 54 7.7 Summary - 55

8.0 Special Applications (Considerations) - 56 8.1 Database considerations - 56 8.2 Microsoft Exchange 2000 – Considerations - 57

9.0 Summary – Roadmap - 58 9.1 Proposal Overview - 58 9.2 Roadmap - 63 9.3 Data Migration (DAS to SAN) - 65 9.4 Transition Points (DAS – SAN – NAS) - 66 9.5 Summary - 67

10.0 Recommendations / Analysis Summary – 68 to 71

9document.doc

Page 10: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Appendix

Appendix 1 - Contact InformationAppendix 2 - MTI Hardware Primary Data CenterAppendix 3 - MTI Hardware Secondary Data Center

Appendix 4 - Hardware Environment WINTEL Appendix 5 - Hardware UNIX / SUNAppendix 6 - SAN Implementation Best PracticesAppendix 7 - HBA Count (MTI)

Appendix 8 - JBOD Inventory (Primary Data Center) Appendix 9 – Wintel Storage Utilization (SAN)

10document.doc

Page 11: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

1.0 Overview

1.1 CUSTOMER SUMMARY

DoubleClick contracted with MTI Professional Services group to perform an enterprise storage and SAN implementation analysis. The purpose of this analysis was to suggest to DoubleClick a method for implementing a centralized SAN storage architecture, while maximizing their current investment.

An analysis of DoubleClick’s data center in New York City and its equipment is reviewed herein. This document attempts to outline current practices, hardware at DoubleClick, as well as recommended best practices, and a plan to migrate to a SAN topology.

DoubleClick develops the tools that advertisers, direct marketers and web publishers use to plan, execute and analyze marketing programs. These powerful technologies have become leading tools for campaign management, online advertising and email delivery, offline database marketing and marketing analytics. Designed to improve the performance and simplify the complexities of marketing, DoubleClick tools are a

11document.doc

Page 12: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

complete suite of integrated applications.

Double click and MTI have been partners since December of 1998. In that time Double click has gone through a complete business cycle. Rapid runaway growth at the end of the 90’s, caused the data center to expand quickly without much consideration for best practices.

Since the economy has started to contract DoubleClick has gone from not enough storage to a temporary glut. A lot of this storage is now 3 and 4 years old, and is a mixture of systems that in some cases are approaching block obsolescence. DoubleClick has taken some steps to correct this. Upgrading older V20’s to S240 series, and moving some storage to other sites in the company where it can be better utilized.

Currently DoubleClick is focusing on consolidating their data center operations in Colorado. They are currently in the process of moving their New York City Data center equipment. They plan to complete this move on or about 1 June 2003.

12document.doc

Page 13: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

This document will focus on helping DoubleClick on identifying methods to consolidate their storage into a central SAN. Areas where DoubleClick can focus, such as standardization of hardware, and upgrading older arrays instead of moving them will be addressed.

1.2 Current Environment Summary

The current environment at DoubleClick consists of a mixture of direct attached storage and SAN attached storage. These environments are distributed and are both SCSI attached and fibre attached.

The hardware in the SAN consists mostly of Gladiator 3600 series / V20/30 undergoing conversion to S240 series. Also included are older Ancor switches, Gadzooks hubs, and Hitachi 9000 series SAN. These devices far from being one complete SAN are actually “POOLS” of SAN storage connected to 2 or 3 hosts each (on average). Thus the economies and redundancies of a Core Switch based SAN are not seen here.

Local attach storage consists of MTI 2500 series JBODS, Compaq Direct Attach Storage, and internal server drives.

13document.doc

Page 14: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

A matrix of MTI & Customer systems and storage can be found in Appendix section of this document.

A topical overview of the New York City Data Center follows.

1.3 New York City Data Center

The New York City Data center resides on the 12th floor of the customer facility. It has OC3 capability to the customers Colorado Data Center (in Thornton, Colorado) and the customer is looking into a second OC3 pipe to aid in migrating data for the upcoming relocation from NYC to Colorado.

The center may be broken into 3 facilities for reference purposes. Those facilities would be:

14document.doc

Page 15: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

1. The Primary Data Center (PDC)2. The Secondary Data Center (SDC)3. The Cage (a small secure area within the PDC)

The customer subdivides operations on these systems into 3 primary groups (or operations) they are:

1. Production2. QA3. DEV

Interviews with the customer on site have indicated that the Primary Data Center (less the cage) & the systems dedicated to production is the primary focus of the Data Migration and SAN Analysis. The systems that provide QA and DEV functions will remain in NYC when the production systems are moved to Colorado.

15document.doc

Page 16: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

1.4 Primary Data Center

The primary data center holds most of the production equipment and is comprised of the following.

a. 8 SUN 6500b. 2 EMC Symetrix (Offline) c. 1 Hitachi 9960 SAN Array (15 TB)d. 52 19 x 70” Cabinets of MTI Storage (112 TB ON –FLOOR)

a. Mainly 3600 with some S240 & V20/30 Seriesb. 112 TB ON –FLOOR / RAW (Includes Offline Cabinets)c. 51TB ONLINE RAW (26 Cabinets 3600)d. 34TB ONLINE RAW (9 Cabs V20/30/S24X)

e. 3 ADIC Scalar 1000 DLT tape Libraries (Offline)f. 2 StorageTek L700 robot tape libraries

16document.doc

Page 17: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSISg. 15 Sun 4500h. 43 SUN Sparc / Ultrasparci. 285 Compaq Proliant (Models mainly 1850/550/6400R + some

newer)j. 28 HP Netserver 600R k. 5 NET APP F760 (7.5TB)l. 4 SUN 220R m. Compaq JBODS (various 8.1 TB approx)

Note: These are systems that appear alive. Due to the size some may actually look alive but be “retired”. All attempts where made not to count “dead” or “retired” systems. Also due to day-to-day operations the count may fluctuate. The above represents a good snapshot of the types and numbers of equipment “in production”.

17document.doc

Page 18: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

2.0 Projections & Growth The MTI statement of work (SOW) for this project stated a goal of planned 30 – 40% annual growth in storage requirements. It will be the baseline for all assumptions made in this document unless otherwise noted. It will be the basis for a scalable SAN storage proposal.

The customer is currently involved in a move to consolidate their facilities in Colorado. At the New York Data Center, the customer has a Primary Data center & a Secondary Data center on the 12th Floor of their building.

Customer has stated that the focus is on the systems identified in the Primary Data center as moving to Colorado. These are the parameters/assumptions for the recommendations and notes found within.

2.1 Current / Future Applications Current applications we are considering for the purposes of SAN design and implementation are:

18document.doc

Page 19: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS1. ORACLE – Database Application2. ABINITIO – Database Application / Works with Oracle3. Microsoft Exchange 4. DART - (Customer Applications).5. STANDARD – (Standard Data Center process’s i.e. File

Serving, Print Serving, Web Serving etc.)a. FS Archiveb. FSSIM Archivec. Custom/Report DBd. Corporate Systemse. NetBackupf. NXPg. Auto PTh. Arch Surei. DF Paycheck j. EPSk. UBR l. OLTPm. Rep Svrn. FSADSo. SPOD DBp. DART Wq. DT

Future applications include but are not limited to:

1. All standard SUN & WINTEL based applications that will execute on current systems using current Direct Attached or SAN storage.

These applications and their performance impact will be used to asses the required performance parameters of the SAN.

2.2 Current / Future Users Statement Of Work (SOW) has stated that the solution must be able to handle the storage needs represented by a 30% - 40% growth rate. DoubleClick is now suffering from the same contraction in customer base as other computer information based service companies. While 30 to 40% growth will be supported by the solution, the customer does not foresee growth at that rate for the near term.

19document.doc

Page 20: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Customer has stated a concern about the block obsolescence of his current storage, and the inherent costs to maintain this type of older architecture. Interviews on site have also revealed the customer and his Data Center employees understand that current storage (pools of San and Direct Attach) lack flexibility and are rapidly approaching EOL (end of Life). While the storage solution will be inherently capable of supporting growth up to and beyond the stated 30 – 40%, modernization and economies of scale will also be used to maximize the cost benefit to DoubleClick.

2.3 Future Hardware Growth As above plan is for 30-40% growth as per the SOW unless stated otherwise. Proposed SAN must be capable of expanding at this rate into the near future.

2.3.1 TAPE Currently the customer has 2 StorageTek L700 robot tape libraries with DLT7000 drives. Backups are controlled via Veritas Netbackup

20document.doc

Page 21: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

software.

All SUN and WINTEL systems backup to these central vaults over a dedicated backup SAN with smaller systems going across the 100T network to central media servers. This is a good setup, and adequate for their needs. Suggest their next tape vault be fibre channel SAN capable, and upgrade to newer (DLT8000 / SDLT) tape technology as they outgrow their current system.

2.3.2 DISK (Note: Secondary Data Center Has 18.3TB of 3600 online) Current disk capacity (Primary Data Center) is broken down as follows: ONSITE MTI ONLINE 3600 = 51 TB ONSITE MTI ONLINE Vxx/S2xx (-V30 2/1/3) = 30 TB

HITACHI 9960 ONLINE ONSITE = 15 TBDIRECT ATT JBOD (COMPAQ /ETC.) = 8.1 TBNETAPPS F760 (5 ARRAYS) = 7.5 TB

TOTAL = 111.6TB PDC

21document.doc

Page 22: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

NOTE 1. Compaq JBODS are mainly older 9GIG and 18GIG drive technology.NOTE 2. Customer has plans to install HITACHI 9980 with 50TB in the near future.NOTE 3. This does not include Proliants that have from 2 to 4 local drives attached (mostly 9 & 18G drives) and operate as either independent systems, or use the drives for the operating system and a mirror. Operating systems are recommended to remain on local disks with a local mirror copy. Putting Operating systems on the SAN is never a good idea as it eats up valuable throughput and creates expensive unnecessary bottlenecks.

2.4 ISSUES – DOUBLECLICK

1. Lots of pools of storage SAN & Direct attach.2. Need to support 1 & 2 GIG Speeds in same SAN3. Need to migrate backups off 100BaseT and onto SAN where

possible/necessary.4. A considerable amount of storage Compaq JBOD / MTI 3600 /

NetAPPS F760 is obsolete and near end of life. Cost to move across

22document.doc

Page 23: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIScountry may not be justifiable or prudent.

5. Symetrix systems being phased out (currently offline).

2.5 SUMMARY

Applications are stable with no major surprises planned. Customer has a large investment in StorageTek backup hardware (L700) and Veritas software. Since these are both adequate for the time being.

The SOW said to plan on a growth rate of 30 to 40%. Customer has stated (and I agree) that while we may not be seeing that rate during this analysis (or in the near future), that the SAN needs to be able to grow quickly while avoiding all the problems DoubleClick had during their last period of explosive growth.

23document.doc

Page 24: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

3.0 Proposed SAN

3.1 SAN Topology

DoubleClick’s proposed SAN backbone is to be based on 2 x Brocade 12000-core switches running in Fabric mode.

3.2 Hardware Base - Directors

The proposed SAN for DoubleClick will be based on Brocade 12000 director class switches. They can provide a high-speed backbone for both FCP/SCSI (DISK & TAPE) and IP traffic.

24document.doc

Page 25: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Brocade Silkworm 12000

Silkworm 12000 Highlights: Simplifies enterprise SAN deployment by combining high port density with

exceptional scalability, performance, reliability, and availability.

Delivers industry-leading port density with up to 128 ports in a single 14U enclosure and up to 384 ports in a single rack, facilitating SAN fabrics composed of thousands of ports.

Meets high-availability requirements with redundant, hot-pluggable components, no single points of failure within the switch, and non-disruptive software upgrades.

Supports emerging storage networking technologies with a unique Multiprotocol architecture.

25document.doc

Page 26: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS Provides 1 Gbit/sec and 2 Gbit/sec operation today with seamless extension to

10 Gbit/sec in the future.

Employs Brocade Inter-Switch Link (ISL) Trunking™ to provide a high-speed data path between switches.

Leverages Brocade Secure Fabric OS™ to ensure a comprehensive security platform for the entire SAN fabric.

3.3 Multiple Operating Systems

Current: DoubleClick has multiple operating systems and platforms. The systems on site are Currently comprised of SUN and Windows 2000 (Wintel) type platforms.

Proposed: The proposed SAN needs to have the capability to support Solaris, Windows, and in the future Linux.

3.4 FCP/SCSI 3 SAN – Core SAN #1

We are proposing 3 core SAN solutions. The first would be to split the SAN in 2 (Core San #1 DISK & Core SAN #2 Tape). Have one SAN built around 1 Silkworm 12000 runs only the FCP/SCSI 3 protocol. This SAN would handle all disk operations.

3.5 IP/Fibre – Backup SAN – Core SAN #2

A second SAN built around 1 Silkworm 12000-class director would handle FCP/IP traffic. It would in effect handle all IO that pertains to direct attach tape libraries and IP traffic pertaining to backups. This would provide what is known as LAN free backup. Offloading some of the

26document.doc

Page 27: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

current Ethernet traffic and relocating it to the fibre network.

3.5.1 – Combined FCP/IP SAN – Core SAN #3 (Alternate)

If a cost performance trade off is desired, the customer can go with a combined SAN. 2 x 128-port Brocade 12000 director class switches can be used to handle both disk and tape IO and IP fibre traffic, utilizing Emulex Multiprotocol drivers.

These drivers will allow an Emulex Lightpulse 8000 / 9000 / 92XX series Host Bus Adapter to handle both protocols simultaneously in the same HBA. This would have a small impact on performance, but would save money in acquisition costs of Switches and HBA’s, while still being fully functional. (Still provides LANFREE backups) It could also be expanded or modified at a later date into the split SAN proposed in 3.4 & 3.5 (above).

This alternative could also use an Ethernet dedicated IP network to route the index files and housekeeping files in Veritas. The fibre network would still run the bulk of the data I/O transfer traffic.

3.5.2 – Combined SAN – (Preferred recommendation)

1. Use the 2 Brocade 12000 core switches to build a mesh fabric.2. Install 2 GIG HBA in high I/O servers (i.e. SUN 4500/6500) and

connect to core switches.

27document.doc

Page 28: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS3. Install V400 2 GIG arrays and dedicate to HI I/O servers.4. Decommission Gladiator 3600 & NetApps boxes.5. Leave 1 GIG HBA’s used in current back up SAN and Silkworm

2800 switches connected as is (add ISL links to core switches)6. Use Brocade Zoning; build “1 GIG ZONE”, “2 GIG ZONE”, “Backup

Zone” etc. 7. Use S240/S270 (V20/V30 conversions) to replace older SCSI

JBODS on COMPAQ / WINTEL systems (Or Leave in NYC to support QA and DEV systems that are not moving.

8. Implement a central san control/viewing solution such as Veritas or Fujitsu (SANVIEW type) software.

9. Install Vcache SAN ready solid-state disk. Connect Vcache to core switches and use to accelerate data base hot files, and indexes. (Do this last, you may not require Vcache once you see the results of the V400s performance on your SAN)

3.6 Standardization

Given DoubleClick’s current Environment all proposals will standardize on the following.

3.6.2 Switches Directors (Standardization) Switches will be based on Brocades family of Fabric enabled switches. They can include Silkworm models 2040/50 (8 port), 2400 (8 port), 2800 (16 port), 6400 (64 port) and the 12000 (128 port).

Customer needs and budget constraints will dictate the ultimate selection. DoubleClick already has a number of 2800’s on site in their current backup solution, and they have purchased 2 Brocade 12000s located in Colorado.

3.6.3 HBA (Standardization)

Most customer systems can be handled by the Emulex Lightpulse

28document.doc

Page 29: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

8000 / 9000 / 92XX (PCI/Sbus) family of dual protocol adapters. 3.6.4 Server Hardware (Standardization)

We strongly suggest that DoublClick consider standardization of hardware/operating systems. While we can support all the hardware in their current environment, a mixed topology has its own overhead in inefficiencies of management and personnel required.

It seems from a review of the New York City facility that Windows 2000 and Solaris (2.8 or later on SUN and Compaq platforms) would be a logical choice for standardization, given the preponderance of those systems already on site. We realize there may be orphan systems remaining for business and other reasons, and our proposal will allow for their support, and the support of the current systems mix.

3.6.5Backup Hardware (Standardization)

Current Backup hardware is centered around 2 StorageTek L700 robot tape libraries running on a combination dedicated backup SAN (using Brocade 2800 switches) and IP backup (Ethernet) for non-SAN systems and Veritas housekeeping traffic.

3.6.6 Backup Software (Standardization)

29document.doc

Page 30: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

DoubleClick has already standardized on Veritas Netbackup. The current system runs well and we would only suggest some additional software packages and a future plan to transition to 2 GIG fibre.

A faster 2GIG backup SAN with IP and SCSI/FCP over fibre, or with SCSI/FCP over fibre and a dedicated Ethernet backup network for index and Veritas IP housekeeping traffic might also be an alternative.

3.7 NAS

Currently NAS / File Sharing is handled by 5 NetApps F760 NAS boxes providing approximately 7.5 TB of storage using a total of 210 36G drives.

Recommend that the NetApps be decommissioned. Obviously this site is primarily a SAN environment, and the NetApps boxes are slow and old and a maintenance problem in terms of $$$ and reliability.

30document.doc

Page 31: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

3.8 Summary

The current site already has a large investment in Brocade switches 1 GIG Host Bus Adapters. We can leverage this current investment and achieve the benefits of standardization by moving all the slower less intensive applications to older 1 GIG HBA’s and technology (S240), and migrating the high speed high I/O applications to new 2 GIG HBA’s and storage (i.e. V400). There is no reason for the customer to totally scrap his current investment and operation. Customer onsite discipline is good, and we suggest that the customer commence moving to Hardware, Software Standardization (using the best practices noted in Appendix 6).

We feel the current move to Colorado is the perfect opportunity for the customer to upgrade his older equipment during the data migration. Savings in maintenance contracts on older equipment (NetApps & 3600s) as well as dramatic fibre performance improvements can more than justify upgrading selected parts of the infrastructure now.

31document.doc

Page 32: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

4.0 Backup

4.1 Current

The current environment (NYC) is comprised of 2 StorageTek L700 Tape libraries running DLT 7000 tape drives. The Libraries are connected to primary SUN Servers via a dedicated 1GIG Fabric using Brocade Silkworm 2800 switches (See Figure 4-1). This backs up their core ORACLE, ABINITIO (EPS1-4), and DART systems, which forms the core of their online systems. They are currently using Veritas Net Backup to control these backups. They are backing up the Oracle database using the offline method. Also all NT/WIN2K systems are backed up over the 100BaseT internal IP network. All backup operations in NYC are under Veritas control.

4.2 Proposed

The proposal is to maintain the Veritas / StorageTek system on site. It

32document.doc

Page 33: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

is an excellent solution in design and implementation. The customer could save some money here by using the system as is.

Connect the Brocade 2800 silkworms to the Brocade 12000s as edge switches, and designate a back up zone. Then as time and money permits, add 2 GIG HBAs to replace the 1 GIG Backup HBAs on a one for one basis. Also upgrade to later DLT technology in the libraries, DLT 8000 and SDLT as time and money permit.

The following proposals are modest improvements to fine tune and improve the current customer back up infrastructure, and to provide information on methods and techniques that may not have been previously considered. It purposely avoids any major changes.

33document.doc

Page 34: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

34document.doc

Figure 4-1

Page 35: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

4.2.1 LAN Free (To speed up backups and reduce IP network congestion)

This is somewhat a misnomer. When you set up IP over Fibre, you actually are building a Fibre Channel LAN. All backup traffic is now moving over fibre IP vice the regular 10/100 Base-T LAN.

VERITAS has stated that when you use a FIBRE Channel attached tape library (I.e. a L700) that you still need a LAN to pass the backup index files over. A fibre channel tape library does show up as a locally attached fibre device to a host using FCP/SCSI3. However the indexes can be passed over the Fibre channel SAN using IP/Fibre.

There are basically 2 ways to do a so-called LAN free backup.

The first method is to use a Fibre channel attached Tape Library connected to a SAN via a Switch. Backup index files (Veritas) are then

35document.doc

Page 36: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

passed over the IP/Fibre Channel network. This requires all systems in the backup network to also have a fibre channel HBA and be running IP/Fibre. In this implementation the backup files are sent over fibre via the SCSI3/FCP protocol. The backup indexes and Veritas/Legatto control traffic also go over the same fibre but using IP protocol. Much the same can be accomplished by using a Fibre Channel Tape Library, and building a separate backend 10/100 to GIG-E backend network that handles nothing but backup traffic.

The second method is to have a SCSI attached tape library hanging off of a backup server. The backup server and clients are all members of a FIBRE channel SAN running IP over Fibre. In this setup all backup and index files are transferred over the fibre channel LAN using IP protocols. (This method is nowhere in evidence at the NYC data center except in a few pieces of offline backup hardware)

4.2.2 Server Free (This is only applicable if you keep your NetApp Boxes)

36document.doc

Page 37: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

NetBackup accomplishes backup and recovery of Network Appliance filers using NDMP (Network Data Management Protocol), an open protocol co-developed by Network Appliance. The NDMP protocol allows NDMP client software, such as VERITAS NetBackup, to send messages over the network to an NDMP server application to initiate and control backup and restore operations. The NDMP server application resides on an NDMP host, such as a NetApp F760. The NetBackup for NDMP Server is where the NetBackup for NDMP software is installed; it may be a master server, a media server or both. If it is only a media server, however, the assumption is that there is a separate NetBackup master server to control backup and restore operations.

Benefits:

High-speed local backupPerform backups from NetApp filers directly to locally attached tape devices for performance without increasing network traffic (refer to red arrow on figure below). Alternate Client RestoreData backed up on one NetApp filer may be restored to a different NetApp filer. This helps simplify both recovery simulation and general restoration of data, especially when the NetApp filer that performed the initial backup is unavailable (refer to blue arrows on

37document.doc

Page 38: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

figure below).

38document.doc

Page 39: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

System Requirements: NetBackup for NDMP supports Sun Solaris and Windows NT/Windows 2000 platforms, both of which DOBLECLICK maintains in their environment. One condition required by the latest release of NetBackup is that the Qualified Network Appliance Data ONTAP™ Release(s) be either 5.3.6R1 or 5.3.4. DOBLECLICK currently maintains two versions of ONTAP™: 5.3.4R3 and 5.3.7R3. It is possible that their current versions may be supported. [Detailed and/or updated information would have to be verified through MTI's Veritas representative.] The other condition is in regards to supported tape libraries and tape drives. As of writing, tape devices currently supported include any general DLT tape library or drive, DLT tape stacker, 8 mm tape library or drive, and the Storage Tek ACSLS-controlled tape drive or library. [Detailed and/or updated information would have to be verified through MTI's Veritas representative.]

4.2.3 Veritas

MTI has inventoried DoubleClick’s environment, It is well thought out and efficient. I suggest DoubleClick look at adding the following Veritas products at some point in the future.

39document.doc

Page 40: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Features & Benefits of VERITAS NetBackup™ DataCenter for DoubleClick.

VERITAS Global Data Manager™ VERITAS Global Data Manager™ is a graphical management layer for centrally managing a single VERITAS NetBackup™ storage domain or multiple storage anywhere in the world. In addition, Global Data Manager offers a monitoring view into Backup Exec™ for Windows NT and Backup Exec for NetWare environments, providing customers with a single view into mixed Backup Exec and VERITAS NetBackup environments.

VERITAS NetBackup Advanced Reporter™

VERITAS NetBackup Advanced Reporter™ delivers a comprehensive suite of easy-to-follow graphical reports that help simplify monitoring, verifying, and troubleshooting VERITAS NetBackup™ environments. Advanced Reporter provides intuitive charts, graphs, and reports that allow administrators to easily uncover problem areas or patterns in backup and restore operations. Reports can be accessed through a web browser from virtually anywhere in the enterprise. Reports are delivered via Web browser to virtually anywhere in the enterprise (even over dial-up phone lines) Many reports embed hyperlinks that allow users to search for more granular

40document.doc

Page 41: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

information without having to open or move to an additional screen Over 30 new graphical reports available

VERITAS NetBackup™ for Microsoft Exchange Server

VERITAS NetBackup™ for Microsoft Exchange Server simplifies database backup and recovery without taking the Exchange server offline or disrupting local or remote systems. A multi-level backup and recovery approach ensures continued availability of Exchange services and data during backups. Central administration, automation options, and support for all popular storage devices create the flexibility administrators need to maximize performance.

VERITAS NetBackup™ for NDMP option (If you decommission your NetApps Boxes ignore this)

Using the open Network Data Management Protocol, VERITAS NetBackup™ NDMP delivers high-performance backup and recovery of NAS systems and Internet messaging servers, such as Network Appliance, EMC, Auspex, ProComm, and Mirapoint. VERITAS NetBackup sends NDMP messages to these systems to initiate and control backup and recovery operations.

41document.doc

Page 42: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

VERITAS NetBackup™ for Oracle VERITAS NetBackup™ for Oracle simplifies the backup and recovery of all parts of the Oracle database, including archived re-do logs and control files. Users can perform hot, non-disruptive backups—full or incremental—of Oracle databases. VERITAS NetBackup's job scheduler and tape robotic support allows for fully automated database backups. In addition, VERITAS NetBackup for Oracle utilizes a high performance parallel process engine that provides the fastest mechanism possible for the backup and recovery of data, resulting in increased database availability.

VERITAS NetBackup™ for Oracle Advanced BLI Agent

VERITAS NetBackup™ for Oracle Advanced BLI Agent delivers high performance protection for Oracle databases by greatly reducing the amount of data during backup and recovery. Block Level Incremental (BLI) backup technology updates changed blocks (not the entire file), dramatically reducing CPU and network usage and enabling more frequent backups. The Advanced BLI Agent is fully integrated with the Oracle 8i Recovery Manager (RMAN) interface and augments the increased manageability and simplified recovery that RMAN provides.

Leveraging your existing hardware results in lower cost implementation compared to split-mirror solutions Online, non-disruptive, full, differential and cumulative block level incremental backup options provide flexible implementation and free up bandwidth Backup and restore at the database, tablespace and data file levels, including control

42document.doc

Page 43: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

files

VERITAS NetBackup™ Shared Storage Option™ VERITAS NetBackup™ Shared Storage Option™ (SSO) is the first heterogeneous SAN-ready storage solution working on both UNIX and Windows. It enables dynamic sharing of individual tape drives—standalone or in an automated tape library—by "virtualizing" tape resources. VERITAS NetBackup™ SSO reduces overall costs by providing better hardware utilization and optimization of resources during backup and recovery operations.

4.2.4 ATL / STORAGTEK- DLT (TAPE LIBRARIES)

Existing tape libraries do not need any upgrades as of this writing. In the future when considering tape libraries from the above manufactures

43document.doc

Page 44: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

consider the following.

- Backward read compatibility with DLT7000- DLT Format (DLT 8000 or SDLT) it is really to late at this point (and

to expensive) to change media formats.- 2GIG Fibre speeds. (Anything slower will just mire you in the past)

4.3 Near Line Data

The customer currently has no Near Line Data (CD-RW / DVD-RW Libraries) on this site, or under consideration.

44document.doc

Page 45: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

4.4 Summary

1. Add the NetBackup Oracle Agent to your current backup suite.2. My primary suggestion on NetApps is to get rid of them (due to age

and performance) but if you retain them consider Veritas NDMP for server less backup of these systems.

3. Leave your backups on a dedicated fibre fabric of Silkworm 2800’s (as it is now). Use ISL links to connect to your core switches. Migrate to 2GIG technology as the L700’s reach end of life in an orderly transition. (You can also upgrade to newer denser DLT drive technology at the same time)

4. Build a backend IP network (IP over Fibre) or Ethernet, dedicate it to passing indexes files (on fibre optic attached clients) and to passing index files and backup traffic on your IP only clients (mostly the Wintel and SPARC equipment)

45document.doc

Page 46: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

5.0 Performance & Applications

5.1 Current Applications – Performance Impact - Current

DoubleClick currently has a somewhat standard mix of applications. The primary ones are Oracle, AB Initio, Microsoft Exchange, DART (in house software), and web server applications. The systems also provide for the traditional mix of file services, print services and backups (Veritas NBU.

The current direct attached storage is marginally adequate from a performance standpoint. The systems are also slowly becoming obsolescent, and in some cases reaching capacity. Currently the customer’s problems are mostly volumes going over 90% capacity, and block obsolescence of storage hardware.

Moving to a core switch SAN based on new 2 GIG bandwidth technology will improve the performance of the customer’s current applications, and ameliorate some of his overall performance issues. However it is our opinion that a controlled migration to a core switch SAN is prudent. The current customer configuration is not only based on pools of obsolete storage but small san (pools) that are also fast reaching the end of their service lives. One major exception to this would be the Hitachi 9000 series SAN that is installed in the Primary Data Center.

46document.doc

Page 47: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

5.2 Proposed SAN – Impact on Applications Performance

The proposed SAN will have a marked impact on current applications performance. DoubleClick has stated that they have two Brocade 12000-core switches available for the Colorado Data Center.

As part of this review I recommend DoubleClick look at removing all 1 GB technology (LP8000’s, Ancor Switches, etc. and eliminate edge switches where practical (except for current backup operations). The 2 Brocade 12000’s could be made into a redundant Fabric (for path failover capability) and all current San equipment (Hitachi, V20/30/S240) as well as all new equipment (i.e. V400) be connected via the core switches.

47document.doc

Page 48: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

5.3 Proposed SAN – Future Growth Impact

The CORE SWITCH SAN will allow for a more controlled growth pattern, and flexibility in storage assignment. Currently the admins have to throw disks (DAS/NAS/JBOD) at many storage problems on a case-by-case basis.

Basing a SAN on director class switches will allow us to start with a Fabric SAN backbone, and expand from there. Hosts and Storage may be added to the backbone and allocated as needed. Basing the SAN on Directors and High Density switches allows for almost limitless growth and flexibility.

5.4 Recommended Enhancements

Recommended enhancements to hardware and proposed topologies are covered in Chapter 9.

Proposed improvements and considerations for such things as Database performance are covered in Chapter 8.0.

The only hardware performance enhancements (other than the SAN itself, and LAN-FREE backup) would be to implement V-Cache.

5.4.1 V-Cache - Hardware

48document.doc

Page 49: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

MTI – V-Cache is a SAN enabled solid-state disk. It (like Direct Attached Solid State Disks) is an I/O device that can be used to store and safeguard data. Solid State Disk technologies like V-Cache use random access memory (RAM) to sore data and thus eliminating the access latency in mechanical disk drives. The near instantaneous response and the ability to place a ram disk within a SAN allow for quantum improvements in database applications by moving those small but I/O intensive segments of database off of traditional mechanical disks.

DOUBLECLCIK could realize significant performance gains by moving small but highly I/O intensive files to V-Cache. V-Cache would fit into the SAN proposal since it is made to be plugged directly into a fibre switch or director. Think of it as a RAM DISK that is SAN Enabled. Its other advantage is that it can be cut into RAM LUNS, and served to specific hosts (lun mapping to WWNN) just like the MTI S200 storage does with its disk/luns.

VCACHE - (General Characteristics)

- Physical: 5U high in a dual unit enclosure- Base Unit goes to 25.8 GIGS so 2 units in 5U enc = 51.6G

49document.doc

Page 50: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS- VCache SAN is for unit that will not be attached to one of our products (it comes with its own switches in a mini SAN)- VCache Expansion is a box to integrate into one of our SANS (S2xx or Vxx)- VCache Unit can hold up to 3 memory modules, of mixed types 4.3 or 8.6GB- SYSTEMS: Solaris 2.7/2.8 and Windows NT4/W2K.- LUNS: Vcache supports 16 LUNS per controller (8 per RCU) You can add as many controllers to Vcache as Customer wants.

The nice thing about V-Cache (on the SAN) is that it can be shared among systems on the SAN, or dedicated completely to any one system.

5.5Tuning Oracle Database with V-Cache Accelerator (Overview)

Disk access times are generally a key issue in most commercial data processing environments. It is disk access, which is commonly the longest part of the data acquisition process. Information processing technology has continually improved by leaps and bounds in the case of CPUs, memories, and other components. The hard disk arena has improved primarily in the direction of creating high real densities (capacity) and a lower cost per megabyte. Progress has not nearly been as great in improving the speed of data access from the disk drive. Speed of access in a drive is mainly limited by the rotating speed of the disk or the linear speed of the access arm. A modem day CPU flies along at 1.5GHz or faster, but it must wait for data from a disk drive which is chugging along at 15,000rpm or less. The ever-widening gap in speed differential is continuing to increase very rapidly and is causing I/O bottlenecks to become more troublesome than ever.

50document.doc

Page 51: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

V-Cache application accelerators use high-speed memory organized as a drive. To the server, it appears as a conventional disk drive when in reality the server is dealing with a mass of external memory. In a V-Cache accelerator, there are no performance-robbing mechanical limitations due to moving heads or spinning platters. Data can be written to V-Cache at memory speeds. There is no seeking of tracks and data can be found and transferred to the server in a few microseconds, which is 200-350 times faster than the 7-10 milliseconds a regular hard drive takes to seek to a track and transfer data. Data files on a Unix file system can be created on the V-Cache accelerator and the operating system will not know the difference.

However, data residing in memory is volatile and must be protected against loss of power to the memory. To prevent potential data loss due to power disruptions, V-Cache incorporates both battery backup and an internal hard disk drive that automatically detect power abnormalities. In the event of power fluctuation, circuits inside the V-Cache accelerator sense the loss and automatically switch over to power from the internal batteries. If the power disruption persists longer than a user-prescribed period of time (i.e. 10 seconds, 30 seconds, 1 minute, etc.), all data from the V-Cache memory is then saved to the internal disk drive. Once data is safely saved to the internal disk, the

51document.doc

Page 52: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

device will switch off and wait for the power to return as the data is now protected. When stable power is re-established, data is copied from the internal disk drive back to the V-Cache memory area and the files may again be accessed by the applications. There is virtually no risk of ever losing data with these safe guards in place. A file created on a device like this is equally if not more safe than a file created on a regular disk drive.

5.5.1 Using V-Cache With Oracle

In a flat file based application, it is easy to identify the HOT (heavily used) files and place them on a V-Cache accelerator to derive maximum and immediate performance benefit. When a database application is involved, it is a more complex scenario. A thorough understanding of the database and the read and write pattern of the application is required before anything should be attempted. Many times the application itself may not provide room for improvement due to internal locking or waits for other processing to complete.

For an application to show benefit it must be performing disk access at a very fast pace and must be writing or reading fresh data. Oracle performs its own smart optimization by buffering all writes and reads in

52document.doc

Page 53: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

the System Global Area. Data is written to the disks only on a “need to” basis. Over and above this database functionality, the operating system performs its own buffering. This means there are two sets of buffers to be scanned before a physical disk access is required. This double buffering strategy works well in many applications. What it does is add a second level of caching to the HOPE algorithm. Oracle always does a synchronized write, meaning the operating system buffers are bypassed for all writes to disk. These buffers are used for the reads only if they are not marked as old or dirty. This theory is largely true except in some implementations of Oracle where the operating system buffers are bypassed completely. For example, parallel server passes operating system buffers to ensure read consistency in various instances.

Under Oracle, the greatest I/O activity exists with the rollback segments, redo logs and temporary segments. These files are excellent candidates for a V-Cache implementation depending on the application. Oracle data files can be placed on a V-Cache accelerator when there are a lot of random reads and writes to specific tables.

Ensuing performance benefits can be extremely varied and generally depend on how much V-Cache memory is configured for the application. The following paragraphs explain how to use V-Cache accelerators with various Oracle files and the types of applications that can enjoy improved performance benefits.

53document.doc

Page 54: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Redo Logs

When an application commits data to the database, Oracle writes the commit information to the redo log files. An Oracle process called the log writer performs the task of writing the data to disk from the SGA (System Global Area). The user data could still be in Oracles SGA but the transaction is deemed complete. The above scheme allows Oracle to support high transaction rates by saving only bare minimum data to the disk. The user data is saved to disk later on by the DBWR process, which wakes up at predefined intervals.

The following performance improvements can be expected from placing Redo Logs on a V-Cache accelerator:

• High Transaction Rates -- An application with high transaction rates will write large volumes of data to the redo logs in a very short time. Redo logs, when placed on a V-Cache accelerator, can provide dramatic performance benefits in such a scenario. Performance gains of up to 2,0000/0 have been seen, though it is much more common to see improvements ranging up to 80%.

54document.doc

Page 55: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

• Transaction Processing -- In a transaction-processing environment, data is written to both redo logs and rollback segments. Performance benefits will be greatly enhanced when both redo logs and rollback segments are placed on V-Cache, particularly for applications with a high volume of inserts, updates and deletes

• Decision Support — In a decision support environment, which typically only reads data, V-Cache accelerators will generally not show a performance benefit.

Rollback Segments

Oracle stores previous images of data in the rollback segments. When an application makes changes to data, it is stored in the rollback segments until the user commits the transaction. All processes read the previous image of the data from the rollback segments until the transaction is committed. Oracle does its own buffering of the rollback segments in the SGA.

However, when large transactions occur, they are written to the disk, and if other processes need this previous image of the data, it will be read from the rollback segments. Rollback segments on a V-Cache accelerator

55document.doc

Page 56: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

will speed up both the reads and the writes.

Generally, a transaction-processing environment will substantially benefit from putting rollback segments on a V-Cache accelerator. On the other hand, a decision support system, which primarily analyzes data, will typically not show performance benefits.

Temporary Segments

Oracle uses temporary segments to store intermediate files. These files could be the result of sub-queries or temporary sort files. Data is both written to and read from the temporary segments by the various Oracle processes. Small sorts are entirely performed in the SGA, while large sorts are performed by using disk space in the temporary segments.

The following performance improvements can be expected from placing Temporary Segments on a V-Cache accelerator:

• Large Sorts / Parallel Queries -- An application performing large sorts or making heavy use of the parallel query option can show immediate Results with a V-Cache accelerator.

56document.doc

Page 57: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

• Transaction Processing -- Transaction processing environments with complex queries could also benefit from using a V-Cache accelerator. However, transaction-processing environments with low volume sorts will generally not show similar benefits.

• Decision Support -- Decision support systems, which retrieve or sort large volumes of data can show immediate results with a V-Cache accelerator.

Data Files

Oracle stores data from tables and indexes in table spaces residing on Oracles data files. All data is buffered and stored in the SGA and, as indicated earlier, will go through the operating systems buffers as well. Putting data files on a V-Cache accelerator will provide benefit only in unique situations. The writes to the data files are done by the DBWR at predefined intervals and are therefore not an immediate priority. Excessive reads can be avoided by increasing the number of buffers and creating the heavily used tables using the CACHE option. This will improve the possibility of the table data blocks being available in the SGA on most occasions. This technique can be used for smaller tables by increasing the size of main memory and the SGA.

57document.doc

Page 58: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

The following performance improvements can be expected from placing Data Files on a V-Cache accelerator:

• Large Tables -- Large tables that are randomly read or written at high frequencies should be placed on a V-Cache accelerator. Such tables can be created on separate table spaces setup on the V-Cache accelerator.

5.5.2 Determining The Need For V-Cache

Oracle application that have consistently high I/O rates that saturate both the SGA and the operating system buffers will benefit the most from a V-Cache accelerator. The saturation of the SGA is indicated by the Hit Ratio, and can be determined by using a monitoring tool like the Oracle Monitor or looking directly at the V$ tables. High transfer rates to a particular data file or rollback segment could indicate possible performance benefits.

The operating system should be monitored for I/O activity, and utilities

58document.doc

Page 59: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

such as sar and iostat can be used to analyze the operating system buffer activity. If the operating system buffers are large enough, there will be more logical than physical accesses. If the database (Oracle’s monitor) shows a high number of physical reads but the operating system does not, then the chances of obtaining any meaningful benefit from a V-Cache accelerator is generally low. On the other hand, an application, which causes physical reads both at the database level and at the operating system level can substantially benefit from V-Cache’s performance attributes. The exception to this is the Parallel Server, which always bypasses the operating system buffers.

An application with intermittently high I/O rates for short periods of time will not generally show much performance improvement with V-Cache, because the database could work on flushing the buffers during the lean periods. I/O transfer rate, rather than how much I/O is performed, determines the need for a V-Cache accelerator. The reads and writes must also slip through the buffering schemes before any benefits can be seen.

Raw Devices

V-Cache accelerator can be configured as a raw device under Unix.

59document.doc

Page 60: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Data written or read from any disk configured as a raw device always bypasses the operating system buffers. This will eliminate the overhead of buffer management. Raw devices are good for bulk volume writes. A process shows good I/O bandwidth when large volumes of data are written at one time, because it is now bypassing the buffering scheme.

This is particularly true with processes that write data at random in small bits and pieces. These processes typically perform poorly because they have lost the benefit of using the buffers and have to wait for the disk to respond. For these environments, using a V-Cache accelerator as a raw device under Oracle will increase the throughput, because the data will move directly from Oracle’s SGA to V-Cache.

RAM Disks

RAM disks have been available with every operating system for a long time. A RAM disk is a part of main server memory that can be designated as a drive, and the system will use that area to store files or any other user data. The RAM disk will cost the same as main memory and can be good for storing small volumes of data or doing some high-speed sorts in a flat file environment. However the main disadvantage of using RAM disk is that the CPU now performs the I/O rather than dele gating work to the I/O controller. A process performing I/O with RAM disks will run very fast

60document.doc

Page 61: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

but it will start loading the CPU with read and write requests, which can significantly slow down any other processes. The whole system could appear sluggish, especially when large data volumes are written to the RAM disk, because the workload on the CPU has increased.

Placing Oracle’s redo logs, rollbacks or temporary segments on RAM disks could potentially slow the entire system down in a heavy transaction-processing environment. Also, the effect of RAM disks on SMP (Symmetric Multi Processing) machines with multiple CPUs must be evaluated. An SMP environment with multiple CPU’s and RAM disks could potentially saturate the system bus and cause a rapid slow down of the whole system in some hardware architectures.

Herein lies another advantage of the V-Cache accelerator — the ability to share contents across multiple servers in clustered scenarios that is not possible with internal server memories.

Also recall that anything written to RAM disks is volatile, and servers do not usually come with the automatic battery backup and disk backup capabilities of the V-Cache accelerator. If power is lost or the machine is rebooted, data will be lost unless there are other planned alternatives.

61document.doc

Page 62: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

From a hardware point of view, a RAM disk will use up more slots in the backplane of the computer, because more memory boards will have to be installed, thereby limiting the number of slots available for other purposes.

5.5.3 V-Cache Oracle Summary When intelligently applied, V-Cache accelerators can provide outstanding performance improvement in Oracle environments. DBA’s can obtain much faster data access times by placing an application’s “HOT” files on a V-Cache accelerator as opposed to traditional disk drives. V-Cache does not replace disk drives or cached arrays but is used in key areas to resolve I / 0 bottlenecks.

62document.doc

Page 63: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

63document.doc

Page 64: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

5.6 Summary

From an MTI hardware perspective, I see the New York City Data Center as having potential major applications bottlenecks in moving forward if the current pools of storage hardware method is retained after the move to Colorado. We can’t speak to other vendors, but I doubt that there are any applications that could not be moved to or not benefit from moving to a core switch based SAN. Moving to a core switch 2 GIG SAN will increase overall performance in and of itself.

We do suggest that DoubleClick start the transition over to newer SAN hardware in Colorado at their earliest convenience. This would permit them to move to new more reliable, and faster technology as they transfer their data and applications to Colorado.

DoubleClick should also consider installing new WINTEL platforms in Colorado to replace some of the obsolete COMPAQ PROLIANT systems that they would otherwise move. I believe the benefits in speed, flexibility, storage capacity; reliability and lower maintenance costs would more than justify these actions.

64document.doc

Page 65: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

6.0 Disaster Recovery

6.1 Current

The customer currently backs up to DLT7000 tape in the 2 tape libraries. In case of disaster in the New York Data Center, their plan is to recover NYC operations at the Colorado Data center.

Data is currently offsite by Iron Mountain Incorporated. http://www.ironmountain.com/Index.asp

Once the move to Colorado is complete DoubleClick may want to look at a new disaster recovery plan.

6.2 Proposed

The current move to Colorado leaves 3 possible alternatives.

1. Off Site Disaster Recovery firm (i.e. Comdisco or IBM Business recovery services)

2. Create another DoubleClick owned data center, and use it in the event the Colorado center fails and do replication using software (Fujitsu TMDF) or a SAN appliance/software

65document.doc

Page 66: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSISsolution (DataSentry 2).

3. Or simply use an offsite storage facility for tapes, and buy computer time from one of the many disaster recovery centers run by various vendors.

6.2.1 Off Site Disaster Recovery

This is my first recommendation. To purchase services from any reputable Business Recovery Service is an expensive proposition, however building a DoubleClick owned duplicate data center is even more expensive. Doing without some kind of disaster recovery is a non-starter. If DoubleClick losses its data center, a significant portion of its revenue will be lost, making recovery even more difficult.

66document.doc

Page 67: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

This solution becomes feasible if DoubleClick does not wish to invest in hardware to duplicate the Colorado Data Center.

6.2.2 Create A DoubleClick owned disaster recovery data center.

The first step in this would be to decide if a fully duplicate center is necessary or if certain key applications capabilities would suffice.

If DoubleClick goes this route, they can customize their solution, and perhaps come up with a more viable and cheaper solution than an off site Business Recovery Firm. There are also some other advantages. DoubleClick could have empty tape libraries and a duplicate SAN in their recovery location. The benefit is if the disk drives in the primary Colorado data center location survive a disaster, these drives could be plugged into empty MTI SAN Arrays at the Disaster Recovery Data Center. This way the data could be recovered without a tape restore.

Depending on the scale of the duplicate data center, full time data replication over the WAN could be implemented. That of course carries its own cost in WAN infrastructure costs.

If DoubleClick decides they can live with a little extra time involved in

67document.doc

Page 68: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

getting back online, than a scaled down (critical systems only) solution, using tape restore might be the best solution.

6.2.3 Offsite Storage Facility

Many reputable companies sell fire and disaster tolerant (notice no one ever says proof anymore) facilities that are environmentally controlled for tape backup storage. This is not technically disaster recovery, but falls under data preservation.

This is probably the least expensive solution, and is the minimum that DoubleClick should employ. The offsite storage would provide some immediate protection from unhappy employees, computer virus’s, and less devastating forms of data loss.

In the event of a complete data loss DoubleClick would still be left with finding a data center and systems to restore to. But this solution when combined with a contract with a disaster recovery computer systems provider has been found to be cost effective.

DoubleClick has stated an intention to at least retain this level of data protection. They are happy with Iron Mountain’s services and may seek to

68document.doc

Page 69: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

continue with them after the expansion/relocation Colorado.

6.3 SUMMARY

It is our recommendation to do a best of both worlds approach. Buy enough hardware to replicate all the critical functions of the Colorado Data Center (once the New York Center is closed). Break these critical functions into 2 categories. First would be mission critical, and second would be mission essential.

Mission Critical. Have those must have functions (revenue generating, business continuance) and databases up and replicated from Colorado data center to Disaster Recovery Data Center. This would allow an immediate continuance of corp. business functions in the case of a disaster.

Mission Essential. Things like Microsoft Outlook. E-mail is essential to business nowadays, but losing use of your current inbox, and files can be worked around and is not a showstopper. Decide these non-critical applications and allow them to remain offsite on tape. Plan to either acquire hardware to run them when the need arises (natural disaster) or to re-allocate the hardware (on the disaster recovery site) to accommodate the must have functions.

69document.doc

Page 70: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

In this manner DoubleClick could stay in business, with critical services intact, and restore those non-critical capabilities as time/resources permit.

70document.doc

Page 71: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

7.0 Control / Virtualization / Data Migration

Currently there exists no perfect product to handle all the management of SAN. In the future software vendors promise packages that can perform hardware configuration, backups, hardware monitoring, performance monitoring, and system administration.

For now there are a lot of Vendors trying to pull this off. Some vendors claim to have this functionality, but when you read between the lines, or ask hard questions, it becomes obvious that they do not.

The following recommendations reflect our best opinion on software that currently exists, that will allow reliable control, monitoring, system administration and data migration.

7.1 MSM / FUSION

MSM & FUSION are MTI products for controlling MTIs family of hardware SANs. Every Vendor has their own version of hardware control software.

71document.doc

Page 72: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

1. MSM – MTI SAN Manager (MSM) is a Web-GUI specifically for administering the MTI S200 & 400 series of Arrays. It allows for LUN Masking, LUN Initialization, array performance monitoring, and other housekeeping/administrative functions.

2. FUSION – Fusion is MTI’s dedicated graphical user interface for the Vivant Vxx series Enterprise RAID systems. It performs all the same functions, as DAM but is able to do such things as array initialization and array re-configuration in the background. Some operations (LUN Creation) in DAM require a controller reset, which would affect other hosts using that array. FUSION & the Vivant Vxx have no such limitation. It can be re-configured on the fly with no impact to any other hosts using the array. (As Double Click upgrades their V20/V30 series to S2xx series, fusion will no longer apply)

7.2 Veritas NBU Currently DOUBLECLICK is using Veritas NetBackup to manage their backups and tape libraries. There is currently no SAN wide control tool that addresses backups of the size and complexity of those run at

72document.doc

Page 73: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

DoubleClick. Veritas is one of the best Software Backup Automation and control suites available. Suggest no change here and continue using Veritas to control Backups.

73document.doc

Page 74: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

NETBACKUP CONSOLE OVERVIEW

7.3 Veritas SANPOINT / Disk Management

Veritas SANPOINT / Foundation suite allows amore flexible control of the customers storage and allows for retention of some of the legacy storage.

It will allow storage to be grown, shrunk, and modified dynamically. Any storage on site may be used, JBOD, RAID, internal disks, regardless of vendor.

74document.doc

Page 75: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

75document.doc

SAN Switch

SAN Switch

SAN Switch

SAN Switch

SAN Switch

SAN Switch

SAN Switch

SAN Switch

SAN SwitchSAN Switch

SAN SwitchSAN Switch

SAN SwitchSAN Switch

SAN SwitchSAN Switch

•Disk group ownership

•DMP provides path failover & load balancing

•Remote Mirror & SW RAID

•Performance optimization

•Dynamic grow/shrink oflogical volumes

ServerApplication

ServerApplication

VxVMNT (Beta) Solaris HP/UX

Page 76: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

SAN Virtualization with VxVM:SAN Virtualization with VxVM: Non-disruptive on-line storage managementNon-disruptive on-line storage management

7.4 Fujitsu (Softek) SAN View Simplify the management of multi-vendor SAN devices

Softek SANView is a proactive Storage Area Network (SAN) fabric manager that simplifies SAN troubleshooting. It monitors the health and performance of your SAN-connected devices so you can diagnose and avoid outages. Through automatic discovery, the software travels into the SAN, identifies each of the various devices and then draws the topology of your SAN.

See your SAN connections Equipped with this graphical representation, you can quickly view interoperability and connectivity of the different SAN devices. Softek SANView then monitors all vital functions and reports on the health of

76document.doc

Page 77: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

your SAN, proactively alerting you to issues before they impact application availability.

Without the clear picture of SAN connectivity that Softek SANView provides, system administrators are operating in the dark. Problems can't be anticipated and when problems do occur, they have no idea where to begin troubleshooting.

Virtualize Your SAN Infrastructure Unique in the industry, Softek SANView's synergy with Softek Virtualization enables customers to virtualize and visualize their SAN infrastructure.

7.4.1 Fujitsu (Softek) Storage Manager

Fujitsu SANVIEW is a SAN monitoring, and viewing tool. To add control functionality to it you must add the Storage Manager option.

• Centralize storage management for mainframe and open systems—Centrally monitors, reports and manages storage across the

77document.doc

Page 78: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

enterprise from a single product in a single console. Simplifies and manages more storage with fewer resources.

• Improve productivity through automation—Monitors and automates actions (archive, delete, provision) to allow management by exception. Lets you set storage policies across servers or groups of servers. Schedules actions to run based on pre-defined criteria. Eliminates the risk of human error or application downtime.

• Manage multi-vendor, heterogeneous storage with one solution—Hardware and software independence eliminates vendor lock-in or proprietary solutions. Supports all storage types from all vendors including DAS, SAN and NAS.

• Define business-process views of storage—Views data from an application perspective so decisions can be based on business requirements, not technology alone. Group servers by your logical business model. Manages your business, not your servers.

78document.doc

Page 79: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

• Automate storage provisioning—Maximizes uptime by automatically provisioning storage to applications that need more space. Users never experience application downtime due to “out-of-space” conditions.

Know what you have, so it can be effectively managedIncrease utilization rates through threshold monitoringAnalyze storage across all media and easily maintain your desired storage stateIncrease productivity through automation

7.4.2 Fujitsu (Softek) TDMF

Softek TDMF, Mainframe Edition simplifies storage migration for mainframe data. It ensures that an organization's revenue-generating applications suffer zero interruption due to the addition of new storage. Without it, the addition of new storage might require a five- to six-hour interruption before being able to access mission-critical business applications. DoubleClick was evaluating this product during the week of 13 – 17 January 2003. Initial results moving data from NYC to Colorado over the OC3 pipe where promising.

79document.doc

Page 80: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Vendor independentIs completely transparent to end-users, operating on any mainframe-compatible equipment regardless of manufacturer. Such vendor independence, allows you to reduce your maintenance costs by replacing expensive hardware/software migration and/or replication solutions.

Ensure business continuance with Point-In-Time CopiesSoftek TDMF, Mainframe Edition features a base component with add-on modules that extend product functionality. The first module, Offline Volume Access (OVA), enables the processing of Point-In-Time copies using standard in-house utilities such as DFSMSdss or FDR to maximize the availability of critical production data.Immediate Return on Investment

Maintain continuous data availability during data movement and backupReplace scheduled outages with business continuanceSatisfy service-level agreements

 

7.5 MTI Data Sentry 2

MTI’s DataSentry2 product is an appliance based Data Management

80document.doc

Page 81: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

solution. Typical applications include remote disaster recovery, creation of off line volumes for development, business operations and data center maintenance usage and re-mote ServerFree and zero impact backup. Data Sentry2 solutions include:

- QuickSnap: Point In Time Images- QuickDR: Short & Long Haul Replication- QuickCopy: Full Point In Time Image Volumes- QuickPath: Server Free Backup

DataSentry2 Console:

The DataSentry2 console provides single point control of Quick functions for DataSentry2 enabled systems throughout the enterprise. From a central SAN console the administrator can use the drag and drop GUI to: create point in time images, create a mirror, extend volumes, define replication relationships, kick off QuickSnap, quiese databases, start Quick-Path Backup, and restore data after disasters.

81document.doc

Page 82: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

7.5.1 QuickSnap (Point In Time Images)

Quick snap is used to create a DAV (Data Availability Volume). This virtual volume represents the source volume at a specific point in time. Once the QuickSnap is taken the resulting DAV can be mounted to a host and subsequent writes to the source volume will cause a physical copy of the older data from the source to be kept aside for use during access of the QuickSnap DAV. This scheme efficiently sees storage space since only the writes since the last QuickSnap have to be kept rather than an entire physical copy of the volume. The facility is often used for periodic recording of an image for the later roll back/restore in case of a user induced write or accidental delete. Since creating a QuickSnap Dav happens instantaneously, the technology eliminates the need for an incremental resynchronization since it will always be faster to take a new QuickSnap than to perform a traditional incremental resynchronization operation.

7.5.2 QuickDR (Data Replication & Migration)

MTI’s DataSentry2 supports creation of remote site hot-standby

82document.doc

Page 83: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

volumes for use during maintenance cycles and local disasters or to migrate data over short or long distances.

Using DatSentry2, a replication set is established within the local storage system. Next a replica is created inside the remote storage system using either the internal or external (in this case OC3) network. This one time copy may take awhile, as it is a complete physical copy.

During replication set up, you choose a time of day or watermark (number of new writes) to establish a trigger for automatic initiation of update. When the trigger occurs a QuickSnap is automatically taken and the updates are sent over the long distance communication channel to update the replica with the last set of changes. In this way the replica (remote) volume is kept synchronized.

83document.doc

Page 84: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

When replication is complete the customer can choose when to break the volume and use the replicated volume in Colorado to take over operations. QuickDR is used in this fashion to either migrate data or long distances, or for keeping a replica copy of chital data, either locally or at a remote data center for disaster recovery.

84document.doc

Page 85: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

7.5.3 QuickCopy (Full Point In Time Image Volumes)

When a quick copy is issued to create a DAV, access to the DAV is available to a host only after the physical copy process completes. (In this way it is different from a QuickSnap where data is available immediately). Multiple QuickCopy can be taken from the same source volume and can be mounted to any host in the data center SAN or replicated to another site for use. The advantage here is that due to it being a complete copy, access time is the same as for the original volume. This is useful for making copies of Databases or applications to test changes without corrupting or changing live data.

7.5.4 QuickPath (Server Free & Zero Impact Backup)

For high-end storage architectures, one can create off-site rotating storage (on-line) disaster images, keeping them constantly refreshed in case of a need to quickly switch to the standby site. If the remote backup image is kept on-line than it can be updated frequently and the time to

85document.doc

Page 86: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

bring up the remote site can be minimized since lengthy tape restorations can be avoided. This technology uses QuickSnap features to allow the customer to complete zero impact backups.

7.6.5 DataSentry2 Migration Considerations

86document.doc

Page 87: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Existing storage systems can be field upgraded to support the DataSentry2. Systems that have been previously configured by the host operating systems can be imported into the DataSentry2. This allows for a seamless migration to DataSentry2. The imported volume s act just like a newly created virtual volume. All “Quick” functions can be ran against the imported volumes.

87document.doc

Page 88: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

7.6 Software Review

Currently DoubleClick has the typical problems that you expect to see from a data center that has faced their challenges. The solutions listed below represent the best of what is available vice “vaporware”.

1. Hardware Control: (Of Hitachi / MT or other SAN) MSM (MTI) or native hardware GUI if they go with another Vendor SAN (Hitachi).

2. BACKUP – Stick with Veritas NBU (Net Back Up).3. Storage Virtualization – For systems hooked up to an MTI Vxx

class, DatSentry2 or Fujitsu Storage manager provides much of this Functionality. Which currently does not exist in the data center.

4. Fujitsu SANVIEW – when combined with Fujitsu Storage Manager allows for not only control but also viewing of your san down to the individual components and firmware that make it up.

5. VERITAS SANPOINT/ VxVM (Virtualization & Viewing) – to allow integration of legacy storage and allow dynamic re-

88document.doc

Page 89: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSISallocation of storage. It is broadly analogues in functionality to Fujitsu SANVIEW and Fujitsu Storage manager combined.

6. BROCADE WEBTOOLS (SAN ADMINISTRATION) – The brocade web-browser java/GUI is part comes with Brocade switches and is Brocades native switch control / monitoring / configuration software. It is a Java web based GUI and is quite flexible and robust. DoubleClick already has this.

This above software all allows for running in a consolidated mode or a distributed node. Or in some cases both. In consolidated mode all the above software could run within windows on 1 or 2 NOC consoles to allow for system monitoring and control. In distributed mode the software could be set up to be accessed remotely or from personnel desktops located around the Colorado Data Center.

89document.doc

Page 90: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

7.7 SUMMARY - ENTERPRISE STORAGE

My recommendation for Data migration is either to go with Fujitsu TDMF (pending the outcome of onsite testing) or DataSentry2 to migrate their data to Colorado. For now you could use TDMF or DS2 to migrate your data, and proceed with SANVIEW and Control Software improvements after the move is complete and you have your systems running in Colorado. If you implement V400’s and core Brocade 12000 core switches in Colorado you can take a phased approach to improving your SAN control and consolidation software.

Once the Data is Migrated to Colorado the customer may want to look at a centralized control and monitoring system for all their storage. Fujitsu SANVIEW with STORAGE MANAGER is one option and may be the best. If DoubleClick goes with TDMF to migrate their data to Colorado, than standardizing on Fujitsu for the SAN software makes sense.

The alternative would be SANVIEW to view and monitor the SAN and DataSentry2 to allow control and Virtualization of “ALL” the onsite

90document.doc

Page 91: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

storage. That could include any direct attach (JBOD) storage that remains.

The nice thing about DataSentry2 is that it not only virtualizes all the storage you wish to put under its control, but also brings SnapCopy, Zero Impact Backup and Disaster Recovery (via replication) to the table.

91document.doc

Page 92: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

8.0 Special Applications Considerations

8.1 Database - Considerations

Customer Database considerations for a SAN implementation are fairly straightforward. Oracle, Sybase, and SAP all have common areas that can benefit from performance tuning, which include the underlying database system’s transaction, journal files and temp space. To optimize the performance of transaction logs, redo logs, journal files and temp space or rollback segments, placement of these files or objects on fast SAN-RAID arrays configured for RAID 0+1 will yield the best performance. Also important is the need to isolate database storage from other activities, placing it in arrays and controllers not being used for storing other active data tables.

SAN-RAID (i.e. MTI V400 / S240 etc.) is designed for storing data in a cost effective centralized manner, with performance being a by-product. Since redo logs, transaction logs, temporary tables, journal files, and rollback segments tend to be I/O bound, any steps that can be taken to improve I/O performance will be of great benefit to DoubleClick.

MTI – V-Cache is a SAN enabled solid-state disk. It (like Direct Attached Solid State Disks) is an I/O device that can be used to store and safeguard data. Solid State Disk technologies like V-Cache use random

92document.doc

Page 93: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

access memory (RAM) to sore data and thus eliminating the access latency in mechanical disk drives. The near instantaneous response and the ability to place a ram disk within a SAN allow for quantum improvements in database applications by moving those small but I/O intensive segments of database off of traditional mechanical disks.

Traditional storage is measured in dollars-per megabyte while Solid State Disk and V-Cache should be measured in terms of dollar per I/O. In other words you could throw a lot of disks at an I/O problem and still not fix the underlying problem. On the other hand you target those I/O bound files with a V-Cache or Solid State Disk and gain a substantial performance increase.

Solid State Disks/V-Cache and SAN-RAID compliment each other, particularly in database and SAP environments. SAN-RAID is used for tables, indices, and some journal files with the appropriate RAID Level, while Solid State Disk is used for redo logs and key tables requiring very low latency and high performance. Performance involving temporary and system table segments can also benefit from being placed on an SSD or SAN-RAID array configured as RAID 0+1.

DOUBLECLICK’s current ORACLE database could benefit from some of these tuning considerations (i.e. RAID 0+1) when a centralized SAN is implemented. However there may be a cost benefit consideration in doing hardware performance improvements like a Solid State Disk/V-Cache)

93document.doc

Page 94: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

8.2 Microsoft Exchange 2000 - Considerations

There are no special considerations for employing Exchange 2000 on a LAN. However DOUBLECLICK may want to consider some kind of software snapshot Technology for capturing point in time backups of Exchange.

Software snapshot technology is implemented through a copy-on-write technique that takes a picture of a file system or data set. The process of taking a snapshot is almost instantaneous, and data is moved to a backup medium at the block level. Any block that is changing while this snapshot image is being backed up is moved to cache. This important process allows the file system to remain in production while also enabling the snapshot to preserve its exact moment in time by flushing the cache to the backup medium. In the event of a restore , data can be restored more efficiently at the file level instead of the volume level. This is made possible by the data mapping that is preserved along with the backup image.

Veritas Netbackup supports this functionality in Microsoft Exchange (API level integration). Also MTI’s Data Sentry 2 SAN appliance can provide this capability in a robust hardware appliance.

94document.doc

Page 95: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

95document.doc

Page 96: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

9.0 Summary - Roadmap

We are submitting two basic topologies as possible candidates for DoubleClick’s transition to an integrated SAN.

9.1 Proposal Overview

Proposal is based on the following assumptions / recommendations.

- Decommission all 3600 Array’s

- Leave S240 (Vivant Conversions) in NYC to support QA & DEV and replace 3600’s.

- Transition all high throughput Disk I/O to 2 GIG operations.

- Leave backups on 1 GIG SAN and migrate as time and $$ permit to 2 GIG architecture as necessary after the relocation

- All Brocade switches in Colorado to be installed in core switch fabric.

- Recycle Ancor switches into meshed fabric to support remaining systems in NYC.

96document.doc

Page 97: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

- Upgrade Vivant V20/30 and upgraded to 73G drive technology (as necessary) to gain space to eliminate older JBODS and or NetAPPS.

97document.doc

Page 98: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

98document.doc

Page 99: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Topology #1

Topology #1 starts with the installation of 2 Director class Brocade switches (Brocade 12000s). Unix systems (Software Clustered or Stand-Alone) and Windows (Clustered or Stand Alone Systems) are connected directly to the Brocade 12000s.

Tape libraries are connected to the core switches and are built into a Backup zone on the switches along with the 1 GIG HBA’s that support back up operations. In this topology migration to 2-gig topology for backup is easier to accomplish in phases, but mixes disk and tape I/O on the core switches.

3 70” cabinets of V400’s (12 V400’s with 32 drives of 146G per V400) provide 56 TB of raw data in the new 2 GIG technology (this effectively replaces 26 cabinets of 3600s). Each V400 Cabinet would have 1 management processor for phone home and equipment monitoring. Money can be saved by not buying the Brocade switches that normally come with the V400’s (2 switches each V400 typical) and instead direct connecting them to the Brocade core switches. Add 2 GIG HBA’s to systems requiring storage and build a 2 GIG zone in Brocade.

Install the V400’s in Colorado and migrate data with DataSentry2 or Fujitsu TDMF.

99document.doc

Page 100: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Add V-C-Cache solid-state database accelerator only if you require more speed than the V400 upgrade will produce.

A management console in the NOC can monitor the SAN via MSM (MTI supplied) or another product (i.e. Fujitsu SANVIEW) . Backups are via a direct fibre attached tape library. If desired multi protocol (IP over Fibre) protocols can be run to move some of the backup traffic for non san capable systems off the network. Virtualization tools (beyond what comes with the V400) Like MTI DataSenry2 or Fujitsu SANVIEW with Storage Control option, can be added after the move as money permits.

100document.doc

Page 101: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

101document.doc

Page 102: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Topology #2

Topology # 2 is basically the same as topology #1. The main difference is that edge switches (Brocade 2800’s) are used for the backups. In effect the backup topology remains almost unchanged, except that the Brocade 2800s are connected the Brocade 12000 core switches via inter switch links (ISLs).

This has the advantage of totally segregating Disk and Tape I/O operations, while allowing for the staged movement of tape operations over to 2 GIG in a controlled fashion, as time and money permit.

All other advantages and functionality are as per the description for Topology #1. I prefer this one, as it keeps the tape system I/O unchanged. There will be enough changes made during the move, this will allow the tape backup architecture to be modified later after the dust settles from the Colorado move.

102document.doc

Page 103: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

9.2 Roadmap

103document.doc

Page 104: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

• Start with a Director Switch (SANS are a lot like Networks so you need a core to start from)

• Add 2 GIG HBA’s only to systems where 2 GIG throughput is necessary (i.e. high I/O servers SUN 4500 / 6500 etc.) Use 2 GIG HBA’s for Disk I/O.

• Bring new enterprise storage into Director SAN (This would allow for a more centralized storage plan going forward vice the DAS and small SANs currently installed)

• Integrate tape/backup fibre environment into director (via ISL Inter Switch Links)

• Install Multiprotocol drivers for ip/scsi (this can be done after relocation) This can allow you to move IP backup traffic off the Ethernet network and onto the fibre backbone. (it can run concurrently with FCP/SCSI)

• Determine Host qualification for SAN and its requirements i.e. backup replication etc. (See 9.4 Transition Points for guidance)

104document.doc

Page 105: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

• Connect new hosts (SAN Candidates) to director switch (12000) or NYC Ancor fabric as necessary.

• Document its location (host) , label fibre connections, raid type , application , WWN, volume-mapping table and replicated volume. (See Appendix – 6 , SAN Best Practices) • Use volume mapping /hard zoning to ensure data security

• If the host is due to be replaced replication can be used to maintain uptime by replicating data to the new host and its storage. (See 9.3 Data Migration) Storage and hosts can be added on the fly in a non-disruptive manner in this environment.

• You can scale to more storage by adding another core switch or edge switches and continue growing your storage using tools like Veritas San Point foundation suite, Fujitsu SANVIEW, or MTI DataSentry2.

105document.doc

Page 106: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

106document.doc

Page 107: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

107document.doc

Page 108: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

9.3 Data Migration (DAS-SAN)

In moving data from one storage system to another (as you eliminate JBODS or migrate to Colorado), there are going to several issues. The time window available, the impact to online resources during the move, and the reliability of the data after the move. For example some data may need to remain online 24x7, during the move, while other data can be offline during the migration process.

The migration process is fairly straightforward for systems that do not need 24x7 availability. Each of these systems can be connected to the SAN via a host bus adapter. Storage can be assigned to that host while maintaining its direct attached storage. The process at that point is a simple disk-to-disk copy/backup or mirror operation. The storage LUN allocated from the SAN merely looks like another local disk. The applications can be stopped, the data copied to the san, the application re-started and tested. The advantage to this is your data is still on your old DAS should something go wrong with the transition. There have been instances during migrations where time constraints, or unforeseen problems have caused customer sites to fail back to their DAS, and this is a nice option to have. Of course backing up the data prior to the migration operation is a nice added bit of insurance.

108document.doc

Page 109: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Where data needs to be 24x7, the procedure is similar. There would still be a need for a small maintenance window to install host bus adapters and drivers into the host being moved to the SAN. After that the host could be brought back online. After the host is online a mirroring application (i.e. Veritas Replication) or FUJITSU TDMF or MTI DataSentry2 could be used to replicate the data from the DAS storage to the SAN storage while allowing the application/host to remain online.

DoubleClick will have to weigh the benefits and impact of the downtime required for migration using a direct copy vs. replication, and decide which is appropriate on a case-by-case basis.

9.4 Transition Determination Points (DAS-SAN) A basic rule set should be utilized by the storage administrator /system administrator to determine a servers /application qualification for SAN placement in your environment. These should include but are not limited

109document.doc

Page 110: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

to these basic rules.

1. Does the application require high availability yes/no2. Does the application have high growth expectations like dynamic

storage needs yes/no3. The existing server / application has reached its end of life and is

scheduled to be replaced by newer equipment 4. The Application has high performance requirements ( video

streaming, imaging)5. The server / application has a storage need of (100GB) or more. 6. The server is able to benefit from the Fibre Channel SAN

Architecture (no workstations Sparc 5,10,20 , ultra 1 etc.)

110document.doc

Page 111: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

9.5 Summary

DoubleClick has to expend a huge effort to migrate data, and move production to a new facility in Colorado. It makes no sense to move high I/O top end servers to older storage technologies.

Our suggestion is to build Colorado around the 2 Core San switches purchased.

Purchase V400 (2 GIG) technology and move high I/O core systems (6500/4500 etc.) to 2 GIG technologies. Use converted V2x/V3x (S24X

111document.doc

Page 112: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

systems) to replace JBODs on any surviving older COMPAQ systems. Upgrade S24X series to 73G as necessary to gain space to eliminate older 3600 and NetAPPs technology.

112document.doc

Page 113: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Chapter 10.0 – Recommendations / Analysis Summary

RECOMMENDATIONS

Overview - Customer site is fairly well organized given the turmoil of explosive growth followed by an economic contraction. Our base recommendation is to install director class switches (both Brocade 12000) into the Colorado Data center. These switches can serve as the basis for growing and reorganizing customer equipment into a fault tolerant fabric.

Use the move to Colorado to do a concurrent “TECH REFRESH” upgrading obsolete hardware as part of the relocation. You have to move data anyway, there is no use putting it on older or slower equipment. A good argument can be made using money that would go to shipping heavy obsolete cabinets across country, on installing new systems with a warranty in Colorado. To implement a true SAN you will have to get rid of the MTI 3600 disk arrays as they do not support fabric (they are older arbitrated loop technology) and are incompatible with the Brocade 12000’s Double Click purchased.

1. Complete V20/V30 upgrades to S240 – Currently the customer is

113document.doc

Page 114: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSISupgrading his 10 remaining V20 and his one V30 to S240s. This is cost effective and represents the best that can be gotten out of the 1GIG fibre technology. (Consider Leaving these in NYC to support DEV & QA systems staying in NYC)

1a. To support the removal of 3600’s upgrade the following Vivants to 73G Drive technology as necessary to add extra fabric capable space. (These systems already have been or are pending upgrade to S24X technology)

a. MTI CAB 17 V20 (48 x 36G) to (48x73G =3.5T )b. MTI CAB 16 V20 (60 x 18G) to (60 x 73G = 4.3T)c. MTI CAB 72 S240 (60 x 36G) to (60 x 73G = 4.3T)d. MTI #65 V30(96x36G)/(96x73G= 7T ) (*NOT CUST OWNED)

1b. Segregate out high I/O systems to 2GIG V400 systems - purchase V400’s and connect them and your high value/high I/O systems (i.e. SUN 4500 / 6500) to the 2 GIG SAN.

Suggested Systems / Applications 2GIG: Primarily the SUN 4500 & 6500 systems, but at least these to start: EPS1 / EPS2 / EPS3 / EPS4 / Strider / DartRpt Server / NYC DART OLTP / COLT Staging DB. UBR / FSADS / SPOD DB / DART W (Jetson) / DT (boomerang)

Suggested Systems / Applications 1GIG: FS Acicive / FSSIM Archive / Custom/Report DB / Corporate Systems / Netbackup (systems)/ NXP / Auto PT / Archsure / DF Paycheck

Most of these systems choices where gleaned from customer interviews, I added some systems to the list that may also be helpful.

1c. Move converted S240’s to lower I/O systems and OR Leave behind for QA & DEV systems remaining in NYC: The V20/30 converted S240 primarily support EPS system and other applications running on the SUN 6500’s in the Primary data center. Move there to 2GIG/V400 technology, and use the S240 to replace other storage (i.e. SCSI JBODS on Compaq systems) in the facility.

114document.doc

Page 115: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

1d. Decommission the NetAPPS boxes: There is more than enough space in the S240 conversions to allow them to also replace these boxes if you upgrade to 73G drive technology. Age of the NetApps boxes, and your move towards an integrated SAN make this advisable.

2. END OF LIFE THE 3600 (CONCURRENT with UPGRADE TO V400) - This should be done for several reasons.

a. 3600 technology is near end of life. Maintenance contracts on it are far more expensive than on the newer V400.

b. It makes no economic sense to move 27 heavy cabinets cross country, when the money could be better applied to a technology upgrade in Colorado.

c. 3600 technologies uses Gadzooks HUBS, they are slow compared to 1 GIG switches and crawl compared to 2GIG technology. The Hubs also are incompatible with a core fabric based Brocade Solution (or any switch vendor for that matter).

d. Moving is not included in the maintenance contracts, any problems found in the 3600’s when they arrive in Colorado will be on a time & parts billable basis.

e. After figuring in maintenance cost savings (a V400 can be bought with a 3 year warranty), and moving cost savings, the V400 looks like a practically a free upgrade. (vice marinating / moving the 3600’s)

115document.doc

Page 116: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

3. UPGRADE LP8000 & JNI 1GIG HBAs to Emulex LP92XX Series 2GIG HBAs – Currently installed HBA’s are getting old and all are based on older 1GIG technology. The newer 2GIG technology is already supported in the Brocade 12000’s that DoubleClick has purchased. Staying with the old 1GIG HBA’s just adds unnecessary performance bottlenecks, and maintenance costs. Maintain the 1 GIG HBA’s that support backup and add 2GIG HBA’s for disk I/O. Some 1 GIG HBA’s can be saved or re-distributed to support S240’s and JBOD conversions to SAN.

4. Leave the dedicated Backup SAN as is. Double Click currently has a dedicated backup SAN built around 2800 silkworm switches, 1GIG HBAs and the in house Ethernet network. These 2800 can be connected to the core switches (12000) and be used as edge switches. A backup zone can be built (using Brocade zoning) to segregate the backup operations. Than as time and money permit you can add 2 GIG HBA’s and migrate backup operations on the largest servers to 2 GIG operations via the 12000 switches. (This will require a tape library upgrade/replacement later to gain full benefit of 2GIG fibre operations).

5. INSTALL V400 w/o the switch (Connect to Brocade 12K Directly) – We sell our V400 product in 2 flavors. One is with included Brocade switches and the other is without. Since DoubleClick already has the

116document.doc

Page 117: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Brocade 12000 switched, purchase our V400’s without their built in switches and connect direct to your 12000s. Use the money saved on switches to purchase a 3-year warranty (cheaper to buy extra warranty time up front) and 2 GIG HBAs (Host Bus Adapters).

6. UPGRADE OLDER PROLIANTS – A lot of the Proliants in the facility are tagged for moving. It may be a good time to consider upgrading them to newer/faster models. (Cost benefit vice moving / repairing etc.)

7. UPGRADE SMALLER SPARCS – Consolidate your older SPARC 2 /5/10 & SPARC Clones into fewer larger models (i.e. SUN 450 / 3500 etc.) where practical. Saves on time, support, and maintenance contracts. (it is understood that in some applications this may not be practical)

8. VCACHE – You are already using some 3rd party direct attach Solid State disks, so the benefit of this technology in Database and other applications is not new to you. Look at moving your highest IO database files to an MTI Vcache device. The main advantage here is economies of scale. Since the V-Cache can be connected to the core SAN switches you don’t have to purchase more switches, and several systems can

117document.doc

Page 118: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

share 1 Vcache unit. You could purchase 1 to 3 fully populated Vcache and share them in all your database applications, vice having a whole bunch of expensive direct attach dedicated solid-state disks. If you move the primary applications to V400s the performance boost might be such that you may consider postponing V-Cache till later.

9. STANDARDIZE – It seems most of your production hardware is now standardized around Compaq Proliant / SUN systems / MTI & Hitachi. You have some small pools of Dell & HP Servers. Try to stay away from adding any other vendors to the mix. Each new system has its own support, training, and spare parts issues that add more cost to your overall support. Most large datacenters choose a few vendors and standardize.

10. MAINTAIN LOCAL SYSTEM DISKS WITH MIRRORS – As looks to be your current practice, no matter who’s SAN storage you use, maintain Operating Systems (SUN / Windows / LINUX etc.) on local disks and mirror locally. Today’s modern operating systems generate a lot of disk caching/thrashing activity that when multiplied by the number of systems you have, generates a needless IO drain on your SAN. Also when troubleshooting SAN problems, should a system lose access to the SAN, local system error logs would aid in trouble shooting. If a system lost connection to the SAN and its system files where on that

118document.doc

Page 119: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

SAN, the error logs would be gone also.

11. GET RID OF SCSI JBODS & ATTACH JBOD SYSTEM TO SAN DoubleClick has a lot of legacy JBOD (Mostly Compaq) SCSI attached disks. Based on 4.3 / 9 / and 18 GIG Technology these systems are slow and will show an increased failure rate in the near future. As above transfer these to S240’s if you migrate off the 3600 or to V400s as appropriate.

12. CONNECT ALL SAN SYSTEMS TO BROCADE FABRIC - Assuming you upgrade to V400’s connect the following into the Brocade Fabric

a. StorageTek tape libraries (using Brocade 2800’s as backup edge switches)

b. Hitachi 9xxx series SAN.c. MTI V400d. MTI S240’s (and unconverted V20/30 series)e. Other converted JBOD systems (i.e. COMPAQ)

119document.doc

Page 120: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

If you stay with the 3600 technology you will have to leave these out of the SAN as they are not Fabric capable. Once complete you can utilize a centralized SAN management or virtualization tool (i.e. SANVIEW)

13. USE BROCADE ZONING – If you follow the above recommendations you can use zoning to segregate your systems into 2GIG, 1 GIG, V-Cache, and Backup zones. Systems and HBAs named in a zone, can only see other members of that zone. It’s a good management tool for organizing your SAN.

14. ADOPT A CENTRALIZED SAN MANAGEMENT TOOL – Fujitsu’s SANView would be an excellent choice. Once all systems are consolidated on the SAN Fabric, you will be able to fully utilize the power and versatility of this type of software tool. I believe you are already running tests with Fujitsu with an eye towards implementing this vendor’s software (i.e. TDMF).

15. ADOPT VIRTUALIZATION – some of virtualizations benefits are already available in the S240 conversions & in any V400’s you buy. But to totally take advantage of virtualization implement DataSentry2 or Fujitsu

120document.doc

Page 121: DClick SAN Cnsldtd REV2

Site Audit & SAN ANALYSIS

Sandier with Storage Manager. Either will allow you to monitor and control your storage dynamically (making volumes larger, assigning free storage, controlling all storage to include JBODS).

121document.doc