msg389 achieving high availability with windows server and exchange server anthony quigney,...
TRANSCRIPT
MSG389
Achieving High Availability with Windows Server and Exchange Server
Anthony Quigney, Application Solution Centre Manager, Dell EMEA
Brian Hayden,Senior Systems Consultant, Application Solution Centre, Dell EMEA
Agenda
Availability – Why is it important?
Availability – Defined
Availability - Business / IT Challenges
Availability SolutionsWindows Server 2003
Exchange Server 2003
Clustering
Dell High Availability Solutions
???
Today, computing resources are the axis on which business revolves. When these resources are unavailable to an organization, it is at risk of losing its competitive edge.
data
Lost Systems . . .
Revenue
Customers
Decision Capability
Data
Productivity
Leads to lost . . .
Effects Of Downtime
The Cost of Downtime
Source: IT Performance Engineering & Measurement Strategies: Quantifying Performance Loss, Meta Group, October 2000
Industry SectorRevenue Per
HourRevenue Per
Employee-Hour
Energy $2,817,846 $569.20Telecommunications $2,066,245 $186.98Manufacturing $1,610,654 $134.24Financial institutions $1,495,134 $1,079.89Insurance $1,202,444 $370.92Retail $1,107,274 $244.37Pharmaceuticals $1,082,252 $167.53Banking $996,802 $130.52Food/beverage processing $804,192 $153.10Consumer products $785,719 $127.98Transportation $668,586 $107.78Utilities $643,250 $380.94Health care $636,030 $142.58Professional services $532,510 $99.59Construction and engineering $389,601 $216.18Media $340,432 $119.74Hospitality and travel $330,654 $38.62Average $1,010,536 $205.55
Four Levels of Continuity High Availability: Maintaining the availability of systems critical to ongoing
operations during a failure or service outage
Disaster Recovery: Recovering from unplanned, catastrophic events or disasters in a predetermined manner based on the importance of the system
High Availability System Features Hot- swappable, redundant components with Mission-critical support
Rapid Equipment Replacement Vendor services and financing programs
SiteBeyond the Building
Increasing cost, functionality and complexity
ApplicationSystem Interaction
Redundant Systems Continuous server, storage, network access
Application Failover/ Load Balancing Continuous application access via clustering
SAN, NAS & DAS Continuous data access
Backup and Restore Real-time tape backup, Off-site storage
Site/Datacenter Failover Re-route users and data to replicated sites
Site Recovery Remote or commercial recovery facilities
PlatformIn the Box
DataBeyond the Box
When a failure occurs, it makes an impact. Avoiding downtime results from properly planning, designing and implementing
multiple levels of protection.
The Causes of Downtime
Causes of Failure Examples Impacts…
Driver hangs, OS hangs/reboots, virus, file corruption
Software defects/failures
Platform, data, applications
Upgrade components, firmware, drivers, O/S, software
Planned administrative downtime
Platform, data, applications
Accidental or intentional file deletion, unskilled operation, experimentation
Operator error andmalicious users
Platform, data, applications
Software/systems requiring reboot, system board failure
System outage/maintenance
Applications
Fire, storms, collapse, explosion, and other localized disasters
Building/site disaster Site
Earthquake, hurricanes, floods, other regional natural catastrophes
Metropolitan disaster Site
Bad memory chip, fan, power, HDD, data path, controller
Component failure Platform, data
AvailabilityTo measure availability, we need to know
How often a failure is expected, or the Mean Time to Failure (MTTF)What is the time it takes to recover from a failure, or the Mean Time to Recover
The calculation for availability is
To achieve high availabilityMTTF must be as high as possible MTTR must be as low as possible
In addition, you must consider business impact when calculating availability
Availability = MTTF
MTTF + MTTR
Levels of Availability
What businesses are saying
“having my email work is more important to me than having a dial tone” – Fortune 50 CIO
“In the next 24 hours, 8 million e-mail messages will be exchanged among employees in the Boeing network”1
11 http://www.boeing.com/companyoffices/aboutus/quickfacts.htmlhttp://www.boeing.com/companyoffices/aboutus/quickfacts.html
Email is business critical What analysts are saying
Email is mission critical and must be efficient: in 2003 businesses will send 3.5 Trillion emails --over 13 Billion emails/day
Gartner predicts that the volume of daily emails sent worldwide will reach 36 billion by 2005 – more than three times the number of emails sent in 20011
1 Gartner Dataquest Perspective, Market Analysis, “From Content to Knowledge: The Growing Gap, March 4, 20031 Gartner Dataquest Perspective, Market Analysis, “From Content to Knowledge: The Growing Gap, March 4, 2003..
Top Concerns of Today’s Messaging Environment
ReliabilityQuick RecoverySecurity PrivacyBusiness Integrity
Windows Server 2003 Exchange Server 2003
Advanced Features
New Features of Windows Server 2003
8 node failover clusters
Shutdown tracker (log reasons for shutdown, restart)
Diskpart – grow basic volumes
Volume Shadow Copy
Mount points (in Cluster)
/USERVA = 3030 (boot.ini switch)
Improved AD performance
Better Memory Management
New Features of Exchange Server 2003
Improved OWA (more like Outlook)
Improved Virus Scanning API (VSAPI)
Exchange Management Pack for MOM included
New Migration tools
Increased Network PerformanceDecreased network & processing costs
Replication
IPSec support between front-end and back-end clusters
With Exchange Server 2003 on Windows Server 2003…
Enable Server and Site ConsolidationImprove Management and Administration Enhance User Experience and Information ManagementImprove Client and Server Communications (sync)Increase the User productivity
Compatible operating systems
Supported Active Directory environments
Exchange version
Windows 2000 Server SP3+
Windows Server 2003
Windows 2000 Server SP3+
Windows Server 2003
Exchange2003
Yes Yes Yes Yes
Exchange2000 + SP3
Yes No Yes Yes
Exchange 2000 + SP2
Yes No Yes Yes
Exchange 5.5 + SP3
Yes No Not required Not required
AD & OS Compatibility Matrix
Exchange Server 2003
Clustering
High Availability Cluster: Goals
Availability Data, application, service
Scalability CPU, storage, # nodes
Application Recovery failover, restart
Manageability Single Point of Administration
Eliminate Single Point of FailureRedundancy throughout
MSCS: Virtual Servers
Virtual Server #1Name: CLUSTERIPIP: 192.168.1.11
App: Quorum
Virtual Server #2Name: EXG1
IP: 192.168.1.12App: Exchange
Virtual Server #XName: EXG2
IP: 192.168.1.13App: Exchange
Clients connect to Virtual Servers (VS). If a cluster node running a VS fails, the other server will run the VS
Cluster Node AName: CLUSTER_A
IP: 192.168.1.1APP: MSCS
Cluster Node BName: CLUSTER_B
IP: 192.168.1.2APP: MSCS
Clients do not connect to physical nodes. Admins connect for administration
MSCS
Virtual Servers typically include the following resources: a disk, IP address, network name, and application service(s)
EXG2EXG1Quorum
ClientClient ClientClient
Cluster Services – Active N+I
Active (N) + Passive (I) combinations
Clusters of smaller servers will continue to overtake larger proprietary systems
Less $$ for hardware
Scale better
Faster Failover
Exchange Server 2003 Clusters
Server Version Active (N+I) ActiveN
Windows 2K AS 2 node 2 node
Windows 2K DC 3+1 node 3 nodes
Windows 2K3 EE 7+1 node 7 nodes
Windows 2K3 DC 7+1 node 7 nodes
Exchange 2003 Installation on MSCS
Easier to create cluster or add nodes using Cluster Administrator in Windows Server 2003
Microsoft Exchange Server 2003 automatically detects presence of MSCS cluster and installs necessary components.
Microsoft® Exchange Failover
Cluster Node Fails
Failure Detected by Cluster Heartbeat
Surviving Node acquires Disk Reservations
Check and mount the file systems
Restart Exchange Resources
Virtual Server
Restore Communications
Client side retry
Exchange 2000 Dependency Tree
System AttendantSystem Attendant
Exchange StoreExchange Store
SMTPSMTP HTTPHTTP IMAP4IMAP4 POP3POP3 MSSearchMSSearch
Message Message Transfer Transfer
AgentAgentRoutingRouting
Network Network NameName
Physical Physical DiskDisk
IP IP AddressAddress
Exchange 2003 Dependency Tree
Flattened dependency hierarchy of Exchange services
Faster recovery times after failover
System AttendantSystem Attendant
SMTPSMTP HTTPHTTP IMAP4IMAP4 Exchange Exchange StoreStore
MSSearchMSSearch
Message Message Transfer Transfer
AgentAgentRoutingRouting
Network Network NameName
Physical Physical DiskDisk
IP IP AddressAddress
Dell | EMC Storage
Advanced Features
Typical Storage Environment
LAN
Exchange 2000 SQL Server2000 File & Print Other
What are the IT challenges with this environment ?
DLT7000 Tape
Library
40GB 40GB 15GB 15GB
80GB 45GB 80GB 60GB
DDS-4
Consolidated Storage Environment
LAN
Consolidating Storage
Tape Library
Exchange 2000 SQL Server2000 File & Print Other
High Availability Level
Application
Operating System
HOST BUS ADAPTER
STORAGE CONTROLLER
RAID LEVEL
DISK PORT
Server
Storage = Achilles’ HeelStorage = Achilles’ Heel
Consolidated Storage Environment
LAN
Consolidating Storage
Tape Library
Exchange 2000 SQL Server2000 File & Print Other
RedundantStorage Area Network
(SAN)
RedundantStorage Area Network
(SAN)
Redundant Storage System
Multi-Path IO with failover (PowerPath)
Redundant Storage Processors (RAID controllers)
Protected write cacheMirroring
SPS
Vaulting
RAID 1, 3, 5, 1+0
Dual Fibre Channel loops on storage system back-end
PowerPathLoad balance I/O across multiple paths to the same RAID controller
I/O Path failover for redundant paths
I/O’s are divided I/O’s are divided across both paths to across both paths to
SPBSPB
SnapView - Snapshot
SnapView creates logical point-in-time views of production information
Takes only seconds to create a complete snapshotCopy on first write
Snapshot allows access for test, backup, etc., without compromising the production data
Production Data
Production Host
100 GB
Snap
Snapshot
10 GB
SnapView - Clone
SnapClone creates full point in time copy of another Volume
Backup Server or Testing Host
Production Data
Production Host
100 GB
Snap
Snap Clone
100 GB
Snapshots & SnapClone
Array based product – no burden on hostRead / write mountable by a secondary host for increased productivityMinimizes time that production data is unavailable to usersCan eliminate scheduled downtime for backupRequires less disk space than a full mirror
MirrorView
Maintains synchronous remote mirroring between two Dell | EMC arrays
Transparent to server, operating system, and applications
Protects from unavailability and data lossPrimary and secondary site can be remote storage for each otherFailover production environment to remote site
Dell | Quantum
Storage
Backup
Data Protection: Value Tradeoffs with Different Solutions
Mirroring
Replication
Snapshots
Backup
Archiving
PrimaryDisk
SecondaryDisk
Tape
RestoreTime
Long
Short
Avail-ability
Low
High
Safety
High
Low
TimeRetained
Long
Short
Multi-Vendor
Many
One
Prioritizing Data Based on Value
ValueLow
High
Business can operate with limited data availability; significant disruption if data is lost
Ava
ilab
ilit
y
Imp
act
of
Dat
a L
oss
Lifeblood
Essential
Important
Lower priorityBusiness can operate with minimal data availability; some disruption if data is lost
Business slowed if data is unavailable; stopped if data is lost
Business stops if data isunavailable or lost
Aligning Data Protection Needs with Technologies
Tape Autoloader
General PurposeNAS
Synchronous Mirrored RAID
Snapshots
Asynchronous Mirrored or Replicated
RAID
Disk-BasedBackup
Local Tape Backup and Remote
Tape Archive
Local/Remote Tape Archive
Important
Non-essential
Essential
Lifeblood
ValueLow
High
Ava
ilab
ilit
y
Imp
act
of
Dat
a L
oss
Disk and Tape: Both Have a Role to Play
0 101
0 10 101010 1 00 0000 0 11 1 10 000 1 10 1 00 0000 0 11 1 10 000 1 10 1 00 0000 0 11 1 10 000 1 1
0 1 00 0000 0 11 1 10 000 1 10 1 00 0000 0 11 1 10 000 1 10 1 00 0000 0 11 1 10 000 1 1
Backup Server withBackup Software
PV 136TTape Library
Disk-based hardware optimized for data protection
and Dell Solutions: Meeting the Challenge Together
Lifeblood
Lower priority
Essential
Important
DLTtape &Super DLTtape
Media
Dell DLT/SDLT Drives
SDLT in Large Automation
DLT/SDLT Midrange
Automation Libraries
Power Vault DLT autoloaders
With Compatibility
Exchange DR Demo On View at Dell Stand
MirrorView
OS Boot Disk
Exchange Logs
Exchange Store
CX600 FC4700
Production Site Disaster Recovery Site
Exchange Logs Mirror
Exchange Store Mirror
OS Boot Disk mirror
Domain Controller
Exchange 2000
Domain Controller
Exchange 2000
StorageGroups
Fibre Switch
SITE
FAILURE!Fibre Switch
Promote Remote MirrorsUpdate Storage GroupsBoot Remote DR Server
Dell EMC Nortel BT Business Continuity Solution
Dell Application Solution Centre
EMC Solutions Operation Centre
Fibre Connectivity
Exchange High Availability Solution
A
CX600-ANortel Optera
ESAT BTDWDM
ManagedService
Nortel Optera
Port 3(MirrorView)
Exchange DataVolumes
BHost bootvolumes C D
MirrorView
100 Miles
Extended VLAN
CX600-B
Existing SAN
Host AClustered
Host BClustered
Mgt Host
Port 3
Existing SAN
Domain Controller Domain Controller
Host CClustered
Host DClustered
Mgt Host
Dell Limerick EMC CorkDELL/EMC/Nortel/ESAT BT DWDM Installation
More Information
Dell HA Clustering websitewww.dell.com/clusters
Dell Solutions websitewww.dell.com/solutions
Dell Power Solutions Magazine (online)www.dell.com/powersolutions
Dell ROI Online Calculatorswww.dell.com/roi
Ask The ExpertsGet Your Questions Answered
Ask the Experts area Wednesday 9-11
Dell Stand (All Week)
Thank You
Community Resources
Community Resourceshttp://www.microsoft.com/communities/default.mspx
Most Valuable Professional (MVP)http://www.mvp.support.microsoft.com/
NewsgroupsConverse online with Microsoft Newsgroups, including Worldwidehttp://www.microsoft.com/communities/newsgroups/default.mspx
User GroupsMeet and learn with your peershttp://www.microsoft.com/communities/usergroups/default.mspx
evaluationsevaluations
© 2003 Microsoft Corporation. All rights reserved.© 2003 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Backup Slides
SAN CopyNo host CPU cycles involvedCan copy to/from
CLARiiON (Dell | EMC)Symmetrix
UsageUpgrade - one time migration to another storage systemTest – routine copy to secondary storage for testContent Distribution – copy to multiple targets
Source can be snapshot, clone, fractured mirrorCopy data from LUN to LUNTarget LUN must be > or = source LUN
Primary Causes of Data Loss
Source: Quantum analysis
HumanError38%
Theft/Sabotage
7%
SoftwareFailure
5%
Hardware Failure20%
Power Failure/ Surges
12%
Viruses10%
NaturalDisasters
4%
Other3%
A data protection solution should protect you against all causes of data loss
Source: Quantum analysis
HumanError38%
Theft/Sabotage
7%
SoftwareFailure
5%
HardwareFailure
20%
Power Failure/Surges
12%
Viruses10%
NaturalDisasters
4%
Other3%
• Purely disk-based backup systems do not offer adequate protection against human error, viruses, hackers or natural disasters
• Removable media such as tape provides full protection
= Protected by= Protected by mirrored diskmirrored disk
= Not fully protected= Not fully protected without removablewithout removable tape tape mediamedia
Protection from Mirrored Disk