allan hirt avanade siebel high availability considerations on the microsoft platform

70
Allan Hirt Allan Hirt Avanade Avanade Siebel High Siebel High Availability Availability Considerations on the Considerations on the Microsoft Platform Microsoft Platform

Upload: virgil-maxwell

Post on 04-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Allan HirtAllan HirtAvanadeAvanade

Siebel High Availability Siebel High Availability Considerations on the Considerations on the

Microsoft PlatformMicrosoft Platform

Allan Hirt
Page 2: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

AgendaAgenda

Availability Basics Failover Clustering

Clustering Basics Failover Cluster Configuration Failover Cluster Administration

Log Shipping Summary

Alan Le Marquand
Keep Agenda as the title and repeat this slide through the deck. At each break highlight the next section by maiking that item gold and all the other items white
Page 3: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Breaking Down The NinesBreaking Down The Nines

Percentage Downtime (per year)

100% None

99.999% < 5.26 minutes

99.99% 5.26 – 52 minutes

99.9 % 52 m – 8 h, 45 min

99 % 8 h, 45 m – 87 h, 36 m

98.9 – 90.0% 87 h, 36 m – 875 h, 54 m

Page 4: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Designing for AvailabilityDesigning for Availability

Everything counts – planned and unplanned Make sure you have your guiding principles Know your tradeoffs (availability vs. scalability vs. growth

vs. cost) Identify your risks, exposures End user is king (or queen) Only design for the availability you need – this is

negotiated HA is soup to nuts and everything inbetween Redundancy is key, but not everything –

contingency/disaster recovery plans the name of the game Application is often the weak link HA is all about people and process; technology is just the

end enabler

Page 5: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Database Availability for SiebelDatabase Availability for Siebel

Technologies: Failover Clustering – automatic, great for closer distances

(technology, not OS/SQL limitations) Log Shipping – manual; good for HA, but also great for

disaster recovery and spanning distances Backup & restore – always have to do it

Always test your backups Remember that Siebel considers the Siebel File System part of

the database – from a MS perspective, this is outside of SQL Server, so it is not in the actual data store

How will you coordinate backups? Same time as SQL DB? What happens if out of sync?

How will you restore?

Native SQL Server-based replication is not an option since Siebel does not allow schema updates other than through Siebel Tools

Page 6: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

AgendaAgenda

Availability Basics Failover Clustering

Clustering Basics Failover Cluster Configuration Failover Cluster Administration

Log Shipping Summary

Alan Le Marquand
Keep Agenda as the title and repeat this slide through the deck. At each break highlight the next section by maiking that item gold and all the other items white
Page 7: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Clustering BasicsClustering BasicsKey Terminology – Part 1Key Terminology – Part 1

Key Terminology Windows Clustering

Server Cluster – not for scale out (sometimes known as MSCS) New to Windows Server 2003 is a Majority Node Set, which is another

type of server cluster In W2K3, no IIS clustered resource; would have to make it a generic

cluster resource – better to use Network Load Balancing anyway

Network Load Balancing – availability and scalability for IP-based (such as IIS) – great for all web servers for Siebel

Failover Clustering – SQL Server 2000’s implementation of availability clustering built on top of a server cluster

Federated server/cluster – SQL Server 2000 scale outNOTE: The terms listed above and elsewhere in this deck are the proper

terms to use for all Windows and SQL Server forms of clustering. Keep them in mind when others talk about clusters.

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 8: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Clustering BasicsClustering BasicsWorking BasicsWorking Basics

A server cluster is made up of nodes which run resources that are contained in cluster group

An IP address and network name resource combined in a group is known as a virtual server.

All nodes connect to a shared disk array using a “shared nothing” model

All nodes are connected via a private network, sometimes known as the heartbeat network

The server cluster uses some form of mechanism for storing state/configuration of the cluster (quorum disk for standard server clusters, share for Majority Node Set; also cluster database)

Applications built to run/interact properly in a server cluster, which are then termed cluster-aware, must be coded specifically using the Clustering API of the Platforms SDK All SQL Server 2000 components cluster-aware While the Siebel Gateway/Server can be clustered via a server cluster,

they are done as generic resources and are not truly cluster-aware applications

The application uses two server cluster processes – IsAlive and LooksAlive – to check the status of the application

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 9: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Clustering BasicsClustering BasicsShared – Not How A MS Server Cluster WorksShared – Not How A MS Server Cluster Works

Locking Mechanism

Client Requests

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 10: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Client PCs

Node A Node B

Shared Disk Array

Private Network

SQL ServerSQL Server SQL ServerSQL Server

Clustering BasicsClustering BasicsShared Nothing – How A MS Server Cluster WorksShared Nothing – How A MS Server Cluster Works

Public Network

Page 11: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Clustering BasicsClustering BasicsThe Failover Process – High LevelThe Failover Process – High Level

Simple to understand: Failure is detected SQL Server stops on one node Resource ownership is changed SQL Server starts on another node

From a client perspective, SQL Server effectively goes through a stop and a start As long as application knows, it can be transparent –

application is the key here After failover, do not need to worry about name (stays the same) Always transactionally current to point of failover Great HA story, however consider all points of failure

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 12: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Clustering BasicsClustering BasicsFailover Clustering ConceptsFailover Clustering Concepts

Single Instance Cluster (was Active/Passive) Only one SQL Server virtual server running

Multiple Instance Cluster (was Active/Active) Up to 16 SQL Server 2000 virtual servers are supported per

Server Cluster (as long as you have the resources) More reflective of what it is, and much easier than

Active/Active/Active/Active/Active/Active/Active/Active/Active/Active/Active/Active/Active/Active/Active/Active

You can mix local and clustered instances, but not recommended

IMPORTANT: YOU CANNOT MIX SQL SERVER 6.5 or 7.0 CLUSTERED INSTALLATIONS WITH A SQL SERVER 2000 FAILOVER CLUSTER ON THE SAME HARDWARE

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 13: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

AgendaAgenda

Availability Basics Failover Clustering

Clustering Basics Failover Cluster Configuration Failover Cluster Administration

Log Shipping Summary

Alan Le Marquand
Keep Agenda as the title and repeat this slide through the deck. At each break highlight the next section by maiking that item gold and all the other items white
Page 14: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationBefore You Run … CrawlBefore You Run … Crawl

Get the right hardware Entire cluster solution MUST be in the Windows Catalog or on

the old Hardware Compatibility List (HCL) (including driver versions)http://support.microsoft.com/default.aspx?scid=kb;en-us;814607 Cannot make a “Frankenstein” server cluster Read all KBs (most linked from above) Make sure any updates are cluster certified Especially worry about disk (SAN/DAS/HBA) drivers

Any geographically dispersed cluster solution must not only be in WC/HCL, but be on the specific geographic lists

Navigating can sometimes be confusing, but ultimate goal is to have you on a supported, known, good configuration

Check best practices (see WPs for networking, configuration, etc.)

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 15: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationAntivirus Software and ClustersAntivirus Software and Clusters

In general, on dedicated SQL Servers, not recommended or needed

If needed, and especially on a cluster, make sure the following are set up as filters: \MSCS on quorum \DtcLog (on quorum or dedicated disk) All SQL Server data and log files/directories

KBs 309422, 250355

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 16: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationNumber of NodesNumber of Nodes

OS Supports Windows 2000 Advanced Server – 2 Windows 2000 Datacenter Server – 4 Windows Server 2003 Enterprise Edition (32- or 64-bit) – 8 Windows Server 2003 Datacenter Edition (32- or 64-bit) – 8

SQL Server 2000 Supports Enterprise Edition 32-bit – up to 4 nodes, no matter what 32-bit

OS (Windows 2000 or Windows Server 2003) 64-bit – up to 8 nodes

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 17: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationNetwork Configuration 1Network Configuration 1

General (more than what is listed here; see book or the WP in the Windows Clustering Resource Center) Minimum of 2 NICs All networks must fail independent of one another and each network

must be on a distinct network and subnet Domain connectivity required

Need domain accounts for both SQL Server services and the server cluster Do not need to be domain administrators, but have proper rights on each node Security – KB 263712, 291255

Dedicated IP addresses needed: server cluster, SQL Server virtual server, IP addresses for each node, IP addresses for each Private NIC, and possibly MS DTC – NO DHCP IPs

All cluster nodes must be in the same domain (with redundant domain controllers, etc.)

Nodes should not be domain controllers – KBs 281662, 298570

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 18: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationNetwork Configuration 2Network Configuration 2

Public Network In general OS, priority must be above Private Network In Cluster Administrator, must be below the Private Network Set speed of network to the actual speed; no autosense – KB

174812 Cannot enable Network Load Balancing on a server cluster or

its public NICs Primary and secondary DNS required Public network should be configured for all communications,

not just public duties; serves as a backup for the private NIC teaming OK on public network – KB 254101

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 19: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationNetwork Configuration 3Network Configuration 3

Private Network – KB 258750 500 ms roundtrip Dedicated; no other traffic but heartbeats In general OS, priority must be below above Private Network In Cluster Administrator, priority must be above all Public

Networks Set speed of network to the actual speed; no autosense Disable NetBIOS Only enable TCP/IP Teaming not supported on the Private Network Use valid IP: 10.0.0.0, 172.16.0.0, 192.168.0.0 Crossover can work; network recommended

Under W2K only, must disable media sense

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 20: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationDisk Configuration 1Disk Configuration 1

Arguably the most important aspect for SQL Server Tradeoff triangle: availability vs. cost vs. performance Dedicated space in the book – not just Chapter 4 Basic disks only; dynamic not supported natively by

any version of Windows – KB 237853 SAN or DAS only; NAS not supported for failover

clustering – KBs 304415, 304261 Mount points technically supported for disk expansion

now, but recommended not to use for the time being (Windows Server 2003 only)

Driver crucial!!!!! Do not implement without certified drivers

Go fibre; SCSI is no longer the common cluster configuration. Fibre required for > 2 nodes

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 21: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationDisk Configuration 2Disk Configuration 2

Drive letters required, which means a maximum of 26 logical drives (really less when you think about it …)

Recommend a 1:1 ratio from logical drive to physical entity SQL Server only sees what Windows sees Lots of LUNs may be bad, and large LUNs not great (failover

time – chkdsk) – KB 310072 2 TB limit under 32-bit

Multiple instances cannot share the same drive in a cluster Reinforce the 1:1 point – if you have two logical drives on one

physical drive/LUN, the LUN will be seen as one physical drive to the OS, and that is how it is presented to SQL Server

Remember capacity planning … plan for now AND later 64k block size when formatting data NTFS only Disk signatures

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 22: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationDisk Configuration 3Disk Configuration 3

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 23: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationServer Cluster InstallationServer Cluster Installation

Make sure everything OK (no errors in Error Log, etc.) Ghosting not supported for cluster nodes under W2K

(can do base OS pre-clustering, though) Windows 2000

No real automation; GUI the best and virtually only way (command line with caveats)

IIS Common Files required Windows Server 2003

Command line GUI Unattended install –

http://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodtechnol/windowsserver2003/deploy/confeat/MSCSclus.asp

Post-install: network priorities, resize log, MS DTC

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 24: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationMS DTC 1MS DTC 1

Make sure configured prior to installing SQL Server 2000, and after server cluster is configured

Windows 2000 2 ways:

Use comclust (must be run on ALL nodes; not just one) Most do this, however realize that it puts the \DtcLog directory on

the quorum drive Quorum is VERY important to cluster health, so there is the

potential risk of possibly filling up the quorum disk or other disk problems

Means that the quorum must be sized properly for both cluster use as well as MS DTC use

Create it manually (this is like Windows Server 2003) Requires own disk, IP resources, so it should remove any

contention

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 25: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationMS DTC 2MS DTC 2

Windows Server 2003 server clusters Configured manually Comclust no longer an option Means you need to plan for the IP address and disk

resource used by MS DTC in addition to all other IP/disk resources

Create in its own cluster group For disk, do not use the quorum, and especially do

not use any of the SQL Server data/log disks

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 26: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationServer Cluster ValidationServer Cluster Validation

Ping all IP addresses Ping all network names

Ping from both within the cluster and from external to the server cluster

Fail all resources back/forth to/from all nodes

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 27: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationAdvanced SecurityAdvanced Security

Kerberos – KB 235529 IPSec – KB 306607, 248694 SSL – KBs 276553, 316898

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 28: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationNaming SQL Virtual Servers 1Naming SQL Virtual Servers 1

Name is important – installing a virtual is a permanent option

You cannot rename a SQL Server virtual server; only way is to uninstall and reinstall

Cannot be the name of the underlying nodes or the server cluster itself

Longest name: 15 char for VS name, 16 char for instance name e.g. SUPERLONGVSNAME\LONGNAMEDINSTANC

Names must be unique within a server cluster and a domain

However, heed KB 289828 for your server names (non-SQL)

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 29: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationNaming SQL Virtual Servers 2Naming SQL Virtual Servers 2

Name Examples SQL1 – Valid SQL1\INS1 – Invalid; already a VS named SQL1 SQL1a\INS1 – Valid SQL1a\SQL1a – Valid, but not recommended (can be

confusing); would be invalid if SQL1a\INS1 already configured

SQL1a\INS2 – Invalid; already a VS named SQL1a SQL1b\INS1 – Invalid; named instance of INS1

associated with SQL1a SQL1b\SQL1b – Valid, but not recommended

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 30: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationSQL Server 2000 Resource DependenciesSQL Server 2000 Resource Dependencies SQL Server resources in a cluster are dependent

upon others to run As you saw in the failover, they start in a specific

order – this is due to dependencies Do not add resources as dependencies (such as a

file share) to the SQL Server resources other than disks unless absolutely necessary. You can cause an availability outage that

has nothing to do with SQL Server

Resource Name Dependency

SQL IP Address None

SQL Network Name SQL IP Address

SQL Server Disk resource(s), SQL Network Name

SQL Server Agent SQL Server

SQL Server Fulltext SQL Server

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 31: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationSingle vs. Multiple InstancesSingle vs. Multiple Instances

Single Instance Less administrative work Avoidance of fixed overhead of multiple instance

Fixed server memory structures DLLs, .EXEs, etc.

Automatic server settings will work better in a single virtual server For instance, grab all available memory Ease in using AWE

Some components are always shared anyway MDAC, DTC, Microsoft Search

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 32: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationSingle vs. Multiple InstancesSingle vs. Multiple Instances

Multiple Instances Good example: consolidation/dev environments Flexibility to separate databases/applications based on different

Service Level Agreements (SLA) requirements performance backup / recovery security change control Operational upgrade maintenance

More cache for procedures (dedicated)

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 33: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationProcessor & Memory 1Processor & Memory 1

OS Version Max Processors Max Amount of Memory

Windows 2000 Advanced Server

8 8 GB

Windows 2000 Datacenter Server

32 32 GB

Windows Server 2003 Enterprise Edition

8 32 GB (32-bit); 64 GB (64-bit)

Windows Server 2003 Datacenter Edition

64 (minimum of 8) 64 GB (32-bit); 512 GB (64-bit) – 1 GB Minimum

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 34: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationProcessor & Memory 2Processor & Memory 2

Configure enough processing power sufficient to handle the load for any instance that may run on a server

Test your application before putting it into production. Monitor processor usage Memory

Single-instance: No issues unless other services or applications are running.

Multiple-instance: Be sure that one instance will not diminish the resources of other processes or instances

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 35: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationProcessor & Memory 3Processor & Memory 3

Memory under 32-bit If SQL instances do not need > 2 GB, do not do anything System has 4 GB, use /3GB if need more than 2 GB System has > 4 GB

Use /3GB and/or AWE if need more than 2 GB per instance up to about 16 Gb (give or take); can be combined, but must test

If need > 3 GB of memory and/or have > 16 GB, use AWE/PAE only (/3GB does not work past 16 GB)

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 36: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationProcessor & Memory 4Processor & Memory 4

32-bit vs. 64-bit /3GB – reserves 1 GB for OS (32-bit); set in boot.ini AWE (needed for large memory under 32-bit)

SQL Server does not manage AWE dynamically max server memory option must be configured

(set a fixed amount) Once configured, AWE holds all the memory acquired until the server is

stopped or reconfigured Not a dynamic setting – requires a stop/start of SQL Although can be configured without it, AWE is basically useless unless

you configure /PAE in boot.ini All memory is dynamic in 64-bit, so if you need large amounts of

memory, it can replace the need for page fixed AWE memory – better resource utilization Theoretically, you do not need to set max memory, just minimum. On

failover, target instances will yield memory to the new, failed over instance.

KBs: 268363, 280793, 283037, 326333, 291988 Book has a ton of information in Chapter 14

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 37: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationProcessor & Memory 5Processor & Memory 5

Configuring more memory under 32-bit /3GB enabled in boot.ini

multi(0)disk(0)rdisk(0)partition(2)\WINNT=“Windows 2000 Advanced Server” /3GB /basevideo /sos

PAE enabled in boot.ini – Q280793multi(0)disk(0)rdisk(0)partition(2)\WINNT=“Windows 2000 Advanced Server”

/PAE /basevideo /sos AWE enabled in SQL

sp_configure ‘awe enabled’, 1 Can mix /3GB and /PAE up to 16 GB, but probably

better to pick ONE model TEST the configuration – especially 32-bit

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 38: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

ExampleExample

Failover is crucial in consideration – exceeding capacity is a BAD thing

Page 39: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationN+1 ConfigurationN+1 Configuration

Client PCsClient PCs

Public NetworkPublic Network

Fibre-Channel Switch(es)

PrivatePrivatenetworknetwork

RAIDRAIDdisk setsdisk sets

Cluster NodesCluster NodesSQL VS 1SQL VS 1 SQL VS 3SQL VS 3SQL VS 2SQL VS 2

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 40: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationFailover Cluster InstallationFailover Cluster Installation Done via SQL Server Setup – detects that you are installing

on a server cluster (“Virtual Server”) option

Cannot be scripted; must use GUI Ghosting not supported Can only select one drive during installation; must add

others post-installation

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 41: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster ConfigurationFailover Cluster ConfigurationFailover Cluster – Post InstallFailover Cluster – Post Install

Add other disks Validation

Ping all IP addresses Ping all network names

Ping from both within the cluster and from external to the server cluster

Fail all resources back/forth to/from all nodes Execute select * from ::fn_virtualservernodes() Execute select * from ::fn_servershareddrives()

Set a static port number Configure resources and groups

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 42: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Make sure that if you’re not using Fulltext (or if using a third party tool that is dependent upon a SQL resource) deselect “Affect the group” or select “Do not restart” in Cluster Administrator

Failover Cluster ConfigurationFailover Cluster ConfigurationResource DependenciesResource Dependencies

Page 43: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Preferred nodes – only set order (>2 nodes); add nodes via SQL Setup (CluAdmin)

Failover Cluster ConfigurationFailover Cluster ConfigurationPreferred NodesPreferred Nodes

Page 44: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

AgendaAgenda

Availability Basics Failover Clustering

Clustering Basics Failover Cluster Configuration Failover Cluster Administration

Log Shipping Summary

Alan Le Marquand
Keep Agenda as the title and repeat this slide through the deck. At each break highlight the next section by maiking that item gold and all the other items white
Page 45: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationWhere Do I …Where Do I …

Change server cluster account/password W2K: updatepwd.exe, Services W2K3: cluster.exe command line

Change SQL Server service accounts/password Enterprise Manager ONLY – do not use Services

(breaks failover cluster) Change IP address, Node Membership, Uninstall

SQL Server setup (need original CD-ROM or installation point)

Use SQL Server tools unless specified

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 46: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationExpanding Disk CapacityExpanding Disk Capacity

Growing existing volumes or new LUN? Both will affect availability; so plan ahead Grow existing volume:

If SAN supports, use diskpart (from W2K Resource Kit, built-in for W2K3)

Use mount points under W2K3, however create only from space on the shared disk array and attach to an existing drive letter

New LUN must take SQL Server offline, and may need to power down depending on how SAN/DAS is configured Do you have the drive letters?

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 47: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationSQL Server Service PacksSQL Server Service Packs

Very different from SQL Server 7.0 Since it is a permanent option, SP applied to all

nodes defined as part of the SQL Server virtual server definition at the same time

Done per instance Currently, requires a reboot

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 48: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationBackup and RestoreBackup and Restore

System: must get system state; Ghosting/normal backups not good enough

DBs: same as any SQL Server 2000 instance however do not back up to local disks (i.e. c:\)

Back up to a share/disk that is seen by all nodes of the cluster

If using third-party software, make sure it works and is configured properly; do not want to make SQL Server fail if backup software is not working properly

Snapshot good, but expensive Volume Shadow Copy (VSS) support under Windows Server

2003

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 49: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationUtilitiesUtilities

Analysis Services Not cluster-aware; can be made a generic cluster resource – KB

308023 Can also use NLB to make available

SQL Mail Not fully supported – KB 298723 Problem: MAPI is not cluster-aware Also see KBs 263556, 308604, 315886, 303287

Process Control Do not use with clustered SQL Server instances Use SQL Server to manage everything (processor, memory)

Windows Resource Manager (WSRM) Can use when configuring processor % for SQL Server Use SQL Server for processor affinity, memory settings

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 50: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationTroubleshootingTroubleshooting

Diagnose in this order every time: Hardware issues Operating-system issues Networking issues Security issues Windows server cluster issues SQL Server issues

Don’t assume SQL Server first – 70%+ of PSS failover cluster cases are not SQL Server issues

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 51: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationQuick Troubleshooting Tips 1 - OSQuick Troubleshooting Tips 1 - OS

Check Logs Event Viewer – start with System

Check KBs first to see if problem is known If coordinating with server cluster log, server cluster log is GMT

and system event log is local time Server Cluster Log – %windir%\cluster

Troubleshooting WP in Cluster Resource Center Server Cluster Setup Log – %windir%\system32\

Logfiles\Cluster Device Manager (device level access state)

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 52: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationQuick Troubleshooting Tips 2 - SQLQuick Troubleshooting Tips 2 - SQL

SQL Server Installation Logs (placed in %windir%); exists on each node Setup.log – log for local binaries portion of the install Sqlstpn.log – log for a SQL Server instance install,

where n is the number of the setup attempt Sqlspn.log – log for a SQL Server service pack

install, where n is the number of the setup attempt Sqlclstr.log – log for clustered instances of SQL

Server

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 53: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationQuick Troubleshooting Tips 3 - ToolsQuick Troubleshooting Tips 3 - Tools

Clusdiag, ClusterRecovery – ships with the Resource Kit (http://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&displaylang=en)

Syscompare MPS reporting tool

http://www.microsoft.com/downloads/details.aspx?FamilyId=CEBF3C7C-7CA5-408F-88B7-F9C79B7306C0&displaylang=en

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 54: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationBarriers According to PSSBarriers According to PSS

1. Lack of planning2. Failure to comply with HCL requirements3. Not understanding the technology

Why clustering is used What clustering provides and does not provide

4. Internal politics5. Need to troubleshoot clusters the same way they were

installed6. Lack of cluster-aware diagnostics 7. Need to build in cluster-awareness 8. Securing SQL access through use of certificates

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 55: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Failover Cluster AdministrationFailover Cluster AdministrationDisaster RecoveryDisaster Recovery

Great story with SQL Server 2000 – can effectively run on less nodes and not interrupt service to repair nodes

Evict from SQL Server definition first (SQL Setup) Then evict node from Cluster Administrator Book covers all scenarios (Chapter 6)

Alan Le Marquand
Keep the main Slide title the same as the current Agenda item. The sub title then describes this slide.
Page 56: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

AgendaAgenda

Availability Basic Failover Clustering

Clustering Basics Failover Cluster Configuration Failover Cluster Administration

Log Shipping Summary

Alan Le Marquand
Keep Agenda as the title and repeat this slide through the deck. At each break highlight the next section by maiking that item gold and all the other items white
Page 57: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Monitoring Server

Transfer Logins

Primary Server

Secondary Server(s)

3. Transaction log restored

1. Transaction log backed up

LogBackup

2. Transaction log copied

LogBackup

Log ShippingLog ShippingHow Log Shipping WorksHow Log Shipping Works

Page 58: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Unlike failover clustering, the switch to another server is most likely not going to be transparent to the end users. Since you are going to another server, the client or

application will need to worry about how to access the new server.

Handle the interruption in service gracefully. Transactionally, you are only as good as

Last transaction completed on primary. Last transaction log backed up on primary. Last transaction log copied from primary. Last transaction log applied to secondary.

Log ShippingLog ShippingRole ChangesRole Changes

Page 59: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Business Questions How many transactions per hour are you generating? How much downtime can your environment tolerate? How much data can you afford to lose? How much money is allocated to the project?

Technical Questions What is your network connectivity? What is the average size of transaction log backup

files? How long does it take to copy and apply transaction

logs? What is the capacity of the secondary? Do you go back to primary?

Log ShippingLog ShippingQuestions to AskQuestions to Ask

Page 60: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Server location Primary/secondary should hopefully be in the same

domain Memory/Processor

Secondary should be equal to the primary Networking

Bandwidth Network card setup

Disk Considerations Transaction-log backup location Disk space

Log ShippingLog ShippingHardware ConsiderationsHardware Considerations

Page 61: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Primary or secondary solution for high availability

Planned downtime Perform maintenance on the primary server Application upgrades Server moves/upgrades

Check the health of the production database Upgrade from SQL Server 7.0 to SQL Server

2000

Log ShippingLog ShippingHA Uses of Log ShippingHA Uses of Log Shipping

Page 62: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Recovery Model Needs to be Full or Bulk-Logged Full file sizes a little smaller than Bulk

Security Windows Authentication recommended Primary/secondary need access to Monitor to write

events Fulltext Shipping multiple databases to a single secondary

Capacity of secondary Application considerations

Log ShippingLog ShippingSQL Server ConsiderationsSQL Server Considerations

Page 63: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Planned Known downtime, such as performing

maintenance on the primary Steps:

Get the tail of log on primary. Copy tail to secondary. Make sure all logins, transaction logs applied. Bring database online. Have clients reconnect.

Log ShippingLog ShippingTypes of Role Changes – 1Types of Role Changes – 1

Page 64: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Unplanned Catastrophic event

Steps: Tail may not be available. Make sure all available logs applied to

secondary. Bring database online. Have clients reconnect.

Log ShippingLog ShippingTypes of Role Changes – 2Types of Role Changes – 2

Page 65: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

After a role change, applications and clients need to access the new primary.

Methods to consider: ODBC DSN Network Load Balancing Rename the SQL Server (non-clustered only)

Log ShippingLog ShippingClient RedirectionClient Redirection

Page 66: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Do you switch back to the original primary when it is available? As long as secondary has the capacity, do not

switch back because it will create another interruption in availability.

If needed, schedule at an off hours time. Must reinitialize the old primary first.

Log ShippingLog ShippingSwitching Back to the Old PrimarySwitching Back to the Old Primary

Page 67: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Client

NLB Cluster

PrimarySQL Server

SecondarySQL Server

IP 1

IP2

IP1

IP2

Log Shipping

Virtual IP Address

Log ShippingLog ShippingLog Shipping and Network Load BalancingLog Shipping and Network Load Balancing

Page 68: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

AgendaAgenda

Availability Basics Failover Clustering

Clustering Basics Failover Cluster Configuration Failover Cluster Administration

Log Shipping Summary

Alan Le Marquand
Keep Agenda as the title and repeat this slide through the deck. At each break highlight the next section by maiking that item gold and all the other items white
Page 69: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

SummarySummary

SQL Server can be made highly available Whatever technology or technologies used must be

part of an overall high availability/disaster recovery plan/strategy that encompasses the technology as well as the entire solution you are making available (i.e. Shared disk is a potential single point of failure)

Planning is everything; installation is the easy part Disk, network, memory/processor, capacity planning ALL

crucial Ensure that things are properly configured prior to

moving onto a next step Test, test ... and test some more Ensure staff is properly trained on administrative

tasks – clusters are very similar to standalone, but not in every way. What you do not know may hurt you!

Page 70: Allan Hirt Avanade Siebel High Availability Considerations on the Microsoft Platform

Helpful Links/Other InfoHelpful Links/Other Info

754 pages of SQL HA information – SQL Server 2000 High Availability (MS Press) http://www.microsoft.com/mspress/books/6515.asp

SQL Server 2000 Failover Clustering Whitepaper (being updated now) http://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodtechnol/sql/deploy/confeat/failclus.asp

SQL Server 2000 Planning for Server Consolidation whitepaper (coming soon)

Previous TechNet webcast – SQL Server High Availability: The Good, The Bad, and The Challenginghttp://www.microsoft.com/usa/webcasts/ondemand/1751.asp

Windows Clustering Whitepapers http://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodte

chnol/windowsserver2003/technologies/clustering/default.asp Clustering Resource Center

http://www.microsoft.com/windowsserver2003/technologies/clustering/default.mspx

SQL Server 2000 SP3 Security Whitepaperhttp://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodtechnol/sql/maintain/security/sp3sec/Default.asp