10 things you need to know about longhorn failover clustering inf309 elden christensen program...

Download 10 Things you need to know about Longhorn Failover Clustering INF309 Elden Christensen Program Manager Microsoft Corporation

If you can't read please download the document

Upload: junior-george

Post on 17-Jan-2018

221 views

Category:

Documents


0 download

DESCRIPTION

Terminology Changes Beta Wolfpack Windows NT 4.0 Microsoft Cluster Service (MSCS) Windows 2000 Server / Windows Server 2003 Server Clustering Windows codenamed Longhorn Server Failover Clustering (WSFC)

TRANSCRIPT

10 Things you need to know about Longhorn Failover Clustering INF309 Elden Christensen Program Manager Microsoft Corporation Agenda 10 Key changes coming for Failover Clustering in Windows Server codenamed Longhorn that you need to know about 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Terminology Changes Beta Wolfpack Windows NT 4.0 Microsoft Cluster Service (MSCS) Windows 2000 Server / Windows Server 2003 Server Clustering Windows codenamed Longhorn Server Failover Clustering (WSFC) Where is Clustering Going Whats Clustering in Longhorn all about? Simplicity, Security, Stability Clusters for people without PhDs Easy to create, use, and manage Enabling the IT Generalist Reduce Clustering Total Cost of Ownership Making Clusters a smart business choice for the enterprise Improvements in Security, Networking, Eventing, and Storage Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Motivation for Validate Configuration Issues Cabling mistakes SP and Hotfix binaries Driver mismatches Inconsistent Settings Complexity Best Practices Supportability Requirements Hardware Compatibility If we can eliminate the configuration issues up front, we can ensure a better cluster experience (installation and operation) 48% of Cluster support calls are due to configuration problems -Microsoft PSS 80% of failures are due to human error -Gartner What Is Cluster Validate? Runs a focused set of tests on a collection of servers that are intended to be a cluster Catch hardware or configuration problems before the cluster goes in production Ensures that the solution you are about to deploy is rock solid Run validate each and every time you install a new cluster Its the very first thing you do! Validate can also be run on configured clusters as a diagnostic tool Disk Resources need to be in an Offline state to be validated Cluster Validation Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Create an entire cluster in one step Setup is streamlined and simplified Intuitive All the power of a full cluster test suite in your hands to ensure that the actual cluster you are setting up will provide rock solid stability Catch configuration issues Fully scriptable for automated deployments New Create Cluster API allows fully customizable experience Simple Validation Deployable Easy To Create Clusters Cluster Setup Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility New User Experience All New Cluster Management Tool!! Designed to be task-based and easy to use Fewer dials-n-knobs to worry about Whats all this IsAlive/LooksAlive stuff I dont care about? Just make my cluster work! Tell us what you want to do and well take care of the rest I would like to make this File Share Highly Available Cluster Administrator Tool Today New Cluster MMC Snap-in New Cluster Management Snap-in Command line (cluster.exe) Cluster Management Console Fully Scriptable with WMI Richer Tool Experience Exposes Advanced Options Task Oriented Phasing out MSClus Cluster MOM Management Pack Manageability Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Networking Enhancements Integrated with new Longhorn TCP/IP Stack Full IPv6 Support Native IPv6 support for client access, native and tunnels Inter-node communication with IPv6 DHCP Support for IPv4 Resources Obtain cluster IP address from a DHCP server Relieves management pain of static IPs Networking Enhancements No more legacy dependencies on NetBIOS Ready for NetBIOS-less environments Simplifying the transport of SMB traffic Removing WINS and NetBIOS name resolution broadcasts Standardizing name resolution on DNS Moved from datagram RPC protocols to more secure TCP session oriented protocols Improvements in IPSec to allow almost instantaneous failover for clients Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Cluster Migrations Cluster Migration Tool Will assist migration of a cluster configuration from one cluster to another Copies resources and cluster configurations from one cluster to another No Mixed Version Compatibility LH node and Win2003 node can not be in the same cluster at the same time No rolling upgrades Migrating to Longhorn Step 1 All nodes running Windows Server 2003 Group owned by Node 1 hosting a IP Address, Network Name, Physical Disk, and File Share resources Windows Server 2003 Single Cluster IP Name Disk File Share Migrating to Longhorn Step 2 Evict Node 2 from the Windows 2003 cluster Perform clean install of Longhorn on Node 2 Create an independent single node cluster with a new Cluster Name Longhor n 2003 Two separate Clusters IP Name Disk File Share Migrating to Longhorn Step 3 Run the Longhorn Migration Wizard on Node 2 Designate Node 2 as target and Node 1 as source Perform a group by group copy of resources Groups are created in an Offline state ResourcesOnline IP Name Disk File Share IP Name Disk File Share ResourcesOffline Migrating to Longhorn Step 4 Longhorn Cluster is pre-staged and ready for migration Bring group Offline on Node 1 Bring group Online on Node 2 ResourcesOffline IP Name Disk File Share IP Name Disk File Share ResourcesOnline Migrating to Longhorn Step 5 Install Longhorn on Node 1 Join Node 1 to the existing Longhorn cluster with Node 2 Resources can now be failed back and forth Migration is now complete! IP Name Disk File Share ResourcesOnline Single Cluster Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Service Manageability Improved Security Model Cluster Service now runs in the context of the LocalSystem built- in account No more Cluster Service Account (CSA) No more account password management No need to pre-stage defined user accounts More resilient to configuration issues Addresses supportability issues where privileges are accidentally stripped by group policies Increased security New Security Context How does this impact you? Cluster Service starts with set privileges Resource Hosting Subsystem launched in the same context with the same privileges Resource DLLs and Applications are launched in the same context of RHS with the same set of privileges No common identity In short, any custom resource DLL or applications leveraging the Generic Application or Generic Script resource types will have reduced privileges and no remote-ability You are responsible for handling the credentials your applications require Test your apps and resources with Windows Server Longhorn! Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility New Quorum Model Majority based cluster membership Who and what gets a vote is fully configurable Eliminating failure points Original design assumed that storage would be always available New best-of-both-worlds quorum model Hybrid of legacy Majority Node Set (MNS) logic and Shared Disk Quorum model This model will replace both existing models No single point of failure! Can survive loss of the Quorum disk Majority Quorum Model New majority based quorum model Majority of Nodes based quorum Disk is optional witness to have a vote in deciding majority 3 total votes, with 2 needed for majority So the Cluster can survive the loss of any 1 vote Shared Storage Device gets 1 vote Vote Each node counts as 1 vote Majority of Nodes Replicated Storage Devices Vote Only Nodes get votes 3+ Node votes without Shared Storage vote Majority of votes needed to operate cluster No shared disk vote Witness Disk Shared Storage Device is master Only Disk gets a vote Nodes have no votes Quorum disk is the master Cluster stays up even if only 1 node can talk to the disk Achieves same behavior as legacy quorum model Vote File Share Witness File Share Witness allows a 2-node cluster with no shared disk Majority of Nodes + Witness based quorum Excellent solution for GeoClusters Could reside at a 3 rd Site Single file server could serve as the Witness for multiple clusters File Share on an independent server Vote Witness Each node counts as 1 vote Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Stretching Clusters But businesses are now demanding more! Stretching Nodes across the river used to be good enough Geographically Dispersed Clusters No More Single-Subnet Limitation Allow cluster nodes to communicate across network routers No more having to connect nodes with VLANs! Configurable Heartbeat Timeouts Increase to Extend Geographically Dispersed Clusters over greater distances Decrease to detect failures faster and take recovery actions for quicker failover IP Address Resource A IP Address Resource B Enhanced Dependencies New Dependency Filter Objects Network Name resource stays up if either IP Address resource A or B are up Today both resource A and B have to be online for the Network Name to be available to users Allows redundant resources and scoping impact to dependent services and applications OR Network Name Resource Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Shared Storage Topology Requirements Only storage that supports Persistent Reservations will be supported in Longhorn Failover Clustering Deprecating parallel-SCSI support Serial Attached SCSI (SAS) based clusters will replace parallel-SCSI Fibre Channel Fibre ChanneliSCSI SAS Supported Shared Bus Types Key Cluster Changes 1. Cluster Validation 2. Revamped Setup 3. New Cluster Experience 4. Networking Enhancements 5. Getting to Longhorn 6. New Security Model 7. New Quorum Model 8. Geographically Dispersed Cluster Enhancements 9. Shared Storage Topologies 10. Storage Compatibility Storage Enhancements Improved disk fencing for shared disks Enhanced mechanism to use Persistent Reservations New algorithm for managing shared disks No more device resets with PRs! No longer uses SCSI Bus Resets which can be disruptive on a SAN Disks are never left in an unprotected state Tight integration into core OS disk management Support for GPT disks Windows Server Longhorn Will Be A Clean Slate Compatibility Some hardware may not be upgradeable Can not assume solutions that previously worked with clustering will continue to work in Longhorn Clustering Supportability There will be no grandfathering of support for currently qualified solutions listed on the Windows Server Catalog Solutions proven to work with Win2003 Clustering means nothing in the context of Longhorn Clustering compatibility SCSI Command Requirements Storage must support the following SCSI-3 SPC-3 compliant SCSI Commands: Unique IDs Vital product data (VPD), device identification page (page code 83h) with Identifier Type 2 (EUI-64 based), 3 (NAA), or 8 PERSISTENT RESERVE IN Read Keys (00h) PERSISTENT RESERVE IN Read Reservation (01h) PERSISTENT RESERVE OUT Reserve (01h) Scope: LU_SCOPE (0h) Type: Write Exclusive Registrants Only (5h) PERSISTENT RESERVE OUT Release (02h) PERSISTENT RESERVE OUT Clear (03h) PERSISTENT RESERVE OUT Preempt (04h) PERSISTENT RESERVE OUT Register AND Ignore Existing Key (06h) ClusSvc.exe ClusRes.dll Disk Resource RHS.exe CluAdmin.msc HBA Storage enclosure User Kernel Volume C:\ Volume F:\ PartMgr.sys Disk.sys ClusDisk.sys Control path NetFT New Cluster Architecture Storport Miniport Major change is that ClusDisk no longer is in the disk fencing business MS MPIO Filter ClusAPI CPrepSrv Validate WMI Persistent Reservation Table 1.Every interface has an entry in the registration table 2.You must be registered to place a reservation 3.Challenging nodes attempt to register 4.Registrations with unknown keys are periodically scrubbed Key is known and unique Anyone who knows the key has access to the disk Persistent Reservation Table in the external storage Registration TableReservation Table Node1_HBA1Key1 Node1_HBA2Key1 Node2_HBA1 Node2_HBA2 Key2 Key2 Registration Defense Protocol Successful defense Defender Node Challenger Node Register and Reserve Read Read and Purge Read Register and Reserve (fails) Read Preempt Attempt Fails Challenge Successful defense Timeline in secs Read ExistingReserve Register and Reserve (fails) Preempt and Reserve Challenger Node Challenge Successful Challenge Timeline in secs Read Registration Defense Protocol Successful challenge Defender Node HBA Requirements All Host Bus Adapters (HBA) must use a Storport mini-port driver All multi-path software must be based on MS MPIO If using a custom DSM it must have a logo All components in a cluster must have a Designed for Windows logo Summary/Call to Action Try out the new cluster experience and send us feedback Test Storage to ensure compatibility with Persistent Reservations Test custom resource DLLs and cluster aware applications for compatibility with new security model Resources Webcast: Overview of all the new Longhorn Cluster Features:Culture=en-USCulture=en-US Chalk Talk: INFCT02 Got Questions about Windows Clustering? Wednesday 13:30 14:45 Hands-on-Lab: INFHOL01 Creating Highly Available Services with Failover Clustering Breakout Session: UCM310 High Availability and Clustering in Exchange Server 2007 Wednesday 17:00 18:15 Come Give Us Your Feedback! Come to the Cluster Focus group and talk to us about how you use clustering and give us your feedback on how we can improve it to meet your business needs Thursday 11/16: Session 1: 10:00 1200 Session 2: 13:00 15:00 Win an XBox 360! Cluster Focus Group attendees have a chance to win an Xbox 360 Ask The Experts Get Your Questions Answered You can find Elden at the Microsoft Ask the Experts area, located in the Exhibition Hall: Wednesday15 November16:30 17:00 Thursday16 November14:45 15:45 Ask The Experts Get Your Questions Answered You can find Manish at the Microsoft Ask the Experts area, located in the Exhibition Hall: Thursday16 NovemberLunch Friday17 November10:15 10:45 Appendix Virtual Server Host Clustering Virtual machine Guests failover from one node to another VS is a clustered application running on a cluster.VHDs reside on shared disk Hosts are clustered Guests are not clustered Guest VMs can run any OS Virtual Server Guest Clustering Applications failover from one Guest to another Guests are effectively nodes in a cluster that access shared storage with a NIC and the iSCSI Software Initiator.VHDs reside on host disk Guests are clustered Hosts are not clustered Guests run Win2003 User data resides on shared disk 2006 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.