Download - High Availability Deep Dive
High Availability Deep DiveWhat’s New in vSphere 5
David Lane, Virtualization Engineer High Point Solutions
Agenda• What is High Availability• What’s New in vSphere 5• Core Components of High Availability vSphere 5• How High Availability Works in vSphere 5• Scenarios for High Availability in vSphere 5• Exploiting High Availability with vSphere 5• Q&A
What is High Availability?• The Answer to Hardware Density
Concerns• Resilient Architecture• Automated Recovery• Simple Setup / Familiar Interface
High Availability Prerequisites• Minimum of 2 Hosts• Minimum of 3GB of Host Memory• VMware vCenter Server• Shared Storage• Pingable Constant Address (Gateway)• HA Communication Firewall Ports (TCP/UDP 8182)• Essentials Plus and Up
Configuring High Availability• 10 Steps - 10 Minutes• Create a Cluster• Drag and Drop Hosts
What’s New for vSphere 5• FDM (Fault Domain Manager) – New HA Agent• Master / Slave Nodes • Datastore Heartbeating • Enhanced Isolation Validation• No DNS Dependency• Supports Management Network Partitions• Enhanced Admission Control Policies
Core HA Components of vSphere 5• FDM (Fault Domain Manager) • VMware vCenter• hostd
FDM• Replaces Legato AAM (Automated Availability
Manager)• Single Process Agent with Watchdog Failsafe• No DNS Dependency No DNS Limitations• Consolidated Logging with Syslog
Compatibility• Talks Directly to hostd and vCenter Not
Dependent on VPXA
VMware vCenter• Deploys FDM Agents – Parallel (AAM
Serial)• Communicates Configuration Changes in
Cluster to Master Node• Retrieves Virtual Machine Status • Displays Protection Status of VMs
hostd• Required for FDM• Runs on Host• Relays information about VMs on host• Responsible to Power On VMs
How Does High Availability vSphere 5 work? The Tools
• Master / Slave Nodes• Heartbeating• Isolated vs. Network Partitioned • Virtual Machine Protection
Master / Slave Nodes• One Master Node Per Cluster (exception Network Partitioned)
• Master Node Monitors VM Health Directs Slaves• Master Node Takes Ownership of Datastores where
VMs Configuration Files are Located• Master Node Reports VM Status to vCenter Server• Master Node Assigned by Election• Slaves Monitor Their running VMs and send Status to
Master and perform restarts on Master Node Requests• Slaves Also Monitor Master Node Health
Master Node Election• Election held When HA is Enabled or Reconfigured and
When Master Node - Fails, Becomes Isolated or Partitioned, Disconnects from vCenter, In Maintenance Mode, In Standby
• Utilizes UDP• Takes 15 Seconds• Host with Most Connected Datastores Wins• If Multiple Hosts Share Highest Number Of Datastores the
Host with the highest Managed Object ID (MOID) Wins• New Master Node will Attempt to Acquire Ownership of All
Datastores by Locking “protectedlist” File (Protected VM List Inventory File, on Datastores in Cluster)
• In The Case of Master Node Isolation File Locks will be Released
Heartbeating
• Network Heatbeating • Datastore Heartbeating
Network Heartbeating
• Heartbeats sent from Slaves to Master and From Master to Slaves• Heartbeats Sent Every Second• Determines the State Of the Hosts
Datastore Heartbeating
• Prevents Unnecessary Restarts• Extra Heartbeat Added to Determine
State if Management Network is Lost • Validates Failure or Just Isolation • Uses PowerOn File to Determine Isolation
Isolated vs. Network Partitioned
• Isolated (Host Separated from Master VMs May be Restarted)– Not Receiving Heartbeat From Master– Not Receiving Election Traffic– Cannot Ping Isolation Address
• Partitioned (Multiple Host Isolated but Can Communicate to Each Other Over Management Network)– Not Receiving Heartbeats from Master– Does Receive Election Traffic
Virtual Machine Protection
• vCenter Server Performs Protection on State Change• Protection guaranteed when the master
has committed the change of state to disk• Protectedlist File Contains VM State and
Protection
Scenarios For High Availability vSphere 5 Using The Tools
• Failed Host• Isolated Host• Application Monitoring - Failed VM OS
Failed Host
• Failed Master Host– Master Election Initiated– New Master Elected– New Master Restarts all VMs on the Protectedlist with Not
Running State
• Failed Slave Host– Master Check Network heartbeat– Master Checks Datastore Heartbeat– Master Restarts VMs Affected
Isolated Host
• Isolation Responses– Power Off– Leave Powered On– Shut Down
• Isolation Detection– Slaves will Hold Single Server Election and Check Ping Address– Master will Check Ping Address– Master Restarts VMs Affected
Application Monitoring - Failed VM OS
• Restarts Individual VM When Needed
• Configurable VM Tools Heartbeat
• Monitors Network and Storage I/O Activity as Fail-Safe
Exploiting HA with vSphere 5
• Stretched Clusters– Storage DRS
• Blade Chassis Failure
• Larger Clusters Tenant Based Cloud
Q&A
THANK YOU