cmg: vmware vsphere performance boot campcmg: …...to interpret the vmware tools (such as esxtop,...
TRANSCRIPT
CMG: VMWare vSphere Performance Boot CampCMG: VMWare vSphere Performance Boot Camp
John PaulManaged Services, R&DF b 29 2012February 29, 2012
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.
Acknowledgments and Presentation GoalAcknowledgments and Presentation Goal
The material in this presentation was pulled from a variety of sources, some of which was graciously provided by VMware and Intel staffsome of which was graciously provided by VMware and Intel staff members. I acknowledge and thank the VMware and Intel staff for their permission to use their material in this presentation. This presentation is intended to review the basics for performance This presentation is intended to review the basics for performance
analysis for the virtual infrastructure with detailed information on the tools and counters used. It presents a series of examples of how the performance counters report different types of resource consumptionperformance counters report different types of resource consumption with a focus on key counters to observe. The performance counters do change and we are not going to go over all of the counters. ESXTOP and RESXTOP will be used interchangeably since both toolsESXTOP and RESXTOP will be used interchangeably since both tools
effectively provide the same counters. The screen shots in this presentation have the colors inverted for readability purposes. The presentation shows screen shots from vSphere 4 and 5 since both
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 2
The presentation shows screen shots from vSphere 4 and 5 since both are actively in use.
IntroductoryIntroductoryCommentsComments
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 3
Trends to Consider Hardware
Introductory Comments
Trends to Consider - Hardware
Intel Strategies – Intel necessarily had to move to a horizontal, multi-core strategy due to the physical restrictions of what could be done oncore strategy due to the physical restrictions of what could be done on the current technology base. This resulted in: Increasing number of cores per socket Stabilization of processor speed (i.e., processor speeds no longer are p p ( , p p g
increasing according to Moore’s Law and in fact are slower for newer models) Focus on new architectures that allow for more efficient movement of data
between memory and the processors external sources and the processorsbetween memory and the processors, external sources and the processors, and larger and faster caches associated with different components
OEM Strategies – As Intel moved down this path system OEMs have assembled the Intel (and AMD) components in different ways (such as ( ) p y (multi-socket) to differentiate their offerings. It is important to understand the timing of the Intel architectural releases, the OEM implementation of those releases, and the operating system vendors’ use of those features in their code They aren’t always aligned
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 4
use of those features in their code. They aren t always aligned.
Trends to Consider Hardware Assisted Virtualization
Introductory Comments
Trends to Consider – Hardware Assisted Virtualization
Intel VT-x and AMD-V - This provides two forms of CPU operation (root and non-root) allowing the virtualization hypervisor to be less(root and non-root), allowing the virtualization hypervisor to be less intrusive during workloads. Hardware virtualization with a Virtual Machine Monitor (VMM) versus binary translation resulted in a substantial performance improvement.substantial performance improvement. Intel EPT and AMD RVI - Memory management virtualization (MMU)
supports extended page tables (EPT) which eliminated the need for ESX to maintain shadow page tablesESX to maintain shadow page tables. Vt-d and AMD-Vi – I/O virtualization assist allows the virtual machines
to have direct access to hardware I/O devices, such as network cards and storage controllers (HBAs)and storage controllers (HBAs).
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 5
Trends to Consider Software
Introductory Comments
Trends to Consider – Software
VMWare/Hypervisor Strategies – Horizontal scalability at the hardware layer requires comparable scalability for the hypervisorlayer requires comparable scalability for the hypervisor NUMA support required hypervisor scheduler changes (Wide NUMA) Larger CPU and RAM virtual machines Efficiency while running larger, most complex workloads Abstraction model versus consolidation model for some workloads Performance guarantees for the Core Four Federation of management tools and resource pools
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 6
Scheduler
Introductory Comments
Scheduler
vCPU – The vCPU is an aggregation of the time it allocates to the workload It time slices each core based upon the type of configurationworkload. It time slices each core based upon the type of configuration. It constantly is changing which core the vm is on, unless affinity is used. SMP Lazy Scheduling The scheduler continues to evolve using lazy SMP Lazy Scheduling – The scheduler continues to evolve, using lazy
scheduling to launch individual vCPUs and then having others “catch up” if CPU skewing occurs. Note that this has improved across releasesreleases. SMP – Note that SMP effectiveness is NOT linear, depending upon
workloads. It is very important to load test your workloads on your hardware to measure the efficiency of SMP We have found that thehardware to measure the efficiency of SMP. We have found that the higher the number of SMP vCPUs, the lower the efficiency.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 7
Introductory Comments
Introductory Comments
Resource Pools – These are really a way to group the amount of resources allocated to a specific workload or grouping of workloads
Introductory Comments
resources allocated to a specific workload, or grouping of workloads across VMs and hosts. Single Unit of Work – It is easy to miss the fact that resource “sharing”
really does not affect the single unit of work Dynamic Resourcereally does not affect the single unit of work. Dynamic Resource Sharing (DRS) moves VMs across ESX hosts where there may be more resources. Perfmon Counters The inclusion of ESX counters (via VMTools) into Perfmon Counters – The inclusion of ESX counters (via VMTools) into
the Windows perfmon counters is helpful for overall analysis, and higher level performance analysis. Many counters are not exposed yet in Perfmonin Perfmon.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 8
Introductory Comments
Introductory Comments
Hyper-threading – The pre-Nehalem Intel architecture had some problems with the hyper-threading efficiencies causing many people to
Introductory Comments
problems with the hyper-threading efficiencies, causing many people to turn off hyper-threading. The Nehalem architecture seems to have corrected those problems, and vSphere is now hyper-threading aware. You need to understand how hyper-threading works so you know howYou need to understand how hyper threading works so you know how to interpret the VMware tools (such as ESXTOP, ESXPlot). The different cores will be shown as equals while they don’t have equal capacity. Microsoft Hyper-V – While we are not going to be diving into Hyper-V
(or Zen VM) the basic principles of the hypervisors are the same, though the implementation is quite different. There are good reasons g p q gwhy VMware leads the market in enterprise virtualization implementations.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 9
Introductory Comments Decision Points
Introductory Comments
WHAT should we change? – VMware continues to expose more performance changing settings One of the key questions that needs
Introductory Comments – Decision Points
performance changing settings. One of the key questions that needs to be answered is whether you should take the default settings for operational simplicity or fine tune for the best possible performance. NUMA Awareness and Control Should the NUMA control be turned NUMA Awareness and Control – Should the NUMA control be turned
over to the guest operating system? ESXTOP versus RESXTOP – Though both work the use of ESXTOP
on the actual host requires SSH to be enabled which may violateon the actual host requires SSH to be enabled, which may violate security guidelines.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 10
vSphere Architecture
VMware ESX Architecture
File
CPU is controlled by scheduler and virtualized b it
GuestGuest TCP/IPFile
Systemby monitor
Monitor supports:BT (Binary Translation)
Monitor (BT, HW, PV)Monitor
Virtual NIC Virtual SCSI
( y )HW (Hardware assist)PV (Paravirtualization)
Memory is allocated by the VMkernel Memory
Allocator
NIC Drivers
Virtual Switch
I/O Drivers
File SystemScheduler
y yVMkernel and virtualized by the monitor
PhysicalHardware
Network and I/O devices are emulated and proxied though native device
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 11
though native device drivers
PerformancePerformance Analysis Basicsy
Key Reference Documents vSphere Resource Management (EN-000591-01)
P f B t P ti f VMW S h 5 0 (EN 000005 04)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 12
Performance Best Practices for VMWare vSphere 5.0 (EN-000005-04)
Performance Analysis Basics
Types of Resources – The Core Four (Plus One)
Though the Core Four resources exist at both the ESX host and virtual machine l l th t th i h th i t ti t d d t d i tlevels, they are not the same in how they are instantiated and reported against. CPU – processor cycles (vertical), multi-processing (horizontal) Memory – allocation and sharing Disk (a.k.a. storage) – throughput, size, latencies, queuing Network - throughput, latencies, queuing
Though all resources are limited, ESX handles the resources differently. CPU is more strictly scheduled, memory is adjusted and reclaimed (more fluid) if based on shares, disk and network are fixed bandwidth (except for queue depths) resources.
The Fifth Core Four resource is virtualization overhead!
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 13
Performance Analysis Basics
vSphere Components in a Context You Are Used To World The smallest schedulable component for vSphere Similar to a process in Windows or thread in other operating systems
Groups A collection of ESX worlds, often associated with a virtual server or common set of
functions, such as, Idle System Helper DriversDrivers Vmotion Console
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 14
Th Fi C f Vi li i d Whi h T l U f E h
Performance Analysis Basics
The Five Contexts of Virtualization and Which Tools to Use for Each
Physical MachineOperating System
Intel Hardware
VCPU VMemory VNIC VDisk
Application
Operating System
Intel Hardware
VCPU VMemory VNIC VDisk
Application
Operating System
ApplicationOperating System
Intel Hardware
VCPU VMemory VNIC VDisk
Application
Virtual Machine
ESX Host Machine
Physical MachineOperating System
Intel Hardware
VCPU VMemory VNIC VDisk
Application
Operating System
Intel Hardware
VCPU VMemory VNIC VDisk
Application
PerfMon ESXTOP PerfMon ESXTOP
Intel Hardware
PNIC PDiskPMemoryPCPU
Intel Hardware
PCPU PMemory PNIC PDisk
Intel Hardware
PNIC PDiskPMemoryPCPUESX Host Farm/Cluster
ESX Host Complex Intel Hardware
PNIC PDiskPMemoryPCPU
PerfMon ESXTOP PerfMon ESXTOP
Virtual Center Virtual Center
Operating System
VCPU VMemory VNIC VDisk
Application
Operating System
VCPU VMemory VNIC VDisk
Application
Operating System
VCPU VMemory VNIC VDisk
Application
Operating System
VCPU VMemory VNIC VDisk
Application
Intel Hardware Intel Hardware
Intel Hardware
PNIC PDiskPMemoryPCPU
Intel Hardware Intel Hardware
Intel Hardware
PNIC PDiskPMemoryPCPU
PerfMon ESXTOP PerfMon ESXTOP
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 15
Remember the virtual context
Performance Analysis Basics
Types of Performance Counters (v4) Static – Counters that don’t change during runtime, for example MEMSZ
(memsize), Adapter queue depth, VM Name. The static counters are informational and ma not be essential d ring performance problem anal sisinformational and may not be essential during performance problem analysis.
Dynamic – Counters that are computed dynamically, for example CPU load average, memory over-commitment load average.C l l t d S l l t d f th d lt b t t i Calculated - Some are calculated from the delta between two successive snapshots. Refresh interval (-d) determines the time between successive snapshots. For example %CPU used = ( CPU used time at snapshot 2 - CPU used time at snapshot 1 ) / time elapsed between snapshots
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 16
Performance Analysis Basics
A Review of the Basic Performance Analysis ApproachIdentify the virtual context of the reported performance problem Where is the problem being seen? (“When I do this here, I get that”)p g How is the problem being quantified? (“My function is 25% slower) Apply a reasonability check (“Has something changed from the status quo?”)
Monitor the performance from within that virtual context View the performance counters in the same context as the problem Look at the ESX cluster level performance counters
L k f t i l b h i (“I th t f d Look for atypical behavior (“Is the amount of resources consumed characteristic of this particular application or task for the server processing tier?” )
Look for repeat offenders! This happens often.
Expand the performance monitoring to each virtual context as needed Are other workloads influencing the virtual context of this particular
application and causing a shortage of a particular resource?C id h h t i i t ti t d f h f th C F
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 17
Consider how a shortage is instantiated for each of the Core Four resources
Performance Analysis Basics
Reservation (Guarantees) • Minimum service level guarantee (in MHz)
Resource Control Revisited – CPU ExampleTotal MHZ
g ( )• When system is overcommitted it is still the target• Needs to pass admission control for start-up
Limit
Shares (Share the Resources)• CPU entitlement is directly proportional to VM's shares and depends on the total number of shares issuedAb t t b l ti tt
Shares
• Abstract number, only ratio matters
Limit • Absolute upper bound on CPU entitlement (in MHz)
Reservation
• Absolute upper bound on CPU entitlement (in MHz)• Even when system is not overcommitted 0 MHZ
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 18
Tools
Key Reference Documents vSphere Monitoring and Performance (EN-000620-01) Chapter 7 – Performance Monitoring Utilities: resxtop and esxtop
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 19
Chapter 7 – Performance Monitoring Utilities: resxtop and esxtop
Esxtop for Advanced Users (VSP1999 VMWorld 2011)
Tools
Load Generators, Data Gatherers, Data Analyzers
Load Generators IOMeter www iometer org IOMeter – www.iometer.org Consume – windows SDK SQLIOSIM - http://support.microsoft.com/?id=231619
Data Gatherers ESXTOP Virtual Center Vscsistats
Data Analyzers ESXTOP (interactive or batch mode) ESXTOP (interactive or batch mode)Windows Perfmon/Systems Monitor ESXPLOT
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 20
Tools
A Comparison of ESXTOP and the vSphere Client
vC gives a graphical view of both real-time and trend consumptionC bi l ti ti ith h t t (1 h ) t di vC combines real-time reporting with short term (1 hour) trending
vC can report on the virtual machine, ESX host, or ESX cluster vC has performance overview charts in vSphere 4 and 5p p vC is limited to 2 unit types at a time for certain views ESXTOP allows more concurrent performance counters to be shown
ESXTOP h hi h t h d t ESXTOP has a higher system overhead to run ESXTOP can sample down to a 2 second sampling period ESXTOP gives a detailed view of each of the Core Fourg
Recommendation – Use vC to get a general view of the system performance but use ESXTOP for detailed problem analysis
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 21
performance but use ESXTOP for detailed problem analysis.
Tools
An Introduction to ESXTOP/RESXTOP
Launched through vSphere Management Assistant (VMA) or CLI or via SSH session (ESXTOP) with ESX host
Screens (version 5) c: cpu (default) d: disk adapter
h h l h: help i: interrupts m: memory n: network p: power management u: disk device v: disk VM
Can be piped to a file and then imported in System Monitor/ESXPLOTp p p y Horizontal and vertical screen resolution limits the number of fields and entities
that could be viewed so chose your fields wisely Some of the rollups and counters may be confusing to the casual user
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 22
Tools
ESXTOP: New Counters in vSphere 5.0
World, VM Count, vCPU count (CPU screen) %VMWait (%Wait %Idle CPU screen) %VMWait (%Wait - %Idle, CPU screen) CPU Clock Frequency in different P-states (Power Management Screen) Failed Disk IOs (Disk adapter screen)
FCMDs/s – failed commands per second FReads/s – failed reads per second FMBRD/s – failed megabyte reads per second FMBWR/s failed megabyte writes per second FMBWR/s – failed megabyte writes per second FRESV/s – failed reservations per second
VAAI: Block Deletion Operations (Disk adapter screen) Same counters as Failed Disk IOs aboveSame counters as Failed Disk IOs above
Low-Latency Swap (Host Cache – Disk Screen) LLSWR/s – Swap in rate from host cache LLSWW/s – Swap out rate to host cache
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 23
LLSWW/s Swap out rate to host cache
ESXTOP H l S ( 5)
Tools
ESXTOP: Help Screen (v5)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 24
ESXTOP CPU ( 5)
Tools
ESXTOP: CPU screen (v5)Time Uptime New Counter
fi ld hidd f th i• Worlds = Worlds, VMs, vCPU Totals
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 25
fields hidden from the view…• ID = ID• GID = world group identifier• NWLD = number of worlds
Tools
ESXTOP: CPU screen (v4) expanding groups
press ‘e’ key
• In rolled up view some stats are cumulative of all the worlds in the groupp g p• Expanded view gives breakdown per world• VM group consists of mks (mouse, keyboard, screen), vcpu, vmx worlds. SMP VMs have additional vcpu and vmm worlds• vmm0, vmm1 = Virtual machine monitors for vCPU0 and vCPU1
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 26
,respectively
ESXTOP CPU S ( 5) M N W ld
Tools
ESXTOP CPU Screen (v5): Many New Worlds
New Processes Using Little/No CPU resource
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 27
ESXTOP CPU S ( 5) Vi l M hi O l ( i V d)
Tools
ESXTOP CPU Screen (v5): Virtual Machines Only (using V command)
Value >= 1 means overload
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 28
ESXTOP CPU S ( 5) Vi l M hi O l E d d
Tools
ESXTOP CPU Screen (v5): Virtual Machines Only, Expanded
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 29
ESXTOP CPU ( 4)
Tools
ESXTOP: CPU screen (v4)
PCPU = Physical CPU/core
CCPU = Console CPU (CPU 0)
Press ‘f’ key to choose fields
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 30
ESXTOP CPU ( 5)
Tools
ESXTOP: CPU screen (v5)
Core Usage Now Shown
PCPU = Physical CPU
CORE = Core CPUChanged Field
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 31
New Field
Idle State on Test Bed (CPU View v4)
Tools
ESXTOP
Idle State on Test Bed (CPU View v4)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 32
Virtual Machine View
Tools
Idle State on Test Bed – GID 32 Expanded (v4)
Wait includes idle
Cumulative Five Worlds Total Idle %Expanded G Rolled Up GID Wait %Five Worlds Total Idle %GID Rolled Up GID
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 33
Possible
Tools
ESXTOP memory screen (v4) Possible states: High,
Soft, hard and low
VMKMEMCOSPCI Hole
Physical Memory (PMEM)
VMKMEMCOS
VMKMEM - Memory managed by VMKernelCOSMEM - Memory used by Service Console
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 34
Tools
ESXTOP: memory screen (v5)
NUMA Stats
Changed Field
New Fields
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 35
Swapping activity in
Tools
ESXTOP: memory screen (4.0) Swapping activity in Service Console
VMKernel S i ti itSwapping activity
SZTGT : determined by reservation, limit and memory sharesSWCUR 0 i i th t
SZTGT = Size targetSWTGT = Swap targetSWCUR = Currently swappedMEMCTL B ll d i
SWCUR = 0 : no swapping in the pastSWTGT = 0 : no swapping pressureSWR/S, SWR/W = 0 : No swapping activity currently
MEMCTL = Balloon driverSWR/S = Swap read /secSWW/S = Swap write /sec
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 36
Tools
ESXTOP: disk adapter screen (v4)
Host bus adapters (HBAs) - includes SCSI,
iSCSI,RAID, and FC-HBA Latency stats from the Device, Kernel and the
G tadapters Guest
G/ ( ) f ( )DAVG/cmd - Average latency (ms) from the Device (LUN)
KAVG/cmd - Average latency (ms) in the VMKernel
GAVG/cmd - Average latency (ms) in the Guest
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 37
Tools
ESXTOP: disk device screen (v4)
LUNs in C:T:L format (Controller: Target: LUN)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 38
Tools
ESXTOP disk VM screen (v v4)
running VMs
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 39
Tools
ESXTOP: network screen (v4)
PKTTX/s - Packets transmitted /secPKTRX/s - Packets received /sec
Physical NIC
Service console
NIC Virtual NIC /s ac ets ece ed /sec
MbTx/s - Transmit Throughput in Mbits/secMbRx/s - Receive throughput in Mbits/sec
Port ID: every entity is attached to a port on the virtual switch
NICs
Port ID: every entity is attached to a port on the virtual switchDNAME - switch where the port belongs to
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 40
Tools
A Brief Introduction to the vSphere Client
Screens – CPU, Disk, Management Agent, Memory, Network, System vCenter collects performance metrics from the hosts that it manages and aggregates the
data using a consolidation algorithm. The algorithm is optimized to keep the database size constant over time.
vCenter does not display many counters for trend/history screens ESXTOP defaults to a 5 second sampling rate while vCenter defaults to a 20 second
rate. Default statistics collection periods, samples, and how long they are stored
Interval Interval Period Number of Samples
Interval Length
Per Hour (real-time) 20 seconds 180
Per day 5 minutes 288 1 dayPer day 5 minutes 288 1 day
Per week 30 minutes 336 1 week
Per month 2 hours 360 1 month
Per year 1 day 365 1 year
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 41
y y y
Tools
vSphere Client – CPU Screen (v4)
To Change To Change
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 42
To Change Settings
To Change Screens
Tools
vSphere Client – Disk Screen (v4)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 43
Tools
vSphere Client - Performance Overview Chart (v4)
Performance overview charts help to quickly identify bottlenecks andto quickly identify bottlenecks and isolate root causes of issues.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 44
Tools
Analyzing Performance from Inside a VMVM Performance Counters Integration into Perfmon
A k h t t ti ti Access key host statistics from inside the guest OS
View “accurate” CPU utilization along side observed CPU utilizationobserved CPU utilization
Third-parties can instrument their agents to access these counters using WMI
Integrated with VMware Tools
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 45
Tools
Summarized Performance Charts (v4)
Quickly identify bottlenecks and isolate root causes Side-by-side performance charts in a single view Correlation and drill-down capabilities Richer set of performance metrics
Key MetricsKey Metrics Displayed
Aggregated UUsage
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 46
Tools
A Brief Introduction to ESXPlot
Launched on a Windows workstationI t d t f fil Imports data from a .csv file
Allows an in-depth analysis of an ESXTOP batch file session Capture data using ESXTOP batch from root using SSH utilityp g g y ESXTOP –a –b >exampleout.csv (for verbose capture)
Transfer file to Windows workstation using WinSCP
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 47
ESXPlotTools
ESXPlot
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 48
ESXPlot Field Expansion: CPUTools
ESXPlot Field Expansion: CPU
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 49
ESXPlot Field Expansion: Physical DiskTools
ESXPlot Field Expansion: Physical Disk
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 50
T P f C t t U f I iti l P bl D t i ti
Tools
Top Performance Counters to Use for Initial Problem DeterminationPhysical/Virtual Machine
CPU (queuing)• Average physical CPU utilization
ESX HostCPU (queuing)• PCPU%
Disk (latency, queuing)• DiskReadLatency• Average physical CPU utilization
• Peak physical CPU utilization• CPU Time• Processor Queue LengthMemory (swapping)• Average Memory Usage
• PCPU%•%SYS•%RDY• Average physical CPU utilization• Peak physical CPU utilization• Physical CPU load average
DiskReadLatency• DiskWriteLatency• CMDS/s (commands/sec)• Bytes transferred/received/sec• Disk bus resets• ABRTS/s (aborts/sec)g y g
• Peak Memory Usage• Page Faults• Page Fault Delta*Disk (latency)• Split IO/Sec
Di k R d Q L th
Physical CPU load average
Memory (swapping)• State (memory state)• SWTGT (swap target)• SWCUR (swap current)
• SPLTCMD/s (I/O split cmds/sec)
Network (queuing/errors)• %DRPTX (packets dropped - TX)• %DRPRX (packets dropped – RX)
MbTX/ ( b t f d/ TX)• Disk Read Queue Length• Disk Write Queue Length• Average Disk Sector Transfer TimeNetwork (queuing/errors)• Total Packets/second• Bytes Received/second
• SWR/s (swap read/sec)• SWW/s (swap write/sec)• Consumed • Active (working set)• Swapused (instantaneous swap)• Swapin (cumulative swap in)
• MbTX/s (mb transferred/sec – TX)• MbRX/s (mb transferred/sec – RX)
• Bytes Received/second• Bytes Sent/Second• Output queue length
• Swapin (cumulative swap in)• Swapout (cumulative swap out)• VMmemctl (balloon memory)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 51
Performance Counters in Action
CPU – Understanding PCPU versus VCPU
It is important to separate the physical CPU (PCPU) resources of the ESX host from the virtual CPU (VCPU) resources that are presented by ESX to the virtual machine. PCPU – The ESX host’s processor resources are exposed only to ESX. The i t l hi t d t t th h i lvirtual machines are not aware and cannot report on those physical resources. VCPU – ESX effectively assembles a virtual CPU(s) for each virtual machine from the physical machine’s processors/cores, based upon the type of resource allocation (ex shares guarantees minimums)allocation (ex. shares, guarantees, minimums). Scheduling - The virtual machine is scheduled to run inside the VCPU(s), with the virtual machine’s reporting mechanism (such as W2K’s System Monitor) reporting on the virtual machine’s allocated VCPU(s) and remaining Core Fourreporting on the virtual machine s allocated VCPU(s) and remaining Core Four resources.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 53
Performance Counters in Action
CPU – Key Question and Considerations
Is there a lack of CPU resources for the VCPU(s) of the virtual machine f th PCPU( ) f th ESX h t?
Allocation – The CPU allocation for a specific workload can be constrained due to the resource settings or number of CPUs, amount of
or for the PCPU(s) of the ESX host?
shares, or limits. The key field at the virtual machine level is CPU queuing and at the ESX level it is Ready to Run (%RDY in ESXTOP). Capacity - The virtual machine’s CPU can be constrained due to a lack p yof sufficient capacity at the ESX host level as evidenced by the PCPU/LCPU utilization. Contention – The specific workload may be constrained by the p y yconsumption of workloads operating outside of their typical patterns SMP CPU Skewing – The movement towards lazy scheduling of SMP CPUs can cause delays if one CPU gets too far “ahead” of the other.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 54
CPUs can cause delays if one CPU gets too far ahead of the other. Look for higher %CSTP (co-schedule pending)
CPU State Times and AccountingPerformance Counters in Action
CPU State Times and Accounting
Accounting: USED = RUN + SYS - OVRLP
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 55
Performance Counters in Action
High CPU within one virtual machine caused by affinity (ESXTOP v4)
Physical CPU Fully
Used
One Virtual CPU is
Fully Used
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 56
Fully Used
Performance Counters in Action
High CPU within one virtual machine (affinity) (vCenter v4)
View of the ESX Host
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 57
View of the VM
Performance Counters in Action
SMP Implementation WITHOUT CPU Constraints – ESXTOP V4
4 Physical CPUs Fully 4 Virtual CPUs
Fully UsedReady to Run
AcceptableOne - 2 CPU SMP VCPU
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 58
Used Fully Used AcceptableSMP VCPU
SMP Implementation WITHOUT CPU Constraints vC V4
Performance Counters in Action
SMP Implementation WITHOUT CPU Constraints – vC V4
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 59
Performance Counters in Action
SMP Implementation with Mild CPU Constraints V4
4 Physical CPUs Fully 4 Virtual CPUs
Heavily Used
Ready to Run Indicates P bl
One - 2 CPU SMP VCPUs (7
)
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 60
Used Heavily Used ProblemsNWLD)
Performance Counters in Action
SMP Implementation with Severe CPU Constraints V4
4 Physical CPUs Fully
4 Virtual CPUs Fully Used
Ready to Run Indicates Severe
P bl
Two - 2 CPU SMP VCPUs
( )
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 61
Used Fully Used Problems(7 NWLD)
Performance Counters in Action
SMP Implementation with Severe CPU Constraints V4
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 62
Performance Counters in Action
CPU Usage – Without Core Sharing
ESX scheduler tries to avoid sharing the same core
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 63
Performance Counters in Action
CPU Usage – With Core Sharing
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 64
®Introduction to the Intel® QuickPath Interconnect
Intel® QuickPath interconnect is: Cache-coherent, high-speed packet-based, point-to-point
interconnect sed in Intel’s ne t generation microprocessors
Four-Socket Platform
interconnect used in Intel’s next generation microprocessors (starting in 2H’08)
Narrow physical link contains 20 lanes Two uni-directional links complete QuickPath interconnect
portport
Provides high bandwidth, low latency connections between processors and between processors and chipsetprocessors and chipset Maximum data rate of 6.4GT/s 2 bytes/T, 2 directions, yields 25.6GB/s per port
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 65
Interconnect performance for Intel’s next generation microarchitectures
Intel TopologiesIntel Topologies
CPUCPU CPUCPU CPUCPU CPUCPUCPUCPU
Intel® Itanium®Processor(Tukwila)
Nehalem-EXNehalem-EPIntel® Core™ i7Processor
Lynnfield
CPUCPU CPUCPU CPUCPU CPUCPUCPUCPU
4 Full Width Links4 Full Width Links2 Full Width Links1 Full Width LinkNo links 4 Full Width Links2 Half Width Links
4 Full Width Links2 Full Width Links1 Full Width LinkNo links
Nehalem-EP Example 2S Nehalem-EX Example 4SIOH IOH
CPUCPU CPUCPU
IOH IOH
CPUCPU CPUCPU
IOH IOH
IOH IOHCPUCPU CPUCPU
IOH IOH
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 66
Different Number of Links for Different Platforms
IOH IOH
I t l QPI P f C id tiIntel: QPI Performance Considerations• Not always a direct correlation between processor performance and
interconnect latency / bandwidthWh t i i t t i th t th i t t h ld f
Max theoretical bandwidth Max bandwidth with packet overhead
• What is important is that the interconnect should perform sufficiently to not limit processor performance
Max of 16 bits (2 bytes) of “real” data sent across full width link during one clock edge
Double pumped bus with max initial frequency of 3.2 GHz
Max bandwidth with packet overhead Typical data transaction is a 64 byte cache line Typical packet has header Flit which requires 4
Phits to transmit across link Data payload takes 32 Phits to transfer (64
bytes at 2 bytes/Phit) 2 bytes/transfer * 2 transfers/cycle * 3.2 GHz = 12.8 GB/s
With Intel® QuickPath Interconnect at 6.4 GT/s translates to 25.6 GB/s across two simultaneous unidirectional links
bytes at 2 bytes/Phit) With CRC is sent inline with data, data packet
requires 4 Phits for header + 32 Phits of payload With Intel® QuickPath Interconnect at 6.4 GT/s,
64B cache line transfers in 5.6 ns translates to 22 8 GB/s across two simultaneoussimultaneous unidirectional links to 22.8 GB/s across two simultaneous unidirectional links
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 67
Memory
Memory – Separating the machine and guest memoryIt is important to note that some statistics refer to guest physical memory while others refer to machine memory. “ Guest physical memory" is the virtual-hardware physical memory presented to the VM " Machine memory" is actualhardware physical memory presented to the VM. Machine memory is actual physical RAM in the ESX host. In the figure below, two VMs are running on an ESX host, where each block represents 4 KB of memory and each color represents a different set of data onrepresents 4 KB of memory and each color represents a different set of data on a block.
Inside each VM, the guest OS maps the virtual memory to its physical memory. ESX Kernel maps the guest physical memory to machine memory. Due to ESX
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 69
Page Sharing technology, guest physical pages with the same content canbe mapped to the same machine page.
Memory
A Brief Look at Ballooning
The W2K balloon driver is located in VMtools ESX sets a balloon target for each workload at start-up and as
workloads are introduced/removed The balloon driver expands memory consumption, requiring the Virtual
Machine operating system to reclaim memory based on its algorithms Ballooning routinely takes 10-20 minutes to reach the target The returned memory is now available for ESX to useThe returned memory is now available for ESX to use Key ballooning fields:
SZTGT: determined by reservation, limit and memory sharesy , ySWCUR = 0 : no swapping in the pastSWTGT = 0 : no swapping pressureSWR/S, SWR/W = 0 : No swapping activity currently
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 70
Memory Interleaving BasicsMemory Interleaving Basics
What is it?It is a process where memory is stored in a non contiguous form toCh 0 Ch 1 Ch 2 Ch 0 Ch 1 Ch 2 stored in a non contiguous form to optimize access performance and efficiencyInterleaving usually done in cache line granularityWhy do it?
Ch 1 Ch 1
Why do it?Increase bandwidth by allowing multiple memory accesses at onceReduce hot spots since memory is spread out over a wider locationT t NUMA (N U if
System Memory MapCSI
Tylersburg To support NUMA (Non Uniform Memory Access) based OS/applications
Memory organization where there is different access times for different sections of memory, due to memory
Tylersburg
-DP IOH
y, ylocated in different locationsConcentrate the data for the application on the memory of the same socket
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 71
Non-NUMA (UMA)Non NUMA (UMA)
Uniform Memory Access (UMA) Addresses interleaved across memory nodes by cache line. Accesses may or may not have to cross QPI link
Socket 0 Memory Socket 1 Memory
DDR3 DDR3DDR3 DDR3
DDR3 DDR3
System Memory Map
Tylersburg-DP
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 72
Uniform Memory Access lacks tuning for optimal performanceUniform Memory Access lacks tuning for optimal performance
NUMANUMA Non-Uniform Memory Access (NUMA) Addresses not interleaved across memory nodes by cache line. Each CPU has direct access to contiguous block of memory.
Socket 0 Memory Socket 1 Memory
DDR3 DDR3DDR3 DDR3
Tylersburg-EP
DDR3 DDR3
System Memory Map
Tylersburg EP
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 73
Thread affinity benefits from memory attached locallyThread affinity benefits from memory attached locally
Memory
ESX Memory Sharing - The “Water Bed Effect”
ESX handles memory shares on an ESX host and across an ESX cluster with a result similar to a single water bed, or room full of water beds, depending upon the
ti d th ll ti taction and the memory allocation type: Initial ESX boot (i.e., “lying down on the water bed”) – ESX sets a target working size
for each virtual machine, based upon the memory allocations or shares, and uses ballooning to pare back the initial allocations until those targets are reached (if possible).
Steady State (i e “minor position changes”) The host gets into a steady state with Steady State (i.e., minor position changes ) - The host gets into a steady state with small adjustments made to memory allocation targets. Memory “ripples” occur during steady state, with the amplitude dependent upon the workload characteristics and consumption by the virtual machines.
New Event (i.e., “second person on the bed”) – The host receives additional workload ( , p )via a newly started virtual machine or VMotion moves a virtual machine to the host through a manual step, maintenance mode, or DRS. ESX pares back the target working size of that virtual machine while the other virtual machines lose CPU cycles that are directed to the new workload.
Large Event (i e “jumping across water beds”) – The cluster has a major event thatLarge Event (i.e., jumping across water beds ) The cluster has a major event that causes a substantial movement of workloads to or between multiple hosts. Each of the hosts has to reach a steady state, or to have DRS determine that the workload is not a current candidate for the existing host, moving to another host that has reached a steady state with available capacity. Maintenance mode is another major event.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 74
Memory
Memory – Key Question and Considerations
Is the memory allocation for each workload optimum to prevent swapping at the Virtual Machine level, yet low enough not to constrain other workloads or the ESX host? HA/DRS/Maintenance Mode Regularity – How often do the workloads in the l t t d b t h t ? E h t i t thcluster get moved between hosts? Each movement causes an impact on the
receiving (negative) and sending (positive) hosts with maintenance mode causing a rolling wave of impact across the cluster, depending upon the timing. Allocation Type Each of the allocation types have their drawbacks so tread Allocation Type – Each of the allocation types have their drawbacks so tread carefully when choosing the allocation type. One size seldom is right for all needs. Capacity/Swapping - The virtual machine’s CPU can be constrained due to aCapacity/Swapping The virtual machine s CPU can be constrained due to a lack of sufficient capacity at the ESX host level. Look for regular swapping at the ESX host level as an indicator of a memory capacity issue but be sure to notice memory leaks that artificially force a memory shortage situation.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 75
Idle State on Test Bed Memory View V4
Memory
Idle State on Test Bed – Memory View V4
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 76
Memory View at Steady State of 3 Virtual Machines Memory Shares V4
Memory
Memory View at Steady State of 3 Virtual Machines – Memory Shares V4
Most memory is not reserved
Virtual Machine Just Powered On
These VMs are at memory steady state
No VM Swapping or Targets
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 77
Powered On
B ll i d S i i P M Vi V4
Memory
Ballooning and Swapping in Progress – Memory View V4
Possible states: High,
Soft, hard and low
Ballooning In Effect Mild swapping
Different Size Targets Due to Different
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 78
EffectAmount of Up Time
Memory Reservations Effect on New Loads V4
Memory
Memory Reservations – Effect on New Loads V4
6GB of “free”
What Size Virtual Machine with Reserved Memory Can Be Started?
physical memory due to memory sharing over 20
minutes
666MB of unreserved
memory
Three VMs each with 2GBCan’t start fourth virtual machine of >512MB of reserved memory
Fourth virtual machine of 512MB of reserved memory started
Three VMs, each with 2GB reserved memory
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 79
Memory Shares Effect on New Loads V4
Memory
Memory Shares – Effect on New Loads V4
5.9 GB of “free”
Three VMs with 2GB allocation
free physical memory
6GB of
Fourth virtual machine of 2GB of memory allocation started successfully
6G ounreserved
memory
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 80
Vi l M hi i h M G Th O A Si l NUMA N d V5
Memory
Virtual Machine with Memory Greater Then On A Single NUMA Node V5
Remote Local NUMA
% Local NUMA Access
NUMA Access
NUMA Access
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 81
Wid NUMA S i V5 1 CPU 1 NUMA N d
Memory
Wide-NUMA Support in V5 – 1 vCPU 1 NUMA Node
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 82
Wid NUMA S i V5 8 CPU 2 NUMA N d
Memory
Wide-NUMA Support in V5 – 8 vCPU 2 NUMA Nodes
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 83
Power Management
Power Management Screen V5
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 85
Impact of Power States on CPU
Power Management
Impact of Power States on CPU
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 86
Power Management Impact on CPU V5
Power Management
Power Management Impact on CPU V5
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 87
Storage Considerations
Storage – Key Question and Considerations
Is the bandwidth and configuration of the storage subsystem sufficient to meet the desired latency (a k a response time) for the target workloads?meet the desired latency (a.k.a. response time) for the target workloads? If the latency target is not being met then further analysis may be very time consuming.Storage Frames specifications refer to the aggregate bandwidth of theStorage Frames specifications refer to the aggregate bandwidth of the frame or components, not the single path capacity of those components. Queuing - Queuing can happen at any point along the storage path, but is not necessarily a bad thing if the latency meets requirementsnecessarily a bad thing if the latency meets requirements.
Storage Path Configuration and Capacity – It is critical to know the configuration of the storage path and the capacity of each component along that path. The number of active vmkernel commands must be less then or equal topath. The number of active vmkernel commands must be less then or equal to the queue depth max of any of the storage path components while processing the target storage workload.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 89
Storage Considerations
Storage – Aggregate versus Single Paths
Storage Frames specifications refer to the aggregate bandwidth of the frame or components not the single path capacity of those components*frame or components, not the single path capacity of those components DMX Message Bandwidth: 4-6.4 GB/s DMX Data Bandwidth: 32-128 GB/s
Gl b l M 32 512 GB Global Memory: 32-512 GB Concurrent Memory Transfers: 16-32 (4 per Global Memory Director)Performance Measurement for storage is all about individual paths and the
f f h i d i h hperformance of the components contained in that path
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 90
(* Source – EMC Symmetrix DMX-4 Specification Sheet c1166-dmx4-ss.pdf)
Storage Considerations
Storage – More Questions
Virtual Machines per LUN - The number of outstanding active vmkernel commands per virtual machine times the number of virtual machines oncommands per virtual machine times the number of virtual machines on a specific LUN must be less then the queue depth of that adapter How fast can the individual disk drive process a request?
B d th bl k i d t f I/O ( ti l d ti l Based upon the block-size and type of I/O (sequential read, sequential write, random read, random write) what type of configuration (RAID, number of physical spindles, cache) is required to match the I/O characteristics and workload demands for average and peak throughput?characteristics and workload demands for average and peak throughput? Does the network storage (SAN frame) handle the I/O rate down each path and aggregated across the internal bus, frame adaptors, and front end processors?end processors?
In order to answer these questions we need to better understand the
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 91
underlying design, considerations, and basics of the storage subsystem
Storage Considerations
Back-end Storage Design ConsiderationsCapacity - What is the storage capacity needed for this workload/cluster? Disk drive size (ex., 144GB, 300GB)( ) Number of disk drives needed within a single logical unit (ex., LUN)
IOPS Rate – How many I/Os per second are required with the needed latency? Number of physical spindles per LUN
I t f h i f h i l di k d i b t LUN Impact of sharing of physical disk drives between LUNs Configuration (ex., cache) and speed of the disk drive
Availability – How many disk drives, storage components can fail at one time? Type of RAID chosen, number of parity drives per groupingyp , p y p g p g Amount of redundancy built into the storage solution
Cost – Delivered cost per byte at the required speed and availability Many options are available for each design consideration
Fi l d i i th h i f h t Final decisions on the choice for each component The cumulative amount of capacity, IOPS rate, and availability often dictate
the overall solution
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 92
St f th G d U B i D fi iti M h i l D i
Storage Considerations
Storage from the Ground Up – Basic Definitions: Mechanical Drives
Disk Latency – The average time it takes for the requested sector to rotate under the read/write head after a completed seekp 5400 (5.5ms), 7200 (4.2ms), 10,000 (3ms) , 15,000 (2ms) RPM Ave. disk latency = 1/2 * rotation Throughput (MB/sec) = (Outstanding IOs/ latency (msec)) * Block size (KB)Throughput (MB/sec) (Outstanding IOs/ latency (msec)) Block size (KB)
Seek Time – The time it takes for the read/write head to find the physical location of the requested data on the disk Average Seek time: 8-10 mse age See t e 8 0 s
Access Time – The total time it takes to locate the data on the drive(s). This includes seek time, latency, settle time, and command processing overhead time.
Host Transfer Rate – The speed at which the host can transfer the data across the disk interface.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 93
N t k St C t Th t C Aff t P f /A il bilit
Storage Considerations
Network Storage Components That Can Affect Performance/Availability
Size and use of cache (i.e., % dedicated to reads versus writes) Number of independent internal data paths and buses Number of independent internal data paths and buses Number of front-end interfaces and processors Types of interfaces supported (ex. Fiber channel and iSCSI) Number and type of physical disk drives available MetaLUN Expansion
MetaLUNs allo for the aggregation of LUNs MetaLUNs allow for the aggregation of LUNs System typically re-stripes data when MetaLUN is changed Some performance degradation during re-striping
Storage Virtualization Aggregation of storage arrays behind a presented mount point/LUN Movements between disk drives and tiers control by storage management
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 94
y g g Change of physical drives and configuration may be transient and severe
Test Bed Idle State Device Adapter View V4
Case Studies - Storage
Test Bed Idle State– Device Adapter View V4Average Device Latency,
Per Command
Storage Adapter
MaximumWorld Maximum Q L thMaximum
Queue LengthQueue Length
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 95
LUN Maximum Queue Length
Moderate load on two virtual machines V4
Case Studies - Storage
Moderate load on two virtual machines V4
Acceptable latency from the disk subsystem
Commands are queued BUT….
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 96
H i l d t i t l hi
Case Studies - Storage
Heavier load on two virtual machines
Virtual machine latencyVirtual machine latency is consistently above
20ms/second, performance could start
to be an issue
Commands are queued and are exceeding
maximum queue lengths BUT….
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 97
to be an issue
H l d f i t l hi
Case Studies - Storage
Heavy load on four virtual machines
Virtual machine latencyVirtual machine latency is consistently above 60
ms/second for some VMs, performance will
be an issue
Commands are queued and are exceeding
maximum queue lengths AND….
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 98
be an issue
A tifi i l C t i t St
Case Studies - Storage
Artificial Constraints on StorageGood
throughput Problem with the disk subsystem
Low device Latency
Bad throughput
Device Latency is high - cache disabled
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 99
Understanding the Disk Counters and LatenciesUnderstanding the Disk Counters and Latencies
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 100
Understanding Disk I/O QueuingUnderstanding Disk I/O Queuing
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 101
SAN St I f t t A t W t h/C id
Storage Considerations
SAN Storage Infrastructure – Areas to Watch/Consider
HBA
HBA Speed Fiber Bandwidth
FA CPU Speed Disk Response
RAID Configuration
Block SizeNumber of Spindles
in LUN/array
HBA
ISL
FC switch Director
San.jpg Disk Speeds
HBA
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 102
Storage Adapter Queue LengthWorld Queue Length LUN Queue Length Cache Size/Type
St Q i Th K Th ttl P i t
Storage Considerations
ESX Host
Storage Queuing – The Key Throttle Points
HBA
VM 1World Queue
Length (WQLEN)
VM 2 QLE
N)
on T
hrot
tle
Storage Area Network
VM 3World Queue
World Queue Length (WQLEN)
Leng
th (L
Exec
utio
rottl
eLength (WQLEN)
HBA
VM 4World Queue
Length (WQLEN) UN
Que
ue
Exec
utio
n Th
r
Length (WQLEN) L
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 103
Storage Considerations
Storage I/O – The Key Throttle Point Definitions Storage Adapter Queue Length (AQLEN)
The number of outstanding vmkernel active commands that the adapter is configured to support. This is not settable. It is a parameter passed from the adapter to the kernel.
LUN Queue Length (LQLEN) The maximum number of permitted outstanding vmkernel active commands to a LUN The maximum number of permitted outstanding vmkernel active commands to a LUN. (This would be the HBA queue depth setting for an HBA.) This is set in the storage adapter configuration via the command line.
World Queue Length (WQLEN) VMware Recommends Not to Change This!!!! The maximum number of permitted outstanding vmkernel active requests to a LUN from any singular virtual machine (min:1, max:256: default: 32) Configuration->Advanced Settings->Disk-> Disk.SchedNumReqOutstanding
Execution Throttle (this is not a displayed counter) Execution Throttle (this is not a displayed counter) The maximum number of permitted outstanding vmkernel active commands that can be executed on any one HBA port (min:1, max:256: default: ~16, depending on vendor) This is set in the HBA driver configuration.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 104
Q L th R l f Th b
Storage Considerations
Queue Length Rules of Thumb
For a lightly-loaded system, average queue length should be less than 1 per spindle with occasional spikes up to 10. If the workload is write-heavy, the average queue length above a mirrored controller should be less than 0.6 per spindle and less than 0.3 per spindle above a RAID-5 controllercontroller.
For a heavily-loaded system that isn’t saturated, average queue length should be less than 2.5 per spindle with infrequent spikes up to 20. If th kl d i it h th l th bthe workload is write-heavy, the average queue length above a mirrored controller should be less than 1.5 per spindle and less than 1 above a RAID-5 controller.
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 105
Closing ThoughtsClosing Thoughts
Know the key counters to look at for each type of resource Be careful on what type of resource allocation technique you use for CPUBe careful on what type of resource allocation technique you use for CPU and RAM. One size may NOT fit all. Consider the impact of events such as maintenance on the performance of a cluster Set up a simple test bed where you can create simple loads to become familiar with the various performance counters and tools Compare your test bed analysis and performance counters with the development and production clusters Know your storage subsystem components and configuration due to the large impact this can have on overall performance
T k th ti t l h th i t f th i t l Take the time to learn how the various components of the virtual infrastructure work together
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 106
John Paul – johnathan paul@siemens comJohn Paul – [email protected]
Copyright © 2012 Siemens Medical Solutions USA, Inc. All rights reserved.Page 107