v sphere performance pitfalls
TRANSCRIPT
-
8/12/2019 V Sphere Performance Pitfalls
1/12
Understanding Major
Performance Pitfalls
in vSphere
V e r n a k . c o m
7 / 2 9 / 2 0 1 3
Technical White Paper by Dan McGee
This document is designed to help IT Professionals avoid
painful design mistakes or oversights in a vSphere
environment.
-
8/12/2019 V Sphere Performance Pitfalls
2/12
Page 1
Table of Contents1. Introduction............................................................................................................................... 2
2. CPU .................................................................................................................................. 3
2.1 Understanding CPU % Ready and the CPU Scheduler ...................................................... 3
2.2 Processor Caching ............................................................................................................. 3
2.3 Failure to Complete P2V Migrations ............................................................................. 3-4
3. Memory ............................................................................................................................ 5
3.1 Overcommitted Memory .................................................................................................. 5
3.2 Memory Limits on Gold Images ..................................................................................... 5-6
3.3 Failure to Maintain Like VMs ............................................................................................ 6
4. Network ........................................................................................................................... 7
4.1 Understanding Default NIC Configuration........................................................................ 7
4.2 STP, STA, BPDU Guard & BPDU Filter ............................................................................ 7-8
4.3 Broadcasts, Flat VLANs and Interrupts .......................................................................... 8-9
5. Storage ........................................................................................................................... 10
5.1 Dangers of Thin Provisioning .......................................................................................... 10
5.2 Array Misconfiguration .............................................................................................. 10-11
5.3 Alignment Issues ............................................................................................................. 11
6. Further Reading .............................................................................................................. 11
-
8/12/2019 V Sphere Performance Pitfalls
3/12
Page 2
1. Introduction
A colleague of mine recently asked me What are the things that just grind our
virtualized servers to a halt? What are the big performance nightmares in vSphere? They did
not just want to know some troubleshooting methodology and how it changes going from
physical to virtual servers. They wanted to know the juicy stuff. They wanted the dark,
disturbing stories from organizations that have had it all go to hell in a handbasket. The types
of companies that have suffered performance issues they cant explain, lower ROI, resume-
generating events, and parted with enormous piles of cash on hardware that wasnt
appropriate for them. This White Paper is my answer to the heart of the inquiry and it is my
hope that these examples provide a basis for better decision making in a vSphere environment.
The examples to follow are by no means mutually exclusive from one another, nor are
they collectively exhaustive. Also, I will not be discussing certain features such as vMotion, HA,and DRS in any significant detail. This is not to discredit their incredible value to a healthy
vSphere environment by any means. For more information on these and other technologies,
please reference the Future Reading section at the end of this document.
Virtual environments are inherently dynamic and utilization levels can change drastically
and quickly. Virtual machines can be created or cloned in minutes by administrator actions or
sometimes by automation tools. HA restarts or DRS events can also add to the less predictable
virtualized world. Unlike physical environments, the underlying resource categories are shared
and this requires IT professionals to learn and adapt to new rules and guidelines in the data
center. The four primary resource categories are CPU, Memory, Network (I/O), and Storage
(Capacity & I/O). The following sections will briefly explain how they are shared and
demonstrate soul-crushing scenarios in each category and how they could be resolved or
avoided.
-
8/12/2019 V Sphere Performance Pitfalls
4/12
Page 3
2. CPU
2.1 Understanding CPU % Ready and the CPU Scheduler
In Larry Louks Critical VMware Mistakes You Should Avoid, the following foundational rules arewell laid out for server processor management:
1. ESXi Host Server schedules VMs onto and off-of processor as needed.2. Whenever a VM is scheduled to processor, all of the cores must be available for the VM to be
scheduled or the VM cannot be scheduled at all.
3. If a VM cannot be scheduled to processor when it needs access to processor, VM performancecan suffer tremendously.
4. When VMs are ready for a processor but unable to be scheduled, this creates what VMwarecalls CPU % Ready values.
5. CPU % Ready manifests as a utilization issue but is actually a scheduling issue.Due the rules mentioned above, mixing VMs with greatly varied vCPUs on the same ESXi host
can create major scheduling problems. This issue is amplified if the host has low density for CPU
cores. Whenever possible, reduce VMs to a single vCPU. Often this is not recommended or
possible but never do the opposite of arbitrarily thinking that throwing vCPUs at a VM will
actually make it perform faster.
2.2 Processor Caching
Processor caches contain information that allow guest operating system and applications to
perform better. To this end, vSphere attempts to schedule on the same core over and over again. If thevirtual machine is moved from to a new core on a new socket where the cache isnt shared (which is
usually isnt between sockets) the cache for the new core must be loaded over time with information
formerly cached on the previous core. The reason I discuss processor caching is because VMware
administrators need to understand that there is a temporary switching cost associated with things such
as a hot migration with vMotion.
2.3 Failure to Complete P2V Migrations
Physical to Virtual migrations involve imaging of an entire physical server including OS,
applications and data, into a virtual disk. The often neglected second phase of a proper migration isstripping the new virtual machine of physical device drivers and related hardware-vendor specific tools.
Plain and simple, physical drivers need to be replaced with virtual ones (VM Tools). Here is a list of steps
for a proper P2V migration:
1. Plan capacity2. Pre-migration cleanup3. Full shutdown migration of critical apps and databases, ensuring stable data
-
8/12/2019 V Sphere Performance Pitfalls
5/12
Page 4
4. Exclude unneeded partitions5. Split array partitions into separate virtual disks6. Post-migration cleanup7. Adjust VM and Guest OS to single vCPU where possible8. Remove unnecessary applications9. Review virtual hardware assignments
-
8/12/2019 V Sphere Performance Pitfalls
6/12
Page 5
3. Memory
3.1 Overcommitted Memory
The total physical RAM in an ESXi host is available to VMs aside from a small amount for the
VMKernel. If I have I have an ESXi host with 64GB of physical RAM, I could build VMs on it and assign
more than 64GB total to the VMs. When this is done, I am overcommitting memory. By default, a
virtual machine does not power on with all of the RAM assigned to it. It boots with half of its RAM
assignment unless the minimum value is changed for the VM.
This memory overcommitment does not cause major issues if the VMs dont actually need all of
the RAM assigned to them. However, if the ESXi host is overcommitted and VMs actually need RAM in
excess of RAM available, there are a few ESXi processes that attempt to keep it all up and running.
These processes include balloon driver activity and VMKernel swap file.
The Memory Control Driver (also known as the Balloon Driver) is installed as part of the VMTools package that should be installed on all of the VMs in the environment. The purpose of the Balloon
Driver Is to force VMs to give up some of their RAM and begin using their swap file. In other words, if
there is more demand for RAM than what the ESXi host may provide, the balloon driver inflates and
VMs are forced to move some contents of their own RAM into the guest OS swap file within their own
virtual disks. As is the case with a physical server, this will cause degradation of performance since the
VM must rewrite parts of its page file back into memory for processing.
The VMKernel swap file Is created when a new VM is first powered on in the VMFS volume that
contains the VM. This file is equal in size to the amount of RAM assigned to the virtual machine. This
file is only used in emergency situations and otherwise sits idle in a health vSphere environment. If
there is a crisis with memory overcommitment, the ESXi host will first use the balloon driver to reclaim
RAM. If the crisis continues, the ESXi host will grab parts of the VM memory and move it out to
VMKernel swap. In this scenario, VMs could be swapping both with a page file and again with a
VMKernel swap file. This takes awful performs it and drags it right down to apocalyptic.
Just because you can overcommit memory does not mean that you should. ESXi is very
resilient in keeping virtual machines up and running even in these scenarios but at an obvious cost of
performance.
3.2 Memory Limits on Gold Images
The base images that we use as templates for our various guest operating system must be
configured properly, lest they yield cloned virtual machines that share their flaws. A relatively common
mistake found in a lot of vSphere implementations is that this master image has memory limits set.
Being that the limit always wins, a VM assigned 8GB of RAM will not use 8GB if a limit was accidentally
set to 2GB. The VM will only ever be allocated up to 2GB but the VM guest thinks it has the full 8GB.
-
8/12/2019 V Sphere Performance Pitfalls
7/12
Page 6
If the VM actually needs the RAM you have assigned to it beyond the limit, severe performance issues
occur. Multiply this by the number of times a misconfigured template has been cloned in the
environment, and you can see how the nightmare scenario might play out.
3.3 Failure to Maintain Like VMs
ESXi is able to scan RAM pages for the virtual machines and identify any duplicate pages. After
this detection, the ESXi host will use a single RAM location and point multiple virtual machines to that
single page of shared RAM. The shared page location is flagged read-only to prevent corruption. To the
VM guest operating systems, they are unaware of this underlying sharing. They simply have what they
need available in RAM. When we scale this concept up to having, lets say, one thousand Windows 2008
R2 Servers, you could imagine how much RAM we could save by having OS builds that are similarly
patched and maintained. This is not to say that you should only run one type of guest operating system
in your environment because that simply isnt practical in many, if not most, environments. However, if
attention is paid to proper operating system patching, application patching, and reducing disparate VMs
and sprawl, it is possible to see high percentages of shared ram on your ESXi hosts.
-
8/12/2019 V Sphere Performance Pitfalls
8/12
Page 7
4. Network
4.1 Understanding Default NIC Configuration
The virtual machines hosted on ESXi are commonly assigned virtual network interface cards.
These virtual NICs connect to virtual switches which provide a path from the virtual NICs to the physical
adaptor(s). The physical NICs on the ESXi hosts do not have IP addresses themselves, but rather act as a
passthrough for the virtual machine traffic to the physical network. Proportional shares are utilized in
order to prevent virtual machines from monopolizing a network path. Shares are enforced only in
situations where there is contention for network resources. Otherwise, virtual machines are simply
given what they need in terms of network resources.
When two or more physical NICs are assigned to a virtual switch, the default mechanism used by
the ESXi host is a round-robin approach to load-balancing. This default methodology works fine for
many environments but can lead to network cards carrying uneven loads. In contrast, the virtual switch
can be setup to perform link aggregation as opposed to using round-robin. In this configuration, the
VMs will gain the combined throughput of the NICs while maintaining failover ability.
4.2 STP, STA, BPDU Guard & BPDU Filter
Spanning Tree Protocol (STP) enabled switches in a redundant Local Area Network (LAN) need to
exchange information between each other for STP to work properly. Bridge Protocol Data Units (BPDUs)
are messages exchange between switches in a redundant LAN. The BPDU frames contain information
such as Switch ID, originating port, MAC Address, switch port cost, switch port priority, etc. The BPDUs
are sent out as multicast messages and when they are received, the switch uses Spanning Tree
Algorithm (STA) to know when there is a Layer 2 Switch loop in network and determines which
redundantports need to be shut down. BPDUs and STA are designed to avoid Layer 2 switching loops
as well as broadcast storms.
As VMware so eloquently puts it in their KB 2047822, The STP process of identifying root bridge
and finding is the switch ports are in a forwarding or blocking state takes somewhere around 30 to 50
seconds. During that time no data can be passed from those switch ports. If a server connected to the
port cannot communicate for that long, the applications running on them will time out. To avoid this
issue of time out on the servers, the best practice is to enable Port Fast configuration on the switch
ports where the servers NICs are connected. The Port Fast configuration puts the physical switch port
immediately into STP forwarding state.
So now that we have the important STP best practice out of the way, lets discuss how Standard
and Distributed vSwitches use STP and BPDU They dont! vSwitches do not support STP nor do they
exchange BPDU frames. By default, they just forward them and think nothing of it. That said, an STP
boundary is often created by using BPDU guard on the physical switch ports. In this BPDU guard
-
8/12/2019 V Sphere Performance Pitfalls
9/12
Page 8
example, any BPDU frames received on the physical switch port causes the port to become blocked.
Since the vSwitch will simply forward BPDU frames through to the physical switch ports, this can cause
an uplink traffic path failure if a compromised virtual machine generates BPDU frames. Then, the
vSphere host moves the traffic to another uplink which then disables another switch port. This process
chains until the entire cluster is compromised by the Denial of Service attack. This attack is especially
cruel and damaging in a multi-tenant environment where a nasty neighbor can attack the neighborhood.
From KB 2047822, To prevent such Denial of Service attack scenarios, the BPDU filter feature is
supported as part of the vSphere 5.1 release. After configuring this feature at the ESXi host level, BPDU
frames from any virtual machine will be dropped by the vSwitch. This feature is available on both
Standard and Distributed vSwitches. I believe that it is critically important for VMware Administrators
and their Networking colleagues to understand how to prevent this cluster-wide catastrophe. As an
aside, there are those in the VMware community who believe that VMware should implement a true
BPDU Guard on the vSwitches and not just BPDU Filter, but that is a discussion that will not be picked up
here. Please refer to the Further Readingsection of this paper for appropriate links to related materials.
4.3 Broadcasts, Flat VLANs and Interrupts
Broadcasts tend to be very small packets and very bursty in nature. Although broadcast packets
are sent to every device on the same VLAN as the sending device, we tend to not worry too much
about broadcasts anymore because available bandwidth is usually so high that the impact of small
broadcast packets is usually negligible. Also, contemporary switches have many features to limit or
eliminate networking loops and broadcast storms such as STP.
However, there are two major reasons why we still have to care at some levels: Bad (Read: Flat)
design and Interrupts. If your organization has one, flat VLAN with all your management traffic, network
storage, and virtual machine traffic playing together, then they are not going to play nicely. When you
do this (Dont be this guy/girl), you create a scenario where broadcasts of everything affect everything
else. Do not cut corners in your network design and use VLANs that far exceed industry recommended
sizes. If you do, you will suffer the wrath of broadcasts and their best friend: Interrupts.
In the years before Plug-N-Play, devices had interrupts set manually by the use of jumpers of dip
switches. Plug-N-Play effectively makes the jumper/dip switch decisions for us now. However, devices
like NICs still need their interrupt and every time a packet hits a network card in an Intel architecture
device, it triggers an interrupt. They use interrupt requests (IRQs) to get an Operating
System/Application to stop and make sure the communication with them is correct and complete before
resuming. So when we talk about interrupts, we are talking about literally stopping the processor.Usually, these interrupt stoppages are brief and infrequent enough to not be noticeable or well
understood by many administrators. Sometimes though, organizations for whatever reasons (or lack
thereof) implement flat VLANs where an enormous amount of broadcast packets are hitting Intel
architecture devices such as ESXi Hosts. This then causes incredible slowdowns due to production
workload processing getting interrupted constantly. Proper network segmentation from the start is the
-
8/12/2019 V Sphere Performance Pitfalls
10/12
Page 9
most effective way to mitigate this risk. Remember, interrupts are like bee stings. One or two on
occasion is tolerable but you do not want to ever suffer the swarm.
-
8/12/2019 V Sphere Performance Pitfalls
11/12
Page 10
5. Storage
5.1 Dangers of Thin Provisioning
Thin provisioned virtual disks are disks for which the assigned amount of disk space is not
allocated at creation. Only enough disk space is allocated to accommodate existing data with a small
buffer space added. The VMDK file for the VM will then grow in incrementally as more data is saved.
The guest operating system sees the full amount that has been assigned to the virtual disk but the
underlying storage is conserved until needed. At first glance, this looks great to new VMware admins.
Reducing wasted disk space can be quite tempting but it comes with a price.
First and foremost, since the virtual disk is dynamically growing as needed in small increments,
there is a performance hit as the LUN is actually assigning the storage that the guest OS already thinks it
has. While were on the guest OS note, that brings me to my other point you are lying to it. As stated
before, the guest operating system sees the full provisioning amount so you are effectively lying to the
OS and expecting everything to work out fine. If you want to see a train wreck up close and personal, fill
a LUN unexpectedly when there are active, expanding VMDK files. Hint: It is going to come to a
screeching halt.
The error checking mechanisms in the guest operating system cannot detect that the VMFS
volume is full or almost full because the administrator wrote the server a bad check. The guest believes
there is enough room left to write but there is no space left physically to write to. This nightmare
scenario can definitely lead to data corruption as VMs crash with partially saved data. In a production
environment, it is highly recommended to keep thin provisioning to an absolute minimum or avoid it
completely. It is suitable for a test lab or for VMware Workstation, but do not rely on it for a production
environment.
5.2 Array Misconfiguration
Virtualized environments take advantage of shared storage types such as NAS, iSCSI, and Fiber
Channel ranking from low to high disk I/O. While fiber storage allows administrators to decide on array
configurations, many NAS and iSCSI options do not. This complicates storage purchase decisions for an
organization and several do not successfully avoid the following scenario. In a fiber channel setup, the
administrator can decide on which disks to carve up unto multiple arrays with different RAID
configurations.
The impact of VM storage I/O is isolated to the arrays upon which the VM is installed. The
ability to break down the storage into these separate arrays and RAID Groups prevents a storage I/O
bottleneck. However, in the NAS and iSCSI example with single arrays, every disk affects every other
disk in the storage device. If an iSCSI device, for instance, supports only a single array that the admin
formats in RAID 5, any change to any disk is going to be striped across every single disk. The lesson to be
learned here when purchasing storage is that it is necessary to properly understand the I/Ops and
configuration capabilities of the device and not just the storage and expandability. When organizations
-
8/12/2019 V Sphere Performance Pitfalls
12/12
Page 11
to not heed this warning, they hit their I/Ops limits long before the storage fills and they potentially
spend heaps of money on temporary storage while reconfiguring the poorly provisioned array.
5.3 Alignment Issues
If we think of storage in a virtual environment in three layers, we have the guest operating
system partition (such as NTFS), VMFS/VMDK, and the physical disk. Misalignment can occur between
the virtual disk blocks (.vmdk) and the underlying physical disk blocks. Also, it is possible to also find
misalignment between the guest operating system partition and the underlying virtual disk blocks.
Misalignment in the environment causes unnecessary I/Ops to be used to overcompensate for the
misalignment. Performance degradation ranges from moderate to severe in these instances depending
on the type and scope of the alignment problem. Most newer guest operating systems install with the
guest partition properly aligned but it still worth checking periodically within the environment. Several
organizations such as Dell provide free or paid tools in order to scan vSphere and report this type of
storage issue.
6. Further Readinghttp://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.html
http://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.html
http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-
Whitepaper.pdf
http://rickardnobel.se/esxi-5-1-bdpu-guard/
http://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-
resource-management-guide.pdf
http://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.htmlhttp://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.htmlhttp://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.htmlhttp://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.htmlhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://rickardnobel.se/esxi-5-1-bdpu-guard/http://rickardnobel.se/esxi-5-1-bdpu-guard/http://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://rickardnobel.se/esxi-5-1-bdpu-guard/http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.htmlhttp://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.html