h8115 vmware vstorage vmax wp

Upload: ashish-singh

Post on 04-Jun-2018

238 views

Category:

Documents


2 download

TRANSCRIPT

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    1/34

    White Paper

    AbstractThis white paper discusses VMwares new vStorage APIs forArray Integration (VAAI), in conjunction with EMCSymmetrixVMAX using Enginuity 5875, and how this tight integrationcan significantly reduce the time to perform various virtual

    machine operations.

    June 2011

    USING VMWARE vSTORAGE APIs FOR ARRAYINTEGRATION WITH EMC SYMMETRIX VMAXIncreasing operational efficiency with VMware andEMC Symmetrix VMAX

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    2/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Copyright 2011 EMC Corporation. All Rights Reserved.

    EMC believes the information in this publication is accurate ofits publication date. The information is subject to changewithout notice.

    The information in this publication is provided as is. EMCCorporation makes no representations or warranties of any kind

    with respect to the information in this publication, andspecifically disclaims implied warranties of merchantability orfitness for a particular purpose.

    Use, copying, and distribution of any EMC software described inthis publication requires an applicable software license.

    For the most up-to-date listing of EMC product names, see EMCCorporation Trademarks on EMC.com.

    VMware, ESX, ESXi, vMotion, and vSphere are registered

    trademarks or trademarks of VMware, Inc. in the United Statesand/or other jurisdictions. All other trademarks used herein arethe property of their respective owners.

    Part Number h8115.1

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    3/34

    3Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Table of ContentsFigure s .................................................................................................................... 5 Execu tive su mm ary.................................................................................................. 6

    Audience ............................................................................................................................ 7Symmetrix VMAX using Enginuity 5875 .................................................................... 7

    FAST VP .............................................................................................................................. 7Virtual LUN VP Mobility ....................................................................................................... 7Virtual Provisioning enhancements .................................................................................... 7Concurrent SRDF/A ............................................................................................................. 8Thick-to-thin migration with SRDF ....................................................................................... 8SRDF general enhancements .............................................................................................. 9Federated Live Migration .................................................................................................... 9Duplicate TimerFinder/Snaps ............................................................................................. 910 Gb/s support ................................................................................................................. 9Symmetrix Data at Rest Encryption ..................................................................................... 9VM ware vSp here 4 .1 ............................................................................................. 10 Storage enhancements ..................................................................................................... 11Network enhancements .................................................................................................... 11Availability enhancements ............................................................................................... 11Management enhancements ............................................................................................ 12Platform enhancements ................................................................................................... 12

    VAA I prim itives ...................................................................................................... 13 Hardware-accelerated Full Copy........................................................................................ 13Hardware-accelerated Block Zero ..................................................................................... 14Hardware-assisted locking ............................................................................................... 15Viewing VAAI support ....................................................................................................... 17Enabling the vStorage APIs for Array Integration ............................................................... 19

    Use case s ............................................................................................................. 20 Hardware-accelerated Full Copy........................................................................................ 20

    Use case configuration ................................................................................................. 21Use Case 1: Deploying a virtual machine from a template ............................................ 22Use Case 2: Cloning hot and cold virtual machines ...................................................... 23Use Case 3: Creating simultaneous multiple clones ..................................................... 26Use Case 4: Storage vMotion ....................................................................................... 27

    Server resources impact to CPU and memory ................................................................. 28Caveats for using hardware-accelerated Full Copy ............................................................ 28Hardware-accelerated Block Zero ..................................................................................... 29

    Use Case 1: Deploying fully allocated virtual machines ................................................ 29Use Case 2: Benefits of Block Zero when using zeroedthick virtual disks ...................... 31

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    4/34

    4Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Server resources impact to CPU and memory ................................................................. 33Con clusion ............................................................................................................ 34 Referen ces ............................................................................................................ 34

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    5/34

    5Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    FiguresFigure 1. VMware vSphere 4.1 architecture ............................................................................. 10Figure 2. Hardware-accelerated Full Copy ................................................................................ 14Figure 3. Hardware-accelerated Block Zero ............................................................................. 15Figure 4. Traditional VMFS locking before vSphere 4.1 and hardware-assisted locking ............ 16

    Figure 5. VMFS locking with vSphere 4.1 and hardware-assisted locking ................................. 17Figure 6. Viewing Hardware Acceleration in the vSphere Client ................................................ 18Figure 7. EMC Virtual Storage Integrator .................................................................................. 18Figure 8. Enabling hardware-accelerated Full Copy or Block Zero on ESX 4.1 ........................... 19Figure 9. Enabling hardware-assisted locking on ESX 4.1 ........................................................ 20Figure 10. Deploying virtual machines from the template ........................................................ 23Figure 11. Performance of cold clones using Full Copy ............................................................ 24Figure 12. Performance of hot clones using Full Copy .............................................................. 25Figure 13. Cloning running VMs under load using Full Copy ..................................................... 26Figure 14. Cloning multiple virtual machines simultaneously with Full Copy ............................ 27Figure 15. Storage vMotion using Full Copy ............................................................................. 28

    Figure 16. Fully allocated virtual disk deployment times using hardware-accelerated BlockZero ....................................................................................................................................... 30Figure 17. Throughput difference between deploying an eagerzeroedthick virtual disk withBlock Zero and off and on ....................................................................................................... 31Figure 18. Throughput increase when writing data to new regions in a virtual disk with BlockZero disabled and enabled ..................................................................................................... 33

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    6/34

    6Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Executive summ aryWith the introduction of the Symmetrix VMAX Enginuity 5875 code, EMC offersthree new storage integrations with VMware vSphere 4.1 enabling customers to

    dramatically improve the efficiency of their VMware environments. VMwarevSphere4.1 includes three new VMware vStorage APIs for Array Integration (VAAI) that allowthe ability to offload specific storage operations to EMCSymmetrixVMAX toincrease both overall system performance and efficiency. Symmetrix VMAX with 5875Enginuity supports all three of VMware's new vStorage APIs, which are:

    Full Copy.This feature delivers hardware-accelerated copying of data byperforming all duplication and migration operations on the array. Customers canachieve considerably faster data movement via VMware Storage vMotion, andvirtual machine creation and deployment from templates and virtual machinecloning.

    Block Zero.This feature delivers hardware-accelerated zero initialization, greatlyreducing common input/output tasks such as creating new virtual machines. Thisfeature is especially beneficial when creating fault-tolerant (FT)-enabled virtualmachines or when performing routine application-level Block Zeroing.

    Hardwa re-assisted locking.This feature delivers improved locking controls onVirtual Machine File System (VMFS), subsequently allowing far more virtualmachines per datastore and shortened simultaneous block virtual machine boottimes. This improves performance of common tasks such as virtual machinemigration, powering many virtual machines on or off and creating a virtualmachine from a template. This feature will be discussed only briefly in this paper

    but is fully supported by Symmetrix VMAX running 5875 code. A separate paperspecifically devoted to this topic and the associated use cases will be released inthe future.

    To enable the VAAI functionality, Enginuity release 5875 is required (VMAX only).Enginuity 5875 is the latest intelligent, multitasking, preemptive storage operatingenvironment released for the Symmetrix VMAX. As with previous Enginuity versionsthis release is devoted to storage operations and optimized for the service levelsrequired in high-end environments. As expected, this Enginuity version on SymmetrixVMAX further advances the ability of EMC self-optimizing intelligence to deliverperformance, array-tiering, availability, and data integrity that now define advancedstorage functionality. A prerequisite for complex, demanding, risk-intolerant ITinfrastructures, Enginuitycoupled with Symmetrix VMAXis the essentialfoundation technology for delivering cost-effective high-end storage services.

    This paper will discuss the operation of and best use cases for utilizing VAAI featureswith the Symmetrix VMAX. In doing so, it will demonstrate how to maximize efficiencyand increase the effectiveness of a VMware vSphere 4.1 environment deployed on anEMC Symmetrix VMAX.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    7/34

    7Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    AudienceThis technical white paper is intended for VMware administrators and storageadministrators responsible for deploying VMware vSphere 4.1, and the included VAAIfeatures, on Symmetrix VMAX using Enginuity 5875 and later.

    Sym metrix VMAX using Enginuity 5875Enginuity 5875 carries the extended and systematic feature development forwardfrom previous Symmetrix generations. This means all of the reliability, availability,and serviceability features, all of the interoperability and host operating systemscoverage, and all of the application software capabilities developed by EMC and itspartners continue to perform productively and seamlessly even as underlyingtechnology is refreshed.

    The following section describes the major feature enhancements that are madeavailable with the Enginuity 5875 operating environment on Symmetrix VMAX storagearrays.

    FAST VPFully Automated Storage Tiering with Virtual Pools (FAST VP) maximizes the benefitsof in-the-box tiered storage by optimizing cost versus performance requirements byplacing the right thin data extents, on the right tier, at the right time. The FAST VPsystem allows a storage administrator to decide how much SATA/Fibre Channel/Flashcapacity is given to a particular application and then automatically place theappropriate busiest thin data extents on the desired performance tier and the leastbusy thin data extents on a capacity tier. In order to provide further controls andmodifications an administrators input criteria are assembled into FAST policies. The

    FAST VP system uses policy information to perform extent data movement operationswithin two or three disk tiers in the VMAX array. Because the unit of analysis andmovement is measured in thin extents this sub-LUN optimization is extremelypowerful and efficient. FAST VP is an evolution of the existing FAST and EMCOptimizer technology.

    Virtual LUN VP MobilityVirtual LUN technology offers manual control of data mobility between storage tierswithin a VMAX array. Virtual LUN data movement can nondisruptively change the drivetype (capacity, rotational speed) and the protection method (RAID scheme) ofSymmetrix logical volumes. Enginuity 5875 brings a further enhancement to this

    feature by allowing thin device movement from one thin pool to another thin pool.

    Virtual Provisioning enhancem entsSymmetrix Virtual Provisioning continues evolving toward the goal of providing thesame feature set for thin devices as traditional Symmetrix devices and extending itfurther. The Enginuity 5875 Virtual Provisioning enhancements are:

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    8/34

    8Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    The ability to rename thin pools. Virtual Provisioning rebalancing was initially released with a static pool variance

    measure of 10 percent. This variation value is now user-definable from 1percent to 50 percent, with the default value of 1 percent.

    The maximum concurrent devices participating in the rebalance can now be setfrom two devices to the entire pool. The T10 SBC-3 committee has finalized standards for two new SCSI commands for

    thin devices. The UNMAP command advises a target device that a range of blocksis no longer needed. If the range covers a full Virtual Provisioning extent,Enginuity 5875 returns that extent to the pool. If the UNMAP command rangecovers only some tracks in an extent, those tracks are marked Never Written byHost (NWBH). The extent is not returned to the pool but those tracks dont have tobe read from disk to return all zeros, dont have to be copied for Snap, Clones, orrebuilds, and wont take bandwidth with SRDF.

    The WRITE SAME command instructs a VMAX to write the same block of data1to aspecified number of sequential logical blocks. Operating systems like VMware willuse this command to write zeros to a Symmetrix LUN. Without this command theVMware host would need to use multiple write requests to the LUN. But withWRITE SAME support in Enginuity 5875, a host can do the same format using justone WRITE SAME instruction.

    Virtually provisioned thin striped metadevices can now be nondisruptivelyexpanded. A thick BCV is established to preserve data during the expansion.The reconfigured thin striped meta will remain thin and not become fullyallocated.

    Thin cascaded clones are supported.Concurrent SRDF/AConcurrent SRDF/A expands the SRDF multisite topology offering by allowing twoseparate asynchronous links from a Symmetrix VMAX to Symmetrix systems locatedat remote data centers. This configuration exploits the core benefits of SRDF/A forimproved application response times while replicating at extended distances.Enginuity 5875 offers the flexibility to change Concurrent SRDF/S and SRDF/Adisaster restart topologies to Concurrent SRDF/A and SRDF/A.

    Thick-to-thin migration with SRDFThick-to-thin migration with SRDF supports the migration of data between thickprovisioned volumes on earlier generation Symmetrix systems to thin provisionedvolumes on new VMAX systems with Enginuity 5875, via SRDF replication. Duringreplication, SRDF will detect tracks or blocks containing all zeros and remove them

    1Although the WRITE SAME command accepts any pattern as an argument, zero is the only valid pattern that is accepted bythe Enginuity operating system.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    9/34

    9Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    from the replication stream, thereby allowing the reclamation of space duringmigration and the desired result of a thinmigration target.

    SRDF general enhancementsSymmetrix Remote Distance Facility continues to evolve to meet user requirements forease-of-use improvements and respond to the changing needs of high-end business

    environments. The Enginuity 5875 general SRDF enhancements are:

    The ability to configure simultaneously multiple static SRDF groups The ability to throttle host write I/O response time up to a user-defined limit The ability to allow TimeFinder/Snap off an SRDF/A R2 deviceFederated Live MigrationEnginuity 5875 introduces a feature known as Federated Live Migration (FLM). Thisproduct allows data movement from an older Symmetrix DMX to a Symmetrix VMAXrunning Enginuity 5875 without downtime to applications and without loading

    additional softwareon any connected hosts. This feature makes use of OpenReplicator to move the data between the Symmetrix arrays and EMC PowerPath, orother multipathing solutions, to manage host access to the arrays while the migrationis taking place. Federated Live Migration enables a device in the VMAX array toimpersonate a device in the old Symmetrix array making it assume the completeidentity and geometry of the old device and then performing an ORS hot pull, donorupdate operation.

    Duplicate TimerFinder/SnapsDuplicate TimeFinder/Snap offers the capability to capture TimeFinder/Snap replicasfrom another TimeFinder/Snap point-in-time copy. Work can proceed against an

    existing TimeFinder/Snap while duplicate copies can be created for additionaldownstream processes or checkpoint backups. With Enginuity 5875 snap copies canbe taken from another snap source, adding even more disk space savings andflexibility through this track sharing technology.

    10 Gb/s supportNew front-end I/O modules supported in Enginuity 5875 represent the latest 10Gigabit technology to the Symmetrix VMAX platform. 10 Gb/s offers improved single-stream performance for SRDF.

    Symm etrix Data at Rest EncryptionSymmetrix Data at Rest Encryption with Enginuity 5875 utilizes Data Encryption keysto encrypt and protect data on drives within a VMAX storage array. Data at RestEncryption eliminates security risks when drives are removed from an array (becauseof normal drive replacement or possibly media theft) and when arrays arerepurposed. Compliance to industry encryption requirements can also be satisfied

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    10/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    with this solution. The data protection mechanism is a set-and-forget feature forthe entire VMAX array that is enabled at array installation.

    VM ware vSphere 4.1The VMware vSphere virtualization suite consists of various components including

    ESX/ESXi hosts, vCenter Server, vSphere Client, vSphere web access, and vSphereSDK. VMware ESX and VMware ESXi are bare-metal hypervisor architectures,meaning they install directly on top of the physical server and partition it into multiplevirtual machines that can run simultaneously, sharing the physical resources of theunderlying server. Each virtual machine represents a complete system, withprocessors, memory, networking, storage and BIOS, and can run an unmodifiedoperating system and applications. In addition to this, VMware vSphere offers a set ofdistributed services including distributed resource scheduling, high availability, andconsolidated backup. VMware vSphere virtualizes the entire IT infrastructureincluding servers, storage, and networks. VMware vSphere aggregates theseresources and presents a uniform set of elements in the virtual environment (Figure

    1). With VMware vSphere, you can manage IT resources like a shared utility anddynamically provision resources to different business units and projects.

    Figure 1. VMwa re vSphere 4.1 architecture

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    11/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Major enhancements in VMware vSphere 4.1 are listed below.

    Storage enhancements Boot from SA N.vSphere 4.1 enables ESXi boot from SAN. iSCSI, FCoE, and Fibre

    Channel boot are supported.

    Hardwa re acceleration with VAAI. ESX can offload specific storage operations tocompliant storage hardware. With storage hardware assistance, ESX performsthese operations faster and consumes less CPU, memory, and storage fabricbandwidth.

    Storage performa nce statistics.vSphere 4.1 offers enhanced visibility into storagethroughput and latency of hosts and virtual machines, and aids in troubleshootingstorage performance issues. NFS statistics are now available in vCenter Serverperformance charts, as well as esxtop. New VMDK and datastore statistics areincluded. All statistics are available through the vSphere SDK.

    Storage I/O control.This feature provides quality-of-service capabilities forstorage I/O in the form of I/O shares and limits that are enforced across all virtual

    machines accessing a datastore, regardless of which host they are running on.Using storage I/O control, vSphere administrators can ensure that the mostimportant virtual machines get adequate I/O resources even in times ofcongestion.

    Network enhancements Network I/O control.Traffic-management controls allow flexible partitioning of

    physical NIC bandwidth between different traffic types, including virtual machine,vMotion, FT, and IP storage traffic (vNetwork Distributed Switch only).

    IPv6 enhancements.IPv6 in ESX supports Internet Protocol Security (IPsec) withmanual keying.

    Load-based teaming.vSphere 4.1 allows dynamic adjustment of the teamingalgorithm so that the load is always balanced across a team of physical adapterson a vNetwork Distributed Switch.

    Availability enhancements Windows Failover Clustering with VMware HA.Clustered virtual machines that

    utilize Windows Failover Clustering/Microsoft Cluster Service are now fullysupported in conjunction with VMware HA.

    VMware HA scalability improvements.VMware HA has the same limits for virtualmachines per host, hosts per cluster, and virtual machines per cluster as vSphere.

    VMw are Fault Tolerance FT) enhance ments. vSphere 4.1 introduces an FT-specificversioning-control mechanism that allows the primary and secondary VMs to runon FT-compatible hosts at different but compatible patch levels. vSphere 4.1differentiates between events that are logged for a primary VM and those that arelogged for its secondary VM, and reports why a host might not support FT. Inaddition, you can disable VMware HA when FT-enabled virtual machines are

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    12/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    deployed in a cluster, allowing for cluster maintenance operations without turningoff FT.

    DRS interoperability for VMwa re HA and FT.FT-enabled virtual machines can takeadvantage of DRS functionality for load balancing and initial placement. Inaddition, VMware HA and DRS are tightly integrated, which allows VMware HA torestart virtual machines in more situations.

    vStorage APIs for Data Protection VAD P) enhan cemen ts. VADP now offers VSSquiescing support for Windows Server 2008 and Windows Server 2008 R2 servers.This enables application-consistent backup and restore operations for WindowsServer 2008 and Windows Server 2008 R2 applications.

    Management enhancements vCLI enhancements. vCLI adds options for SCSI, VAAI, network, and virtual

    machine control, including the ability to terminate an unresponsive virtualmachine. In addition, vSphere 4.1 provides controls that allow you to log vCLIactivity.

    Enhan cements to host profiles.You can use host profiles to roll out administratorpassword changes in vSphere 4.1. Enhancements also include improved CiscoNexus 1000V support and PCI device ordering configuration.

    Power managem ent improvements.ESX 4.1 takes advantage of deep sleep statesto further reduce power consumption during idle periods. The vSphere Client hasa simple user interface that allows you to choose one of four host powermanagement policies. In addition, you can view the history of host powerconsumption and power cap information on the vSphere Client Performance tabon newer platforms with integrated power meters.

    Platform enhancem ents Scalability increases.vCenter Server 4.1 can support three times more virtual

    machines and hosts per system, as well as more concurrent instances of thevSphere Client and a larger number of virtual machines per cluster than vCenterServer 4.0. The scalability limits of Linked Mode, vMotion, and vNetworkDistributed Switch have also increased.

    DRS virtual ma chine Host Affinity Rules.DRS provides the ability to set constraintsthat restrict placement of a virtual machine to a subset of hosts in a cluster. Thisfeature is useful for enforcing host-based ISV licensing models, as well as forkeeping sets of virtual machines on different racks or blade systems foravailability reasons.

    Memory compression.Compressed memory is a new level of the memoryhierarchy, between RAM and disk. Slower than memory, but much faster thandisk, compressed memory improves the performance of virtual machines whenmemory is under contention, because less virtual memory is swapped to disk.

    vMotion enhancements. In vSphere 4.1, vMotion enhancements significantlyreduce the overall time for host evacuations, with support for more simultaneous

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    13/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    virtual machine migrations and faster individual virtual machine migrations. Theresult is a performance improvement of up to 8x for an individual virtual machinemigration, and support for four to eight simultaneous vMotion migrations perhost, depending on the vMotion network adapter (1 GbE or 10 GbE, respectively).

    ESX/ESXi Active Directory integration. Integration with Microsoft Active Directoryallows seamless user authentication for ESX/ESXi. You can maintain users and

    groups in Active Directory for centralized user management and assign privilegesto users or groups on ESX/ESXi hosts. In vSphere 4.1, integration with ActiveDirectory allows you to roll out permission rules to hosts by using Host Profiles.

    VAAI primitivesThe vStorage API for Array Integration (VAAI) is a new API for storage partners toleverage that permits certain functions to be delegated to the storage array, thusgreatly enhancing the performance of those functions. This API is fully supported byEMC Symmetrix VMAX running Enginuity 5875 or later2. In the vSphere 4.1 release,

    this array offload capability supports three primitives: hardware-accelerated FullCopy, hardware-accelerated Block Zero, and hardware-assisted locking.

    Hardware-accelerated Full CopyThe time it takes to deploy or migrate a virtual machine will be greatly reduced by useof the Full Copy primitive, as the process is entirely executed on the storage array andnot on the ESX server. The host simply initiates the process and reports on theprogress of the operation on the array. This greatly reduces overall traffic on the ESXserver. In addition to deploying new virtual machines from a template or throughcloning, Full Copy is also utilized when doing a Storage vMotion. When a virtualmachine is migrated between datastores on the same array the live copy is performed

    entirely on the array.

    Not only does Full Copy save time, but it also saves significant server CPU cycles,memory, IP and SAN network bandwidth, and storage front-end controller I/O.

    2Customers using or intending to use VAAI with Enginuity 5875 are advised to refer to the following EMC Technical Advisory

    for additional patching information: ETA emc263675: Symmetrix VMAX: VMware vStorage API for Array Integration (VAAI).

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    14/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 2. Hardware-accelerated Full CopyHardware-accelerated Block ZeroTo have the array complete a Block Zeroing out of a disk is far more efficient andmuch faster than traditional software Block Zeroing. A typical use case for BlockZeroing is when creating virtual disks that are eagerzeroedthick in format. Without theBlock Zeroing primitive, the ESX server must complete all the zero writes of the entiredisk before it reports back that the disk zeroing is complete. For a very large disk thisis time-consuming. When employing the Block Zeroing primitive, however, alsoreferred to as write same, the disk array returns the cursor to the requesting serviceas though the process of writing the zeros has been completed. It then finishes thejob of zeroing out those blocks without the need to hold the cursor until the job isdone, as is the case with software zeroing.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    15/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 3. Hardware-accelerated Block ZeroHardware-assisted lockingAs a clustered shared-storage manager, VMwares Virtual Machine File System (VMFS)needs to coordinate the access of multiple ESX server hosts to portions of the spacewithin the logical units that they share. VMFS allocates portions of the storageavailable to it for the data describing virtual machines, and their configurations, aswell as the virtual disks that they access. Within a cluster of ESX servers the virtual

    machines contained in the VMFS datastore can be loaded and run on any of the ESXinstances. They can also be moved between instances for load balancing and highavailability.

    VMware has implemented locking structures within the VMFS datastores that areused to prevent any virtual machine from being run on, or modified by, more than oneESX at a time. The initial implementation of mutual exclusion for updates to theselocking structures was built on the use of SCSI RESERVE and RELEASE commands.This protocol claims sole access to an entire logical unit for the reserving host until itissues a subsequent release. Under the protection of a SCSI RESERVE, a server nodecould update metadata records on the device to reflect its usage of portions of thedevice without the risk of interference from any other host that might also wish to

    claim the same portion of the device. This approach, which is shown in Figure 4, hassignificant impact on overall cluster performance since all other access to any portionof the device is prevented while SCSI RESERVE is in effect. As ESX clusters havegrown in size, as well as in their frequency of modifications to the virtual machinesthey are running, the performance degradation from the use of SCSI RESERVE andRELEASE commands has become unacceptable.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    16/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 4. Traditional VMFS locking before vSph ere 4.1 and hardwa re-assisted lockingThis led to the development of the third primitive for VAAI in the vSphere 4.1 release,hardware-assisted locking. This primitive provides a more granular means ofprotecting the VMFS metadata than SCSI reservations. Hardware-assisted lockingleverages a storage array atomic test and set capability to enable a fine-grainedblock-level locking mechanism as shown in Figure 5. First, hardware-assisted lockingreplaces the sequence of RESERVE, READ, WRITE, and RELEASE SCSI commands. Itdoes this with a single SCSI request for an atomic read-modify-write operationconditional on the presumed availability of the target lock. Second, this new requestonly requires exclusion of other accesses to the targeted locked block, rather than to

    the entire VMFS volume containing the lock. This locking metadata update operationis used by VMware whenever a virtual machines state is being changed. This may bea result of the virtual machine being powered on or off, having its configurationmodified, or being migrated from one ESX server host to another with vMotion orDynamic Resource Scheduling.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    17/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 5. VMFS locking with vSph ere 4.1 and hardwa re-assisted lockingAlthough the non-hardware assisted SCSI reservation locking mechanism does notoften result in performance degradation, the use of hardware-assisted lockingprovides a much more efficient means to avoid retries for getting a lock when manyESX servers are sharing the same datastore. Hardware-assisted locking enables theoffloading of the lock mechanism to the array and then the array does the locking at avery granular level. This permits significant scalability in a VMware cluster sharing adatastore without compromising the integrity of the VMFS shared storage-poolmetadata.

    Viewing VAAI supportWithin the vSphere Client, one can see whether or not a particular device or datastoresupports hardware acceleration. From the Configuration tab, underHardware/Storage, there is a new column present in the datastore and device views

    labeled Hardware Acceleration. This column indicates the support status of theprimitives for the device or datastore. There are three possible values that canpopulate this column: Supported, Not supported, or Unknown3. This newcolumn is shown in Figure 6.

    3A particular device or datastore may be labeled as Unknown until it is successfully queried.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    18/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 6. Viewing Hardware Acceleration in the vSphere ClientA similar view is also available using EMCs Virtual Storage Integrator (VSI) plug-in.This free tool, available to download on Powerlink, enables additional capabilities tothe vSphere Client, so that users may view detailed storage-specific information asseen in Figure 7.

    Figure 7. EMC V irtual Storage IntegratorThe Hardware Acceleration column does not indicate whether the primitives areenabled or disabled, only whether they are supported. For enabling and disabling theprimitives see section Enabling the vStorage APIs for Array Integration.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    19/34

    1Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Enabling the vStorage APIs for Array IntegrationThe VAAIs are enabled by default on both the Symmetrix VMAX running 5875Enginuity or later, and on the 4.1 ESX server (properly licensed4) and should notrequire any user intervention. All of the three primitives, however, can be disabledthrough the ESX server if desired, either through the CLI or GUI. Using the vSphereClient, Full Copy and Block Zero can be disabled or enabled by altering the respective

    settings, DataMover.HardwareAcceleratedMoveandDataMover.HardwareAcceleratedInit, in the ESX server advanced settings underDataMover as shown in Figure 8.

    Figure 8. Enabling hardware-accelerated Full Copy or Block Zero on ESX 4.1Hardware-assisted locking can be disabled or enabled by changing the settingVMFS3.HarwareAcceleratedLocking in the ESX server advanced settings under VMFS3as shown in Figure 9.

    4Refer to VMware documentation for required licensing to enable VAAI features.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    20/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 9. Enabling hardware-assisted locking on ESX 4.1Disabling or enabling the primitives is a dynamic process on the ESX server and doesnot require the ESX server to be in maintenance mode, nor does it necessitate areboot of the server.

    It is important to note that the primitives will not cause failures in the vSphereenvironment if conditions prevent their use. In such cases, VMware simply reverts

    back to the default software behavior. For example, if the Full Copy primitive isenabled and a user attempts to clone a VM to a local datastore which does notsupport hardware acceleration, VMware will revert to the default software copy. Theprimitives may also be enabled or disabled even in the middle of operations that areutilizing them without any concern of disruption.

    Use casesHardware-accelerated Full Co py5The primary use cases that show the benefit of hardware-accelerated Full Copy arerelated to deploying virtual machines whether from a template or executing a hot orcold clone. With Full Copy the creation any of these virtual machines will be offloadedto the array. In addition to deploying new virtual machines, Full Copy is also utilized

    5The SCSI command for hardware-accelerated Full Copy is XCOPY.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    21/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    by Storage vMotion operations, significantly reducing the time to move a virtualmachine to a new datastore.

    Following are a number of use cases for Full Copy concerned with the deployment ofvirtual machines some from a template and others from cloning and StoragevMotion. At a minimum, each use case includes the test run on both thick (standard)metavolumes and thin metavolumes with hardware-accelerated Full Copy enabled

    and disabled.

    Use case configurationIn general, the Full Copy use cases were conducted using the same virtual machine,or a clone of that virtual machine converted to a template. This virtual machine wascreated using the vSphere Client, taking all standard defaults and with the followingcharacteristics:

    20 GB virtual disk Windows Server 2003 operating systemThe virtual machine was then completely filled with data so only about 500 MB of freespace remained. This configuration might be thought of as a worst case scenario forcustomers since most virtual machines that are used for cloning, even with the OSand all applications deployed on it, do not approach 20 GB of actual data. Rememberthat the default configuration of a virtual machine uses a zeroedthick virtual disk.This type of disk, though reserving the 20 GB of space on the VMFS, will not actuallywrite to disk until necessary. This is distinct from a fault-tolerant virtual machine,which must have an eagerzeroedthick virtual disk, a disk that is zeroed out entirelyupon creation. Since only a small portion of a 20 GB virtual machine will actuallycontain data, the Full Copy process will move far more quickly than the use casespresented in this study. In order to fully test the functionality and performance,

    however, a full 20 GB virtual machine was used.When creating non-fault-tolerant virtual machines, the cloning times are directlyrelated to how much data is in the virtual machine, not the size of the virtual disk.This is true whether performing a software copy (Full Copy disabled), or a hardware-accelerated Full Copy. From the standpoint of cloning, a 20 GB VM that has 5 GB ofdata will take approximately the same amount of time to clone as a 100 GB VM with 5GB of data.

    The use cases included the use of both thick (standard) and thin devices in stripedmetavolume and non-metavolume configurations.6

    6In the use case graphs, standard devices are referred to as thick and metavolumes are abbreviated to meta.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    22/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    All results presented in graphs of Full Copy functionality are relative. There are manyfactors that can impact how fast both software clones and Full Copy clones arecreated. For software clones the amount of CPU and memory on the ESX server aswell as the network will play a role. For Full Copy the number of engines and directorsand other hardware components will impact the cloning time. In both cases, theexisting load on the environment will most certainly make a difference.

    Use Case 1: Deploying a virtual machine from a templateThe first use case for Full Copy is the deployment of a virtual machine from atemplate. This is probably the most common type of virtual machine duplication forcustomers. A customer begins by creating a virtual machine, then installs theoperating system and following this, loads the applications. The virtual machine isthen customized so that when users are presented with the cloned virtual machine,they are able to enter in their personalized information. When an administratorcompletes the virtual machine base configuration, it is powered down and then

    converted to a template.These clones were created on one of four datastores backed by a particular disk type:

    Thick non-metadevice Thick striped metadevice Thin non-metadevice Thin striped metadeviceThe following graph in Figure 10 shows the significant time difference between aclone created using Full Copy and one using software cloning.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    23/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 10. Deploying virtual machines from the template

    Use Case 2: Cloning hot and cold virtual mach inesThe second use case involves creating clones from existing virtual machines, bothrunning (hot) and not running (cold). In addition, a third type of test was run in whicha hot clone was sustaining heavy read I/O while it was being cloned. These cloneswere also created on one of four datastores backed by a particular disk type:

    Thick non-metadevice Thick striped metadevice Thin non-metadevice

    Thin striped metadeviceIn each instance, utilizing the Full Copy functionality resulted in a significantperformance benefit as visible in Figure 11, Figure 12, and Figure 13.

    Thicknonmeta Thickmeta Thinnonmeta Thinmeta

    Time

    DatastoreDiskType

    DeployingVirtualMachinesfromTemplateusingFullCopy

    AccelerationOff

    AccelerationOn

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    24/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 11. Performance of cold clones using Fu ll Copy

    Thicknonmeta Thickmeta Thinnonmeta Thinmeta

    Time

    DatastoreDiskType

    FullCopyColdClonePerformanceonVariousDiskTypes

    AccelerationOff

    AccelerationOn

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    25/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 12. Performance of hot clones u sing Full Copy

    Thicknonmeta Thickmeta Thinnonmeta Thinmeta

    Time

    DatastoreDiskType

    FullCopyHotClonePerformanceonVariousDiskTypes

    AccelerationOff

    AccelerationOn

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    26/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 13. Cloning running VM s under load using Full Copy

    Use Case 3: Creating simultaneo us multiple clonesThe third use case that was tested was to deploy four clones simultaneously from thesame source virtual machine. This particular test was tested against a cold virtualmachine and on the larger datastore disk types: thick striped metavolume and thinstriped metavolume.

    As seen in Figure 14, the results for the two different disk types are very similar. WithFully Copy enabled, the time to create all four virtual machines is only a fraction of thetime it took to create them with acceleration off.

    Thicknonmeta Thickmeta Thinnonmeta Thinmeta

    Time

    DatastoreDiskType

    FullCopyPerformanceonRunningVirtualMachinesunderHeavyReadIO

    AccelerationOff

    AccelerationOn

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    27/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 14. Cloning mu ltiple virtual machines simu ltaneously with Full Copy

    Use Case 4: Storage vMotionThe final use case is one that demonstrates a great benefit for customers who requiredatastore relocation of their virtual machine, but at the same time desire to reducethe impact to their live applications running on that virtual machine. This use casethen is for Storage vMotion. With Full Copy enabled, the process of moving a virtualmachine from one datastore to another datastore is offloaded. As mentionedpreviously, software cloning requires CPU, memory, and network bandwidth. Theresources it takes to software clone, therefore, might negatively impact theapplications running on that virtual machine that is being moved. By utilizing Full

    Copy this is avoided and additionally as seen in Figure 15, Full Copy is far quicker inmoving that virtual machine.

    Thickmeta Thinmeta

    Time

    DatastoreDiskType

    CloningMultipleVirtualMachinesSimultaneouslywithFullCopy

    AccelerationOff

    AccelerationOn

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    28/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 15. Storage vMotion using Full Copy

    Server resources impact to CPU and mem oryWhile the relative time benefit of using Fully Copy is apparent in the presented usecases, what of the CPU and memory utilization on the ESX host? In this particular testenvironment and in these particular use cases, the CPU and memory utilization of theESX host did not differ significantly between performing a software clone or a FullCopy clone. That is not to say, however, this will hold true of all environments. TheESX host used in this testing was not constrained by CPU or memory resources in anyway as they were both plentiful; and the ESX host was not under any other workloadthan the testing. In a customer environment where resources may be limited andworkloads on the ESX host many and varied, the impact of running a software clone

    instead of a Full Copy clone may be more significant.Caveats for using hardw are-accelerated Full CopyThe following are some general caveats that EMC provides when using this feature:

    Thickmeta Thinmeta

    Time

    DatastoreDiskType

    StoragevMotionusingFullCopy

    AccelerationOff

    AccelerationOn

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    29/34

    2Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Limit the number of simultaneous clones using Full Copy to three or four. This isnot a strict limitation, but EMC believes this will ensure the best performancewhen offloading the copy process.

    A Symmetrix metavolume that has SAN Copy, TimeFinder/Clone, TimeFinder/Snapor ChangeTracker sessions and certain RecoverPoint sessions will not supporthardware-accelerated Full Copy. Any cloning or Storage vMotion operation run on

    datastores backed by these volumes will automatically be diverted to the defaultVMware software copy. Note that the vSphere Client has no knowledge of suchsessions and as such the Hardware Accelerated column in the vSphere Clientwill still indicate Supported for these devices or datastores.

    If SRDF protection of a cloned or migrated virtual machine is desired, the devicehosting the target datastore should have an existing SRDF relationship with one ormore remote device(s).

    Hardware-accelerated Block ZeroHardware-accelerated Block Zero provides benefits and performance increases in a

    variety of use cases. This paper will cover the two most commonly encounteredscenarios. The first scenario discussed will be concerned with deploying fullyallocated virtual machines (virtual machines using the eagerzeroedthick virtual diskallocation mechanism). The second use case discusses the performance gainsachieved through the use of Block Zero when performing I/O in a virtual machine withvirtual disks using the zeroedthick allocation format.

    Use Case 1: Dep loying fully allocated virtual machinesIn certain situations, virtual disks are deployed using the eagerzeroedthickallocation mechanism to fully zero out the disk at creation. An example of a scenariothat would demand this is a feature such as VMware Fault Tolerance that requires all

    virtual disks in use by protected virtual machines to use this allocation mechanism. Inother cases, virtual disks are created using the thin or zeroedthick allocationsmechanisms where space is not fully allocated on creation of the virtual disk. Both ofthese mechanisms will not zero out the space until the guest OS writes data to apreviously unallocated block. This behavior inherently will cause a performance dropwhen allocating new space as the kernel must first write zeros to initialize the newspace before the virtual machine can write actual data. Therefore, theeagerzeroedthick allocation mechanism is used to avoid performance degradationscaused by the on-the-fly zeroing of thin or zeroedthick virtual disks.

    Since eagerzeroedthick virtual machines are entirely zeroed at the initial creation, it

    can take a great deal of time to write these zeros, especially in virtual machines withlarge virtual disks. Furthermore, this creation operation consumes SAN bandwidthduring execution. With the introduction of Block Zero, the zeroing is offloaded to thearray, which saves not only on SAN bandwidth but also takes advantage of theincreased processing power inherent on the Symmetrix VMAX. It therefore writesthese zeros faster and more efficiently. If Symmetrix thin devices are used in

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    30/34

    3Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    conjunction with Block Zero, the zeros are discarded by the Symmetrix and thin poolcapacity is reserved for non-zero data.

    In this use case, new virtual disks of varying sizes (10 GB, 40 GB, 100 GB, and 200GB) are deployed using the eagerzeroedthick allocation mechanism with hardware-accelerated Block Zero enabled and disabled. As shown by the graph in Figure 16, thenoted improvements are dramatic. The improvement in deployment time ranges from

    7 to 18 times faster when Block Zero is enabled7. Typically the larger the virtual diskis, the greater the difference in deployment times due to the fact that there is agreater differential in I/O being sent down over the link in larger virtual disks.

    Figure 16. Fully allocated virtual disk deployment times u sing hardwa re-acceleratedBlock ZeroIn addition to time savings, there is a substantial decrease in throughput during thecreation of the virtual disks between when Block Zero is on and off. Figure 17 showsthe ESX performance graph of the write rate in KB per second (throughput) of thetarget Symmetrix device from the ESX server creating the virtual disk. A dramatic

    7It is important to note that the performance gains depicted in Figure 14 depend on a number of factors and may notnecessarily be achieved in all environments.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    31/34

    3Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    difference can be noted between when Block Zero is on and off. A virtual disk iscreated with hardware acceleration off and takes 13 minutes, starting at 12:14 andending at 12:27, and has almost a constant write rate of around 280,000 KB/s. Afterthe virtual disk creation completes, hardware-accelerated Block Zero is enabled and asecond identical virtual disk is created on the same target Symmetrix device. Thethroughput required to deploy this is so low that it cannot be discerned on the same

    scale as the first virtual disk creation. So inserted into Figure 17 is a magnifiedportion of the graph so the throughput can be visualized. The throughput spikes ataround 1,000 KB/s for around a minutea 280x decrease in required throughput tocreate the virtual disk!

    Figure 17. Throughp ut difference betwee n deploying an eagerzeroedthick virtual diskwith Block Zero and off and onUse Case 2: Ben efits of Block Zero w hen using zeroedthick virtual disksBy default in ESX 4.x, new virtual disks are created using the zeroedthick allocationmechanism. For instance, this is the recommended disk format for use withSymmetrix Virtual Provisioning to improve storage efficiencies by not writing zeros oncreation. In this allocation scheme, the storage required for the virtual disks isreserved in the datastore but the VMware kernel does not initialize all the blocks withzeros. The blocks are initialized with zeros by the ESX kernel in response to I/O

    activities to previously uninitialized blocks by the guest operating system.Consequently, as discussed in the previous use case, there is overhead when writingto unallocated regions of the virtual disk as the ESX kernel first must write zeros tothe previously unallocated space before the guest can complete any new writeoperations. Previously, to avoid this overhead, eagerzeroedthick virtual disks wereused to ensure optimal performance. Unfortunately, especially in the case of VirtualProvisioning, eagerzeroedthick virtual disks wasted inordinate amounts of space in

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    32/34

    3Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    the thin pool due to the zeros written at creation. A decision therefore needed to bemade between either consuming more capacity with eagerzeroedthick virtual disksor accepting the inherent performance overhead associated with zeroedthick virtualdisks.

    In this use case, the benefits of having hardware-accelerated Block Zero enabled willbe explored when writing to unallocated regions of a zeroedthick virtual disk. With

    Block Zero enabled, the zeros are no longer written to initialize blocks before theguest OS writes to them, therefore removing the aforementioned performanceoverhead caused by the zeroedthick allocation mechanism. For this use case, anentirely unallocated 50 GB zeroedthick is written to with Block Zero on and offusing 1 MB sequential writes. A performance improvement of up to 18 percent inaverage throughput was consistently experienced by enabling Block Zero. Bydisplacing the process of the zero initialization from the ESX kernel to the Symmetrix,it introduces a marked improvement in writing to unallocated virtual disk regions.This performance improvement is depicted graphically in Figure 18.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    33/34

    3Using VMware vStorage APIs for Array Integration with EMC Symmetrix VMAX

    Figure 18. Throughpu t increase when writing data to new regions in a virtual disk withBlock Zero disabled and enabledServer resources impact to CPU and mem oryWhile the relative time benefit and drop of overall throughput of using block zero isreadily apparent in the presented use cases, what of the CPU and memory utilizationon the ESX host? In this particular test environment and in these particular use cases,the CPU and memory utilization of the ESX host did not differ significantly betweenwhen block zero was enabled or disabled. That is not to say, however, this will holdtrue of all environments. The ESX host used in this testing was not constrained by

    CPU or memory resources in any way as they were both plentiful; and the ESX hostwas not under any other workload than the testing. In a customer environment whereresources may be limited and workloads on the ESX host many and varied, the impactof utilizing block zero may be more significant.

  • 8/13/2019 h8115 Vmware Vstorage Vmax Wp

    34/34

    ConclusionWith the introduction of the Symmetrix VMAX Enginuity 5875 code, EMC offers threenew storage integrations with VMware vSphere 4.1 enabling customers todramatically improve the efficiency of their VMware environments. VMware vSphere4.1 includes three new VMware vStorage APIs for Array Integration Full Copy, BlockZero, and hardware-assisted locking. These three new primitives allow the ability tooffload specific storage operations to EMC's Symmetrix VMAX to increase both overallsystem performance and efficiency.

    ReferencesEMC Using EMC Symmetrix Storage in VMware vSphere EnvironmentsTechBook

    http://www.emc.com/collateral/hardware/solution-overview/h2529-vmware-esx-svr-w-symmetrix-wp-ldv.pdf

    Using VMware vSphere with EMC Symmetrix Storage Applied Technology whitepaper

    http://www.emc.com/collateral/hardware/white-papers/h6531-using-vmware-vsphere-with-emc-symmetrix-wp.pdf

    Virtual Storage Integrator 4.1 Product Guide (Powerlink only) New Features in EMC Enginuity 5875 for Open Systems Environmentswhite paperVMware Whats New in VMware vSphere 4.1 Storage

    http://www.vmware.com/files/pdf/techpaper/VMW-Whats-New-vSphere41-Storage.pdf

    vSphere Datacenter Administration Guidehttp://www.vmware.com/pdf/vsphere4/r41/vsp_41_dc_admin_guide.pdf

    Virtual Machine Administration Guidehttp://www.vmware.com/pdf/vsphere4/r41/vsp_41_vm_admin_guide.pdf

    ESX Configuration Guidehttp://www.vmware.com/pdf/vsphere4/r41/vsp_41_esx_server_config.pdf

    Fibre Channel SAN Configuration Guidehttp://www.vmware.com/pdf/vsphere4/r41/vsp_41_san_cfg.pdf