a guide to storage and virtualisation -...

8
A guide to storage and virtualisation a special report from ComputerWeekly AGSANDREW/ISTOCK

Upload: others

Post on 23-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

A guide to storage and virtualisation

a special report from ComputerWeeklyAGSANDREW/ISTO

CK

Page 2: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-2-

There is no denying that server virtualisation has been a highly successful technology that has transformed the way applications are deployed in the datacentre. What previously took weeks and months can now be done in hours or even minutes, reducing application development and deployment cycles.

But one of the consequences of the move towards virtualised workloads has been a ballooning of demand for storage capacity and increased complexity in storage management for virtualisation.

So what is at the heart of this increase in storage demand? Why do so many organisations struggle to cope with storage in virtual environments? First, here is some background to the problem.

Shared storage for virtual servers

Shared storage can be deployed onto virtual servers, with either block-based systems – such as Fibre Channel and internet small computer system interface (iSCSI) – or file-based systems, including server message block (SMB) and network file systems (NFS).

In the case of block-based arrays, the storage presented to the hypervisor – or host – is in the form of a logical unit number (LUN), which on VMware vSphere platforms represents a datastore, formatted with VMware’s virtual-machine file system (VMFS).

Microsoft Hyper-V block storage is formatted as a system drive. In both implementations, many virtual machines are stored in the same datastore or drive, with datastores scaling to multiple terabytes of capacity.

For file-based systems, VMware vSphere only supports shares using the NFS protocol, while SMB is the preferred format for Hyper-V virtual machines.

Many issues in managing storage on virtual infrastructure stem from the consequences of storage design. These include the following:

FragmentationVirtual machines stored in a single datastore all receive the same level of performance and resiliency from the storage array. This is because the datastore is based on one or more LUNs – or volumes – on block-based systems or a file share on file-based systems. As a result, many customers create multiple datastores, each with different performance characteristics.

This can include, for example, dedicated datastores for production workloads, some for development and some that have array-based replication. There are some exceptions here – systems that have block-based tiering, for example.

LUN-level array supportAs already discussed, block-based systems allocate datastores based on one or more LUNs. A LUN is the smallest unit of granularity for features such as replication and failover. As a result, if customers use array-based data

Key challenges in virtualisation and storage – and how to handle themVirtualisation brings benefits to the datacentre, but it also brings big changes to storage provisioning and management. We look at the key challenges and how to deal with them

Many issues in managing storage on virtual infrastructure stem from the consequences of storage design

One consequence of the move towards virtualised workloads has been ballooning demand for storage capacity

Page 3: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-3-

protection, guest virtual machines (VMs) have to be grouped on LUNs based on their data protection requirements. This can lead to the creation of additional LUNs in a system, increasing the problems of fragmentation.

Fully allocated VM storage Virtual machine storage can be either fully allocated (or reserved out) at VM-creation time or allocated dynamically after the VM has been created. Allocating all storage at creation potentially delivers better performance because the hypervisor doesn’t have to go through the process of requesting and reserving storage as each block of data is written by the guest.

However, in many instances, VMs are built from templates that need to be created with boot volumes of sufficient size for all installed applications. Inevitably, this leads to wastage with every virtual machine deployed.

Lack of thin provisioningThin provisioning is a feature of storage arrays and hypervisors that reserves capacity on physical disk as it is required, rather than pre-allocating it way beyond current needs. A whole article could be dedicated purely to a discussion of thin provisioning and whether best practice dictates thin provision should occur at both the hypervisor and storage array.

In practice, implementing thin provisioning at both layers of the infrastructure is totally acceptable, but many organisations fail to understand that, over time, thin-provisioned datastores will become transformed into their thick-provisioned equivalents unless processes are in place to reclaim freed space.

VM sprawl As obvious as it may seem, one issue with storage growth in virtual environments is VM sprawl, or the proliferation of virtual machines that are rarely or never used. Server virtualisation allows rapid deployment of applications for production and development, but that same ease of use means that, without adequate management, VM creation can get out of hand.

This can be further compounded by the creation of orphan VMs that are not connected to the hypervisor inventory and exist purely on disk.

Allocating all storage at creation potentially delivers better performance because the hypervisor doesn’t have to go through the process of requesting and reserving storage

AGSANDREW/ISTO

CK

Page 4: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-4-

Need for toolsVM sprawl indicates one problem – the need for tools that match up the data on disk to VMs configured to the hypervisor. It is easy – especially in larger environments – for virtual machines to be removed from the inventory without deletion from disk, often for acceptable reasons.

If these VMs are never cleaned up or re-associated with a host, then, over time, there is less likelihood of clean-up work being done.

Efficient virtualisation storage management

So how can these challenges be addressed to ensure efficient virtual-server storage? Here are some ideas to consider:

Implement a good thin-provisioning policyWhere possible, all virtual machines should be allocated with thin-provisioned storage. However, an effective thin-provisioning strategy means more than just configuring the feature at the VM guest and storage layer.

Thin provisioning clean-up also needs to be performed at the guest and hypervisor to return released storage to the array.

This is done in the VM guest by tools such as SDelete, which writes binary zeros onto the free space of the guest file system. At the hypervisor, VMware has a feature called VAAI Thin Provisioning Block Space Reclamation (known as Unmap), which allows the administrator to release storage by using the vmkfstools command. There is a trade-off in terms of how frequently these commands are run versus storage reclaimed, which varies by environment.

File-based datastores are typically more efficient at returning free space to the array. As virtual machines are deleted or files change in size, the space on the array file system is made available for use elsewhere. This is one of the advantages file-based systems have over traditional block storage.

Use native array featuresMany storage arrays support zero block reclaim – also called zero page reclaim – which allows the array to natively recover zeroed-out blocks of data. HP 3Par StoreServ systems, for example, do this in line, while others perform the function as a background task that needs to be scheduled by the storage administrator.

Datastores on VMware systems can be formatted with the eager-zeroed format, which writes all zeros to the VM at the time of creation. The array immediately ignores the zeroed blocks, maintaining efficient space utilisation while enabling features such as fault tolerance and allowing VMs to have storage resources fully allocated at the time of creation.

Hyper-V platforms have fewer issues with thin provisioning, but the same principles of running commands such as SDelete apply.

For arrays that offer sub-LUN tiering, performance can be managed at a VM level, using EMC’s fully automated storage tiering (Fast), for example. However, using these tools may conflict with configuration settings in the hypervisor for moving data between tiers of storage. The use of this technology should be discussed with the virtual-server administrator to agree a policy that works at both the hypervisor and storage level.

Use the latest file system formatsOn VMware vSphere platforms, the VMFS disk format used to store data on block-based systems has been improved over time as new versions of vSphere have been released. However, many IT organisations may have upgraded

An effective thin-provisioning strategy means more than just configuring the feature at the VM guest and storage layer

Many IT organisations may have upgraded vSphere without reformatting their VMFS datastores. This practice can lead to inefficiency

Page 5: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-5-

vSphere without reformatting their VMFS datastores. This practice can lead to inefficiency compared with using the latest VMFS formats. With the availability of live migration features such as storage vMotion, migrating hosts to new VMFS formats – and taking the opportunity to thin provision them at the same time – should be a priority with each upgrade.

Manage VM sprawlThis seems like a reasonable statement to make, but isn’t obvious to everyone. It is good practice to look at virtual-machine usage and have policies in place to archive virtual machines to cheaper storage after a period of inactivity – for example, when a VM has not been powered on for three months or more.

In development environments especially, it may be that a VM is not needed until the next software release cycle and so can be archived off until that point.

Backup software, such as CommVault’s Simpana suite, allows virtual machines to be moved to cheaper storage while retaining a stub file in place, so the VM can still be tracked in the hypervisor inventory.

Deploy effective toolsThere are tools that can identify and locate orphaned virtual machines. Alternatively, this process can be achieved using scripts to match the hypervisor inventory to a list of on-disk VM files. Running this process regularly will help catch orphaned VMs early and enable the VM to be reused or deleted.

There is one final observation to make that applies equally to storage and virtual-server administrators: spend time understanding a little more of each other’s technologies. With that in mind – and the right processes – managing storage on virtual-server installations can be made just a little easier. n

It is good practice to look at virtual-machine usage and have policies in place to archive virtual machines to cheaper storage after a period of inactivity

ALENGO/IST

OCK

Page 6: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-6-

Providing storage in a physical environment used to require storage administrators to match LUN (logical unit number) storage partitions to the performance and availability characteristics needed for each server. But with the advent of server virtualisation, that has all changed.

Instead, in the virtual environment, storage resources are abstracted not by the carving out of LUNs, but by the virtualisation hypervisor. The LUN still exists, but usually as a single large pool from which virtual storage is assigned to individual guests.

This pooling process means that some additional planning and design is necessary by the storage and virtualisation administrators to ensure storage resources operate in a timely and efficient fashion.

Hypervisors emulate storage devices

In virtual server environments, the storage presented to the guest is abstracted from the physical storage and is represented by generic SCSI devices.

In VMware environments, these initially were Buslogic and LSIlogic emulations of parallel SCSI devices and have since been expanded to include faster SAS versions. Hyper-V provides storage to a guest using a similar IDE driver and can use SCSI devices for non-boot disks. Essentially, though, the key factor is that, regardless of the underlying storage used by the hypervisor, the host still sees an emulated IDE, SCSI or SAS device connected through a single controller.

Device emulation means the physical data of a virtual machine can be moved around the storage in a virtual infrastructure without any impact to the host. However, it does pose some limitations. Firstly, there are limits to the size of individual disk volumes, and secondly, only standard SCSI commands are passed through to the emulated device.

This can be an issue for servers that need to access array control devices. In this instance, disks can be directly connected without out emulation. In VMware environments, these devices are known as RDMs – Raw Device Mapping. The latest release of Hyper-V provides a feature that allows Fibre Channel devices to be connected directly to the guest machine without device emulation.

Virtual disks – VMDKs and VHDs

Virtual disk drives are stored by the hypervisor as files, with effectively one file per guest volume. In vSphere, the files are known as VMDKs (virtual machine disks), whereas Hyper-V stores them as a VHD (virtual hard disk).

Within vSphere, a VMDK can be stored on an NFS share or on a block device (Fibre Channel or iSCSI) that has been formatted with the VMware File System (VMFS). A single VMDK is limited to 2TB – 512bytes in size, which implies a 2TB limit on all guest volumes. Where a guest needs more than 2TB, the storage has to be presented through multiple logical volumes.

Virtualisation and the LUN: Storage configuration for VMs

Device emulation means the physical data of a virtual machine can be moved around the storage in a virtual infrastructure without any impact to the host

In the virtual environment, storage resources are abstracted not by the carving out of LUNs, but by the virtualisation hypervisor

Storage administrators used to match LUNs to physical servers, but that has all changed. Find out how in this guide to the basics of virtual machine storage

Page 7: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-7-

On Hyper-V, the VHD format is also limited to 2TB in size. Microsoft recently released the new VHDX format as part of Windows Server 2012 which allows individual virtual disks to be scaled to 64TB in size.

The physical storage used to hold virtual disks can be either block or NAS devices. VMware supports iSCSI, Fibre Channel, FCoE and NFS. Hyper-V supports Fibre Channel, iSCSI and SMB, the latter sometimes being referred to historically as CIFS.

The type of storage used is transparent to the virtual guest, as is the level of multi-pathing in place. This is all achieved at the hypervisor layer and should follow standard good practices of multiple redundant paths to physical storage.

Matching VMs to storage

Hyper-V and vSphere store virtual machines in larger “containers”. Hyper-V uses local NTFS volumes or SMB/CIFS file shares. VSphere uses NFS shares or LUNs formatted with VMFS, known as datastores.

Before vSphere version 5, the block size of a VMFS datastore could range from 1MB-8MB and represented a limitation in terms of VMFS sizes. The largest 2TB VMFS datastore needed an 8MB block size, effectively resulting in a minimum of 8MB increments assigned to virtual guests.

VMFS version 5 (released with vSphere 5) provides for a uniform 1MB increment, regardless of the datastore size. For Hyper-V, the block increment is 2MB, irrespective of the formatting of the underlying NTFS file system.

In both hypervisor platforms, the container used to store virtual machines represents the way physical storage is presented to the hypervisor and so means all virtual guests in that container will receive the same level of performance and availability.

Therefore. vSphere datastores and Hyper-V volumes should be used to group similar types of virtual machines together. This grouping may be, for example, production versus test/development guests or be used to provide higher performance storage, such as tier 1 or SSD.

Consideration should also be given to the connectivity of physical storage and how it can affect performance. For example, in Fibre Channel environments, there may be a benefit in having separate Fibre Channel HBAs (host bus adapters) dedicated to high-performance storage to reduce the impact of contention of lower performance virtual machines in a mixed environment.

The type of storage used is transparent to the virtual guest, as is the level of multi-pathing in place

Virtual storage tipsn Create datastores and physical disk volumes as large as possible.n Apply existing standards on multi-pathing and RAID protection to the physical datastores and volumes.n Group similar workloads together within a single datastore/volume.n Use thin provisioning to pre-allocate storage for virtual guests to the expected maximums required, remembering

individual vSphere disks have a maximum 2TB limit (64TB for Hyper-V VHDX).n Implement monitoring and alerting to manage the growth of thin provisioned virtual disks.n Where supported in the guest operating system, use the latest version of virtual SCSI and SAS drivers to get the best

levels of performance.n Use physical device mapping to present physical disks to guests that need direct SCSI support.

Page 8: A guide to storage and virtualisation - Bitpipedocs.media.bitpipe.com/io_10x/io_102267/item_465972/... · 2015-05-28 · Key challenges in virtualisation and storage – and how to

a special report from ComputerWeekly

-8-

Thin provisioning

Both vSphere and Hyper-V provide for thin provisioned virtual machines. By this, we mean on-demand growth of virtual machines rather than physically reserving the entire size of the virtual machine at creation time.

vSphere provides for “thick” guest volumes in two formats: zeroedthick – in which storage is reserved at creation time and is zeroed out, or erased just-in-time as the host writes to that block of physical storage; and eagerzeroedthick – where the storage reserved is zeroed out at guest creation time.

Both of these formats represent a trade-off in performance versus security as zeroedthick can result in stale data existing on the VMFS. Hyper-V provides for “thick” allocated VHDs or dynamically expanding VHDs.

As with thin provisioning in traditional environments, there are positives and negatives in using the technology in virtual environments. Thin provisioning within the hypervisor means many more virtual machines can be accommodated on disk and this is especially useful where virtual machines are deliberately over-allocated in size to cater for future growth. Of course, the downside to on-demand expansion is the dispersed nature of the storage for a single virtual guest.

As each guest on a datastore or volume expands, it allocates space in 1MB or 2MB chunks with no predictability on when the next chunk will be requested for any specific virtual machine. This leads to a random and fragmented layout for the storage of an individual guest. This is particularly true of virtual desktop environments, which have a high degree of random I/O, producing performance problems as many virtual desktops are started at the same time.

One obvious question from using thin provisioning is whether thin technologies should be implemented in both the hypervisor and the storage. There is no reason not to have thin provisioning in both places; the only recommendation is to ensure reporting and monitoring is in place to manage growth. n

As with thin provisioning in traditional environments, there are positives and negatives in using the technology in virtual environments

ALENGO/IST

OCK