vmworld 2017 core storage

56
Cormac Hogan Cody Hosterman SER1143BU #VMworld #SER1143BU A Deep Dive into vSphere 6.5 Core Storage Features and Functionality

Upload: cormac-hogan

Post on 28-Jan-2018

2.125 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: VMworld 2017 Core Storage

Cormac HoganCody Hosterman

SER1143BU

#VMworld #SER1143BU

A Deep Dive into vSphere 6.5 Core Storage Features and Functionality

Page 2: VMworld 2017 Core Storage

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

2#SER1143BU CONFIDENTIAL

Page 3: VMworld 2017 Core Storage

Introduction

Page 4: VMworld 2017 Core Storage

Welcome from Cormac and Cody

• Cormac

• Director and Chief Technologist

• VMware

• @CormacJHogan

• http://cormachogan.com

4

• Cody

• Technical Director for VMware Solutions

• Pure Storage

• @CodyHosterman

• https://codyhosterman.com

#SER1143BU CONFIDENTIAL

Page 5: VMworld 2017 Core Storage

Agenda Slide

1 Limits

2 VMFS-6

3 VAAI (ATS Miscompare and UNMAP)

4 SPBM (SIOCv2 and vSphere VM Encryption)

5 NFS v4.1

6 iSCSI

7 NVMe

5#SER1143BU CONFIDENTIAL

Page 6: VMworld 2017 Core Storage

vSphere 6.5 Storage Limits

Page 7: VMworld 2017 Core Storage

vSphere 6.5 Scaling and Limits

Paths

• ESXi hosts now support up to 2000 paths

– Increase from the 1024 per host paths supported previously

Devices

• ESXi hosts now support up to 512 devices

– Increase from the 256 devices supported per host previously

– Multiple targets are required to address more than 256 devices

– This does not impact Virtual Volumes (aka VVols), which can address 16,383 VVol per PE

7#SER1143BU CONFIDENTIAL

Page 8: VMworld 2017 Core Storage

vSphere 6.5 Scaling and Limits

• 512e Advanced Format Device Support

• Capacity limits are now an issue with 512n (native) sector size used currently in disk drives

• New Advanced Format (AF) drives use a 4K native sector size for higher capacity

• These 4Kn devices are not yet supported on vSphere

• For legacy applications and operating systems that cannot support 4KN drives, new 4K sector size drives that run in 512 emulation (512e) mode are now available

– These drives will have a physical sector size of 4K but the logical sector size of 512 bytes

• These drives are now supported on vSphere 6.5 for VMFS and RDM (Raw Device Mappings)

8#SER1143BU CONFIDENTIAL

Page 9: VMworld 2017 Core Storage

512n/512e

• # esxcli storage core device capacity list

9

[root@esxi-dell-e:~] esxcli storage core device capacity list

Device Physical Logical Logical Size Format Type

Blocksize Blocksize Block Count

------------------------------------ --------- --------- ----------- ---------- -----------

naa.624a9370d4d78052ea564a7e00011014 512 512 20971520 10240 MiB 512n

naa.624a9370d4d78052ea564a7e00011015 512 512 20971520 10240 MiB 512n

naa.624a9370d4d78052ea564a7e00011138 512 512 1048576000 512000 MiB 512n

naa.624a9370d4d78052ea564a7e00011139 512 512 1048576000 512000 MiB 512n

naa.55cd2e404c31fa00 4096 512 390721968 190782 MiB 512e

naa.500a07510f86d6bb 4096 512 1562824368 763097 MiB 512e

naa.500a07510f86d685 4096 512 1562824368 763097 MiB 512e

naa.5001e820026415f0 512 512 390721968 190782 MiB 512n

512e (emulated)512n (native)

#SER1143BU CONFIDENTIAL

Page 10: VMworld 2017 Core Storage

DSNRO

The setting “Disk.SchedNumReqOutstanding” aka “No of outstanding IOs with competing worlds” has changed in behavior

• DSRNO can be set to a maximum of:

– 6.0 and earlier: 256

– 6.5 and on: Whatever the HBA Device Queue Depth Limit is

• Allows for extreme levels of performance

10#SER1143BU CONFIDENTIAL

Page 11: VMworld 2017 Core Storage

VMFS-6

Page 12: VMworld 2017 Core Storage

VMFS-6: On-disk Format Changes

• File System Resource Management - File Block Format

• VMFS-6 has two new “internal” block sizes, small file block (SFB) and large file block (LFB)

– The SFB size is set to 1MB; the LFB size is set to 512MB

– These are internal concepts for ”files” only; the VMFS block size is still 1MB

• Thin disks created on VMFS-6 are initially backed with SFBs

• Thick disks created on VMFS-6 are allocated LFBs as much as possible

– For the portion of the thick disk which does not fit into an LFB, SFBs are allocated

• These enhancements should result in much faster file creation times

– Especially true with swap file creation so long as the swap file can be created with all LFBs

– Swap files are always thickly provisioned

12

VMFS-6

#SER1143BU CONFIDENTIAL

Page 13: VMworld 2017 Core Storage

VMFS-6: On-disk Format Changes

• Dynamic System Resource Files

• System resource files (.fdc.sf, .pbc.sf, .sbc.sf, .jbc.sf) are now extended dynamicallyfor VMFS-6

– Previously these were static in size

– These may show a much smaller size initially, when compared to previous versions of VMFS, but they will grow over time

• If the filesystem exhausts any resources, the respective system resource file is extended to create additional resources

• VMFS-6 can now support millions of files / pointer blocks / sub blocks (as long as volume has free space)

13

VMFS-6

#SER1143BU CONFIDENTIAL

Page 14: VMworld 2017 Core Storage

vmkfstools – 500GB VMFS-5 Volume

#SER1143BU CONFIDENTIAL 14

# vmkfstools –P –v10 /vmfs/devices/<device id>

Page 15: VMworld 2017 Core Storage

vmkfstools – 500GB VMFS-6 volume

15

Large file blocksIn VMFS-6, Sub Blocks for are used for Pointer Blocks.That's why Ptr Blocks max is shown as 0 here

#SER1143BU CONFIDENTIAL

Page 16: VMworld 2017 Core Storage

VMFS-6: On-disk Format Changes

• File System Resource Management - Journaling

• VMFS is a distributed journaling filesystem

• Journals are used on VMFS when performing metadata updates on the filesystem

• Previous versions of VMFS used regular file blocks as journal resource blocks

• In VMFS-6, journal blocks tracked in a separate system resource file called .jbc.sf.

• Introduced to address VMFS journal related issues on previous versions of VMFS, due to the use of regular files blocks as journal blocks and vice-versa

– E.g. full file system, see VMware KB article 1010931

16

VMFS-6

#SER1143BU CONFIDENTIAL

Page 17: VMworld 2017 Core Storage

New Journal System File Resource

17

VMFS-5

VMFS-6

#SER1143BU CONFIDENTIAL

Page 18: VMworld 2017 Core Storage

VMFS-6: VM-based Block Allocation Affinity

• Resources for VMs (blocks, file descriptors, etc.) on earlier VMFS versions were allocated on a per host basis (host-based block allocation affinity)

• Host contention issues arose when a VM/VMDK was created on one host, and then vMotion was used to migrate the VM to another host

• If additional blocks were allocated to the VM/VMDK by the new host at the same time as the original host tried to allocate blocks for a different VM in the same resource group, the different hosts could contend for resource locks on the same resource

• This change introduces VM-based block allocation affinity, which will decrease resource lock contention

18

VMFS-6

#SER1143BU CONFIDENTIAL

Page 19: VMworld 2017 Core Storage

VMFS-6: Parallelism/Concurrency Improvements

• Some of the biggest delays on VMFS were in device scanning and filesystem probing

• vSphere 6.5 has new, highly parallel, device discovery and filesystem probing mechanisms

– Previous versions of VMFS only allowed one transaction at a time per host on a given filesystem; VMFS-6 supports multiple, concurrent transactions at a time per host

• These improvements are significant for fail-over event, and Site Recover Manager (SRM) should especially benefit

• Requirement to support higher limits on number of devices and paths in vSphere 6.5

19

VMFS-6

#SER1143BU CONFIDENTIAL

Page 20: VMworld 2017 Core Storage

Hot Extend Support

• Prior to ESXi 6.5, VMDKs on a powered on VMcould only be grown if size was less than 2TB

• If the size of a VMDK was 2TB or larger, or theexpand operation caused it to exceed 2TB, the hotextend operation would fail

• This required administrators to typically shut downthe virtual machine to expand it beyond 2TB

• The behavior has been changed in vSphere 6.5and hot extend no longer has this limitation

20

This is a vSphere 6.5 improvement, not specific to VMFS-6.This will also work on VMFS-5 volumes.

#SER1143BU CONFIDENTIAL

Page 21: VMworld 2017 Core Storage

“Upgrading” to VMFS-6

• No direct ‘in-place’ upgrade of filesystem to VMFS-6 available. New datastores only.

• Customers upgrading to vSphere 6.5 release should continue to use VMFS-5 datastores (or older) until they can create new VMFS-6 datastores

• Use migration techniques such as Storage vMotion to move VMs from the old datastore to the new VMFS-6 datastore

21#SER1143BU CONFIDENTIAL

Page 22: VMworld 2017 Core Storage

22

VMFS-6 Performance Improvements with resignature

(discovery and filesystem probing)

#SER1143BU CONFIDENTIAL

Page 23: VMworld 2017 Core Storage

23#SER1143BU CONFIDENTIAL

Page 24: VMworld 2017 Core Storage

VAAIvSphere APIs for Array Integration

Page 25: VMworld 2017 Core Storage

ATS Miscompare Handling (1 of 3)

• The heartbeat region of VMFS is used for on-disk locking

• Every host that uses the VMFS volume has its own heartbeat region

• This region is updated by the host on every heartbeat

• The region that is updated is the time stamp, which tells others that this host is alive

• When the host is down, this region is used to communicate lock state to other hosts

25

ATS

#SER1143BU CONFIDENTIAL

Page 26: VMworld 2017 Core Storage

ATS Miscompare Handling (2 of 3)

• In vSphere 5.5 U2, we started using ATS for maintaining the heartbeat

• ATS is the Atomic Test and Set primitive which is one of the VAAI primitives

• Prior to this release, we only used ATS when the heartbeat state changed

• For example, we would use ATS in the following cases:

– Acquire a heartbeat

– Clear a heartbeat

– Replay a heartbeat

– Reclaim a heartbeat

• We did not use ATS for maintaining the ‘liveness’ of a heartbeat

• This change for using ATS to maintain ‘liveness’ of a heartbeat appears to have led to issues for certain storage arrays

26

ATS

#SER1143BU CONFIDENTIAL

Page 27: VMworld 2017 Core Storage

ATS Miscompare Handling (3 of 3)

• When an ATS Miscompare is received, all outstanding IO is aborted

• This led to additional stress and load being placed on the storage arrays

– In some cases, this led to the controllers crashing on the array

• In vSphere 6.5, there are new heuristics added so that when we get a miscompare event, we retry the read and verify that there is a miscompare

• If the miscompare is real, then we do the same as before, i.e. abort outstanding I/O

• If the on-disk HB data has not changed, then this is a false miscompare

• In the event of a false miscompare:

– VMFS will not immediately abort IOs

– VMFS will re-attempt ATS HB after a short interval (usually less than 100ms)

27

ATS

#SER1143BU CONFIDENTIAL

Page 28: VMworld 2017 Core Storage

An Introduction to UNMAP

UNMAP via datastore

• VAAI UNMAP was introduced in vSphere 5.0

• Enables ESXi host to inform the backing storage that files or VMs had be moved or deleted from a Thin Provisioned VMFS datastore

• Allows the backing storage to reclaim the freed blocks

• No way of doing this previously, resulting in stranded space on Thin Provisioned VMFS datastores

28

UNMAP

#SER1143BU CONFIDENTIAL

Page 29: VMworld 2017 Core Storage

Automated UNMAP in vSphere 6.5

Introducing Automated UNMAP Space Reclamation

• In vSphere 6.5, there is now an automated UNMAP crawler mechanism for reclaiming dead or stranded space on VMFS datastores

• Now UNMAP will run continuously in the background

• UNMAP granularity on the storage array

– The granularity of the reclaim is set to 1MB chunk

– Automatic UNMAP is not supported on arrays with UNMAP granularity greater than 1MB

– Auto UNMAP feature support is footnoted in the VMware Hardware Compatibility Guide (HCL)

29

UNMAP

#SER1143BU CONFIDENTIAL

Page 30: VMworld 2017 Core Storage

Some Considerations with Automated UNMAP

• Only is issued to VMFS datastores that are VMFS-6 and have powered-on VMs

• Can take 12-24 hours to fully reclaim

• Default behavior is turned on, but can be turned off on the host (host won’t participate) …

– EnableVMFS6Unmap

• …or on the datastore (no hosts will reclaim it)

30

UNMAP

#SER1143BU CONFIDENTIAL

Page 31: VMworld 2017 Core Storage

An Introduction to Guest OS UNMAP

UNMAP via Guest OS

• In vSphere 6.0, additional improvements to UNMAP facilitate the reclaiming of stranded space from within a Guest OS

• Effectively, ability for a Guest OS in a thinly provisioned VM to tell the backing storage that blocks

• Backing storage to reclaim this capacity, and shrink the size of the VMDK

31

UNMAP

#SER1143BU CONFIDENTIAL

Page 32: VMworld 2017 Core Storage

Some Considerations with Automated UNMAP

TRIM Handling

• UNMAP work at certain block boundaries on VMFS, whereas TRIM does not have such restrictions

• While this should be fine on VMFS-6, which is now 4K aligned, certain TRIMs converted into UNMAPs may fail due to block alignment issues on previous versions of VMFS

Linux Guest OS SPC-4 support

• Initially in-guest UNMAP support to reclaim in-guest dead space natively was limited to Windows 2012 R2

• Linux distributions check the SCSI version, and unless it is version 5 or greater, it does not send UNMAPs

• With SPC-4 support introduced in vSphere 6.5, Linux Guest OS’es will now also be able to issue UNMAPs

32

UNMAP

#SER1143BU CONFIDENTIAL

Page 33: VMworld 2017 Core Storage

Automated UNMAP Limits and Considerations

Guest OS filesystem alignment

• VMDK alignment is aligned on 1 MB block boundaries

• However un-alignment may still occur within the guest OS filesystem

• This may also prevent UNMAP from working correctly

• A best practice is to align guest OS partitions to the 1MB granularity boundary

33

UNMAP

#SER1143BU CONFIDENTIAL

Page 34: VMworld 2017 Core Storage

Known Automated UNMAP Issues

• vSphere 6.5

– Tools in guest operating system might send unmap requests that are not alignedto the VMFS unmap granularity.

– Such requests are not passed to the storage array for space reclamation.

– Further info in KB article 2148987

– This issue is addressed in vSphere 6.5 P01

• vSphere 6.5 P01

– Certain versions of Windows Guest OS running in a VM may appear unresponsive if UNMAP is used.

– Further info in KB article 2150591.

– This issue is addressed in vSphere 6.5 U1.

34

UNMAP

#SER1143BU CONFIDENTIAL

Page 35: VMworld 2017 Core Storage

35

UNMAP in action

#SER1143BU CONFIDENTIAL

Page 36: VMworld 2017 Core Storage

36#SER1143BU CONFIDENTIAL

Page 37: VMworld 2017 Core Storage

SPBMStorage Policy Based Management

Page 38: VMworld 2017 Core Storage

The Storage Policy Based Management (SPBM) Paradigm

• SPBM is the foundation of VMware's Software Defined Storage vision

• Common framework to allow storage and host related capabilities to be consumedvia policies.

• Applies data services (e.g. protection, encryption, performance) on a per VM, or even per VMDK level

38#SER1143BU CONFIDENTIAL

Page 39: VMworld 2017 Core Storage

Creating Policies via Rules and Rule Sets

• Rule

– A Rule references a combination of a metadata tag and a related value, indicatingthe quality or quantity of the capability that is desired

– These two items act as a key and a value that, when referenced together througha Rule, become a condition that must be met for compliance

• Rule Sets

– A Rule Set is comprised of one or more Rules

– A storage policy includes one or more Rule Sets that describe requirements forvirtual machine storage resources

– Multiple “Rule Sets” can be leveraged to allow a single storage policy to definealternative selection parameters, even from several storage providers

39#SER1143BU CONFIDENTIAL

Page 40: VMworld 2017 Core Storage

#SER1143BU CONFIDENTIAL 40

VAIO

vSAN, VVOLs,VMFS

Page 41: VMworld 2017 Core Storage

41

SPBM and Common Rulesfor

Data Services provided by hosts

- VM Encryption- Storage I/O Control v2

#SER1143BU CONFIDENTIAL

Page 42: VMworld 2017 Core Storage

42

2 new features introduced with vSphere 6.5- Encryption - Storage I/O Control v2

Implementation is done via I/O Filters

#SER1143BU CONFIDENTIAL

Page 43: VMworld 2017 Core Storage

#SER1143BU CONFIDENTIAL

Storage I/O Control v2

• VM Storage Policies in vSphere 6.5 has a new option called “Common Rules”

• These are used for configuring data services provided by hosts, such as Storage I/O Control and Encryption. It is the same mechanism used for VAIO/IO Filters

43

Page 44: VMworld 2017 Core Storage

vSphere VM Encryption

• vSphere 6.5 introduces a new VM encryption mechanism

• It requires an external Key Management Server (KMS). Check the HCL for supported vendors

• This encryption mechanism is implemented in the hypervisor, making vSphere VM encryption agnostic to the Guest OS

• This not only encrypts the VMDK, but it also encrypts some of the VM Home directory contents, e.g. VMX file, metadata files, etc.

• Like SIOCv2, vSphere VM Encryption in vSphere 6.5 is policy driven

44#SER1143BU CONFIDENTIAL

Page 45: VMworld 2017 Core Storage

#SER1143BU CONFIDENTIAL

vSphere VM Encryption I/O Filter

• Common rules must be enabled add vSphere VM Encryption to a policy.

• Only setting in the custom encryption policy is to allow I/O filters before encryption.

45

Page 46: VMworld 2017 Core Storage

46

VM Encryption and SIOC policy

#SER1143BU CONFIDENTIAL

Page 47: VMworld 2017 Core Storage

47#SER1143BU CONFIDENTIAL

Page 48: VMworld 2017 Core Storage

NFS v4.1 Improvements

Page 49: VMworld 2017 Core Storage

NFS v4.1 Improvements

• Hardware Acceleration/VAAI-NAS Improvements

– NFS 4.1 client in vSphere 6.5 supports hardware acceleration by offloading certain operations to the storage array.

– This comes in the form of a plugin to the ESXi host that is developed/provided by the storage array partner.

– Refer to your NAS storage array vendor for further information.

• Kerberos IPv6 Support

– NFS v4.1 Kerberos adds IPV6 support in vSphere 6.5.

• Kerberos AES Encryption Support

– NFS v4.1 Kerberos adds Advanced Encryption Standards (AES) encryption support in vSphere 6.5

49

NFSv4.1

#SER1143BU CONFIDENTIAL

Page 50: VMworld 2017 Core Storage

iSCSI Improvements

Page 51: VMworld 2017 Core Storage

iSCSI Enhancements

• ISCSI Routing and Port Binding

– ESXi 6.5 now supports having the iSCSI initiator and the iSCSI target residing in different network subnets with port binding

• UEFI iSCSI Boot

– VMware now supports UEFI (Unified Extensible Firmware Interface) iSCSI Boot on Dell 13th generation servers with Intel x540 dual port Network Interface Card (NIC).

51

iSCSI

#SER1143BU CONFIDENTIAL

Page 52: VMworld 2017 Core Storage

NVMe Support

Page 53: VMworld 2017 Core Storage

NVMe (1 of 2)

• Virtual NVMe Device

– New virtual storage HBA for all flash SAN/vSAN storages

– New Operating Systems now leverage multiple queues with NVMe devices

• Virtual NVMe device allows VMs to take advantage of such in-guest IO stack improvements

– Improved performance compared to Virtual SATA device on local PCIe SSD devices

• Virtual NVMe device provides 30-50% lower CPU cost per I/O

• Virtual NVMe device achieve 30-80% higher IOPS

53#SER1143BU CONFIDENTIAL

Page 54: VMworld 2017 Core Storage

NVMe (2 of 2)

• Supported configuration information of virtual NVMe device.

54

Number of Controllers per VM 4 Enumerated as nvme0,…, nvme3.

Number of namespaces per controller 15Each namespace is mapped to a virtual disk.

Enumerated as nvme0:0, …, nvme0:15

Maximum queues and interrupts 16 1 admin + 15 I/O queues

Maximum queue depth 256 4K in-flight commands per controller

• Supports NVMe Specification v1.0e mandatory admin and I/O commands• Interoperability with all existing vSphere features, except SMP-FT

#SER1143BU CONFIDENTIAL

Page 55: VMworld 2017 Core Storage
Page 56: VMworld 2017 Core Storage

Cormac Hogan

[email protected]

@cormacjhogan

Cody Hosterman

[email protected]

@codyhosterman