sto1178bu vsan operations and management or distribution › vmware › vmworldus17 › ...vsan...

45
Pete Koehler and Jeff Hunter STO1178BU #VMworld #STO1178BU vSAN Operations and Management Recommendations, Considerations, Best Practices @vmpete @jhuntervmware VMworld 2017 Content: Not for publication or distribution

Upload: others

Post on 26-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Pete Koehler and Jeff Hunter

STO1178BU

#VMworld #STO1178BU

vSAN Operations and Management –Recommendations, Considerations, Best Practices

@vmpete

@jhuntervmware

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

2#STO1178BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Assumptions

You already know the basics of vSANIf not… vSAN v6.6 - Getting Started Workshop ELW180801U Thursday at 9:30 AM

You are ok with presentations and demos that move at a rapid pace

Time for Q&A is not guaranteed

3#STO1178BU CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Consideration:

Maintain some “slack” space for

resyncs when changing storage

policies. 25-30% recommended.

4

#1

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Review requirements for optimal

vSAN cluster size

6

#2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Need to grow? When to scale up versus when to scale out

• Small cluster

– Fewer data service options

– Larger impact with failure

– Less hardware and licenses

• Larger cluster

– More data service options

– Reduced impact with failure

– More hardware and licenses

No additional fault domains for rebuild

Additional fault domains for resiliency

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Cases where it makes sense to scale up

• Hosts in cluster originally spec’d for scale up

• Limited rack space, power, cooling

• No budget for additional servers, licensing

• Cluster has sufficient compute power

• Cluster size already provides necessary data services (e.g. RAID-6)

Good news: you can do both with vSAN

8

Disk Group 1 Disk Group 2 Future DG

Disk Group 1 Disk Group 2 Future DG

Disk Group 1 Disk Group 2 Future DG

Disk Group 1 Disk Group 2 Future DG

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Avoid Maintenance Mode – No

Data Evacuation – where objects

are assigned policies with

PFTT=0

10

#3

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Caution: Maintenance Mode – No Data Evacuation

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Object B has only one component on Host 1

• Object A has PFTT = 1

• Object B has PFTT = 0

12

Host 1

A

B

Host 2

A

Host 3 Host 4

A

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

If data is not moved, Object B becomes inaccessible

• Object A has PFTT = 1

• Object B has PFTT = 0

13

Host 1

A

B

Host 2

A

Host 3 Host 4

A

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Consideration:

You must have a sufficient

number of fault domains to

facilitate Maintenance Mode –

Evacuate All Data to Other Hosts

14

#4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

vSAN maintenance mode considers capacity, not fault domains

Cache Cache CacheCache

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

vSAN maintenance mode considers capacity, not fault domains

Cache Cache CacheCache

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Evacuate all data fails if there are not enough fault domains

17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Use Ensure Data Accessibility option instead

18

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Design recommendation

Add an additional host/fault domain for rebuilds, data evacuation, etc.

19

Cache Cache Cache CacheCache

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Considerations:

Enabling DD+C is easy, requires

rolling upgrade. Drives can be

added, but not removed.

20

#5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Check object status and health

when there is a failure in the

cluster

22

#6

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

23

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

24

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

25

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

View vSAN Network Group

Partitions in vSAN Health and

Disk Management

26

#7

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

27

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

28

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Use vSphere Update Manager

(VUM) to update drivers

29

#8

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Easily monitor performance in

vSphere Web Client

31

#9

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Monitor environment with vSAN Performance Service

32

• Purpose built collector of vSAN performance metrics

• Native to ESXi

• Accessible via GUI (vCenter), CLI, and API

• All common I/O metrics

• Distinguishes between front end and back end storage traffic

• Integration with performance diagnostics

VM/App

Host level

Disk / disk group

Cluster

NEWvSAN 6.6.1

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Combine vRealize Operations

and vRealize Log Insight for

holistic view and thorough log

inspection

33

#10

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Augment vCenter intelligence with analytics from vRealize

34

vSphere vSAN

vSAN Datastore

VCSA

Analytics with vR Ops

• Monitoring

• Event history

• Alerting

• Insight/Trending

Log Analytics with Log Insight

• Log aggregation

• Log data mining, intelligence

• Root cause analysis

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

vSAN dashboards in vR Ops provide unique intelligence

• Prebuilt vSAN dashboards with multi-site visibility and analytics

• Fully integrated natively into vRealize Operations 6.6

• Prebuilt dashboards

• Easily customize to expose data you want to see

• Display vSAN and non vSAN metrics together

Aggregate cluster statistics

Cluster specific statistics

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Use Remote Access to View ESXi DCUI During Host Restarts

#11

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Observe Host Restarts with Remote DCUI

• vSAN host restarts can take longer than non-vSAN hosts

– Processes all log entries to generate metadata tables

– Time taken depends on load and amount of data in write buffer

• Use any type of out of band management for DCUI access

• Further reboots of host in this state should be avoided

• Be patient

38

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Monitor disk usage and

remediate an unbalanced cluster

in the vSAN Health UI

39

#12

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Monitor disk usage and balance with the Health checks

40

Proactive rebalancing

• When bringing new or fully evacuated hosts or disks online

• Health alert when greater than 30% delta in usage

• Manual action

Reactive rebalancing

• When single device is greater than 80% full

• Automated action

• May break components into smaller chunks

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Use vSphere Update Manager to

verify the best software versions

for your environment and

automate the upgrade process

41

#13

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Updating your vSAN cluster with VUM

• New VUM integration

• Simplifies upgrades and patching by verifying hardware compatibility

• Creates HCL aware baseline recommendations, scanning for critical patches or required drivers

• Will only upgrade to the highest level of HCL compatibility

• Automates VUM baseline creation, scanning and delivery of update packages and ISOs.

NEWvSAN 6.6.1

Custom Baseline

Update to 6.6.1 Update drivers

Update to 6.6.1

Custom Baseline

Update drivers

Remain at 6.2

VUM

HCL Database

Release Catalog

vmware.com

Custom Baseline

NEWvSAN 6.6.1

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Upgrade vCenter Server before

upgrading hosts

43

#14

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

When removing a host from a

cluster, use maintenance mode –

evacuate all data. Then, remove

the disk groups.

44

#15

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

vSphere vSAN

Removing a host from a vSAN cluster

45

• Must be in maintenance mode

• Choose “Evacuate all data to other hosts” when performing EMM

• Workflow will indicate how much data must be evacuated

• Does NOT remove disk group from host

• Host count impacts capacity and functionality

vmdk witnessvmdk

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 41: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Follow a specific sequence when

powering down an entire vSAN

cluster

47

#16

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 42: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Powering down a vSAN cluster

48

• Applicable for moving physical cluster, and sustained power outages

• Prevents issues with data in flight

• Slight variation if vCenter server lives in vSAN

– Use embedded host client for power down, EMM

• Process can be reversed for power up

VM/App

Host

1

2

3Cluster

Shut down all VMs in

cluster

Place all hosts into

Maintenance mode• Select “No data migration”

• Deselect “move powered off VMs”

Power off all hosts in

cluster

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 43: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

Recommendation:

Last, but not least…

Configure every host, virtual

appliance, and vCenter Server to

use the same reliable DNS and

NTP sources

49

#17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 44: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

@vmpete

@jhuntervmwareVMworld 2017 Content: Not fo

r publication or distri

bution

Page 45: STO1178BU vSAN Operations and Management or distribution › vmware › vmworldus17 › ...vSAN Operations and Management – ... Getting Started Workshop ELW180801U Thursday at 9:30

@vmpete

@jhuntervmwareVMworld 2017 Content: Not fo

r publication or distri

bution