sto1178bu vsan operations and management or distribution › vmware › vmworldus17 › ...vsan...
TRANSCRIPT
Pete Koehler and Jeff Hunter
STO1178BU
#VMworld #STO1178BU
vSAN Operations and Management –Recommendations, Considerations, Best Practices
@vmpete
@jhuntervmware
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
2#STO1178BU CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Assumptions
You already know the basics of vSANIf not… vSAN v6.6 - Getting Started Workshop ELW180801U Thursday at 9:30 AM
You are ok with presentations and demos that move at a rapid pace
Time for Q&A is not guaranteed
3#STO1178BU CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Consideration:
Maintain some “slack” space for
resyncs when changing storage
policies. 25-30% recommended.
4
#1
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Review requirements for optimal
vSAN cluster size
6
#2
VMworld 2017 Content: Not fo
r publication or distri
bution
Need to grow? When to scale up versus when to scale out
• Small cluster
– Fewer data service options
– Larger impact with failure
– Less hardware and licenses
• Larger cluster
– More data service options
– Reduced impact with failure
– More hardware and licenses
No additional fault domains for rebuild
Additional fault domains for resiliency
VMworld 2017 Content: Not fo
r publication or distri
bution
Cases where it makes sense to scale up
• Hosts in cluster originally spec’d for scale up
• Limited rack space, power, cooling
• No budget for additional servers, licensing
• Cluster has sufficient compute power
• Cluster size already provides necessary data services (e.g. RAID-6)
Good news: you can do both with vSAN
8
Disk Group 1 Disk Group 2 Future DG
Disk Group 1 Disk Group 2 Future DG
Disk Group 1 Disk Group 2 Future DG
Disk Group 1 Disk Group 2 Future DG
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Avoid Maintenance Mode – No
Data Evacuation – where objects
are assigned policies with
PFTT=0
10
#3
VMworld 2017 Content: Not fo
r publication or distri
bution
Caution: Maintenance Mode – No Data Evacuation
VMworld 2017 Content: Not fo
r publication or distri
bution
Object B has only one component on Host 1
• Object A has PFTT = 1
• Object B has PFTT = 0
12
Host 1
A
B
Host 2
A
Host 3 Host 4
A
VMworld 2017 Content: Not fo
r publication or distri
bution
If data is not moved, Object B becomes inaccessible
• Object A has PFTT = 1
• Object B has PFTT = 0
13
Host 1
A
B
Host 2
A
Host 3 Host 4
A
VMworld 2017 Content: Not fo
r publication or distri
bution
Consideration:
You must have a sufficient
number of fault domains to
facilitate Maintenance Mode –
Evacuate All Data to Other Hosts
14
#4
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN maintenance mode considers capacity, not fault domains
Cache Cache CacheCache
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN maintenance mode considers capacity, not fault domains
Cache Cache CacheCache
VMworld 2017 Content: Not fo
r publication or distri
bution
Evacuate all data fails if there are not enough fault domains
17
VMworld 2017 Content: Not fo
r publication or distri
bution
Use Ensure Data Accessibility option instead
18
VMworld 2017 Content: Not fo
r publication or distri
bution
Design recommendation
Add an additional host/fault domain for rebuilds, data evacuation, etc.
19
Cache Cache Cache CacheCache
VMworld 2017 Content: Not fo
r publication or distri
bution
Considerations:
Enabling DD+C is easy, requires
rolling upgrade. Drives can be
added, but not removed.
20
#5
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Check object status and health
when there is a failure in the
cluster
22
#6
VMworld 2017 Content: Not fo
r publication or distri
bution
24
VMworld 2017 Content: Not fo
r publication or distri
bution
25
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
View vSAN Network Group
Partitions in vSAN Health and
Disk Management
26
#7
VMworld 2017 Content: Not fo
r publication or distri
bution
27
VMworld 2017 Content: Not fo
r publication or distri
bution
28
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Use vSphere Update Manager
(VUM) to update drivers
29
#8
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Easily monitor performance in
vSphere Web Client
31
#9
VMworld 2017 Content: Not fo
r publication or distri
bution
Monitor environment with vSAN Performance Service
32
• Purpose built collector of vSAN performance metrics
• Native to ESXi
• Accessible via GUI (vCenter), CLI, and API
• All common I/O metrics
• Distinguishes between front end and back end storage traffic
• Integration with performance diagnostics
VM/App
Host level
Disk / disk group
Cluster
NEWvSAN 6.6.1
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Combine vRealize Operations
and vRealize Log Insight for
holistic view and thorough log
inspection
33
#10
VMworld 2017 Content: Not fo
r publication or distri
bution
Augment vCenter intelligence with analytics from vRealize
34
vSphere vSAN
vSAN Datastore
VCSA
Analytics with vR Ops
• Monitoring
• Event history
• Alerting
• Insight/Trending
Log Analytics with Log Insight
• Log aggregation
• Log data mining, intelligence
• Root cause analysis
VMworld 2017 Content: Not fo
r publication or distri
bution
vSAN dashboards in vR Ops provide unique intelligence
• Prebuilt vSAN dashboards with multi-site visibility and analytics
• Fully integrated natively into vRealize Operations 6.6
• Prebuilt dashboards
• Easily customize to expose data you want to see
• Display vSAN and non vSAN metrics together
Aggregate cluster statistics
Cluster specific statistics
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Use Remote Access to View ESXi DCUI During Host Restarts
#11
VMworld 2017 Content: Not fo
r publication or distri
bution
Observe Host Restarts with Remote DCUI
• vSAN host restarts can take longer than non-vSAN hosts
– Processes all log entries to generate metadata tables
– Time taken depends on load and amount of data in write buffer
• Use any type of out of band management for DCUI access
• Further reboots of host in this state should be avoided
• Be patient
38
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Monitor disk usage and
remediate an unbalanced cluster
in the vSAN Health UI
39
#12
VMworld 2017 Content: Not fo
r publication or distri
bution
Monitor disk usage and balance with the Health checks
40
Proactive rebalancing
• When bringing new or fully evacuated hosts or disks online
• Health alert when greater than 30% delta in usage
• Manual action
Reactive rebalancing
• When single device is greater than 80% full
• Automated action
• May break components into smaller chunks
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Use vSphere Update Manager to
verify the best software versions
for your environment and
automate the upgrade process
41
#13
VMworld 2017 Content: Not fo
r publication or distri
bution
Updating your vSAN cluster with VUM
• New VUM integration
• Simplifies upgrades and patching by verifying hardware compatibility
• Creates HCL aware baseline recommendations, scanning for critical patches or required drivers
• Will only upgrade to the highest level of HCL compatibility
• Automates VUM baseline creation, scanning and delivery of update packages and ISOs.
NEWvSAN 6.6.1
Custom Baseline
Update to 6.6.1 Update drivers
Update to 6.6.1
Custom Baseline
Update drivers
Remain at 6.2
VUM
HCL Database
Release Catalog
vmware.com
Custom Baseline
NEWvSAN 6.6.1
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Upgrade vCenter Server before
upgrading hosts
43
#14
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
When removing a host from a
cluster, use maintenance mode –
evacuate all data. Then, remove
the disk groups.
44
#15
VMworld 2017 Content: Not fo
r publication or distri
bution
vSphere vSAN
Removing a host from a vSAN cluster
45
• Must be in maintenance mode
• Choose “Evacuate all data to other hosts” when performing EMM
• Workflow will indicate how much data must be evacuated
• Does NOT remove disk group from host
• Host count impacts capacity and functionality
vmdk witnessvmdk
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Follow a specific sequence when
powering down an entire vSAN
cluster
47
#16
VMworld 2017 Content: Not fo
r publication or distri
bution
Powering down a vSAN cluster
48
• Applicable for moving physical cluster, and sustained power outages
• Prevents issues with data in flight
• Slight variation if vCenter server lives in vSAN
– Use embedded host client for power down, EMM
• Process can be reversed for power up
VM/App
Host
1
2
3Cluster
Shut down all VMs in
cluster
Place all hosts into
Maintenance mode• Select “No data migration”
• Deselect “move powered off VMs”
Power off all hosts in
cluster
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendation:
Last, but not least…
Configure every host, virtual
appliance, and vCenter Server to
use the same reliable DNS and
NTP sources
49
#17
VMworld 2017 Content: Not fo
r publication or distri
bution
@vmpete
@jhuntervmwareVMworld 2017 Content: Not fo
r publication or distri
bution
@vmpete
@jhuntervmwareVMworld 2017 Content: Not fo
r publication or distri
bution