vmworld 2015: troubleshooting for vsphere 6

87
Troubleshooting for vSphere 6 Jamie Rawson, VMware, Inc VAPP6257 #VAPP6257

Upload: vmworld

Post on 23-Jan-2018

2.605 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: VMworld 2015: Troubleshooting for vSphere 6

Troubleshooting for vSphere 6

Jamie Rawson, VMware, Inc

VAPP6257

#VAPP6257

Page 2: VMworld 2015: Troubleshooting for vSphere 6

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

CONFIDENTIAL 2

Page 3: VMworld 2015: Troubleshooting for vSphere 6

The Percentage of Applications in Virtualized InfrastructureHas Increased Dramatically over the Last Few Years (VMware Core Metrics Survey July 2015)

Microsoft SQL is the most common application running in on-premise virtual infrastructure

NA EU dAP BRIC SMB COMM ENT

57% 73% 70% 74% 68% 71% 64%

47% 51% 39% 56% 43% 51% 54%

41% 43% 46% 61% 36% 46% 57%

45% 54% 37% 41% 43% 49% 46%

34% 38% 59% 51% 37% 39% 48%

26% 27% 32% 37% 24% 34% 33%

25% 30% 23% 35% 16% 30% 39%

29% 16% 31% 27% 22% 22% 30%

15% 23% 30% 28% 19% 24% 25%

15% 22% 22% 30% 17% 21% 25%

71% 62% 62% 64% 65% 64% 68%

48% 54% 49% 55% 50% 51% 53%

51% 45% 49% 49% 44% 49% 53%

36% 35% 39% 46% 37% 40% 37%

20% 15% 20% 26% 15% 17% 25%

600 450 230 323 653 346 604

Region Company Size

67%

49%

46%

45%

42%

29%

28%

25%

22%

21%

66%

51%

49%

38%

19%

Microsoft SQL

Microsoft SharePoint

SAP

Microsoft Exchange

Oracle Databases

Oracle Applications

High Performance Computing

Custom BCA/ industry-specific

Oracle Middleware

IBM Middleware

Business critical

Important

Development

Test

Staging

Applications in Virtualized Infrastructure

> Total

< Total

N = 1603

Level of Criticality of Applications in Virtualized Infrastructure

(Select all that apply)

(Select all that apply)

CONFIDENTIAL 3

Page 4: VMworld 2015: Troubleshooting for vSphere 6

Virtualizing Applications Sessions and Offerings

• 30 Breakout Sessions with 5 Panels & 4 Quick Talks

• 10 Group Discussions

• One-on-One Meet the Experts Sessions

• Checkout the Hands on Labs

Sign up for the Independent Oracle User Group (IOUG) VMware Special Interest Group (SIG)www.ioug.org/vmware

Page 5: VMworld 2015: Troubleshooting for vSphere 6

RDBMS Books from VMware Press

Book signing @ 1PM Tuesday Sept 1

vmwarepress.com

http://www.pearsonitcertification.com/store/virtualizing-oracle-databases-on-vsphere-9780133570182

http://www.pearsonitcertification.com/store/virtualizing-sql-server-with-vmware-doing-it-right-9780321927750

CONFIDENTIAL 5

Page 6: VMworld 2015: Troubleshooting for vSphere 6

This Presentation Will Discuss

1

Use of the VMware vSphere® Web Client, the vSphere command-line

interface (vCLI), the ESXi Shell, and log files to diagnose and correct

problems in vSphere

2 Troubleshooting networking issues

3 Troubleshooting storage issues

4 Troubleshooting vSphere HA cluster issues

5 Troubleshooting VMware vSphere® vMotion® issues

CONFIDENTIAL 6

Page 7: VMworld 2015: Troubleshooting for vSphere 6

7Please note: The VMware vSphere: Troubleshooting Workshop [6.0] is currently under development, and does not appear

on our schedule at this time. Certain troubleshooting topics are addressed in the VMware vSphere: Fast Track [6.0] course.

Scope

This Presentation covers portions of the Troubleshooting Workshop:

For more information, go to http://vmware.com/education

Page 8: VMworld 2015: Troubleshooting for vSphere 6

Troubleshooting Overview

Page 9: VMworld 2015: Troubleshooting for vSphere 6

Definition of a System Problem

Problems can arise from numerous

sources, which include:

CONFIDENTIAL 9

This presentation addresses problems caused

by configuration and operational issues.

Configuration Issues

Resource Contention

Network Attacks

Software Bugs

Hardware Failures

A system problem is a fault in a system, or one

of its components, that

negatively affects the

services needed for

normal production.

Page 10: VMworld 2015: Troubleshooting for vSphere 6

Effect of a System Problem

These problems can affect

certain aspects of a system:

CONFIDENTIAL 10

• These perceived effects, or symptoms, are what is generally exposedand reported

• Although performance is a predominant symptom in reported problems, this presentation does not focus on problems that causeperformance issues

– Performance troubleshooting is covered in the VMware vSphere: Optimize and Scale [6.0] course

Usability Accuracy

Reliability Performance

Page 11: VMworld 2015: Troubleshooting for vSphere 6

Troubleshooting Process

Troubleshooting involves the systematic approach to identifying the problem (root cause) from the reported symptom

The troubleshooting process consists of the following tasks:

CONFIDENTIAL 11

Defining the problem

• Identifying symptoms

• Gathering information

Identifying the cause of the problem

• Identifying possible causes

• Determining the root cause

Resolving the problem

• Identifying possible solutions

• Implementing the best solution

Page 12: VMworld 2015: Troubleshooting for vSphere 6

Gathering Information about the Problem

Reproduce the problem.

This provides a repeatable means to

verify the problem as well as a way to

validate that the problem was resolved.

Identify the scope of the problem:

Does the problem affect only one object,

or multiple objects?

Gather additional information:

Was the task working before?

• If so, what changed in your environment or

configuration?

CONFIDENTIAL 12

1

2

3

4

5

Consult references,

such as product release notes,

to determine whether the

problem is a known problem.

Use your existing knowledge

of your system’s configuration to

help you determine the cause of

the problem.

Page 13: VMworld 2015: Troubleshooting for vSphere 6

View diagnostic messages

displayed in the GUI or

written to log files

Viewing and Interpreting Diagnostic Information

CONFIDENTIAL 13

Example: Error powering on a virtual machine

Interpret the diagnostic

messages to focus your

troubleshooting efforts

Page 14: VMworld 2015: Troubleshooting for vSphere 6

Identifying Possible Causes

• A structured approach to troubleshooting enables you to determine the root cause quickly and effectively

• Based on the problem’s characteristics, take one of the following troubleshooting approaches:

CONFIDENTIAL 14

Bottom-up

Approach

the cause

by halves

Hardware(CPU, memory,

network, storage)

Application/Guest OS

ESXi Host

Virtual Machine

Top-down

Investigate cause top-down

Investigate cause bottom-up

Approach the cause by halves

Page 15: VMworld 2015: Troubleshooting for vSphere 6

Resolving the Problem

• After identifying the root cause, assess the impact of the problem on operations:

– High impact – Resolve immediately

– Medium impact – Resolve when possible

– Low impact – Resolve during next maintenance window

• Identify possible solutions to resolve the problem

– Short-term solution – Workaround

– Long-term solution – Reconfiguration

– Impact analysis – Assess the impact of the solution on operations

• Implement the solution

CONFIDENTIAL 15

Page 16: VMworld 2015: Troubleshooting for vSphere 6

Example: Defining the Problem

Scenario:

You attempt to migrate the virtual machine named VM01 from the host named esxi01 to the host named esxi02. After waiting a couple of minutes, the vSphere vMotion migration fails with an error

CONFIDENTIAL 16

VMware vSphere® vMotion® migration fails with an error

Is this a vSphere vMotionproblem or a symptom of an underlying problem?

• The error message is the starting point

Page 17: VMworld 2015: Troubleshooting for vSphere 6

Example: Gathering Information

Error messages can help determine the problem

CONFIDENTIAL 17

Page 18: VMworld 2015: Troubleshooting for vSphere 6

Example: Identifying Possible Causes

Use the information you gathered to identify possible causes:

• Based on error messages, the vSphere vMotion migration failed because esxi01 and esxi02 failed to connect over the network named “vMotion.”

• This error points you to a possiblemisconfiguration on the ESXi host

CONFIDENTIAL 18

vMotion is misconfigured

Network connectivity is down

on one of the ESXi hosts

vMotion VMkernel interface connectivity

is down on one of the ESXi hosts

Possible Causes

Hardware(CPU, memory,

network, storage)

Application/Guest OS

ESXi Host

Virtual Machine

Page 19: VMworld 2015: Troubleshooting for vSphere 6

Example: Determining the Root Cause

CONFIDENTIAL 19

Success?

ping esxi02

Fix network configuration

to get a successful ping

Perform vMotion

migration

Success?

ping 172.20.13.52

Success?

Fix VMkernel configuration

to get a successful ping

No

Yes

No

Yes

Success?

Perform vMotion

migration

YesYes

NoNo

Further

investigation

necessary

Root cause

identified

Test next

possible

cause

Start here:

Page 20: VMworld 2015: Troubleshooting for vSphere 6

Example: Resolving the Problem

In this example, suppose that the root cause was an incorrect IP address for the vSphere vMotion VMkernel interface on esxi02

CONFIDENTIAL 20

Assess the impact of the problem on operations

• Probably high impact

− The problem affects any virtual machine that is migrated to esxi02.

− The problem also affects the proper operation of VMware vSphere®

Distributed Resource Scheduler™

Identify possible solutions to resolve the problem

• Short-term accommodation – Do not migrate virtual machines to esxi02

• Long-term solution – Fix the IP address of esxi02’s vMotion VMkernel interface

Implementing the solution should not require downtime3

2

1

Page 21: VMworld 2015: Troubleshooting for vSphere 6

Command-line Troubleshooting Tools

Page 22: VMworld 2015: Troubleshooting for vSphere 6

Command-line Troubleshooting Tools

Ways to obtain command-line access on a VMware® ESXi™ host:

CONFIDENTIAL 22

vSphereManagement Assistant

VMware vSphere®

ESXi™ Shell, which includes:

• esxcli commands

• A set of esxcfg-* commands

• A set of commands for troubleshooting

• Includes the VMware vSphere®

Command-Line Interface (vCLI) package

Page 23: VMworld 2015: Troubleshooting for vSphere 6

vSphere ESXi Shell

vSphere ESXi Shell can be accessed:

CONFIDENTIAL 23

Locally, from the direct console user interface (DCUI)

• Enable the local vSphere ESXi Shell from the DCUI or from the VMware vSphere®

Web Client

• Access vSphere ESXi Shell from the main DCUI screen by pressing Alt+F1 to open a console window to the host

Remotely, from an SSH session

To access the local vSphere ESXi Shell:

• Enable the vSphere ESXi Shell and SSH services

• Use an SSH client, such as PuTTY, to access vSphere ESXi Shell

• Disable the SSH service when you are no longer using it

To access the remote vSphere ESXi Shell:

Remotely, from an SSH session

Page 24: VMworld 2015: Troubleshooting for vSphere 6

vSphere Management Assistant

vSphere Management Assistant is a virtual appliance

that includes the following:

CONFIDENTIAL 24

vCLI command set

• Enables you to run system

administration commands to

manage ESXi hosts, such as:

• Requires credential connection

options to a server

Reference: https://www.vmware.com/support/developer/vcli/vcli60/vsp6_60_vcli_relnotes.html

vi-fastpass authentication

component

• Automates authentication to

VMware® vCenter Server™

system or ESXi host targets

• Prevents the user from having

to continually add login

credentials to every command

being executed

Page 25: VMworld 2015: Troubleshooting for vSphere 6

Esxcli Command

The esxcli command offers the following namespaces,

as well as many new esxcli functions added in the

vSphere 6.0 release

CONFIDENTIAL 25

Page 26: VMworld 2015: Troubleshooting for vSphere 6

Example: Viewing vSphere Network Information

CONFIDENTIAL 26

esxcli network

• Physical and virtual network information can be displayed by using esxcli network commands

Page 27: VMworld 2015: Troubleshooting for vSphere 6

Example: Viewing Standard Switch Information

CONFIDENTIAL 27

esxcli network vswitch standard

• Standard switches can be created by using the esxcli

network vswitch standard command structure

Page 28: VMworld 2015: Troubleshooting for vSphere 6

vicfg-* Commands

CONFIDENTIAL 28

vicfg-*

• Commands with the vicfg- prefix enable you to manage

your storage, network, and host configuration

• For example, to display IP information of your VMkernel interfaces:

− vicfg-vmknic -l

Page 29: VMworld 2015: Troubleshooting for vSphere 6

vmware-cmd Command

CONFIDENTIAL 29

• vmware-cmd

– The vmware-cmd command is exclusively used for virtual machines.

vmware-cmd

• The vmware-cmd command is exclusively used for virtual machines

Page 30: VMworld 2015: Troubleshooting for vSphere 6

Example: Viewing Virtual Machine Information

CONFIDENTIAL 30

vmware-cmd -l

• Lists the virtual machines that are located on the target host. Lists virtual machines by path to the .vmx file

Page 31: VMworld 2015: Troubleshooting for vSphere 6

Useful ESXi Host Logs for Troubleshooting

ESXi hosts write to multiple log files, depending on which action is being performed

CONFIDENTIAL 31

Log file Purpose

hostd.log Host Management service logs

syslog.logManagement service initialization, watchdogs, scheduled

tasks, and DCUI use

vmkernel.log

Core VMkernel logs, including device discovery, storage

and networking device and driver events, and virtual

machine startups

vmkwarning.logA summary of warning and alert log messages excerpted

from the VMKernel logs

vmksummary.log

A summary of ESXi host startup and shutdown, and an

hourly heartbeat with uptime, number of virtual machines

running, and service resource consumption

Page 32: VMworld 2015: Troubleshooting for vSphere 6

Viewing Log Files by Using the vSphere Web Client

The vSphere Web Client can be used to view and search

log files on vCenter Server systems and ESXi hosts

CONFIDENTIAL 32

Page 33: VMworld 2015: Troubleshooting for vSphere 6

Improved Audit Trail of ESXi Administrative Tasks

In vSphere 5, ESXi hosts log actions by named VMware

vCenter Server™ users as vpxuser.

CONFIDENTIAL 33

Page 34: VMworld 2015: Troubleshooting for vSphere 6

Viewing Log Files by Using the DCUI

The Direct Console User Interface (DCUI) can be used if

vCenter Server is not available.

With the DCUI, only the log files for a single ESXi host can

be viewed.

CONFIDENTIAL 34

Page 35: VMworld 2015: Troubleshooting for vSphere 6

Tip: Location of VMware vCenter Server 6.0 log files

CONFIDENTIAL 35

• The VMware vCenter Server 6.0 logs are located in the

%ALLUSERSPROFILE%\VMWare\vCenterServer\logs folder

• The VMware vCenter Server Appliance 6.0 logs are located in the

/var/log/vmware/ folder

Tip

• See http://kb.vmware.com/kb/2110014 for full details

Recommendation

Page 36: VMworld 2015: Troubleshooting for vSphere 6

Tip: Collecting Diagnostic Information for VMware vCenter Server 4.x, 5.x and 6.0

CONFIDENTIAL 36

• VMware Technical Support routinely requests diagnostic information from you when a support request is handled. This diagnostic information contains product specific logs and configuration files from the host on which the product is run. The information is gathered using a specific script or tool for each product

Tip

• See http://kb.vmware.com/kb/1011641 for full details

Recommendation

Page 37: VMworld 2015: Troubleshooting for vSphere 6

Collecting Diagnostic Data for

VMware Technical Support

CONFIDENTIAL 37

Methods for collecting diagnostic information to send to VMware Technical Support include the following:

• Use the GUI to export files to a log bundle.

− vSphere Client or vSphere Web Client

• Use the vm-support command to collect

information from an individual ESXi host

Page 38: VMworld 2015: Troubleshooting for vSphere 6

Network Troubleshooting

Page 39: VMworld 2015: Troubleshooting for vSphere 6

Networking Overview

CONFIDENTIAL 39

In vSphere, networking problems can

occur with various types of connectivity:

Virtual machine network connectivity

VMware® ESXi™

host management network connectivity

Standard switches

Distributed switches

Virtual switch connectivity

Page 40: VMworld 2015: Troubleshooting for vSphere 6

The ESXi host has intermittent or no network connectivity to other systems

Example: Network Issue

CONFIDENTIAL 40

Command

prompt at

the DCUI

Initial check:

From the ESXi local console, ping a system that is known to be up and accessible by the ESXi host

Page 41: VMworld 2015: Troubleshooting for vSphere 6

Host Networking Rollback

If an invalid configuration occurs, one or more hosts might be out of synchronization with the distributed switch

CONFIDENTIAL 41

Examples of events that might

trigger a host networking rollback:

One type of rollback is the host

networking rollback

• Updating DNS and routing settings

• Updating the speed or duplex of a physical NIC

• Changing the IP settings of a management VMkernel network adapter

• Updating teaming and failover policies to a port group that contains the management VMkernelnetwork adapter

• Triggered when a network configuration

change is made that disconnects the host

Rollback enables you to roll back to a previous valid configuration

Page 42: VMworld 2015: Troubleshooting for vSphere 6

Distributed Switch Rollback

If an invalid configuration occurs, one or more hosts might be out of

synchronization with the distributed switch

CONFIDENTIAL 42

Examples of events that might

trigger a distributed switch rollback:

The other type of rollback is the

distributed switch rollback

• Changing the MTU of a distributed switch

• Changing the following settings in the

distributed port group of the management

VMkernel network adapter:

− NIC teaming and failover

− VLAN

− Traffic shaping

• Triggered when invalid updates are made

to distributed switch-related objects

Page 43: VMworld 2015: Troubleshooting for vSphere 6

Recovering from a Distributed Switch Misconfiguration

Always back up your

distributed switch after

you make a change to

its configuration

If your distributed switch loses

network connectivity because

of a misconfiguration, you can

restore from your latest

backup

CONFIDENTIAL 43

The vSphere Web Client provides

you with the following features:

• Export − Back up your distributed

switch configuration

• Restore − Reset the configuration

of an existing distributed switch from

an exported configuration file

• Import − Create a new distributed

switch from an exported

configuration file

The export, restore, and

import functions are

available with the

vSphere Web Client

1 2 3

Page 44: VMworld 2015: Troubleshooting for vSphere 6

Storage Troubleshooting

Page 45: VMworld 2015: Troubleshooting for vSphere 6

Example: Storage Issue

CONFIDENTIAL 45

IP storage is not reachable by an ESXi host

Initial checks:

• Verify that the ESXi host can see the LUN.

− esxcli storage core path list

• Check whether a rescan restores visibility to the LUNs.

− esxcli storage core adapter rescan –A <vmhba##>

Page 46: VMworld 2015: Troubleshooting for vSphere 6

Example: Storage Issue

CONFIDENTIAL 46

Initial checks:

• Find detailed information regarding LUN paths:

− esxcli storage core path list

• List LUN multipathing information:

− esxcli storage nmp device list

• Check whether a rescan restores visibility to the LUNs.

− esxcli storage core adapter rescan –A <vmhba##>

One or more paths to a LUN are lost

Page 47: VMworld 2015: Troubleshooting for vSphere 6

Identifying Possible Causes

CONFIDENTIAL 47

If you see errors in /var/log/vmkernel.log that refer to a permanent device loss (PDL)

or all paths down (APD) condition, then take a bottom-up approach to troubleshooting

For iSCSI storage, NIC teaming is misconfigured.

The path selection policy for a storage device is

misconfigured

A PDL condition has occurred

An APD condition has occurred

Possible Causes

Hardware:Storage Network,

Storage Array

ESXi Host

Page 48: VMworld 2015: Troubleshooting for vSphere 6

Issue: Virtual Machines on an NFS 4.1 Datastore Fail after the NFS 4.1 Share Recovers from an APD State

CONFIDENTIAL 48

• This issue occurs because NFSv3 and v4 are different protocols with different

behaviors. After the grace period (grace period differs depending on vendor), the

NFS server flushes the client state

• The following error is displayed:

error: The lock protecting VM.vmdk has been lost

Symptom and Cause

• See http://kb.vmware.com/kb/2089321 for full details

Solution

Page 49: VMworld 2015: Troubleshooting for vSphere 6

Issue: LUNs Attached to VMware vSphere 6.0 Hosts May Remain in APD Timeout State after Paths Have Recovered

CONFIDENTIAL 49

• When an APD event occurs, LUNs connected to ESXi may remain inaccessible after paths

to the LUNs recover. The 140-second APD timeout expires even though paths to storage

have recovered

• This issue has been confirmed in all releases of ESXi 6.0 and is due to a fault in APD

handling. When this issue occurs, a LUN has paths available and is online following an APD

event, but the APD timer continues upcounting until ultimately the LUN enters APD Timeout

state. After the initial APD event, the datastore is inaccessible as long as active workloads

are associated with the datastore in question

Symptom and Cause

• See http://kb.vmware.com/kb/2126021 for full details

Solution

Page 50: VMworld 2015: Troubleshooting for vSphere 6

Issue: Unable to Create a VMDK Larger than 1 TB on NFS 4.1 on EMC VNX Array

CONFIDENTIAL 50

• Unable to create a VMDK larger than 1 TB on NFS 4.1 on EMC VNX array

• Cannot create a VMDK larger than 1 TB

• Creating a VMDK larger than 1 TB fails on NFS 4.1 on EMC VNX array

This issue occurs because NFS version 4.1 storage from EMC VNX supports only

32-bit file formats which prevents you from creating virtual machine files that are

larger than 1 TB on the NFS 4.1 datastore. This is a limitation on the EMC VNX

array

Symptom and Cause

• See http://kb.vmware.com/kb/2089311 for full details

Solution

Page 51: VMworld 2015: Troubleshooting for vSphere 6

Issue: Storage Views Tab Missing from the vSphere Client

CONFIDENTIAL 51

• Storage Views Tab missing after logging in successfully to the VMware vCenter Server 6.0.x using

the vSphere Client

• This issue occurs because the storage views front-end is removed from the vSphere Web Client

and the back-end is disabled in vCenter Server 6.0.x. This also means that the vSphere Client no

longer supports Storage Views

Symptom and Cause

• This is an expected behavior for the vSphere Client when connected to the VMware vCenter

Server 6.0.x. For more information, see VMware vSphere 6.0 Release Notes.

To work around this issue, use VMware vSphere PowerCLI to get the details

• See http://kb.vmware.com/kb/2112085 for full details

Solution

Page 52: VMworld 2015: Troubleshooting for vSphere 6

Retrieving SMART Data

CONFIDENTIAL 52

the esxcli storage core device smart get –d

device_name command to retrieve data about a specified

SSD device

vSphere 6 includes

Page 53: VMworld 2015: Troubleshooting for vSphere 6

Possible Cause: NFS Misconfiguration

CONFIDENTIAL53

If your virtual machines reside on NFS datastores, verify that your NFS configuration is correct

NFS server name

or IP address

ESXi host with NIC

mapped to virtual switchVMkernel port configured

with IP address

Mount permission

(read/write

or read only) and ACLs

Directory to share with the

ESXi host over the network

Page 54: VMworld 2015: Troubleshooting for vSphere 6

NFS Version Compatibility with Other Sphere Technologies

CONFIDENTIAL 54

Compatibility with other vSphere technologies

vSphere Technologies NFS v3 NFS v4.1

vSphere vMotion/vSphere Storage vMotion Yes Yes

vSphere HA Yes Yes

vSphere Fault Tolerance Yes Yes

vSphere DRS/vSphere DPM Yes Yes

Stateless ESXi/Host Profiles Yes Yes

vSphere Storage DRS/vSphere Storage I/O Control Yes No

Site Recovery Manager Yes No

Virtual Volumes Yes No

Page 55: VMworld 2015: Troubleshooting for vSphere 6

NFS Dual Stack Not Supported

CONFIDENTIAL 55

NFS v3 and v4.1 Use Different

Locking SemanticsBest Practice

NFS v3 uses proprietary client-side cooperative locking. NFS v4.1 uses server-side locking

• Configure an NFS array to allow only one NFS protocol

• Use either NFS v3 or NFS v4.1 to mount the same NFS share across all VMware ESXi™ hosts

- Data corruption can occur if hosts attempt to access the same NFS share using different NFS client versions

Page 56: VMworld 2015: Troubleshooting for vSphere 6

Reviewing Session Information

CONFIDENTIAL 56

The esxcli storage nfs41 list command is used to list and

view the volume name, IP address, and other information for the export

Page 57: VMworld 2015: Troubleshooting for vSphere 6

Cluster Troubleshooting

Page 58: VMworld 2015: Troubleshooting for vSphere 6

vSphere HA

CONFIDENTIAL 58

vCenter

Server

ESXi host (slave)

FDM

ESXi host (master)

FDM

ESXi host (slave)

FDM

vpxd

hostdhostdhostd

Management

Network

vpxa vpxa vpxa

Heartbeat

Datastores

A reliable network connection between the hosts and VMware® vCenter Server™

is essential for enabling vSphere HA

Page 59: VMworld 2015: Troubleshooting for vSphere 6

Example: vSphere HA Issue

CONFIDENTIAL 59

The issue might also occur if you attempt to power-on a virtual machine that is part of a vSphere HA cluster with insufficient failover resources.

The vCenter Server displays the following error: Insufficient failover capacity

in a vSphere HA cluster

Page 60: VMworld 2015: Troubleshooting for vSphere 6

Identifying Possible Causes

CONFIDENTIAL 60

Excessive virtual machine reservations or insufficient resources in the cluster can cause

insufficient failover capacity for vSphere HA

One or more of the virtual machines

have excessive reservations

The cluster has insufficient

physical resources

Possible Causes

ESXi Host

Virtual Machine

vSphere HAvSphere HA admission control

policy is not configured correctly

Page 61: VMworld 2015: Troubleshooting for vSphere 6

vSphere HA

CONFIDENTIAL 61

NIOCES.etherswitch:

NIOCES_UpdateNIOCVnicInfo: Fail to

reserve bandwidth for the port

When a host failure or isolation occurs, vSphere HA powers on a virtual machine on

another host in the cluster with respect to the bandwidth reservation and teaming policy:

Failover:

vSphere HA respects the

virtual machine’s reservation

vSphere HA Failure:

If a virtual machine cannot start because the

bandwidth reservation cannot be met, information

about the failure is available in the UI and log files

Page 62: VMworld 2015: Troubleshooting for vSphere 6

vSphere HA and VMCP

CONFIDENTIAL 62

• VMCP provides enhanced protection from APD and PDL conditions

• Can automatically restart impacted virtual machines on non-impacted hosts

(Inactive) (Inaccessible)

vSphere 5.x is unable to detect ADP

conditions and remediate PDL conditions

In vSphere 6, vSphere HA includes Virtual

Machine Component Protection (VMCP):

Page 63: VMworld 2015: Troubleshooting for vSphere 6

vSphere vMotion TCP/IP Stacks

CONFIDENTIAL 63

userworld

VMkernelUser

hostd PING DHCP

vSphere FT Virtual SAN NFS vSphere vMotion

Default TCP IP

• Separate Memory Heap

• ARP Tables

• Routing Table

• Default Gateway

VMKTCP-API

Default TCP IP

• Separate Memory Heap

• ARP Tables

• Routing Table

• Default Gateway

In vSphere 6, each host has a second TCP/IP stack dedicated to vSphere vMotion

Page 64: VMworld 2015: Troubleshooting for vSphere 6

Long Distance vMotion Requirements in VMware vSphere 6.0

CONFIDENTIAL 64

VMware vSphere 6.0 adds functionality to migrate virtual machines over long

distances. You can now perform reliable migrations between hosts and sites that are

separated by high network round-trip latency times. vMotion across long distances is

enabled when correctly configured and the appropriate license is installed. No

further user configuration is required

Tip

• See http://kb.vmware.com/kb/2106949 for full details

Recommendation

Page 65: VMworld 2015: Troubleshooting for vSphere 6

Cross vCenter vMotion Requirements in VMware vSphere 6.0

CONFIDENTIAL 65

VMware vSphere 6.0 and later adds new functionality that lets you migrate virtual

machines between vCenter Server instances

Tip

• See http://kb.vmware.com/kb/2106952 for full details

Recommendation

Page 66: VMworld 2015: Troubleshooting for vSphere 6

Example: vSphere vMotion Issue

CONFIDENTIAL 66

vSphere vMotion fails at 15% or less, or times out completely

Initial check:

If vSphere vMotion was previously working, perform a check:

• Restart the management agents on the ESXi host at the command prompt

/etc/init.d/hostd restart

/etc/init.d/vpxa restart

• Or restart the management agents on the ESXi host by using the DCUI

Page 67: VMworld 2015: Troubleshooting for vSphere 6

Resetting Migrate.Enabled

CONFIDENTIAL 67

Try resetting the advanced setting, Migrate.Enabled.

- Change the value to 0 and save the setting.

- Change the value back to 1 and save

the setting.

If vSphere vMotion fails at 10%

with the error, “A general system

error occurred: Migration failed

while copying data, Broken Pipe,”

take the following action:

Page 68: VMworld 2015: Troubleshooting for vSphere 6

Possible Cause: DRS Configuration

CONFIDENTIAL 68

DRS might have valid reasons for not performing vSphere vMotion migrations

DRS Never Migrates

The automation level

is set to manual mode

The automation level is

fully automated mode and

the migration threshold is

set to apply priority 1

recommendations

DRS Seldom Migrates

Virtual machine loads

are fairly consistent

The automation level is

fully automated mode

and the migration

threshold is set to apply

priority 1, 2, and 3

recommendations

DRS Often Migrates

Virtual machine loads

are very erratic in their

resource requirements

The automation level is

fully automated mode

and the migration

threshold is set to apply

all recommendations

Page 69: VMworld 2015: Troubleshooting for vSphere 6

REFERENCE: VMware vSphere 6 Upgrades- Known Issues and Workarounds

Page 70: VMworld 2015: Troubleshooting for vSphere 6

Issue: Internal Error Occurs During VMware vCenter Server Database Pre-upgrade Checks

CONFIDENTIAL 70

• This issue occurs when the VMware vCenter Server database was previously migrated

from the default embedded Microsoft SQL Server Express instance to another instance, but

the Microsoft Windows registry entries on the VMware vCenter Server were not updated

Symptom and Cause

• To resolve this issue, update the VMware vCenter Server database registry entries to

reflect the correct Microsoft SQL Server instance in which the VMware vCenter Server

database resides

• See http://kb.vmware.com/kb/2115567 for full resolution details

Solution

Page 71: VMworld 2015: Troubleshooting for vSphere 6

Issue: Installing vCenter Server 6.0 with a Microsoft SQL Database Fails

CONFIDENTIAL 71

• When installing vCenter Server 6.0 with a Microsoft SQL database the install fails with the

error:

An error occurred while starting service 'invsvc'

• The issue is caused when SQL Server Browser Service is stopped

Symptom and Cause

• To resolve the issue start the SQL Server Browser Service and perform the install again

Note: The SQL Server Browser Service is set to disabled by default

• See http://kb.vmware.com/kb/2119169 for full resolution details

Solution

Page 72: VMworld 2015: Troubleshooting for vSphere 6

Issue: Upgrading to vCenter Server 6.0 Fails while Validating the Database

CONFIDENTIAL 72

• The following error is displayed

Error : The user associated with the DSN has insufficient privileges.

• This issue occurs due to vCenter Server 6.0 requiring additional privileges to be assigned

to the vCenter Server database user

Symptom and Cause

• To resolve the issue, grant the additional privileges to the vCenter Server database user

• See http://kb.vmware.com/kb/2114754 for full resolution details

Solution

Page 73: VMworld 2015: Troubleshooting for vSphere 6

Issue: Installing or Upgrading to vCenter Server 6.0 Fails During the Import Phase of the Inventory Service Data

CONFIDENTIAL 73

• Upgrading vCenter Server 5.1 or 5.5 to vCenter Server 6.0 fails during the import phase of

the Inventory Service data

• This issue occurs due to a non-functional Inventory Service prior to the upgrade

Symptom and Cause

• To resolve this issue, rollback to the vCenter Server 5.x system and investigate the status

of the Inventory Service

• See http://kb.vmware.com/kb/2119117 for full resolution details

Solution

Page 74: VMworld 2015: Troubleshooting for vSphere 6

Issue: Installing or Upgrading to vCenter Server 6.0 Using an External Platform Services Controller (PSC) Fails

CONFIDENTIAL 74

• Installing or upgrading vCenter Server 5.x to vCenter Server 6.0 using an external Platform

Services Controller (PSC) fails with the error:

install.vmafd.join_vmdir_failed

ERROR: 1, join vmdir failed

• This issue occurs because of the stale data in the Platform Services Controller (PSC) due

to old vCenter Single Sign-On installation data that existed at one time with the same fully

qualified domain name as the failing vCenter Server 6.0 installation or upgrade. The

vCenter Single Sign-On no longer exists but the references still exist in the running

Platform Services Controller

Symptom and Cause

• To resolve this issue, clean the stale data from the Platform Service Controller.

• See http://kb.vmware.com/kb/2117378 for full resolution details

Solution

Page 75: VMworld 2015: Troubleshooting for vSphere 6

Issue: Installing or Upgrading to vCenter Server 6.0 Fails During the Import of the VMware License Service

CONFIDENTIAL 75

• Upgrading from vCenter Server 5.5 to vCenter Server 6.0 fails with the error:

Internal error occurs during Import of VMware License Service

• This issue occurs if the vCenter Single Sign-On user [email protected] has a

password that contains unsupported characters. For more information on unsupported

password characters for the [email protected] user, see vSphere 5.5 Single Sign-

On [email protected] password issues (2060637)

Symptom and Cause

• To work around this issue the [email protected] password needs to be changed

to a supported password

• See http://kb.vmware.com/kb/2111863 for full resolution details

Solution

Page 76: VMworld 2015: Troubleshooting for vSphere 6

Issue: Upgrading to vCenter Server 6.0 Reports that the SSL Certificates are not Compatible

CONFIDENTIAL 76

• When attempting to upgrade, the following warning is display:

The system name in the vCenter Server 5.5 SSL certificate and the vCenter Single Sign-On

5.5 SSL certificates are not compatible. Please replace either the vCenter Server SSL

certificates or the vCenter Single Sign-On SSL certificates so both vCenter Server and

vCenter Single Sign-On SSL certificates use the same system name

• Performing the installation of vCenter Server 5.x with only the IP address of the host OS

then upgrading to vCenter Server 6.0 using the fully qualified domain name (FQDN)

Symptom and Cause

This is a known issue affecting vCenter Server 6.0. Currently, there is no resolution.

• See http://kb.vmware.com/kb/2110943 for the workaround

Solution

Page 77: VMworld 2015: Troubleshooting for vSphere 6

Issue: Database Compatibility Mode

CONFIDENTIAL 77

• Installing or Upgrading to vCenter Server 6.0 fails with the error

Incompatible MSSQL version with vCenter Server 6.0

• This issue occurs if the vCenter Server database resides on an Microsoft SQL instance that

does not meet the requirements for vCenter Server 6.0. For more information on supported

databases, see VMware Product Interoperability Matrixes

Symptom and Cause

• To work around this issue if you are using SQL Server 2008 R2 SP1 or SP2 Datacenter

Edition, move the database to a non-Datacenter Edition (such as Enterprise)

• See http://kb.vmware.com/kb/2111541 for full resolution details

Solution

Page 78: VMworld 2015: Troubleshooting for vSphere 6

Issue: DSN Compatibility

CONFIDENTIAL 78

• When upgrading to vSphere 6.0, you may see an error similar to:

Error: Unsupported database driver: <file name> Resolution: Verify you're using vCenter

Server with supported driver

• vCenter Server requires ODBC drivers and clients when connecting to various databases.

If an incorrect driver is used it will produce the issue

Symptom and Cause

• See http://kb.vmware.com/kb/1015804 for full resolution and driver details

Solution

Page 79: VMworld 2015: Troubleshooting for vSphere 6

Issue: The vSphere Web Client 6.0 Displays an Internal Error after Upgrading to vSphere 6.0 in a pre-5.8.5 vRealize Operations Manager Environment

CONFIDENTIAL 79

• The vSphere Web Client 6.0 displays an internal error.

• When logging in to the vSphere Web Client 6.0, you see the error:

An Internal Error has occurred - Unable to load resource module from /monitoring-

ui/locales/monitoring-ui-en_US.swf

Symptom and Cause

• This issue occurs because vRealize Operations Manager versions prior to 5.8.5 are not

supported in the vSphere 6.0 environment. For more information, see the VMware Product

Interoperability Matrixes.

• See http://kb.vmware.com/kb/2111224 for full resolution details

Solution

Page 80: VMworld 2015: Troubleshooting for vSphere 6

Issue: Missing Inventory Items from the vSphere Web Client 6.0

CONFIDENTIAL 80

• After upgrading to vSphere 6.0, some of the inventory items are missing or

unavailable in the vSphere Web Client

• The inventory items are visible in the vSphere Client

Symptom and Cause

• See http://kb.vmware.com/kb/2121185 for full resolution details

Solution

Page 81: VMworld 2015: Troubleshooting for vSphere 6

Issue: Unable to Login Using the Use Windows Session Credentials Feature in the vSphere Web Client

CONFIDENTIAL 81

• You are unable to log into the vSphere Web Client

Symptom and Cause

• See http://kb.vmware.com/kb/2121717 for full resolution details

Solution

Page 82: VMworld 2015: Troubleshooting for vSphere 6

VMware vSphere 6 Training & Certification

Page 83: VMworld 2015: Troubleshooting for vSphere 6

vSphere 6 Training & Certification

Course Delivery Type Certification

vSphere What’s New [V5.5 to V6]Instructor Led (5-day), Live Online, Onsite,

On Demand, vFlex-ILT

VMware Certified Professional 6 - Data

Center Virtualization (VCP6-DCV)

vSphere: Install, Configure, ManageInstructor Led (5-day), Live Online, Onsite,

On Demand, vFlex-ILT

vSphere: Optimize and ScaleInstructor Led (5-day), Live Online, Onsite,

vFlex-ILT

VMware vSphere: Fast Track [V6] Instructor Led (5-day), Live Online, Onsite,

VMware vSphere: Boot Camp [V6] Instructor-Led (5-day intensive)

CONFIDENTIAL 83

www.vmware.com/go/vsphere6training

Page 84: VMworld 2015: Troubleshooting for vSphere 6

Learn More

Visit us at VMworld

• Education & Certification Lounge: Moscone West, 3rd Floor

• Testing Center: Moscone South, West Mezzanine

Visit our website

• vSphere 6 Training: www.vmware.com/go/vsphere6training

• VMware training and certification: www.vmware.com/education

Page 85: VMworld 2015: Troubleshooting for vSphere 6
Page 86: VMworld 2015: Troubleshooting for vSphere 6
Page 87: VMworld 2015: Troubleshooting for vSphere 6

Troubleshooting for vSphere 6

Jamie Rawson, VMware, Inc

VAPP6257

#VAPP6257