vmware vrealize operations management pack for kubernetes

51
VMware vRealize Operations Management Pack for Kubernetes Management Packs for vRealize Operations Manager 1.5.2

Upload: others

Post on 29-Apr-2022

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VMware vRealize Operations Management Pack for Kubernetes

VMware vRealize Operations Management Pack for Kubernetes

Management Packs for vRealize Operations Manager 1.5.2

Page 2: VMware vRealize Operations Management Pack for Kubernetes

You can find the most up-to-date technical documentation on the VMware website at:

https://docs.vmware.com/

VMware, Inc.3401 Hillview Ave.Palo Alto, CA 94304www.vmware.com

Copyright ©

2021 VMware, Inc. All rights reserved. Copyright and trademark information.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 2

Page 3: VMware vRealize Operations Management Pack for Kubernetes

Contents

vRealize Operations Management Pack for Kubernetes 4

1 Introduction 5

2 About Kubernetes 6

3 Key Day Two Operations Use Cases for Kubernetes 7

4 Installation of vRealize Operations Kubernetes Solution 9Installation of vRealize Operations Management Pack for Kubernetes On-Premises 9

Installation of vRealize Operations Kubernetes Solution on vRealize Operations Cloud 10

5 Monitoring Kubernetes 11Prerequisites 12

Monitoring Using Container Advisor Daemonset 12

cAdvisor YAML Definition 12

Preparing Kubernetes Cluster for Monitoring 16

Monitoring Using Prometheus 16

Prometheus Integration 16

Sample Deployment Setups 21

Supported Integrations 34

Configuring vRealize Operations Management Pack for Kubernetes 34

Tanzu Kubernetes Grid Integrated Overview 37

Tanzu Mission Control Overview 39

Collection Strategy 41

Collection Strategy for Prometheus 41

Collection Strategy for cAdvisor 42

6 Alerts in vRealize Operations Management Pack for Kubernetes 43Reports in vRealize Operations Management Pack for Kubernetes 47

7 Dashboards in vRealize Operations Management Pack for Kubernetes 48Kubernetes Overview 48

Kubernetes Overview - Environment 48

Kubernetes Overview - Nodes 49

Kubernetes Overview - Pods and Container 50

Kubernetes POD and Container Availability - Overview 51

VMware, Inc. 3

Page 4: VMware vRealize Operations Management Pack for Kubernetes

vRealize Operations Management Pack for Kubernetes

The vRealize Operations Management Pack for Kubernetes provides information for automating deployment and scaling operations of application containers across clusters of hosts providing container-centric infrastructure. The current version of the management pack supports monitoring Kubernetes clusters and containers deployed using the same.

Intended Audience

This information is intended for anyone who wants to install, and use vRealize Operations Management Pack for Kubernetes for monitoring their Kubernetes Clusters and containers deployed through Kubernetes.

VMware, Inc. 4

Page 5: VMware vRealize Operations Management Pack for Kubernetes

Introduction 1With Kubernetes becoming the platform of choice to run applications in enterprises, it is essential that an organization has the required tools for the IT teams to operationalize the Kubernetes platform. Whether Kubernetes is hosted on top of VMware’s Software Defined data center or in a native public cloud such as Amazon Web Services, Microsoft Azure, the central IT teams need full visibility into this new world to assure the performance and availability of business applications. These business applications can continue to use both virtual machines and containers for the foreseeable future. While vRealize Operations provides turnkey integrations into your cloud environments. The Kubernetes integration, ensures that you have full line of sight from your applications to infrastructure to empower IT teams and help them effectively support application owners and line of businesses as they adopt this new platform .

VMware vRealize Operations delivers self-driving IT operations management for private, hybrid, and multi-cloud environments in a unified, AI-powered platform. vRealize Operations can monitor multiple Kubernetes solutions, whether it is VMware TKG, RedHat OpenShift, or Kubernetes on Amazon Web Services EC2, Azure, or Google Virtual Machines.

Using vRealize Operations Management Pack for Kubernetes, you can monitor, troubleshoot, and optimize the capacity management for Kubernetes clusters. Some of the additional capabilities of this management pack are listed here.

n Autodiscovery is supported for Tanzu Kubernetes Grid Integrated and VMware Tanzu Mission Control (TMC) clusters provisioned on Amazon Web Services only.

n Complete visualization of Kubernetes cluster topology, including namespaces, clusters, replica sets, nodes, pods, and containers.

n Performance monitoring for Kubernetes clusters.

n Out-of-the-box dashboards for Kubernetes constructs, which include inventory and configuration.

n Multiple alerts to monitor the Kubernetes clusters.

n Mapping Kubernetes nodes with virtual machine objects.

n Report generation for capacity, configuration, and inventory metrics for clusters or pods.

VMware, Inc. 5

Page 6: VMware vRealize Operations Management Pack for Kubernetes

About Kubernetes 2Kubernetes is an open-source container orchestration platform that enables the operation of an elastic web server framework for cloud applications. Kubernetes can support data center outsourcing to public cloud service providers or can be used for web hosting at scale.

VMware Tanzu Kubernetes Grid, or TKG, is a Kubernetes-based container solution that consists of advanced networking capabilities, a private container registry, and life-cycle management. TKG simplifies the deployment and operation of Kubernetes clusters, so you can run and manage containers at scale on private and public clouds.

This document outlines all the Kubernetes offerings within the VMware Tanzu family that can be discovered and monitored by central IT teams. It also includes instructions on how IT teams can use this integration for supported versions of Kubernetes clusters deployed using third party provisioning systems such as RedHat OpenShift, and others.

VMware, Inc. 6

Page 7: VMware vRealize Operations Management Pack for Kubernetes

Key Day Two Operations Use Cases for Kubernetes 3As enterprises start using Kubernetes, they would need all the capabilities they leverage today for virtual machines. Starting from discovering the Kubernetes clusters, creating an inventory, defining relationships, and finally collecting all the key metrics and events to provide full visibility.

Some of the key use cases that the vRealize Operations and Kubernetes integration support, are listed here.

Automatic Discovery and Monitoring of Kubernetes Clusters

For administrators, the biggest benefit of this integration is that they can automatically configure monitoring for Kubernetes clusters as they are deployed by the provisioning system such as Tanzu Kubernetes Grid Integrated Service or Tanzu Mission Control. This also takes care of authentication for them and hence there is zero overhead to manage these environments. The clusters get added as they are provisioned and removed as they are decommissioned.

Kubernetes Inventory & Relationships

The integration allows the administrator to quickly get a full inventory of all the key Kubernetes constructs. This includes list of Kubernetes clusters deployed across all the environments, list of Namespaces that are deployed by the developers, a list of Kubernetes nodes and finally a list of replica sets, services, pods, and containers. With the exhaustive inventory that is collected every five minutes, administrators can easily report on this inventory using the powerful reporting capability of vRealize Operations. This inventory is also related automatically to provide a full stack topology from container to disk. With such a topology, administrators can easily correlate applications to infrastructure and find root cause of problems brewing in the environment.

Kubernetes Monitoring

After the inventory and relationships are available all the key metrics are automatically called and published for the administrators to consume. Based on industry best practices, out of the box content such as Alerts, Dashboards, and Reports are supplied with this integration to get started with monitoring of all the KPIs associated with a Kubernetes environment. The metrics collected through this includes both container and pod infrastructure and application metrics from the containers that can be easily ingested into vRealize Operations using the Prometheus integration.

Kubernetes Troubleshooting and Root Cause Analysis

VMware, Inc. 7

Page 8: VMware vRealize Operations Management Pack for Kubernetes

While Kubernetes works based on the desired state configuration and always tries to ensure application performance and availability, there are scenarios where underlying infrastructure and applications mis-behave and results in performance or availability issues. In such a situation, it is important for administrators to have a way to troubleshoot the scope of related objects in a given period of time. vRealize Operations Troubleshooting Workbench allows the administrators to define a scope based on relations and define a time period for troubleshooting that scope. With these inputs, the Workbench automatically finds the potential evidences that are signals one must observe to arrive at the root cause. The scope-based Metric Correlation is another feature within the Workbench that can find the positive and negative correlations to an anomalous metric across a full scope to find the needle in the haystack.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 8

Page 9: VMware vRealize Operations Management Pack for Kubernetes

Installation of vRealize Operations Kubernetes Solution 4The vRealize Operations Management Pack for Kubernetes is an add-on integration that must be installed on top of the core vRealize Operations platform. This integration is available through the VMware Marketplace.

This chapter includes the following topics:

n Installation of vRealize Operations Management Pack for Kubernetes On-Premises

n Installation of vRealize Operations Kubernetes Solution on vRealize Operations Cloud

Installation of vRealize Operations Management Pack for Kubernetes On-Premises

For the on-premises deployment of vRealize Operations managed by end users, the vRealize Operations Management Pack for Kubernetes must be downloaded and installed on this instance. This can be done in a few simple steps.

Prerequisites

n Verify that you have the latest version of the management pack and the management pack is compatible with your version of vRealize Operations.

n Ensure that you have the privileges to install a management pack.

Procedure

1 Download the PAK file for the management pack from VMware Marketplace.

2 Log in to the vRealize Operations with administrator privileges.

3 In the menu, select Administration and in the left pane select Solutions > Repository.

4 On the Repository tab, click Add/Upgrade.

5 Browse to locate the temporary folder and select the PAK file.

6 Click Upload.

The upload might take several minutes.

7 Read and accept the EULA, and click Next.

VMware, Inc. 9

Page 10: VMware vRealize Operations Management Pack for Kubernetes

8 When vRealize Operations Management Pack for Kubernetes is installed, click Finish.

You can now configure this integration from the Administrator > Other Accounts section.

Installation of vRealize Operations Kubernetes Solution on vRealize Operations Cloud

You can manage and monitor the private and public cloud accounts using vRealize Operations Kubernetes solution. To install the vRealize Operations Kubernetes solution, you must have login credentials to the vRealize Operations Cloud.

Prerequisites

Verify that you have privileges to install a management pack.

Procedure

1 Log in to vRealize Operations Cloud with the relevant credentials.

2 In the menu, select Administration and in the left pane select Marketplace.

3 In the Search box, search for Management Pack for Kubernetes.

4 Click Install.

5 Read and accept the EULA and click Next.

6 When vRealize Operations Management Pack for Kubernetes is installed, click Finish.

You can now configure this integration from Administration > Other accounts section.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 10

Page 11: VMware vRealize Operations Management Pack for Kubernetes

Monitoring Kubernetes 5VMware vRealize Operations Manager delivers self-driving IT operations management for private, hybrid, and multi-cloud environments in a unified, AI-powered platform. vRealize Operations Manager can monitor multiple Kubernetes solutions, whether it is VMware TKG, RedHat OpenShift, or Kubernetes on AWS EC2, Azure, or Google Virtual Machines.

Using vRealize Operations Management Pack for Kubernetes, you can monitor, troubleshoot, and optimize the capacity management for Kubernetes clusters. Some of the additional capabilities of this management pack are listed here.

n Auto-Discovery of Kubernetes clusters

n Complete visualization of K8s cluster topology, including namespaces, clusters, replica sets, nodes, pods, and containers

n Performance monitoring for Kubernetes clusters.

n Out-of-the-box dashboards for Workload Management, which includes inventory and configuration.

n Multiple alerts to monitor the Kubernetes clusters.

n Mapping Kubernetes nodes with virtual machine objects.

n Capacity management for Kubernetes clusters.

n Report generation for capacity, configuration, and inventory metrics for clusters or pods.

This chapter includes the following topics:

n Prerequisites

n Monitoring Using Container Advisor Daemonset

n Monitoring Using Prometheus

n Supported Integrations

n Collection Strategy

VMware, Inc. 11

Page 12: VMware vRealize Operations Management Pack for Kubernetes

Prerequisites

Before you can use the vRealize Operations Management Pack for Kubernetes to monitor the Kubernetes clusters, you must prepare your vRealize Operations environment. You must make sure that vRealize Operations Manager meets the following general requirements.

n Ensure that you have installed vRealize Operations for Cloud or vRealize Operations Manager 8.1 or later.

n Verify that you have installed vRealize Operations Management Pack for Kubernetes.

n Ensure that you have a Kubernetes cluster deployed by Tanzu Kubernetes Grid.

n Ensure that you have a OpenShift Kubernetes cluster deployed.

n Ensure that you have a Kubernetes cluster deployed by upstream Kubernetes.

Monitoring Using Container Advisor Daemonset

Container Advisor (cAdvisor) helps you in understanding the resource usage and performance characteristics of the running containers. The Container Advisor daemon collects, aggregates, processes, and exports information about running containers. For each container cAdvisor keeps resource isolation parameters, historical resource usage, histograms of complete historical resource usage and network statistics.

cAdvisor YAML Definition

Before you install vRealize Operations Management Pack for Kubernetes, you must deploy the cAdvisor DaemonSet on the cluster. Based on the Kubernetes settings, you must create a cAdvisor YAML definition.

Here are a few points to consider when you create a cAdvisor YAML definition:

n Containers running on hostPort must be accessible on your cluster. For example, the sample YAML definition on hostPort given below has port 31194 as the hostPort. So, the cluster must allow a connection on port 31194.

If the containers running on hostPort are not accessible, verify with hostNetwork. A sample YAML definition on hostNetwork specific to Tanzu Kubernetes Grid Integrated (TKGI) is provided in Sample cAdvisor YAML Definition on HostNetwork.

n The docker path configured in the volume must be correct.

Note The docker path can be different based on your settings.

n All the nodes must have sufficient CPU and memory to run DaemonSets.

n You must use the hostPort defined in the YAML definition as the cAdvisor port when you create an adapter instance.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 12

Page 13: VMware vRealize Operations Management Pack for Kubernetes

n Verify that the Kubernetes Clusters available in the TKGI Environment have the cAdvisor DaemonSet configured on port 31194.

Sample cAdvisor YAML Definition on HostPort

apiVersion: apps/v1 # apps/v1beta2 in Kube 1.8, extensions/v1beta1 in Kube < 1.8kind: DaemonSetmetadata: name: vrops-cadvisor namespace: kube-system labels: app: vrops-cadvisor annotations: seccomp.security.alpha.kubernetes.io/pod: 'docker/default'spec: selector: matchLabels: app: vrops-cadvisor template: metadata: labels: app: vrops-cadvisor version: v0.33.0 spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: vrops-cadvisor image: google/cadvisor:v0.33.0 resources: requests: memory: 250Mi cpu: 250m limits: cpu: 400m volumeMounts: - name: rootfs mountPath: /rootfs readOnly: true - name: var-run mountPath: /var/run readOnly: true - name: sys mountPath: /sys readOnly: true - name: docker mountPath: /var/lib/docker #Mouting Docker volume readOnly: true - name: disk mountPath: /dev/disk readOnly: true ports: - name: http

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 13

Page 14: VMware vRealize Operations Management Pack for Kubernetes

containerPort: 8080 #Port exposed hostPort : 31194 #Host's port - Port to expose your cAdvisor DaemonSet on each node protocol: TCP automountServiceAccountToken: false terminationGracePeriodSeconds: 30 volumes: - name: rootfs hostPath: path: / - name: var-run hostPath: path: /var/run - name: sys hostPath: path: /sys - name: docker hostPath: path: /var/lib/docker #Docker path in Host System - name: disk hostPath: path: /dev/disk

Sample cAdvisor YAML Definition on HostNetwork

apiVersion: apps/v1 # apps/v1beta2 in Kube 1.8, extensions/v1beta1 in Kube < 1.8kind: DaemonSetmetadata: name: vrops-cadvisor namespace: kube-system labels: app: vrops-cadvisorspec: selector: matchLabels: name: vrops-cadvisor template: metadata: labels: name: vrops-cadvisor version: v0.33.0 spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule hostNetwork: true containers: - name: vrops-cadvisor image: google/cadvisor:v0.33.0 imagePullPolicy: Always volumeMounts: - name: rootfs mountPath: /rootfs readOnly: true

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 14

Page 15: VMware vRealize Operations Management Pack for Kubernetes

- name: var-run mountPath: /var/run readOnly: false - name: sys mountPath: /sys readOnly: true - name: docker mountPath: /var/lib/docker #Mouting Docker volume readOnly: true - name: docker-sock mountPath: /var/run/docker.sock readOnly: true - name: containerd-sock mountPath: /var/run/containerd.sock readOnly: true - name: disk mountPath: /dev/disk readOnly: true ports: - name: http containerPort: 31194 #Port exposed hostPort: 31194 #Host's port - Port to expose your cAdvisor DaemonSet on each node protocol: TCP securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE args: - --port=31194 - --profiling - --housekeeping_interval=1s terminationGracePeriodSeconds: 30 volumes: - name: rootfs hostPath: path: / - name: var-run hostPath: path: /var/run - name: sys hostPath: path: /sys - name: docker hostPath: path: /var/vcap/store/docker/docker #Docker path in Host System - name: docker-sock hostPath: path: /var/vcap/sys/run/docker/docker.sock - name: containerd-sock hostPath: path: /var/run/docker/containerd/docker-containerd.sock - name: disk

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 15

Page 16: VMware vRealize Operations Management Pack for Kubernetes

hostPath: path: /dev/disk

Preparing Kubernetes Cluster for Monitoring

In this example, you are preauthenticated to VMware Cloud on Amazon Web Services software defined data center. To monitor the Kubernetes cluster, perform the following steps.

1 Deploy a cluster using the command tkg create cluster –plan=dev tkg-cluster-03

2 Create a vrops-cAdvisor.yaml file on this cluster and run it as a DaemonSet using the

commandkubectl config use-context tkg-cluster-03-admin@tkg-cluster-03

3 Switch to Temp directory using the command root@tkg [~]tkg-cluster-03-admin@tkg-cluster-03:default)# cd /tmp

4 Create a vrops-cAdvisor.yaml file using vi command root@tkg [~]tkg-cluster-03-admin@tkg-cluster-03:default)# vi vrops-cAdvisor.yaml

5 Run kubectl apply -f vrops-cAdvisor.yaml to run cAdvisor as a Daemonset.

6 Run less .kube/config to read the configuration file and note down the IP address and

credentials to add to vRealize Operations Manager.

7 Use one of the following authentication types to authenticate the guest cluster.

n Basic Authentication - Uses HTTP basic authentication to authenticate API requests through authentication plugins.

n Client Certification Authentication - Uses client certificates to authenticate API requests through authentication plugins.

n Token Authentication - Uses bearer tokens to authenticate API requests through authentication plugins.

Refer to Authentication Strategies for more details.

Note Management Packs can work with read-only permission.

Monitoring Using Prometheus

In Prometheus integration, vRealize Operations Manager retrieves the metrics directly from Prometheus with the help of exporters running on the Kubernetes cluster.

Prometheus Integration

In Prometheus integration, vRealize Operations Manager retrieves the metrics directly from Prometheus with the help of exporters running on the Kubernetes cluster. vRealize Operations Manager supports metrics collection for following Kubernetes services, Namespace, Nodes, Pods, and Containers with Prometheus integration.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 16

Page 17: VMware vRealize Operations Management Pack for Kubernetes

Prerequisites

Ensure that you have deployed one Prometheus or In-cluster Prometheus per cluster.

Set Up Prometheus Server

To set up the Prometheus Server, you have to perform the following actions.

Procedure

u Follow the instructions in this link to Setup the Prometheus Server.

Set Up Prometheus with Nginx

To set up the Prometheus behind Nginx server, you have to perform the following actions.

Procedure

1 To set up Nginx, follow the instructions in this link.

2 Setup Basic Authentication using this link.

Supported Exporters

The Prometheus integration with vRealize Operations Manager, supports metrics collection from the following exporters running on the Kubernetes cluster.

Supported Exporters

Exporter Name Support for Linux Support for Windows Supported Metrics

cAdvisor YES NO cAdvisor Metrics

cStatsExporter NO YES cStatsExporter Metrics

Telegraf Kubernetes Input plugin

YES YES Telegraf Metrics

kube-state-metrics YES YES kube-state-metrics

Windows-node-exporter NO YES Windows Node Exporter Metrics

Node Exporter YES NO Node Exporter Metrics

Exporter Details

n Deploy the following exporters in your kubernetes cluster to get metrics to respective endpoints and then add them as targets in the prometheus.yml.

n Install Node Exporter and Windows-Exporter on each node as a service.

The details of the exporters like, name, official documentation, sample deployment and important notes are given in the following table.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 17

Page 18: VMware vRealize Operations Management Pack for Kubernetes

Exporter Name Official Documentation Sample Deployment Important Notes

cAdvisor https://github.com/google/cadvisor

cAdvisor Setup n Ensure that you add nodename label and the nodename value in labels in prometheus.yml for

vRealize Operations Manager to recognize that node.

n Deploy cAdvisor as a Daemonset Kind in kuberenetes cluster.

cStatsExporter https://github.com/alexvaut/cStatsExporter Image: docker pull projects.registry.vmware.com/vrops_metric_exporters/cstatsexporter@sha256:1547a44857612c616ab14cde945326ffb9497abd5541737ff7fd4f3e2af83226

cStatsExporter Setup n To get metrics for all the nodes consistently, cstatsExporter can be deployed as a Docker container on each windows node.

Telegraf Kubernetes Input plugin

https://github.com/influxdata/telegraf/tree/master/plugins/inputs/kubernetesImage for telegraf for windows: docker pull projects.registry.vmware.com/vrops_metric_exporters/telegraf-win@sha256:be76abe7efb53d4af999302af08e014a80651ad06491fabd38f5ed927124bf4a

Telegraf Kubernetes Plugin Setup for Windows And Linux

n Set omit_hostname=true under [agent] in telegraf.conf file or in

the configmap.

n Deploy the Telegraf Kubernetes Plugin as a Daemonset Kind in kuberenetes cluster.

kube-state-metrics https://github.com/kubernetes/kube-state-metrics

Deploy the yaml files from the given link.

n This exporter can be deployed as a deployment kind in Kubernetes cluster.

n Expose the port 8080 of kube-state-metrics using NodePort service and add the job in prometheus.yml.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 18

Page 19: VMware vRealize Operations Management Pack for Kubernetes

Exporter Name Official Documentation Sample Deployment Important Notes

Windows-node-exporter https://github.com/prometheus-community/windows_exporter

Windows Exporter Setup on Windows Node

n Install the Windows node exporter as a service in each windows node.

n Ensure that you add nodename label and the nodename value in labels in prometheus.yml for

vRealize Operations Manager to recognize the node.

Node Exporter https://github.com/prometheus/node_exporter

Node Exporter Setup on Linux Nodes

n Install Node Explorer as a service in each Linux node.

n Ensure that you add the nodename label and the nodename value in labels in prometheus.yml for vRealize Operations Manager to recognize the node.

Experimental Support for Application Metrics

With the help of sidecar pattern, the Telegraf agent with appropriate Application input plugins (https://github.com/influxdata/telegraf#input-plugins) is enabled and deployed in the same pod where the application is hosted. Also, you can use or modify the telegraf agent installed in each node for collecting node metrics to monitor the applications deployed in Kubernetes with the appropriate input plugins configuration. All these metrics must have the proper pod name or pod id so that vRealize Operations Manager parses the metrics and displays them in the vRealize Operations Manager user interface.

Sample sidecar deployment configuration of telegraf agent to monitor Redis is provided in Redis Telegraf Input Plugin.

Set up vROps configuration for third-party Prometheus exporters

You can follow these steps to modify or change the config file in vRops to support third-party prometheus exporters.

Following is the metric received using Prometheus collector:

kubernetes_system_container_memory_major_page_faults{container_name="kubelet",host="vrops-telegraf-k8s-fvcvr", job="k8s_telegraf_job",namespace="kube-system",node_name="k8s-01-node-01"}

This metric has the following components:

1 Metric Name: kubernetes_system_container_memory_major_page_faults

2 Label Name: container_name, host, job, namespace, node_name

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 19

Page 20: VMware vRealize Operations Management Pack for Kubernetes

3 Label Value: kubelet, vrops-telegraf-k8s-fvcvr, k8s_telegraf_job, kube-system", k8s-01-node-01

Given a metric name and a set of labels, time series are frequently identified using this notation:

<metric name>{<label name>=<label value>, ...}

For example, from the above sample metric, the service-related information can be extracted from the labels:

1 container_name – contains the name of Container

2 namespace – contains the name of Namespace

3 node_name – contains the name of Node

For more details, refer to https://prometheus.io/docs/concepts/data_model/#notation.

To map any Prometheus Metric to vROps Objects such as Namespace, Node, Pod, and Container, the metric Label Value should possess the identifier or name of these service instances which will later be mapped to an appropriate object in vROps.

Path to the YAML configuration file in vROps Appliance: /usr/lib/vmware-vcops/user/plugins/inbound/KubernetesAdapter3/conf/prometheusConfig.yml

Create a new mapping in the configuration file in the below format:

config_name: <mapping configuration name>label_contains_name: '' k8s_node:label_contains_name: ''k8s_pod:label_contains_name: ''k8s_container:label_contains_id: ''label_contains_name: ''

Use any one of the cases below:

Case 1: To map the sample metric to the Container object in vROps, fill only the k8s_container section.

k8s_container:label_contains_id: ''label_contains_name: 'container_name'

In this case, only the container name is available and present under the label container_name.

Case 2: To map the sample metric to the Namespace object in vROps, fill only the k8s_namespace section.

k8s_namespace:label_contains_name: 'namespace'

In this case, the Namespace name is present under the label namespace.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 20

Page 21: VMware vRealize Operations Management Pack for Kubernetes

Case 3: To map the sample metric to the Node object in vROps, fill only the k8s_node section.

k8s_node:label_contains_name: 'node_name'

In this case, the Node name is present under the label node_name.

Note n To avoid multiple label groups getting displayed in vROps Metric chart, add the label names

(to be dropped) in "Prometheus metric labels to exclude" section. For this purpose, go to Configuring Kubernetes Adapter Instance > Advanced SettingsStep 10 > Step g.

n Once the configuration file is updated, user must restart the Kubernetes Adapter instance.

Sample Deployment Setups

In Prometheus integration, vRealize Operations Manager retrieves the metrics directly from Prometheus with the help of exporters running on the Kubernetes cluster.

Sample Deplyment Setup in vRealize Operations Manager

The following sample deployment setups are available in vRealize Operations Manager.

n cAdvisor

n cStatsExporter

n Telegraf Kubernetes Input Plugin

n Kube-State-Metrics

n Windows Node Exporter

n Node Exporter

cAdvisor Setup

To deploy cAdvisor for monitoring Linux containers using the YAML file, you have to perform the following actions.

Procedure

1 Create cAdvisor.yaml (Daemaonset yaml).

apiVersion: apps/v1 # apps/v1beta2 in Kube 1.8, extensions/v1beta1 in Kube < 1.8kind: DaemonSetmetadata: name: vrops-cadvisor namespace: kube-system labels: app: vrops-cadvisor annotations: seccomp.security.alpha.kubernetes.io/pod: 'docker/default'spec:

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 21

Page 22: VMware vRealize Operations Management Pack for Kubernetes

selector: matchLabels: app: vrops-cadvisor template: metadata: labels: app: vrops-cadvisor version: v0.33.0 spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: vrops-cadvisor image: google/cadvisor:v0.33.0 resources: requests: memory: 250Mi cpu: 250m limits: cpu: 400m volumeMounts: - name: rootfs mountPath: /rootfs readOnly: true - name: var-run mountPath: /var/run readOnly: true - name: sys mountPath: /sys readOnly: true - name: docker mountPath: /var/lib/docker #Mouting Docker volume readOnly: true - name: disk mountPath: /dev/disk readOnly: true ports: - name: http containerPort: 8080 #Port exposed hostPort : 31194 #Host's port - Port to expose your cAdvisor DaemonSet on each node protocol: TCP automountServiceAccountToken: false terminationGracePeriodSeconds: 30 volumes: - name: rootfs hostPath: path: / - name: var-run hostPath: path: /var/run - name: sys hostPath: path: /sys

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 22

Page 23: VMware vRealize Operations Management Pack for Kubernetes

- name: docker hostPath: path: /var/lib/docker #Docker path in Host System - name: disk hostPath: path: /dev/disk

2 Run kubectl apply -f cadvisor.yaml.

Daemonset and service on your Kubernetes cluster is created.

3 Run kubectl get all -o wide.

Helps you see where the cadvisor containter is created.

4 Copy the IP address of the node and the port which the cadvisor service has exposed.

5 Add this as a job in prometheus.yml file in the Prometheus server.

Add cAdvisor Exporter

To add the cAdvisor exporter to the prometheus.yml file, perform the following actions.

Procedure

1 Add following job config to prometheus.yml.

- job_name: 'cadvisor' static_configs: - targets: ['Node IP where Cadvisor is running:31194' ] labels: nodename: 'nodename' proxy_url: 'http://ProxyIP'

Note Add the nodename label and the nodename value in labels in the prometheus.yml for

vRealize Operations to recognize the node.

2 Restart the service: service prometheus restart.

3 Add jobs in the prometheus.yml file for each linux node or multiple NodeIP: Port for cadvisor

can be added in the targets array.

cStatsExporter Setup

You can use the CstatsExporter to retrieve the Windows container metrics in two ways. Perform the following actions to retrieve the metrics information.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 23

Page 24: VMware vRealize Operations Management Pack for Kubernetes

Procedure

u Deploy directly as docker container on each Windows node using docker image in harbor.

a Use Secure Shell (SSH) to access the Windows node and run the following command

Run >> docker run --rm -p 9030:9030 -v \\.\pipe\docker_engine:\\.\pipe\docker_engine projects.registry.vmware.com/vrops_metric_exporters/cstatsexporter@sha256:1547a44857612c616ab14cde945326ffb9497abd5541737ff7fd4f3e2af83226

b If you want to deploy as a docker container on port 9030, add 9030 port in prometheus.yml.

Add cStatsExporter

To add the cStatsExporter to the prometheus.yml file, perform the following actions.

Procedure

1 Add following job configuration to the prometheus.yml.

- job_name: 'cstatsExporter' static_configs: - targets: ['Node IP where cstats is running:9030' ] proxy_url: 'http://ProxyIP'

2 Restart the service service prometheus restart.

3 Add jobs in the prometheus.yml file for each Windows node or multiple NodeIP: Port for

cStats can be added in the targets array.

Telegraf Kubernetes Plugin Setup for Windows And Linux

To deploy the Telegraf Kubernetes Plugin Setup for Windows And Linux containers using the YAML file, you have to perform the following actions.

Procedure

u Deploy the following YAML files on your Kubernetes cluster:

Telegraf for Linux

apiVersion: v1kind: ServiceAccountmetadata: name: vrops-mp-user namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata: name: vrops-mp-userroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 24

Page 25: VMware vRealize Operations Management Pack for Kubernetes

name: cluster-adminsubjects: - kind: ServiceAccount name: vrops-mp-user namespace: kube-system---apiVersion: apps/v1 # apps/v1beta2 in Kube 1.8, extensions/v1beta1 in Kube < 1.8kind: DaemonSetmetadata: name: vrops-telegraf-k8s namespace: kube-system labels: app: vrops-telegraf-k8s annotations: seccomp.security.alpha.kubernetes.io/pod: 'docker/default'spec: selector: matchLabels: app: vrops-telegraf-k8s template: metadata: labels: app: vrops-telegraf-k8s version: v1.0 spec: serviceAccountName: vrops-mp-user tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: vrops-telegraf-k8s-container image: telegraf:1.16.0 resources: requests: memory: 250Mi cpu: 250m limits: cpu: 400m volumeMounts: - name: telegraf-d mountPath: /etc/telegraf ports: - name: http containerPort: 9273 #Port exposed hostPort : 31196 #Host's port - Port to expose your telegraf DaemonS Set on each node protocol: TCP env: - name: METRIC_SOURCE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: POD_NAME valueFrom: fieldRef:

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 25

Page 26: VMware vRealize Operations Management Pack for Kubernetes

fieldPath: metadata.name - name: NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: NODE_HOSTNAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: NODE_IP valueFrom: fieldRef: fieldPath: status.hostIP automountServiceAccountToken: true terminationGracePeriodSeconds: 30 volumes: - name: telegraf-d projected: sources: - configMap: name: vrops-telegraf-k8s-config---apiVersion: v1kind: ConfigMapmetadata: name: vrops-telegraf-k8s-config namespace: kube-systemdata: telegraf.conf: | # Configuration for telegraf agent [global_tags] namespace = "$NAMESPACE" [agent] interval = "10s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" quiet = false hostname = "" omit_hostname = true ################################################################## # OUTPUT PLUGINS # ################################################################## # Configuration for the Prometheus client to spawn [[outputs.prometheus_client]] ## Address to listen on listen = ":9273" ################################################################## # INPUT PLUGINS # ##################################################################

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 26

Page 27: VMware vRealize Operations Management Pack for Kubernetes

[[inputs.kubernetes]] ## URL for the kubelet url = "https://$NODE_IP:10250" insecure_skip_verify = true

Telegraf for Windows

apiVersion: v1kind: ServiceAccountmetadata: name: vrops-mp-user namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata: name: vrops-mp-userroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-adminsubjects: - kind: ServiceAccount name: vrops-mp-user namespace: kube-system---apiVersion: apps/v1 # apps/v1beta2 in Kube 1.8, extensions/v1beta1 in Kube < 1.8kind: DaemonSetmetadata: name: vrops-telegraf-k8s-win namespace: kube-system labels: app: vrops-telegraf-k8s-win annotations: seccomp.security.alpha.kubernetes.io/pod: 'docker/default'spec: selector: matchLabels: app: vrops-telegraf-k8s-win template: metadata: labels: app: vrops-telegraf-k8s-win version: v1.0 spec: serviceAccountName: vrops-mp-user containers: - name: vrops-telegraf-k8s-container-win image: projects.registry.vmware.com/vrops_metric_exporters/telegraf-win@sha256:be76abe7efb53d4af999302af08e014a80651ad06491fabd38f5ed927124bf4a resources: requests: memory: 250Mi cpu: 250m limits:

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 27

Page 28: VMware vRealize Operations Management Pack for Kubernetes

cpu: 400m volumeMounts: - name: telegraf-d mountPath: C:\Program Files\Telegraf\ ports: - name: http containerPort: 9273 #Port exposed hostPort : 31197 #Host's port - Port to expose your telegraf DaemonSet on each node protocol: TCP env: - name: METRIC_SOURCE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: NODE_HOSTNAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: NODE_IP valueFrom: fieldRef: fieldPath: status.hostIP automountServiceAccountToken: true terminationGracePeriodSeconds: 30 volumes: - name: telegraf-d projected: sources: - configMap: name: vrops-telegraf-k8s-config-win nodeSelector: kubernetes.io/os: windows tolerations: - key: "windows" operator: "Equal" value: "2019" effect: "NoSchedule"---apiVersion: v1kind: ConfigMapmetadata: name: vrops-telegraf-k8s-config-win namespace: kube-systemdata: telegraf.conf: | # Configuration for telegraf agent

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 28

Page 29: VMware vRealize Operations Management Pack for Kubernetes

[global_tags] namespace = "$NAMESPACE" [agent] interval = "60s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" quiet = false hostname = "" omit_hostname = true ################################################################## # OUTPUT PLUGINS # ################################################################## # Configuration for the Prometheus client to spawn [[outputs.prometheus_client]] ## Address to listen on listen = ":9273" ################################################################## # INPUT PLUGINS # ################################################################## [[inputs.kubernetes]] ## URL for the kubelet url = "https://$NODE_IP:10250" bearer_token = "C:/var/run/secrets/kubernetes.io/serviceaccount/token" insecure_skip_verify = true---apiVersion: v1kind: Servicemetadata: name: vrops-telegraf-k8s-win namespace: kube-system labels: app: vrops-telegraf-k8s-winspec: externalTrafficPolicy: Local ports: - port: 9273 targetPort: 9273 nodePort: 31197 selector: app: vrops-telegraf-k8s-win type: NodePort---

Add Telegraf Exporter

To add the Telegraf exporter to the prometheus.yml file, perform the following actions.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 29

Page 30: VMware vRealize Operations Management Pack for Kubernetes

Procedure

1 Add following job config to prometheus.yml.

- job_name: 'telegraf-exporter'static_configs:- targets: ['Node IP:31196' ]proxy_url: 'http://ProxyIP'- job_name: 'telegraf-win-exporter'static_configs:- targets: ['Node IP:31197' ]proxy_url: 'http://ProxyIP'

2 Restart the service: service prometheus restart.

Windows Exporter Setup on Windows Node

To deploy Windows Exporter for monitoring Windows containers using the YAML file, you have to perform the following actions.

Procedure

u Run the following commands on each windows node to install windows-exporter as a service. You can find the latest windows-exporter msi file here.

$url = "https://github.com/prometheus-community/windows_exporter/releases/download/v0.14.0/windows_exporter-0.14.0-amd64.msi"$output = "C:\Users\windows.msi"$start_time = Get-Date$wc = New-Object System.Net.WebClient$wc.DownloadFile($url, $output)msiexec /i C:\Users\windows.msi ENABLED_COLLECTORS=cpu,cs,logical_disk,net,os,service,system,textfile,container,memory

Add Windows Exporter

To add the Windows exporter to the prometheus.yml file, perform the following actions.

Procedure

1 Add following job config to prometheus.yml.

- job_name: 'windows-exporter' static_configs: - targets: ['NodeIP:9182' ] labels: nodename: 'nodename' proxy_url: 'http://ProxyIP'

Note Add the nodename label and the nodename value in labels in prometheus.yml for

vRealize Operations Manager to recognize the node.

2 Restart the service: service prometheus restart.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 30

Page 31: VMware vRealize Operations Management Pack for Kubernetes

3 Add jobs in the prometheus.yml file for each Windows node.

Node Exporter Setup on Linux Nodes

Run the following commands on each linux node to install Node Exporter as service.

Procedure

1 Download and move the node exporter binary to /usr/local/bin (You can find the latest binaries from official github link for node_Exporter)

wget https://github.com/prometheus/node_exporter/releases/download/v*/node_exporter-*.*-amd64.tar.gztar xvfz node_exporter-*.*-amd64.tar.gzsudo mv node_exporter-*.*-amd64/node_exporter /usr/local/bin/

2 Create a node_exporter user to run the node exporter service.

sudo useradd -rs /bin/false node_exporter

3 Create a node_exporter service file under systemd.

sudo tee /etc/systemd/system/node_exporter.service<<EOF[Unit]Description=Node ExporterAfter=network.target [Service]User=node_exporterGroup=node_exporterType=simpleExecStart=/usr/local/bin/node_exporter [Install]WantedBy=multi-user.targetEOF

4 Reload the system daemon and start the node exporter service.

sudo systemctl daemon-reloadsudo systemctl start node_exportersudo systemctl enable node_exporter

5 Check the status of node exporter if it is running in active state.

sudo systemctl status node_exporter

You can see all the server metrics or node metrics from the following link.

http://<Node-IP>:9100/metrics

.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 31

Page 32: VMware vRealize Operations Management Pack for Kubernetes

Add Node Exporter

To add the node exporter to the prometheus.yml file, perform the following actions.

Procedure

1 Add following job config to prometheus.yml.

- job_name: 'node-exporter' static_configs: - targets: ['NodeIP:9100' ] labels: nodename: 'nodename' proxy_url: 'http://ProxyIP'

Note Add the nodename label and the nodename value in labels in prometheus.yml for

vRealize Operations Manager to recognize the node.

2 Restart the service: service prometheus restart.

3 Add jobs in the prometheus.yml file for each linux node.

Redis Telegraf Input Plugin

To deploy Redis Telegraf Input Plugin using the YAML file, you have to perform the following actions.

Procedure

u Deploy the following yaml in kubernetes cluster.

Note Ensure that omit_hostname is set to true under agent and podname label is set under

global_tags.

apiVersion: v1kind: ConfigMapmetadata: name: telegraf-configdata: telegraf.conf: |+ [global_tags] podname = "$POD_NAME" [agent] interval = "10s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" quiet = false hostname = ""

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 32

Page 33: VMware vRealize Operations Management Pack for Kubernetes

omit_hostname = true [[outputs.prometheus_client]] listen = ":9273" [[inputs.redis]] servers = ["tcp://localhost:6379"]---apiVersion: apps/v1kind: Deploymentmetadata: name: redisspec: replicas: 1 selector: matchLabels: app: redis template: metadata: labels: app: redis annotations: telegraf.influxdata.com/inputs: |+ [[inputs.redis]] servers = ["tcp://localhost:6379"] telegraf.influxdata.com/class: app spec: containers: - name: redis image: redis:4 resources: requests: cpu: 100m memory: 100Mi ports: - containerPort: 6379 - name: telegraf image: telegraf:1.16.0 env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name volumeMounts: - name: telegraf-config-volume mountPath: /etc/telegraf/telegraf.conf subPath: telegraf.conf readOnly: true volumes: - name: telegraf-config-volume configMap: name: telegraf-config---apiVersion: v1kind: Servicemetadata: name: redis

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 33

Page 34: VMware vRealize Operations Management Pack for Kubernetes

spec: type: NodePort selector: app: redis ports: - name: prom protocol: TCP port: 9273 targetPort: 9273 nodePort: 30008

Add Redis Telegraf Exporter

To add the Redis Telegraf exporter to the prometheus.yml file, perform the following actions.

Procedure

1 Add following job config to prometheus.yml.

- job_name: 'redisExporter' static_configs: - targets: ['Node IP:30008' ] proxy_url: 'http://ProxyIP'

2 Restart the service: service prometheus restart.

Supported Integrations

The vRealize Operations Management Pack for Kubernetes 1.5 or later integrates with VMware Tanzu Kubernetes Grid Integrated Service and VMware Tanzu Mission Control to auto-discover provisioned Kubernetes clusters and bring them inside vRealize Operations for monitoring. This integration also supports third-party container platforms such as RedHat OpenShift, Rancher Labs, or any supported upstream Kubernetes running on top of VMware vSphere or native public cloud virtual machine services such as AWS EC2 or Microsoft Azure virtual machine service.

Supported Integrations in vRealize Operations Management Pack for Kubernetes

The vRealize Operations Management Pack for Kubernetes 1.5.2 supports the following integrations:

n Upstream Kubernetes 1.16–1.19

n Tanzu Kubernetes Grid Integrated or Tanzu Mission Control 1.16–1.19

n OpenShift 4.3–4.5

Configuring vRealize Operations Management Pack for Kubernetes

You can configure vRealize Operations Management Pack for Kubernetes on vRealize Operations after you install the solution.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 34

Page 35: VMware vRealize Operations Management Pack for Kubernetes

Preparing Your Environment

Configure the Kubernetes adapter only when TKGI is not used to deploy the Kubernetes Clusters.

Prerequisites

n Ensure you have installed vRealize Operations 8.1 or later or have access to vRealize Operations Cloud.

n Verify that you have installed vRealize Operations Management Pack for Kubernetes.

n Verify that the Kubernetes Clusters available in the TKGI Environment have the cAdvisor DaemonSet configured on port 31194.

n You must have the cluster.admin role assigned to you.

Note You can create the user-defined custom roles but it should have the read-only access to all the resources in cluster and the Kubernetes API.

Configuring Kubernetes Adapter Instance

Configure the Kubernetes adapter only when TKGI is not used to deploy the Kubernetes Clusters.

Prerequisites

n Ensure you have installed vRealize Operations 8.1 or later or have access to vRealize Operations Cloud.

n Verify that you have installed vRealize Operations Management Pack for Kubernetes.

n Verify that the Kubernetes Clusters available in the TKGI Environment have the cAdvisor DaemonSet configured on port 31194.

Procedure

1 From the main menu of vRealize Operations Manager, click Administration, and then in the left pane, click Solutions.

2 From the Solutions list, select vRealize Operations Management Pack for Kubernetes.

3 Click the Configure icon to edit an object.

4 Enter the display name of the adapter.

5 Enter description for the adapter.

6 Enter the http URL of the Kubernetes primary node in the Master URL text box.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 35

Page 36: VMware vRealize Operations Management Pack for Kubernetes

7 Select the Collector Service as per your requirement.

The available options are Prometheus Server, Kubelet, and Daemonset.

a If you select Kubelet or DaemonSet as the cAdvisor Service, you can select a cAdvisor service running inside the Kubelet or the one deployed externally as a DaemonSet.

b If you select Prometheus as the collection strategy service, you have to provide the server URL details in the credentials section.

Note By default, some Kubernetes deployments might have the cAdvisor service disabled on Kubelet. In such a situation, the cAdvisor service must be enabled on Kubelet or a standalone cAdvisor service must be deployed as a DaemonSet.

8 Enter the port number if cAdvisor is running as a DaemonSet.

Note If you select Prometheus as the collector service you do not have to enter the cAdvisor Port (Daemoset) port details.

9 Enter the Credential details of the Master URL.

a Click the Add New icon.

b Select the authentication to connect to the Kubernetes API Server. vRealize Operations Management Pack for Kubernetes supports basic, client certificate, and token authentication.

Table 5-1. Authentication Types

Authentication Description

Basic Auth Uses HTTP basic authentication to authenticate API requests through authentication plugins.

Client Certification Auth Uses client certificates to authenticate API requests through authentication plugins.

Token Auth Uses bearer tokens to authenticate API requests through authentication plugins.

Note If you select Prometheus Server as the collector service, in the Manage Credential section you have to provide details for Prometheus Server, Prometheus endpoint username, and Prometheus endpoint password.

For more information, see Kubernetes Authentication.

10 Under Advanced Settings

a Select the collector that is used to manage the adapter processes.

b If the Kubernetes cluster is running on vCenter Server and the same server is monitored by the vCenter Adapter instance, you can view a link from the Kubernetes node to the vSphere Virtual Machine. To view the link, enter the IP address of the vCenter Server instance.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 36

Page 37: VMware vRealize Operations Management Pack for Kubernetes

c If you want to monitor Java Process, then enable this option.

d If you want to delete the non-existent objects for a defined period, then select the time frame from the drop-down menu.

Note The object deletion schedule is applicable to the Kubernetes Monitoring management pack only, and is over and above the global setting object deletion policy.

e If you want to do cAdvisor Install Check, then enable this option.

f Enable this option to include all the Prometheus metric labels (label name & value) under the Prometheus metric group.

Note When enabling and disabling the field, delete and reconfigure the adapter instance to remove the history metric data from vRealize Operations Manager, if needed. (optional)

g Enter the Prometheus labels that must be excluded from the Prometheus metric group.

The label names must be entered as a list of comma-separated, case-sensitive values. For example, If there are ten labels being displayed and only four of them are required, then the remaining six label names must be added in this field.

h Click Save Settings.

11 Click Close.

Tanzu Kubernetes Grid Integrated Overview

VMware Tanzu Kubernetes Grid provides organizations with a consistent, upstream-compatible, regional Kubernetes substrate across software-defined data centers (SDDC) and public cloud environments, that are ready for end-user workloads and ecosystem integrations. Tanzu Kubernetes Grid Integrated is a Kubernetes Cluster Provisioning engine with the VMware’s Opinionated way of deploying Kubernetes. You can deploy Kubernetes clusters to vSphere, Amazon Web Services, and Google Cloud Provider.

Preparing Your Environment for TKGI

To monitor the Kubernetes cluster created using Tanzu Kuberentes Grid Integrated (TKGI) adapter instance, you have to first configure the TKGI adapter.

Prerequisites

n Verify that you have installed vRealize Operations Management Pack for Kubernetes.

n Verify that the TKGI API Hostname (FQDN) is accessible and resolvable.

n Verify that the Kubernetes Clusters available in the TKGI Environment have the cAdvisor DaemonSet configured on port 31194.

n You must have the pks.cluster.admin role assigned to you to use the LDAP credentials.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 37

Page 38: VMware vRealize Operations Management Pack for Kubernetes

Configuring Tanzu Kubernetes Grid Integrated

Configure the Tanzu Kubernetes Grid Integrated (TKGI) adapter to monitor the Kubernetes clusters created using TKGI. The Kubernetes adapter instance is automatically created after you configure the TKGI adapter. If you are deploying the Kubernetes cluster through TKGI, do not configure the Kubernetes adapter instance. For the provided PKS/TKGI environment details, the TKGI adapter instance queries the TKGI API every 5 minutes (default collection interval) for new Kubernetes Cluster deployed and creates a Kubernetes Adapter Instance against each discovered cluster.

Procedure

1 From the main menu of vRealize Operations, click Administration, and then in the left pane, click Solutions > Repository.

2 From the Solutions list, select VMware vRealize® Operations™ Management Pack for Kubernetes .

3 Click the Configure icon to edit an object.

4 Select VMware Tanzu Kubernetes Grid Integrated (TKGI) Adapter from the Adapter list and configure the adapter instance.

Field Name Action

Display Name Enter the display name of the TKGI adapter.

Description (Optional) Enter a description for the adapter instance.

TKGI API Hostname (FQDN) Enter the API URL for the TKGI instance.

TKGI Instance Alias Enter the alias name for the adapter instance.

5 To add the credentials, click the Add icon.

Field Name Action

Credential Name The name by which you are identifying the configured credentials.

User name The user name to access the TKGI API.

PKS UAA Management Admin Client's secret The PKS UAA Management Admin client secret to access the api.

Proxy Hostname IP Address of the HTTP Proxy Server.

Proxy Port Proxy Port (80/8080).

Proxy Username Username to access proxy.

Proxy Password Password to authenticate proxy.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 38

Page 39: VMware vRealize Operations Management Pack for Kubernetes

Field Name Action

(Optional) UAA/LDAP Username Provide UAA Credentials if Internal UAA is enabled and provide LDAP Credentails if LDAP Server is enabled.

The User Account and Authentication (UAA) or Lightweight Directory Access Protocol(LDAP) credentials is used to communicate with the TKGI UAA server to obtain authentication token and to configure the Kubernetes adapter instance with the authentication model using the bearer token.

Note The LDAP credentials are required only if the OpenID Connect authentication service is enabled in TKGI.

UAA/LDAP Password UAA/LDAP password.

6 Click Validate Connection to validate the connection.

7 (Optional) From the Collector/Groups drop-down box in the Advanced Settings area, select the collector or collector group upon which you want to run the adapter instance.This option is set to the optimal collector by default.

8 Auto Configure Kubernetes Adapter Instance: Select the Enabled option to discover the Kubernetes cluster in a TKGI instance and create Kubernetes adapter instances automatically. Select the Disabled option to manually create the Kubernetes adapter instance.

9 Auto-accept Kubernetes Cluster SSL Certificate: Select the Enabled option to accept the untrusted certificates presented by the K8s adapter instances by default. Select the Disabled option to manually accept the untrusted certificates for the auto-configured K8s adapter instances.

10 Auto-delete Kubernetes Adpater Instance: Select the Enabled option to delete the K8s adapter instances for deleted kubernetes clusters. Select the Disabled option to retain the K8s adapter instances.

11 Click Save.

Note By default, the TKGI Adapter instance auto-discovers the Kubernetes clusters available in the TKGI Environment. It creates an appropriate Kubernetes Cluster Resource and a Kubernetes Adapter instance against each cluster.

Tanzu Mission Control Overview

VMware Tanzu Mission Control™ is a centralized management platform for consistently operating and securing your Kubernetes infrastructure and modern applications across multiple teams and clouds. You can use Tanzu Mission Control to manage your entire Kubernetes footprint, regardless of where your clusters reside.

Note Only Amazon Web Services clusters are currently supported for TMC integration.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 39

Page 40: VMware vRealize Operations Management Pack for Kubernetes

Preparing Your Environment for Tanzu Mission Control

To monitor the Kubernetes cluster using the vRealize Operations Management Pack for Kubernetes, you have to prepare the environment.

Prerequisites

n Verify that you have installed vRealize Operations Management Pack for Kubernetes.

n Verify that the Tanzu Misson Control (TMC) URL is accessible and resolvable.

n Verify that the Kubernetes Clusters available in the TMC Environment has the cAdvisor DaemonSet configured on port 31194.

To know more about VMware Tanzu Mission Control™, see the Tanzu Mission Control documentation.

Configuring Tanzu Mission Control

Configure the Tanzu Mission Control console to start managing clusters. Follow these steps to configure the Tanzu Mission Control.

Procedure

1 From the main menu of vRealize Operations, click Administration, and then in the left pane, click Solutions > Repository.

2 From the Solutions list, select VMware vRealize® Operations™ Management Pack for Kubernetes .

3 Click the Configure icon to edit an object.

4 Select Tanzu Mission Control Adapter from the Adapter list and configure the adapter instance.

Field Name Action

Display Name Enter the display name of the TMC adapter.

Description (Optional) Enter a description for the adapter instance.

Connect Information

TMC URL Enter the TMC Cloud URL for the TMC instance.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 40

Page 41: VMware vRealize Operations Management Pack for Kubernetes

Field Name Action

Credential Select the credential you want to use to sign on to the environment from the drop-down menu. To add new credentials, click the plus sign.

n Credential Name. The name by which you are identifying the configured credentials.

n CSP Refresh Token. Enter the CSP token details used for discovering the Kubernetes cluster from TMC.

Provide proxy details in the following fields if accessing the TKGI API requires proxy authentication.

n Proxy HostName

n Proxy Port

n Proxy Username

n Proxy Password

Validate Connection Click Validate Connection to validate the connection.

Advanced Settings Use Advance Setting to define the following:

n Collectors/Groups. Select the collector or collector group on which you want to run the adapter instance.

This option is set to the optimal collector by default.

n Auto-accept Kubernetes Cluster SSL Certificate: Select the Enabled option to accept the untrusted certificates presented by the K8s adapter instances by default. Select the Disabled option to manually accept the untrusted certificates for the auto-configured K8s adapter instances.

n Enable cAdvisor Install Check: Select the Enabled option to enable install check on cAdvisor. Select the Disabled option to disable install check on cAdvisor.

n Auto-delete Kubernetes Adapter Instance: Select the Enabled option to delete the K8s adapter instances for deleted Kubernetes clusters. Select the Disabled option to retain the K8s adapter instances.

5 Click Save.

Collection Strategy

You can collect data for Kubernetes cluster using the Prometheus collector. You can also use Prometheus collector to collect custom metrics from cAdvisor Exporter, which maps to the original metrics collected by the adapter using cAdvisor.

Collection Strategy for Prometheus

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 41

Page 42: VMware vRealize Operations Management Pack for Kubernetes

Adapter Instance Configuration for Prometheus

1 Select Prometheus as the collector service.

2 Enter the prometheus server end point and log in credentials.

Note If Prometheus uses Basic Authentication for authentication, then we must enter the user name and password.

3 Click Validate Connection to validate the connection.

Collection Strategy for cAdvisor

The Collection Strategy for cAdvisor is similar to what is defined for cAvisor in Configuring Kubernetes Adapter Instance.

To know how to collect data for cAdvisor, see Configuring Kubernetes Adapter Instance .

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 42

Page 43: VMware vRealize Operations Management Pack for Kubernetes

Alerts in vRealize Operations Management Pack for Kubernetes 6The Kubernetes objects raise alerts and alert symptom for instances that are available in the cluster.

Table 6-1. Alerts in vRealize Operations Management Pack for Kubernetes

Alert Definition Symptoms Severity Recommendation

Container CPU limit is set to unlimited

Container CPU limit is not defined

Info A container running without CPU limit may claim all of Node's resources. Modify your Pod configuration with a CPU limit on the affected container. A quick glimpse at the CPU usage trend can help you set the limit.

Container CPU usage is high

Container CPU usage is higher than 90%

Container CPU usage is higher than 80%

Container CPU usage is higher than 70%

Critical

Immediate

Warning

Consider the option of increasing CPU limit on the affected container if Node's resources permit. Else, you may have to add a new Node to the cluster to ease out the CPU crunch.

Container Memory limit is set to unlimited

Container Memory limit is not defined

Info A container running without Memory limit may claim all of Node's resources. Modify your Pod configuration with a Memory limit on the affected container. A quick glimpse at the Memory usage trend can help you set the limit.

Container is not available Container is not available Immediate Redeploy the Pod and make sure it goes to Ready state

Container Memory usage is high

Container Process CPU usage is higher than 90%

Container Process CPU usage is higher than 80%

Container Process CPU usage is higher than 70%

Critical

Immediate

Warning

Consider the option of increasing Memory limit on the affected container if Node's resources permit. Else, you may have to add a new Node to the cluster to ease out the Memory crunch.

VMware, Inc. 43

Page 44: VMware vRealize Operations Management Pack for Kubernetes

Table 6-1. Alerts in vRealize Operations Management Pack for Kubernetes (continued)

Alert Definition Symptoms Severity Recommendation

Container Process has high Memory Usage

Container Process Memory usage is higher than 90%

Container Process Memory usage is higher than 80%

Container Process Memory usage is higher than 70%

Critical

Immediate

Warning

Consider increasing CPU limit of the container.

Container Process has high Memory Usage

Node Process Memory usage is higher than 90%

Node Process Memory usage is higher than 80%

Node Process Memory usage is higher than 70%

Critical

Immediate

Warning

Consider increasing Memory limit of the container.

Master Node is not available

Master Node is not available

Immediate Ensure that the Master Node is reachable and API server is up and running.

Namespace is not Available

Namespace is not Available

Immediate Check if the namespace has been deleted

Node has high CPU Usage

Node CPU usage is higher than 90%

Node CPU Memory usage is higher than 80%

Node CPU Memory usage is higher than 70%

Node CPU Memory usage is higher than 60%

Critical

Immediate

Warning

Info

Consider increasing CPU resource of the Node OR add a new Node to the cluster

Node is not available Node is not available Immediate Verify if the Node is reachable and in Ready state

Node has high Memory Usage

Node Memory usage is higher than 90%

NodeMemory usage is higher than 80%

NodeMemory usage is higher than 70%

NodeMemory usage is higher than 60%

Critical

Immediate

Warning

Info

Consider increasing Memory resource of the Node OR add a new new Node to the cluster

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 44

Page 45: VMware vRealize Operations Management Pack for Kubernetes

Table 6-1. Alerts in vRealize Operations Management Pack for Kubernetes (continued)

Alert Definition Symptoms Severity Recommendation

One of the Pods has highest CPU usage on Namespace

Pod with highest utilization on namespace has CPU usage higher than 90%

Pod with highest utilization on namespace has CPU usage higher than 80%

Pod with highest utilization on namespace has CPU usage higher than 70%

Descendant pod object (OR Operation)

Pod memory usage is higher than 70%

Pod memory usage is higher than 80%

Pod memory usage is higher than 90%

Critical

Immediate

Warning

Info

Consider modifying the affected Pod configurations to increase CPU limits

One of the pods has high Memory usage on Namespace

Pod with highest utilization on namespace has memory usage higher than 90%

Pod with highest utilization on namespace has memory usage higher than 80%

Pod with highest utilization on namespace has memory usage higher than 70%

Descendant pod object (OR Operation)

Pod CPU Usage is higher than 70%

Pod CPU Usage is higher than 80%

Pod CPU Usage is higher than 90%

Critical

Immediate

Warning

Info

Consider modifying the affected Pod configurations to increase Memory limits

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 45

Page 46: VMware vRealize Operations Management Pack for Kubernetes

Table 6-1. Alerts in vRealize Operations Management Pack for Kubernetes (continued)

Alert Definition Symptoms Severity Recommendation

One of the pods has high CPU usage on Service

Pod with highest utilization on Service has CPU usage higher than 70%

Pod with highest utilization on Service has CPU usage higher than 80%

Pod with highest utilization on Service has CPU usage higher than 90%

Descendant pod object (OR Operation)

Pod CPU Usage is higher than 70%

Pod CPU Usage is higher than 80%

Pod CPU Usage is higher than 90%

Critical

Immediate

Warning

Consider modifying the affected Pod configurations to increase CPU limits

One of the pods has high Memory usage on Service

Pod with highest utilization on Service has memory usage higher than 70%

Pod with highest utilization on Service has memory usage higher than 80%

Pod with highest utilization on Service has memory usage higher than 90%

Descendant pod object (OR Operation)

Pod memory usage is higher than 70%

Pod memory usage is higher than 80%

Pod memory usage is higher than 90%

Critical

Immediate

Warning

Consider modifying the affected Pod configurations to increase Memory limits

Pod has high CPU Usage Pod CPU Usage is higher than 90%

Pod CPU Usage is higher than 80%

Pod CPU Usage is higher than 70%

Pod CPU Usage is higher than 60%

Critical

Immediate

Warning

Info

Go through the individual usage of the affected Pod's containers and balance their CPU limits.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 46

Page 47: VMware vRealize Operations Management Pack for Kubernetes

Table 6-1. Alerts in vRealize Operations Management Pack for Kubernetes (continued)

Alert Definition Symptoms Severity Recommendation

Pod has high Memory Usage

Pod memory usage is higher than 90%

Pod memory usage is higher than 80%

Pod memory usage is higher than 70%

Pod memory usage is higher than 60%

Critical

Immediate

Warning

Info

Go through the individual usage of the affected Pod's containers and balance their Memory limits.

Pod is not available Pod is not available Critical Redeploy the Pod and make sure it goes to Ready state

ReplicaSet is not available

ReplicaSet is not available

Immediate Make sure that the Replica Set is present.

Service is not available Service is not available Immediate Make sure that the Service is present.

Sum of Resource Requests of Pods exceed Node Capacity

CPU Requests greater than node capacity

Memory Requests greater than node capacity

Critical Minimum CPU/Memory resources required to run the Pods of the affected node has exceeded Node capacity. Consider increasing Node resources OR add more Nodes to distribute the workload.

This chapter includes the following topics:

n Reports in vRealize Operations Management Pack for Kubernetes

Reports in vRealize Operations Management Pack for Kubernetes

A report is a snapshot of views. Reports provide a view of Kubernetes adapter instance objects in CSV and PDF format. The report in vRealize Operations Management Pack for Kubernetes is called the Kubernetes Adapter Instance Summary.

This report is based on the following views:

n Kubernetes adapter instance objects

n Containers with no memory limit

n Container with no CPU limit

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 47

Page 48: VMware vRealize Operations Management Pack for Kubernetes

Dashboards in vRealize Operations Management Pack for Kubernetes

7You can use the dashboards to view and troubleshoot objects in your Kubernetes cluster eco-system that are monitored by vRealize Operations Management Pack for Kubernetes.

Access Dashboards

Procedure

1 To access the dashboards, from the main menu of VMware vRealize Operations, click Dashboards.

2 From the dashboard list, select Kubernetes Overview.

Kubernetes Overview

The overview dashboard provides an overall representation of Kubernetes environment, nodes, pods, and containers. The overview provides information of the overall health status of clusters, nodes, and pods with their respective historical trend and metric chart.

Kubernetes Overview - Environment

The Kubernetes overview environment widget provides an overall view of Kubernetes adapter instances, its associated objects information, alerts, and health status of objects.

Figure 7-1. Kubernetes Overview - Environment

VMware, Inc. 48

Page 49: VMware vRealize Operations Management Pack for Kubernetes

Widget Name Description

Search for a Kubernetes Cluster This widget displays only Kubernetes instances but not all objects types. You can retrieve the total metrics from the instances that are listed under this widget.

Summary of the Selected Cluster This widget displays the total number of nodes, namespaces, pods, containers and services within the Kubernetes cluster.

Any Alerts on the Nodes, Namespaces, Pods or Containers

This widget displays all the immediate and critical alerts within a cluster of nodes, namespaces, pods or containers. When you select a object type from the Search for a Kubernetes Cluster widget, the corresponding alerts that are only immediate and critical gets populated.

Are the cluster members healthy?

This widget provides a hierarchical view of object relationship of a Kubernetes cluster.

Note The Total Objects column in the Search for a Kubernetes Cluster widget does not match with the sum of the objects in the Summary of the Selected Cluster widget. This is because the value of the total objects includes the total count of replica sets and Java container processes.

Kubernetes Overview - Nodes

The Kubernetes overview nodes widget provides detailed set of information of nodes, node properties, health status, metrics, and hierarchical representation of pod relationship.

Figure 7-2. Kubernetes Overview - Nodes

Widget Name Description

Top 5 Least Healthy Nodes in the Selected Cluster

This widget displays the top 5 least healthy nodes in the selected cluster.

Node Properties This widget displays respective node properties of the node that is selected in the Top 5 Least Healthy Nodes in the Selected Cluster widget.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 49

Page 50: VMware vRealize Operations Management Pack for Kubernetes

Widget Name Description

Pods running on this Node This widget provides a hierarchical view of pods and its relationship on a selected cluster.

Pick another Metric or Property if needed

This widget lists all the metrics and properties for a selected node. This widget populates metrics and property of a node when it is selected from the Top 5 Least Healthy Nodes in the Selected Cluster widget.

Node Metric Chart This widget provides a chart with the metric information of a metric or a property that is selected from the previous widget.

Kubernetes Overview - Pods and Container

The Kubernetes Overview pods and container widget provides detailed set of information of pods health status, hierarchical representation of pod relationship, metrics and so on.

Figure 7-3. Kubernetes Overview - Pods and Containers

Widget Name Description

Top 25 Least Healthy Pods in the Selected Cluster

This widget displays the Top 5 least healthy pods in the selected cluster.

How is the Pod associated with other components

This widget provides a hierarchical view of pods and its relationship with other components on a selected cluster.

Pick any Metric from the selected component

This widget lists all the metrics and properties for a selected pod. This widget populates metrics and property of a pod when it is selected from the How is this Pod Associated with other components widget.

Metric Chart This widget provides a chart with the metric information of a metric or a property that is selected from the previous widget.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 50

Page 51: VMware vRealize Operations Management Pack for Kubernetes

Kubernetes POD and Container Availability - Overview

The Kubernetes POD and Container Availability - Overview widget provides the heat map details for the PODs and Containers of a cluster. You can view all the details about the PODs and Containers associated with the entire cluster.

Figure 7-4. Kubernetes POD and Container Availability

Widget Name Description

POD Availability This widget displays the heat map and pod availability details for the entire cluster.

Container Availability This widget displays the heat map and container availability details for the entire cluster.

VMware vRealize Operations Management Pack for Kubernetes

VMware, Inc. 51