Panel Discussion: Performance Measuring Tools for the Cloud
April 2016Austin Openstack Summit
Panelists & Presenters
Panelists & Presenters
Yuting Wu
AWCloud
Nicholas WakouDell
Das KamhoutIntel
XiaoFei WangAWCloud
Douglas ShakshoberRed Hat
https://www.openstack.org/summit/austin-2016/summit-schedule/events/7989
Agenda
Agenda
Measuring the Cloud Using Rally & CloudBench Measuring the Cloud Using Rally & Dichotomy Performance Analysis by HAProxy Industry Standard Benchmarks
Measuring the Cloud Using Rally & CloudBenchDouglas Shakshober, Red Hat Inc.
Red Hat – OpenStack Cloud Benchmark Efforts
Rally- CPU loads - Linpack script: https://review.openstack.org/#/c/115439/- Pros – Test Cloud provisioning and auto execution of workloads within cloud
Cloud Bench- CBtool: https://github.com/jtaleric/cbtool - added Netperf Cleint/Server- Pros – Helps sync start of execution and completion
SPECcloud- Agreed upon Industry standard – add support for Fedora/CentOS/RHEL
Cinder- Ceph Benchmark Tool – CBT
Perfkit- Google general cloud benchmark harness
Rally – OSP Provisioning / ping
Rally Pro’s- Automate VM provisioning- Monitor Nova success/failures- Contributed Linpack CPU load
Rally Cons- Difficult to sync benchmarks- VM cleanup scripts needed
Rally used for Linpack CPU Performance(example - 256 VMs over 8 host 32/host)
OSP/ Network Performance(Example Netperf VxLAN OVS using Mellanox 40 Gb networks)
Cloud Bench harness- Benchmark harness- Sync up starting time for bench- Run Client / Server- Avoid VM to VM traffic on same host.
Shaker- distributed data-plane testing tool- Multi-stream iperf benchmark
https://github.com/openstack/shaker
Red Hat FIO scale for Ceph ScalingFIO 100 IOPs - R7.1 Rand 4k, 8 R7.1 Compute host, 8 R7.1 KVM guest/host up to 64VMs,4 R6.6 Ceph 1.2.2 Host, 12core/24cpu, 48GB Mem , 12 disk/host – total of 48 OSDs)
Cloud Bench harness- Benchmark harness- Sync up starting time for bench- Add warm up period for BM steady
state- Would like fio parallel option to
aggregate perf data CBT – Ceph Benchmark Tool
- https://github.com/ceph/cbt- Test Seq/Rand I/O w/ FIO- Range of transfers 4k-1m- Queue Depth – 1-64- Run on Bare Metal or KVM VMs.
0
20
40
60
80
100
120
Ephem Rand Write
Ephem Rand Read
Ceph Rand Write
Ceph Rand Read
Avera
ge I
OP
s / G
uest
8 Guest
16 Guest
32 Guest
64 Guest
Measuring the Cloud Using Rally & DichotomyXiaoFei Wang, AWCloud
What the Rally tool can do
Benchmarking tool for OpenStack As a basic tool and Tempest for OpenStack CI/CD system Cloud verification and profiling tool Deployment tool for DevStack
The Rally and Dichotomy
When profiling the cloud performance bottlenecks, cannot quickly judge orders of magnitude. For example, a single Nova API can handle withstand the number of concurrent create instance.
How to use the Dichotomy
How to apply The Rally in actual project
The Rally tool is used in a variety of ways. For example, CI with Jenkins, Production of cloud.
Keystone, Nova, Cinder, Neutron, Glance , etc. That all component workingin the HAProxy. The following is the actual result.
Performance Analysis by HAProxyYuting Wu, AWCloud
HAProxy often used to load balance
each OpenStack API service.
HAProxy collects many statistics and
information about traffic passing through
in the log.
By analyzing HAProxy log, we can
determine the performance bottleneck of
OpenStack API service.
HAProxy Load-Balancing OpenStack API Services
Performance Analysis by HAProxy Log
# Example’s
Value
Name Custom log
tag
Short description
8 0/0/0/67/67 Tq/Tw/Tc/Tr/T
t%Tq
%Tw
%Tc
%Tr
%Tt
Tq: total time in milliseconds spent waiting for the client to
send a full HTTP request, not counting data
Tw: total time in milliseconds spent waiting in the various
queues
Tc: total time in milliseconds spent waiting for the connection
to establish to the final server, including retries
Tr: total time in milliseconds spent waiting for the server to
send a full HTTP response, not counting data
Tt: total time in milliseconds elapsed between the accept and
the last close. It covers all possible processings
9 200 Status_code %st HTTP status code returned to the client
10 2591 Bytes_read %B total number of bytes transmitted to the client when the log is
emitted
Performance Analysis by HAProxy Log (Continued)
# Example’s
Value
Name Custom log
tag
Short description
11 - - captured_request_cookie
captured_response
_cookie
%cc
%cs
captured_request_cookie: optional ”name=value” entry
indicating that the client had this cookie in the request
captured_response_cookie: optional ”name=value” entry
indicating that the server has returned a cookie with its
response
12 - - - - termination_state
cookie_status
%tsc termination_state: condition the session was in when the
session ended
cookie_status: status of cookie persistence
13 15/2/0/0/0 actconn/
feconn/
beconn/
srv_conn/
retries
%ac
%fc
%bc
%sc
%rc
actconn: total number of concurrent connections on the process
when the session was logged
feconn: total number of concurrent connections on the frontend
when the session was logged
beconn: total number of concurrent connections handled by the
backend when the session was logged
srv_conn: total number of concurrent connections still active on
the server when the session was logged
retries: number of connection retries experienced by this
session when trying to connect to the server
14 0/0 srv_queue/
backend_queue
%sq/%bq srv_queue: total number of requests which were processed
before this one in the server queue
Backend_queue: total number of requests which were
processed before this one in the backend’s global queue
Industry Standard BenchmarksNicholas Wakou, Dell
Performance Consortia
21
Standard Performance Evaluation Corporation (www.spec.org)
Transaction Processing Performance Council (www.tpc.org)
SPEC Cloud IaaS 2016 Benchmark
Measures performance of Infrastructure-as-a-Service (IaaS)
Clouds.
Measures both control and data plane
Control: management operations, e.g., Instance provisioning time
Data: virtualization, network performance, runtime performance
Uses workloads that
resemble “real” customer applications
benchmarks the cloud, not the application
Produces metrics (“elasticity”, “scalability”, “provisioning time”)
which allow comparison.
22
Benchmark and Workload Control
23
Cloud SUTGroup of boxes represents anapplication instance..
Benchmark Harness
Benchmark Harness. It comprises of Cloud Bench (CBTOOL) and baseline/elasticity drivers, and report generators.
For white-box clouds the benchmark harness is outside the SUT. For black-box clouds, it can be in the same location or campus.
YCSBFramework used by a common set of workloads for evaluating performance of different key-value and cloud serving stores.
KMeans- Hadoop-based CPU intensive workload- Chose Intel HiBench implementation
SPEC Cloud Workloads
What is Measured
Measures the number of AIs that can be loaded onto a Cluster before SLA violations occur.
Measures the scalability and elasticity of the Cloud under Test (CuT).
Not a measure of Instance density
SPEC Cloud workloads can individually be used to stress the CuT:
KMeans – CPU/MemoryYCSB - IO
SPEC Cloud IaaS 2016 High Level Report Summary
26
Fair usageA tester cannot selectively report benchmark metrics. Their reporting in product or
marketing descriptions is governed by fair usage.