z/vm performance analysis
DESCRIPTION
Apresentação realizada no CMG Brasil 2013TRANSCRIPT
1
z/VM Performance Analysis
Lívio Sousa - [email protected] zEnterprise Client Technical Specialist
Lívio Sousa - [email protected] zEnterprise Client Technical Specialist
2
Overview
• Guidelines
• Commands
• *MONITOR
• Performance Toolkit
• Omegamon XE
3
Definition of Performance
• Performance definitions:– Response time– Batch elapsed time– Throughput– Resource consumed per unit of work done– Utilization– Users supported– Phone ringing– Consistency
• All of the above
4
Performance Guidelines
• Processor
• Storage
• Paging
• Minidisk cache
• Server machines
5
Processor Guidelines• Dedicated processors - mostly political
– Absolute share can be almost as effective– Gets wait state assist and 500 ms minor time slice– Perhaps not a good idea if you are CPU-constrained– A virtual machine should have all dedicated or all shared processors
• Share settings– Use absolute if you can judge percent of resources required– Use relative if difficult to judge and if lower share as system load increases is
acceptable– Be aware that share value is split by vCPUs– Do not use LIMITHARD settings unnecessarily
• Masks looping users• More scheduler overhead
• Use the right number of virtual processors for the guest's workload
• Don’t share all available IFLs to all LPARs– Suspend Time can be high
6
Storage GuidelinesVirtual-to-real ratio should be <= 3:1 or make sure paging system is robust
– To avoid any performance impact for production workloads, you may need to keep ratio to 1:1– See also http://www.vm.ibm.com/perf/tips/memory.html– VIR2REAL EXEC (Bruce Hayden)
http://www.vm.ibm.com/download/packages/descript.cgi?VIR2REAL
Define some processor storage as expanded storage to provide paging hierarchy– For more background, see http://www.vm.ibm.com/perf/tips/storconf.html
Size guests appropriately– Avoiding over provisioning– Do not put them in a high guest paging position– Right-sized usually means "just barely swapping"
Exploit shared memory where possible– IPL your Linux guests from a segment– Use the Linux XIP (execute-in-place) file system
Total Virtual storage (all logged on userids): 388308 MB (379.2 GB) Usable real storage (pageable) for this system: 202927 MB (198.2 GB) Total LPAR Real storage: 204800 MB (200.0 GB) Expanded storage usable for paging: 25600 MB ( 25.0 GB) Total Virtual disk (VDISK) space defined: 50176 MB ( 49.0 GB) Average Virtual disk size: 512 MB Virtual + VDISK to Real storage ratio: 2.2 : 1
7
Paging Guidelines• DASD paging allocations less than or equal to 50%
– QUERY ALLOC PAGE
• Watch blocks read per paging request (keep >10)– Long block runs make paging I/O efficient
• Multiple volumes and multiple paths– Remember, one I/O per real device at a time– Use lots of little volumes rather than a few big volumes– Pay attention in Response Time and Wait Queues
• Do not mix sizes of paging DASD– Use all -3s, or all -9s, or whatever
• Paging to FCP SCSI (EDEVICES) may offer higher paging bandwidth with higher processor requirements
– See also http://www.vm.ibm.com/perf/tips/prgpage.html
88
Reorder Processing - Background
• Page reorder is the process in z/VM of managing user frame owned lists as input to demand scan processing. – It includes resetting the HW reference bit.– Serializes the virtual machine (all virtual processors).– In all releases of z/VM
• It is done periodically on a virtual machine basis.
• The cost of reorder is proportional to the number of resident frames for the virtual machine.– Roughly 130 ms/GB resident– Delays of ~1 second for guest having 8 GB resident– This can vary for different reasons +/- 40%
99
Reorder Processing - DiagnosingPerformance Toolkit
– Check resident page fields (“R<2GB” & “R>2GB”) on FCX113 UPAGE report• Remember, Reorder works against the resident pages, not total virtual
machine size.– Check Console Function Mode Wait (“%CFW”) on FCX114
USTAT report• A virtual machine may be brought through console function mode to
serialize Reorder. There are other ways to serialize for Reorder and there are other reasons that for CFW, so this is not conclusive.
REORDMON– Available from the VM Download Page
http://www.vm.ibm.com/download/packages/– Works against raw MONWRITE data for all monitored virtual
machines– Works in real time for a specific virtual machine– Provides how often Reorder processing occurs in each monitor
interval
1010
REORDMON Example
Num. of Average Average Userid Reorders Rsdnt(MB) Ref'd(MB) Reorder Times -------- -------- --------- --------- -------------------LINUX002 2 18352 13356 13:29:05 14:15:05 LINUX001 1 22444 6966 13:44:05 LINUX005 1 14275 5374 13:56:05 LINUX003 2 21408 13660 13:43:05 14:10:05 LINUX007 1 12238 5961 13:51:05 LINUX006 1 9686 4359 13:31:05 LINUX004 1 21410 11886 14:18:05
1111
Reorder Processing - Mitigations
• Try to keep the virtual machine as small as possible.
• Virtual machines with multiple applications may need to be split into multiple virtual machines with fewer applications.
• See http://www.vm.ibm.com/perf/tips/reorder.html for more details.
• Apply APAR VM64774 if necessary:– SET and QUERY commands, system wide settings– Corrects problem in earlier “patch” solution that inhibits paging of
PGMBKs (Page Tables) for virtual machines where Reorder is set off.
– z/VM 5.4.0 PTF UM33167 RSU 1003– z/VM 6.1.0 PTF UM33169 RSU 1003
12
Minidisk Cache Guidelines• In general, enable MDC for everything
• Configure some real storage for MDC
• Set maximum MDC limits– SET MDC STOR 0M 256M and SET MDC XSTOR 0M 0M
• Disable MDC for– Write-mostly or read-once disks (logs, accounting, Linux swap)– Target volumes in backup scenarios
• Better performer than Virtual Disk in Storage (VDISK) for read I/Os
13
Server Machine Guidelines
• Server Virtual Machine (SVM)• TCP/IP, RACFVM, etc.• QUICKDSP ON to avoid eligible list• Higher SHARE setting• Ensure performance data includes these virtual
machines
14
CP INDICATE Command• LOAD: shows total system load
– Processors, XSTORE, paging, MDC, queue lengths, storage load– STORAGE value not very meaningful
• USER EXP: more useful than plain USER
• QUEUES EXP: great for scheduler problems and quick state sampling– Mostly useful for eligible list assessments
• PAGING: lists users in page wait
• I/O: lists users in I/O wait
• ACTIVE: displays number of active users over given interval
• Consider using MONITOR DATA instead for "serious" examinations
15
CP INDICATE LOAD Example
INDICATE LOAD
AVGPROC-088% 03 XSTORE-000000/SEC MIGRATE-0000/SEC MDC READS-000035/SEC WRITES-000001/SEC HIT RATIO-099% PAGING-0023/SEC STEAL-000% Q0-00007(00000) DORMANT-00410Q1-00000(00000) E1-00000(00000) Q2-00001(00000) EXPAN-002 E2-00000(00000) Q3-00013(00000) EXPAN-002 E3-00000(00000) PROC 0000-087% PROC 0001-088% PROC 0002-089%LIMITED-00000
16
Selected CP QUERY Commands
USERS: number and type of users on systemSRM: scheduler/dispatcher settings (LDUBUF, etc.)SHARE: type and intensity of system shareFRAMES: real storage allocationPATHS: physical paths to device and statusALLOC MAP: DASD allocationALLOC PAGE: how full your paging space isXSTORE: assignment of expanded storageMONITOR: current monitor settingsMDC: MDC usageVDISK: virtual disk in storage usageSXSPAGES: System Execution Space
17
5,000 Foot View
CP Control
Blocks
Application
DataVM Events
*MONITOR System Service
MONDCSS
Segment
MONWRITE Utility
Performance
Toolkit
Raw
Monwrite
History
Files
TCP/IPNetwork
3270
Browser
VMRM
18
19
Processor
REPORT NAME REPORT CODE COMMAND
CPU Load and Transactions FCX100 CPU
LPAR Load FCX126 LPAR
Processor Log FCX144 PROCLOG
LPAR Load Log FCX202 LPARLOG
User Wait States FCX114 USTAT
System Summary FCX225 SYMSUMLG
20
FCX126 Run 2011/09/20 18:00:56 Logical Partition Activity From 2011/09/13 09:19:15 To 2011/09/13 10:09:15
For 3000 Secs 00:50:00 Result of 13092011 Run __________________________________________________________________________________________ Processor type and model : 2817-401 Nr. of configured partitions: 6 Nr. of physical processors : 25 Partition Nr. Upid #Proc Weight Wait-C Cap %Load CPU %Busy %Ovhd %Susp %VMld %Logld Type LPAR1 1 00 24 100 NO NO 89.0 0 94.3 2.1 6.5 92.0 98.4 IFL 100 NO 1 93.4 2.4 7.7 90.8 98.3 IFL 100 NO 2 93.6 2.3 7.4 91.1 98.3 IFL 100 NO 3 93.6 2.4 7.5 91.1 98.4 IFL 100 NO 4 93.6 2.3 7.4 91.1 98.4 IFL 100 NO 5 93.5 2.3 7.5 91.0 98.3 IFL 100 NO 6 93.4 2.4 7.6 90.9 98.3 IFL 100 NO 7 93.2 2.4 7.7 90.6 98.1 IFL 100 NO 8 93.4 2.4 7.5 90.8 98.2 IFL 100 NO 9 93.2 2.4 7.7 90.7 98.2 IFL 100 NO 10 93.1 2.5 7.8 90.4 98.0 IFL 100 NO 11 93.2 2.4 7.7 90.6 98.0 IFL 100 NO 12 93.4 2.4 7.5 90.8 98.1 IFL 100 NO 13 93.3 2.3 7.5 90.8 98.1 IFL 100 NO 14 93.3 2.4 7.5 90.7 98.1 IFL 100 NO 15 93.2 2.5 7.6 90.5 97.9 IFL 100 NO 16 91.1 2.9 9.0 88.0 96.6 IFL 100 NO 17 91.3 2.8 8.8 88.2 96.7 IFL 100 NO 18 91.4 2.9 8.9 88.3 96.8 IFL 100 NO 19 91.5 2.7 8.8 88.5 97.0 IFL 100 NO 20 91.7 2.8 8.7 88.6 97.1 IFL 100 NO 21 91.5 2.8 8.9 88.5 97.1 IFL
21
FCX225 Run 2011/09/20 18:00:56 SYSSUMLG System Performance Summary by Time From 2011/09/13 09:19:15 To 2011/09/13 10:09:15 For 3000 Secs 00:50:00 Result of 13092011 Run _________________________________________________________________________________ <------- CPU --------> <Vec> <--Users--> <---I/O---> <Stg> <-Paging--> <--Ratio--> SSCH DASD Users <-Rate/s--> Interval Pct Cap- On- Pct Log- +RSCH Resp in PGIN+ Read+ End Time Busy T/V ture line Busy ged Activ /s msec Elist PGOUT Write >>Mean>> 90.0 1.10 .9293 24.0 .... 117 108 571.2 .4 .0 2610 1051 09:20:15 92.4 1.13 .9059 24.0 .... 117 108 523.0 .5 .0 1992 527.8 09:21:15 92.9 1.07 .9523 24.0 .... 117 108 399.2 .5 .0 1669 301.4 09:22:15 93.2 1.08 .9458 24.0 .... 117 108 557.4 .3 .0 2817 633.9 09:23:15 94.5 1.07 .9535 24.0 .... 117 108 590.3 .3 .0 1410 482.7 09:24:15 93.4 1.07 .9537 24.0 .... 117 108 649.5 .4 .0 2363 488.5 09:25:15 90.4 1.09 .9347 24.0 .... 117 108 684.7 .4 .0 2485 768.9 09:26:15 92.4 1.08 .9436 24.0 .... 117 108 666.8 .4 .0 2940 1215 09:27:15 90.9 1.09 .9344 24.0 .... 117 108 607.2 .4 .0 3179 726.7 09:28:15 92.2 1.08 .9469 24.0 .... 117 108 664.2 .5 .0 2179 896.0 09:29:17 90.8 1.10 .9318 24.0 .... 117 108 645.9 .6 .0 3404 804.5 09:30:16 89.5 1.19 .8579 24.0 .... 117 108 670.8 .7 .0 5402 3487 09:31:15 92.7 1.08 .9412 24.0 .... 117 108 588.7 .4 .0 3091 1807 09:32:15 91.2 1.09 .9421 24.0 .... 117 108 602.8 .3 .0 2635 1076 09:33:16 89.3 1.14 .9047 24.0 .... 117 108 255.2 .5 .0 3140 710.5 09:34:15 88.5 1.10 .9374 24.0 .... 117 108 205.2 .6 .0 2513 897.4 09:35:15 85.9 1.12 .9257 24.0 .... 117 108 320.4 .5 .0 3117 953.5 09:36:16 86.1 1.13 .9144 24.0 .... 117 108 213.5 .5 .0 3642 1144 09:37:16 83.0 1.14 .9090 24.0 .... 117 108 245.6 .5 .0 3414 2133
22
REPORT NAME REPORT CODE COMMAND
Auxiliary Storage Log FCX146 AUXLOG
CP Owned Device FCX109 DEVICE CPOWNED
User Page Data FCX113 UPAGE
Shared Data Spaces FCX134 DSPACESH
SXS Available Page Queues Mgnt FCX261 SXSAVAIL
Mini Disk Storage FCX178 MDCSTOR
Storage Utilization FCX103 STORAGE
Available List Log FCX254 AVAILLOG
Storage
23
FCX109 Run 2011/05/31 17:44:26 DEVICE CPOWNED Load and Performance of CP Owned DisksFrom 2011/05/12 16:48:41 To 2011/05/12 17:31:41 For 2580 Secs 00:43:00 Result of 20110512 Run _______________________________________________________________________________ Page / SPOOL Allocation Summary PAGE slots available 25165k SPOOL slots available 3605928 PAGE slot utilization 25% SPOOL slot utilization 65% T-Disk cylinders avail. ....... DUMP slots available 0 T-Disk space utilization ...% DUMP slot utilization ..%. . . . . . . . . . _____ . .< Device Descr. -> <------------- Rate/s -------------> User Serv MLOAD Volume Area Area Used <--Page---> <--Spool--> SSCH Inter Queue Time RespAddr Devtyp Serial Type Extent % P-Rds P-Wrt S-Rds S-Wrt Total +RSCH feres Lngth /Page TimeEDF1 9336 ZDPAG1 PAGE 12583k 25 196.5 199.9 ... ... 396.4 .0 0 8.18 5.5 88.0EDF2 9336 ZDPAG2 PAGE 12583k 24 194.2 206.1 ... ... 400.4 .0 0 7.23 6.0 58.44374 3390 610SP1 SPOOL 802880 61 .0 .0 .0 .0 .0 .1 0 0 .4 .44672 3390 610SP2 SPOOL 803060 68 .0 .0 .0 .0 .0 .0 0 0 1.0 1.0
24
I/O
REPORT NAME REPORT CODE COMMAND
General I/O Device FCX108 DEVICE
SCSI Device FCX249 SCSI
DASD Performance Log FCX131 DEVCONF
FICON Channel Load FCX215 INTERIM FCHANNEL
General I/O Device Data Log FCX168 DEVLOG
I/O Processor Log FCX232 IOPROCLG
25
Studying MONWRITE Data
• z/VM Performance Toolkit
• Interactively – possible, but not so useful
• PERFKIT BATCH command – pretty useful– Control files tell Perfkit which reports to produce– You can then inspect the reports by hand or programmatically
• See z/VM Performance Toolkit Reference for information on how to use PERFKIT BATCH
• PRFIT (Brian Wade)http://www.vm.ibm.com/download/packages/descript.cgi?PRFIT
26
26
Some Notes on z/VM Limits• Sheer hardware:
– z/VM 5.2: 24 engines, 128 GB real– z/VM 5.3: 32 engines, 256 GB real– zSeries: 65,000 I/O devices
• Workloads we’ve run in test have included:– 54 engines– 440 GB real storage– 128 GB XSTORE– 240 1-GB Linux guests– 8 1-TB guests
• Utilizations we routinely see in customer environments– 85% to 95% CPU utilization without worry– Tens of thousands of pages per second without worry
• Our limits tend to have two distinct shapes– Performance drops off slowly with utilization (CPUs)– Performance drops off rapidly when wall is hit (storage)
Per
form
ance
Utilization
Precipitous (e.g., storage)Gradual (e.g., CPUs)
27
Some Final Thoughts
• Define what is performance for your case
• Collect data for a base line of good performance
• Implement change management process
• Make as few changes as possible at a time
• Relieving one bottleneck will reveal another