ibm systems technical eventspublic.dhe.ibm.com › software › dw › linux390 › perf ›...
TRANSCRIPT
Linux on z Systems Disk I/O Performance
Martin KammererLinux on z Systems Performance Evaluation
2017-10-09I016653
IBM SystemsTechnical Eventsibm.com/training/events
Technical University
Munich 2017
IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
2IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Motivation and Agenda
● In many Linux systems the disk attachment is not set up to get best throughput
● Material can be used as check list to review existing or future server configuration
● Describes tuning knobs on all layers
– Application program
– Linux kernel / device drivers
– Interconnects
– Storage server
● Considerations for
– Linux guests on z/VM and KVM
– KVM host
● z14 benefits
3IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Kernel Components Involved In Disk I/O
Block device layer
Device mapper (dm)dm-multipath
Logical Volume Manager (LVM)
I/O scheduler
Virtual File System (VFS)
Application program Issuing reads and writes
Page cache
VFS dispatches requests to different devices andtranslates to sector addressingLVM defines the physical to logical device relationMultipath sets the multipath policiesdm holds the generic mapping of all block devices andperforms 1:n mapping for logical volumes and/or multipath
Page cache contains all file I/O data, direct I/O bypasses the page cacheI/O schedulers merge, order and queue requests, start device drivers via (un)plug device Data transfer handling
Direct Acces Storage Device driver Small Computer System Interface driver (SCSI)
Device drivers z Fi be r C h annel Protocol driver (zFCP)Queued Direct I/O driver (qdio)
4IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Considerations For Application Programs (1)
● direct I/O:
– is enabled for read and write if a user program is opening the file with option o_direct
– bypasses the page cache
– provides reliable constant throughput values
– is recommended if high I/O rates are expected for a file
– is a good choice in all cases where the application does not want Linux to economize I/O and/or where the application buffers larger amount of file contents
● async I/O:
– is exploited if a user program issues aio_read or aio_write system calls
– prevents the application from being blocked in the I/O system call until the I/O completes
– allows read merging by Linux in case of using page cache
– is recommended if high I/O rates are required in a user program
– can be combined with direct I/O (performance hint of database products)
5IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Considerations For Application Programs (2)
● Temporary files:
– should not reside on real disks
– should be in a ram disk or tmpfs to allow fastest access to these files
– don't need to survive a crash and therefore don't require a journaling file system
● File system:
– use the default file system of the Linux distribution
– select a well suited journaling mode● the “heaviest” could be the worst from a performance point of view
– to reduce contention in the file system directory● if possible omit for heavily used files the recording of the access time (noatime or relatime)
– if not set by default (xfs)
6IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Logical Volumes
● Linear logical volumes
– allow an easy extension of the file system
– don‘t increase performance
● Striped logical volumes
– provide the capability to perform simultaneous I/O on different stripes
– allow load balancing
– are extendable
– may lead to smaller I/O request sizes than linear logical volumes or normal physical volumes (in case the I/O request contains larger amount of data than stripe size)
● Don't use logical volumes for “/” or “/usr”
– If the logical volume gets corrupted your system is gone
● Logical volumes require more CPU cycles than physical disks
– Consumption increases with the number of physical disks used for the logical volume
● LVM / Logical Volume Manager is the tool to manage logical volumes
a
7IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Multipath (High Availability Feature For SCSI Volumes)
● Connects a volume over multiple independent paths and / or ports (path_grouping_policy)
● Multibus mode
– Uses all paths in a priority group alternately
– Determining the path by estimating the shortest service time from actual data● Default policy is service-time in multipath.conf● Check the actual used value by the device mapper with “dmsetup table”
– To overcome bandwidth limitations of a single path
– Increases throughput for SCSI volumes in all Linux distributions
– For load balancing
● Failover mode
– Keeps the volume accessible in case of a single failure in the connecting hardware
– Should one path fail, the operating system will route I/O through one of the remaining paths, with no changes visible to the applications
– Will not improve performance
8IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Page Cache Considerations
● Pros
– Page cache helps Linux to economize I/O
– Read requests can be made faster by adding a read ahead quantity, depending on the historical behavior of file accessesby applications
– Write requests are delayed and data in the page cache can have multiple updates before being written to disk
– Write requests in the page cache can be merged into larger I/O requests
● Cons
– Requires Linux memory pages
– Not useful when cached data is not exploited like
● Data just only needed once
● Application buffers data itself
– Linux does not know which data the application really needs next, it only makes a guess
●
No alternatives if application cannot handle direct I/O
9IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Page Cache Administration
● /proc/sys/vm/swappiness
– Is one of several parameters to influence the swap tendency
– Can be set individually with percentage values from 0 to 100
– The higher the swappiness value, the more the system will swap
– A tuning knob to let the system prefer to shrink the Linux page cache instead of swapping processes
– Different kernel levels may react differently on the same swappiness value
– Setting swappiness to 0 can have unexpected effects
● [flush-<major>:<minor>]
– These threads write data out of the page cache per device
– /proc/sys/vm/dirty_background_ratio● Controls if writes are done more consistently to the disk or more in a batch
– /proc/sys/vm/dirty_ratio● If this limit is reached writing to disk needs to happen immediately
– Both can be set individually with percentage values from 0 to 100
10IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
I/O Scheduler
● Three different I/O schedulers are available
– noop scheduler● Only last request merging
– deadline scheduler● Plus full search merge; avoids read request starvation
– complete fair queuing (cfq) scheduler● Plus all users of a particular drive would be able to execute about the same number of I/O
requests over a given time
● The default scheduler in current distributions is deadline
● Do not change the default scheduler / default settings without a reason
● It could reduce throughput or increase processor consumption
11IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
I/O Scheduler performance
● deadline scheduler
– throughput and processor consumption moderate
● noop scheduler
– showed similar throughput like deadline scheduler
– may save processor cycles
● cfq scheduler
– had no advantage in all our measurements
– tends to high processor consumption
noop
cfq
deadline
Low consumption
High consumption
Complex
Trivial
12IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
General Layout For FICON/ECKD
● The dasd driver starts the I/O on a subchannel
● Each subchannel connects to all channel paths in the path group
● Each channel connects via a switch to a host bus adapter
● A host bus adapter connects to both servers
● Each server connects to its extent pools / ranks
Block device layer
Switc
h
HBA 1
HBA 4
HBA 3
HBA 2 Serv
er 0
ranks
7
5
3
1
chpid 4
chpid 3
chpid 2
a
DAChannel
subsystem
dmdm-multipath
LVM
I/O scheduler
VFS
Application program
Page cache
dasd driver
chpid 1
Serv
er 1
13IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
FICON/ECKD dasd I/O To A Single Volume With Alias Devices
● VFS sees one device
● The dasd driver sees the real device and the alias devices
● Load balancing with alias devices is done by the dasd driver
Switc
h
HBA 1
HBA 4
HBA 3
HBA 2 Serv
er 0
ranks
7
5
3
1
chpid 4
chpid 3
chpid 2
a
DAChannel
subsystemchpid 1
bcd
Serv
er 1
Block device layerI/O scheduler
VFS
Application program
Page cache
dasd driver
14IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
FICON/ECKD dasd I/O To A Logical Volume
● VFS sees one device (logical volume)
● The device mapper sees the logical volume and the physical volumes
● Load balancing in the logical volume is done by the device mapper
Serv
er 0
ranks
7
5
3
1DA
Serv
er 1
Block device layerdmLVM
I/O scheduler
VFS
Application program
Page cache
dasd driver
Switc
h
HBA 1
HBA 4
HBA 3
HBA 2
chpid 4
chpid 3
chpid 2
a
Channelsubsystem
chpid 1
bcd
15IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
FICON/ECKD dasd I/O To A Logical Volume Combined With PAV Devices
● VFS sees one device (logical volume)
● The device mapper sees the logical volume and the physical volumes
● Load balancing in the logical volume is done by the device mapper
● The dasd driver sees the real device and all alias devices
● Load balancing with alias devices is done by the dasd driver
Serv
er 0
ranks
7
5
3
1DA
Channelsubsystem
abcd
efgh
ijkl
mnop
Serv
er 1
Block device layerdmLVM
I/O scheduler
VFS
Application program
Page cache
dasd driver
Switc
h
HBA 1
HBA 4
HBA 3
HBA 2
chpid 4
chpid 3
chpid 2
chpid 1
16IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
I/O Processing Characteristics For FICON/ECKD
● 1:1 mapping host subchannel:dasd
● Serialization of I/Os per subchannel
● I/O request queue in Linux
● Disk blocks are 4 KiB
● High availability by FICON path groups
● Load balancing by FICON path groups and Parallel Access Volumes (HyperPAV, virtual PAV as z/VM guest)
● Linux dasd driver optimizes throughput and controls when to use
– Parallel Access Volumes (HyperPAV, boosts I/O)
– High Performance FICON (zHPF)
17IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
General Layout For FCP/SCSI
● The SCSI driver finalizes the I/O requests
● The zFCP driver adds the FCP protocol to the requests
● The qdio driver transfers the I/O to the channel
● A host bus adapter connects to both servers
● Each server connects to its extent pools / ranks
Switc
h
HBA 5
HBA 8
HBA 7
HBA 6 Serv
er 0
ranks
8
6
4
2
chpid 8
chpid 7
chpid 6
DA
chpid 5
SCSI driverzFCP driverqdio driver
Serv
er 1
Block device layerdm
dm-multipathLVM
I/O scheduler
VFS
Application program
Page cache
18IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
FCP/SCSI LUN I/O To A Multipath Volume
● VFS sees one device
● The device mapper sees the multipath alternatives to the same volume
● Load balancing for the multipath volume is done by the device mapper
Switc
h
HBA 5
HBA 8
HBA 7
HBA 6 Serv
er 0
ranks
8
6
4
2
chpid 8
chpid 7
chpid 6
DA
chpid 5
Serv
er 1
SCSI driverzFCP driverqdio driver
Block device layerdm
dm-multipath
I/O scheduler
VFS
Application program
Page cache
19IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
FCP/SCSI LUN I/O To A Logical Volume Combined With Multipath Volumes
● VFS sees one device
● The device mapper sees the logical volume and the physical volumes and the multipath alternatives to the same volume
● Load balancing in the logical volume and for the multipath volumes is done by the device mapper
Serv
er 0
ranks
8
6
4
2
DA
Serv
er 1
SCSI driverzFCP driverqdio driver
Block device layerdm
dm-multipathLVM
I/O scheduler
VFS
Application program
Page cache
Switc
h
HBA 5
HBA 8
HBA 7
HBA 6
chpid 8
chpid 7
chpid 6
chpid 5
20IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
I/O Processing Characteristics For FCP/SCSI
● Several I/Os can be issued against a LUN immediately
● Queuing in the FICON Express card and/or in the storage server
● Additional I/O request queue in Linux
● Disk blocks are 512 Bytes for compatibility reasons
● High availability by Linux multipathing, type failover or multibus
● Load balancing and performance by Linux multipathing, type multibus
● The FCP/SCSI datarouter technique improves throughput
– Kernel parmline zfcp.datarouter=1 (default in recent Linux distributions)
21IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
.
.
.
.
.
.
Channel Paths
● High availability requires more than 1 channel path between host and storage server
● Throughput scales with the number of channel paths
● Switches should support the maximum channel speed
22IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Channel Paths – Don’t Do This!!
● A small number of switch interconnects with same speed will degrade your carefully configured hosts and storage servers
.
.
.
.
.
.
.
.
.
.
.
.
23IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Storage server
● Provide a big cache to avoid real disk I/O
● Provide a Non Volatile Storage (NVS) to ensure that cached data, which has not yet been destaged to disk devices is not lost in case of a power outage
● Provide algorithms to optimize disk access and cache usage, optimizes for sequential I/O or for random I/O
● Have the biggest challenge with heavy random write, because this is limited by the destage speed if the NVS is full
24IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
DS8000 Storage Server - Ranks
● The smallest unit is a array of disk drives
● A RAID 5 or RAID 10 technique is used to increase the reliability of the disk array, defining a rank
● The volumes for use by the host systems, are configured on the rank
● The volume has stripes on each disk drive
● All configured volumes will share the same disk drives
rank
Disk drives
volume
RAID technology
25IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
DS8000 Storage Server – Components
● A rank
– represents a raid array of disk drives
– is connected to the servers by device adapters
– belongs to one server for primary access
● A device adapter connects ranks to servers
● Each server controls half of the storage servers disk drives, read cache and NVS
Disk drives
Serv
er 0 2
1
3
Device Adapter
4
5
volumes
6
7
8
Read
cac
heN
VSRe
ad c
ache
NVS
Serv
er 1
ranks
26IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
DS8000 Storage Server – Extent Pool
● Extent pools
– consist of one or several ranks
– can be configured for one type of volumes only: either ECKD dasds or SCSI LUNs
– are assigned to one server
● Some possible extent pool definitions for ECKD and SCSI are shown in the example
Serv
er 0
ranks
2
1
3
Device Adapter
4
5
volumes
6
7
8
ECKD
ECKD
ECKD
SCSI
SCSI
SCSI
SCSI
ECKD
Serv
er 1
27IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
1
DS8000 Storage Pool Striped Volume
● A storage pool striped volume (rotate extents) is defined on a extent pool consisting of several ranks
● It is striped over the ranks in stripes of 1 GiB
● As shown in the example
– The 1 GiB stripes are rotated over all ranks
– The first storage pool striped volume starts in rank 1
ranksvolume
1 74
32
Disk drives
65
32 8
28IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
DS8000 Storage Pool Striped Volume
● A storage pool striped volume
– uses disk drives of multiple ranks
– uses several device adapters
– is bound to one server side
Serv
er 0
Serv
er 1
Device Adapter
extent pools / ranks
Disk drives
29IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Storage Server Performance Considerations To Speed Up I/O
● Volume configuration in the storage server
– Configure the volumes as storage pool striped volumes in extent pools larger than 1 rank, to use simultaneously● more physical disks● more cache and NVS● more device adapters
– Provide alias devices (HyperPAV) for FICON/ECKD volumes
● Striping in Storage Server or in Linux?
– Storage server striping● Is simple, requires only some administration effort when the storage server is configured● Striping effort is for free and not accounted as host CPU cycles
– Linux striping ● Can spread over all storage server extent pools● Allows arbitrary stripe sizes to match workload characteristics
30IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Best Performance With …
● The use of newest FICON card technology and appropriate switches are key for best throughput
● Throughput scales with the number of channel paths
● If the volumes are configured with storage pool striping a Linux striped logical volume may not be necessary
● For FICON/ECKD the use of HyperPAV devices is most important for high throughput
● If I/O rate is low
– Using the default with Linux page cache I/O is best choice
● If I/O rate is high
– Choose direct I/O or even better direct I/O and async I/O (both need to be supported by the application program)
● While both FCP/SCSI and FICON/ECKD provide sufficient I/O throughput, FCP/SCSI is recommended if really the maximum performance is needed
– Especially for random I/O (e.g. databases) FCP/SCSI is best choice
Linux on z Systems Disk I/O Performance as z/VM Guest
Martin Kammererinux on z Systems Performance Evaluation
2017-10-09I016653
L
IBM SystemsTechnical Eventsibm.com/training/events
Technical University
Munich 2017
IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
32IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Guests With z/VM dasd Or edev Minidisks
● VFS sees one device
● Request serialization at the single device to z/VM
● Load balancing with alias devices is done by the dasd driver
Z/VM
Block device layerI/O scheduler
VFS
Application program
Page cache
dasd driver
Switc
h
HBA 1
HBA 4
HBA 3
HBA 2 Serv
er 0
ranks
7
5
3
1
chpid 4
chpid 3
chpid 2
DA
chpid 1
Serv
er 1
33IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Guests With z/VM dasd Fullpack Minidisks With Virtual PAV
● VFS sees one device
● The dasd driver sees the real device and the alias devices
● Load balancing with alias devices is done by the dasd driver
Z/VM
Block device layerI/O scheduler
VFS
Application program
Page cache
dasd driver
Switc
h
HBA 1
HBA 4
HBA 3
HBA 2 Serv
er 0
ranks
7
5
3
1
chpid 4
chpid 3
chpid 2
DA
chpid 1
Serv
er 1
34IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Guests With z/VM Attached SCSI Ports
● VFS sees one device
● The device mapper sees the multipath alternatives to the same volume
● Load balancing for the multipath volume is done by the device mapper
Switc
h
HBA 5
HBA 8
HBA 7
HBA 6 Serv
er 0
ranks
8
6
4
2
chpid 8
chpid 7
chpid 6
DA
chpid 5
Serv
er 1
SCSI driverzFCP driverqdio driver
Block device layerdm
dm-multipath
I/O scheduler
VFS
Application program
Page cache
35IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Guests With z/VM Disk I/O Performance Considerations
● All applies as already described for Linux
– Logical volumes
– Multipathing
– HyperPAV
– I/O paths to storage server
– Storage server configuration
– Storage pool striped volumes
● Use z/VM dasd or edev minidisks for performance uncritical purposes
KVM for z Systems Disk I/O Performance
Martin KammererLinux on z Systems Performance Evaluation
2017-10-09I016653
IBM SystemsTechnical Eventsibm.com/training/events
Technical University
Munich 2017
IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
37IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Linux Kernel Components Involved In Disk I/OApplication program
Block device layer
Device drivers
Device mapper (dm)dm-multipath
Logical Volume Manager (LVM)
I/O scheduler
virtio_blk / iothreads
Page cache
Direct Acces Storage Device driver Small Computer System Interface driver (SCSI)z Fiber Channel Protocol driver (zFCP)Queued Direct I/O driver (qdio)
Block device layervirtio_blk
Virtual File System (VFS)
Page cache
Issuing reads and writes
VFS dispatches requests to different devices andtranslates to sector addressingPage cache contains all file I/O data, direct I/O bypasses the page cacheTransfer to host
Transfer to guestLVM defines the physical to logical device relationMultipath sets the multipath policiesdm holds the generic mapping of all block devices andperforms 1:n mapping for logical volumes and/or multipathPage cache contains all file I/O data, direct I/O bypasses the page cacheI/O schedulers merge, order and queue requests, start device drivers via (un)plug device Data transfer handling
host
guest
38IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
KVM Host Disk I/O Performance Considerations
● All applies as already described for Linux
– Logical volumes
– Multipathing
– HyperPAV
– I/O paths to storage server
– Storage server configuration
– Storage pool striped volumes
39IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Disk Setup - Native Block Device
● The disk devices are 'owned' by the host as DASD or FCP device, but not mounted!
● The disk devices are propagated to the guest as independent virtio-blk data-plane devices.
● In the guest, each resulting virtio-blk device is partitioned and formatted with a file system and mounted.
Disk 1- native device
Guest
Host
Guest file system/mnt1 /mnt ... /mntn
Disk ...- native device
Disk n- native device
40IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Disk Setup - Image File
● The disk devices are 'owned' by the host as DASD or FCP device,
● Partitions are formatted and mounted on the host
Guest
Host
Guest file system/mnt1 /mntn
Disk 1- partition 1- XFS file system- mounted /mnt/disk1
Image file /mnt/disk1/file.img
Disk 1
Disk n- partition 1- XFS file system- mounted /mnt/diskn
Disk n
. . .
Image file /mnt/diskn/file.img
● The image file resides in the host file system
● The image files are propagated to the guest as independent virtio-blk data-plane devices
● In the guest, each resulting virtio-blk device is partitioned and formatted with a file system and mounted
41IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
KVM Host Disk I/O, Virtual Devices (1)
● Para-virtualized devices hide the real storage devices for the guests
● Devices presented to the guest could be
– Block devices of DASD or SCSI volumes ● For performance critical purposes
– Disk image files● For performance uncritical purposes● Sparse files occupy only the written space (over commit disk space)
– Tradeoff: space consumption versus performance
change in affects increasesguest domain XML this guests throughput of this guest
42IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
KVM Host Disk I/O, Virtual Devices (2)
● KVM host and guest operating system maintain a page cache
– Set cache=none in the device section of the guest configuration (recommended)● Disables the host’s page cache for this device
● Specify a quantity of iothreads in the guest configuration to enable simultaneous I/O
– Performance uncritical I/O devices can share the same iothread
– Performance critical I/O devices should have an individual iothread each
– The total number of iothreads should not be much bigger than the number of the virtual processorthe guest
s of
change in affects increasesguest domain XML this guest, KVM host throughput of this guest
change in affects increasesguest domain XML this guest, KVM host throughput of this guest
Linux on z Systems Disk I/O on z14
Martin KammererLinux on z Systems Performance Evaluation
2017-10-09I016653
IBM SystemsTechnical Eventsibm.com/training/events
Technical University
Munich 2017
IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
44IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Disk I/O on z14
● z14 FICON Express 16S+ versus z13 FICON Express 16S
– FICON Express 16S+ is supposed to allow much more IOPS than its predecessor
– With both FCP/SCSI and FICON/ECKD the expected triple IOPS on a single host channel number could be achieved
– Higher IOPS rates can improve throughput in situations where many I/Os (most likely short requests) are issued
● Database workloads could benefit from this
– I/O benchmarks showed throughput increase of up to 30% in all disciplines (sequential read and write, random read and write)
– No changes required to get improvements
45IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Thank You!
Martin KammererManager Linux on z SystemsPerformance Evaluation
Research & DevelopmentSchönaicher Strasse 22071032 Böblingen, Germany
Linux on System z – Tuning hints and tips http://www.ibm.com/developerworks/linux/linux390/perf/index.html
Live Virtual Classes for z/VM and Linux http://www.vm.ibm.com/education/lvc/
Mainframe Linux blog http://linuxmain.blogspot.com
46IBM SYSTEMS / Linux on z Systems Disk I/O Performance / Oct. 09, 2017 / © 2017 IBM Corporation
Linu
x
Applications
storage
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. A current list of IBM trademarks is available on the Web at “Copyright and trademarkinformation” at www.ibm.com/legal/copytrade.shtml.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Other product and service names might be trademarks of IBM or other companies.
© Copyright IBM Corporation 2017. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
IBM Systems Technical Events – ibm.com/training/events
Thank
ibm.com/training/systems
I016653