life in the fast lane: zhyperlink · 2020. 11. 30. · • vsam reads ++apars • db2 log writes...
TRANSCRIPT
Life in the fast lane: zHyperLinkJohn Baker
IntelliMagic
November 2020
Session 4AO
IntelliMagic zAcademy Availability Intelligence
• Overview
• Traditional vs Synchronous I/O
• “Sync” in context
• Connectivity and DRP implications
• zHyperLink eligibility and real-world results
• Q&A
Agenda
2
IntelliMagic zAcademy Availability Intelligence
Existing I/O components
3
CPU
IOP
Channels
FCD ports
Cache and NVS
HA’s and
ports
DA’s and
Drives
IntelliMagic zAcademy Availability Intelligence
The life of an I/O (host side)
4
Open dataset via JCL andRead or write logical record
Builds channel program describingData location and type of access
May return immediately if in buffer
Handles priorities, queuing (IOSQ/PAV)and issues SSCH to engage
CSS hardware
SAP selects path to Disk Subsystem,Channel executes channel program
and handles data flow(Pend: CSS/CMR)
Application Program
Access Method
I/O Supervisor
Channel Subsystem
Disk Subsystem
IntelliMagic zAcademy Availability Intelligence
The life of an I/O (DSS side)
5
Disk Subsystem
Hard Disk/Flash Drives
Channel Subsystem - Decodes channel program (CKD/FB)- Manages cache and NVS- Manages extent integrity (Pend DB)- Initiates staging/destaging (Disc)- Manages RAID structures- Replication (Disc)
- Organized in Ranks/Pools/Arrays- Read/write data- Error checking
IntelliMagic zAcademy Availability Intelligence
How long does all that take?
6
IntelliMagic zAcademy Availability Intelligence
Synchronous I/O
7
Application Program
Access Method
I/O Supervisor
PCIe zHyperLink
• Channel subsystem is bypassed
• CPU waits for PCI/DMA operation to complete; no task switching
• Processor cache content is preserved
IntelliMagic zAcademy Availability Intelligence
Response vs Response
8
mic
roseconds
zHyperLink response
IntelliMagic zAcademy Availability Intelligence
Sync Operation – Determined by z/OS
• Eligible I/Os are tried synchronously• CP spins waiting for sync I/O to complete• “Heritage” (traditional) I/O if cache miss – zHyperLink does not replace FICON
IntelliMagic zAcademy Availability Intelligence
“Sync I/O” – Db2 vs zHyperLink
10
Making the best of a bad situation:Turning a (bad) Db2 sync I/O into a (good) zHyperLink sync I/O
• Occurs when zHL eligible I/O experiences cache hit
• Reduces time required to complete I/O
• One 4K “block” (currently)
• Seek to maximize
zHyperLink sync I/O – GoodDb2 read sync I/O – Bad
• Occurs for data not loaded in advance by prefetch I/O
• Causes “costly” application wait times
• One 4K “page”
• Seek to minimize
IntelliMagic zAcademy Availability Intelligence
Context: CF vs zHyperLink
11
Familiarity with Coupling Facility concepts provides a great frameworkfor understanding zHyperLink processing
• Very fast: 20-30 mics
• CP spins during sync I/Os
• Safeguard from high CPU –cache misses re-driven as (async) FICON I/Os
zHyperLinkCoupling Facility
• Ultra-fast: 5-8 mics
• CP spins during sync reqs
• Safeguard from high CPU –heuristic algorithm “converts” subsequent requests to async
IntelliMagic zAcademy Availability Intelligence
Connectivity – CEC to Storage
FICON links
At least 2 zHyperLinks
Maximum distance:150 meters
8 GB/sec optical links
DS8880Z14/15
zHyperLink is always Point-to-Point – no directors
zHyperLink is PCI link, not channel
IntelliMagic zAcademy Availability Intelligence
No Metro Distance Mirrors
Common replication architecture for financial institutions
These synchronous distances not supported with zHyperLink
IntelliMagic zAcademy Availability Intelligence
zHyperLink for Writes
To exploit zHyperLink for writes, synchronous copy must be local using zHyperWrite (prerequisite)
• Storage frame driven synchronous replication is not supported
• HyperSwap is supported (within 150m)
• Asynchronous copy is supported in combination with zHyperLink
Replication approach may need to be reevaluated
IntelliMagic zAcademy Availability Intelligence
Potential Read-Only Approach
• IF zHyperLink exploitation is limited to reads, AND
• IF zHL-eligible reads all occur on DS8880 direct connected to CECs within 150m
• THEN current replication design may be viable; writes still occur through FICON
Primary:150 m
from CECs
Primary:150 m
from CECs
© 2019 IBM Corporation
IBM Z – WSC Performance Team
16
zHyperLink ™ Eligibility and Enablement– Eligibility designed to show potential benefit, even before zHyperLink™ Enablement
– Maintenance stream captures Eligibility, even if actual function disabled because of• Caller, Lack of Infrastructure, DFSMS Storage Class Granularity by default everything disabled (OA54822)
– Once zHyperLink enabled Eligibility may still be utilized• DFSMS Storage Class Granularity – by default everything “disable”
• Blocksize >4096
– Or Once zHyperLink enabled zHPF may still occur• Even “intra data set” with zHyperLink Enabled – E.g. Db2 Synchronous Reads and Prefetch Reads
• zHyperLink™ Read Misses (not a “Success”)
• Today, Writes
16
Eligibility Callers
• Db2 v12 Sync Reads
• Db2 v11 Sync Reads
• VSAM Reads ++APARs
• Db2 Log Writes
Media Manager
Yes to all ?
• zHyperLink Infrastruture
• Caller Enabled
• DFSMS Enabled
• <=4096 Blocksize
If Not
zHyperLink Enabled - Attempts/Successes,
and Other SMF 42-6 zHyperLink Fields
• Db2 v12 Sync Reads
• Db2 v11 Sync Reads
• VSAM Reads ++APARs
• Db2 Log Writes
zHyperLink Eligible- Eligible / Read Hits
• Db2 v12 Sync Reads
• Db2 v11 Sync Reads
• VSAM Reads ++APARS
• Db2 Log Writes
zHPF – Normal SMF 42-6Non Eligible Callers
SMF 42-6 Fields
This slide courtesy of John Burg, IBM
IntelliMagic zAcademy Availability Intelligence
Tracking eligibility and usage
17
IntelliMagic zAcademy Availability Intelligence
VSAM: FICON and Synchronous Requests
18
zHyperLinkResponse
IntelliMagic zAcademy Availability Intelligence
Synchronous I/O Content (%)
19
IntelliMagic zAcademy Availability Intelligence
zHL Enabled Storage Class“VSAM work”
20
IntelliMagic zAcademy Availability Intelligence
Before and After: Linear Datasets (Db2)
21
IntelliMagic zAcademy Availability Intelligence
Synchronous I/O ContentPercentage of all I/O
22
IntelliMagic zAcademy Availability Intelligence
Detailed Stats from SMF 42
23
IntelliMagic zAcademy Availability Intelligence
Eligible vs Active zHyperLink
24
IntelliMagic zAcademy Availability Intelligence
LR vs SSCH and zHyperLink %
25
IntelliMagic zAcademy Availability Intelligence
Weighted Response
26
IntelliMagic zAcademy Availability Intelligence
DB2 Database: Activity and Response
27
IntelliMagic zAcademy Availability Intelligence
DB2 Transaction Response Time
28
IntelliMagic zAcademy Availability Intelligence
Database Sync I/O Waitms/commit
29
IntelliMagic zAcademy Availability Intelligence
Summary: I/O is all relative!
Local Buffer
Coupling FacilityGroup Buffer Pool
zHyperLink Sync I/O
Storage System Enterprise HDD
FICON Cache Hit
Flash Back-end Miss
Average Tiered Back-end Miss
1 µs
8 µs
20 µs
150 µs / 0.15 ms
1 second
8 seconds
20 seconds
2.5 minutes
8.3 minutes
33.3 minutes
500 µs / 0.5 ms
2000 µs / 2 ms
By way of comparison … if local buffer access
took 1 second …
IntelliMagic zAcademy Availability Intelligence
Sources
• D. Craddock et al. “zHyperLink: Low-latency I/O for Db2 on IBM Z and DS8880 storage”, IBM Journal of Research and Development. Vol 62, No. 2-3, Paper 13, March/May 2018. https://dl.acm.org/doi/10.1147/JRD.2018.2802070
• IntelliMagic White Paper: “zHyperLink: The Holy Grail of Mainframe I/O?”
• Frank Kyne, “Meet the Future of I/O – zHyperLinks”, Cheryl Watson’s Tuning Letter (CWTL) 2018 #2, pp. 86-116
• John Burg (IBM), “Assessing and Analyzing zHyperLink”. SHARE Pittsburgh, August 7, 2019, Session #25708.
Please submit your session feedback!
• Do it online at http://conferences.gse.org.uk/2020/feedback/nn
• This session is 4AO
GSE UK Conference 2020 Charity
• The GSE UK Region team hope that you find this presentation and others that follow useful and help to expand your knowledge of z Systems.
• Please consider showing your appreciation by kindly donating a small sum to our charity this year, NHS Charities Together. Follow the link below or scan the QR Code:
http://uk.virginmoneygiving.com/GuideShareEuropeUKRegion