bp perf and troublshooting -edge-v1.9
TRANSCRIPT
© Copyright IBM Corporation 2015
Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
sPE0330
Best Practices for Performance,
Design and Troubleshooting IBM
Storage Connected to IBM Power
Systems
Chuck Laing
Senior Technical Staff Member
IBM GTS Service Organization
© Copyright IBM Corporation 2015
Storage Overview - what's inside?
1. Know the Physical makeup
2. Know the Virtual makeup (good throughput design tips)
3. What is a Storage Pool - where do I place data?
4. What should I be aware of/what should I avoid? (Tips & Pitfalls-
Tuning)
• To Stripe or not Stripe, that is the question!
5. Zoning configuration and dual connectivity
• Checking that multipathing is working on host
6. Documentation - why it matters
7. Topology Diagrams
8. Disk Mapping (view at a glance)
9. Easy Storage Inquiry Tools
10. How to Improve Performance
• Bottlenecks
2
Agenda
The Top Ten Things SA’s should know about Storage
© Copyright IBM Corporation 2015
Throughput and Performance Key Optimization Factors
3
• Throughput
– Spreading and balancing IO
across hardware resources
• Controllers
• Ports & zoning connections
• PCI’s Cards
• CPUs, RAM
• Disks spindles
– Compression
– Thin Provisioning
– Easy Tier - SSD
• Etc….
• IO Performance Tuning
– Using utilities, functions and
features to tweak (backend, frontend)
• Qdepths
• HBA transfer rates
– FC adapters
• LVM striping vs spreading
• Data Placement
– Random versus sequential
– Spreading versus Isolation
– Application characteristics
Configuring throughput optimally increases potential performance scalability
© Copyright IBM Corporation 2015
FlashSystem Any Storage Private, Public or Hybrid Cloud
Control
Virtualize Accelerate Scale
Family of Storage Management
and Optimization Software
Protect Archive
IBM
Spectrum
Control
Tivoli Storage Productivity Center
(TPC) and management layer of
Virtual Storage Center (VSC)
IBM
Spectrum
Protect
Tivoli Storage Manager (TSM)
IBM
Spectrum
Archive
Linear Tape File System (LTFS)
IBM
Spectrum
Virtualize
SAN Volume Controller (SVC)
IBM
Spectrum
Accelerate
Software from XIV System
IBM
Spectrum
Scale
Elastic Storage - GPFS
IBM Spectrum Storage Solutions
What you may remember …
4
Based On Technology From
© Copyright IBM Corporation 2015
5
Grid Building Block
-Data Module (1-15)
-CPU
-Memory (360GB/720GB)
-12 disk drives (1, 2, 3, 4, 6TB)
-Optional SSD 360/720 cache
External Connect
-Interface/Data Module (4-9)
-24 FC ports – 8Gb
-iSCSI ports – 22 GbE or 12 10GbE (Model 214)
Internal Interconnect
-2 Infiniband Switches
-3 UPSs Gen 3 Spectrum Accelerate
Build with a Strong Foundation
• Model 961 base and model 96E expansion
• 961 with up to 3 x 96E expansion frames
• 2.5” small-form-factor drives; 3.5” nearline; 1U High Performance Flash Enclosures
• 6 Gb/s SAS (SAS-2)
• Maximum of four frames
• Maximum of 1,536 drives plus 240 Flash cards
• Top of rack exit option for power and cabling
• Each frame has its own redundant set of power cords
• Two POWER7+ servers 4.228 GHz processors
• 2, 4, 8, and 16 core processor options per storage controller
• DS8870 has dual active/active controllers
• Up to 1TB of processor memory
• Host adapters
• Up to 64 16 Gb/sec or 128 8 Gb/sec ports or combination of 8/16 Gb/sec ports
• Each port supports FCP and FICON at the port level
• Base frame and first expansion frame allows 16 adapters. For 8 Gb/sec, both 4 and 8 port host adapter cards available. For 16 Gb/sec, 4 port host adapter card available
• All Flash configuration has all host adapters in the base frame
• Efficient front-to-back cooling (cold aisle/hot aisle)
V7000 V9000 • Great for
multiple mixed
workloads that
drive huge I/O
• Scale out for
more all flash
capacity, IOPS
and bandwidth
• Up to 2.5M
IOPS, 200µs
(.2ms)
• Up to 228TB
usable, 1.1PB
Effective
© Copyright IBM Corporation 2015
Traditional Applications New Generation Applications
Storage Management
Policy Automation
Analytics & Optimization
Snapshot & Replication
Management
Integration & API Services
Data Protection
Spectrum Virtualize
Virtualized SAN Block
Spectrum Scale
Global File & Object
Flexibility to use IBM and non-IBM Servers & Storage or Cloud Services
Spectrum Accelerate
Hyperscale Block
IBM Storwize, XIV, DS8000, FlashSystem and Tape Systems
Non-IBM storage, including commodity servers and media
Data Access
Storage and Data Control
Spectrum Control Spectrum Protect
Self Service Storage
Spectrum Archive
Data Retention
and non-IBM clouds
IBM Comprehensive Software Defined Storage
Capabilities - IBM Spectrum Storage Solutions – BP to Virtualize data
6
sPE0330 © Copyright IBM Corporation 2015
Foundation - Build a SAN Environment
Seems simple enough right?
• Build the Pools, at the Storage Device • Choose your storage type by disk characteristics, speeds and feeds
• Create Volumes from those pools
• Use ET, Compression, technology to render best Performance
• Connect Hosts to the Storage though the SAN Fabric
• Zone for redundancy and resiliency
• Configure settings to Best Practices
• Configure hosts to take advantage of the Storage Foundation
• Configure VIOS, LPARs, VMs, etc.
• Distribute virtual aspects appropriately
• Map the volumes to the hosts
• Create the file systems , LPs, PPs, PVs, LPs, VGs, etc
• Place the applications on the configured hosts
sPE0330 © Copyright IBM Corporation 2015
Foundation - Slow Performance or Outage Occurs
Now What?
• You followed the recipe • You took advantage of all the technology, features and functions by:
• Minimizing and automatically migrating volume IO hotspots – using ET in Pools
• Dual connecting all ports from the Storage to the Hosts
• Using good redundant performing storage foundation building blocks
• What happened?
• The cookies came out of the oven with clumps of:
• Salt, baking soda and brown sugar in spots.
sPE0330 © Copyright IBM Corporation 2015
Foundation -
What causes performance degradation and Outages?
• The 3 most common root causes are: • Configuration changes
• Hardware component failure
• IO load shift or increase
• You should consider designing the environment to
withstand load shift in the event of half the
environment failing such as: • Controller outages
• Fabric outages
• Server Outages
• You should design configurations to known Best
Practices • …Just because you can do something …should you?
© Copyright IBM Corporation 2015
Foundation - Top 10 Most Common Logical Issues
Found Globally / Trending
1. Incorrect Zoning Practices and Connections
• Oversubscribed SVC to Host Systems ports (Too many logical paths per vdisk)
• Oversubscribed Storage Controller ports to SVC Nodes (Too many connections)
• Single Points of Failure (SPoF) (Undersubscribed - not enough logical paths per vdisk)
2. Unsupported or down-level host multipathing drivers (SDD or non SDD drivers)
3. Incorrect load balancing
• Improper front-end load balancing (SVC preferred Node vdisk to host)
• Misconfigured Back-end Storage Controller balancing
4. Incorrect Volume Pool configurations
• Ineffective SVC Cache utilization (too many MDGs with too few disk spindles)
• Data placement - Improper application sharing versus isolation
• Improper Tiering decisions
5. Lack of documentation for proper management and troubleshooting
6. Down level or problematic microcode
7. Fabric port topology bottlenecks (incorrect physical topology)
8. Incorrect physical Fibre cabling practices (cracked glass)
9. Insufficient cable labeling practices
10.Suboptimal cooling
10
© Copyright IBM Corporation 2015
BP - Attaching Servers
Foundation – Zoning/Mapping Volumes from Pools
11
1. Incorrect Zoning Practices and Connections - can cause Fabric Congestion • Oversubscribed SVC to Host Systems ports (Too many logical paths per vdisk)
• Oversubscribed Storage Controller ports to SVC Nodes (Too many connections)
• Single Points of Failure (SPoF) (Not enough logical paths per vdisk)
2. Unsupported or down-level host multipathing drivers (SDD or non SDD drivers) –can cause “sick but not dead environments”
A Deeper Dive
© Copyright IBM Corporation 2015
New Storage Zoning Schema per Iogrp 12 Port Node
Evolution and Types of Zones – non cluster type
Making 1 zone per Node per Fabric with
the same 8 ports from a single backend
storage unit, will ensure the max login
count of 16 is not exceeded
Production SAN Fabric
D
STG Zone-1
STG Zone-2
STG Zone-3
STG Zone-4
Production SAN Fabric
C
12
Spectrum Virtualize DH8– 12 FC ports per node
I/O Group 0
Node 1
1 2 3 4
Slot 1
5 6 7 8
Slot 2
Physical
port
number 9 10 11 12
Slot 5 Logical
port with
wwpn #
embedded Node 2
1 2 3 4
Slot 1
5 6 7 8
Slot 2 9 10 11 12
Slot 5
2
2
2
1
2
4
2
3
2
2
2
1
2
4
2
3 5
1
5
2
5
4
5
3
1
2
1
1
1
4
1
3
5
2
5
1
5
4
5
3
1
2
1
1
1
4
1
3
Host/STG
Rep /
Node – Node
12
© Copyright IBM Corporation 2015
Back-end Storage to Spectrum Virtualize Zones
Storage Zone Type – How many Storage zones?
HBA1
P
1
P
2
HBA2
P
1
P
2
Back-end Storage
SAN Fabric 1 SAN Fabric 2
STG Zone-1
STG Zone-2
STG Zone-3
STG Zone-4
STG Zone-5
STG Zone-6
STG Zone-7
STG Zone-8
13
© Copyright IBM Corporation 2015
Looking at Power System Zoning between
Spectrum Virtualize and Standalone Power Systems
• Is this BP - redundant pathing for 2 HBA ports?
• What could be wrong?
14
Host
B1 A1
Fabric1 Core1 Fabric2 Core1
© Copyright IBM Corporation 2015 15
Host
B1 A1
Fabric1 Core1 Fabric2 Core1
Correct
Looking at Power System Zoning between
Spectrum Virtualize and Standalone Power Systems
4 paths per vdisk
© Copyright IBM Corporation 2015
Foundation -
Zoning Multi HBA hosts for Resiliency
• Sys Admins – provide PCI slot to Port WWPN identity to Storage Admins
• Storage Admins – define the SVC host definitions to match – Avoid single points of hardware failure at the Host HBA, Fabric and SVC
– Make four zones, one for each pseudo host per fabric( Red, Blue, Orange and Green )
© Copyright IBM Corporation
2014 16
HBA1
P
1
P
2
HBA2
P
1
P
2
Physical Host
SAN Fabric 1 SAN Fabric 2
HBA1
P
1
HBA2
P
1
P
2
SVC defined Pseudo Host1
HBA1
P
1
P
2
HBA2
P
1
SVC defined Pseudo Host2
© Copyright IBM Corporation 2015
LPM Could go to Frame2 or Frame3 Both active and inactive ports will be active during the LPM. Upon LPM completion the previous active ports will now show inactive and the previous inactive ports will show active.
Map the same Vdisks to the inactive LPAR in the same fashion as the active LPAR
FCA2
P3
P4
FCA4
P7
P8
SVC
VIO Server1 VIO Server2
Pseudo1
I
Pseudo 2
Active
Client
Logical
Partition
(LPAR2)
Fame1 Hypervisor
SAN
Pseudo1
Pseudo2
Active
Client
Logical
Partition
(LPAR1)
VFCA VP1.1a
VFCA VP7.1a
VFCA VP5.1a
VFCA VP3.1a
VFCA VP2.2a
VFCA VP8.2a
VFCA VP6.2a
VFCA VP4.2a
VP1.1a
VP2.2a
VP5.1a
VP6.2a VP2.2i
VP1.1i VP5.1i
VP6.2a
FCA1
P1
P2
FCA3
P5
P6
VP3.1a
VP4.2a
VP3.1i
VP4.2i
VFCA4.1 VP7.1a
VP8.2a
VP7.1i
VP8.2i
Pseudo1
Pseudo2
Inactive
Client
Logical
Partition
(Pseudo
LPAR1b)
VFCA VP1.1i
VFCA VP7.1i
VFCA VP5.1i
VFCA VP3.1i
LPM
•VFCA
P1 P2
P3 P4
P3 … P64
I
Fame2
Hypervisor
SAN
VFCA
.
•VFCA
P1 P2
P3 P4
P3 … P64
I
Fame3
Hypervisor
SAN
VFCA
.
inactive & active
vWWPN pairs
17
LPM Could go to Frame2 or Frame3
During LPM the number of paths double from 4 to 8 Starting with 8 paths per vdisk will render an unsupported 16 paths during this time - could lead to IO interruption
FCA2
P3
P4
FCA4
P7
P8
SVC
VIO Server1 VIO Server2
Pseudo1
I
Pseudo 2
Active Client Logical Partition (LPAR2)
Fame1 Hypervisor
SAN
Pseudo1
Pseudo2
Active Client Logical Partition (LPAR1)
VFCA VP1.1a
VFCA VP7.1a
VFCA VP5.1a
VFCA VP3.1a
VFCA VP2.2a
VFCA VP8.2a
VFCA VP6.2a
VFCA VP4.2a
VP1.1a
VP2.2a
VP5.1a
VP6.2a VP2.2i
VP1.1i VP5.1i
VP6.2a
FCA1
P1
P2
FCA3
P5
P6
VP3.1a
VP4.2a
VP3.1i
VP4.2i
VFCA4.1 VP7.1a
VP8.2a
VP7.1i
VP8.2i
Pseudo1
Pseudo2
Active Client Logical Partition (Pseudo LPAR1b) During LPM
VFCA VP1.1i
VFCA VP7.1i
VFCA VP5.1i
VFCA VP3.1i
LPM
•VFCA
P1 P2
P3 P4
P3 … P64
I
Fame2 Hypervisor
SAN
VFCA
.
•VFCA
P1 P2
P3 P4
P3 … P64
I
Fame3 Hypervisor
SAN
VFCA
.
inactive & active vWWPN pairs
18
© Copyright IBM Corporation 2015
LPM
VFCA6.1
VFCA8.1
VFCA5.1
VFCA3.1
VFCA1.1
VFCA2.1
VFCA4.1
Dual VIOS to Multiple LPARs
Is it resilient? - One VIOS Failure
x
FCA2
P3
P4
FCA4
P7
P8
Spectrum
Virtualize
VIO Server1 VIO Server2
Pseudo1
I
Pseudo 2
Active
Client
Logical
Partition
(LPAR2)
Fame1 Hypervisor
SAN
Pseudo1
Pseudo2
Active
Client
Logical
Partition
(LPAR1)
VFCA VP1.1a
VFCA VP7.1a
VFCA VP5.1a
VFCA VP3.1a
VFCA VP2.2a
VFCA VP8.2a
VFCA VP6.2a
VFCA VP4.2a
VP1.1a
VP2.2a
VP5.1a
VP6.2a VP2.2i
VP1.1i VP5.1i
VP6.2a
FCA1
P1
P2
FCA3
P5
P6
VP3.1a
VP4.2a
VP3.1i
VP4.2i
VFCA7.1
VP7.1a
VP8.2a
VP7.1i
VP8.2i
19
© Copyright IBM Corporation 2015
LPM
VFCA6.1
VFCA8.1
VFCA5.1
VFCA3.1
VFCA1.1
VFCA2.1
VFCA4.1
x
FCA2
P3
P4
FCA4
P7
P8
Spectrum
Virtualize
VIO Server1 VIO Server2
Pseudo1
I
Pseudo 2
Active
Client
Logical
Partition
(LPAR2)
Fame1 Hypervisor
SAN
Pseudo1
Pseudo2
Active
Client
Logical
Partition
(LPAR1)
VFCA VP1.1a
VFCA VP7.1a
VFCA VP5.1a
VFCA VP3.1a
VFCA VP2.2a
VFCA VP8.2a
VFCA VP6.2a
VFCA VP4.2a
VP1.1a
VP2.2a
VP5.1a
VP6.2a VP2.2i
VP1.1i VP5.1i
VP6.2a
FCA1
P1
P2
FCA3
P5
P6
VP3.1a
VP4.2a
VP3.1i
VP4.2i
VFCA7.1
VP7.1a
VP8.2a
VP7.1i
VP8.2i
Dual VIOS to Multiple LPARs
Is it resilient? – One SAN Fabric Failure
20
© Copyright IBM Corporation 2015
Types of Zones
Host ESX to SVC Zones
2+2 =4 Paths per LUN
© Copyright IBM Corporation 2015
Connectivity - Host to SVC Zoning Best Practices
DEV#: 5 DEVICE NAME: hdisk5 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 600507680181059C4000000000000007
==============================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi0/path0 OPEN NORMAL 1996022 0
1* fscsi0/path1 OPEN NORMAL 29 0
2 fscsi2/path2 OPEN NORMAL 1902495 0
3* fscsi2/path3 OPEN NORMAL 29 0
22
• Good communication between the SA and the Storadmin, can uncover
issues quickly
– Correct datapathing has 3 factors
• Proper zoning
• Proper SVC Host definitions (SVC logical config of the host def)
• Proper redundancy for the SVC preferred /non preferred pathing
DEV#: 3 DEVICE NAME: hdisk3 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 600507680181059BA000000000000005
============================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi0/path0 OPEN NORMAL 558254 0
1* fscsi0/path1 OPEN NORMAL 197 0
2* fscsi0/path2 OPEN NORMAL 197 0
3 fscsi0/path3 OPEN NORMAL 493559 0
4 fscsi2/path4 OPEN NORMAL 493330 0
5* fscsi2/path5 OPEN NORMAL 197 0
6* fscsi2/path6 OPEN NORMAL 197 0
7 fscsi2/path7 OPEN NORMAL 493451 0
8 fscsi5/path8 OPEN NORMAL 492225 0
9* fscsi5/path9 OPEN NORMAL 197 0
10* fscsi5/path10 OPEN NORMAL 197 0
11 fscsi5/path11 OPEN NORMAL 492660 0
12 fscsi7/path12 OPEN NORMAL 491988 0
13* fscsi7/path13 OPEN NORMAL 197 0
14* fscsi7/path14 OPEN NORMAL 197 0
15 fscsi7/path15 OPEN NORMAL 492943 0
Correct Incorrect
© Copyright IBM Corporation 2015 23
Host Multipath Configuration Best Practices examples
-Running a script can show current status
• For AIX VIO - driver used - sddpcm – https://w3-connections.ibm.com/wikis/home?lang=en-
us#!/wiki/Global%20Server%20Management%20Distributed%20SL/page/Script%20to%20Capture%20SAN%20Path%20Info
– Multipath installed = Yes, Set to: 4 paths/ hdisk, fcsi_settings:2145: fast_fail,
– Multipath Policy =load_balance
• For Linux/ESX/VMWare - – https://w3-connections.ibm.com/files/app#/file/1ba027ec-5d20-4c60-a281-f18f16192f7a
– Device –mapper – multipath HBA elements=4,
–For Windows – driver used = MPIO=SDDDSM – https://w3-connections.ibm.com/files/app#/file/3e52f54c-a445-4b17-aa5d-a5da43d4bedb
– Multipath installed = Yes, HBA elements = 4, MPIO Policy = Optimized
• For Solaris – driver MPxIO • Https://w3-connections.ibm.com/files/app#/file/66ea3228-4b26-48bd-a8fd-55751a02fc42
• Multipath installed = MPxIO, Path Subscription= 4, MPIO Policy = round-robin
Content input from : Bill Marshall, Jason Moras, Brad Worthen, Ramesh Palakodeti
© Copyright IBM Corporation 2015
Load Balancing
24
3. Incorrect load balancing
• Improper Front-end load balancing (SVC preferred Node vdisk to host)
• Misconfigured Back-end Storage Controller balancing
A Deeper Dive
© Copyright IBM Corporation 2015
Examples of correct Host to SVC Volume Balancing
25
vdisk3
vdisk1
vdisk2
Preferred path for vdisk1 is
SVC N1P2 & N1P3
Non Preferred path for vdisk1
is SVC N2P2 &N2P3
Preferred path for vdisk2 is
SVC N2P2 & N2P3
Non Preferred path for vdisk2
is SVC N1P2 &N1P3
vdisk1
vdisk2 vdisk3 vdisk4
vdisk4
© Copyright IBM Corporation 2015
Spectrum Virtualize (# Volumes to Host ports
from Nodes / IOgrps)
26
Imbalanced Data I/O loads
© Copyright IBM Corporation 2015
Back-end Load Balancing
Which has better throughput?
27
SAN Fabric SAN Fabric
SAN Fabric SAN Fabric
1
2
© Copyright IBM Corporation 2015
Storage Pool Configurations
28
4. Incorrect Volume Pool configurations
• Ineffective SVC Cache utilization (too many MDGs with too few disk spindles)
• Improper Tiering decisions
• Data placement - Improper application sharing versus isolation
Looking Deeper
© Copyright IBM Corporation 2015
Storage Pools
Easy Tier v3: Support for up to 3 Tiers
• Support any combination of 1-3 tiers
Tier 0 Tier 1 Tier2
Flash/SSD ENT NL
Flash/SSD ENT NONE
Flash/SSD NL NONE
NONE ENT NL
Flash/SSD NONE NONE
NONE ENT NONE
NONE NONE NL
Storage Pools - “Example Only”
Drive Selection in an Easy Tier Environment
Tiered Storage Strategy Overview
Right Tiering - more on tier 1 than should be?
31
Tier0
Ultra High Performance
Tier1
High Performance
Mission Critical
Tier2
Medium Performance
Non-Mission Critical
Tier3
Low Performance
Archival/Tape
Performance
You should consider: Cost versus Performance
Cost
Per
Gigabyte
Tier 0: Ultra High Performance Applications 1-3%
Tier 1: Mission critical, revenue generating
applications
15-20%
Tier 2: Backup, recovery and vital data 20-25%
Tier 3: Archives and long term
retention 50-60%
Storage Pyramid
sPE0330
Storage Pools
Tiered Storage Classification
CO
ST
/ P
ER
FO
RM
AN
CE
/ A
VA
ILA
BIL
ITY
Tier rating is based on performance AND reliability
RAID6 recommended on drives greater than 900GB
TIER Description Technical Examples – High level Guidance – Local variations on technology exist
IBM (block) SVC Recommended for Open Performance range capability
TIER0 – Flash
Systems Preferred
Solid State drives
alternate
Ultra High Performance. Meet QoS for High End
DS8870/SVC 400GB recommended *** RAID5 -- Small Block Recommended – 840 / 900 FlashSystem RAID5 -- Excellent ‘Turbo TIER1’ when coupled with XIV GEN3 (SSD cache) --
DS8870 -> Greater than 250,000 IOPs/5500+ MBs 840/900 FlashSystem Greater than 500,000 IOPs mixed work load (70/30).
TIER1(a)
TIER1(b)
High Performance. Drive up utilization of high-ended storage subsystems and still maintain performance QoS objectives. For low capacity requirements smaller less powerful devices may meet tier definition
DS8870 w/SAS 600GB 15K disk drive RAID5/RAID6 arrays *** - 300GB 15K to be used only if DISK Magic shows need -- RAID6 should be seriously considered
DS8870 -> 200,000+ IOPs / 5000 MBs V9000 -> TBD
XIV* (GEN3) model 214 with 2TB SAS drives (11 Module or greater unit)
XIV* (GEN3) model 214 with 3TB SAS
drives (11 Module or greater unit) --- 11.2 code version required --- --- SSDs (solid state drives) required --- XIV GEN2 removed from strategy – --- Lower cost. Serious compare to DS8
recommended for non-mainframe ---
XIV 2TB XIV (GEN3 15 mod) less than 130,000 IOPs/3,400 MBs XIV (GEN3 11 mod) less than 95,000 IOPs/3,200 MBs XIV 3TB XIV (GEN3 15 mod) less than 120,000 IOPs/3,400 MBs XIV (GEN3 11 mod) less than 85,000 IOPs/3,200 MBs
sPE0330
TIER Description Technical Examples – High level Guidance – Local variations on technology exist
IBM (block) SVC Recommended for Open
Performance range capability
TIER2
Medium Performance. Meet QoS for applications/data that resides here. For low capacity requirements smaller less powerful devices may meet tier definition
DS8870 w/SAS 600GB 10k disk drive RAID5/RAID10 arrays V7000 w/SAS 600GB using RAID5 (GEN2) V7000 In-Rack w/450GB SAS for PureSystems -- Note that the V7000 Flex Version is NOT RECOMMENDED --
DS8870 -> less than 80,000 IOPs or less than 3,000 MBs V7000 Gen2 (block) -> less than 75,000 IOPs or less than 2,500 MBs
TIER2b
Medium performance/Cloud and Commodity Storage
SDS Block Offering On Prem - Quantastor w/4TB 72 drive chassis running RAID10 or RAID5 as specified – with compression only. On Prem – 4TB R5 4+1 x 2 w/Software RAID 0 on top. V7000 with 1.2TB 10K SAS drivers using RAID6 (GEN2) XIV* (GEN3) model 214 with 4TB SAS
XIV -> less than 100,000 IOPs or less than 2,500 MBs SDS Block -> less than 50,000 IOPs or less than 1,000 MBs
CO
ST
/ P
ER
FO
RM
AN
CE
/ A
VA
ILA
BIL
ITY
Tier rating is based on performance AND reliability
RAID5 not recommended on drives greater than 900GB
Storage Pools
Tiered Storage Classification
sPE0330
TIER Description Technical Examples – High level Guidance – Local variations on technology exist
IBM (block) SVC Recommended for Open
Performance range capability
TIER3
Low Performance. Meet QoS for applications/data that resides here.
DS8870 with NL-SAS tech using RAID6 when customer already has DS8K with room for this storage V7000 with NL-SAS using RAID6 (GEN2)
DS8870 -> less than 25,000 IOPs or less than 1,000 MBs V7000 (block) -> less than 30,000 IOPs or less than 300 MBs
TIER3b
Low Performance. Meet QoS for applications/data that resides here.
SDS Block Offering On Prem - 4TB 72drive chassis running RAID5 – with compression
SDS & File Block: up to 35K and 280 MB/sec and when configured as specified SDS Object: Performance is high configuration dependent and measured in GETs and PUTs. Less than 25,000 IOPs or less than 300MB/sec
TIER4 Archival, long term retention, backup
Virtual Engines, Tape ATLs, ProtecTIER
N/A Tier based on features.
CO
ST
/ P
ER
FO
RM
AN
CE
/ A
VA
ILA
BIL
ITY
Tier rating is based on performance AND reliability
RAID5 not recommended on drives greater than 900GB
Storage Pools
Tiered Storage Classification
sPE0330
Storage Pools
‘Block’ Decision Tree
Application Highly
Sensitive to IO
Response Time?
See slide 2 for SAN
strategy (DS8K, XIV) Virtualization Needed?
TIER2 needed? See TIER2b for V7K
Options or alterates
See Tier3 for V7K
Options or alternates
Yes
SVC 4 or 8 node
Cluster Recommended
No
No
See previous pages for Guidance on Tier Performance Levels
Virtualization Needed?
Yes
Yes
Yes
No
SVC 4 or 8 node
Cluster Recommended
See Previous slides for
SAN strategy (XIV,
V7K)
If new small solution with, iSCSI QuataStor (Tier 2B/Tier 3) is an option at 99.9% and lower avail.
If advanced capabilities like site to site replication, local instant
copy, etc, then Quantastor not recommened
No
© Copyright IBM Corporation 2015
How many pools are to many?
SVC Cache Partitioning
36
Potential
Risks 1. Too many MDGs results in ineffective utilization of SVC destage / write
cache partitioning
2. Potential Performance issues are a result during High IO workloads.
3. 3. Results in limited and slowed data IO performance
4. To few Mdisks in the MDG could results in degraded disk performance,
impacting throughput from the back-end controller to the SVC,
ultimately slowing down the Application read/write access IOs per
second (IOPS).
Actions to
correct the
error
1. Reduce the number of MDGs through SVC mdisk consolidation to 5 or
less per cluster where possible.
2. This is not an architectural limitation but is the global standard, however,
it may make sense to create more MDGs if attempting to isolate
workloads to different disk spindles.
3. Make larger MDGs, with the minimum of 8 MDisks in an the MDG .
Some considerations are:
i. More MDisks in the MDG is better for transactional workloads
ii. The number of MDisks in the MDG is the most important
attribute influencing performance and SVC destage write cache
dedication. See the following table:
Note: Increasing performance “potential” adversely increases impact
boundary, but cannot be avoided up to minimum performance requirements.
37
Documentation
5. Lack of documentation for proper management and troubleshooting
6. Down Level Microcode
• Looking Deeper at Documentation
© Copyright IBM Corporation 2015
Are there any automated storage inquiry tools out there that will help me understand my setup?
• Storage tools – Gathers information such as, but not limited to:
• LUN layout
• LUN to Host mapping
• Storage Pool maps
• Fabric connectivity
• Firmware/ code level
– DS8QTOOL
• Go to the following Website to download the tool:
– http://congsa.ibm.com/~dlutz/public/ds8qtool/index.htm
– SVCQTOOL
• Go to the following Website to download the tool:
– http://congsa.ibm.com/~dlutz/public/svcqtool/index.htm
38
Documentation
How do I get the information?
© Copyright IBM Corporation 2015
Documentation – helps Data placement in Pools
Mapping Virtual LUNS to Physical Disks
• On the host server using SDD
• Ask the StoAdmin to find disk/device UID or Raid-group in Storage Pool
• StorAdmin cross-references Storage Pool UID with Controller’s Arrays in Pools
© Copyright IBM Corporation 2014
DEV#: 81 DEVICE NAME: hdisk81 TYPE: 2145 ALGORITHM: Load Balance
SERIAL: 60050768019002F4A8000000000005C7
======================================================================
Path# Adapter/Path Name State Mode Select Errors
0 fscsi0/path0 FAILED NORMAL 89154 2
1* fscsi0/path1 FAILED NORMAL 63 0
2 fscsi1/path2 OPEN NORMAL 34014 3
3* fscsi1/path3 OPEN NORMAL 77 0
LUN
to
Pool
to
Array
40
Troubleshooting and Tuning
• Best Practices for IBM Storage Performance, connected to Power
Systems
© Copyright IBM Corporation 2015
• After verifying that the disk subsystem is causing a system bottleneck, a
number of solutions are possible. These solutions include the following:
1. Consider using faster disks Flash & SDD will out perform HDD, etc.
2. Eventually change the RAID implementation if this is relevant to the server’s
I/O workload characteristics.
• For example, going to RAID-10 if the activity is heavy random writes may show
observable gains.
3. Add more arrays/ranks to the Storage pool.
• This will allow you to spread the data across more physical disks and thus improve
performance for both reads and writes.
4. Add more RAM
• Adding memory will increase system memory disk cache, which in effect improves disk
response times.
5. Finally, if the previous actions do not provide the desired application
performance:
• Off-load/migrate - processing to another host system in the network (either users,
applications, or services).
Troubleshooting - What are some Storage Bottlenecks?
41
© Copyright IBM Corporation 2015
Troubleshooting -Systems Administrator
How do I improve disk performance on the Host?
42
1. Reduce the number of IOs
• Bigger caches
• Application, file system, disk subsystem
• Use caches more efficiently
• No file system logging
• No access time updates
2. Improve average IO service times
• Consider changing the data layout
• Reduce locking for IOs
• Adjust Host Qdepth - Buffer/queue tuning
• Adjust HBA transfer Rates
• Use SSDs or RAM disk
• Consider changing the Radom versus sequential striping or spreading
• Faster disks/interfaces, more disks
• Short stroke the disks and use the outer edge
• Smooth the IOs out over time
3. Reduce the overhead to handle IOs
© Copyright IBM Corporation 2015
What are the most common/Important OS I/O tuning parameters?
• Device Queue Depth
– Queue Depth can help or hurt performance per LUN
• Be aware of Queue Depth when planning system layout, adjust only if necessary
• To calculate - best thing to do is go to each device “Information Center” URLs listed in link slide
• What are the default Queue Depths? ___
• HBA transfer rates
– FC adapters
• LVM striping vs spreading
• Data Placement
– Random versus sequential
– Spreading versus Isolation
43
• Queue Depth is central to the following fundamental performance formula:
• IO Rate = Number of Commands * Response Time per Command
• For example:
– IO Rate = 32 Commands per Second / .01 Seconds (10 milliseconds) per Command = 3200 IOPs
Some real-world examples:
• OS=Default Queue Depth= Expected IO Rate
• AIX Standalone = 16 per LUN = 1600 IOPs per LUN
• AIX VIOS = 20 per LUN = 2000 IOPs per LUN
• AIX VIOC = 3 per LUN = 300 IOPs per LUN
• Windows = 32 per Disk = 3200 IOPS per LUN
• Content provided by Mark Chitti
Troubleshooting -
Tips – Most Common OS IO Tuning Parameters
© Copyright IBM Corporation 2015
Troubleshooting -
#1 -Data Placement
44
Strip2
Strip4
Strip5
Strip1
Strip3
LUN1 made of strips on the outer edge of the DDMs (1s) also could have App A Raid-5 7+P
LUN3 made of strips in the middle of the DDMs (3s) also could have App B Raid-5 7+P
Placing Applications on the same LUNs/Pools result in IO contention
Extent pool or 8 Ranks
4
2 5
3
1
4
2 5
3
1
4
2 5
3
1
4
2 5
3
1
4
2 5
3
1
4
2 5
3
1
4
2 5
3
1
4
2 5
3
1
For existing applications, use storage and server performance monitoring tools to
understand current application workload characteristics such as:
• Read/Write ratio
• Random/sequential ratio
• Average transfer size (blocksize)
• Peak workload (I/Os per second for random access, and MB per second for sequential access)
• Peak workload periods (time of day, time of month)
• Copy services requirements (Point-in-Time Copy, Remote Mirroring)
• Host connection utilization and throughput (HBA Host connections)
• Remote mirroring link utilization and throughput
What causes THRASHING?
Most commonly when workloads peak at the same time or log files and data files share physical spindles
© Copyright IBM Corporation 2015
Data Placement on Power Systems
#1 - Random IO Data layout
45
RAID array
LUN or logical disk
1
2
3
4
5
1
PV
2 3 4 5
datavg
# mklv lv1 –e x hdisk1 hdisk2 … hdisk5
# mklv lv2 –e x hdisk3 hdisk1 …. hdisk4
…..
Use a random order for the hdisks for each LV
Slide Provided by Dan Braden
What does random LV creation order, help
prevent?
© Copyright IBM Corporation 2015
Data Placement on Power Systems
#1 - Data Layout - OS Spreading versus Striping
46
Is there is a difference? What’s the diff?
– Do you know what your volumes made of!
File system spread
sPE0330 © Copyright IBM Corporation 2015
#1 - Data Placement Performance issues!
Look for differences in the infrastructures ...Please....!!
47
• Leading causes for performance degradation after migration to new pools could be and
should be checked for:
• Different disk foundation technology or down level firmware
• Different pool sizes, with different number, size, speed and Raid type of disk (SAS, NL, FC, SSD, Flash, Sata)
• LUN sizes not as important as physical disk capacity, speed and type) but LUN size too big can cause an I/O bottleneck in the SVC
ports
• Cache utilization - could be the number of SVC MDiskgroups
• I/O Congestion in the target SAN
• Different SAN switches and firmware levels
• Perhaps a slow draining device - usually caused by a hung device HBA or SFP –
• Could be a lack of Buffer Credits
• Could be a dual core switch architecture, with incorrect zoning or not ISL'ed correctly
• Could be lack of Trunking
• Could be fixed versus auto-negotiate port speed setting
• Could be Port Fillwords not set to 3 for the IBM gear...
• Server side:
• CPU, cores,HBA transfer rates, Qdepth settings, BIOS settings
• Multipath Settings- can determine LUN behavior in handling the I/O
• Is striping or spreading for random IO configured on the server hosting the application
• Application side -type of application
• Are Data and Log files sharing the same physical spindles – OK on Flash / SSD XIV or Easy Tier is turned on, but not other
traditional other technology
• Application IO stacking..,is it the same...look for diff's
© Copyright IBM Corporation 2015
Data Placement on Power Systems
#1 - Data Layout Summary
48
Does data layout affect IO performance more than any tunable IO parameter?
Good data layout avoids dealing with disk hot spots
– An ongoing management issue and cost
Data layout must be planned in advance
– Changes are generally painful
iostat might and filemon can show unbalanced IO
Best practice: evenly balance IOs across all physical disks unless TIERING
Random IO best practice:
– Spread IOs evenly across all physical disks unless dedicated resources are needed to
isolate specific performance sensitive data
• For disk subsystems
• Create RAID arrays of equal size and type
• Create VGs with one LUN from every array
• Spread all LVs across all PVs in the VG
sPE0330 © Copyright IBM Corporation 2015
• Collect suspect device logs, send them to engaged support team
• Ask the device vendor - if a component has failed
• If yes block the component –HBA until replacement of component
• If no – continue to troubleshoot
• Determine what changed in last day/12 hours/6 hours
• If configuration changed – reverse the change
• If no configuration change continue to troubleshoot
• Is the zoning correct
• For missing SAN paths, determine if issue is narrow or widespread:
• Are paths missing on only one server? If so it is likely to be a Server HBA issue
• Are the paths missing on other servers? If so it is likely to be a SAN issue
• Are the paths missing on all Server HBA ports or only one of the ports?
• If paths are missing on multiple servers, are the HBA ports common to one Fabric?
• Are the missing paths common to one Fabric or to both Fabrics?
• Are the missing paths common to one Fabric blade?
• Are the missing paths common to one Storage port?
Troubleshooting Questions, If its not tiering, data
layout or storage foundation. What else could it be?
49
50
Incorrect Physical Topology
Configurations
7. Fabric port topology bottlenecks (incorrect physical topology)
8. Incorrect physical Fibre cabling practices (cracked glass)
9. Insufficient cable labeling practices
10. Suboptimal cooling
Physical Topology
© Copyright IBM Corporation 2015
Racking and Stacking: SVC Best Practice:
A Right Way Example
51
Rear View Front View
Clean and Neat
© Copyright IBM Corporation 2015
.
Unserviceable
fabric rack
1. Bend Radius
exceeded
2. Insufficient
strain relief
Cable weight
pull on other
cables
3. Cables loose
on floor
Susceptible to
pinching,
getting caught
in door, being
stepped
on…etc.
What’s wrong?
52
© Copyright IBM Corporation 2015
SVC rack and stack – how?
• What is the impact of this?
53
Can’t service
Blocked air exhaust
© Copyright IBM Corporation 2015
What’s this?
• What’s Wong!!
54
© Copyright IBM Corporation 2015
SVC rack and stack – how?
• Whats wrong here?
55
It was off in a corner with no surrounding
air flow.
Impact: Shortened component life span
from over heating.
Everytime the cabnet door was opened to
service it, it blew:
– A power supply
– Fans always failing
– Application outages
© Copyright IBM Corporation 2015
• Top 5 most common things that are or go wrong:
1. Growth without checking a very simple test – IO Read / Write response time in milliseconds
• Lack of data Spread - Additional Applications placed in Storage Pools not designed to share the load
• Lack of data Isolation - Timing of applications peak times is not isolated properly - causing congestion
• Improper configuration changes or even worse “No” configuration changes
• In the Data lifecycle things do change and shift overtime – causing load IO imbalance – impacting the overall SAN
2. Lack of Automatic Monitoring and Alerting and Daily Health Checks
• No Call Home or not configured or becomes out of date – wrong contact info
• Clocks not synchronized between equipment – makes it hard to pinpoint events
3. Down Level - Device Microcode slips beyond supported life or not updated regularly
• Servers, Storage or Fabric firmware get out of sync and become incompatible with changes
• Incorrect Host device Multipathing Driver
4. Single Points of Failure - Hardware components fail but not reconfigured properly:
• Server HBA WWPNs change but zoning is not updated in Fabric – Incorrect WWPN
• Raid Disks fail but not replaced in a timely manner because IO continues on redundant hardware
• Fabric ports fail but are not replaced in a timely manner because IO continues on redundant paths
• Improper IO pathing – to many or not enough paths by suboptimal design
5. Suboptimal physical design or changes over time
56
Troubleshooting - Summary
After initial build and configuration what next?
57 57
• Knowing - what's inside will help you make informed decisions?
• You should make a list of the things you don’t know
– Talk to the Storage Administrator or those who do know
• A better Admin understands
1. The backend physical makeup
2. The backend virtual makeup
3. What's in a Storage Pool for better data placement
4. Avoids the Pitfalls associated with IO Tuning
• 5. Know where to go to get right multipathing device drivers
• 6. Knows why documentation matters
– 7. Keep Topology Diagrams
– 8. keep Disk Mapping documentation
– 9. Be able to use Storage Inquiry Tools to find answers
– 10. Understand how to troubleshoot storage performance bottlenecks
Summary
© Copyright IBM Corporation 2015
Please fill out an evaluation for sPE0330
@ibmtechu.com
© Copyright IBM Corporation 2014
Some great prizes to be won!
© Copyright IBM Corporation 2015
Questions-
59
© Copyright IBM Corporation 2015
Extras for traditional storage
60
Best Practices for Performance, Design
and Troubleshooting IBM Storage
connected to Power Systems
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Brocade Best practice to Trunk
- on all Fibrechannel ISLs
61
Severity Major Why the error has occurred One or more ISL does not have Trunking enabled.
Potential Risks 1. Running with a ISL without Trunking, ie. a single Fibre
connection(s) between 2 SAN switch's there are some risks:
• Single Point of Failure, causing Fabric Segmentation and loss of connectivity if the
only (or last) connection between 2 switch's are lost
• Performance bottleneck, ISL Trunking is designed to significantly reduce traffic
congestion in storage networks.
2. If there are 2 ISLs between the switches there are multiple
scenarios why a trunk is not formed
• Either there are no Trunking license
• The links are cabled in different ASICs in either end
• The difference in length of the cables are to long
• There are "noise" in one cable, could be a bad connector or patch panel, a too much
bend cable etc.
Actions to correct the error Add more connections between this switch and the neighbor switch running
with a single connection and/or purchase a Trunking License.
© Copyright IBM Corporation 2015
Cisco Best Practice to enable Call Home
• #2 Top Reason - Lack of Automatic Monitoring and Alerting
62
Severity Minor Why the error has
occurred This error has occurred because no SMTP server is
entered for callhome transport.
A good example looks like either:
smtp server:smtp.svl.ibm.com
smtp server port:25
smtp server priority:0
smtp server:9.30.121.1
smtp server port:25
smtp server priority:0
Potential Risks Not setting up SMTP server impact the ability to for
notifications, risking critical errors going undetected
Actions to correct
the error Configure SMTP server for callhome.
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best Practice - Brocade Buffercredits on E-ports
should be greater than 20
63
Severity Warning Why the error has
occurred This error has occurred because the buffer credits on an
E-Port (ISL) are less than 20 Potential Risks During peak times, having less buffer credits on E-Ports
(ISL) will lead into loss of frames resulting in
performance issues.
By default 8 BB (Buffer) credits are allocated per port.
Considering the SAN topology it is highly recommended
to increase the default number of buffers to 20 or more Actions to correct
the error Use the command "portcfglongdistance" to allocate the
additional buffer credits
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best practice – do not have Brocade L-Ports present
64
Severity Warning Why the error has
occurred This error has occurred because one or more ports
logged in as "Loop" port Potential Risks Having Loop port in the environment will lead into
performance issues. Loop port may appear in the SAN
due to improper login / registering of host into the SAN
Actions to correct
the error Make sure that host properly logged into the SAN
Make sure that topology on the host set to point-to-point
© Copyright IBM Corporation 2015
Best practice – To sync time between all devices
65
Severity Info Why the error has
occurred This error has occurred because Network Time Protocol
(NTP) is not configured on the fabric device. Potential Risks Without clock synchronization it is much more difficult to
correlate logs of events across multiple devices, and
unsynchronized clocks may cause problems with some
protocols. Actions to correct
the error Configure the fabric device to use an NTP server that is
consistent with other devices on the same fabric.
• #2 Top Reason - Improper configuration changes or even worse “No” configuration
changes
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best practice – Incorrect port speed of IFL/ISL Ports
66
Severity Minor
Why the error has
occurred
This error has occurred because, port speed on ISL / IFL are not set to fixed speed ie: A good example should look like this:
128 1 16 658000 id 4G Online E-Port 10:00:00:05:1e:36:05:b0 "<SWITCH_NAME>" (downstream)(Trunk
master)
129 1 17 658100 id 4G Online E-Port (Trunk port, master is Slot 1 Port 16 )
130 1 18 658200 id 4G Online E-Port (Trunk port, master is Slot 1 Port 16 )
131 1 19 658300 id 4G Online E-Port (Trunk port, master is Slot 1 Port 16 )
A bad example could be any of these:
48 4 0 653000 id N4 Online E-Port 10:00:00:05:1e:36:05:b0 "<SWITCH_NAME>" (Trunk master)
49 4 1 653100 id N4 Online E-Port (Trunk port, master is Slot 4 Port 0 )
50 4 2 653200 id N4 Online E-Port (Trunk port, master is Slot 4 Port 0 )
51 4 3 653300 id N4 Online E-Port (Trunk port, master is Slot 4 Port 0 )
60 4 12 653c00 id N4 Online E-Port (Trunk port, master is Slot 4 Port 13 )
Note : When ever you use the command "portcfgspeed" command, port will go offline and will come online,
hence it is disruptive for that particular link. Implement the change in an appropriate time.
Potential Risks Having ISL / IFL ports in "Auto Negotiate" mode switches will keep on check for the
connectivity. Which will lead to both the switches to exchange the capabilities which may
lead into principal switch polling
Actions to correct the
error
Make sure that you set the port speeds of ISL / IFL on all switches to a fixed value.
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best Practice – Do not mix Hard and Soft Zoning
67
Severity Critical
Why the error has
occurred 1. There are two types of Zoning identification:
• Port World Wide Name (pWWN)
• Domain, Port (D,P).
2. For easier management it is possible to assign aliases for both
pWWN and D,P identifiers.
3. To ensure that all Zoning implements frame-based hardware
enforcement, use pWWN or D,P identification exclusively.
4. pWWN is more secure than D,P because of physical security
issues and it enables the use of FCR, FC FastWrite, Access
Gateway, and other features.
5. BEST PRACTICE: All zones should use frame-based hardware
enforcement; the best way to do this is to use pWWN identification
exclusively for all Zoning configurations Potential Risks Potential security and performance issues, when not following Best
Practice Actions to correct the
error Change any zones or alias's using Domain, Port to using WWPNs.
Change must be planned with all responsible parties to ensure
nondisruptive change.
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best Practice – Balance Connections between Fabrics
68
Severity Critical Why the
error has
occurred
1.Connections from the SVC to the fabrics are not balanced. The SVC passes this check if all nodes in the
SVC cluster satisfies all of the following 3 conditions: It is connected to exactly 2 independent fabrics.
2.It is connected to the same number of switches in each fabric.
3.The fiber connections to a switch (SW1) in one fabric must correspond to the connections to a switch (SW2)
in the other fabric - ie. if the SVC is connected to switch SW1 with 2 fibers then it must also be connected to
SW2 with exactly 2 fibers.
It is recommended that if, for example, the SVC is connected to ports 1 and 2 in SW1, then it should also be
connected to ports 1 and 2 in SW2, but this "strict mirroring" is not required in order to pass the check.
4.An independent fabric in this context can be one of two things: A "simple" fabric - just a group of
interconnected switches.
5.Two or more fabrics connected via fiberchannel routing (FCR) – the switches will in effect make up a single
fabric.
TPCHC is able to distinguish between fabrics with and without FCR. If a storage device is connected to 2
switches in the same fabric, then the switches are either in the same simple fabric, or they are in separate
fabrics connected via FCR. In either case the switches are NOT in independent fabrics, and the storage
device fails this check.
Multiple independent fabrics can have the same ID (fid), but TPCHC is able to distinguish between different
fabrics using the same fid. Potential
Risks If storage devices are not connected redundantly to two fabrics there is a single point of failure.
Furthermore the workload cannot be spread evenly between the two fabrics. Actions to
correct the
error
Verify connections to the fabrics and take corrective actions to ensure balance is in place.
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best Practice – MM/GM SVC Vdisks 500GB Max size
69
Name Vdisk sizing max 500GB if utilized in a Metro or Global
Mirror Relationship Severity info Why the error has
occurred This error has occurred because Flash Copy (FC)
enabled volumes larger than 500GB were discovered on
this SVC Cluster. Potential Risks Suboptimal performance. Actions to correct
the error Reduce the size of the volumes.
© Copyright IBM Corporation 2015
Best Practice – No Disk Controller Ports Off-line
70
Name Disk controllers degraded Severity Minor
Why the error has
occurred This error has occurred because the one or controllers have
been reported as degraded. This may be an unexpected
condition, resulting from a back-end storage ports that were
once configured that are no longer logged in to the SVC. Potential Risks 1. IO-loss and corrupted data
2. Data integrity is at risk.
3. This condition risks possible saturation of bandwidth on
existing configured ports potentially restricting the
available data I/O bandwidth. Actions to correct the
error 1. Check the Back-end storage Controller for failed arrays,
volumes or ports. Take appropriate action to correct the
condition.
2. Check Fabric conditions for port failures
3. It may be appropriate open a hardware PMR with
support to diagnose the cause of this condition
• #4 Top Reason - Single Points of Failure - Hardware components fail but not
reconfigured properly
© Copyright IBM Corporation 2015
• #4 Top Reason - Single Points of Failure - Hardware components fail but not
reconfigured properly
Best Practice – Correct Degraded Host Port Config’s
71
Severity Major Why the error has
occurred
This error has occurred because the one or more hosts have reported a host to SVC path status that is degraded. This condition may be
unexpected, check with Host owners for Hosts that have failed HBAs
Potential Risks 1. Undetected loss of redundancy
2. If this condition is unexpected then the host and applications residing on the hosts could be potentially impacted with a loss of IO Load
balancing and eminent IO bottlenecks
3. The data residing on these hosts are more susceptible to failure in that the remaining port is in a Single Point of Failure condition. In the
event of the reaming port failing the following results could occur:
1. I/O loss
2. Database corruption
3. Lengthy restore from tape backups
4. Extended Problem Determination
Actions to correct
the error
1. Check fabric ports for unexpected failures or offline ports
2. Check the SVC logical host definitions for wrong wwpn information in the event that a host HBA has been replaced but not updated in the
SVC logical definition
3. Check with the Host owners, Sys Admin, for decommissioned hosts that can be removed from the SVC logical definitions.
4. Dual HBAs should be architected on the clustered servers and the remaining non clustered severs, in order to add resiliency and protect the
critical data. The client should be advised of the current risks associated with this the current SPoFs.
5. Best practice is to dual port every host connection to the Fabric. Further "proper" testing should be done during a maintenance window
6. Phase1 Testing the redundancy between the Fabric and the host
1. Open a change record to reflect the change (Make sure all necessary approvers are notified)
2. Identify and verify which host HBA's are active for I/O activity by performing a test read and write to the SAN disk from the host
3. Stop I/O between the host and the Disk Storage
4. On the San Fabric, block the Switch port on the "even" fabric zoned between the host and the storage device.
5. Perform another read/write test to the same LUN
6. Identify and verify which host HBA is active for I/O activity
7. On the even SAN fabric unblock the Switch port
8. On the San Fabric, block the Switch port on the "odd" fabric zoned between the host and the storage device.
9. Perform another read/write test to the same LUN
10. Identify and verify which host HBA is active for I/O activity
11. On the odd SAN fabric unblock the Switch port
12. If the I/O activity toggles between the two HBA's then phase 1 of the test is successful
7.Phase 2 Testing the redundancy between the Storage device and the host
1. Repeat the process as defined in phase one, except block test and unblock the ports connected to the Storage ports instead of the
host.
2. When a new Host server or Storage device is added to the environment testing is strongly recommended.
8. Note: Ideally this type of test is best during the initial implementation of new equipment, before it is turned over to the customer or placed in
production.
© Copyright IBM Corporation 2015
• #1 Top Reason - Improper configuration changes or even worse “No” configuration
changes
Best practice is to -
Utilize all FC Ports but only ports 1 & 3 to SVC
72
Severity Warning Why the error has
occurred Best practice are to utilize all hardware bought, so there
are:
• No idle components
• Optimize the usage for maximum performance and
resiliency
• For XIV the FC port number 4 is pre-configured for
mirroring which can be changed if mirroing is not
used, and the utilized for host connection, to
improve performance and distribution of ports
between Fabrics.
• For connection to the SVC – use only port 1 & 3 Potential Risks Wasted capacity, and lack of connectivity for hosts, where
best pratice is all hosts should have connectivity to all
modules. Actions to correct the
error Cable the remaining ports or modules
© Copyright IBM Corporation 2015
Best practice –XIV
Each host must have minimum 2 connections
73
Severity Major
Why the error
has occurred
message: This error has occurred because less than 2 connections were discovered in the XIV logical host definition
A logical host definition should contain atleast 2 wwpns
ie: a good example should look like this:
Name Type FC Ports
dssapsrvu002 default 21000000C95D3A6A,10000000C95FFA7
5
dssapsrvu003 default 21000000C95AAA6A,21000000C95BAA
75,21000000C9510A6A,21000000C9512A75
a bad example could look like this:
Name Type FC Ports
dssapsrvu002 default 10000000C95GAA6A
Potential Risks For single connections the risks associated could be either a single point of failure, with no IO failover in the event of
loosing an port on the XIV , fabric or host HBA
The data residing on these hosts are more susceptible to failure which could result in the following:
1.I/O loss
2.Database corruption
3.Lengthy restore from tape backups
4.Extended Problem Determination
Actions to
correct the
error
1.Check the XIV logical host definitions for only one HBA defined to a host
2.Action must be taken to add a second HBA definition and or host HBA, for any definitions showing less than 2 HBAs
3.
Note: Take caution in reducing HBA definitions as this is a disruptive action. Host System administrators will need to
rescan or rediscover corrected paths after action is taken.
© Copyright IBM Corporation 2015
Best practice – V7000u
Best practice are to utilize all Node Ports
74
Severity Warning
Why the error has occurred This error has occurred because a node or port on the node is offline or failed.
Potential Risks Best practice are to utilize all hardware bought, so there are
no idle components
optimize the usage for maximum performance and resiliency
For V7000u there might be ports reserved for services like Global and
Metro Mirroing, but it is important to make sure, there are no error in the
configuration, which might result in: 1.Data integrity is at risk.
2.This condition could result in a Single point of failure for any attached host to a node pair in an IOgrp.
3.The attached hosts and applications residing on the hosts could be potentially impacted with a loss of IO Load balancing
and eminent IO bottlenecks
4.The data residing on these hosts are more susceptible to failure in that the remaining port fails. In the event of the
reaming port failing the following results could occur:
5.I/O loss
6.Database corruption
7.Lengthy restore from tape backups
8.Extended Problem Determination
Actions to correct the error 1.It may be appropriate open a hardware PMR with support to diagnose
the cause of this condition
2.Open a Hardware PMR to dispatch consult with a support Product Field
Engineer (PFE) to check the condition
© Copyright IBM Corporation 2015
Best practice – V7000u
SMTP server must be filled in
75
Severity Minor Why the error has
occurred
: This error has occurred because the SMPT was reported as not being enabled. SMTP must be turned on as a minimum
requirement for email alerting to work. and a valid entry such as callhome*@*.ibm.com should be configured
Potential Risks Not setting up SMTP will result in not having the ability to manage, and activate email event and inventory notifications risking
critical errors going undetected
Actions to correct the
error
1.Enable SMTP by doing the following:
1. Issue the mkemailserver CLI command. Up to six SMTP email servers can be configured to provide redundant access to
the external email network.
2. The following example creates an email server object. It specifies the name, IP address, and port number of the SMTP
email server. After you issue the command, you see a message that indicates that the email server was successfully created.
3.
mkemailserver -ip ip_address -port port_number
where ip_address specifies the IP address of a remote email server and port_number specifies the port number for the email
server.
Add recipients of email event and inventory notifications to the email event notification facility. To do this, issue the
mkemailuser CLI command. You can add up to twelve recipients, one recipient at a time.
The following example adds email recipient manager2008 and designates that manager2008 receive email error-type event
notifications.
mkemailuser -address [email protected] -error on -usertype local
•Set the contact information that is used by the email event notification facility. To do this, issue the chemail CLI command. If
you are starting the email event notification facility, the reply, contact, primary, and location parameters are required. If you are
modifying contact information used by the email event notification facility, at least one of the parameters must be specified.
The following example sets the contact information for the email recipient manager2008.
chemail -reply [email protected] -contact manager2008 -primary 0441234567 -location 'room 256 floor 1 IBM'
•Activate the email and inventory notification function. To do this, issue the startemail CLI command. There are no parameters
for this command.
•Note: Inventory information is automatically reported to IBM when you activate error reporting.
•Optionally, test the email notification function to ensure that it is operating correctly and send an inventory email notification.
•
To send a test email notification to one or more recipients, issue the testemail CLI command. You must either specify all or the
user ID or user name of an email recipient that you want to send a test email to.
To send an inventory email notification to all recipients that are enabled to receive inventory email notifications, issue the
sendinventoryemail CLI command. There are no parameters for this command.
© Copyright IBM Corporation 2015
Best Practice - Brocade
76
Name Er_bad_os per port must be below 5 per minute
Severity info Why the error has occurred The number of invalid ordered sets (platform- and port-
specific).
Brocade reccomends alerting if value exceeds 5 per
minute. Counter is only read once a day and this will give
an threshold of 5x60x24=7200. Daily operation is around
8 hours per day so value is divided by 3 and rounded to
2500.
Potential Risks
Loss of synchronization if running 8Gb link, causing
interuption to data stream.
Actions to correct the error Change Fill Word settings, see brocade check 2.32
© Copyright IBM Corporation 2015
Best Practice - Brocade
77
Name Er_rx_c3_timeout per port must be below 5 per minute
Severity info
Why the error
has occurred
The number of receive class 3 frames received at this
port and discarded at the transmission port due to
timeout (platform-and port-specific).
For further explanation see IBM SANswers Wiki
Brocade reccomends alerting if value exceeds 5 per
minute. Counter is only read once a day and this will
give an threshold of 5x60x24=7200. Daily operation is
around 8 hours per day so value is divided by 3 and
rounded to 2500.
Potential Risks • Discards or frames will result in IO timeout, and
retransmit of frames, causing interuption of data
stream.
Actions to
correct the error
•Fix the reason for timeout, which can be
exhausted links and ISLs
•to few buffercredits assigned to ISLs and storage ports
•slow draining devices meaning devices which which
are not able to receive and process data at the speed it
is send.
© Copyright IBM Corporation 2015
Best Practice – Brocade / Regularly Check Status
78
Issue Er_tx_c3_timeout per port is greater than 5 per minute Severity info Why the error has occurred The number of transmit class 3 frames discarded at the transmission port due to
timeout (platform- and port-specific).
For further explanation see IBM SANswers Wiki
Brocade recommends alerting if value exceeds 5 per minute. Counter is only read
once a day and this will give an threshold of 5x60x24=7200. Daily operation is
around 8 hours per day so value is divided by 3 and rounded to 2500.
Potential Risks Discards or frames will result in IO timeout, and retransmit of frames, causing
interruption of data stream.
Actions to correct the error 1. Fix the reason for timeout, which can be exhausted links and ISLs
2. To few buffer credits assigned to ISLs and storage ports
3. Slow draining devices meaning devices which are not able to receive and
process data at the speed it is send.
© Copyright IBM Corporation 2015
• Disk mapping at a glance
– Mapping becomes important • Spreading versus isolation
79
Isolation Spreading
Track data placement and Host Vdisk mapping
Documentation –
Does it matter? Why?
© Copyright IBM Corporation 2015
• Spreading versus Isolation
– Spreading the I/O across MDGs exploits the aggregate throughput offered by more physical resources working together
– Spreading I/O across the hardware resources will also render more throughput than isolating the I/O to only a subset of hardware resource
– You may reason that the more hardware resources you can spread across, the better the throughput
• Don’t spread file systems across multiple frames – Makes it more difficult to manage code upgrades, etc.
• Should you ever isolate data to specific hardware resources?
• Name a circumstance!
80
• Isolation
– In some cases more isolation on dedicated resources may produce better I/O throughput by eliminating I/O contention
– Separate FlashCopy – Source and Target LUNs – on isolated spindles
Data Placement and Host Vdisk
mapping
© Copyright IBM Corporation 2015
Data layout affects IO performance more than any tunable IO parameter
• If a bottleneck is discovered, then some of the things you need to do are:
– Identify the hardware resources the heavy hitting volumes are on
• Identify which D/A pair the rank resides on
• Identify which I/O enclosure the D/A pair resides on
• Identify which host adapters the heavy hitting volumes are using
• Identify which host server the problem volumes reside on
• Identify empty non used volumes on other ranks – storage pools
– Move data off the saturated I/O enclosures to empty volumes residing on less used
ranks/storage pools
– Move data off the heavy hitting volumes to empty volumes residing on less used hardware
resources and perhaps to the another Storage Device
– Balance LUN mapping across
• Backend and host HBAs
• SVC iogrps
• SVC preferred nodes
– Change Raid type.
Traditional Data Placement –
StorAdmin – How do I improve disk performance?
81
© Copyright IBM Corporation 2015
Back-end Load Balancing
Which has better throughput?
82
DA_0
S5=Array_
44
S3=Array_
10
S7=Array_
52
S1=Array_
0 DA_0
S6=Array_
48
S4=Array_
14
S8=Array_
56
S2=Array_
5
DA_1
S13=Array_
34
S11=Array_
26
S9=Array_
18 DA_1
S14=Array_
37
S12=Array_
30
S10=Array_
22
DA_2
S19=Array_
45
S17=Array_
11
S21=Array_
22
S15=Array
_1 DA_2
S20=Array_
49
S18=Array_
15
S22=Array_
57
S16=Array
_6
DA_3
S27=Array_
35
S25=Array_
27
S29=Array_
40
S23=Array_
19 DA_3
S28=Array_
38
S26=Array_
31
S30=Array_
42
S24=Array_
23
DA_4
S35=Array_
46
S33=Array_
12
S37=Array_
54
S31=Array
_2 DA_4
S36=Array_
50
S34=Array_
16
S38=Array_
58
S32=Array
_7
DA_5
S43=Array_
36
S41=Array_
28
S45=Array_
41
S39=Array_
20 DA_5
S44=Array_
39
S42=Array_
32
S46=Array_
43
S40=Array_
24
DA_7
S59=Array_
29
S57=Array_
21
S55=Array
_4 DA_7
S60=Array_
33
S58=Array_
25
S56=Array
_9
DA_6
S51=Array_
47
S49=Array_
13
S53=Array_
55
S47=Array
_3 DA_6
S52=Array_
51
S50=Array_
17
S54=Array_
59
S48=Array
_8
DA_0
S5=Array_
4
S3=Array_
2
S7=Array_
6
S1=Array_
0 DA_0
S6=Array_
5
S4=Array_
3
S8=Array_
7
S2=Array_
1
DA_1
S13=Array_
12
S11=Array_
10
S9=Array_
8 DA_1
S14=Array_
13
S12=Array_
11
S10=Array_
9
DA_2
S19=Array_
18
S17=Array_
16
S21=Array_
20
S15=Array_
14 DA_2
S11=Array_
19
S18=Array_
17
S22=Array_
21
S16=Array_
15
DA_3
S27=Array_
26
S25=Array_
24
S29=Array_
28
S23=Array_
22 DA_3
S28=Array_
27
S26=Array_
25
S30=Array_
29
S24=Array_
23
DA_4
S35=Array_
34
S33=Array_
32
S37=Array_
36
S31=Array_
30 DA_4
S36=Array_
35
S34=Array_
33
S38=Array_
37
S32=Array_
31
DA_5
S43=Array_
42
S41=Array_
40
S45=Array_
44
S39=Array_
38 DA_5
S44=Array_
43
S42=Array_
41
S46=Array_
45
S40=Array_
39
DA_7
S59=Array_
58
S57=Array_
56
S55=Array_
54 DA_7
S60=Array_
59
S58=Array_
57
S56=Array_
55
DA_6
S51=Array_
50
S49=Array_
48
S53=Array_
52
S47=Array_
46 DA_6
S52=Array_
51
S50=Array_
49
S54=Array_
53
S48=Array_
47
Unbalanced I/O to DA Cards Balanced I/O to DA Cards
© Copyright IBM Corporation 2015
Sequential IO Data layout
• Does understanding the backend enable good front-end configuration?
• Sequential IO (with no random IOs) best practice:
– Create RAID arrays with data stripes a power of 2
• RAID 5 arrays of 5 or 9 disks
• RAID 10 arrays of 2, 4, 8, or 16 disks
– Create VGs with one LUN per array
– Create LVs that are spread across all PVs in the VG using a PP or LV strip size >= a full stripe on the RAID array
– Do application IOs equal to, or a multiple of, a full stripe on the RAID array
– Avoid LV Striping
• Reason: Can’t dynamically change the stripe width for LV striping
– Use PP Striping
• Reason: Can dynamically change the stripe width for PP striping
83
Slide Provided by Dan Braden
© Copyright IBM Corporation 2015
Data Placement – Traditional Storage Pools and Striping
• Should you ever stripe with pre-virtualized volumes?
• We recommend not striping or spreading in SVC, V7000 and XIV Storage Pools
• Avoid LVM spreading with any striped storage pool
• You can use file system striping with DS8000 storage pools
– Across storage pools with a finer granularity stripe
– Within DS8000 storage pools but on separate spindles when volumes are created sequentially
© Copyright IBM Corporation 2014
84
Sequential Pools
Striped Pools
Host Stripe - Raid-0 only
Host Stripe
No Host Stripe
Host Stripe
S
t
r
i
p
e
• Please refer to the following PPTs
provided by Dan Braden
• Disk IO Tuning
• SANBoot
85
More on Host Disk IO Tuning
© Copyright IBM Corporation 2015
Tip - Queue Depth Tuning
• Take some measurements
• Make some calculations
– (Storage port depth / total LUNs per host = queue depth) • If a single host with 10 assigned LUNs, is accessing the storage port supporting 4096
then calculate as (4096/10 = 409) or 256 in this case
– Are there different calculations for the different storage devices? • For volumes on homogeneous hosts examples:
– SVC q = ((n ×7000) / (v×p×c))
– DS8000 = 2048
– XIV= 1400
– V7000 q = ((n * 4000) / (v * p * c))
• Best thing to do is go to each device “Information Center” URLs listed in link slide
– Don’t increase queue depths beyond what the disk can handle! • IOs will be lost and will have to be retried, which reduced performance
• Note:
– For more information on the info needed to make the calculations, please refer to the deck by “Dan Braden” in the Extra slides at the end of this deck
86
© Copyright IBM Corporation 2015 87
SDD Driver Testing for Proper HBA Failover
• On an AIX vio server, check the AIX system to verify IO activity still continues
on the alternate ports by testing with SDD and /or SDDPCM commands
• The Server Admin - Create a mount point for logical volume that can be manipulated to generate IO
traffic for the purpose of this test
• The Server Admin - Verify and record selected (targets yet to be determined) datapaths for preferred
and alternate status (active and inactive) by using the SDD "pcmpath query device" or "datapath
query device" command on the AIX vio server
• Note the path selection counts on the multiple paths. There should only be two paths under the
"Select" column, above zero (0). These are the two open paths on the preferred node. (if paths 0 and
2 show numbers under the "Select " column, other than zer0, then do the following:
1. Take one path off-line by issuing the command (pcmpath set device 0 path 0 offline) or (datapath set device 0 path 0 offline) - Path
0 should now be in a dead state .
2. Go to the mount point of lv and edit a file to create traffic. After creating the traffic, reissue the pcmpath or datapath query command
"pcmpath query device" or "datapath query device" and look at the path selection numbers. Notice only path selection count for
Path 2 increased for the other preferred path
3. Close Path 2 by issuing the command "pcmpath set device 0 path 2 offline" or "datapath set device 0 path 2 offline"
4. Return to the mount point and add or edit files to create IO.
5. Execute the "pcmpath query device" or "datapath query device“ command, to look at the path selection count. Disk access should
now be via the other paths. (This is now load balancing to the non-prefered SVC node for this Vdisk)
Reestablish both preferred paths by executing the following commands: "pcmpath set device 0 path 0 online" and "pcmpath
set device 0 path 2 online" or "datapath set device 0 path 0 online" and "datapath set device 0 path 2 online”
Slide provided by Chuck Laing
© Copyright IBM Corporation 2015 88
Non SDD Driver Testing for Proper HBA Failover
• Further "proper" testing should be done during a maintenance window
• Testing the redundancy between the Fabric and the host
1. Open a change record to reflect the change (Make sure all necessary approvers are notified)
2. The Server Admin - Identify and verify which host HBA's are active for I/O activity by performing a test read and write to the SAN disk from the host
3. The Server Admin - Stop I/O between the host and the Disk Storage
4. The SAN Admin - On the San Fabric, disable the Switch port on the "even" fabric zoned between the host and the storage device.
5. The Server Admin - Perform another read/write test to the same LUN
6. The Server Admin - Identify and verify which host HBA is active for I/O activity
7. The SAN Admin - On the even SAN fabric enable the Switch port
8. The SAN Admin - On the San Fabric, disable the Switch port on the "odd" fabric zoned between the host and the storage device.
9. The Server Admin - Perform another read/write test to the same LUN
10.The Server Admin - Identify and verify which host HBA is active for I/O activity
11.The SAN Admin - the odd SAN fabric enable the Switch port
• If the I/O activity toggles between the two HBA's then the test is successful
• When a new Host server or Storage device is added to the environment testing is strongly recommended
• Note: Ideally this type of test is best done during the initial implementation of new equipment, before it is turned over to the customer or placed in production
Slide provided by Chuck Laing
sPE0330 © Copyright IBM Corporation 2015
• For performance degradation issues (sick but not dead scenarios)
• Are you or any team member aware of any failed hardware?
• Is this issue a performance degradation issue (Sick but not Dead) to the point of stopping an
application of high ms Read/Write response times at the application disk level?
• The following could cause this scenario
• A failed Switch Optic
• A physical cable not securely attached
• Ask if the Patch Panel is clean and securely attached?
• If no failed hardware involved but congestion is possible:
• Are there buffer credit errors in the switch or SVC or Storage error or message logs?
• Did a large DB Query just run?
• Is there a server HBA with errors? – may cause slow drain - block the HBA until replacement?
• Is there a bladeserver with errors?
• Are any storage adapters fencing?
Troubleshooting Questions, What else could it be?
89
© Copyright IBM Corporation 2015
Document VIO to LPAR mapping
• Script Output sample to produce documentation
90
Content provided by Aydin Y. Tasdeler