EMC Proven Professional Knowledge Sharing 2010
Failure Points in Storage Configurations: Common Problems and a Solution Charles Macdonald
Charles MacdonaldSenior Technology [email protected]
2010 EMC Proven Professional Knowledge Sharing 1
Table of Contents Introduction ....................................................................................................................... 1 Components of the Storage Environment ......................................................................... 2
Roles and Responsibilities ............................................................................................ 2 Table 1 – Typical Roles in a Storage Environment ................................................... 4
I/O System Components ................................................................................................... 6 Physical Components of the I/O System: Hosts, Connectivity, and Storage ................ 6
Figure 1 – Hosts, Connectivity, and Storage ............................................................. 7 Hosts: Physical Components .................................................................................... 8 Figure 2 – Host Physical Components in the I/O Path .............................................. 9 Connectivity: Physical Components ........................................................................ 10 Figure 3 – FC Network Physical Components in the I/O Path ................................ 12 Storage: Physical Components ............................................................................... 13 Figure 4 – Storage Physical Components in the I/O Path ....................................... 14 Figure 5 – Physical Components in the I/O Path and Roles ................................... 16
Logical Components of the I/O System ....................................................................... 17 Hosts: Logical Components .................................................................................... 17 Figure 6 - Host Logical Components in the I/O Path ............................................... 20 Connectivity: Logical Components .......................................................................... 21 Storage: Logical Components ................................................................................. 21 Figure 7 – Storage Logical Components in the I/O Path ......................................... 24 Figure 8 – Logical Components in the I/O Path ...................................................... 25
Configurable Items in the Storage Environment ......................................................... 26 Figure 9 – I/O Path for Solaris 10 with Leadville and DMX ..................................... 31
Defining the Problem ................................................................................................... 33 Solving the Problem .................................................................................................... 33
Figure 11 – Configuration Management Goal ......................................................... 36 Conclusion ...................................................................................................................... 37 Disclaimer: The views, processes or methodologies published in this compilation are those of the authors. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies
2010 EMC Proven Professional Knowledge Sharing 1
Introduction In large organizations, responsibility for configuring items between the application and
the physical disks that may affect storage availability and performance often cross
several functional groups, e.g., storage operations, systems administration, database
administration, application support, and design/architecture. Configuration errors or
omissions at any level can contribute to performance degradation or decreased
availability. However, requirements are often not well understood by all the groups
involved. This article offers a generalized overview of the path an I/O operation takes
between the application and the storage to illustrate how the interaction of various
configurable items provides optimal performance and availability.
This article will provide Storage Administrators with a common language for potential
configuration issues with:
• Other Systems Administrators
• Database Administrators
• Application Support
• Design/Architecture
• Managers responsible for technical groups
An understanding between management and the technical groups ensures that they all
work collaboratively to deploy and maintain systems with appropriate configurations that
meet performance and availability requirements.
2010 EMC Proven Professional Knowledge Sharing 2
Components of the Storage Environment A storage environment includes three categories of components:
1. Physical components of the I/O system
2. Logical components of the I/O system
3. Human actors
Physical components include servers, storage arrays, and connectivity devices. Logical
components include databases, server operating systems, and storage system
microcode. Human Actors, i.e., the people associated with the storage environment,
include business users who create, view, and manipulate data within the storage
environment, as well as anyone involved in the architecture, design, deployment,
maintenance, and ongoing operation of the storage environment’s physical and logical
components.
The purpose of a storage environment is to support business functions and objectives at
the least possible cost. Business objectives that drive costs include not just the amount
of primary data that must be stored, but also multipliers to the amount of data, such as
the number of backups required, backup retention periods, archive retention, and
disaster recovery requirements. Associated cost drivers include performance
requirements for speed of access to the data, availability requirements that require
redundancy, and sufficiency of systems to meet recovery time and recovery point
objectives. The three components of a storage environment must be working in concert
to fulfill this purpose.
Roles and Responsibilities
The Business User is the most important person within the storage environment. He/she
is the reason for the existence of the storage environment. Others in the storage
environment perform the tasks required to design, deploy, and operate the storage
environment. These tasks are generally consistent regardless of the business setting,
but the way that tasks are allocated in different organizations may be quite different. For
example, the configuration of disk and file systems on a server may be a Systems
Administrator function in one organization, but the responsibility of the Storage
Administrator in another. Similarly, higher level functions, such as planning the
2010 EMC Proven Professional Knowledge Sharing 3
particulars of storage allocations, may be determined by a Design group in one
organization, but reside within a Storage Operations group in another. Different
structures may co-exist within silos in the same organization for a variety of reasons
such as legacy structures inherited with the acquisition of other corporations, or
independent management structures in different functional business groups or branches
of business.
Organizational roles and responsibilities should be well defined and well understood by
all of the groups and individuals involved, as should methods of engagement and
communication between the functional groups. Insufficient collaboration, coordination,
and communication are likely to increase the frequency of preventable business function
disruptions by introducing failure points in the storage environment.
In this article, the roles of the typical actors in a storage environment are summarized in
Table 1. These definitions are for illustration only; they are not intended to define a
standard that would be suitable for all organizations. The role descriptions use the term
Service Level Agreement (SLA) that implies that some level of service has been well
defined. That may not be true in many organizations.
2010 EMC Proven Professional Knowledge Sharing 4
Table 1 – Typical Roles in a Storage Environment
Actor Role
Business Users store, retrieve, and manipulate data
within the storage environment to carry out business
functions. Business User requirements* provide the
justification for the cost of deploying and maintaining a
storage environment, and drive the creation of SLAs for
items such as performance, availability, recovery point
objectives, recovery time objectives, data retention
periods, and disaster recovery strategies.
*Note: Business Users might not be very good at defining
requirements, e.g., the desire “we want 100% uptime and
never delete anything” usually does not become a
requirement when the business is presented with a price
tag to achieve it.
Architects set general and specific technology directions
at a high level, having broad impact on strategic
technology decisions. Architects create standards and
blueprints for the technology that will be available within
an organization, including vendor selection.
Designers create specific technology solutions for
defined business requirements. Designers produce
designs within the constraints set by Architects on
technology and standards.
Application Support Analysts deploy, configure, and
manage applications. Application Support Analysts
maintain applications to meet defined SLAs, including
performance and availability requirements.
2010 EMC Proven Professional Knowledge Sharing 5
Database Administrators deploy, configure, and
manage databases and database related software, such
as Oracle ASM, in accordance with designs provided by
Designers. Database Administrators maintain the
database environment to meet SLAs, including
performance and availability requirements.
Systems Administrators deploy, configure, and manage
server hardware and operating systems, in accordance
with designs provided by Designers. Systems
Administrator responsibilities include configuring server
components relating to storage, such as HBAs, path
management software, volume managers, and file
systems. Systems Administrators maintain the server
environment to meet SLAs, including performance and
availability requirements.
Storage Administrators deploy, configure, and manage
storage arrays and dedicated storage networks, in
accordance with designs provided by Designers. Storage
Administrators maintain the storage arrays and networks
to meet SLAs, including performance and availability
requirements.
Backup Administrators deploy, configure, and manage
backup applications, and dedicated hardware that
supports the application, such as tape libraries. Backup
Administrators maintain the backup environment to meet
SLAs, including performance and availability
requirements.
2010 EMC Proven Professional Knowledge Sharing 6
I/O System Components The following sections provide a generic description of the I/O System in terms of
physical components, logical components, and their relationships. Most of the examples
are for hosts attached to block storage over a fibre channel network, but the general
discussion applies to other storage networks as well.
Physical Components of the I/O System: Hosts, Connectivity,
and Storage
The physical components of a storage environment are segregated into host,
connectivity, and storage devices. Hosts are the computers that run the applications that
users interact with to store, retrieve, and manipulate data. Hosts can be anything from a
notebook PC, to large mid-range systems, and may also refer to virtual hardware, such
as a VMware server.
Storage refers to any storage medium that is external to the host, such as magnetic
tape, optical disk drives, and solid state disk drives. This article is only concerned with
intelligent storage arrays that do not merely provide access to disk, but also provide a
feature rich environment that mediates the access to disk.
Connectivity refers to the physical components that provide the medium for
communication between hosts and storage, such as optical cables, twisted pair cables,
or copper cables. In the case of networks, it also includes hubs, routers, and switches.
The physical components of the storage environment are discussed in more detail in the
following sections. Below, Figure 1 illustrates the high level relationship between hosts,
connectivity, and storage, for both block based storage on an FC SAN, and network
attached storage.
2010 EMC Proven Professional Knowledge Sharing 7
Figure 1 – Hosts, Connectivity, and Storage
2010 EMC Proven Professional Knowledge Sharing 8
Hosts: Physical Components
To illustrate data flow within a computer, the physical components can be generalized to
the Central Processing Unit (CPU), Internal Storage, I/O Devices, and Internal
Connectivity. The CPU consists of the physical components that perform the instructions
contained in the programs. The CPU also contains some amount of high speed storage
(registers and cache) to facilitate CPU operations. Internal Storage consists of memory,
e.g., RAM and ROM, and larger storage devices, such as disk drives, tape drives, and
CD/DVD drives. I/O Devices include devices that handle user to host communications,
such as the keyboard, monitor, and mouse, and devices such as Network Interface
Cards (NICs) and Host Bus Adapters (HBAs) that enable host to host or host to external
storage communications. The CPU, internal storage, and I/O devices communicate
within a computer over buses and bridges that form the internal connectivity.
Figure 2 is a high level diagram of the physical components within a host, and their I/O
connections, for a host with connectivity to block based storage via an FC SAN. Note
that the host is depicted with two HBAs to provide physical redundancy for the I/O path
to the storage.
2010 EMC Proven Professional Knowledge Sharing 9
Figure 2 – Host Physical Components in the I/O Path
2010 EMC Proven Professional Knowledge Sharing 10
The physical components of the host must be sized appropriately for the expected
workload, and may require redundancy to meet availability requirements. You must also
consider exception scenarios when determining the necessary level of redundancy. If
business functions are particularly sensitive to performance degradation, and not just
downtime, for example, a host with two separate HBAs attached to a pair of redundant
SAN fabrics, it will still have access to disk in the case of a failure that interrupts the I/O
path for one HBA (e.g., HBA optic failure, switch port failure), but will also have a 50%
reduction in theoretical bandwidth to the storage. Additional HBAs may be required in
the host to provide redundancy for performance if the potential performance degradation
in this routine failure scenario is not acceptable to the business user.
Connectivity: Physical Components
There are a variety of connectivity options to support various storage network protocols,
such as Infiniband, Fibre Channel (FC), iSCSI, Fibre Channel over Ethernet (FCoE),
NFS, and CIFS. This article will only consider two types of storage networks: IP
networks and FC networks.
IP networks provide connectivity for host and storage for either block based or file based
I/O. Block based I/O over IP networks use protocols such as iSCSI, or FCoE. File I/O
protocols Network File System (NFS) and Common Internet File System (CIFS) that are
used to access Network Attached Storage (NAS) devices are more widely used. IP
networks are generally well understood, so are not described in this article. However,
troubleshooting problems with NAS is often complicated by having an additional
organizational unit involved (the IP Network Administrators) and a more complex
topology, particularly where the IP connectivity is not a dedicated storage network.
FC networks, also called fabrics, provide connectivity between hosts and storage for
block based I/O. The connectivity devices in FC networks can consist of routers, hubs,
and switches. FC switches are the prevalent interconnect device, so it is the only device
considered here; a generalized description of FC switches is given below.
2010 EMC Proven Professional Knowledge Sharing 11
FC switches consist of three key components: ports, internal connectivity, and control
processor (CP) units. Ports contain a transceiver (either a gigabit interface converter
(GBIC), or a small form-factor pluggable (SFP), which transmit and receive signals to
and from the node (host, storage, or switch) that is attached to the port. Cables used to
connect nodes to the FC switch are generally fibre optic cables, but copper may be used
in some circumstances. Port to port communication in the switch takes place via a bus or
backplane, which also allows provides the communication path for the controller units.
FC switches generally include a number of hot swappable components, such as GBICs,
fans, and power supplies. Director class switches may also have hot swappable
controller cards.
FC networks are typically deployed in pairs of redundant fabrics, so that a failure in one
fabric degrades connectivity, rather than interrupting it. Figure 3 illustrates the
components of a FC network.
2010 EMC Proven Professional Knowledge Sharing 12
Figure 3 – FC Network Physical Components in the I/O Path
2010 EMC Proven Professional Knowledge Sharing 13
Storage: Physical Components
The physical makeup of a storage array is described by grouping components into four
broad categories: front end, cache, back end, and physical disks.
The front end consists of the storage front adapters and their controllers. Front adapters
provide the interface between the hosts and the storage array, either through direct
connections, or a storage network.
Cache consists of a number of cards of volatile memory that mediates the transfer of
data between hosts and physical disks. I/O operations to cache are much faster than I/O
operations to disk. Cache increases the speed of write operations from a host
perspective, as the storage array acknowledges the write I/O as committed when it is
written to cache, then writes it to disk or destages it later.
Read ahead algorithms attempt to prefetch data into cache before hosts request it,
speeding up read I/O operations by facilitating some portion of the read I/O from the host
perspective to be performed at cache rather than disk speeds. Cache mirroring protects
uncommitted writes from cache board failures; writes are duplicated on two separate
cache boards. During a power failure, battery power dumps cache to vault disks.
The back end consists of storage back adapters and their controllers. The back adapters
connect to physical disks using SCSI or FC. Back adapters are configured to provide
multiple paths to the disks, so the back adapters are not a single point of failure in the
storage array. The back end controllers also contain some small amount of memory to
help facilitate and optimize data transfer between the cache and the physical disks.
The physical disks in the storage array are persistent data stores, i.e. data that is written
to disk is not dependent on power to maintain its state. One storage array may contain
different types of disks, such as FC, Serial ATA (SATA), or Solid State Drives (SSD).
Different disk sizes may also reside within the same storage array.
The internal connectivity in the storage array varies between storage vendors and
models of arrays, but all contain some type of redundancy. Figure 4 illustrates the
physical components of a storage array.
2010 EMC Proven Professional Knowledge Sharing 14
Figure 4 – Storage Physical Components in the I/O Path
2010 EMC Proven Professional Knowledge Sharing 15
Figure 5 illustrates an end-to-end look at the physical components in the I/O path, and
identifies the actors that are responsible for tasks associated with the physical
components. The roles and responsibilities for architecture, design, and operation of the
physical infrastructure are generally easy to define, and obvious failure points in the
infrastructure are relatively easy to avoid if you take care to introduce architectural
standards and solution designs that incorporate hardware redundancy, such as dual
HBAs and FC fabrics. The greater danger lies in the configuration of the storage
environment’s logical components.
2010 EMC Proven Professional Knowledge Sharing 16
Figure 5 – Physical Components in the I/O Path and Roles
2010 EMC Proven Professional Knowledge Sharing 17
Logical Components of the I/O System
The logical components of the I/O system control user interactions with the physical
components and interactions between the physical components and include:
• host applications
• databases
• host operating systems
• storage operating systems
• FC fabric operating systems
• logical volume managers
• file systems
• device drivers
The logical components contain a large number of configurable items that multiply to
create a vast number of configuration combinations. The availability and performance of
the storage environment is highly dependent on correctly configuring the logical
components to work together. Logical components may reach beyond the physical
divides between hosts, connectivity, and storage, but the physical divides are still a
useful model for organizing the discussion of the logical components.
Hosts: Logical Components
Operating Systems (OS) provide the logical container that the rest of the logical
components on a host work within. The OS manages the interactions and interfaces
between the users, the logical components, and the physical components on the host.
Examples of operating systems include Microsoft Windows, and the various flavours of
UNIX, such as Solaris, AIX, Linux, and HPUX.
Device drivers enable the OS to communicate with specific hardware devices. Device
drivers may be embedded with the OS, or may be installed separately, depending on the
OS and the specific hardware. Device drivers interact with the programmed instructions
called firmware or microcode that reside on the hardware devices. Some hardware
devices, such as HBAs, support user microcode upgrades. The OS and drivers provide
all of the basic services used for I/O on the host.
2010 EMC Proven Professional Knowledge Sharing 18
Multipath Managers provide a layer of virtualization above what the OS sees as physical
disks. Devices are presented over multiple physical and logical paths to provide
redundant access to storage, and to increase throughput. Path managers recognize that
the multiple paths point to a single storage device, and mediate access to the disk.
Multipath managers may also provide performance enhancement functionality by
incorporating algorithms to spread I/O over multiple paths. Multipath managers may be
integrated in the OS, like Solaris MPxIO, or may be provided by a third party, e.g., EMC
Powerpath™, HDS HDLM, or Veritas DMP.
Logical Volume Managers (LVM) provide a virtualization layer above what the OS sees
as physical disks. LVMs group a disk or disks into a volume group, and then create
logical volumes that are again presented to the OS as physical disks. Logical volumes
can be a partition, i.e., a portion of a physical disk, or may span multiple physical disks.
Other LVM functionality may include snapshots, software RAID implementations, and
the ability to move data around on physical disks without any disruption to data access.
LVMs may be integrated with the OS, such as AIX LVM and Windows Logical Disk
Manager, or may be a third party product, such as Veritas Volume Manager.
File Systems provide a method to store and organize data in collections of files. The file
system maps user data to logical units of storage called file system blocks. These are
then mapped to LVM extents if an LVM is present, and then to the OS disk physical
extents, which in the case of external storage are also a virtual entity. For example, the
OS disk physical events are mapped to actual physical disks by the storage subsystem.
Applications directly provide services to users. They contain the functionality and logic to
allow users to perform groups of related tasks. Examples of application classes include
word processors, web browsers, and video games. Applications reside in a logical layer
above the file system. Database applications are specialized for organizing logically
related data to facilitate data analysis, storage, and retrieval. Oracle and Microsoft SQL
are examples of database applications. Applications and databases both rely on the file
system to manage data storage and retrieval. Oracle Automatic Storage Management
(ASM) provides a file system and LVM-like functionality specifically for Oracle database
applications. From an OS perspective, ASM uses raw disk that does not pass through
any other LVM or file system present on the host.
2010 EMC Proven Professional Knowledge Sharing 19
Since the logical components in the I/O path are layered, the logical path of an I/O
passes through several layers.
• I/O begins in the application/database layer
• is passed to the file system
• then the LVM
• then to the SCSI target drivers
• followed by the multipath driver
• then the SCSI command is encapsulated by the driver or drivers that handle the
fibre channel protocol (FCP)
• then passes on to the HBA driver
• and out to the SAN
This layering is depicted in Figure 6 below.
2010 EMC Proven Professional Knowledge Sharing 20
Figure 6 - Host Logical Components in the I/O Path
2010 EMC Proven Professional Knowledge Sharing 21
Connectivity: Logical Components
Logical components in the connectivity environment direct traffic between the source
and destination devices on the network. For FC fabrics, the principal logical component
of interest from a configuration perspective is FC zoning that restricts communications
between the nodes that are logged into the FC fabric. A zone set is a collection of zones
that identifies which nodes are visible to each other on the fabric. Zoning decreases
interference between nodes that do not need to communicate with each other, provides
security by restricting communications between nodes, and simplifies management.
FC switches run their own operating systems that are also referred to as microcode. The
microcode provides all of the FC services and management components such as
command line interfaces (CLI), application interfaces (API), web interfaces, and alerting
via Simple Mail Transfer Protocol (SMTP).
This article does not provide a detailed discussion of the I/O path within the FC fabric, as
the logical configuration of FC switches is generally not the source of many systemic
problems in the storage environment.
Storage: Logical Components
The storage array operating system, often referred to as microcode, may contain a wide
range of features including support for multiple RAID levels, logical device cloning,
logical device snapshots, remote replication capabilities, and external storage
virtualization. The storage operating systems also manage performance optimization
algorithms. These include command queuing that optimizes I/O by re-ordering
commands to reduce seek time and rotational latency on the physical disks, and read
ahead algorithms that recognize sequential read I/O, then pre-fetch data from disk to
cache in anticipation of a host request to increase read hits. Read hits decrease I/O
response time by serving reads at cache rather than disk speeds.
2010 EMC Proven Professional Knowledge Sharing 22
Storage arrays partition physical disks into logical devices (ldevs) that are then
presented to hosts as SCSI target devices, commonly referred to as LUNs. The storage
array must adapt its SCSI emulation to accommodate variations in host operating
system requirements. Active/passive arrays also need to accommodate different
behaviours for the various host multipath manager solutions.
LUN security, also known as LUN masking, allows multiple hosts to access LUNs
through shared front end ports on the storage array without seeing LUNs that belong to
other hosts.
The precise path an I/O takes through the storage array varies somewhat depending
whether the I/O is a write or a read, and on its interaction with the cache.
As mentioned earlier, storage arrays increase write performance by acknowledging write
I/O to the host once data has been written to mirrored cache, and then committing the
data to disk later. This is called a write hit, and describes the majority of write I/O. Some
storage arrays, such as the EMC CLARiiON®, allow the storage administrator to set write
aside limits that direct large I/Os directly to disk. This slows down the write response
time for the host, but prevents large I/Os from consuming too much cache. If a write I/O
is forced to wait for a cache slot to become free because the cache is globally stressed,
or a cache limit has been reached for the logical device, a delayed fast write has
occurred. The term write miss may also be applied to delayed fast writes, or to scenarios
where data is being written directly to disk for reasons other than write aside, such as
cache failure.
Read hits occur when a read I/O is serviced from data already resident in cache. As
discussed above, read-ahead algorithms increase the number of read hits by pre-
fetching data to cache in anticipation of host requests. A read miss occurs when a read
request has to wait for data to be read from the physical disks before it can be returned
to the host. Read hits service I/O at cache speeds, so have a significantly lower
response time to the host than read misses, which service I/O at disk speeds.
2010 EMC Proven Professional Knowledge Sharing 23
The sequence of events for a write hit on the storage array begins with FC frames being
received on the front end ports, after which the SCSI payload is extracted by the front
end controllers. The I/O is written to mirrored cache, and then the front end controller
creates a SCSI write acknowledgement, encapsulates it in an FC frame, and sends it to
the host via the front end port. Later, the data is destaged from cache to the back end
controllers, which then commit the data to disk. Note that not all writes are necessarily
committed to disk; if a host overwrites the data while it still resides in cache (a write
cache rehit), then the original write is never committed to disk.
From a logical perspective, the I/O passes through layers that deal with FCP, then SCSI,
then map logical devices to physical devices, then service the I/O to/from the physical
devices, as illustrated in Figure 7.
2010 EMC Proven Professional Knowledge Sharing 24
Figure 7 – Storage Logical Components in the I/O Path
2010 EMC Proven Professional Knowledge Sharing 25
Figure 8 – Logical Components in the I/O Path
2010 EMC Proven Professional Knowledge Sharing 26
Configurable Items in the Storage Environment
The logical layers in the storage environment have configurable components that affect
performance and availability. Incorrect configurations at any level can defeat physical
and logical designs for high availability. Correct configurations require end-to-end
compatibility for the configuration of the logical components. However, that end-to-end
view can be difficult to accomplish as the various actors often do not understand the
end-to-end configuration requirements. Most default configurations will work when
deployed, concealing issues that will only become apparent when an exception scenario
occurs with the storage environment. Storage environments should be able to withstand
many types of physical failures and brief connectivity interruptions without causing
significant disruption to applications and databases.
Common failures and events that should not cause disruption to service include:
• Single disk failure within a RAID group
• HBA failure in a multipath environment
• Switch or switch port failure in a multipath environment
• Storage port failure in a multipath environment
• Redundant power supply failure on a storage array
• LUN trespassing in active/passive storage arrays
Some of the above may cause performance degradation in busy systems. For example,
the loss of one of two paths as the result of a switch port failure decreases theoretical
throughput by 50%, and the loss of a disk within a RAID group causes an increase in
disk activity when the RAID group rebuilds to a new disk or a hot spare.
Common maintenance and deployment activities that should not cause disruption to
service include:
• Activating zone sets on the SAN fabric
• Creating LUNs on storage arrays
• Allocating LUNs to hosts
• Microcode upgrades on switches
• Microcode upgrades on storage arrays
2010 EMC Proven Professional Knowledge Sharing 27
Some of these activities may cause some performance degradation in busy systems.
Additional care needs to be taken with active/passive storage arrays such as the EMC
CLARiiON if the activity requires a controller failover, e.g., a CLARiiON microcode
upgrades requires the service processors (SP) to reboot, resulting in many multipath
events on connected hosts as LUNs trespass back and forth between the two SP.
From an availability perspective, storage administrators’ tasks are fairly straightforward,
with only minor variations in configuration to suit the OS of the attached host. For
example, the following Storage Administrator activities are OS agnostic:
• LUN masking (unless LUN masking also includes SCSI emulation)
• LUN creation
• MetaLUN configuration (e.g., striped vs. concatenated)
• Fabric Zoning
Storage Administrator activities that are not OS agnostic are due to OS variations in
SCSI implementation and fail over requirements. This affects activities like:
• SCSI LUN numbering (e.g., Volume Set Addressing for HPUX)
• SCSI emulation and failover configurations (e.g., port flag settings on DMX, HBA
registration on CX)
Storage Administrators face greater challenges with configuration for performance
considerations. They affect decisions about items such as:
• RAID type (e.g., RAID 5 vs. RAID 10)
• Disk tier (e.g., SSD vs. FC vs. SATA, 15000 RPM vs. 10000 RPM)
• MetaLUN configuration (striped vs. concatenated)
• Fan out ratios
2010 EMC Proven Professional Knowledge Sharing 28
Most storage availability issues related to configuration reside in the logical components
at the host level, due to a number of factors:
• the large number of configurable items
• the large number of product choices available at the various logical levels (e.g.,
ZFS vs. VxFS, native multipath vs. EMC PowerPath® vs. VxDMP)
• the responsibility for configurable items affecting availability and performance
resides in different functional groups (e.g., Systems Administrators, Database
Administrators, and Application Support Analysts).
Storage vendors provide guidance on host configurations that have been tested for
availability, and are suitable general performance requirements. This guidance widely
comes in the form of an interoperability matrix that contains information on a wide range
of host-connectivity-storage combinations, for both logical and physical components.
These documents are updated regularly to cover new products, and include required
OS, microcode, and driver patches and upgrades. Detailed connectivity guides for a
particular OS and storage vendor combination is usually available from the storage
vendor. These detailed guides provide an in-depth look at configuration options, and
may include guidelines for parameter tuning beyond the guidelines published in the
interoperability matrix.
EMC also provides the web-based Host Environment Analysis Tool (HEAT) to assist
Systems Administrators to validate the configuration of hosts that are attached to EMC
Symmetrix® or CLARiiON arrays via FC. The System Administrator runs a data collection
tool to capture the configuration of a host (EMCGrab for UNIX, EMCReports for
Windows), then uploads the compressed output file to the HEAT website. HEAT then
compares the host configuration against the current interoperability matrix, and provides
a report on the configuration highlighting any discovered deficiencies.
Each logical layer of the I/O path has some mechanism for dealing with imperfect I/O
operations. As a result, there are multiple mechanisms for retrying failed I/O operations,
and timeout settings for reporting I/O operation failures up to the next logical layer.
Incompatible configurations between the layers can lead to service disruptions if any
layer is not able to appropriately handle I/O operation exceptions. Several examples of
failures due to configuration or compatibility issues are discussed below.
2010 EMC Proven Professional Knowledge Sharing 29
There must be some mechanism at the application level to deal with I/O interruptions
that are considered to be normal operations in the storage environment. Whether or not
there is a configurable item at the application layer depends on the application, so in
some cases problems are introduced at the design stage, when applications are
introduced that are not able to tolerate normal operations in the storage environment.
For example, WebLogic 8.1 JMS queues have internal timers that may cause JMS to
shut down if I/O hangs during the ‘non-disruptive’ failover of the NAS head during a
microcode upgrade, even though NFS on the host recovers gracefully.
With Oracle ASM, Database Administrators have taken on some additional configuration
responsibilities, as they now have more control over the configuration of logical devices
on the host. DBAs can make many configuration changes that affect performance, but
they can also impact availability. For example, adding new devices into an ASM disk
group triggers a rebalancing operation that can have a significant performance impact.
DBAs also choose how to allocate raw devices into the ASM disk groups; as a result,
poor communication between the DBA, Systems Administrator, and Storage
Administrator could lead to devices with dissimilar performance characteristics (e.g., FC
vs. SATAII) being added to the same ASM disk group.
Path management software may require SCSI target driver timeout settings that are
different than the OS default settings. For example, failing to set the ssd:ssd_io_time on
a Solaris 10 host running the Leadville stack can cause the host to panic or offline all
disk during regular SAN events such as LUN trespasses, switch microcode upgrades, or
storage array microcode upgrades.
Host SCSI queue depth settings that control the maximum number of outstanding SCSI
commands can be queued against a device. They are often thought to be only related to
performance, but queue depth can also be implicated in availability. For example,
inappropriately high queue depth settings can lead to unusually high I/O response times
on a busy storage array, which can trigger write timeout values within Oracle ASM. This
causes ASM to take disk groups offline.
2010 EMC Proven Professional Knowledge Sharing 30
FC protocol parameter tuning may also be recommended on some systems, affecting
items such as the number of I/O retries at the FC frame level, or the length of time the
HBA waits before it takes a port offline due to loss of light.
Storage vendor qualified HBA drivers and firmware may contain default settings that are
different from the OEM installation, again leading to potential availability issues for
common SAN events.
Figure 9 relates the generic logical layers presented in this article to a specific
configuration example: a Solaris 10 host using the Leadville stack, attached via dual FC
fabrics to an EMC DMX class array. Also included are some comments about
configurable items at each logical layer, mapped to the actor in the organization most
likely to perform the configuration.
2010 EMC Proven Professional Knowledge Sharing 31
Figure 9 – I/O Path for Solaris 10 with Leadville and DMX
fcp
fp
HOST
Applications Databases
File System Oracle Automatic Storage
Management (ASM)*
* if present
Logical Volume Manager (LVM)
SCSI target drivers
FC protocol drivers
HBA drivers
HBA HBA
Multipath drivers
Solaris 10 – Leadville
Oracle 10g Application
ZFS ASM
ssd
scsi_vhci (mpxio)
emlxs
I/O timeout I/O size
Block size
Diskgroups External Redundancy Allocation Unit Size Striping Stripe size
zpools RAID Striping Cache Quotas Snapshots Clones Block size Compression
/etc/system entries: I/O timeout: ssd_io_time Queue Depth: ssd_max_throttle SCSI retry: ssd ua retry count
/kernel/drv/fp.conf /kernel/drv/mpt.conf
/etc/system entries: Automatic LUN discovery: fcp:ssfcp_enble_auto_configuration Fail port: fp_offline_ticker FC Frame reties: fp retry count
/kernel/drv/emlxs.conf entries: Fail port: fp_offline_ticker FC Frame reties: fp_retry_count
Configurable Items
2010 EMC Proven Professional Knowledge Sharing 32
CONNECTIVITY
FC Zoning
Port
Port
FC Zoning
Port
Port
STORAGE
FC Protocol
SCSI Emulation
Logical Devices
Disk Controllers
Port Port
Port Port
D i s k
RAID
LUN Security
Cache Cache
Single initiator
zoning
Single initiator
zoning
Fibre and SCSI Port Flag Settings
LUN masking
MetaDevices
LUN mapping
RAID 5 (7+1)
Zoning Port speed
Port Flag settings: Common Serial Number Disable Queue Reset on Unit Attention Fan Out ratios SCSI LUN numbers LUN masking MetaDevice striping/concatenation
Disk Tier (disk type, speed, size) RAID level Share Everything, Share Nothing Hot Spares
2010 EMC Proven Professional Knowledge Sharing 33
Defining the Problem
On occasion, storage goes awry. It is easy to imagine nightmarish storage scenarios, all
you have to do is flip through the “Fixed Problems” section of the microcode patches you
haven’t applied yet to get some ideas. However, most storage related problems occur
because people associated with the storage environment deploy flawed configurations
that appear to be healthy, but may not withstand storage environment exception
scenarios, such as microcode upgrades and redundant hardware failures. In other
cases, incorrect configurations may manifest as problems on a future reboot, or disk
discovery. As a general statement, it is safe to state:
Most storage related failures are directly attributable to improper configurations
either as the result of initial deployment errors, or failure to maintain the
environment by pro-actively applying patches and upgrades.
As a corollary to the above, it follows that:
Most storage related failures are preventable.
Solving the Problem
Configuration of a storage environment is an iterative process. Configuration changes
resulting from new deployments into the environment, and sustainment activities arising
from problem, performance, and capacity management processes. Each of these
processes needs to be tied to a rigorous configuration management discipline to
maintain a storage environment that approaches optimal performance and availability
within the constraints of the environment’s physical capabilities.
The purpose of configuration management is to identify and track the characteristics of
the physical and logical components, with the amount of detail necessary to support
decision making regarding the potential impact configuration changes may have within
the storage environment. Configuration management begins with identifying the
components in the storage environment. Often, this information will be recorded in a
database referred to as the configuration management database (CMDB). The
configuration management discipline should also contain processes that control the
2010 EMC Proven Professional Knowledge Sharing 34
configuration (i.e., place controls on changes to the configuration), provide the ability to
report on the configuration, and require configuration audits. Configuration management
processes should be deeply integrated with problem management and change
management processes, and will assist in providing answers to questions such as:
• What business services, applications, databases, servers, etc. may be impacted
by a planned configuration change?
• What business services, applications, databases, servers, etc. may be impacted
by a problem that has been identified? (e.g., what servers are connected to a
failed switch, or what servers require proactive maintenance to apply an OS
patch?)
• What configuration changes were introduced in the storage environment that
may be linked to the introduction of a current problem?
Configuration management in the storage environment is complicated by the number of
organizational teams that may be involved in the design, deployment, and maintenance
of the configurable items in the I/O path. A successful configuration management
implementation must define, record, and standardize the configuration requirements for
all of the logical components in the I/O path. This will be an ongoing process, as it must
capture new technologies as they are introduced, and fixes and releases for existing
components (e.g., new driver versions, OS and microcode patches). Each of the
affected organizational groups should have a highly skilled resource involved in the
configuration management process since the establishment of deployment standards
requires specialized knowledge within each logical layer of the storage environment.
Some organizations may choose to create a separate organizational unit for
configuration management, while others may choose to create a committee containing
representation from several organizational groups. In any case, configuration standards
must be effectively communicated to the operational groups that apply the
configurations, and must be effectively integrated into their deployment and maintenance
procedures. Documentation of the standards should include a description of the
functionality of each configuration item, as well as the reason for the particular value it
should be set to, and, if appropriate, a reference to the vendor documentation that
specifies the setting. This higher level information will increase compliance by providing
2010 EMC Proven Professional Knowledge Sharing 35
operational staff with a context for configuration decisions, and generally increases their
level of awareness and comfort with the storage environment.
Audit configurations to ensure that they comply with the defined configuration standards.
Audits should be conducted for each deployment, and also periodically to measure the
degree of compliance with configuration standards and uncover any latent threats. In
large environments, periodic audits can be limited to some representative subset of the
environment to allow the overall compliance of the storage environment to be
extrapolated. Audits also allow compliance with configuration standards to be measured
and rated, which can be useful as an organizational performance metric, or as part of a
continuous improvement process.
Figure 11 illustrates the desired outcome of implementing a configuration management
process for the storage environment.
2010 EMC Proven Professional Knowledge Sharing 36
Figure 11 – Configuration Management Goal
2010 EMC Proven Professional Knowledge Sharing 37
Conclusion The purpose of a storage environment is to support business functions and objectives at
the least possible cost. To fulfill this purpose, the logical, physical, and human
components of a storage environment must be working in concert. However, storage
configurations often fail, with causes that are directly attributable to improper
configurations or poor maintenance practices. Thus, most storage related failures are
preventable.
Most storage configuration issues occur in the logical components at the host level, due
to a number of factors, such as:
• The large number of configurable items at the host level
• The large number of product choices available
• The responsibility for configurable items resides in several different functional
groups
When an organization first deploys a storage environment, the Storage Administrator
and the Systems Administrator roles are likely to reside within the same functional
group. As the storage environment grows larger, these roles are often separated into
different functional groups. As a result, storage related skills within the Systems
Administrator groups are likely to decline over time, as new Systems Administrators
have generally had very limited exposure to external storage. Since Systems
Administrators are responsible for a large number of configurable items related to
storage, this declining skill set presents challenges in maintaining the health of the
storage environment. Configuration and maintenance issues at the Systems
Administrator level can compound very quickly, such as when configuration errors are
embedded in deployment tools, or if server configurations and patch levels delay or
prevent microcode upgrades being deployed on storage arrays and switches.
We can reduce storage related failures by implementing a rigorous configuration
management discipline. To be successful, the configuration management processes
must include representation from all of the functional groups involved in configuring the
I/O path, and be integrated into their documentation and procedures. As well as tracking
the status of all of the configurations within the storage environment, the configuration
2010 EMC Proven Professional Knowledge Sharing 38
management process must include regular audit procedures to provide a measurable
verification of the level of compliance to standards.
Configuration management faces fewer challenges if the number of unique
configurations can be reduced by:
• Limiting the number of manual tasks in server deployments by using standard
images during deployment and tools such as Jumpstart for Solaris, Kickstart for
Red Hat Linux, or NIM for AIX.
• Decreasing SAN complexity by limiting the number of storage vendors within the
same class of storage, i.e., having different vendors for monolithic arrays,
modular arrays, and NAS is manageable, but having two vendors for modular
storage increases complexity unnecessarily.
Storage vendors provide guidance on host configurations that have been tested for
availability and are suitable general performance requirements. EMC also provides tools
that allow Systems Administrators to quickly check a host configuration against the
current EMC support matrix. In general, storage vendors are strongly motivated to help
keep your storage environment healthy. Their expertise and experience in other
customer sites should be leveraged when establishing a configuration management
process and in its ongoing maintenance.