click the tab to view text that corresponds to the audio ... · pdf file... configuration...

68
Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved. Welcome to Storage Performance Fundamentals. Click the Notes tab to view text that corresponds to the audio recording. Click the Supporting Materials tab to download a PDF version of this eLearning. Copyright © 1996, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 , 2014 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC2, EMC, Data Domain, RSA, EMC Centera, EMC ControlCenter, EMC LifeLine, EMC OnCourse, EMC Proven, EMC Snap, EMC SourceOne, EMC Storage Administrator, Acartus, Access Logix, AdvantEdge, AlphaStor, ApplicationXtender, ArchiveXtender, Atmos, Authentica, Authentic Problems, Automated Resource Manager, AutoStart, AutoSwap, AVALONidm, Avamar, Captiva, Catalog Solution, C-Clip, Celerra, Celerra Replicator, Centera, CenterStage, CentraStar, ClaimPack, ClaimsEditor, CLARiiON, ClientPak, Codebook Correlation Technology, Common Information Model, Configuration Intelligence, Configuresoft, Connectrix, CopyCross, CopyPoint, Dantz, DatabaseXtender, Direct Matrix Architecture, DiskXtender, DiskXtender 2000, Document Sciences, Documentum, elnput, E-Lab, EmailXaminer, EmailXtender, Enginuity, eRoom, Event Explorer, FarPoint, FirstPass, FLARE, FormWare, Geosynchrony, Global File Virtualization, Graphic Visualization, Greenplum, HighRoad, HomeBase, InfoMover, Infoscape, Infra, InputAccel, InputAccel Express, Invista, Ionix, ISIS, Max Retriever, MediaStor, MirrorView, Navisphere, NetWorker, nLayers, OnAlert, OpenScale, PixTools, Powerlink, PowerPath, PowerSnap, QuickScan, Rainfinity, RepliCare, RepliStor, ResourcePak, Retrospect, RSA, the RSA logo, SafeLine, SAN Advisor, SAN Copy, SAN Manager, Smarts, SnapImage, SnapSure, SnapView, SRDF, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix, Symmetrix DMX, Symmetrix VMAX, TimeFinder, UltraFlex, UltraPoint, UltraScale, Unisphere, VMAX, Vblock, Viewlets, Virtual Matrix, Virtual Matrix Architecture, Virtual Provisioning, VisualSAN, VisualSRM, Voyence, VPLEX, VSAM-Assist, WebXtender, xPression, xPresso, YottaYotta, the EMC logo, and where information lives, are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. © Copyright 2014 EMC Corporation. All rights reserved. Published in the USA. Revision Date: 02-07-2014 Revision Number: MR-1WP-STORPERFD.1.0.0 1 Storage Performance Fundamentals

Upload: dangkhue

Post on 18-Mar-2018

232 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Welcome to Storage Performance Fundamentals.

Click the Notes tab to view text that corresponds to the audio recording.

Click the Supporting Materials tab to download a PDF version of this eLearning. Copyright © 1996, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 , 2014 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

EMC2, EMC, Data Domain, RSA, EMC Centera, EMC ControlCenter, EMC LifeLine, EMC OnCourse, EMC Proven, EMC Snap, EMC SourceOne, EMC Storage Administrator, Acartus, Access Logix, AdvantEdge, AlphaStor, ApplicationXtender, ArchiveXtender, Atmos, Authentica, Authentic Problems, Automated Resource Manager, AutoStart, AutoSwap, AVALONidm, Avamar, Captiva, Catalog Solution, C-Clip, Celerra, Celerra Replicator, Centera, CenterStage, CentraStar, ClaimPack, ClaimsEditor, CLARiiON, ClientPak, Codebook Correlation Technology, Common Information Model, Configuration Intelligence, Configuresoft, Connectrix, CopyCross, CopyPoint, Dantz, DatabaseXtender, Direct Matrix Architecture, DiskXtender, DiskXtender 2000, Document Sciences, Documentum, elnput, E-Lab, EmailXaminer, EmailXtender, Enginuity, eRoom, Event Explorer, FarPoint, FirstPass, FLARE, FormWare, Geosynchrony, Global File Virtualization, Graphic Visualization, Greenplum, HighRoad, HomeBase, InfoMover, Infoscape, Infra, InputAccel, InputAccel Express, Invista, Ionix, ISIS, Max Retriever, MediaStor, MirrorView, Navisphere, NetWorker, nLayers, OnAlert, OpenScale, PixTools, Powerlink, PowerPath, PowerSnap, QuickScan, Rainfinity, RepliCare, RepliStor, ResourcePak, Retrospect, RSA, the RSA logo, SafeLine, SAN Advisor, SAN Copy, SAN Manager, Smarts, SnapImage, SnapSure, SnapView, SRDF, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix, Symmetrix DMX, Symmetrix VMAX, TimeFinder, UltraFlex, UltraPoint, UltraScale, Unisphere, VMAX, Vblock, Viewlets, Virtual Matrix, Virtual Matrix Architecture, Virtual Provisioning, VisualSAN, VisualSRM, Voyence, VPLEX, VSAM-Assist, WebXtender, xPression, xPresso, YottaYotta, the EMC logo, and where information lives, are registered trademarks or trademarks of EMC Corporation in the United States and other countries.

All other trademarks used herein are the property of their respective owners.

© Copyright 2014 EMC Corporation. All rights reserved. Published in the USA.

Revision Date: 02-07-2014 Revision Number: MR-1WP-STORPERFD.1.0.0

1 Storage Performance Fundamentals

Page 2: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This course covers the basics of analyzing storage system performance and introduces storage performance benchmarking.

2 Storage Performance Fundamentals

Page 3: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This module describes the components that influence the performance of a storage system. It will begin by defining many terms used in analyzing storage system performance.

Storage Performance Fundamentals 3

Page 4: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Performance in this context is defined as how well a system or a group of systems work. It is the manner or efficiency in which a system performs or the total effectiveness of a system. Storage solutions must meet a customer’s capacity and performance requirements. Reasons for measuring performance is usually driven by IT customers, sale teams or end-users. Examples are: A customer wants to run certain applications with certain response times. A sales team and technical architects (TAs) are requested to design a solution.

Though performance can be defined by raw numbers, usually the user experience in real world environments will prompt the investigation of performance issues. Typical complaints consists of the response time being too long (application is slow), or that certain operations are taking too long to finish (backups taking 4 hours when they should only take 2 hours). Some performance issues may be corrected by the end-user. Others, like those associated with application design, may not be that simple.

Storage Performance Fundamentals 4

Page 5: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Storage system performance can be measured in several ways including response time, throughput, and availability.

There are a large number of variables that influence performance. Performance issues may be found in the application, the client (host) computer, the IP (LAN) or storage area network (SAN) , the file server(s), or the storage system itself.

Storage Performance Fundamentals 5

Page 6: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Bandwidth measures the capacity of a link, bus, channel, interface, or device to transfer data. This is the amount of data that can be transmitted in a fixed amount of time and is usually measured in bits per second (bps) or bytes per second (Bps). Network Bandwidth usually refers to the theoretical data transfer rate of a device under ideal condition; it should therefore be treated as an upper bound of performance. Storage System Bandwidth is the amount of data that is transferred along a channel per second. It is measured in Megabytes per second (MB/s) or Gigabytes per second (GB/s).

Throughput is the amount of data per second that a drive can deliver to the controller. The data transfer rate of a drive covers both the internal rate (moving data between the disk surface and the controller on the drive) and the external rate (moving data between the controller on the drive and the host system).

IOPS is measured by the number of I/O operations that are processed per second over a period of time. It is also known as the storage array transaction rate or total throughput.

Storage Performance Fundamentals 6

Page 7: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Utilization measures the fraction of time that a device is busy servicing requests, usually reported as a percentage busy.

Service time measures how long it takes to process a specific request.

Queue time represents the amount of time a request waits to be processed. In general, at low levels of utilization, there is minimal queuing, which allows device service time to dictate response time. However, as utilization increases queue time increases nonlinearly and begins to dominate response time.

Response time is the interval of time between submitting a request and receiving a response. This is normally measured in milliseconds (ms). Response time encompasses both the service time at the device processing the request and any other delays encountered waiting for processing. Response can be defined as service time + queue time. Response time is often referred to as latency.

Storage Performance Fundamentals 7

Page 8: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

A Bottleneck is the slowest component in a system or process, at any given time. There is always a bottleneck, however a bottleneck does not necessarily have to be causing a noticeable problem. When the current bottleneck is modified such that it is no longer “the bottleneck”, something else will become the bottleneck i.e. A different component becomes the bottleneck. The goal is to address bottlenecks until the required level of performance is achieved.

Bottleneck refers to the specific component that causes the delay in a system. In a storage system the bottleneck could be the disks, the bus, memory bandwidth, etc.

A benchmark is a standardized test that serves as a basis for evaluation or comparison. It can be a test used to compare the performance of hardware and/or software.

Test Harness is a collection of software and test data configured to test a system by running it under varying conditions and monitoring its behavior and outputs. The Test Harness is the main script or software that actually execute the tests using test libraries and generating reports. It requires that the test scripts are designed to handle different test data and different test scenarios.

Storage Performance Fundamentals 8

Page 9: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Burstiness defines an I/O pattern that is characterized by brief periods of intense activity followed by periods that are less busy. The greater the range of variation, the more “bursty” the pattern is considered. For example, one might see a burstiness of I/Os during the first 15 minutes of a work day due to employee logins.

Saturation is the state in which a system or device has reached the limits of normal operation. At this point, response time will increase dramatically without a corresponding increase in bandwidth or IOPS. Saturation is the result of workload rates that exceed a system or devices operational capabilities. Industry-standard benchmark tests, such as Spec SFS, test storage devices by increasing the workload to the point of saturation.

Storage Performance Fundamentals 9

Page 10: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Using the terminology covered in the previous slides, we can now apply several performance metrics as part of a benchmark measurement. The graph displayed here shows two devices upon which a workload with a steadily increasing IOPS is being run. As the IOPS increase, we see a corresponding increase in response time. That increase eventually spikes severely upward, demonstrating that the device has reached saturation. Therefore this graph is measuring a system’s saturation.

Storage Performance Fundamentals 10

Page 11: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Workload Skew is the percentage of total I/O load as a percentage of total data capacity where the sum of those values is 100%. In other words, skew shows how much of the storage system is handling the bulk of the workload.

The diagonal (blue) dashed line shows a perfectly linear distribution of workload over the available data area (storage capacity). The total load would be 50% at the 50% capacity mark, so the skew would be 50%. Skew values lower than 50% are meaningless (and are the same as 100% minus that value). Workload with skew values around the 50% mark is regarded as data with no skew.

In the case of the solid (blue) line, 90% of the workload is distributed over 10% of the capacity, for a skew of 90%. The calculation of skew, at the LUN or slice level (part of a LUN), uses tools available to EMC employees only.

Explained another way: The x-axis is cumulative data, expressed as a percentage, and the y-axis is cumulative I/O, expressed as a percentage. Note that we’re dealing with percentages – an important point when we need to specify the skew. Skew is the ratio of total workload to the total data capacity, at the point where the workload and capacity percentages add up to 100%. In the example shown above, we see that 90% of the I/O activity (total workload) is performed on 10% of the data (total data capacity). The sum of those values is 100%: 90% + 10%. This environment will then be described as having a skew of 90%.

Skew may be calculated at the LUN level, and the results will then show which LUNs are the most active. Those LUNs would typically be relocated to a higher tier of storage.

Storage Performance Fundamentals 11

Page 12: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Here are some representations of skew values, ranging from heavy skew to no skew. Note that “no skew” is a skew of 50%, meaning that I/O is distributed evenly across the data surface. 50% of the I/O is therefore performed on 50% of the data capacity, and the line showing this is a straight line from point (0,0) to point (100,100). Workloads that exhibit heavy skew are ideal candidates for the use of flash drives.

Storage Performance Fundamentals 12

Page 13: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Seek Time is the time required for read/write heads in a disk drive to move between tracks of the disk.

Rotational Latency is the time taken by the platter to rotate and position the data location under the read/write head.

Transfer Rate, also called Data Transfer Rate or Throughput is the amount of data per second that a drive can deliver to the controller. The data transfer rate of a drive covers both the internal rate (moving data between the disk surface and the controller on the drive) and the external rate (moving data between the controller on the drive and the host system). Be aware that these parameters are only relevant for rotating, magnetic disk drives.

Storage Performance Fundamentals 13

Page 14: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Read I/O is an operation executed on a file by a user or application that does not entail any data modification.

Write I/O is an operation executed on a file by a user or application that will modify the data one way or another. Read and write I/Os depend greatly on cache for the best performance. We will cover cache and cache performance soon.

Random access indicates that I/Os are distributed throughout the relevant address space.

Sequential access indicates that I/Os are contiguous within the relevant address space.

Note that the relevant address space can be that of a file, file system, LUN, raid group, etc. When dealing with rotating disks, the system gets much better performance from sequential I/Os because the next location in the data is the next location on the disk, and the disk head does not have to move to access the next block on disk.

Storage Performance Fundamentals 14

Page 15: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

RPM (Rotations Per Minute) is the number disk rotations in one minute. RPM is the same as Rotation Speed which is the speed at which the hard drive platter rotates.

Disk Queue is measured as an average. The Average Disk Queue Length is the average number of both read and write requests that were queued for the selected disk during the sample interval.

Average Response Time is the relationship between service time and controller utilization. It is given as:

Average Response Time (TR) = Service Time (Ts) / (1 – Utilization)

Note: Disk Transfers/sec = IOPS

Average Disk sec/Transfer = Average Response Time

Storage Performance Fundamentals 15

Page 16: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Cache is a semiconductor memory where data is placed temporarily to reduce the time required to service I/O requests from hosts.

Cache Page - Cache is organized into pages. A cache page is the smallest unit of cache allocation. The size of a cache page is configured according to the application I/O size.

Cache Hit - There are two types of cache hits, read and write. A Read hit occurs when the requested data is found in cache and sent directly to the host, without any disk operation. A Write hit is when data sent by a host is written to cache and the write acknowledged back to the host, before a disk write is performed.

A Cache Miss occurs when the requested data is not found in cache. It is an I/O request that does not result in a cache hit.

Flushing is the process that commits data from cache to the disk. On a VNX system, there are three types of flushing: Idle flushing, High watermark flushing, and Forced flushing. Other storage systems may used different mechanisms and different terms for the de-staging of cache.

• Idle Flushing occurs continuously, at a modest rate, when the cache utilization level is between the high and low watermark.

• High Watermark Flushing is activated when cache utilization hits a certain level called the high watermark. The storage system dedicates some additional resources for flushing. This type of flushing has some impact on I/O processing.

• Forced Flushing occurs in the event of a large I/O burst when cache reaches 100 percent of its capacity, which significantly affects the I/O response time. In forced flushing the system flushing the cache on priority by allocating more resources.

Storage Performance Fundamentals 16

Page 17: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Some storage systems do not have cache, however if cache is available, generally cache reads provides much faster speeds than disk reads. This is because cached data is retrieved at electronic (not mechanical) speeds.

The amount of cache memory and how it works varies by the storage system and model. For example, on an EMC VNX, as soon as a write I/O is absorbed by cache and mirrored to the peer Storage Processor, an acknowledgement is sent back to the host. Cache writes also can be faster than writes directly to disk. If writes are reasonably sequential, a VNX cache coalesces the incoming I/Os and sends data in fewer I/Os down to the disks. The number of host I/Os received by LUNs will be less than the number of I/Os going down to disk.

The VNX Operating Environment (OE) for Block stripe element size is 64 KB. A stripe element size means that a VNX will write 64 KB to a disk before moving to the next disk in the RAID group. In addition, writes to the same block when found in cache are superseded with the last write.

Storage Performance Fundamentals 17

Page 18: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This module defined storage performance as how well a system works. One reason for measuring the performance is so that a sales team can locate or design a system that will meet customer’s performance requirements. A customer may want to run particular applications with required response times. Another reason for measuring performance is so that a baseline can be set and used later as a reference. If end-users report that certain tasks or application response times have increased, this can be compared with the baseline reference. Terms such as bandwidth, throughput, and utilization were defined.

Storage Performance Fundamentals 18

Page 19: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This module introduces and describes the I/O workloads and how they can be characterized. The module also differentiates what should be considered when measuring performance in NAS versus SAN environments.

Storage Performance Fundamentals 19

Page 20: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson defines different types of workloads and explains the importance of workload classification to storage system performance. Little’s law is introduced. RAID levels are reviewed and how the different RAID levels affect storage system performance will be explained.

Storage Performance Fundamentals 20

Page 21: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Workload Characterization, also called Workload Type, Workload Parameters, Workload Attributes, or Workload Key Performance metrics is the single most important performance consideration and can dramatically affect system performance. Basically the Workload Characterization describes the different “types of workloads” that may be presented to a storage system.

Workloads can be characterized by:

• I/O Size

• Read vs. Write I/Os

• Random vs. Sequential access

• Single vs. Multi-threaded access

• Working Set Size

• I/O Demand Rate

Before you can start to analyze and tune for performance you must first know what type of workload the application or client is generating.

Storage Performance Fundamentals 21

Page 22: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

The I/O size is the amount of data for each I/O transaction requested by the host. Some typical examples of I/O sizes are 8 KB (this is a typical file system or an Oracle application transfer amount), 64 KB (this is a typical transfer I/O size for a backup/restore process), and 256 KB (this is a typical streaming video application transfer amount).

The I/O request size has a significant affect on performance throughput. Generally, the larger the I/O size, the higher the storage bandwidth. Most production workloads have a mix of I/O sizes.

Larger I/Os take longer to transmit and process. However, some of the overhead used to execute an I/O is fixed, so if data exists in larger chunks, it is more efficient to transmit larger blocks. The same is true regarding an IP network. It takes longer to transmit larger packets, however this takes from the overhead that IP switches need to process the packets.

A host can move more data, faster, by using larger I/Os than smaller ones. The response time of each large transfer will be longer than the response time for a single smaller transfer, but the combined service times of many smaller transactions will be greater than a single transaction that contains the same amount of data.

Storage Performance Fundamentals 22

Page 23: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

As the I/O size increases, bandwidth also increases. With larger I/Os we are able to transmit more data. Notice that the IOPS decrease as the I/O size increases; with larger I/Os we are able to move more data while sending fewer I/Os. This test was performed on a single RAID 5 LUN with 16 threads.

Storage Performance Fundamentals 23

Page 24: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Another workload characterization is whether an I/O access is read or write. Typically called the read/write ratio. Very few workloads are all reads or all writes. It is important to know which access is the majority, because the two access types use different amounts of storage system resources. Reads consume fewer resources than writes. Sequential reads that find their data in the array’s cache consume the least amount of resources and have the highest throughput. However, reads not found in cache, which are normal with random access, have a much lower throughput and higher response time because data will need to be retrieved from disk.

Writes use more resources and are generally slower than reads due to the fact that protection is usually added to new data. Typically, all writes must be cached, mirrored, and acknowledged. This calls for a larger cache size for buffering writes.

Storage Performance Fundamentals 24

Page 25: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Most applications fall under one of two access patterns: Random and Sequential. Random access is exemplified by Online Transaction Processing (OLTP) (databases) where data reads and modifications are made in a scattered manner across the entire dataset. A random workload refers to reads or writes that are distributed throughout the relevant address space. Random I/O at the drive level requires the drive to seek data across the rotating platters, which involves a relatively slow, mechanical head movement.

Sequential access refers to successive reads or writes that are logically contiguous within the relevant address space. Sequential access is typical during back-up and restore operations as well as event logging. To enhance performance, intelligent storage systems will detect sequential access patterns and begin to pre-fetch data into cache. This allows the host to satisfy its I/O request from cache rather than disk. This is called Pre-fetching.

Sequential writes can also be coalesced where many smaller writes can be combined into fewer large transfers to disk or array.

Storage Performance Fundamentals 25

Page 26: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Per Wikipedia: “A thread of execution is the smallest sequence of programmed instructions that can be managed independently by an operating system scheduler.”

This workload characteristic represents the degree of I/O demand parallelism presented to the storage by the application. It is a measure of how much demand is placed on the storage system, also called the Demand Intensity. This parameter is best gleaned from the application vendor or the system administrator.

Storage Performance Fundamentals 26

Page 27: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Displayed here are the results of a test that was performed on a NAS file system created from R5 4+1 RAID groups. As you can see, as more threads are added to this random

8 KB write workload, the more IOPS are performed and more bandwidth is processed.

Storage Performance Fundamentals 27

Page 28: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

On a per-thread basis, sequential I/O is significantly faster than random I/O. Increasing the count of concurrent sequential threads to the same disk begins to generate random behavior on the back-end due to the sequential streams being multi-plexed together.

If the production environment will be dealing with sequential I/Os, make sure that you don’t have multi-threaded sequential access to the same set of disks. This will appear random to the storage system, since the data will be in different locations, the drives will have to search all over the disk, rather than reading all of the sequentially files serially.

For example, three users are trying to sequentially read different files that are located on the same disk. Even though these users are sending sequential I/Os, they are accessing different locations on the disk at the same time, which for the storage array, appears like random access.

Storage Performance Fundamentals 28

Page 29: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

The results shown here are from a test using 32 KB client I/Os. Each thread is accessing a single file on the same file system. As noted, the per-thread performance decreases as more threads are being used. However, the aggregate bandwidth increases. Obviously, more data is being accessed as the number of threads increase, but each client will see a performance decrease because response time will increase.

Storage Performance Fundamentals 29

Page 30: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Working Set Size is the portion of the total data space that is being used by an application at a certain period of time, also known as active data. The working set of an application is defined as the total address space that is traversed (either written and/or read) in some finite, short period of time during its operation.

Storage Performance Fundamentals 30

Page 31: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

The I/O demand rate parameter can take on one of two units of measure depending on the transfer block size. One can be derived from the other.

Typically, for I/O transfer sizes less than 64 KB, the I/O processing rate per second (IOPS) is the unit of choice. This is so because smaller transfers are limited by the transaction processing rate of the storage rather than the raw bandwidth. The demand of OLTP workloads, due to their relatively small block sizes (8-16 KB), is expressed in these units. One can think of this parameter as the analog to a toll booth on a freeway. Regardless of the speed limit (bandwidth), unless enough vehicles can be processed by the toll booths, there will be performance problems.

For I/O transfer sizes greater than 64 KB, mega-bytes per second (MB/s) is used because larger transfers are limited by the bandwidth of the storage rather than the I/O processing capability. Backup and restore workloads are expressed MB/s.

Storage Performance Fundamentals 31

Page 32: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

There is another relationship that ties together response time and queue depth – Little’s Law. Little’s law governs the concurrency of a system needed to achieve a desired amount of throughput. For storage, Little’s Law is: (Outstanding I/Os) ÷ (response time) = IOPS

Another way of expressing outstanding I/Os is the queue depth of a system.

If you boil this down, ultimately the limitation of IOPS performance is the ability of a system to handle outstanding I/Os concurrently. Once that limit is reached, the I/Os get logged up and the response time increases rapidly. This is the reason a common tactic to increase storage performance has been to simply add disks – each additional disk increases the concurrent I/O capabilities.

Storage Performance Fundamentals 32

Page 33: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

RAID level selection should be determined by application performance, data availability requirements, and cost. RAID levels are defined on the basis of striping, mirroring, and parity techniques. Some RAID levels use a single technique, whereas others use a combination of techniques. The commonly used RAID levels are listed on the slide.

Storage Performance Fundamentals 33

Page 34: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

When choosing a RAID type, it is imperative to consider its impact on disk performance and application IOPS. In both mirrored and parity RAID configurations, every write operation translates into more I/O overhead for the disks, which is referred to as a write penalty. In a RAID 1 implementation, every write operation must be performed on two disks configured as a mirrored pair, whereas in a RAID 5 implementation, a write operation may manifest as four I/O operations. When performing I/Os to a disk configured with RAID 5, the controller has to read, recalculate, and write a parity segment for every data write operation.

This slide illustrates a single write operation on RAID 5 that contains a group of five disks. The parity (P) at the controller is calculated as follows:

Cp = C1 + C2 + C3 + C4 (XOR operations)

This formulae will be different for a RAID group consisting of a different number of physical disk.

Whenever the controller performs a write I/O, parity must be computed by reading the old parity (Cp old) and the old data (C4 old) from the disk, which means two read I/Os. Then, the new parity (Cp new) is computed as follows:

Cp new = Cp old – C4 old + C4 new (XOR operations)

After computing the new parity, the controller completes the write I/O by writing the new data and the new parity onto the disks, amounting to two write I/Os. Therefore, the controller performs two disk reads and two disk writes for every write operation, and the write penalty is 4.

In RAID 6, which maintains dual parity, a disk write requires three read operations: two parity and one data. After calculating both new parities, the controller performs three write operations: two parity and an I/O. Therefore, in a RAID 6 implementation, the controller performs six I/O operations for each write I/O, and the write penalty is 6.

In cases where the write fully occupies a stripe, the parity can be calculated without having to perform reads. This make the process more efficient.

Storage Performance Fundamentals 34

Page 35: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Consider an application that generates 1200 IOPS at peak workload, with a read/write ratio of 2:1. Calculate disk load at peak activity for RAID 1/0 and RAID 5 configurations.

Storage Performance Fundamentals 35

Page 36: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

The read-write ratio for an application is given as 2:1, and the total I/O generated by the application is 1200. Therefore, in case of RAID 1/0 the total reads are 800 and writes are 400. However, for RAID 1/0 write penalty is 2. Therefore, the total disk load for writes will be 800. Hence, the total disk load will be 1600 IOPS.

Similarly, the disk load for RAID 5 can be calculated. The write penalty for RAID 5 is 4, therefore the total IOPS will be 800 reads and 1600 writes, and the total disk load will be 2400 IOPS.

Storage Performance Fundamentals 36

Page 37: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Compare the different RAID levels. Not all RAID levels are supported by all storage systems.

Storage Performance Fundamentals 37

Page 38: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Common applications that benefit from different RAID levels.

• RAID 1+0 performs well for workloads that use small, random, write-intensive I/Os. Some applications that benefit from RAID 1+0 are high transaction rate online transaction processing (OLTP), relational database management system (RDBMS) temp space and so on.

• RAID 3 provides good performance for applications that involve large sequential data access, such as data backup or video streaming.

• RAID 5 is good for random, read intensive I/O applications and preferred for messaging, medium-performance media serving, and RDBMS implementations, in which database administrators (DBAs) optimize data access.

Storage Performance Fundamentals 38

Page 39: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson defined workload characterization. Single and multi-threaded access were defined. Little’s Law was introduced and RAID levels were reviewed. This lesson showed how these workload characteristics affect storage performance.

Storage Performance Fundamentals 39

Page 40: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson lists the components of both NAS and SAN environments that may influence storage system performance. The lesson shows the differences in each environment that must be considered when analyzing performance.

Storage Performance Fundamentals 40

Page 41: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

The NAS environment consists of the following components:

• Client and Applications (IP Host)

• IP Node Ports (NICs), Connectors and Cables

• Client IP network(s) (Ethernet Switches)

• File Servers (File systems) (NAS heads)

• Storage Controllers (Arrays) (Storage Processors)

• Storage back-end connectivity (SAS)

• Disk Drives (Flash Drives)

A complex system will run only as fast as its slowest component. Bottleneck analysis is the process of identifying the slowest device in a configuration. The difficulty lies in finding that slowest component. If you replace a component other than the bottleneck device, performance will remain the same.

A characteristic of a balanced system is one in which the performance potential of all of its components is maximized. All components in a balanced system are at least capable of handling the flow of work from component to component. A balanced system is one where workload components are evenly distributed across the processing resources. If there are delays, the work that is waiting to be processed is also evenly distributed in the system.

During this lesson, we look at both NAS and SAN environments and explore how their components influence how performance should be analyzed.

Storage Performance Fundamentals 41

Page 42: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Here is an example of a SAN environment. Again the entire system is only as fast as its slowest component.

The SAN environment consist of the following components:

• Application/File Servers (File systems) (SAN Host)

• Interconnecting Devices (FC Node Ports (HBAs), FC Switches or Hubs)

• Storage Controllers (Arrays) (Storage Processors)

• Storage back-end connectivity (SAS)

• Disk Drives (Flash Drives)

Be aware that SANs usually attach to LANs, if so, the follow components will also influence overall performance:

• Client and Applications (IP Host)

• IP Node Ports (NICs), Connectors and Cables

• Client IP network (Ethernet Switches)

Storage Performance Fundamentals 42

Page 43: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

When analyzing NAS and SAN performance data, one must take in consideration the differences between the two environments.

In a SAN, storage arrays accept all different I/O sizes from a host. I/O size is dependent upon operating system parameters and application workload. The response time, throughput, and bandwidth achievable from the array are dependent on the I/O size. This is why SAN storage array performance is presented with varying I/O sizes. NAS also accepts all I/O sizes from its clients, however, the I/O size received by the NAS head will not necessarily be the I/O size transferred to the storage array. Most NAS head I/O is 8 K by default. For example on a VNX OE for File, coalescence of groups of these 8 K blocks at the NAS head is the result of IOMerge. IOMerge is an operation on the File side that may break a large I/O into smaller I/Os to fit the 8 KB file system block size, or it might also coalesce multiple smaller I/Os into larger I/Os before sending them to the array for writes.

Another difference is the sharing nature of data in a NAS environment. In a NAS environment, at any given time, there will probably be more than one client accessing the same file system, and thus accessing the same LUNs. In a SAN , each host will have their own set of LUNs.

In a NAS environment data must go through the IP network which contains buffers and protocols, which add to the latency. In most SANs, the host has a direct path to the storage array via Fibre Channel. There are exceptions when dealing with FCoE (Fibre Channel over Ethernet) and iSCSI, in which case, an IP network is also used.

VNX Data Movers contain their own cache, used primarily as read cache. Operating Systems differ on how they cache data stored on SAN/DAS devices (Microsoft tends to cache much less than UNIX). Since network connections are inherently slower, most operating systems aggressively cache network requests. This complicates the translation of SAN performance data into NAS since you must now understand how much data is actually being moved to and from the storage device.

Storage Performance Fundamentals 43

Page 44: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

In order to demonstrate how different are NAS and SAN environments, a test was performed by having a client execute a “save as” operation on a Notepad file containing 70 lines of code. The client accessed this file via a SAN, and then via a NAS.

During the SAN experiment, 83 OS/file system events were observed with Procmon and 2 I/Os were measured at the disk level. Procmon (Process Monitor) is an advanced monitoring tool for Windows that shows real-time file system, registry and process/thread activity.

When the client performed a “save as” to a remote file system via CIFS we had 112 OS/file system events involving file share via Procmon. Since the client was using CIFS, the client observed a total of 498 SMB requests by using EtherReal. When using VPN to access the file share, a total of 867 SMB requests was recorded.

The goal of this test was not to discourage the use of NAS, but just to show how different they are and how each environment has its pros and cons. If you’re looking at a solution to share data among many users and applications, then NAS is a great fit. If you’re looking for speed and high throughput for production servers, then SAN would be the way to go.

Storage Performance Fundamentals 44

Page 45: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

There are some variables that exist in the NAS and SAN environments and affect performance. These similarities are: client host machines, the applications running on the client(s), the NICs or HBAs, connectors and cables, the client IP network including Ethernet switches and routers, IP parameters, and type of file system(s).

Storage Performance Fundamentals 45

Page 46: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

In some storage system installations, overall performance may be improved with the use of Jumbo Frames. Jumbo Frames is a feature, when teamed with Gigabit Ethernet (GigE) technology, allow more data to encapsulated into each Ethernet frame. Standard Ethernet frames have a maximum frame size of 1500 bytes. Jumbo frames can have a Maximum Transfer Unit (MTU) of up to 9000 bytes, dependent on the NIC vendor. Many GigE switches and NICs support jumbo frames.

Here’s how it works: Every Ethernet frame on a network has to be assembled by the sender, and its headers have to be read by the network components (NICs, switches and routers) between the sender and the receiver. The receiver then reads the frame and TCP/IP headers before processing the data. This activity, plus the headers added to frames and packets to get them from sender to receiver, consumes CPU cycles and bandwidth. Sending data in jumbo frames means fewer frames are sent across the network. This results in significant improvements in CPU cycles and bandwidth.

In a NAS, implementing jumbo frames between a NAS head and file servers is a good fit. However, implementing jumbo frames at the clients, (In a NAS or in a host in a SAN) is risky. The administrator will need to have full control of all clients or hosts in order to insure that there is end-to-end jumbo frame support.

Also, larger frames are not always better. Jumbo frame sizes must be matched to device(s) computing power. Large frames can occupy a slow link for some time, causing greater delays to following frames and increasing lag and minimum latency. Therefore, using jumbo frames for low latency applications (gaming, VoIP) can be counterproductive.

Storage Performance Fundamentals 46

Page 47: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

A buffer overflow occurs when a receiving device cannot accommodate the speed with which data is being transmitted to its memory buffer. Buffer overflows are more likely to occur at the Ethernet switch transmit/receive buffer for the client port or the Ethernet switch transmit/receive buffer for the front-end ports of the storage system.

The purpose of flow control is to match the sending and receiving device throughput. If the receiving device becomes congested, it can send a frame called a “pause” frame to the source at the opposite end of the connection, instructing that sender to stop sending packets for a specific period of time. The sender waits the requested time before sending more data. The receiving device can also send a frame back to the sender with a time-to-wait of zero, instructing the sender to begin sending data again. Therefore, if a client is unable to accept packets at the rate of the sender, it can send out a pause frame and request that the server delay transmission for a certain period of time.

This slide shows a typical cause of buffer overflows. In this case, a storage system is connected to the switch via a Gigabit (1000 Mb/s) Ethernet link. The client is connected to the same switch via an 100 Mb/s link. As the storage system transmits data to the client, the switch output buffer (for the client) can be overrun causing packet loss and retransmissions. A Flow Control option must be used to enable flow control on the storage system.

Storage Performance Fundamentals 47

Page 48: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Duplex is the ability of a communications system to communicate in both directions. With half-duplex, only one party can communicate at a time. With full-duplex, both parties can communicate with each other simultaneously.

Duplex and speed mismatches are the most common performance issues we encounter, and yet they are resolved easily. Unlike with a speed mismatch, the two devices will communicate with a duplex mismatch. However, devices with a duplex mismatch will suffer from extremely poor performance. This can be confusing to many network administrators who see the two devices communicating. They would assume that because the devices can ping each other, there must not be an Ethernet issue.

Duplex mismatch is a condition where two connected devices operate in different duplex modes, that is, one is hard set at half duplex while the other one operates at full duplex. The effect of a duplex mismatch is a network that works but is often much slower than its nominal speed. Duplex mismatch can also develop from connecting a device that performs auto-negotiation to one that is manually set to a full duplex mode. This happens because the switch, when not configured for auto-negotiation, will default to half-duplex. The end result of a duplex mismatch is dropped packets, which in turn leads to a high retransmission rate.

Speed and duplex settings can either be hard set or auto-negotiation can be configured on both devices. To avoid duplex mismatches Auto-negotiation is now recommended for GigE connections. 10GigE connections only support full-duplex.

Storage Performance Fundamentals 48

Page 49: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

A retransmission is the resending of packets which have been either damaged or dropped. Whenever one party sends something to the other party, it retains a copy of the data that was sent until the recipient has acknowledged that it received it. Below you will find a list of factors that cause retransmissions:

• An acknowledgment from the receiver has not reached the sender within a reasonable time

• The sender discovers that the transmission was unsuccessful

• The receiver does not obtain the expected data in time and notifies the sender

• The receiver knows that the data has arrived, but in a damaged condition and indicates that to the sender

If errors on the Storage System are present, check for a duplex mismatch or buffer overflows.

• Check In Errors and Out Errors

If errors are present on the client, check for a duplex mismatch or a hardware problem (cable, client’s switch port, client’s interface)

• On Windows clients run netstat –e

Check for Discards and Errors

• On UNIX clients run netstat –i

Check for Ierrors and Oerrors

Here is a list of the parameters that can be used with the netstat command:

netstat [-a] [-b] [-e] [-f] [-n] [-o] [-p protocol] [-r] [-s] [-t] [-x] [-y] [time_interval] [/?]

Performance Monitor can also detect retransmissions for Windows clients. Use the TCP object with the Segments Retransmitted/sec counter.

Storage Performance Fundamentals 49

Page 50: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson covered how NAS and SAN environments are different and how this difference will influence system performance. The lesson also described potential network issues that can affect overall performance such as Ethernet collisions, duplex mismatches, retransmissions, link reliability and bandwidth, utilization of switches and routers, buffer overflows on Ethernet switches and end devices, network saturation, and end-to-end support for jumbo frames.

Storage Performance Fundamentals 50

Page 51: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This module defined workloads and how different workload attributes affect storage system performance. The module also showed that the different environments of NAS and SAN will dictate how performance is measured.

Storage Performance Fundamentals 51

Page 52: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This module expands on the definition of benchmark covered in module 1. This module explains the benefits of benchmarking and what tools can be used to conduct a benchmark.

Storage Performance Fundamentals 52

Page 53: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson explains what is involved in executing a performance benchmark. The lesson will show how benchmark results can be used.

Storage Performance Fundamentals 53

Page 54: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Benchmarking is to evaluate or check something by comparison with a standard or something that serves as a standard by which others may be measured or judged. Once a standard or baseline has been found or established then ways to improve the system(s) can be identified.

Another goal of benchmarking is to determine the suitability of a storage system solution to solve a need. Sales teams and system architects must correctly design a storage solution to meet a customer’s performance and capacity requirements.

Storage system benchmarking consists of the following steps:

• Select the storage system to benchmark

• Identify the type of workload(s)

• Choose individual components and tests to run

• Collect data on performances

• Analyze the data and identify opportunities for improvement

• Locate, adapt and implement best practices

Storage Performance Fundamentals 54

Page 55: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

As mentioned earlier, workload characterization has a significant influence on storage system performance, therefore selecting the correct workload parameters is very important when benchmarking a system. The workload attributes when benchmarking must truly represent the IO transactions presented to the storage by the host and real applications. Again workload characteristics are: I/O Size, Read/Write Ratio, Access Patterns, Single vs. Multi-Threaded, Working Set Size , and I/O Demand Rate. Incorrect selection of any of these could introduce errors into the benchmark.

Storage Performance Fundamentals 55

Page 56: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

To successfully complete a benchmark, access to all storage system components must be available. Workload characteristics must be obtained from the customer. Next, a workload that truly represents the “actually real life” workload must be available to apply to the test storage system. The correct benchmarking tools must be located and obtained. Then either a single/multiple tests or a test harness must be executed. Once the tests are complete a benchmark can be developed and validated.

Storage Performance Fundamentals 56

Page 57: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Since benchmarking provides a method by which system performance can be measured, collected data can be used locate components of a system that require improving. The benchmark results serves as an objective measure of how well a system(s) are performing. Armed with this information administrators can make informed decisions on how to improve the system(s). Benchmarking can help identify gaps or bottlenecks that require attention.

Finally, benchmarking provides a quantitative baseline analysis of existing systems. This data provides a reference point for implementing and managing improvements.

Storage Performance Fundamentals 57

Page 58: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson defined performance benchmarking and what is involved in executing a performance benchmark of a storage system. The lesson also showed how benchmark results can be used.

Storage Performance Fundamentals 58

Page 59: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson introduces some software applications that can be used in benchmarking a storage system.

Storage Performance Fundamentals 59

Page 60: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Iometer is both a workload generator and a measurement tool. It creates a file and performs I/O operations on that file in order to stress the file system. It then evaluates and records the performance of its I/O operations. Iometer can be configured to emulate the disk or network I/O load of any program or benchmark, or can be used to generate entirely synthetic I/O loads. It can generate and measure loads on single or multiple systems.

Iometer can be run on most Windows, Linux and Unix platforms.

Storage Performance Fundamentals 60

Page 61: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

Displayed here is the Iometer graphical user interface. This is what you will see every time you access the tool. In the top left-hand corner you will find the open and save icons. Next to these you will find more icons to add managers and workers for multi-threaded workloads. Here is also where you’ll find the icon to remove any workers or managers, and the green flag to start Dynamo and generate the workload.

In the middle pane, you find the disk targets. When you map a drive to a NAS CIFS server, for example, it will show up here.

Drives are accessed by writing to a file called \iobw.tst. If this file does not exist, the drive’s icon will have a red diagonal line through it. At the start of the test the file will be created and grown until the file system or disk is full.

This initialization can take several minutes to hours, depending on the size of the file system/disk. The iobw.tst file will fill the entire disk target, unless a maximum disk size is used.

Storage Performance Fundamentals 61

Page 62: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

IOzone is a filesystem benchmark tool. The benchmark generates and measures a variety of file operations such as reading and writing to a file. IOzone can perform sequential and random operations to the file. IOzone is useful for performing a broad filesystem analysis of a vendor’s computer platform. The benchmark tests file I/O performance for the following operations:

• Read, write, re-read, re-write, read backwards, read strided, fread, fwrite, random read, pread ,mmap, aio_read, and aio_write

IOzone is available for most Windows, Linux, FreeBSD and Unix platforms. It can be downloaded from http://www.iozone.org.

Storage Performance Fundamentals 62

Page 63: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

DBENCH is a tool to generate I/O workloads to either a filesystem or to a networked CIFS or NFS server. It can also communicate with an iSCSI target. DBENCH can be used to stress a filesystem or a server to see which workload becomes saturated. It can also be used for prediction analysis to determine how many concurrent clients and/or applications performing a workload can a server handle before response starts to lag.

DBENCH is free software. It is now considered a de-facto standard for generating load on the Linux VFS.

Storage Performance Fundamentals 63

Page 64: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

VMware View Planner is the first comprehensive standard methodology for comparing virtual desktop deployment platforms. View Planner generates a realistic measure of client-side desktop performance for all desktops being measured on the virtual desktop platform. View Planner uses a rich set of commonly used applications as the desktop workload.

It can be download at http://www.vmware.com/products/view-planner.html.

Storage Performance Fundamentals 64

Page 65: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This lesson introduced popular applications that can be used to benchmark storage systems.

Storage Performance Fundamentals 65

Page 66: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This module illustrated the benefits of performance benchmarking. It provided an overview on how to accurately perform a benchmark on a storage system. Also some benchmarking tools were introduced.

Storage Performance Fundamentals 66

Page 67: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

The documentation listed here was used in the development of this course.

Storage Performance Fundamentals 67

Page 68: Click the tab to view text that corresponds to the audio ... · PDF file... Configuration Intelligence, Configuresoft, Connectrix, CopyCross ... Voyence, VPLEX, VSAM-Assist, WebXtender,

Copyright © 2014 EMC Corporation. Do not copy - All Rights Reserved.

This course introduced the basics of performance analysis on a storage system. It covered which workload parameters much be considered in capturing realistic data while benchmarking a storage system.

Storage Performance Fundamentals 68