raid2.0+ technical white paper - actfornet...raid2.0+ is a brand-new raid technology developed by...
TRANSCRIPT
HUAWEI OceanStor 18000 Enterprise Storage
RAID2.0+ Technical White Paper
Issue 01
Date 2013-09-06
HUAWEI TECHNOLOGIES CO., LTD.
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
i
Copyright © Huawei Technologies Co., Ltd. 2013. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.
Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website: http://www.huawei.com
Email: [email protected]
Tel: 4008302118
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper Change History
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
ii
Change History
Date Issue Description Prepared By
20130802 V1.0 Qin Xuan/00204091
20130906 V1.1 Qin Xuan/00204091
20131014 V1.2 rename Qin Xuan/00204091
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper Contents
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
iii
Contents
Change History ........................................................................................................................... ii
1 Overview of RAID2.0+ ............................................................................................................ 1
1.1 RAID Technology Evolution.................................................................................................................................. 1
1.2 Introduction to Huawei RAID2.0+ ......................................................................................................................... 2
2 Working Principles of RAID2.0+ ........................................................................................... 3
2.1 Basic Principle of RAID2.0+ ................................................................................................................................. 3
2.2 RAID2.0+ Implementation Framework .................................................................................................................. 4
2.3 Logical Objects Involved in RAID2.0+ .................................................................................................................. 5
3 Technical Highlights of RAID2.0+ ........................................................................................ 9
3.1 Secure and Trusted ................................................................................................................................................ 9
3.1.1 Automatic Load Balancing to Decrease the Overall Failure Rate .......................................................................... 9
3.1.2 Fast Thin Reconstruction to Reduce Dual-Disk Failure Probability .....................................................................10
3.1.3 Failure Self-Check and Self-Healing to Ensure System Reliability ......................................................................12
3.2 Flexible and Efficient ...........................................................................................................................................12
3.2.1 Pool Virtualization Design to Simplify Storage Planning and Management .........................................................12
3.2.2 One LUN Across More Disks to Improve Performance of a Single LUN ............................................................13
3.2.3 Dynamic Space Distribution to Flexibly Adapt to Service Changes .....................................................................13
4 Appendix A: FAQs ................................................................................................................. 15
5 Appendix B: Peripheral Resources About RAID2.0+ ........................................................ 18
6 Acronyms and Abbreviations ............................................................................................... 19
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 1 Overview of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
1
1 Overview of RAID2.0+
1.1 RAID Technology Evolution
The term Redundant Array of Independent Disks (RAID) was first defined by the University
of California, Berkeley in 1987. The basic idea of RAID is to combine multiple independent
physical disks based on a certain algorithm to form a virtual logical disk that provides a larger
capacity, higher performance, or better data error tolerance.
As a mature and reliable data protection standard for storage systems, RAID has always been
used as a basic technology by storage systems since its existence. However, with rapid growth
of data storage needs and emergence of high-performance applications in recent years,
traditional RAID gradually exposes its defects.
IDC predicts that in the following five years, the storage market will keep an annual growth
rate of at least 10% on average and the global storage capacity may reach 16,840 PB. To meet
data growth needs, disk device manufacturers keep using more advanced technologies to
increase the unit storage density of disks. Nowadays, 4 TB large-capacity disks and 900 GB
high-performance SAS disks are commonly seen in enterprise and consumer markets.
However, data reconstruction implemented upon the failure of a large-capacity disk reveals
the disadvantages of traditional RAID.
For example, traditional RAID 5 (8D+1P) needs 40 hours to reconstruct data on a 7.2k rpm 4
TB disk. The reconstruction process consumes system resources, decreasing the overall
performance of the application system. If a user restricts the reconstruction priority in return
for timely application response, the reconstruction time will be even longer. In addition,
during the time-consuming reconstruction, a large number of access operations may cause
other disks in the RAID group to fail, greatly increasing the disk failure probability and data
loss risk.
On the other hand, traditional RAID is subject to the number of disks. In the era of soaring
data growth, traditional RAID fails to meet enterprises' needs for unified and flexible resource
scheduling. Besides, as disk capacity increases, disk-based data management becomes
increasingly inefficient.
To resolve the preceding issues of traditional RAID and follow the virtualization trend, many
storage vendors adopt substitutes for traditional RAID as follows:
LUN virtualization
Based on traditional RAID, some storage vendors such as EMC and HDS divide RAID
groups into more fine-grained units and combine these units to form storage space accessible to hosts.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 1 Overview of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
2
Block virtualization
Some storage vendors such as Huawei and HP 3PAR divide the space of disks that
belong to a storage pool into small-granularity data blocks and create RAID groups
based on these data blocks. This approach allows data to be evenly distributed onto all disks in the storage pool and enables resources to be managed in the form of data blocks.
Figure 1-1 Figure 1 RAID technology evolution
1.2 Introduction to Huawei RAID2.0+
OceanStor Enterprise Storage are brand-new enterprise storage systems designed based on the
current application status of storage products and storage technology trends. The OceanStor
Enterprise Storage is intended for medium- and large-sized data centers, and it focuses on the
core services of medium- and large-sized enterprises. The OceanStor Enterprise Storage
features virtualization, hybrid cloud, thin IT, and low-carbon footprint.
The OceanStor Enterprise Storage employs an innovative Smart Matrix all-switching
hardware architecture. With the use of the dedicated storage operating system called eXtreme
Virtual Engine (XVE), the OceanStor Enterprise Storage meets various storage needs of
large-sized data centers.
RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the
disadvantages of traditional RAID and keep in line with the storage architecture virtualization
trend. RAID2.0+ implements two-layer virtualized management instead of the traditional
fixed management. Based on the underlying disk management that employs block
virtualization (Virtual for Disk), RAID2.0+ uses Smart-series efficiency improvement
software to implement efficient resource management that features upper-layer virtualization
(Virtual for Pool).
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
3
2 Working Principles of RAID2.0+
2.1 Basic Principle of RAID2.0+
Huawei RAID2.0+ employs two-layer virtualized management, namely, underlying disk
management plus upper-layer resource management. In an OceanStor Enterprise Storage
system, the space of each disk is divided into data blocks with a small granularity and RAID
groups are created based on data blocks so that data is evenly distributed onto all disks in a
storage pool. Besides, using data block as the smallest unit greatly improves the efficiency of
resource management.
The OceanStor Enterprise Storage supports SSDs, SAS disks, and NL-SAS disks, which
compose disk domains. In a disk domain, disks of the same type are allocated to disk groups (DGs) based on certain rules.
The space of each disk in a DG is divided into chunks (CKs) of a fixed size. The
OceanStor Enterprise Storage selects chunks from different disks at random to compose chunk groups (CKGs) based on a certain RAID algorithm.
Divide RAID
Each CKG is divided into logical storage spaces of a fixed size called extents. Extents
are the basic units that compose thick LUNs (also called fat LUNs). For thin LUNs,
extents are further divided into grains that have a smaller granularity, and grains are mapped to thin LUNs.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
4
2.2 RAID2.0+ Implementation Framework The following figure shows the implementation framework of RAID2.0+ employed by the
OceanStor Enterprise Storage.
A disk domain in the OceanStor Enterprise Storage consists of disks from one or
multiple storage tiers. Each storage tier supports disks of a specific type: SSDs compose
the high-performance tier, SAS disks compose the performance tier, and NL-SAS disks compose the capacity tier.
The disks in each storage tier are divided into CKs with a fixed size of 64 MB.
CKs in each storage tier compose CKGs based on a user-defined RAID policy. Users are
allowed to define a specific RAID policy for each storage tier of a storage pool.
The OceanStor Enterprise Storage divides CKGs into small-sized extents. Extent is the
smallest granularity for data migration and is the basic unit of a thick LUN. When
creating a storage pool, users can set the extent size on the Advanced page. The default
extent size is 4 MB.
Multiple extents compose a volume that is externally presented as a LUN (which is a
thick LUN) accessible to hosts. A LUN implements space application, space release, and data migration based on extents. For example, when creating a LUN, a user can specify a
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
5
storage tier from which the capacity of the LUN comes. In this case, the LUN consists of
the extents in the specified storage tier. After services start running, the OceanStor
Enterprise Storage migrates data among the storage tiers based on data activity levels and
data migration polices. (This function requires a SmartTier license.) In this scenario, data
on the LUN is distributed to the storage tiers of the storage pool based on extents.
When a user creates a thin LUN, the OceanStor Enterprise Storage divides extents into
grains and maps grains to the thin LUN. In this way, fine-grained management of storage capacity is implemented.
2.3 Logical Objects Involved in RAID2.0+ This section describes the major logical objects and key concepts related to RAID2.0+.
Disk Domain
A disk domain is a combination of multiple disks (or all the disks of a storage system). After
disks are consolidated and a certain amount of hot spare space is reserved, a disk domain
provides storage resources for storage pools in a unified manner.
One or more disk domains can be created on an OceanStor Enterprise Storage system.
Multiple storage pools can be created in a disk domain.
A disk domain can consist of SSDs, SAS disks, and/or NL-SAS disks.
Disk domains are isolated from each other, in terms of performance, storage resources, and faults.
Storage Pool and Storage Tier
A storage pool is a storage resource container. The storage resources used by application
servers are all from storage pools. Created in a specific disk domain, a storage pool
dynamically allocates CKs of the disk domain and enables these CKs to compose CKGs
based on the RAID policy defined for each storage tier. Then, the CKGs provide applications
with storage resources that have RAID protection.
A storage tier is a collection of storage media providing the same performance level in a
storage pool. Different storage tiers manage storage media at different performance levels and
provide storage space for applications that have different performance requirements. Based on
disk types, a storage pool can be divided into multiple tiers. The following table describes the
storage tiers and disk types supported by the OceanStor Enterprise Storage.
Storage Tier
Tier Name Supported Disk Type
Application Scenario
Tier 0 High-performa
nce tier
SSD SSDs provide high performance but are
expensive. Therefore, SSDs are suitable for
storing most-frequently accessed data.
Tier 1 Performance
tier
SAS SAS disks provide moderate performance
and are not expensive. Therefore, SAS
disks are suitable for storing data less frequently accessed.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
6
Storage Tier
Tier Name Supported Disk Type
Application Scenario
Tier 2 Capacity tier NL-SAS NL-SAS disks provide low performance but
a large capacity per disk at a low cost.
Therefore, NL-SAS disks are suitable for
storing a large amount of data and the data that is seldom accessed.
When creating a storage pool, a user is allowed to specify storage tiers allocated from the
corresponding disk domain and define a RAID policy and a capacity for each tier.
The OceanStor Enterprise Storage supports RAID 5, RAID 6, and RAID 10. The
following table lists the supported RAID policies.
RAID Level RAID Policy
RAID 5 4D+1P or 8D+1P
RAID 6 4D+2P or 8D+2P
RAID 10 2D+2D or 4D+4D, which is automatically selected by the storage system
The capacity tier consists of large-capacity NL-SAS disks. It is recommended that RAID
6 (a double-parity RAID level) be used as the RAID policy for this tier.
Disk Group(DG)
A Disk group (DG) is a collection of disks of the same type and from the same disk domain.
The OceanStor Enterprise Storage supports three disk types: SSD, SAS, and NL-SAS. Based
on the number of disks of each type in each disk domain, the OceanStor Enterprise Storage
automatically allocates disks to one or more DGs.
One DG consists of only one type of disks.
CKs in a CKG come from different disks that belong to the same DG.
DGs are internal objects automatically configured by the OceanStor Enterprise Storage
and typically used for fault isolation. DGs are not presented externally.
Logical Drive(LD)
Logical drives (LDs) are logical disks managed by the OceanStor Enterprise Storage. Each
LD corresponds to a physical disk.
Chunk(CK)
Each CK is a 64 MB physical space divided from disk space in a storage pool. CK is the basic
unit of a RAID group.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
7
Chunk Group(CKG)
A CKG is a logical storage unit that consists of CKs from different disks in the same DG
based on a RAID algorithm. A storage pool allocates resources from a disk domain by taking
CKG as the smallest unit.
All CKs in a CKG come from the disks in the same DG.
A CKG has RAID attributes, which are actually configured for the corresponding storage tier.
CKs and CKGs are internal objects automatically configured by the OceanStor Enterprise Storage. They are not presented externally.
Extent
An extent is a logical storage space with a fixed size divided from a CKG. The size ranges
from 512 KB to 64 MB. The default size is 4 MB. Extent is the smallest unit (granularity) for
data migration and hotspot data statistics collection. It is also the smallest unit for space
application and release in a storage pool.
One extent belongs to one volume or LUN.
A user can set the extent size while creating a storage pool. After that, the extent size cannot be changed.
The extents in one storage pool may have a different size from those in another storage pool. However, the extents in the same storage pool have the same size.
Grain
In thin LUN mode, extents are divided into 64 KB grains. A thin LUN allocates storage space
in units of grains. Logical block addresses in a grain are consecutive.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
8
Grains are mapped to thin LUNs. A thick LUN does not involve grains.
Volume and LUN
A volume is an internal management object in the OceanStor Enterprise Storage. A volume
organizes all extents and grains of a LUN and applies for and releases extents to increase and
decrease the actual space occupied by the volume.
A LUN is a storage unit that can be directly mapped to a host for data read and write. A LUN
is the external display of a volume.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
9
3 Technical Highlights of RAID2.0+
Based on two-layer virtualized management, RAID2.0+ overcomes the inherent defects of
traditional RAID and greatly improves storage system reliability and resource management
efficiency. By leveraging RAID2.0+, the OceanStor Enterprise Storage provides truly secure,
trusted, flexible, and efficient enterprise storage. This chapter describes the technical
highlights of RAID2.0+ from the preceding four perspectives.
3.1 Secure and Trusted
3.1.1 Automatic Load Balancing to Decrease the Overall Failure Rate
A traditional RAID-based storage system typically contains multiple RAID groups, where
each RAID group consists of up to 10-odd disks. RAID groups work under different loads. As
a result, the stress is unbalanced, leading to existence of hotspot disks. According to the
statistics collected by SNIA, hotspot disks are more vulnerable to failures. In the following
figure, Duty Cycle indicates the percentage of disk working time to total disk power-on time,
and ARF indicates the annual failure rate. It can be inferred that when the duty cycle is high,
the ARF is almost 1.5 to 2 times higher than the ARF in the low duty cycle scenario.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
10
RAID2.0+ implements block virtualization to enable data to be automatically and evenly
distributed onto all disks in a storage pool, preventing unbalanced load distribution. This
approach decreases the overall failure rate of a storage system.
3.1.2 Fast Thin Reconstruction to Reduce Dual-Disk Failure Probability
In recent 10 years of disk development, disk capacity growth outpaces performance
improvement. Nowadays, 4 TB disks are commonly seen in enterprise and consumer markets.
5 TB disks will come into being in the second quarter of 2014. Besides, even
high-performance SAS disks that are dedicated to the enterprise market can provide up to 1.2
TB per disk.
Rapid capacity growth makes traditional RAID face a serious issue: reconstruction of a single
disk, which required only dozens of minutes 10 years ago, now requires 10-odd hours or even
dozens of hours. The increasingly longer reconstruction time leads to the following problem:
A storage system that encounters a disk failure must stay in the degraded state without error
tolerance for a long time, exposed to a serious data loss risk. It is common that data loss
occurs in a storage system under the dual stress posed by services and data reconstruction.
Based on underlying block virtualization, RAID2.0+ overcomes the performance bottleneck
seen in target disks (hot spare disks) that are used by traditional RAID for data reconstruction.
As a result, the write bandwidth provided for reconstructed data flows is no longer a
reconstruction speed bottleneck, greatly accelerating reconstruction, decreasing dual-disk
failure probability, and improving storage system reliability.
The following figure compares the reconstruction principle of traditional RAID with that of
RAID2.0+.
In the schematic diagram of traditional RAID, HDDs 0 to 4 compose a RAID 5 group,
and HDD 5 serves as a hot spare disk. If HDD 1 fails, an XOR algorithm is used to
reconstruct data based on HDDs 0, 2, 3, and 4, and the reconstructed data is written onto HDD 5.
In the schematic diagram of RAID2.0+, if HDD 1 fails, its data is reconstructed based on
a CK granularity, where only the allocated CKs ( and in the figure) are
reconstructed. All disks in the storage pool participate in the reconstruction. The
reconstructed data is distributed on multiple disks (HDDs 4 and 9 in the figure).
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
11
RAID2.0+ fined-grained and efficient fault handling also contributes to reconstruction
acceleration. In addition to the original bad sector repair and disk failure reconstruction,
RAID2.0+ provides bad block repair that only reconstructs used space based on a CK
granularity. By efficiently identifying used space, RAID2.0+ implements thin reconstruction
upon a disk failure to further shorten the reconstruction time, mitigating data loss risks.
With great advantages in reconstruction, RAID2.0+ enables the OceanStor Enterprise Storage
to outperform traditional storage systems in terms of reconstruction. The following figure
compares the time that a traditional storage system and the OceanStor Enterprise Storage
spend in reconstructing 1 TB data in an NL-SAS large-capacity disk environment.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
12
3.1.3 Failure Self-Check and Self-Healing to Ensure System Reliability
The OceanStor Enterprise Storage employs a multi-level error tolerance design for disks and
provides various measures to ensure reliability, including online disk diagnosis, Disk Health
Analyzer (DHA), bad sector background scanning, and bad sector repair. Based on a hot spare
policy, RAID2.0+ automatically reserves a certain amount of hot spare space in a disk domain.
If the OceanStor Enterprise Storage detects an uncorrectable media error in an area of a disk
or finds that an entire disk fails, the OceanStor Enterprise Storage automatically reconstructs
the affected data blocks and writes the reconstructed data to the hot spare space of other disks,
implementing quick self-healing.
Traditional RAID RAID2.0+
Independent global or local hot spare disks
must be manually configured.
Distributed hot spare space is provided
automatically.
Many-to-one reconstruction is implemented.
Reconstructed data flows are written to a single hot spare disk in serial.
Many-to-many reconstruction is
implemented. Reconstructed data flows are written to multiple disks in parallel.
Hotspot disks exist. The reconstruction time is
long.
Load balancing is implemented. The
reconstruction time is short.
3.2 Flexible and Efficient
3.2.1 Pool Virtualization Design to Simplify Storage Planning and Management
Nowadays, mainstream high-end storage systems typically contain hundreds of or even
thousands of disks of different types. If such a storage system employs traditional RAID,
administrators need to manage a lot of RAID groups and must carefully plan performance and
capacity for each application and RAID group. In the era of constant changes, it is almost
impossible to accurately predict the service development trends in an IT system lifecycle and
the corresponding data growth amount. As a result, administrators often face management
issues such as uneven allocation of storage resources. These issues greatly increase
management complexity.
The OceanStor Enterprise Storage employs advanced virtualization technologies to manage
storage resources in the form of storage pools. Administrators only need to maintain a few storage pools. All RAID configurations are automatically completed during the creation of
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
13
storage pools. In addition, the OceanStor Enterprise Storage automatically manages and
schedules system resources in a smart way based on user-defined policies, significantly
simplifying storage planning and management.
3.2.2 One LUN Across More Disks to Improve Performance of a Single LUN
Since the 21st century, server computing capabilities have improved greatly and the number
of host applications (such as databases and virtual machines) has increased sharply, giving
rise to the needs for higher storage performance, capacity, and flexibility. Restricted by the
number of disks, a traditional RAID group provides only a small capacity, moderate
performance, and poor scalability. These disadvantages prevent traditional RAID groups from
meeting service requirements. When a host accesses a LUN intensively, only a limited
number of disks are actually accessed, easily causing disk access bottlenecks and making the
disks hotspot.
RAID2.0+ supports a storage pool that consists of dozens of or even hundreds of disks. LUNs
are created based on a storage pool, thereby no longer subject to the limited number of disks
supported by a RAID group. Wide striping technology distributes data of a single LUN onto
multiple disks, preventing disks from becoming hotspot and improving the performance and
capacity of a single LUN significantly. If the capacity of an existing storage system does not
meet the needs, a user can dynamically expand the capacity of a storage pool and that of a
LUN by simply adding disks to the disk domain. This approach improves disk capacity
utilization.
3.2.3 Dynamic Space Distribution to Flexibly Adapt to Service Changes
RAID2.0+ is implemented based on industry-leading block virtualization. Data and service
load in a volume are automatically and evenly distributed onto all physical disks in a storage
pool. By leveraging the Smart series efficiency improvement software, the OceanStor
Enterprise Storage automatically schedules resources in a smart way based on factors such as
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
14
the amount of hot and cold data and the performance and capacity required by a service. In
this way, the OceanStor Enterprise Storage adapts to rapid changes in enterprise services.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 4 Appendix A: FAQs
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
15
4 Appendix A: FAQs
Q1. Do all disks of an OceanStor Enterprise Storage system reside in the same storage pool?
Answer: Not always. An OceanStor Enterprise Storage system manages disks based on disk
domains and storage pools. A user can create one or more disk domains that are isolated from
each other in terms of resources, performance, and faults. One or more storage pools can be
created in a disk domain. Each storage pool uses various types of disks to provide storage
space.
Q2. How does an OceanStor Enterprise Storage system treat data in the event of capacity
expansion and disk failure?
Answer: After a disk is added to a storage pool, the OceanStor Enterprise Storage system
automatically moves a certain amount of data to the newly added disk space based on the disk
usage for capacity balancing, ensuring that all disks in the same storage pool have similar
space utilization.
If a disk fails, CKGs related to the failed disk automatically perform data reconstruction, and
the reconstructed data is evenly written to the hot spare space of other functional disks. Users
do not need to specify hot spare space. The OceanStor Enterprise Storage system
automatically selects hot spare space based on the disk usage. For details about the
reconstruction process, see section 3.1.2 "Fast Thin Reconstruction to Reduce Dual-Disk
Failure Probability."
Q3. Compared with traditional RAID, in what aspects does RAID2.0+ demonstrate its high
reliability?
Answer: RAID2.0+ demonstrates its high reliability in the following aspects:
Load balancing: RAID2.0+ enables disks to work in a balanced manner, preventing
some disks from being overloaded, which may occur in traditional RAID. For details, see section 3.1.1 "Automatic Load Balancing to Decrease the Overall Failure Rate."
Robust reconstruction: RAID2.0+ enables more disks to share reconstruction load,
reducing the work stress on each disk. In this way, the disk failure risk is minimized
during the reconstruction.
Fast reconstruction: RAID2.0+ significantly shortens the reconstruction window to
help an OceanStor Enterprise Storage system regain the error tolerance capability as
soon as possible, thereby improving system reliability. For details, see section 3.1.2 "Fast Thin Reconstruction to Reduce Dual-Disk Failure Probability".
Thin reconstruction: Based on metadata, RAID2.0+ detects which allocated space is in
use. In the event of reconstruction, RAID2.0+ reconstructs only the used space to reduce
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 4 Appendix A: FAQs
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
16
the workload and shorten the reconstruction time, lowering the reconstruction failure
risks.
Self-healing: RAID2.0+ uses distributed hot spare space. If an OceanStor Enterprise
Storage system detects a fault, reconstruction automatically starts as long as there are
free CKs in disks, thereby improving reliability and cutting management costs. For
details, see section 3.1.1 "Automatic Load Balancing to Decrease the Overall Failure Rate."
Minimized data loss amount: If a traditional RAID group fails, all data in the RAID
group is affected. In terms of RAID2.0+, if multiple disks fail, only the data related to
these failed disks is affected, and other data is still accessible. Therefore, the amount of
lost data is much less.
Based on the Markov model, the following table lists the data loss risks of traditional
RAID and RAID2.0+, with data loss probability and data loss amount taken into consideration.
System Configuration
RAID2.0+ Configuration
Traditional RAID Configuration
Data Loss Risk (Traditional RAID/RAID2.0+)
40 x 600 GB SAS
disks (disk failure rate of 1%)
RAID 5 (4+1 disks), 40
disks per DG
Eight RAID 5 (4+1
disks) groups
16.09
40 x 2 TB SATA
disks (disk failure
rate of 2%)
RAID 6 (8+2 disks), 40
disks per DG
Four RAID 5 (4+1
disks) groups
69.29
40 x 600 GB SAS
disks (disk failure rate of 1%)
RAID 10 (10 disks), 40
disks per DG
Four RAID 10 (10
disks) groups
39.15
Q4. Will data be lost if there is a concurrent failure of two disks in an OceanStor Enterprise
Storage system?
Answer: The essence of this question is about the error tolerance capability of RAID. RAID is
the basis of data storage protection. RAID 5 tolerates a concurrent failure of only one disk (for
traditional RAID) or CK (for RAID2.0+). RAID 6 tolerates a concurrent failure of two disks
or CKs. Therefore, if you employ a double-parity RAID level (such as RAID 6) regardless of
traditional RAID or RAID2.0+, a concurrent failure of two disks will not cause data loss.
If RAID 5 is employed, a concurrent failure of two disks will cause data loss in traditional
RAID scenarios. For an OceanStor Enterprise Storage system that employs RAID2.0+,
however, as long as each CKG does not contain two failed CKs, data will not be lost.
The following figure shows a storage pool that consists of 20 disks, where storage space is
provided for hosts in the form of LUNs and the RAID policy is RAID 5 (4D+1P).
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 4 Appendix A: FAQs
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
17
RAID2.0+: Concurrent Failure of Two Disks
If there is a concurrent failure of HDDs 7 and 9, only CKs related to the two disks are affected.
In the preceding figure, CKs 71 to 76 and CKs 101 to 106 are affected, whereas CKs 77 to 79
and CKs 107 to 109 are not affected because no data is stored in these free CKs. Accordingly,
CKGs 0, 1, 2, 4, 8, 11, 12, 13, 17, 19, 21, and 23 (the red CKGs with an underscore) are
affected. CKGs adopt the RAID 5 (4D+1P) policy, and each affected CKG contains only one
failed CK. Therefore, the data provided by each of these CKGs is still available. From the
perspective of hosts, the corresponding LUNs are still accessible and services are not
interrupted.
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 5 Appendix B: Peripheral Resources About RAID2.0+
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
18
5 Appendix B: Peripheral Resources About RAID2.0+
RAID2.0+ Technology Video, helping you easily learn about the advantages and principles
of RAID2.0+:
http://3ms.huawei.com/mm/docMaintain/mmMaintain.do?method=showMMDetail&f_id=ST
R13083022230014http://3ms.huawei.com/mm/video/videoMaintain.do?method=showVideoD
etail&f_id=1421965
RAID2.0+ Panorama:
http://3ms.huawei.com/mm/docMaintain/mmMaintain.do?method=showMMDetail&f_id=ST
R13072905120079
OceanStor Enterprise Storage
RAID2.0+ Technical White Paper 6 Acronyms and Abbreviations
Issue 01 (2013-08-12) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
19
6 Acronyms and Abbreviations
Table 6-1 Acronyms and Abbreviations
Acronyms and Abbreviations Full Spelling
AFR annual failure rate
CK chunk
CKG chunk group
DG disk group
DHA Disk Health Analyzer
LD logical drive
LUN logical unit number
RAID Redundant Array of Independent Disks
RPM revolutions per minute
SNIA Storage Networking Industry Association
XVE eXtreme Virtual Engine