raid2.0+ technical white paper - actfornet...raid2.0+ is a brand-new raid technology developed by...

23
HUAWEI OceanStor 18000 Enterprise Storage RAID2.0+ Technical White Paper Issue 01 Date 2013-09-06 HUAWEI TECHNOLOGIES CO., LTD.

Upload: others

Post on 21-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

HUAWEI OceanStor 18000 Enterprise Storage

RAID2.0+ Technical White Paper

Issue 01

Date 2013-09-06

HUAWEI TECHNOLOGIES CO., LTD.

Page 2: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

i

Copyright © Huawei Technologies Co., Ltd. 2013. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice

The purchased products, services and features are stipulated by the contract made between Huawei and

the customer. All or part of the products, services and features described in this document may not be

within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,

information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the

preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.

Address: Huawei Industrial Base

Bantian, Longgang

Shenzhen 518129

People's Republic of China

Website: http://www.huawei.com

Email: [email protected]

Tel: 4008302118

Page 3: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper Change History

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

ii

Change History

Date Issue Description Prepared By

20130802 V1.0 Qin Xuan/00204091

20130906 V1.1 Qin Xuan/00204091

20131014 V1.2 rename Qin Xuan/00204091

Page 4: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper Contents

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

iii

Contents

Change History ........................................................................................................................... ii

1 Overview of RAID2.0+ ............................................................................................................ 1

1.1 RAID Technology Evolution.................................................................................................................................. 1

1.2 Introduction to Huawei RAID2.0+ ......................................................................................................................... 2

2 Working Principles of RAID2.0+ ........................................................................................... 3

2.1 Basic Principle of RAID2.0+ ................................................................................................................................. 3

2.2 RAID2.0+ Implementation Framework .................................................................................................................. 4

2.3 Logical Objects Involved in RAID2.0+ .................................................................................................................. 5

3 Technical Highlights of RAID2.0+ ........................................................................................ 9

3.1 Secure and Trusted ................................................................................................................................................ 9

3.1.1 Automatic Load Balancing to Decrease the Overall Failure Rate .......................................................................... 9

3.1.2 Fast Thin Reconstruction to Reduce Dual-Disk Failure Probability .....................................................................10

3.1.3 Failure Self-Check and Self-Healing to Ensure System Reliability ......................................................................12

3.2 Flexible and Efficient ...........................................................................................................................................12

3.2.1 Pool Virtualization Design to Simplify Storage Planning and Management .........................................................12

3.2.2 One LUN Across More Disks to Improve Performance of a Single LUN ............................................................13

3.2.3 Dynamic Space Distribution to Flexibly Adapt to Service Changes .....................................................................13

4 Appendix A: FAQs ................................................................................................................. 15

5 Appendix B: Peripheral Resources About RAID2.0+ ........................................................ 18

6 Acronyms and Abbreviations ............................................................................................... 19

Page 5: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 1 Overview of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

1

1 Overview of RAID2.0+

1.1 RAID Technology Evolution

The term Redundant Array of Independent Disks (RAID) was first defined by the University

of California, Berkeley in 1987. The basic idea of RAID is to combine multiple independent

physical disks based on a certain algorithm to form a virtual logical disk that provides a larger

capacity, higher performance, or better data error tolerance.

As a mature and reliable data protection standard for storage systems, RAID has always been

used as a basic technology by storage systems since its existence. However, with rapid growth

of data storage needs and emergence of high-performance applications in recent years,

traditional RAID gradually exposes its defects.

IDC predicts that in the following five years, the storage market will keep an annual growth

rate of at least 10% on average and the global storage capacity may reach 16,840 PB. To meet

data growth needs, disk device manufacturers keep using more advanced technologies to

increase the unit storage density of disks. Nowadays, 4 TB large-capacity disks and 900 GB

high-performance SAS disks are commonly seen in enterprise and consumer markets.

However, data reconstruction implemented upon the failure of a large-capacity disk reveals

the disadvantages of traditional RAID.

For example, traditional RAID 5 (8D+1P) needs 40 hours to reconstruct data on a 7.2k rpm 4

TB disk. The reconstruction process consumes system resources, decreasing the overall

performance of the application system. If a user restricts the reconstruction priority in return

for timely application response, the reconstruction time will be even longer. In addition,

during the time-consuming reconstruction, a large number of access operations may cause

other disks in the RAID group to fail, greatly increasing the disk failure probability and data

loss risk.

On the other hand, traditional RAID is subject to the number of disks. In the era of soaring

data growth, traditional RAID fails to meet enterprises' needs for unified and flexible resource

scheduling. Besides, as disk capacity increases, disk-based data management becomes

increasingly inefficient.

To resolve the preceding issues of traditional RAID and follow the virtualization trend, many

storage vendors adopt substitutes for traditional RAID as follows:

LUN virtualization

Based on traditional RAID, some storage vendors such as EMC and HDS divide RAID

groups into more fine-grained units and combine these units to form storage space accessible to hosts.

Page 6: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 1 Overview of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

2

Block virtualization

Some storage vendors such as Huawei and HP 3PAR divide the space of disks that

belong to a storage pool into small-granularity data blocks and create RAID groups

based on these data blocks. This approach allows data to be evenly distributed onto all disks in the storage pool and enables resources to be managed in the form of data blocks.

Figure 1-1 Figure 1 RAID technology evolution

1.2 Introduction to Huawei RAID2.0+

OceanStor Enterprise Storage are brand-new enterprise storage systems designed based on the

current application status of storage products and storage technology trends. The OceanStor

Enterprise Storage is intended for medium- and large-sized data centers, and it focuses on the

core services of medium- and large-sized enterprises. The OceanStor Enterprise Storage

features virtualization, hybrid cloud, thin IT, and low-carbon footprint.

The OceanStor Enterprise Storage employs an innovative Smart Matrix all-switching

hardware architecture. With the use of the dedicated storage operating system called eXtreme

Virtual Engine (XVE), the OceanStor Enterprise Storage meets various storage needs of

large-sized data centers.

RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the

disadvantages of traditional RAID and keep in line with the storage architecture virtualization

trend. RAID2.0+ implements two-layer virtualized management instead of the traditional

fixed management. Based on the underlying disk management that employs block

virtualization (Virtual for Disk), RAID2.0+ uses Smart-series efficiency improvement

software to implement efficient resource management that features upper-layer virtualization

(Virtual for Pool).

Page 7: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

3

2 Working Principles of RAID2.0+

2.1 Basic Principle of RAID2.0+

Huawei RAID2.0+ employs two-layer virtualized management, namely, underlying disk

management plus upper-layer resource management. In an OceanStor Enterprise Storage

system, the space of each disk is divided into data blocks with a small granularity and RAID

groups are created based on data blocks so that data is evenly distributed onto all disks in a

storage pool. Besides, using data block as the smallest unit greatly improves the efficiency of

resource management.

The OceanStor Enterprise Storage supports SSDs, SAS disks, and NL-SAS disks, which

compose disk domains. In a disk domain, disks of the same type are allocated to disk groups (DGs) based on certain rules.

The space of each disk in a DG is divided into chunks (CKs) of a fixed size. The

OceanStor Enterprise Storage selects chunks from different disks at random to compose chunk groups (CKGs) based on a certain RAID algorithm.

Divide RAID

Each CKG is divided into logical storage spaces of a fixed size called extents. Extents

are the basic units that compose thick LUNs (also called fat LUNs). For thin LUNs,

extents are further divided into grains that have a smaller granularity, and grains are mapped to thin LUNs.

Page 8: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

4

2.2 RAID2.0+ Implementation Framework The following figure shows the implementation framework of RAID2.0+ employed by the

OceanStor Enterprise Storage.

A disk domain in the OceanStor Enterprise Storage consists of disks from one or

multiple storage tiers. Each storage tier supports disks of a specific type: SSDs compose

the high-performance tier, SAS disks compose the performance tier, and NL-SAS disks compose the capacity tier.

The disks in each storage tier are divided into CKs with a fixed size of 64 MB.

CKs in each storage tier compose CKGs based on a user-defined RAID policy. Users are

allowed to define a specific RAID policy for each storage tier of a storage pool.

The OceanStor Enterprise Storage divides CKGs into small-sized extents. Extent is the

smallest granularity for data migration and is the basic unit of a thick LUN. When

creating a storage pool, users can set the extent size on the Advanced page. The default

extent size is 4 MB.

Multiple extents compose a volume that is externally presented as a LUN (which is a

thick LUN) accessible to hosts. A LUN implements space application, space release, and data migration based on extents. For example, when creating a LUN, a user can specify a

Page 9: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

5

storage tier from which the capacity of the LUN comes. In this case, the LUN consists of

the extents in the specified storage tier. After services start running, the OceanStor

Enterprise Storage migrates data among the storage tiers based on data activity levels and

data migration polices. (This function requires a SmartTier license.) In this scenario, data

on the LUN is distributed to the storage tiers of the storage pool based on extents.

When a user creates a thin LUN, the OceanStor Enterprise Storage divides extents into

grains and maps grains to the thin LUN. In this way, fine-grained management of storage capacity is implemented.

2.3 Logical Objects Involved in RAID2.0+ This section describes the major logical objects and key concepts related to RAID2.0+.

Disk Domain

A disk domain is a combination of multiple disks (or all the disks of a storage system). After

disks are consolidated and a certain amount of hot spare space is reserved, a disk domain

provides storage resources for storage pools in a unified manner.

One or more disk domains can be created on an OceanStor Enterprise Storage system.

Multiple storage pools can be created in a disk domain.

A disk domain can consist of SSDs, SAS disks, and/or NL-SAS disks.

Disk domains are isolated from each other, in terms of performance, storage resources, and faults.

Storage Pool and Storage Tier

A storage pool is a storage resource container. The storage resources used by application

servers are all from storage pools. Created in a specific disk domain, a storage pool

dynamically allocates CKs of the disk domain and enables these CKs to compose CKGs

based on the RAID policy defined for each storage tier. Then, the CKGs provide applications

with storage resources that have RAID protection.

A storage tier is a collection of storage media providing the same performance level in a

storage pool. Different storage tiers manage storage media at different performance levels and

provide storage space for applications that have different performance requirements. Based on

disk types, a storage pool can be divided into multiple tiers. The following table describes the

storage tiers and disk types supported by the OceanStor Enterprise Storage.

Storage Tier

Tier Name Supported Disk Type

Application Scenario

Tier 0 High-performa

nce tier

SSD SSDs provide high performance but are

expensive. Therefore, SSDs are suitable for

storing most-frequently accessed data.

Tier 1 Performance

tier

SAS SAS disks provide moderate performance

and are not expensive. Therefore, SAS

disks are suitable for storing data less frequently accessed.

Page 10: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

6

Storage Tier

Tier Name Supported Disk Type

Application Scenario

Tier 2 Capacity tier NL-SAS NL-SAS disks provide low performance but

a large capacity per disk at a low cost.

Therefore, NL-SAS disks are suitable for

storing a large amount of data and the data that is seldom accessed.

When creating a storage pool, a user is allowed to specify storage tiers allocated from the

corresponding disk domain and define a RAID policy and a capacity for each tier.

The OceanStor Enterprise Storage supports RAID 5, RAID 6, and RAID 10. The

following table lists the supported RAID policies.

RAID Level RAID Policy

RAID 5 4D+1P or 8D+1P

RAID 6 4D+2P or 8D+2P

RAID 10 2D+2D or 4D+4D, which is automatically selected by the storage system

The capacity tier consists of large-capacity NL-SAS disks. It is recommended that RAID

6 (a double-parity RAID level) be used as the RAID policy for this tier.

Disk Group(DG)

A Disk group (DG) is a collection of disks of the same type and from the same disk domain.

The OceanStor Enterprise Storage supports three disk types: SSD, SAS, and NL-SAS. Based

on the number of disks of each type in each disk domain, the OceanStor Enterprise Storage

automatically allocates disks to one or more DGs.

One DG consists of only one type of disks.

CKs in a CKG come from different disks that belong to the same DG.

DGs are internal objects automatically configured by the OceanStor Enterprise Storage

and typically used for fault isolation. DGs are not presented externally.

Logical Drive(LD)

Logical drives (LDs) are logical disks managed by the OceanStor Enterprise Storage. Each

LD corresponds to a physical disk.

Chunk(CK)

Each CK is a 64 MB physical space divided from disk space in a storage pool. CK is the basic

unit of a RAID group.

Page 11: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

7

Chunk Group(CKG)

A CKG is a logical storage unit that consists of CKs from different disks in the same DG

based on a RAID algorithm. A storage pool allocates resources from a disk domain by taking

CKG as the smallest unit.

All CKs in a CKG come from the disks in the same DG.

A CKG has RAID attributes, which are actually configured for the corresponding storage tier.

CKs and CKGs are internal objects automatically configured by the OceanStor Enterprise Storage. They are not presented externally.

Extent

An extent is a logical storage space with a fixed size divided from a CKG. The size ranges

from 512 KB to 64 MB. The default size is 4 MB. Extent is the smallest unit (granularity) for

data migration and hotspot data statistics collection. It is also the smallest unit for space

application and release in a storage pool.

One extent belongs to one volume or LUN.

A user can set the extent size while creating a storage pool. After that, the extent size cannot be changed.

The extents in one storage pool may have a different size from those in another storage pool. However, the extents in the same storage pool have the same size.

Grain

In thin LUN mode, extents are divided into 64 KB grains. A thin LUN allocates storage space

in units of grains. Logical block addresses in a grain are consecutive.

Page 12: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 2 Working Principles of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

8

Grains are mapped to thin LUNs. A thick LUN does not involve grains.

Volume and LUN

A volume is an internal management object in the OceanStor Enterprise Storage. A volume

organizes all extents and grains of a LUN and applies for and releases extents to increase and

decrease the actual space occupied by the volume.

A LUN is a storage unit that can be directly mapped to a host for data read and write. A LUN

is the external display of a volume.

Page 13: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

9

3 Technical Highlights of RAID2.0+

Based on two-layer virtualized management, RAID2.0+ overcomes the inherent defects of

traditional RAID and greatly improves storage system reliability and resource management

efficiency. By leveraging RAID2.0+, the OceanStor Enterprise Storage provides truly secure,

trusted, flexible, and efficient enterprise storage. This chapter describes the technical

highlights of RAID2.0+ from the preceding four perspectives.

3.1 Secure and Trusted

3.1.1 Automatic Load Balancing to Decrease the Overall Failure Rate

A traditional RAID-based storage system typically contains multiple RAID groups, where

each RAID group consists of up to 10-odd disks. RAID groups work under different loads. As

a result, the stress is unbalanced, leading to existence of hotspot disks. According to the

statistics collected by SNIA, hotspot disks are more vulnerable to failures. In the following

figure, Duty Cycle indicates the percentage of disk working time to total disk power-on time,

and ARF indicates the annual failure rate. It can be inferred that when the duty cycle is high,

the ARF is almost 1.5 to 2 times higher than the ARF in the low duty cycle scenario.

Page 14: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

10

RAID2.0+ implements block virtualization to enable data to be automatically and evenly

distributed onto all disks in a storage pool, preventing unbalanced load distribution. This

approach decreases the overall failure rate of a storage system.

3.1.2 Fast Thin Reconstruction to Reduce Dual-Disk Failure Probability

In recent 10 years of disk development, disk capacity growth outpaces performance

improvement. Nowadays, 4 TB disks are commonly seen in enterprise and consumer markets.

5 TB disks will come into being in the second quarter of 2014. Besides, even

high-performance SAS disks that are dedicated to the enterprise market can provide up to 1.2

TB per disk.

Rapid capacity growth makes traditional RAID face a serious issue: reconstruction of a single

disk, which required only dozens of minutes 10 years ago, now requires 10-odd hours or even

dozens of hours. The increasingly longer reconstruction time leads to the following problem:

A storage system that encounters a disk failure must stay in the degraded state without error

tolerance for a long time, exposed to a serious data loss risk. It is common that data loss

occurs in a storage system under the dual stress posed by services and data reconstruction.

Based on underlying block virtualization, RAID2.0+ overcomes the performance bottleneck

seen in target disks (hot spare disks) that are used by traditional RAID for data reconstruction.

As a result, the write bandwidth provided for reconstructed data flows is no longer a

reconstruction speed bottleneck, greatly accelerating reconstruction, decreasing dual-disk

failure probability, and improving storage system reliability.

The following figure compares the reconstruction principle of traditional RAID with that of

RAID2.0+.

In the schematic diagram of traditional RAID, HDDs 0 to 4 compose a RAID 5 group,

and HDD 5 serves as a hot spare disk. If HDD 1 fails, an XOR algorithm is used to

reconstruct data based on HDDs 0, 2, 3, and 4, and the reconstructed data is written onto HDD 5.

In the schematic diagram of RAID2.0+, if HDD 1 fails, its data is reconstructed based on

a CK granularity, where only the allocated CKs ( and in the figure) are

reconstructed. All disks in the storage pool participate in the reconstruction. The

reconstructed data is distributed on multiple disks (HDDs 4 and 9 in the figure).

Page 15: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

11

RAID2.0+ fined-grained and efficient fault handling also contributes to reconstruction

acceleration. In addition to the original bad sector repair and disk failure reconstruction,

RAID2.0+ provides bad block repair that only reconstructs used space based on a CK

granularity. By efficiently identifying used space, RAID2.0+ implements thin reconstruction

upon a disk failure to further shorten the reconstruction time, mitigating data loss risks.

With great advantages in reconstruction, RAID2.0+ enables the OceanStor Enterprise Storage

to outperform traditional storage systems in terms of reconstruction. The following figure

compares the time that a traditional storage system and the OceanStor Enterprise Storage

spend in reconstructing 1 TB data in an NL-SAS large-capacity disk environment.

Page 16: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

12

3.1.3 Failure Self-Check and Self-Healing to Ensure System Reliability

The OceanStor Enterprise Storage employs a multi-level error tolerance design for disks and

provides various measures to ensure reliability, including online disk diagnosis, Disk Health

Analyzer (DHA), bad sector background scanning, and bad sector repair. Based on a hot spare

policy, RAID2.0+ automatically reserves a certain amount of hot spare space in a disk domain.

If the OceanStor Enterprise Storage detects an uncorrectable media error in an area of a disk

or finds that an entire disk fails, the OceanStor Enterprise Storage automatically reconstructs

the affected data blocks and writes the reconstructed data to the hot spare space of other disks,

implementing quick self-healing.

Traditional RAID RAID2.0+

Independent global or local hot spare disks

must be manually configured.

Distributed hot spare space is provided

automatically.

Many-to-one reconstruction is implemented.

Reconstructed data flows are written to a single hot spare disk in serial.

Many-to-many reconstruction is

implemented. Reconstructed data flows are written to multiple disks in parallel.

Hotspot disks exist. The reconstruction time is

long.

Load balancing is implemented. The

reconstruction time is short.

3.2 Flexible and Efficient

3.2.1 Pool Virtualization Design to Simplify Storage Planning and Management

Nowadays, mainstream high-end storage systems typically contain hundreds of or even

thousands of disks of different types. If such a storage system employs traditional RAID,

administrators need to manage a lot of RAID groups and must carefully plan performance and

capacity for each application and RAID group. In the era of constant changes, it is almost

impossible to accurately predict the service development trends in an IT system lifecycle and

the corresponding data growth amount. As a result, administrators often face management

issues such as uneven allocation of storage resources. These issues greatly increase

management complexity.

The OceanStor Enterprise Storage employs advanced virtualization technologies to manage

storage resources in the form of storage pools. Administrators only need to maintain a few storage pools. All RAID configurations are automatically completed during the creation of

Page 17: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

13

storage pools. In addition, the OceanStor Enterprise Storage automatically manages and

schedules system resources in a smart way based on user-defined policies, significantly

simplifying storage planning and management.

3.2.2 One LUN Across More Disks to Improve Performance of a Single LUN

Since the 21st century, server computing capabilities have improved greatly and the number

of host applications (such as databases and virtual machines) has increased sharply, giving

rise to the needs for higher storage performance, capacity, and flexibility. Restricted by the

number of disks, a traditional RAID group provides only a small capacity, moderate

performance, and poor scalability. These disadvantages prevent traditional RAID groups from

meeting service requirements. When a host accesses a LUN intensively, only a limited

number of disks are actually accessed, easily causing disk access bottlenecks and making the

disks hotspot.

RAID2.0+ supports a storage pool that consists of dozens of or even hundreds of disks. LUNs

are created based on a storage pool, thereby no longer subject to the limited number of disks

supported by a RAID group. Wide striping technology distributes data of a single LUN onto

multiple disks, preventing disks from becoming hotspot and improving the performance and

capacity of a single LUN significantly. If the capacity of an existing storage system does not

meet the needs, a user can dynamically expand the capacity of a storage pool and that of a

LUN by simply adding disks to the disk domain. This approach improves disk capacity

utilization.

3.2.3 Dynamic Space Distribution to Flexibly Adapt to Service Changes

RAID2.0+ is implemented based on industry-leading block virtualization. Data and service

load in a volume are automatically and evenly distributed onto all physical disks in a storage

pool. By leveraging the Smart series efficiency improvement software, the OceanStor

Enterprise Storage automatically schedules resources in a smart way based on factors such as

Page 18: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 3 Technical Highlights of RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

14

the amount of hot and cold data and the performance and capacity required by a service. In

this way, the OceanStor Enterprise Storage adapts to rapid changes in enterprise services.

Page 19: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 4 Appendix A: FAQs

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

15

4 Appendix A: FAQs

Q1. Do all disks of an OceanStor Enterprise Storage system reside in the same storage pool?

Answer: Not always. An OceanStor Enterprise Storage system manages disks based on disk

domains and storage pools. A user can create one or more disk domains that are isolated from

each other in terms of resources, performance, and faults. One or more storage pools can be

created in a disk domain. Each storage pool uses various types of disks to provide storage

space.

Q2. How does an OceanStor Enterprise Storage system treat data in the event of capacity

expansion and disk failure?

Answer: After a disk is added to a storage pool, the OceanStor Enterprise Storage system

automatically moves a certain amount of data to the newly added disk space based on the disk

usage for capacity balancing, ensuring that all disks in the same storage pool have similar

space utilization.

If a disk fails, CKGs related to the failed disk automatically perform data reconstruction, and

the reconstructed data is evenly written to the hot spare space of other functional disks. Users

do not need to specify hot spare space. The OceanStor Enterprise Storage system

automatically selects hot spare space based on the disk usage. For details about the

reconstruction process, see section 3.1.2 "Fast Thin Reconstruction to Reduce Dual-Disk

Failure Probability."

Q3. Compared with traditional RAID, in what aspects does RAID2.0+ demonstrate its high

reliability?

Answer: RAID2.0+ demonstrates its high reliability in the following aspects:

Load balancing: RAID2.0+ enables disks to work in a balanced manner, preventing

some disks from being overloaded, which may occur in traditional RAID. For details, see section 3.1.1 "Automatic Load Balancing to Decrease the Overall Failure Rate."

Robust reconstruction: RAID2.0+ enables more disks to share reconstruction load,

reducing the work stress on each disk. In this way, the disk failure risk is minimized

during the reconstruction.

Fast reconstruction: RAID2.0+ significantly shortens the reconstruction window to

help an OceanStor Enterprise Storage system regain the error tolerance capability as

soon as possible, thereby improving system reliability. For details, see section 3.1.2 "Fast Thin Reconstruction to Reduce Dual-Disk Failure Probability".

Thin reconstruction: Based on metadata, RAID2.0+ detects which allocated space is in

use. In the event of reconstruction, RAID2.0+ reconstructs only the used space to reduce

Page 20: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 4 Appendix A: FAQs

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

16

the workload and shorten the reconstruction time, lowering the reconstruction failure

risks.

Self-healing: RAID2.0+ uses distributed hot spare space. If an OceanStor Enterprise

Storage system detects a fault, reconstruction automatically starts as long as there are

free CKs in disks, thereby improving reliability and cutting management costs. For

details, see section 3.1.1 "Automatic Load Balancing to Decrease the Overall Failure Rate."

Minimized data loss amount: If a traditional RAID group fails, all data in the RAID

group is affected. In terms of RAID2.0+, if multiple disks fail, only the data related to

these failed disks is affected, and other data is still accessible. Therefore, the amount of

lost data is much less.

Based on the Markov model, the following table lists the data loss risks of traditional

RAID and RAID2.0+, with data loss probability and data loss amount taken into consideration.

System Configuration

RAID2.0+ Configuration

Traditional RAID Configuration

Data Loss Risk (Traditional RAID/RAID2.0+)

40 x 600 GB SAS

disks (disk failure rate of 1%)

RAID 5 (4+1 disks), 40

disks per DG

Eight RAID 5 (4+1

disks) groups

16.09

40 x 2 TB SATA

disks (disk failure

rate of 2%)

RAID 6 (8+2 disks), 40

disks per DG

Four RAID 5 (4+1

disks) groups

69.29

40 x 600 GB SAS

disks (disk failure rate of 1%)

RAID 10 (10 disks), 40

disks per DG

Four RAID 10 (10

disks) groups

39.15

Q4. Will data be lost if there is a concurrent failure of two disks in an OceanStor Enterprise

Storage system?

Answer: The essence of this question is about the error tolerance capability of RAID. RAID is

the basis of data storage protection. RAID 5 tolerates a concurrent failure of only one disk (for

traditional RAID) or CK (for RAID2.0+). RAID 6 tolerates a concurrent failure of two disks

or CKs. Therefore, if you employ a double-parity RAID level (such as RAID 6) regardless of

traditional RAID or RAID2.0+, a concurrent failure of two disks will not cause data loss.

If RAID 5 is employed, a concurrent failure of two disks will cause data loss in traditional

RAID scenarios. For an OceanStor Enterprise Storage system that employs RAID2.0+,

however, as long as each CKG does not contain two failed CKs, data will not be lost.

The following figure shows a storage pool that consists of 20 disks, where storage space is

provided for hosts in the form of LUNs and the RAID policy is RAID 5 (4D+1P).

Page 21: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 4 Appendix A: FAQs

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

17

RAID2.0+: Concurrent Failure of Two Disks

If there is a concurrent failure of HDDs 7 and 9, only CKs related to the two disks are affected.

In the preceding figure, CKs 71 to 76 and CKs 101 to 106 are affected, whereas CKs 77 to 79

and CKs 107 to 109 are not affected because no data is stored in these free CKs. Accordingly,

CKGs 0, 1, 2, 4, 8, 11, 12, 13, 17, 19, 21, and 23 (the red CKGs with an underscore) are

affected. CKGs adopt the RAID 5 (4D+1P) policy, and each affected CKG contains only one

failed CK. Therefore, the data provided by each of these CKGs is still available. From the

perspective of hosts, the corresponding LUNs are still accessible and services are not

interrupted.

Page 22: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 5 Appendix B: Peripheral Resources About RAID2.0+

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

18

5 Appendix B: Peripheral Resources About RAID2.0+

RAID2.0+ Technology Video, helping you easily learn about the advantages and principles

of RAID2.0+:

http://3ms.huawei.com/mm/docMaintain/mmMaintain.do?method=showMMDetail&f_id=ST

R13083022230014http://3ms.huawei.com/mm/video/videoMaintain.do?method=showVideoD

etail&f_id=1421965

RAID2.0+ Panorama:

http://3ms.huawei.com/mm/docMaintain/mmMaintain.do?method=showMMDetail&f_id=ST

R13072905120079

Page 23: RAID2.0+ Technical White Paper - ActForNet...RAID2.0+ is a brand-new RAID technology developed by Huawei to overcome the disadvantages of traditional RAID and keep in line with the

OceanStor Enterprise Storage

RAID2.0+ Technical White Paper 6 Acronyms and Abbreviations

Issue 01 (2013-08-12) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

19

6 Acronyms and Abbreviations

Table 6-1 Acronyms and Abbreviations

Acronyms and Abbreviations Full Spelling

AFR annual failure rate

CK chunk

CKG chunk group

DG disk group

DHA Disk Health Analyzer

LD logical drive

LUN logical unit number

RAID Redundant Array of Independent Disks

RPM revolutions per minute

SNIA Storage Networking Industry Association

XVE eXtreme Virtual Engine