parity logging o vercoming the small write problem in redundant disk arrays daniel stodolsky garth...

29
Parity Logging Overcoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Upload: lynne-wilson

Post on 27-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Parity LoggingOvercoming the Small Write Problem

in Redundant Disk Arrays

Daniel Stodolsky

Garth Gibson

Mark Holland

Page 2: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Contents

Overview of some Raid systems Small write problem Parity logging Floating data and parity Comparison between different models Concluding remarks Questions

Page 3: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

RAID systems consideredin this paper .

Page 4: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Small Write Problem RAID 5 Small write may require prereading

old data, writing new data, prereading corresponding old parity value, and writing new parity value.

RAID level 5 ,therefore, is penalized by a factor of four over nonredundant arrays for workloads of mostly small writes.

Mirrored disks are only penalized by a factor of two since data only needs to be written to two separate disks

Page 5: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

OLTP and Small write

OLPT (On-line transaction processing) systems represent a substantial segment in of the secondary storage market . Bank System is an example

OLTP systems require update-intensive database services

Performance of OLTP is largely determined by small write performance.

Page 6: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Disk Bandwidth

The three components of disk access are: seek time, rotational positioning time, and data transfer time.

Small disk writes make inefficient use of disk bandwidth

Random cylinder accesses move data twice as fast as random track accesses which, in turn, move data ten times faster than random block accesses.

Page 7: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Parity Logging

A powerful mechanism for eliminating small write penalty.

Based on the much higher disk bandwidth of large accesses over small

A technique for logging or journaling events to transform small random accesses into large sequential accesses to log and parity disks

Page 8: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Basic Parity Logging Model A RAID level 4 disk array with one additional disk, a

log disk. parity update image is held in a fault tolerant buffer When enough parity update images are buffered,

they are written to the end of the log on the log disk. When the log disk fills up, the out-of-date parity and

the log of parity update information are read into memory.

The out-of-date parity is updated (in memory) and rewritten with large sequential writes.

Page 9: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Basic Parity Logging Model

Page 10: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Reliability of Basic Logging Model

Data disk failure => • update parity disk

• Reconstruct the lost data

Log or Parity disk failure• Install new empty log disk (or parity disk)

• Reconstruct parity

Page 11: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Tracks, Cylinders, and Sectors

Page 12: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Parity Maintenance Time analysis (basic model vs Raid 4)

Every D small writes issued cause one track write to the log to occur

Every TVD small writes issued cause the log disk to fill up then 3 full disk accesses at cylinder data rate

=> parity writes for TVD small writes consumes as much disk time asTV(D/10) + 3V(T/2xD/10) = TVD/4

Result “Parity consumed by the parity update I/Os is reduced by about a factor of eight

Page 13: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Enhancing Basic parity Logging Model

Limitation• The Basic Parity Logging model is completely impractical

since an entire disk’s capacity of random access memory is required to hold the parity during the application of the parity updates.

Enhancement (Parity Logging Regions)• dividing the array into regions.

• Every region is treated the same way as an entire disk in the basic model

• Each region has its own fault tolerant buffer

Page 14: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Parity Logging Regions

Page 15: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Enhancing Parity Logging Regions

Limitation• Log and parity disks may become

performance bottlenecks if there are many disks in the array.

Enhancement (Log and parity Rotation)• Distributing parity and Logs across all the

disks in the array

Page 16: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Log and parity Rotation

Page 17: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Enhancing Log and parity Rotation

Limitation• The log and parity bandwidth for a particular

region is still that of a single disk.

Enhancement (Block Parity Striping)• Distributing the parity log for each region over

multiple disks.

Page 18: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Block Parity Striping

Page 19: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Analytical Model

Single small write access in parity logging will on average take

Which can be simplified to S + (3 + 2/D) R

Without preread S + (1 + 2/D) R

More analysis • Writing fault tolerant buffers to Parity log regions.

• Log parity integration

Page 20: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Simulation Parameters

Page 21: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Parity Logging Overheads vs RAID 5 Overhead (per small write)

Contributions to disk busy time for the example disk array ( previous slide)

Extra I/O done by RAID 5 cost nearly 35 milliseconds

Page 22: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Alternative SchemesFloating Data and Parity

Organizing data and parity into cylinders that contain either data only or parity only and

Maintaining a single track of empty space per cylinder

Page 23: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Floating Data Parity

Page 24: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Floating Data and Parity (analysis) For RAID 5, busy time for each data and parity

update is S + R + 2R/D + (2R – 2R/D) + 2R/D

With new technique (2R – 2R/D) term is replaced with a head switch and a short rotational delay ( 0.76 data units using the sample array mentioned before)

Small random write in floating data and parity is 2S+(2+11.04/D)R + 2H

This is close to mirroring performance if D is large and H is small

Page 25: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Model Estimates (as predicted by analysis )

I/O p

er secon

d

per d

isk

Page 26: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Response Times and Utilization.

Page 27: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Response Time Standard Deviation

Page 28: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Concluding Remarks

Parity logging achieves better performance than Raid Level 5 arrays

When data must be preread before being overwritten, Parity Logging is comparable to floating parity and data

Performance is superior to mirroring and floating parity and data when the data to be overwritten is cached

Page 29: Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland

Questions

What is parity logging Describe the general technique of Parity

logging. What is the small write problem, and why

it is so important What are the advantages and

disadvantages of floating data and parity