236601 - coding and algorithms for memories lecture 12

17
236601 - Coding and Algorithms for Memories Lecture 12 1

Upload: verne

Post on 06-Jan-2016

31 views

Category:

Documents


2 download

DESCRIPTION

236601 - Coding and Algorithms for Memories Lecture 12. Array Codes and Distributed Storage. Large Scale Storage Systems. Big Data Players: Facebook, Amazon, Google, Yahoo,… Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!) Failures are the norm. 3. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 236601 - Coding and Algorithms  for  Memories Lecture 12

1

236601 - Coding and Algorithms for

MemoriesLecture 12

Page 2: 236601 - Coding and Algorithms  for  Memories Lecture 12

2

Array Codes and Distributed Storage

Page 3: 236601 - Coding and Algorithms  for  Memories Lecture 12

Large Scale Storage Systems

3

• Big Data Players: Facebook, Amazon, Google, Yahoo,…

Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!)

• Failures are the norm

Page 4: 236601 - Coding and Algorithms  for  Memories Lecture 12

Node failures at Facebook

4

Date

XORing Elephants: Novel Erasure Codes for Big Data M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, VLDB 2013

Page 5: 236601 - Coding and Algorithms  for  Memories Lecture 12

5

Problem Setup

• Disks are stored together in a group (rack)• Disk failures should be supported• Requirements:– Support as many disk failures as possible– And yet…

• Optimal and fast recovery• Low complexity

Page 6: 236601 - Coding and Algorithms  for  Memories Lecture 12

6

Problem Setup• Question 1: How many extra disks are required to support

a single disk failure?

• Question 2: How many extra disks are required to support two disk failures?

• Question 3: How many extra disks are required to support d disk failures?

A B C A+B+C

A B C A+B+C

A+B+C

A B C A+B+C

A+B+C

’A+’B+’C

{(x1,x2,x3,x4): x1+x2+x3+x4= 0 }

{(x1,x2,x3,x4,x5): x1+x2+x3+x4=0 x1+x2+x3+x5=0 }

{(x1,x2,x3,x4,x5,x6): x1+x2+x3+x4=0 x1+x2+x3+x5=0

’x1+’x2+’x3+x6=0}

{(x1,x2,x3,x4): H1∙(x1,x2,x3,x4)T=0}

H1 = (1,1,1,1)

{(x1,x2,x3,x4,x5): H2∙(x1,x2,x3,x4,x5)T=0}

H2= (1,1,1,1,0; ,,,0,1)

{(x1,x2,x3,x4,x5,x6):H3∙(x1,x2,x3,x4,x5,x6)T=0} H3= (1,1,1,1,0,0; ,,,0,1,0; ’,’,’,0,1,0)

Page 7: 236601 - Coding and Algorithms  for  Memories Lecture 12

7

Reed Solomon Codes

• A code with parity check matrix of the form

Where is a primitive element at some extension field and O() > n-1Claim: Every sub-matrix of size dxd has full rank

Page 8: 236601 - Coding and Algorithms  for  Memories Lecture 12

8

Reed Solomon Codes

• Advantages:– Support the maximum number of disk failures– Are very comment in practice and have

relatively efficient encoding/decoding schemes

• Disadvantages – Require to work over large fields– Need to read all the disks in order to recover

even a single disk failure – not efficient rebuild

Page 9: 236601 - Coding and Algorithms  for  Memories Lecture 12

9

Reed Solomon Codes

• Advantages:– Support the maximum number of disk failures– Are very comment in practice and have

relatively efficient encoding/decoding schemes

• Disadvantages – Require to work over large fields

Solution: EvenOdd Codes– Need to read all the disks in order to recover

even a single disk failure – not efficient rebuildSolution: ZigZag Codes

Page 10: 236601 - Coding and Algorithms  for  Memories Lecture 12

10

EVENODD Codes

• Designed by Mario Balum, Jim Brady, Jehoshua Bruck, and Jai Menon

• Goal: Construct array codes correcting 2 disk failures using only binary XOR operations– No need for calculations over extension fields

• Code construction:– Every disk is a column– The array size is (m-1)x(m+2), m is prime– The last two arrays are used for parity

Page 11: 236601 - Coding and Algorithms  for  Memories Lecture 12

11

EVENODD Codes

0 1 1 0 1

0 0 1 1 0

0 0 0 1 1

1 1 0 1 0

0 1 0 1 1 0 1

0 0 0 0 1 1 0

1 0 0 0 0 1 1

0 1 1 1 0 1 0

0 0 0 0 0 0 0

Page 12: 236601 - Coding and Algorithms  for  Memories Lecture 12

The Repair Problem

12

1 2 3 4 5 6 7 910

8P1

P3

P4

P2

• A disk is lost – Repair job starts

• Access, read, and transmit data of disks!

• Overuse of system resources during single repair

• Goal: Reduce repair cost in a single disk repair

• Facebook’s storage Scheme:– 10 data blocks

– 4 parity blocks

– Can tolerate any four disk failures

RS code

Page 13: 236601 - Coding and Algorithms  for  Memories Lecture 12

13

ZigZag Codes

• Designed by Itzhak Tamo, Zhiying Wang, and Jehoshua Bruck

• The goal: construct codes correcting the max number of erasures and yet allow efficient reconstruction if only a single drive fails

Page 14: 236601 - Coding and Algorithms  for  Memories Lecture 12

14

ZigZag Codes

• Example

a b a+b a+2dc d c+d c+b

Page 15: 236601 - Coding and Algorithms  for  Memories Lecture 12

15

ZigZag Codes

• Lower bound: The min amount of data required to be read to recover a single drive failure– (n,k) code: n drives, k information, and n-k redundancy– M- size of a single drive in bits

• For (n,n-2) code it is required to read at least 1/2 from the remaining drives, that is at least (1/2)(n-1)M bits– The last example is optimal

• In general, for (n,n-r) code it required to read at least 1/r from the remaining drives (1/r)(n-1)M

Page 16: 236601 - Coding and Algorithms  for  Memories Lecture 12

16

ZigZag Codes

• Example

info 1 info 2 info 3 Row parity

ZigZag

parity

Page 17: 236601 - Coding and Algorithms  for  Memories Lecture 12

17

ZigZag Codes

• Example

info 1 info 2 info 3 Row parity

ZigZag

parity0 2 1 01 3 0 12 0 3 23 1 2 3