the pros and cons of erasure coding & replication …...the pros and cons of erasure coding...
TRANSCRIPT
![Page 1: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/1.jpg)
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms
Abhijith Shenoy – Engineer, Hedvig Inc. @hedviginc
![Page 2: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/2.jpg)
The need for new architectures
2
Business innovation Time-to-market Flexible
infrastructure
Business executives Developers IT infrastructure / DevOps
![Page 3: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/3.jpg)
3
Modern apps need. . .
• Scale • Flexibility • Self-service • Automation
To achieve this, the world is moving to a software-defined, distributed systems approach
![Page 4: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/4.jpg)
Software-defined Storage
![Page 5: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/5.jpg)
Software-defined storage
commodity servers software software-defined storage
+ =
![Page 6: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/6.jpg)
Common software-defined storage elements
Proxy/client Provides storage access to application compute environment via common protocols
Storage software
Forms elastic storage cluster with commodity servers or cloud infrastructure
![Page 7: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/7.jpg)
Modern software-defined storage architectures
7
Hyperconverged Hyperscale
![Page 8: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/8.jpg)
Spanning multiple DCs and clouds
8
Hyperconverged Hyperscale
![Page 9: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/9.jpg)
Data protection with SDS: RAID, Erasure Coding
& Replication
![Page 10: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/10.jpg)
Protecting stored data: RAID
10
• Redundant Array of Independent Disks – Divides or replicates data across multiple drives to
deliver performance and fault tolerance – Commonly used: RAID 0, RAID 1, RAID 5, RAID 10
• Pros – Trusted protection solution in the
traditional array world – Known performance delivery
• Cons – High-capacity drive (8TB+) rebuilds can
take days or even weeks – RAID controllers add complexity for
requisite performance
![Page 11: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/11.jpg)
Protecting stored data: Erasure coding • A parity based protection technique
– Data broken into fragments and encoded – Stored across different locations with a configurable number of
redundant pieces • Pros
– Consumes less storage than replication – good for cheap/deep – Allows for the failure of two or more elements of a storage system
• Cons – Parity calculation is CPU-intensive – Increased latency can slow production writes and rebuilds
11
![Page 12: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/12.jpg)
How erasure coding works • Split a file into n chunks and code into m parity blocks
12
A X1
X2 split encode
A1
A2
A3
A4
![Page 13: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/13.jpg)
How erasure coding works • Tolerate m erasures (failures)
13
A1
A2
A3
A4
= X1
X2 =
X1
X1
=
=
+
+ 2
X2
X2
![Page 14: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/14.jpg)
How erasure coding works • In a distributed system, chunks are spread across nodes • In this example, 2 nodes can fail and data can still be rebuilt
14
Node 1
A1
Node 2
A2
Node 3
A3
Node 4
A4 X1 X1 X1 + X2 X1 + (2)X2
![Page 15: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/15.jpg)
Erasure coding use case: Archival storage • Goal
– Need long-term storage of PBs of files – Minimizing storage costs critical to business profitability
• Solution – Software-defined storage + erasure coding
• Results – Store and protect archival data in 1.5x disk space – Performance adequate for workload – Rebuilds slower than desired, but capacity savings outweigh latency
15
![Page 16: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/16.jpg)
Protecting stored data: Replication • The creation of data copies across different locations of the
storage system – Typically 2 or 3 copies, configurable based on accepted risk level – If a drive fails, data is recreated on another drive from replica(s)
• Pros – Less CPU intensive = faster write performance – Simple restores = faster rebuild performance
• Cons – Requires 2x or more the original storage space
16
![Page 17: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/17.jpg)
How replication works with software-defined storage • Data broken into chunks
and n copies made across server nodes in a cluster
17
Data Center 3 Data Center 2 Data Center 1
App host
Node 1 Node 2 Node 3
![Page 18: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/18.jpg)
Offsetting replication overhead • Compression
– ~ 2:1 reduction
• Deduplication – ~5:1 or higher reduction
• Low disk cost – HDD and flash economics declining – Overhead of replication more tolerable
18
App hosts
![Page 19: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/19.jpg)
Doesn’t have to be one-size-fits-all • Modern solutions provide per-volume choice • Choose protection type based on workload
19
2-6 copies
512 bytes – 64k
Agnostic | Rack-aware | Datacenter-aware
Block (iSCSI) | NFS
![Page 20: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/20.jpg)
Replication use case: Primary data storage • Company: Large financial organization • Situation
– Hosting 500TB of data across four datacenters in two countries – Want maximum availability and recoverability
• Solution – Deployed software-defined storage with 4-way replication
• Results – Achieve high-performance, high-availability, and quick rebuilds
20
Data Center A Data Center B Data Center C
Active Active
Data Center D
Active
![Page 21: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/21.jpg)
Summary • Protection technologies are evolving along with architectures
• RAID has met its limitation with large capacity drives
• Erasure coding is a good option for latency tolerant, large capacity stores
• Replication provides protection in demanding performance and availability environments
• Software-defined storage offers choice and flexibility to deploy each protection technology where it makes sense
21
![Page 22: The Pros and Cons of Erasure Coding & Replication …...The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy – Engineer, Hedvig](https://reader031.vdocument.in/reader031/viewer/2022011822/5ecb72020746fe0230439c86/html5/thumbnails/22.jpg)
Thank You!