pwl denver: copysets
TRANSCRIPT
![Page 1: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/1.jpg)
opysets:
Reducing the Frequency of Data Lossin Cloud Storage
Aysylu Greenberg
Papers We Love Denver
April 27, 2017
![Page 2: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/2.jpg)
Welcome Papers We Love Denver!
![Page 3: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/3.jpg)
Aysylu Greenberg
@aysylu22
paperswelove.org
![Page 4: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/4.jpg)
![Page 5: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/5.jpg)
Today
• Random replication
• Copyset Replication
• Copyset Replication with Scatter Width
• Pragmatic aspects
![Page 6: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/6.jpg)
RANDOM REPLICATION
Overview & Tradeoffs
![Page 7: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/7.jpg)
Random Replication
R = 3N = 9
![Page 8: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/8.jpg)
Random Replication:Correlated Failures
![Page 9: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/9.jpg)
Recovery from Data Loss
Fixed cost of restoring lost data is high
Lose more data but less often
Increase in R is expensive
![Page 10: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/10.jpg)
Random Replication:Tradeoff
{small amount & high frequency} data loss
{large amount & low frequency} data loss
![Page 11: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/11.jpg)
COPYSET REPLICATION
Intuition
![Page 12: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/12.jpg)
Copyset Replication
R = 3N = 9
![Page 13: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/13.jpg)
Copyset Replication
R = 3N = 9
S = 2
![Page 14: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/14.jpg)
Recovery from Node Failure
Simpler recovery than random replication:
R – 1 nodes with data
Higher load on small number of nodes
![Page 15: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/15.jpg)
Copyset Replication with S = 2:Tradeoff
{small amount & high frequency} data loss
{large amount & low frequency} data loss
![Page 16: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/16.jpg)
SCATTER WIDTH
Tuning choices
![Page 17: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/17.jpg)
Copyset Replication with S=2
R = 3N = 9
![Page 18: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/18.jpg)
Copyset Replication with S=4
R = 3N = 9
![Page 19: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/19.jpg)
Copyset Replication with S = 4
1 2 3
654
7 8 9 R = 3N = 9
![Page 20: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/20.jpg)
1 2 3
654
7 8 9
1 2 3
654
7 8 9
Copyset Replication with S = 4:Permutation Phase
![Page 21: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/21.jpg)
1 2 3
654
7 8 9
Copyset Replication with S = 4:Permutation Phase
![Page 22: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/22.jpg)
1 2 3
654
7 8 9
Copyset Replication with S = 4:Permutation Phase
![Page 23: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/23.jpg)
1 2 3
654
7 8 9
Copyset Replication with S = 4:Permutation Phase
![Page 24: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/24.jpg)
Tuning Scatter Width
Set by system designer to control parallelism of data recovery
Control load on each individual node during recovery
![Page 25: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/25.jpg)
Copyset Replication Scatter Width:Tradeoffs
{small amount & high frequency} data loss
{large amount & low frequency} data loss
![Page 26: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/26.jpg)
Scatter Width:Tuning Choices
Random replication: scatter width of N-1, lots of replica sets
![Page 27: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/27.jpg)
Scatter Width:Tuning Choices
Random replication: scatter width of N-1, lots of replica sets
S << N
![Page 28: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/28.jpg)
Scatter Width:Tuning Choices
Random replication: scatter width of N-1, lots of replica sets
S << N
To reduce frequency of data loss, minimize:
![Page 29: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/29.jpg)
FROM IDEAS TO PRACTICE
Pragmatic aspects
![Page 30: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/30.jpg)
Pragmatic Aspects
• Move randomization to permutation stage
• Low overhead on operations
• Near optimal and fast
• Support for dynamic systems while maintaining guarantees is tricky -> chainsets(http://hackingdistributed.com/2014/02/14/chainsets/)
• Tiered replicationhttps://www.usenix.org/conference/atc15/technical-session/presentation/cidon
![Page 31: PWL Denver: Copysets](https://reader031.vdocument.in/reader031/viewer/2022022415/5a666b3a7f8b9a44398b4bd9/html5/thumbnails/31.jpg)
opysets:
Reducing the Frequency of Data Lossin Cloud Storage
Aysylu Greenberg
Papers We Love Denver
April 27, 2017