Download - Depot: Cloud Storage with minimal Trust
![Page 1: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/1.jpg)
Depot: Cloud Storage with minimal Trust
COSC 7388 – Advanced Distributed Computing
Presentation By Sushil Joshi
![Page 2: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/2.jpg)
Agenda
• Introduction• Typical Key Value Store• Fork-Join-Causality Consistency• Architecture of Depot• Basic Protocol• Properties Provided by Depot• Experimental Evaluation
![Page 3: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/3.jpg)
Introduction
• Cloud storage system that minimizes trust• Cloud Storage Service Provider (SSP) are
fault-prone (software bug, malicious insider, operator error, natural disaster)
• Depot eliminates trust for safety• Minimizes trust for liveness and availability
![Page 4: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/4.jpg)
Typical Key-Value Storage
• GET and PUT api available to customers.• Most services store and retrieve data based
on primary key only• Not implemented in RDBMS since typical
usecase do not require complex querying and managing facilities provided by RDBMS
• Excess functionality requires extra hardware and extra manpower
• RDBMS chooses consistency over availability• Partitioning scheme can not be used for load
balancing in RDBMS
![Page 5: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/5.jpg)
Consistency Vs Availability
• Strong consistency and high data availability can not be obtained together.
• Availability can be achieved by replicas and allowing concurrent write operation.
• This leads to conflicting changes that needs to be resolved
• Problem arises: when to resolve those conflicts and who resolves them.
• Eventually consistent – all replicas receive all updates eventually.
![Page 6: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/6.jpg)
Version Evolution of an Object
Vector Clock for version reconcilation. Sx, Sy, Sz are replicas of data storage.
D1 [Sx, 1]
D2 [Sx, 2]
D3 [Sx, 2][Sy,1] D4 [Sx, 2][Sz,1]
D5 [Sx, 3][Sy, 1][Sz,1]
Write handled by Sx
Write handled by Sx
Write handled by SzWrite handled by Sy
Write handled by Sx
![Page 7: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/7.jpg)
Gossip-based Protocol
• Random peer is chosen by each peer every second for gossip exchange.
• Used to propagate membership changes• Mapping stored at different nodes are
reconciled during same gossip exchange.• Partitioning and placement information also
propagates via gossip-based protocol.
![Page 8: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/8.jpg)
Fork-Join-Causality (FJC) Consistency
• Definition 1 An observer graph is an execution and an edge assignment• Definition 2 An execution is a set of read and write vertices
• Read Vertex = (n, s, oID, val) tuple• Write Vertex = (n, s, oID, wl) tuple
• Definition 3 An edge assignment for an execution is a set of directed edges connecting vertices of an execution.
• Definition 4 A consistency check for consistency semantics C is set of conditions that an observer graph must satisfy to be called consistent with respect to C
• Definition 5 An execution alpha is C-consistent iff there exists and edge assignment for alpha such that the resulting observer graph satisfies C's consistency check
• Definition 6 Vertex u preceeds vertex v in observer graph G if there is a direct path from u to v in G. If u does not preceed v and v doesn't preceed u, then u and v are concurrent.
• Definition 7 An operation u is said to be observed by a correct node p in G if either p executes u or if p executes an operation v such that u preceeds v.
![Page 9: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/9.jpg)
Fork-Join-Causality (FJC) Consistency
(a) An execution with a faulty node p2 and (b) an observer graph that is FJC and FCC.
![Page 10: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/10.jpg)
Fork-Join-Causality (FJC) Consistency
• An execution is FJC Consistent if following holds in an observer graph G–Serial Ordering at each correct node–Reads by correct nodes return latest
preceeding concurrent writes.
• The observer graph in (b) is both FJC and FCC consistent because FJC and FCC do not require total ordering of p2 ’s operations.
![Page 11: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/11.jpg)
Architecture of Depot
Arrows between servers indicate replication and exchange.
![Page 12: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/12.jpg)
Basic Protocol
• Exchange an “update” with other servers in the event of an update to a key's value
• Format : dVV, {key, H(value),localClock@NodeId, H(History)} sign of Node
• LogicalClock advanced on every update at nodeId and also every successful update from peer (advanced to more than peer's value).
• H(value): collision-resistant hash of the value rather than whole value
• H(History): collision-resistant hash of most recent update by each node know to writer at that instant of issuing update.
![Page 13: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/13.jpg)
Example of Series of writesN M P Q R
W0 (N, 1, A, “one”) dVV=NA, {A, Hv, 1@N, Hh}
Put A, “one”, 1@N, VV[N] = 1 -- -- --
W1 (N, 2, A, “two”) dVV[N]=1, {A, Hv, 2@N, Hh}
Put A, “two”, 2@N, VV[N] = 2 -- -- --
W2 (N, 3, A, “three”) dVV[N]=2, {A, Hv, 3@N, Hh}
Put A, “three”, 3@N, VV[N] = 3 -- -- --
W3(M, 4, A, “four”), VV[N]=3, VV[M]=4 dVV[M]=4, {A, Hv, 4@M, Hh}
-- -- --
Not accepted N N N
W5(N, 5, A, “five”) dVV[N]=4,{A, Hv, 5@N, Hh}
Put A, “five”, 5@N, VV[N]=5, VV[M]=4 -- -- --
![Page 14: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/14.jpg)
At the End of W5
W0 W1 W2W3
W5
![Page 15: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/15.jpg)
Properties Provided by Depot
• Fork-Join-Causal Consistency• Eventual Consistency• Availability and Durability• Bounded Staleness• Integrity and authorization• Data Recovery• Evicting Faulty Nodes
![Page 16: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/16.jpg)
Baseline Variants for Experimental Evaluation
Baseline variants used for comparison with depot
![Page 17: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/17.jpg)
Experimental Evaluation
Mean and standard deviation for GETs and PUTs of various object sizes in Depot and four baseline variants
![Page 18: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/18.jpg)
Experimental Evaluation
99th Percentile for GETs and PUTs of various object sizes in Depot and four baseline variants
![Page 19: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/19.jpg)
Per Request Average Resource Use
Baseline (B), B+Hash (H), B+H+Sig (S), B+H+S+Store (St), and Depot (D) in 100/0 (GET) and 0/100 (PUT) workloads with 10KB objects.
![Page 20: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/20.jpg)
Per Request Average Resource Use
The labels indicate the absolute per-request averages. (C) and (S) indicate resource use at clients and servers, respectively.
![Page 21: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/21.jpg)
Per Request Average Resource Use
(C-S) and (C-S) are client-server and server-server network use, respectively. For storage costs, we report the cost of storing a version of an object.
![Page 22: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/22.jpg)
Evaluated Dollar Cost
Dollar cost to GET 1TB of data, PUT 1TB of data, or store 1TB of data for 1 month. Each object has a small key and a 10KB value. 1TB of PUTs or GETs corresponds to 10^8 operations, and 1TB of storage corresponds to 10^8 objects.
![Page 23: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/23.jpg)
Effect of Total Server Failure
The effect of total server failure (t=300s) on staleness
![Page 24: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/24.jpg)
Effect of Total Server Failure
The effect of total server failure (t=300s) on GET Latency
![Page 25: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/25.jpg)
Questions
?
![Page 26: Depot: Cloud Storage with minimal Trust](https://reader034.vdocument.in/reader034/viewer/2022051402/5681584b550346895dc5a377/html5/thumbnails/26.jpg)
References[1] Depot: Cloud storage with minimal trust (extended version) , Prince Mahajan, Srinath ∗Setty, Sangmin Lee, Allen Clement, Lorenzo Alvisi, Mike Dahlin, and Michael Walfish
[2] Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels