SQUAREScalable Quorum-based Atomic Memory with
Local Reconfiguration
Vincent Gramoli, Emmanuelle Anceaume, Antonino Virgillito
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Context and Motivations
Distributed systems become Large-scale Dynamic Unpredictable
Challenges in Distributed Shared Memory Atomic Consistency Load Support
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Distributed Shared Memory (DSM)
Atomic Consistency Object Composition: we focus on a single object Read operations return the last value written
Replicated Object Replica is a node maintaining the value of the object Memory is the set of replicas
Read/Write Operations [ABD95] Any client can read and modify (write) the object To do so, it contacts quorums (a.k.a. mutually
intersecting sets) of replicas
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Existing DSM solutions
Lack of IndependenceNon-terminating operation may block others undefinitely
Lack of Scalability
Lack of Adaptiveness
MemoryMemory
Underloaded Memory (unused resource)
Overloaded Memory (bursts of load)
MemoryClients keep track ofall memory replicas
(replacing a replica is complex)
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
System Model
Object replicated on failure-prone nodes The replicas r1, …, rk share a 2-dim coordinate space
r1 r2 r3 r4
r5 r6 r7 r8
…
… rk-1
rk
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
System Model
Unreliable communication through neighborhood Each replica ri can communicate only with its nearest
neighbors
ri
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
System Model
Topology takeover mechanism (CAN [RFH+01]) Upon node failure/departure the space sharing is modified
accordingly
If a node ri fails, a takeover node rj replaces it
rirj
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
System Model
Topology takeover mechanism (CAN [RFH+01]) Upon node failure/departure the space sharing is modified
accordingly
If a node ri fails, a takeover node rj replaces it
rj
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Introducing our Dynamic Quorums
Dual-type Dynamic Quorums Vertical Quorum: All replicas responsible of an abscissa x Horizontal Quorum: All replicas responsible of an ordinate y
Intersection for Atomicity requirement Values are propagated (consulted) at a vertical (horizontal) quorum Thus, all consultations obtain the lastly propagated value
x
y
For any horizontal quorum H and any vertical quorum V:
H V ≠ Ø
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
SQUARE features
Atomicity and Independence Atomic operations are independent from each other
Local Knowledge Reactive Quorum Access: wrapping around the torus
Fast Adaptive Read Operations Single phase operations: accessing a single horizontal
quorum is sufficient
Memory Adaptiveness If overloaded (global approximation), then expand If underloaded (local observation), then shrink
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Operation Execution
Basic Read Operation:1) Get up-to-date value,2) Propagate this value
on a vertical Quorum.
Basic Write Operation:1) Get up-to-date value,2) Propagate the value to
write (and a higher version number) twice on the same vertical quorum
Fast Adaptive Read Op:1) Get up-to-date value
once on a single horizontal quorum.
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Adjustment of the overlay size
SQUARE thwarts if the requested replica is overloaded:Other replicas on its diagonal are contacted in turn until a non-overloaded one is found
SQUARE expands if all contacted replicas are overloaded:A node outside the memory is added, and the object value is replicated at this node.
SQUARE shrinks if a replica gets underloaded:The replica simply leaves the memory after neighbors notification.
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Simulation Results
Self-Adaptiveness
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Simulation Results
Load-Balancing
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Conclusion
Atomic Consistency is guaranteed Using dynamic quorum intersection, Each failed/leaving participant is replaced to
ensure quorum availability.
Adaptiveness makes the algorithm tunable Minimizing operation latency as much as possible, Maximizing capability to support bursts of load.
Perspective on operation speed up Kleinberg’s model to route in polylog(q) hops
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Some References
[CGG+05] Reconfigurable distributed storage for dynamic networks. G. Chockler, S. Gilbert, V. Gramoli, P. M. Musial, and A. A. Shvartsman. In Proc. of 9th Int’l Conf. on Principles of Distributed Systems (OPODIS’05), 2005.
[AGGV05] P2P Architecture for Self-*Atomic MemoryE. Anceaume, M. Gradinariu, V. Gramoli, A. Virgillito In Proc of the 8th Intl Symposium on Parallel Architectures, Algorithms,and Networks (I-SPAN’05)214–219, 2005.
[RFH+01] A Scalable Content Adressable NetworkS. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker In Proc. of the ACM SIGCOMM, 161–172, 2001.
[ABD95] Sharing Memory Robustly in Message Passing SystemsH. Attiya, A. Bar-Noy, D. Dolev In Journal of the ACM, 42(1):124–142, 1995.
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Simulation Results
Operation Latency
Request rate
Read latency
Write latency
Max.
memory size
Max. hor quorum size
Max. vert.
quorum size
1/250 478.6 733.3 10 5 6
1/200 621.8 812.5 14 4 8
1/100 1131.8 1395.8 24 3 14
1/50 1500.7 2173.5 46 8 23
1/25 2407.9 3500.9 98 11 51
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Simulation Results
Fault-tolerance
ACM SAC’07March 14th
V. Gramoli, E. Anceaume, A. Virgillito
Simulation Results
Scalability