s- paxos : eliminating the leader bottleneck
DESCRIPTION
S- Paxos : Eliminating the Leader Bottleneck. Martin Biely , Zarko Milosevic, Nuno Santos , André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL ) Switzerland. October 9, 2012. Context: State Machine Replication. Consistency among replicas ensured by Deterministic service - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/1.jpg)
S-Paxos: Eliminating the Leader Bottleneck
Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper
Ecole Polytechnique Fédérale de Lausanne (EPFL)Switzerland
October 9, 2012
![Page 2: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/2.jpg)
2
Context: State Machine Replication• Consistency among
replicas ensured by• Deterministic service• Same initial state• Same sequence of
requests
• System model• Partially synchronous• Crash stop (max
crashes)
Nuno Santos Context and Motivation
Replicated Service
Service Service Service
Ordering protocol (Paxos)
Clients
![Page 3: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/3.jpg)
3
The Paxos Protocol
Nuno Santos Context and Motivation
• Observation: leader receives and sends more messages than the followers• Potential system bottleneck…
• Paxos is a leader-based protocol• A distinguished process (leader) coordinates the others (followers)
![Page 4: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/4.jpg)
4
Paxos Performance
Nuno Santos Context and Motivation
Experimental settings• JPaxos – implementation of Paxos in Java (protocol shown previously)• n=3, request size=20 bytes, CPU 2x2cores @2.2Ghz
The bottleneck in Paxos is typically the leader
![Page 5: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/5.jpg)
5
Paxos is Leader-centric• Leader-centric protocol
• The leader does considerably more work than the followers• Therefore, the leader is prone to being the system bottleneck
• Paxos and most leader-based protocols are also leader-centric
Nuno Santos Context and Motivation
![Page 6: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/6.jpg)
6
Leader-based vs Leader-centric• Note that leader-based ≠ leader-centric
• Leader-based – algorithmic concept, leader is a distinguished process
• Leader-centric – resource usage, leader is a bottleneck
Nuno Santos Context and Motivation
Question: do leader-based protocols like Paxos must also be leader-centric?
![Page 7: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/7.jpg)
7
S-PAXOS OVERVIEWLeader-based but not leader-centric
Nuno Santos
![Page 8: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/8.jpg)
8
Why Paxos is Leader-centric• Leader does the following
• Receives requests from clients• Coordinates protocol to order requests• Replies to clients
• Followers do much less• Receive client requests from leader• Acknowledge order proposed by leader
• Underlying problem: unbalanced resource utilization• Leader runs out of resources (CPU, network bandwidth)• While followers are lightly loaded
Nuno Santos S-Paxos Overview
![Page 9: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/9.jpg)
9
S-Paxos: A Balanced Paxos Variant• S-Paxos balances workload across replicas
• Leader and followers have similar resource usage• The full resources of all replicas become available to the ordering
protocol
• S-Paxos is leader-based but not leader-centric
• Combines several well-known ideas in a novel way• All replicas handle client communication• All replicas disseminate requests• Ordering done on IDs
Nuno Santos S-Paxos Overview
![Page 10: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/10.jpg)
10
S-Paxos key ideasDistribute client communication
• Commonly used in practice• For instance, ZooKeeper
• But by itself, still leader-centric• Leader runs the ordering protocol on
requests (Phase 2a messages of Paxos) • Followers have to forward requests to
leader• And hence, sends requests to other
followers
Nuno Santos S-Paxos Overview
All replicas handle client communication
![Page 11: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/11.jpg)
11
S-Paxos key ideasDistribute request dissemination
• Note that Phase 2a messages have a dual purpose• Dissemination of requests• Establishing order
• All replicas disseminate requests• Ordering performed on IDs
Nuno Santos S-Paxos Overview
S-Paxos separates dissemination from ordering
![Page 12: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/12.jpg)
12
S-Paxos Architecture and Data Flow
Nuno Santos S-Paxos Overview
![Page 13: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/13.jpg)
13
S-Paxos balances work among replicas• Client communication and request
dissemination usually the bulk of the load• In S-Paxos this task is performed by all
replicas
• Leader still has to coordinate ordering protocol• But IDs are small messages• So leader has minimal additional overhead
Nuno Santos S-Paxos Overview
• Two levels of batching to further reduce load on leader• Dissemination layer: batch client requests and use ordering layer to
order ids of batches• Ordering layer: usual Paxos batching, in this case batches of batch ids.
![Page 14: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/14.jpg)
14
Benefits in the presence of faults• Faster view change
• Since IDs are small, Phase 1 of Paxos completes quickly
• Failures affecting the leader have less impact on throughput• Ordering protocol is interrupted, but dissemination protocol
continues among working replicas• When a correct leader emerges, it can quickly order the IDs of the
requests that were disseminated while there was no leader
Nuno Santos S-Paxos Overview
![Page 15: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/15.jpg)
15
DISSEMINATION LAYER PROTOCOL
Nuno Santos
![Page 16: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/16.jpg)
16
Dissemination Layer Overview• Dissemination layer tasks
1) Receive requests from clients2) Disseminate requests and IDs to all replicas3) Initiate ordering of IDs4) Execute requests in the order established for
IDs
• Challenges• Once an ID is decided, the corresponding request
must remain available in the system• Coordinate view change between ordering and
dissemination layers to ensure that ids are ordered once-and-only once
Nuno Santos Dissemination Layer Protocol
2 2
1
3 4
![Page 17: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/17.jpg)
17
Overview of the ProtocolDisseminating requests• Optimistic implementation of reliable broadcast• When a replica receives a request from a client, it broadcasts <request,ID> • Replicas acknowledge reception of forwarded requests by broadcasting <Ack,ID>
Proposing IDs• Leader proposes an ID once the corresponding request is stable
• That is, when it receives acknowledgements for the ID
Executing requests• Replica must have: request and decision for corresponding ID• If ID decided before request received, poll other replicas for request after a
small delay• Request stable, so at least one correct replica has the request
Nuno Santos Dissemination Layer Protocol
![Page 18: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/18.jpg)
18
PERFORMANCE EVALUATION
Nuno Santos
![Page 19: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/19.jpg)
19
Performance Evaluation• S-Paxos implemented on top of JPaxos, a Java implementation of Paxos
• Experiments compare • JPaxos (leader-centric)• S-Paxos (non leader-centric)
• Testbed: Grid 5000 (helios cluster)• CPU: 2x2-cores @ 2.2Ghz• Network: 1Gbit Ethernet
• Experimental parameters• Request size: 20 bytes• Batch size
• S-Paxos: dissemination layer 1450 bytes, ordering layer: 50 bytes• JPaxos: 1450 bytes
• Null service
Nuno Santos Experimental Evaluation
![Page 20: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/20.jpg)
20
Load Distribution: Average CPU utilization
Nuno Santos Experimental Evaluation
JPaxos S-Paxos
![Page 21: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/21.jpg)
21
Performance with Increasing Number of Clients (n=3)
Nuno Santos Experimental Evaluation
Throughput Response time
![Page 22: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/22.jpg)
22
Scalability
Nuno Santos Experimental Evaluation
Throughput
![Page 23: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/23.jpg)
23
Throughput with crashes
Nuno Santos Experimental Evaluation
Crash of the leader
• Request size: 1KB, Batch size: 8KB,
![Page 24: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/24.jpg)
24
False suspicions
Nuno Santos Experimental Evaluation
• Leader is (wrongly) suspected every 10 seconds
![Page 25: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/25.jpg)
25
Conclusion
Nuno Santos
A leader-based protocol does not need to be leader-centric
S-Paxos: balances the workload across replicas
Benefits• Better performance for the same number of replicas• Better scalability with the number of replicas• Better performance in the presence of faults
![Page 26: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/26.jpg)
26
ADDITIONAL SLIDES
Nuno Santos
![Page 27: S- Paxos : Eliminating the Leader Bottleneck](https://reader035.vdocument.in/reader035/viewer/2022070502/568147b8550346895db4fc8a/html5/thumbnails/27.jpg)
27
Discussion
Nuno Santos Dissemination Layer Protocol
• Broadcast of <request,ID>: best effort, no retransmission• Avoids cost of reliable broadcast on requests• Recovering from partial delivery (message loss/crashes):
• Request does not become stable - client timeouts and retransmits• Request becomes stable – after ID is decided, replicas poll other
replicas for request
• Broadcast of <Ack,ID>: retransmission• Ensures that once a request is stable, it will be proposed• Almost free in practice: acks are small and can be piggybacked on
other messages.