a closer look at fault tolerance - hellasgadi taubenfeld srdc 2013 12 problem model weakly wf...
TRANSCRIPT
![Page 1: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/1.jpg)
A Closer Look at Fault Tolerance
Gadi Taubenfeld SRDC 2013 1
Gadi Taubenfeld
![Page 2: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/2.jpg)
Example: Perfect Renaming
Gadi Taubenfeld SRDC 2013 2
17 39 11 99 27
5 3 1 2 4
![Page 3: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/3.jpg)
Example: Perfect Renaming
Gadi Taubenfeld SRDC 2013 3
5
39
1 2 4
1-resilient
![Page 4: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/4.jpg)
Example: Perfect Renaming
Gadi Taubenfeld SRDC 2013 4
5
39
1 2
27
Not 1-resilient
![Page 5: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/5.jpg)
Example: Perfect Renaming
Gadi Taubenfeld SRDC 2013 5
5
39
1 2
27 39 27 17 11 99
Not 1-resilient Not 1-resilient
![Page 6: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/6.jpg)
A General Definition See paper for details
Gadi Taubenfeld SRDC 2013 6
For a given function f: N N, an algorithm is (t,f)-resilient if in the
presence of t’ faults at most f(t’) participating correct processes may
not terminate their operations, for 0 ≤ t’ ≤ t.
Not covered in this talk
![Page 7: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/7.jpg)
Notation
Gadi Taubenfeld SRDC 2013 7
Correct active process
Correct process that has terminated
Faulty process
![Page 8: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/8.jpg)
Wait-freedom [Herlihy 1991]
Gadi Taubenfeld SRDC 2013 8
In the presence of any number of faults, all the correct participating processes must terminate.
1 faults
2 faults
3 faults
4 faults
0 faults P1 P2 P3 P4 P5 P6
5 faults
![Page 9: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/9.jpg)
Almost-wait-freedom
Gadi Taubenfeld SRDC 2013 9
In the presence of any number of faults, all the correct participating processes, except maybe one, must terminate.
1 faults
2 faults
3 faults
4 faults
0 faults P1 P2 P3 P4 P5 P6
5 faults
![Page 10: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/10.jpg)
Partially-wait-freedom
Gadi Taubenfeld SRDC 2013 10
In the presence of any number of t ≤ n-1 faults all the correct participating processes, except maybe t of them, must terminate.
1 faults
2 faults
3 faults
4 faults
0 faults P1 P2 P3 P4 P5 P6
5 faults
![Page 11: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/11.jpg)
Weakly-wait-freedom
Gadi Taubenfeld SRDC 2013 11
In the presence of any number of faults, if there are two or more correct participating processes then one correct participating processes must terminate.
1 faults
2 faults
3 faults
4 faults
0 faults P1 P2 P3 P4 P5 P6
5 faults
![Page 12: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/12.jpg)
Technical Results
Gadi Taubenfeld SRDC 2013 12
Problem Model Weakly WF
PartiallyWF
Almost WF
Complexity
Election SM/MP
Test&set SM
Perfect renaming
SM/MP
Stack SM
Swap SM
Fetch&add SM
Consensus Set-consensus
SM/MP
SM -- Shared Memory using atomic registers
MP – Message Passing (send/receive)
Thm: There is no 1-resilient implementation
using atomic registers or messages
![Page 13: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/13.jpg)
Technical Results
Gadi Taubenfeld SRDC 2013 13
Problem Model Weakly WF
PartiallyWF
Almost WF
Complexity
Election SM/MP
Test&set SM
Perfect renaming
SM/MP
Stack SM
Swap SM
Fetch&add SM
Consensus Set-consensus
SM/MP
x x x
SM -- Shared Memory using atomic registers
MP – Message Passing (send/receive)
![Page 14: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/14.jpg)
Gadi Taubenfeld SRDC 2013 14
Problem Model Weakly WF
PartiallyWF
Almost WF
Complexity Upper Lower
Election SM log n +2 log n +1
Election MP O(n^2)
Test&set SM n+1 n
Perfect renaming one-shot
SM O(n log n)
Perfect renaming one-shot
MP O(n^3)
Perfect renaming Long-lived
SM O(n^2)
Technical Results
SM -- # of atomic registers
MP -- # of messages
![Page 15: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/15.jpg)
Gadi Taubenfeld SRDC 2013 15
An almost-wait-free symmetric election process p program
turn = p for level = 1 to log n do repeat if done = 1 then return(0) fi if turn p then for j =1 to level - 1 do if V[j] = p then V[j] = 0 fi od return(0) fi until V[level] = 0 V[level] = p if turn p then for j =1 to level do if V[j] = p then V[j] = 0 fi od return(0) od done = 1; return(1)
0 turn 0 0 0 0 0 0 0 0 0 done V
1 log n . . .
p
Inspired by Styer & Peterson PODC 1989
![Page 16: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/16.jpg)
16
An almost-wait-free symmetric test&set bit process p program
if turn 0 then return(0) fi turn = p repeat for j =1 to n-1 do if lock[j] = 0 then lock[j] = p fi od locked = 1 for j =1 to n-1 do if lock[j] p then locked = 0 fi od until turn p or locked = 1 or winner = 1 if turn p or winner = 1 then for j =1 to n-1 do if lock[j] = p then lock[j] = 0 fi od return(0) fi winner = 1; return(1)
test&set
winner = 0; turn = 0 for j =1 to n-1 do if lock[j] = p then lock[j] = 0 fi od
reset
0 turn
0
0 0 0 0 0 0 0 0
locked
lock
1 n-1 . . .
p
0 winner
(local)
![Page 17: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/17.jpg)
Gadi Taubenfeld SRDC 2013 17
A trivial almost-wait-free symmetric election Program for a process with identifier my.id
counter := 0
Send my.id to all the other processes;
Each time a message is received do
if my.id < message.val then return(0) else counter := counter +1 fi
if counter =n-1 then return(1) fi
od
Is there a better algorithm ?
![Page 18: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/18.jpg)
Gadi Taubenfeld SRDC 2013 18
Perfect Renaming Partially-wait-free, Long-lived
0 0 0 0 0
Almost-wait-free test&set bit
What about almost-wait-free renaming ?
1 2 3 4 5
![Page 19: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/19.jpg)
Gadi Taubenfeld SRDC 2013 19
Fetch&add, Swap, Stack Partially-wait-free
Fetch&add
Swap
Stack
Test&set
+ atomic registers
WF WF
WF
WF
Almost-WF
Partially-WF
Partially-WF
Partially-WF
[Afek, Weisberger, Weisman PODC 93]
What about almost-wait-free ?
![Page 20: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/20.jpg)
Open Problems
Improve our results: – Computability: Is there an almost-wait-free perfect renaming, stack, swap, f&a, … – Complexity: improve the space/message/time …
Type of faults: crash, omission, Byzantine, …
Time: asynchronous, synchronous, … Other objects: queue, …
Failure models: uniform, non-uniform
Other models: unbounded concurrency, failure detectors …
Gadi Taubenfeld SRDC 2013 20
![Page 21: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/21.jpg)
Fault- tolerance What can go wrong?
Processes
Communication links
Messages
Shared memory
Timing failures
Gadi Taubenfeld SRDC 2013 21
Memory reordering
![Page 22: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/22.jpg)
Example: Using Flags
Gadi Taubenfeld 22 ICDCN 2013
x and y : atomic bits, initially 0
Q: Is it possible that both processes read the value 0 ?
0 x 0 y
Process A
write.x(1)
read.y
Process B
write.y(1)
read.x
![Page 23: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/23.jpg)
Gadi Taubenfeld 23 ICDCN 2013
Example: Using Flags x and y : atomic bits, initially 0
0 x 0 y
Process A
write.x(1)
read.y
Process B
write.y(1)
read.x
Fact: Many hardware architectures do not support sequential consistency because thy think it is too strong
![Page 24: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/24.jpg)
Gadi Taubenfeld 24 ICDCN 2013
Example: Using Flags x and y : atomic bits, initially 0
0 x 0 y
Process A
write.x(1)
read.y
Process B
write.y(1)
read.x
Solution: Memory barriers
![Page 25: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/25.jpg)
Gadi Taubenfeld 25 ICDCN 2013
Assumption: At most one memory reordering is possible.
Example: Using Flags x and y : atomic bits, initially 0
0 x 0 y
Process A
write.x(1)
read.y
Process B
write.y(1)
read.x
![Page 26: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/26.jpg)
Gadi Taubenfeld 26 ICDCN 2013
Assumption: At most one memory reordering is possible.
Process A
write.x(1)
write.x(1)
read.y
Process B
write.y(1)
write.y(1)
read.x
Example: Using Flags
![Page 27: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/27.jpg)
Question
Gadi Taubenfeld 27 ICDCN 2013
How can we provide some level of resiliency against
memory reordering and reduce the number of memory
barriers required.
![Page 28: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/28.jpg)
Gadi Taubenfeld 28
X: atomic register
write.x(1)
write.x(2)
write.x(3)
read.x
ICDCN 2013
No reordering: x {3}
One reordering: x {2,3}
Two reordering: x {1,2,3}
X: 2-atomic register
write.x(1)
write.x(2)
write.x(3)
read.x
No reordering: x {2,3}
One reordering: x {1,2,3}
1. Design your algorithm to be correct assuming weak objects (2-atomic registers)
2. Replace the weak objects with strong objects (1-atomic registers)
Get “some” resiliency against memory
reordering (I.e., need less barriers)
Memory reordering resiliency: design strategy
![Page 29: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/29.jpg)
Gadi Taubenfeld 29 ICDCN 2013
1. Design your algorithm to be correct assuming weak objects (2-atomic registers)
2. Replace the weak objects with strong objects (1-atomic registers)
Get “some” resiliency against memory
reordering (I.e., need less barriers)
Memory reordering resiliency: design strategy
What weak objects ?
How much ?
![Page 30: A Closer Look at Fault Tolerance - HellasGadi Taubenfeld SRDC 2013 12 Problem Model Weakly WF Partially WF Almost WF Complexity Election SM/MP Test&set SM Perfect renaming SM/MP Stack](https://reader033.vdocument.in/reader033/viewer/2022042007/5e703ceb39325805a14de3c6/html5/thumbnails/30.jpg)
The End
Gadi Taubenfeld SRDC 2013 30