csc2/458 parallel and distributed systems mutual exclusion ... · csc2/458 parallel and distributed...
TRANSCRIPT
![Page 1: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/1.jpg)
CSC2/458 Parallel and Distributed Systems
Mutual Exclusion and Leader Elections
Sreepathi Pai
March 29, 2018
URCS
![Page 2: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/2.jpg)
Outline
Mutual Exclusion Using Voting
Misra’s Token Recovery Algorithm
Election Algorithms
![Page 3: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/3.jpg)
Outline
Mutual Exclusion Using Voting
Misra’s Token Recovery Algorithm
Election Algorithms
![Page 4: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/4.jpg)
From the previous lecture
• Does a process need to wait for all replicas to reply beforechecking majority?
• No [it would NOT (thanks, Mohsen!) solve the problem raised
by Andrew, but would lead to lower utilization]
• How many processes need to fail?
• f >= m − N/2, where
• m = N/2 + 1
• Does this mean mutual exclusion can be violated?
• Yes (with very low probability, see Lin et al. 2014)
![Page 5: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/5.jpg)
Different Types of Failures (Thomas)
• How does fail recovery compare with fail stop?
• Fail stop: Process operates correctly, fails in a detectable way
and remains failed
• Fail recovery: Process fails and “restarts”
![Page 6: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/6.jpg)
Outline
Mutual Exclusion Using Voting
Misra’s Token Recovery Algorithm
Election Algorithms
![Page 7: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/7.jpg)
Recall Token-based Mutual Exclusion
• A token circulates in an (unidirectional) ring
• Process i sends token to Process i + 1 (modulo N)
• A process holding the token can perform actions on sharedresources
• i.e. it is in the critical section
• A tokens can be lost
• released by process i but not received by process j
![Page 8: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/8.jpg)
Loss of token
• Two problems
• Detecting loss
• Regenerating a single token
![Page 9: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/9.jpg)
One possible solution
• Detect loss of token using timeouts
• Perform leader election
• Leader generates new token
• This solution in a few slides
![Page 10: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/10.jpg)
Misra’s algorithm for detecting token loss and regeneration
• Use two tokens X and Y
• X is also the mutual exclusion token (but not Y )
• X and Y detect the loss of each other
• Assume in order receipt
![Page 11: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/11.jpg)
Key Insight
“A token at a process pi can guarantee the other token is lost if
since this token’s last visit to pi , neither this token nor pi have
seen the other token.”
- Misra, 1983, Detecting Termination of Distributed Computations
Using Markers, PODC
• What does it mean for:
• a process to have seen a token?
• for a token to have seen the other token?
![Page 12: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/12.jpg)
The Algorithm: Setup
• Associate nX and nY , two integers with X and Y
• Initialize nX and nY to +1 and -1 respectively
• Each token carries its value with it (i.e nX or nY )
• Each process pi contains a mi initialized to zero
• remembers the last token seen and its value
![Page 13: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/13.jpg)
The Algorithm: Working
When tokens encounter each other:
nX = nX + 1nY = nY - 1
When pi encounters Y (analogous code to encountering X not
shown):
if m_i == nY: /* token X is lost *//* regenerate token X */nY -= 1nX = -nY
else:m_i = nY
end if
![Page 14: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/14.jpg)
Do we need infinite precision?
• nX can become arbitrarily large
• nY can become arbitrarily small
• Can we avoid this?
• What is the invariant we need to maintain?
• When are counters updated?
• How many such events can happen between two visits to pi?
![Page 15: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/15.jpg)
Other notes
Misra proposed this algorithm for termination detection. We will
revisit it.
But can you see how it may apply?
• All processes are in either IDLE or ACTIVE
• Receiving a message marks process as ACTIVE
• Processes can only quit when all of them are IDLE and there
are no messages in flight
![Page 16: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/16.jpg)
Outline
Mutual Exclusion Using Voting
Misra’s Token Recovery Algorithm
Election Algorithms
![Page 17: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/17.jpg)
Electing Leaders
• Initiating an election
• Anytime
• Detecting a winner and making sure everybody agrees on thesame winner
• Using process IDs to break ties for example
![Page 18: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/18.jpg)
Ring-based Elections: Selective Extension
• (Logical) Unidirectional ring topology
• Two message types, both contain a process ID:
• ELECTION
• ELECTED
![Page 19: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/19.jpg)
Algorithm: Part I
A process can initiate an election anytime. Process pi does this by
sending a ELECTION(pi ) to its neighbour and “marking itself” as
participating in an election.
On receiving message ELECTION(X), a process pj :
if X > p_j:participating = Tsend(ELECTION(X))
elif X < p_j:participating = Tsend(ELECTION(p_j))
elif X == p_j:send(ELECTED(p_j))
![Page 20: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/20.jpg)
Algorithm: Part II
When receiving ELECTED(Y):
participating = Fcoordinator = Y
if Y != p_j:send(ELECTED(Y))
![Page 21: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/21.jpg)
Textbook has slight modifications
• Sends lists instead of one number
• Skips dead nodes
1 2 3 4
5670
[3]
[3,4]
[3,4,5]
[3,4,5,6]
[3,4,5,6,0]
[3,4,5,6,0,1] [3,4,5,6,0,1,2]
[6]
[6,0]
[6,0,1] [6,0,1,2] [6,0,1,2,3]
[6,0,1,2,3,4]
[6,0,1,2,3,4,5]
![Page 22: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/22.jpg)
The Bully Algorithm
The coordinator with the highest process ID always wins.
• Three types of messages:
• ELECTION (initiation)
• OK (resolution)
• COORDINATOR (verdict)
![Page 23: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/23.jpg)
Bully Algorithm in Action: Initiation
Election
Election
Ele
ction
1
2
4
0
5
6
3
7
![Page 24: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/24.jpg)
Bully Algorithm in Action: Resolution
OK
OK
1
2
4
0
5
6
3
7
![Page 25: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/25.jpg)
Bully Algorithm in Action: Further Elections
Ele
ction
Election
Election
1
2
4
0
5
6
3
7
![Page 26: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/26.jpg)
Bully Algorithm in Action: Resolution
OK
1
2
4
0
5
6
3
7
![Page 27: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/27.jpg)
Bully Algorithm in Action: Final Verdict
Coordinator
1
2
4
0
5
6
3
7
![Page 28: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/28.jpg)
Algorithm
Any process pi can initiate an election at any time:
• Send ELECTION message to all processes pk such that k > i
• Wait for OK replies
• If no replies (within a timeout), process pi has won and
announces win using COORDINATOR
On receiving an ELECTION message:
• Send OK to sender
• Sender cannot become a coordinator
• Initiate election if any higher processes known to exist
• if not, process is new coordinator, send COORDINATOR
![Page 29: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/29.jpg)
What happens when 7 comes back online?
Coordinator
1
2
4
0
5
6
3
7
![Page 30: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/30.jpg)
Interesting Extensions
• Wireless networks
• Small, dynamic, no fixed topology
• P2P networks
• Large, dynamic, may need multiple coordinators
• See textbook for details
• Will revisit some of these topics on a P2P lecture
![Page 31: CSC2/458 Parallel and Distributed Systems Mutual Exclusion ... · CSC2/458 Parallel and Distributed Systems Mutual Exclusion and Leader Elections Sreepathi Pai March 29 ... Mutual](https://reader036.vdocument.in/reader036/viewer/2022071023/5fd854ea60d23a14b7473284/html5/thumbnails/31.jpg)
Acknowledgements
All figures from van Steen and Tanenbaum, 3rd Edition.