d mutex (1)

Upload: pulkit-budhiraja

Post on 07-Apr-2018

320 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 D MUTEX (1)

    1/34

    Mutual Exclusion in Distributed Systems

    Single Processor Systems

    use semaphore, monitor, etc.

    Distributed Systems

    centralized algorithm central server coordinate the ordering for entering CS

    overload the central site

    introduce a single point of failure in the system

  • 8/3/2019 D MUTEX (1)

    2/34

    Mutual Exclusion in Distributed Systems

    decentralized algorithms

    non-token based algorithms

    Lamport's algorithm

    Ricart-Agrawala's algorithm

    Maekawa's algorithm

    token based algorithms

    token-ring algorithm

    broadcast algorithm

    tree-based algorithm

    self-stabilizing algorithm

  • 8/3/2019 D MUTEX (1)

    3/34

    Lamport's Algorithm

    Request the CS:

    1. Pibroadcasts request(ti, i) to all processors and puts the request in its local

    queue (in the order of timestamps tof the requests)

    2. Pj upon receiving the request (ti, i), puts the request in its local queue (in the

    order of timestamps tof the requests) and sends reply (tj, j)to Pi

    Enter the CS:

    1. ifPi has received reply messages from all sites with timestamps larger than ti

    and its request is at the top of the queue, then it enters the CS

    Release the CS:

    1. Pi, upon exiting CS, removes its request from the queue and sends release (ti)

    to all processors

    2. Pj, upon receiving the message, removes the request from the top of the

    queue

  • 8/3/2019 D MUTEX (1)

    4/34

    Lamport's Algorithm -- Properties

    this algorithm requires

    a total ordering of events

    all sites to be alive

    requires 3(N1) messages per request

    response time in a very low load 2T

    T: per message communication latency

    assume there is no one in CS

    send N1 request messages sent in parallel (T)

    send N1 response messages sent in parallel (T) so, requester enters CS after 2T time

  • 8/3/2019 D MUTEX (1)

    5/34

    Ricart-Agrawala's Algorithm

    Request the CS:1. Pi broadcasts request(ti, i) to all processors

    2. Pj, upon receiving the request

    a) sends reply (tj, j) to Pi ifPj is neither requesting nor executing in the

    CS

    b) sends reply (tj, j) to Pi ifPj is requesting the CS but the timestamp forPjs request is larger than ti

    c) defers the request otherwise

    Enter the CS:

    1. ifPi has received reply messages from all sites, then it enters the CS

    Release the CS:

    1. Pi upon exiting CS, sends reply (j) to all the deferred requests

  • 8/3/2019 D MUTEX (1)

    6/34

    Ricart-Agrawala's Algorithm

    this algorithm requires

    a total ordering of events

    require all sites to be alive

    requires 2(N1) messages per request

    response time in a very low load 2T

    send N1 request messages in parallel (T)

    send N1 response messages in parallel (T)

  • 8/3/2019 D MUTEX (1)

    7/34

    Maekawa's Algorithm

    Request set each node has a request set

    when the node wants to enter the critical section, it sends its request to all

    nodes in its request set

    the request set of each node does not include all nodes in the system

    the intersection of any two request sets is non-empty

    Example

    consider three nodes, X, Y, and Z

    Xs request set include nodes X and Y

    Ys request set include nodes Y and Z

    Zs request set include nodes Z and X

  • 8/3/2019 D MUTEX (1)

    8/34

    Maekawa's Algorithm

    Request the CS:1. Pi multicasts request(ti, i) to its request set, including itself

    2. Pj upon receiving the request

    a) if it is not currently locked, then locks itself and sends reply (j) to Pi

    b) otherwise, puts the request in a queue (in the order of the timestamp)

    Enter the CS:

    1. if Pi has received reply messages from all sites in its request set, then it

    enters the CS

    Release the CS:

    1. Pi upon exiting CS, sends release (ti) to all processors in its request set

    2. Pj upon receiving the message

    a) if the waiting queue is not empty then it removes the entry in the queue

    and sends reply (j) to that node

    b) otherwise, unlocks itself

  • 8/3/2019 D MUTEX (1)

    9/34

    Maekawa's Algorithm -- Properties

    requires a total ordering of events

    requires 3Nmessages per request

    response time in a very low load

    2T

    send K1 request messages sent in parallel (T)

    send K1 response messages sent in parallel (T)

    has the potential deadlock problem

  • 8/3/2019 D MUTEX (1)

    10/34

    Potential Deadlock Problem in Maekawa's Algorithm

    requests reach different sites in different order

    consider nodes X, Y, Z, who issue requests to enter the critical section

    Xs request has the lowest timestamp, Zs request has the highest

    A is the mediator of requests from X and Y

    B is the mediator of requests from Y and Z

    C is the mediator of requests from X and Z

    A received Xs request first and locked itself for X

    B received Ys request first and locked itself for Y

    C received Zs request first and locked itself for Z

    X will not get a reply from C

    Y will not get a reply from A

    Z will not get a reply from B

    deadlock

  • 8/3/2019 D MUTEX (1)

    11/34

    Solution to the Potential Deadlock Problem

    detect the potential deadlock

    when a request with a smaller timestamp is received, while the node is

    locked for a request with a larger timestamp

    resolution

    ask the requester with a larger timestamp to give up its granted privilege if

    it has not already gotten all replies

    for the previous example, C asks Zto give up the granted privilege

  • 8/3/2019 D MUTEX (1)

    12/34

    Resolve the Potential Deadlock Problem

    Request the CS:

    1. Pimulticasts request(ti, i) to its request set, including itself

    2. Pz upon receiving the request

    a) if it is not currently locked, then locks itself and sends reply (z) to Pi

    b) if it is currently locked for Pk, then

    if request from Pk has a smaller timestamp then puts the new

    request in a waiting queue (in the order of the timestamp) and sends

    failed(z) to Pi

    otherwise (Pi's request has a smaller timestamp), sends inquire (z)

    to Pk

  • 8/3/2019 D MUTEX (1)

    13/34

    Resolve the Potential Deadlock Problem

    Request the CS:

    3. Pkupon receiving inquire (z)

    a) if it has received a failed message then sends relinquish (k) to all sites in

    its request set

    b) if it has received all reply messages then ignores the inquire message

    c) otherwise, simply waits

    4. Pz, upon receiving relinquish (k),

    a) changes the lock to lock for Pi and sends reply (z)to Pi

    Property

    requires at most 5N messages per request

    response time under very low load: 2T

  • 8/3/2019 D MUTEX (1)

    14/34

    Request Set Generation

    Assume

    totalNnodes

    Let Si denote the request set for Pi, the request sets have to satisfy

    SiSj, for all i,j

    Si, for all i, always contains P

    i

    additional desirable properties

    |Si| = |Sj| = K, for all i,j, and for some K

    i.e., the request sets are of equal size, and each is of size K

    O(Pi) = O(Pj) =D, for all i andj

    O(Pi) denotes the number of occurrences ofPi in all request sets i.e., each node is involved inD request sets

  • 8/3/2019 D MUTEX (1)

    15/34

    Request Set Generation

    relationship between KandD

    Nnodes, each has a request set of size K

    totalNKnodes required (can be duplicates)

    since there areNnodes, each site need to be duplicatedD times

    K=D

    request set size K

    consider the first request set, it has Knodes, each of them can be in (K1)

    other request sets

    Each other request set should contain at least one of the nodes in the first

    request set

    total K(K1) extra request sets other than the first one

    N= K(K1)+1 KN

  • 8/3/2019 D MUTEX (1)

    16/34

    Request Set Generation

    assumeN= K(K1) + 1, for some K, and K1 is a prime number

    consider a matrix of size K1 by K1

    it can generate Kgroups ofK1 nonintersecting sets

    K1 nonintersecting rows

    K

    1 nonintersecting columns (K2) of (K1) nonintersecting diagonals

    different diagonals: jump 1 on each row (the real diagonal), jump 2, ....,

    jump (K1)1

    each number (out of the first Knumbers) can be combined with each of

    the K

    1 nonintersecting sets to produce K

    1 of 1-element-intersectedsets

  • 8/3/2019 D MUTEX (1)

    17/34

    Request Set Generation Example -- K=6

    N= 6 * 5 + 1 = 31, K= 6, matrix is 5 by 5

    the first Knumbers 123456 form one set

    1 combined with all rows to form one set

    2 combined with all columns to form one set

    3 combined with all jump-1 diagonals jump-1 diagonals: 7djpv, 8ekqr, 9flms, ....

    4 combined with all jump-2 diagonals

    jump-2 diagonals: 7elnu, 8fhov, 9gipr, ....

    5 combined with all jump-3 diagonals

    jump-3 diagonals: 7fiqt, 8gjmu, ....

    6 combined with all jump-4 diagonals

    jump-4 diagonals: 7gkos, 8clpt, , bfjnr

    total K(K1)+1 = 31 sets

    1 2 3 4 5 6

    7 8 9 a b

    c d e f g

    h i j k l

    m n o p q

    r s t u v

  • 8/3/2019 D MUTEX (1)

    18/34

    Request Set Assignment Example -- K=6

    How to assign the 31 sets to the 31 nodes

    node 1 gets the first set: 123456

    the request set constructed from each row is assigned to

    the 2nd node in the set

    e.g., request set 1789ab is assigned to node 7

    now, all nodes in the first column have their request sets

    node 2 gets the set of 2 and first column

    the request set constructed from each column is assigned

    to the 2nd node in the set

    e.g., node 8 has request set 28dins

    note that, set 27chmr is assigned to node 2, not 7

    now, the first node of each column and each row have

    their request sets

    the jump-X diagonals will be assigned to the rest of the

    nodes

    1 2 3 4 5 6

    7 8 9 a b

    c d e f g

    h i j k l

    m n o p q

    r s t u v

    3 4 5 6

    d e f g

    i j k ln o p q

    s t u v

  • 8/3/2019 D MUTEX (1)

    19/34

    Request Set Assignment Example -- K=6

    the request set constructed from each jump-1 diagonal isassigned to the 3rd node in the request set

    request set 37djpv is assigned to node d

    but, set 3bciou is assigned to node 3, not node c

    the request set constructed from each jump-2 diagonal is

    assigned to the 4th node in the request set

    e.g., request set 47elnu is assigned to node l

    but, set 48fhov is assigned to node 4, not node h

    the request set constructed from each jump-3 diagonal is

    assigned to the 5th node in the request set

    e.g., request set 57fiqt is assigned to node q

    but, set 58gjmu is assigned to node 5, not node m

    the request set constructed from each jump-4 diagonal is

    assigned to the last node in the request set

    e.g., request set 67gkos is assigned to node s

    but, set 6bfjnr is assigned to node 6, not node r

    1 2 3 4 5 6

    7 8 9 a b

    c d e f g

    h i j k l

    m n o p q

    r s t u v

  • 8/3/2019 D MUTEX (1)

    20/34

    Request Sets Generation Algorithm (Cont.)

    ifK1 is a power of a prime number

    it is possible to generate optimal request sets

    ifK1 is not a power of a prime number orNcannot be expressed as

    K(K1)+1

    find a numberMwhereMis the smallest integer which is greater thanN

    and can be expressed as K(K1), for some K, where Kis the power of a

    prime number

    generate the required sets forMprocessors

    replace numbersN+1..Mby 1..MN

    removeMNsets

    same thing can be done for site failures

  • 8/3/2019 D MUTEX (1)

    21/34

    consider the closest prime number that can be divided into K(K1)+1

    N=5M=7

    derive the sets fromM=7 and remove the duplicated nodes

    1 2 3

    4 51 2 -- replace nodes 6 and 7 by 1 and 2

    S1 = {1, 2, 3}

    S4 = {1, 4, 5}

    S6 = {1, 1, 2} remove

    S2 = {2, 4, 1} S5 = {2, 5, 2} {2, 5}

    S7 = {3, 4, 2} remove

    S3 = {3, 5, 1}

    Request Set Generation Example -- N=5

  • 8/3/2019 D MUTEX (1)

    22/34

    Token Ring Algorithm

    a unique token is associated with the CS

    Pi enters CS only if it owns the token

    Request to enter CS:

    1. ifPjowns the token and it does not need to enter the CS, then it passes thetoken to P(j+1) mod N

    2. Pi will sooner or later gets the token

    Enter the CS:

    1. when Pi owns the token, it enters CS

    Release the CS:

    1. pass the token to the next processor

  • 8/3/2019 D MUTEX (1)

    23/34

    Token Ring Algorithm -- Properties

    simple and no deadlock or starvation

    number of messages and response time

    if only one node needs the token, the token will traverseN/2 nodes on

    average

    best case: 0 message (the node has the token) 0 delay

    worst case:N1 messages (sequentially) (N1)T delay

    tolerable overhead with smallN

    cannot scale up for largeN

    it is difficult to design a fault tolerant algorithm for this scheme

    The concept of token is similar to centralized control, however, thecentral site is moving

  • 8/3/2019 D MUTEX (1)

    24/34

    Suzuki-Kasami's Broadcast Algorithm

    data structures:

    vectorX: associated with the token

    X[i]: the timestamp of the last request from Pi that has been served

    vectorRTj: associated with node Pj

    RTj[i]: the timestamp of the most current request from Pi known by Pj

    nodej determines whether a node khas an outstanding request by checking

    whetherRTj[k] >X[k]

  • 8/3/2019 D MUTEX (1)

    25/34

  • 8/3/2019 D MUTEX (1)

    26/34

    Suzuki-Kasami's Broadcast Algorithm

    Enter the CS:

    ifPi has received the token then it enters the CS

    Release the CS:

    Pi upon exiting CS, setsX[i]= RTi[i]

    execute (A)

  • 8/3/2019 D MUTEX (1)

    27/34

    Suzuki-Kasami's Broadcast Algorithm -- Properties

    this algorithm gives better fault tolerance in the sense of handlingrequests

    as long as the request is received by some processors that will possess the

    token, the request will be processed

    however, the problem of missing token is still there

    e.g. the token is held by a dead processors or is sent to a dead processor

    requireNmessages per request

    N1 messages for broadcasting the request

    1 message sending the token

    if the node that wants to enter the critical section happens to have the token,

    then there is no message needed

    response time

    in general, there is a delay of 2T

    in best case, there is no delay

  • 8/3/2019 D MUTEX (1)

    28/34

    Raymond's Tree-Based Algorithm

    the processors are structured as a tree and the token is placed at the rootnode

    the tree restructures when the token moves

    Request the CS (going up the tree):

    1. Pi send request(i) to its parent and puts the request in its queue if it does not

    hold the token

    2. Pj upon receiving the request

    a) puts the request in its queue

    b) if it has not sent a request to its parent then

    sends request(j) to its parent

    c) otherwise (a request has already been sent to its parent for another

    child node)

    does nothing

  • 8/3/2019 D MUTEX (1)

    29/34

    Raymond's Tree-Based Algorithm

    Request the CS (going down the tree):

    3. root site upon receiving the request

    a) puts the request in its queue

    b) executes (DTPR)

    4. Pj, upon receiving the token,

    a) if it was not requesting to enter CS or its request was not on the top of

    its queue then executes (DTPR)

    D. delete the top entry from its requesting queue

    T. send the token to the requesting child

    P. update parent pointer to point to the requesting child

    R. if its request queue is non-empty then send a request to the new

    parent

  • 8/3/2019 D MUTEX (1)

    30/34

    Raymond's Tree-Based Algorithm

    Enter the CS:

    1. ifPi has received the token and its request is on the top of its queue then it

    enters the CS

    Release the CS:

    1. Pi upon exiting CS

    a) if its queue is not empty, then executes (DTPR)

  • 8/3/2019 D MUTEX (1)

    31/34

    Raymond's Tree-Based Algorithm-- Example

    1

    2 3

    4 5 6 7

    1. token is at node 1node 5 made a request

    1

    2 3

    4 5 6 7

    3. node 4 also sends a request,node 2 receives it

    1

    2 3

    4 5 6 7

    4. token is at node 2 now

    node 2 becomes the root

    1

    2 3

    4 5 6 7

    5. node 5 gets the token, it enters CS

    6. node 2 sends a request to node 5

    2. node 2 receives

    the request, it sends

    the request to node 1

  • 8/3/2019 D MUTEX (1)

    32/34

    Raymond's Tree-Based Algorithm-- Example

    1

    2 3

    4 5 6 7

    7. node 5 sends the tokento node 2

    1

    2 3

    4 5 6 7

    8. node 4 gets the token, it enters CS9. node 3 sends a request

    1

    2 3

    4 5 6 7

    10. the request from node 2

    comes to node 4

    1

    2 3

    4 5 6 7

    11. node 3 gets the token, and becomes

    the root

  • 8/3/2019 D MUTEX (1)

    33/34

    Raymond's Tree-Based Algorithm -- Properties

    the node with the token is always the root node

    requires the nodes on the entire path, from requester to root, to be alive

    in order to process a request

    still has the lost token problem

    requires 2 logNmessages per request in average

    longest path: 2 logN(when the root is at the leaf of the original tree)

    best case: 0 messages

    worst case: 4 logNmessages (2 logNto the root, 2 logNback with token)

    response time

    the message passing has to be done sequentially the average response time: T logN

    the best case response time: 0

    the worst case response time: 4T logN

  • 8/3/2019 D MUTEX (1)

    34/34

    Performance Comparisons

    T: per message transmission time

    E: computation time

    response time: consider low load

    algorithm response time # messages

    Lamport 2T+E 3(N1)Ricart-Ag 2T+E 2(N1)Maekawa 2T+E 3N 5Ntoken-ring

    [0N

    T]+E 0

    N

    broadcast [0 or 2T]+E 0 or N

    tree-based [04T logN]+E [0 4 logN]