distributed process coordination presentation 1 - sept. 14th 2002 cse8343 - spring 02 group a4:chris...

Distributed Process CoordinationPresentation 1 - Sept. 14th 2002

CSE8343 - Spring 02

Group A4: Chris Sun,

Min Fang,

Bryan Maden

Introduction

• Coordination requires either global time or global ordering

• Can be accomplished through: hardware, software or combination of both

• Can have a centralized coordination or distributed coordination

• Used to solve Critical Section Problems

Introduction

Cont.

• Solutions:• Mutual Exclusion

• Centralized• Distributed• Token Ring

• Clocks• Lamports Timestamp• Cristian’s Algorithm• Berkeley Algorithm

• Coordinators• Election Algorithm• Bully Algorithm

Mutual ExclusionCentralized

• One Process is chosen be the coordinator.• Any process wanting to enter Critical Section (CS) sends

message to coordinator.• Coordinator chooses who enters CS. Sends reply message to

allow process to enter CS.• When process is finished in CS, sends ‘finished’ message to

coordinator.• Coordinator chooses next CS recipient.

Mutual Exclusion Distributed

• Process Pi wants to enter CS.

• Generates timestamp TS and sends message request(Pi, TS) to all processes in system (including self)

• When Process Pj receives a request message, it may reply immediately with a reply message, or it may defer sending the reply message.

• When Process Pi receives a reply message from all other processes, it enters the CS.

Mutual Exclusion Distributed cont.

Process Pj decides when to send reply message to process Pi based on three factors

1) If process Pj is in its critical section, then it defers it reply

to Pi

2) If process Pj does not want to enter its critical section,

then it sends a reply immediately to Pi

3) If process Pj is waiting to enter its critical section, Pj

compares its own request message timestamp with

the timestamp TS of the incoming request

message. If its own request message timestamp is

greater than TS, process Pj sends a reply

immediately, otherwise the reply is deferred.

Mutual Exclusion Distributed cont.

This algorithm provides• Mutual Exclusion • Deadlock Prevention• Starvation Prevention

All Process in the system must• Know all the other processes• Receive new process information

Token Passing A token is a special type of message that is passed around the

system.

Possession of the token allows the process to enter the CS.

• When a process receives the token it can hold the token and enter the CS, or it may pass the token giving up the right to enter the CS.

• If the token is lost an election is held to create a new one.• If a process dies, the ring is broken and a new one must be

created.

• Starvation and deadlock are avoided with uni-directional ring and one token.

Logical Clocks

Logical Clock and Lamport Timestamp

• Logical clock– Order of events matters more than absolute time

– E.g.) UNIX make: input.c input.o

• Lamport timestamp– Synchronize logical clocks

• Happens-before relation– A -> B : A happens before B

– Two cases which determine “happens-before”

– A and B are in same process, and A occurs before B: a -> b

– A is send-event of message M, and B is receive-event of same message M

• Transitive relation– If A -> B and B -> C, then A-> C

• Concurrent events– Neither A -> B nor B -> A is true

Lamport Algorithm

• Assign time value C(A) such that– If a happens before b in the same process, C(a) < C(b)

– If a and b represent the sending and receiving of a message, C(a) < C(b)

• Lamport Algorithm– Each process increments local clock between any two successive events

– Message contains a timestamp

– Upon receiving a message, if received timestamp is ahead, receiver fast forward it clock to be one more than sending time

• Extension for total ordering– Requirement: For all distinctive events a and b, C(a) C(b)

– Solution: Break tie between concurrent events using process number

Lamport Timestamp Example

• Clocks run at different rate• Correct clocks using Lamport Algorithm

56

80

64

72

0

6

12

18

24

30

36

42

48

54

60

0

8

16

24

32

40

48

0

10

20

30

40

50

60

70

80

90

100

A

B

C

D 69

77

61

85

70

76

Totally-Ordered Multicast

• Definition: sending messages to a set of processes, in such a way that all messages are delivered to the correct destinations in the same order.

• Scenario– Replicated accounts in New York(NY) and San Francisco(SF)

– Two transactions occur at the same time and multicast

• Current balance: $1,000

• Add $100 at SF

• Add interest of 1% at NY

– Possible results

• $1,111

• $1,110

Totally Ordered Multicast

• Use Lamport timestamps

• Algorithm (Communication history)– Message is time-stamped with sender’s logical time

– Message is multicast (including sender itself)

– When message is received

• It is put into local queue

• Ordered according to timestamp

• Multicast acknowledgement

– Message is delivered to applications only when

• It is at head of queue

• It has been acknowledged by all involved processes

• Other algorithms: sequencer and destination agreement.

Vector Timestamps

• Problem of Lamport timestamps– C(a) < C(b) => a < b is not always true.

– It does not capture “causality”

• Vector Timestamp– VT(a) < VT(b) when event a causally precede event b

– Vi[i] : number of events that have occurred so far at Pi

– If Vi[j] = k then Pi knows that k events have occurred at Pj

– Increment Vi[i] at each new event at Pi

– When Pi sends message m, it piggybacks current vector vt

– When Pj receives m,

• it adjust vector: Vj[k] = max{Vj[k],vt[k]} for each k

• Vj[i] is incremented by 1

Vector Timestamps Example

• Vi[i] : the number of events that have occurred so far at Pi

• Vi[j] : the number of events that have occurred at Pj that Pi has potentially been affected by, where j < > i.

a b

c d

e f

m1

m2

(2,0,0)(1,0,0)

(2,1,0) (2,2,0)

(2,2,2)(0,0,1)

p1

p2

p3

Physical time

Physical Clocks

Clock Sync. Algorithm

• Distributed System, P.245~P.250

• Overview

• Cristian’s Algorithm

• UNIX

• Averaging

Overview

• TAI: International Atomic Time– BIH, Paris– 50 cesium 133 clocks

• TUC: Universal Coordinated Time– Based on TAI– Basis of all modern timekeeping– Replacing Greenwich, an astronomical time

Overview-2

• WWV– Shortwave radio station– NIST, National Institute of Standard Time– Broadcast UTC

Algorithm types

• Distributed Systems

• Centralized– Clients ask Server– Server polls Clients

• DecentralizedHost collects times from others

Cristian’s Algorithm

• Client asks Server

• Time Server has a WWV receiver

• No more than t = /2 sec., asking for current time. : max drift rate, 10-5, 2 every hour for H=60 : max time deviation between hosts =2t

Cristian’s Algorithm-2

• Time never goes back– inconsistent– Gradually slow down

• Propagation time

• Interrupt handling time

• Threshold

• Multiple asking and averaging

Berkeley UNIX

• Time daemon polls the clients

• Computing a standard time

• Broadcast standard time

Averaging Algorithm

• Every interval, each hosts broadcast its time to all others

• Each one computes its own time based on the information from others

• Discard extreme data

• Correction: propagation, topology

Process Coordination

Election Algorithm

• A leader is often needed in distributed systems– As controller or coordinator

• We need to elect a leader on startup and when current leader fails. – E.g. take over the role of a failed process, pick a master in Berkeley clock

synchronization algorithm

• Assumption– Every process knows ID of all the other processes

• Conditions– Operational process with largest ID wins

– All operational process should be informed of a new leader

– Recovering process can find current leader

• Types of election algorithms: Bully and Ring algorithms

Bully Algorithm

• Election is initiated by any process (P) notices that coordinator is no longer responding– Concurrent multiple elections are possible

• Algorithm– P sends ELECTION messages to all process with higher ID

– If no one responds, P wins and becomes coordinator

– Sends out COORDINATOR messages to all other processes

– If one of higher-ups answers, it takes over. P is done.– 3 message types: election, OK, I won– Several processes can initiate an election simultaneously– O(n2) messages required with n processes

Bully Algorithm Example

• Process 4 holds an election• Process 5 and 6 respond, telling 4 to stop• Now 5 and 6 each hold an election

References

• Tanenbaum and Steen, “Distributed Systems Principles and Paradigms”, P.245~P.250

• http://www.cse.fau.edu/~fdai/projects/ds/tom.pdf• http://lsewww.epfl.ch/Documents/acrobat/DSU00.pdf • http://data.uta.edu/~ramesh/cse5306/DC.html• Silberschatz, Galvin, Gagne, “Applied Operating System

Concepts”, First Edition, Wiley & Sons, 2000, Pg.521-535

distributed process coordination presentation 1 - sept. 14th 2002 cse8343 - spring 02 group a4:chris...

Documents

reply message

selfwhen process pj

event of message

finished message

message requestpi

request message timestamp

incoming request message

critical section cs