synchronization chapter 5. - 1 - contents qclock synchronization qlogical clocks qglobal state...
TRANSCRIPT
Synchronization
Chapter 5
- 2 -
Contents
Clock SynchronizationLogical ClocksGlobal StateElection AlgorithmsMutual ExclusionDistributed TransactionsConclusionCritical Idea
- 3 -
Clock synchronization
A simple question: Is it possible to synchronize all the clocks in a distributed system ?
- 4 -
Physical Clocks (1)
Some concept: Timer; counter, holding register; clock tick; clock skew.
Prob: How do we synchronize them with real-world clocksHow do we synchronize the clocks with each other
Mean solar second: measuring a large numbers of day taking average dividing by 86400
TAI: the mean number of sticks of the cesium 133 clocks (since 1/1/1958) divided by 9,192,631,770
- 5 -
Physical Clocks (2)
TAI: highly stable but late leap secondBy the way: raising their frequency from 60Hz or
50Hz 61Hz or 51Hz
- 6 -
Cristian’s Algorithm No more than δ/2ρ each machine sends a message to the
time server (which has a WWV receiver) asking for the current time
Probs: time must never run backward and it’s take a nonzero amount of time for the time server’s reply to get back to the sender
- 7 -
The Berkeley Algorithm
a)The time daemon asks all the other machines for their clock values
b)The machines answer
c)The time daemon tells everyone how to adjust their clock
- 8 -
Averaging Algorithms
At the beginning of each interval, every machine broadcasts the current time according to its clock.
Then it starts a local timer to collect all other broadcasts that arrive during some interval S.
The simplest algorithm is just to average the values from all other machines.
One of the most widely used algorithms in the Internet is the Network Time Protocol (NTP).
- 9 -
Assign time to dist. Sys.
If a happens before b in the same process, C(a) < C(b).
If a and b represent the sending and receiving of a message, respectively, C(a) < C(b).
For all distinctive events a and b, C(a) ≠ C(b).
- 10 -
Totally order Multicasting
Timestamp can be used to implement totally ordered multicast
- 11 -
Vector timestamps
VT(a) < VT(b) a: causally precede event bProperties of vector timestamps
Vi[i] is the number of events that have occurred so far at P i
If Vi[j] = k then Pj knows that k events have occurred at P j
Message r (from PJ): reaction of message a (PI)
PK process message r if:vt(r)[j] = Vk[j] + 1
vt(r)[i] ≤ Vk[i] for all i≠j
- 12 -
Global State (1)
a) A consistent cutb) An inconsistent cut
- 13 -
Global State (2)
a) Organization of a process and channels for a distributed snapshot
- 14 -
Global State (3)
b) Process Q receives a marker for the first time and records its local state
c) Q records all incoming messaged) Q receives a marker for its incoming channel and finishes
recording the state of the incoming channel
- 15 -
Election AlgorithmsElection algorithms: algorithms for electing a
coordinator (using this as a generic name for the special process).
Election algorithms attempt to locate the process with the highest process number and designate it as coordinator.
Goal: to ensure that when an election starts, it concludes with all processes agreeing on who the new coordinator is to be.
- 16 -
The Bully Algorithm (1)
The bully election algorithm Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election
- 17 -
The Bully Algorithm (2)
d) Process 6 tells 5 to stope) Process 6 wins and tells everyone
Ring Algorithm
We assume that the processes are physically or logically
ordered, so that each process knows who is successor is. When any process notices that the coordinator is not functioning,
it builds ELECTION message containing its own process num
and sends message to its successor. If successor is down, the sender skips over the successor and
goes to the next number along the ring, or the one after that,
until a running process is located. At each step , the sender adds its own process num to the list in
the message effectively making itself a candidate to be elected
as a coordinator.
- 18 -
- 19 -
Ring Algorithm
- 20 -
a) Process 1 asks the coordinator for permission to enter a critical region. Permission is grantedb) Process 2 then asks permission to enter the same critical region. The coordinator does not reply.c) When process 1 exits the critical region, it tells the coordinator, when then replies to 2
Mutual ExclusionCentralized Algorithm
Distributed Algorithm
When a process wants to enter a critical region, it builds a
message containing the name of the critical region it wants to
enter , its process number, and the current time. Sends the message to all other processes, conceptually including
itself. The sending of message is assumed to be reliable. When a process receives a request message from other process,
the action it takes depends on its state with respect to the
critical region named in the message. Three cases have to be distinguished.
- 21 -
Distributed Algorithm contd
1. If the receiver is not in the critical region and does not want
to enter it, it sends back an OK message to the sender. 2. If the receiver is already in the critical region, it does not
reply. Instead, it queues the request. 3. If the receiver wants to enter the critical region but has not
yet done so, it compares the timestamp in the incoming
message, the lowest one wins. If the incoming message is
lower, the receiver sends back an OK message. If its own
message has a lowest timestamp, the receiver queues the
incoming request and sends nothing.
- 22 -
- 23 -
Distributed Algorithm example
a) Two processes want to enter the same critical region at the same moment.b) Process 0 has the lowest timestamp, so it wins.c) When process 0 is done, it sends an OK also, so 2 can now enter the
critical region.
- 24 -
Token Ring Algorithm
a) An unordered group of processes on a network.
b) A logical ring constructed in software.
- 25 -
Comparison
A comparison of three mutual exclusion algorithms.
AlgorithmMessages per entry/exit
Delay before entry (in message times)
Problems
Centralized 3 2 Coordinator crash
Distributed 2 ( n – 1 ) 2 ( n – 1 )
Crash of any process
Group communication
Token ring 1 to 0 to n – 1Lost token, process crash
- 26 -
The Transaction Model (1)
Updating a master tape is fault tolerant.
- 27 -
The Transaction Model (2)
Examples of primitives for transactions.
Primitive Description
BEGIN_TRANSACTION Make the start of a transaction
END_TRANSACTION Terminate the transaction and try to commit
ABORT_TRANSACTION Kill the transaction and restore the old values
READ Read data from a file, a table, or otherwise
WRITE Write data to a file, a table, or otherwise
- 28 -
Four Characteristics
Atomic:to the outside world, the transaction happens indivisibly
Consistent: the transaction does not violate system invariants
Isolated: concurrent transactions do not interfere with each other
Durable: once a transaction commits, the changes are permanent
- 29 -
Limitations of Flat Transactions
Main limitation: do not allow partial results to be committed or aborted
In the case of updating all of the hyperlinks to a webpage W, which moved to a new location
BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi;END_TRANSACTION
(a)
BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full =>ABORT_TRANSACTION (b)
- 30 -
Classification of Transactions
a) A nested transactionb) A distributed transaction
- 31 -
ImplementationPrivate Workspace
A file is only for read not modify there is no need for a private copy
a) The file index and disk blocks for a three-block fileb) The situation after a transaction has modified block 0 and appended block 3b) After committing
- 32 -
Writeahead Log
a) A transactionb) – d) The log before each statement is executed
x = 0;
y = 0;
BEGIN_TRANSACTION;
x = x + 1;
y = y + 2
x = y * y;
END_TRANSACTION;
(a)
Log
[x = 0 / 1]
(b)
Log
[x = 0 / 1]
[y = 0/2]
(c)
Log
[x = 0 / 1]
[y = 0/2]
[x = 1/4]
(d)
- 33 -
Concurrency Control (1)
General organization of managers for handling transactions
- 34 -
Concurrency Control (2) General organization of
managers for handling distributed transactions
- 35 -
Serializability
The whole idea behind concurrency control is to properly schedule conflicting operations (two read operations never conflict )
Synchronization can take place either through mutual exclusion mechanisms on shared data (i.e locking)
Or explicitly ordering operations using timestamps
- 36 -
Two-phase locking
A transaction T is granted a lock if there is no conflict The scheduler will never release a lock for data item x, until the data
manager acknowledges it has performed the operation for which the lock was set
Once the scheduler has released a lock on behalf of a transaction T, it will never grant another lock on behalf of T
- 37 -
Strict two-phase locking
In centralized 2PL: a single site is responsible for granting and releasing locks
In primary 2PL: each data item is assigned a primary copy
In distributed 2PL: the schedulers on each machine not only take care that locks are granted and released, but also that the operation is forwarded to the (local) data manager
- 38 -
Pessimistic Timestamp Ordering
Concurrency control using timestamps.
- 39 -
Conclusion
Lamport timestamps: if a happen before b C(a) < C(b).
Determining the global state can be done by synchronizing all processes so that each collects its own local state, along with the messages that are currently in transit.
Synchronization between processes choose a coordinator election algorithms
Mutual Exclusion algorithms: can be centralized or distributed
- 40 -
Conclusion & Critical Idea
A transaction consists of a series of operationsA transaction is durable, meaning that if it
completes, its effects are permanent
Two-phase locking can lead to dead lockAcquiring all locks in some canonical order to
prevent hold-and-wait cyclesUsing deadlock detection by maintaining an
explicit graph for cyclesInheritance Priority Protocol
- 41 -
42
Consistency and Replication
Chapter 6
43
Content1. Definition of Consistency and Replication
2. Understand Replication
Reason for Replication & Problem of Replication
3. The only solution for Replication Problem
4. Consistency Model
Data-centric models & Client-centric models
5. Distribution Protocols
Distributing updates to replicas
6. Consistency Protocols
Implementation of consistency models
44
Consistency and ReplicationIntroduction:
Replication:
Replication of data
Reason for Replication:
Enhance reliability or improve performance
Consistency:
Consistency of replicated data
Reason for Consistency:
Keep replicas to be the same
45
Understanding ReplicationIntroduction:
Object Replication
Purpose: managing data in distributed system
Consider objects instead of data alone
Benefit of encapsulating and operating data
Consistency Problem:
Whenever a replica is updated, that replica becomes different from the others. Synchronization Problem
Two Approaches focus on who will deal with it:
Object-specific Replication
Middleware Replication
46
Understanding Replication
a) A distributed system for replication-aware distributed objects.b) A distributed system responsible for replica management
Two Approaches
47
Only solution for Consistency ProblemIntroduction:
Synchronous replication consistency
Key idea: a single atomic operation, or transaction
Difficulties:
need to synchronize all replicas
a lot of communication time
expensive in terms of performance
Only solution:
Loosen the consistency constraints
need not to be executed as atomic operations
copies may not always same
48
Consistency ModelsIntroduction:
Consistency Model:
A contract between processes and the data store.
Two kind of Models:
Data-centric Consistency Models
Guarantee for a number of processes
Simultaneously update
Sequential consistency
Client-centric Consistency Models
Guarantee for a single processes
Lack simultaneous updates
49
Data-centric Consistency Models
The general organization of a logical data store, physically distributed and replicated across multiple processes.
50
Data-centric Consistency Models Data-centric Consistency Models (7 kinds)
Strict Consistency
Linearizability and Sequential Consistency
Causal Consistency
FIFO Consistency
Weak Consistency
Release Consistency
Entry Consistency
Loosen constraints
51
Data-centric Consistency ModelsConsistency Description
Strict Absolute time ordering of all shared accesses matters.
LinearizabilityAll processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp
SequentialAll processes see all shared accesses in the same order. Accesses are not ordered in time
Causal All processes see causally-related shared accesses in the same order.
FIFOAll processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order
(a)
Consistency Description
Weak Shared data can be counted on to be consistent only after a synchronization is done
Release Shared data are made consistent when a critical region is exited
Entry Shared data pertaining to a critical region are made consistent when a critical region is entered.
(b)
a) Consistency models not using synchronization operations.b) Models with synchronization operations.
52
FIFO Consistency (1)Necessary Condition:
“Any read on a data item x returns a value corresponding to the result of the most recent write x”
Writes done by a single process are seen by all other processes in the order in which they were issued
but writes from different processes may be seen
in a different order by different processes.
53
FIFO Consistency (2)
A valid sequence of events of FIFO consistency
54
FIFO Consistency (3)
Statement execution as seen by the three processes from the previous slide. The statements in bold are the ones that generate the output shown.
x = 1;
print (y, z);
y = 1;
print(x, z);
z = 1;
print (x, y);
Prints: 00
(a)
x = 1;
y = 1;
print(x, z);
print ( y, z);
z = 1;
print (x, y);
Prints: 10
(b)
y = 1;
print (x, z);
z = 1;
print (x, y);
x = 1;
print (y, z);
Prints: 01
(c)
55
FIFO Consistency (4)
Two concurrent processes.
Process P1 Process P2
x = 1;
if (y == 0) kill (P2);
y = 1;
if (x == 0) kill (P1);
56
Weak Consistency (1)Properties: Accesses to synchronization variables
associated with a data store are sequentially consistent
No operation on a synchronization variable is allowed to be performed until all previous writes have been completed everywhere
No read or write operation on data items are allowed to be performed until all previous operations to synchronization variables have been performed.
57
Weak Consistency (2)
A program fragment in which some variables may be kept in registers.
int a, b, c, d, e, x, y; /* variables */int *p, *q; /* pointers */int f( int *p, int *q); /* function prototype */
a = x * x; /* a stored in register */b = y * y; /* b as well */c = a*a*a + b*b + a * b; /* used later */d = a * a * c; /* used later */p = &a; /* p gets address of a */q = &b /* q gets address of b */e = f(p, q) /* function call */
58
Weak Consistency (3)
a) A valid sequence of events for weak consistency.b) An invalid sequence for weak consistency.
59
Client-centric Consistency Models Different types of Client-centric Consistency Models:
Eventual Consistency
Monotonic-Read Consistency
Monotonic-Write Consistency
Read-Your-Writes Consistency
Writes-Follow-Reads Consistency
60
Eventual Consistency
The principle of a mobile user accessing different replicas of a distributed database.
61
Monotonic Reads
The read operations performed by a single process P at two different local copies of the same data store.
a) A monotonic-read consistent data storeb) A data store that does not provide monotonic reads.
62
Monotonic Writes
The write operations performed by a single process P at two different local copies of the same data store
a) A monotonic-write consistent data store.b) A data store that does not provide monotonic-write consistency.
63
Read Your Writes
a) A data store that provides read-your-writes consistency.b) A data store that does not.
64
Writes Follow Reads
a) A writes-follow-reads consistent data storeb) A data store that does not provide writes-follow-reads consistency
65
Distribution Protocols Purpose: Solve the following problem
What is exactly propagated?
Where updates are propagated?
By whom propagation is initiated?
Three Distribution Protocols:
Replica Placement
Update Propagation
Epidemic Protocols
66
Replica Placement
The logical organization of different kinds of copies of a data store into three concentric rings.
67
Replica Placement Replica Placement
Permanent Replicas
The initial set of replicas
Constitute a distributed data store
Server-Initiated Replicas
Copies of a data store
Exist to enhance performance
Created at the initiative of the data store
Client-Initiated Replicas
Created at the initiative of clients
Commonly known as client caches
68
Update Propagation Introduction:
Update Propagation generally initiated at a client
Subsequently forwarded to one of the copies
Three design issues:
State Vs. Operations
What is actually to be propagated
Pull Vs. Push Protocols
Whether updates are pulled or pushed
Unicasting Vs. Multicasting
Whether unicasting or multicasting should be used
69
Epidemic ProtocolsIntroduction:
Update propagation in eventual-consistent data stores is often implemented by a class of algorithms known as Epidemic Protocols.
Don’t solve any update conflicts
Only concern is propagating updates to all replicas in as few messages as possible.
Update Propagation Models (Example: anti-entropy)
1. P only pushes its own updates to Q
2. P only pulls in new updates from Q
3. P and Q send updates to each other
70
Consistency ProtocolsIntroduction:
Consistency protocol
Describe implementations of specific consistency models
Models: sequential consistency, weak consistency with synchronization variables, and atomic consistency
Three Protocols:
Primary-Based Protocols
Remote-Write Protocols & Local-Write Protocols
Replicated-Write Protocols
Active Replication & Quorum-Based Protocols
Cache-coherence Protocols
71
Primary-Based Protocols
Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded.
Remote-Write Protocols (1)
72
Primary-Based Protocols
The principle of primary-backup protocol.
Remote-Write Protocols (2)
73
Primary-Based Protocols
Primary-based local-write protocol in which a single copy is migrated between processes.
Local-Write Protocols (1)
74
Primary-Based Protocols
Primary-backup protocol in which the primary migrates to the process wanting to perform an update.
Local-Write Protocols (2)
75
Replicated-Write Protocols
The problem of replicated invocations.
Active Replication (1)
76
Replicated-Write Protocols
a) Forwarding an invocation request from a replicated object.b) Returning a reply to a replicated object.
Active Replication (2)
77
Replicated-Write ProtocolsQuorum-Based Protocols
Three examples of the voting algorithm:a) A correct choice of read and write setb) A choice that may lead to write-write conflictsc) A correct choice, known as ROWA (read one, write all)
78
End of Chapter 6Thank you!