consistency & replication chapter no. 6

70
(1) Consistency & Replication Chapter No. 6 Muhammad Rafi Meng Min Guan Donghai

Upload: maddox

Post on 11-Jan-2016

41 views

Category:

Documents


5 download

DESCRIPTION

Consistency & Replication Chapter No. 6. Muhammad Rafi Meng Min Guan Donghai. Motivation. Large Scale distributed systems required scalability in every respect of their function. Data are generally replicated to enhance reliability , improve performance and increase availability - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Consistency & Replication Chapter No. 6

(1)

Consistency & ReplicationChapter No. 6

Muhammad RafiMeng Min

Guan Donghai

Page 2: Consistency & Replication Chapter No. 6

(2)

Motivation• Large Scale distributed systems required scalability

in every respect of their function.• Data are generally replicated to enhance reliability,

improve performance and increase availability • One major problem of replication is to maintain

consistency• To make consistency we need to have

– Distribution of updates (amount, frequency, means)– Keeping the replica consistent (immediate updates is

required)

• We will discuss a variety of protocols for data and client centric consistency

• We also discuss the distribution protocols and consistency protocols

• Example• Simulation

Page 3: Consistency & Replication Chapter No. 6

(3)

Outline

• Motivation Rafi• Introduction Meng Min• Data-Centric Consistency Models Meng

Min• Client-Centric Consistency Models

Donghai• Distribution Protocols Donghai• Consistency Protocols Rafi• Examples Rafi• Simulation of Consistency technique

Rafi

Page 4: Consistency & Replication Chapter No. 6

(4)

Content

Definition of Consistency and Replication

6.1 Introduction

Reasons for Replication & Problems of Replication

Object Replication

6.2 Data-centric consistency models

Page 5: Consistency & Replication Chapter No. 6

(5)

Consistency and Replication

Definition :

Replication:

Replication of data

Consistency:

Consistency of replicated data

Reason for Consistency:

Keep replicas to be the same

Page 6: Consistency & Replication Chapter No. 6

(6)

Consistency and Replication

6.1 Introduction:Reasons for Replication:

Reliability Performance

Problem of Replication:Consistency Problem

Whenever a replica is updated, that replica becomes

different from the others. Synchronization Problem

Performance Replication Consistency Synchronization

Page 7: Consistency & Replication Chapter No. 6

(7)

Object Replication (1)

Purpose:managing data in distributed systems

Object Replication:Consider objects instead of data aloneBenefit of encapsulating and operating

data

Page 8: Consistency & Replication Chapter No. 6

(8)

Object Replication (2)

Organization of a distributed remote object shared by two different clients.

Page 9: Consistency & Replication Chapter No. 6

(9)

Object Replication (3)

• A remote object capable of handling concurrent invocations on its own.• A remote object for which an object adapter is required to handle

concurrent invocations

Page 10: Consistency & Replication Chapter No. 6

(10)

Data-centric Consistency Models

The general organization of a logical data store, physically distributed and replicated across multiple processes.

Consistency Model:A contract between processes and the data store.

Page 11: Consistency & Replication Chapter No. 6

(11)

Data-centric Consistency Models

Data-centric Consistency Models

Strict Consistency Strict

Linearizability and Sequential Consistency

Causal Consistency

FIFO Consistency

Weak Consistency

Release Consistency

Entry Consistency Weak

Page 12: Consistency & Replication Chapter No. 6

(12)

Strict ConsistencyCondition:Any read on a data item x returns a value corresponding to the

result of the most recent write on x.Disadvantage: Assume the existence of absolute global time.

Behavior of two processes, operating on the same data item.• A strictly consistent store.• A store that is not strictly consistent.

Page 13: Consistency & Replication Chapter No. 6

(13)

Linearizability and Sequential Consistency

• Sequential Consistency Condition: The result of execution is the same as if the operations by all processes were executed in some sequential order and the operations appear in the order specified by its program. Time does not play a role.

• A sequentially consistent data store.• A data store that is not sequentially consistent.Linearizable consistency condition: Sequential + if ts1(x)<ts2(y), OP1(x) should

precede OP2(y) in this sequence.

Page 14: Consistency & Replication Chapter No. 6

(14)

Causal Consistency (1)

• Necessary condition:Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines.

This sequence is allowed with a causally-consistent store, but not with sequentially or strictly consistent store.

Page 15: Consistency & Replication Chapter No. 6

(15)

Causal Consistency (2)

a) A violation of a casually-consistent store.b) A correct sequence of events in a casually-consistent

store.

Page 16: Consistency & Replication Chapter No. 6

(16)

FIFO Consistency

• Necessary Condition:Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes.

A valid sequence of events for FIFO consistency

Page 17: Consistency & Replication Chapter No. 6

(17)

Weak Consistency (1)

Properties:• Accesses to synchronization variables associated

with a data store are sequentially consistent

• No operation on a synchronization variable is allowed to be performed until all previous writes have been completed everywhere

• No read or write operation on data items are allowed to be performed until all previous operations to synchronization variables have been performed.

Page 18: Consistency & Replication Chapter No. 6

(18)

Weak Consistency (2)

a) A valid sequence of events for weak consistency.b) An invalid sequence for weak consistency.

Page 19: Consistency & Replication Chapter No. 6

(19)

Release Consistency (1)

Rules:

• Before a read or write operation on shared data is performed, all previous acquires done by the process must have completed successfully.

• Before a release is allowed to be performed, all previous reads and writes by the process must have completed

• Accesses to synchronization variables are FIFO consistent (sequential consistency is not required).

Page 20: Consistency & Replication Chapter No. 6

(20)

Release Consistency (2)

A valid event sequence for release consistency.

Page 21: Consistency & Replication Chapter No. 6

(21)

Entry Consistency (1)Conditions:

• An acquire access of a synchronization variable is not allowed to perform with respect to a process until all updates to the guarded shared data have been performed with respect to that process.

• Before an exclusive mode access to a synchronization variable by a process is allowed to perform with respect to that process, no other process may hold the synchronization variable, not even in nonexclusive mode.

• After an exclusive mode access to a synchronization variable has been performed, any other process's next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable's owner.

Page 22: Consistency & Replication Chapter No. 6

(22)

Entry Consistency (2)

A valid event sequence for entry consistency

Page 23: Consistency & Replication Chapter No. 6

(23)

Summary of Consistency Models

Consistency

Description

Strict Absolute time ordering of all shared accesses matters.

Linearizability

All processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp

SequentialAll processes see all shared accesses in the same order. Accesses are not ordered in time

CausalAll processes see causally-related shared accesses in the same order.

FIFOAll processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order

(a) Consistency models not using synchronization operations.

Consistency

Description

Weak Shared data can be counted on to be consistent only after a synchronization is done

Release Shared data are made consistent when a critical region is exited

Entry Shared data pertaining to a critical region are made consistent when a critical region is entered.

(b) Models with synchronization operations.

Page 24: Consistency & Replication Chapter No. 6

(24)

6. Consistency and Replication

6.3 Client-Centric Consistency

models

6.4 Distribution ProtocolsGuan Donghai2005-11-15

Page 25: Consistency & Replication Chapter No. 6

(25)

Client-Centric Consistency Models

• The previously studied consistency models concern themselves with maintaining a consistent (globally accessible) data-store in the presence of concurrent read/write operations.

• Another class of distributed datastore is that which is characterized by the lack of simultaneous updates. Here, the emphasis is more on maintaining a consistent view of things for the individual client process that is currently operating on the data-store.

Page 26: Consistency & Replication Chapter No. 6

(26)

More Client-Centric Consistency

How fast should updates (writes) be made available to read-only

processes?• Think of most database systems: mainly read.• Think of the DNS: write-write conflicts do no

occur.• Think of WWW: as with DNS, except that heavy

use of client-side caching is present: even the return of stale pages is acceptable to most users.

• These systems all exhibit a high degree of acceptable inconsistency … with the replicas gradually become consistent over time.

Page 27: Consistency & Replication Chapter No. 6

(27)

Toward Eventual Consistency

• The only requirement is that all replicas will eventually be the same.

• All updates must be guaranteed to propogate to all replicas … eventually!

• This works well if every client always updates the same replica.

• Things are a little difficult if the clients are mobile.

Page 28: Consistency & Replication Chapter No. 6

(28)

Eventual Consistency

A mobile user accessing different replicas of a distributed database has problems with eventual consistency

Page 29: Consistency & Replication Chapter No. 6

(29)

Client-Centric Consistency

• Four models in client-centric consistency:

– Monotonic Read Consistency

– Monotonic Write Consistency

– Read-your-writes Consistency

– Writes-follows-reads Consistency

Page 30: Consistency & Replication Chapter No. 6

(30)

Client-Centric Consistency

• xi[t] -version of data x at copy Li at time t

• xi[t] is result of a set of write operations applied to x since initialization

• this set is notated as WS(xi[t])

• When these operations later (t2) are performed to x at copy Lj ,it is written as WS(xi[t1]; xj[t2])

Page 31: Consistency & Replication Chapter No. 6

(31)

Monotonic ReadsIf a process reads the value of a data item x, any

successive read operations on x by that process will always return that same value or a more recent value.

a) A monotonic-read consistent data storeb) A data store that does not provide monotonic reads.

Example: distributed email database

Page 32: Consistency & Replication Chapter No. 6

(32)

Monotonic Writes

A write operation by a process on a data item x is completed before any successive write operation on x by the same process

a) A monotonic-write consistent data store.b) A data store that does not provide monotonic-write consistency.

Example: The MW-guarantee could be used by a text editor when editing replicated files

Page 33: Consistency & Replication Chapter No. 6

(33)

Read Your Writes

The effect of a write operation by a process on data item x will always be seen a successive read operation on x by the same process.

a) A data store that provides read-your-writes consistency.b) A data store that does not.

Example: updating passwords

Page 34: Consistency & Replication Chapter No. 6

(34)

Writes Follow ReadsA write operation by a process on a data item x following a previous readoperation on x by the same process, it is guaranteed to take place on the

same or a more recent value of x that was read.

a) A writes-follow-reads consistent data storeb) A data store that does not provide writes-follow-reads consistency

Example: replicated bulletin board database

Page 35: Consistency & Replication Chapter No. 6

(35)

Client centric model-Summary

• Monotonic read If a process reads the value of a data item x, any

successive read operation on x by that process will always return that same value or a more recent value

• Monotonic write A write operation by a process on a data item x is

completed before any successive write operation on x by the same process

• Read your writes The effect of a write operation by a process on a data

item x will always be seen by a successive read operation on x by the same process

• Writes follow reads A write operation by a process on a data item x following

a previous read operation on x by the same process, is garanteed to take place on the same or more recent values of x that was read

Page 36: Consistency & Replication Chapter No. 6

(36)

Distribution Protocols

• Purpose: Solve the following problem – What is exactly propagated?

– Where updates are propagated?– By whom propagation is initiated?

• Three Distribution Protocols– Replica Placement– Update Propagation– Epidemic Protocols

Page 37: Consistency & Replication Chapter No. 6

(37)

Replica Placement

The logical organization of different kinds of copies of a data store into three concentric rings.

Page 38: Consistency & Replication Chapter No. 6

(38)

Permanent Replicas

• Permanent Replicas– Initial set of replicas that constitutes

a distributed data store

– Typically, the number of it is small

• Example: Web site

Page 39: Consistency & Replication Chapter No. 6

(39)

Server-Initiated Replicas

• Server-Initiated Replicas– Created at the initiative of the owner of the data

store– Exist to enhance performance

• Work Scheme

How to decide where and when replicas should be created or deleted?– Each server keeps track of access counts per file,

and where access requests come from. – When a server Q decides to reevaluate the

placement of the files it stores, it checks the access count for each file.

– If the total number of access requests for F at Q drops below the deletion threshold del (Q,F), it will delete F unless it is the last copy.

Page 40: Consistency & Replication Chapter No. 6

(40)

Server-Initiated Replicas

Counting access requests from different clients.

Page 41: Consistency & Replication Chapter No. 6

(41)

Client-Initiated Replicas

• Work Scheme– When a client wants access to some data, it

connects to the nearest copy of the data store from where it fetches the data it wants to read,

– When most operations involve only reading data, performance can be improved by letting the client store requested data in a nearby cache.

– The next time that same data needs to be read, the client can simply fetch it from this local cache.

• Client-Initiated Replicas– Created at the initiative of clients– Commonly known as client caches

Page 42: Consistency & Replication Chapter No. 6

(42)

Update Propagation

• Introduction

– Generally initiated at a client – Subsequently forwarded to one of the copies

• Three design issues– State Vs. Operations

What is actually to be propagated

– Pull Vs. Push Protocols

Whether updates are pulled or pushed– Unicasting Vs. Multicasting

Whether unicasting or multicasting should be used

Page 43: Consistency & Replication Chapter No. 6

(43)

State Vs. Operations

What is actually to be propagated?• Propagate only a notification of an update

– Use little network bandwidth– Work best when read-to-write ratio is relatively

small

• Transfer data from one copy to another

– Useful when the read-to-write ratio is relatively high

• Propagate the update operation to other copies

– Tell each replica swhich update operation it should perform

– Updates can often be propagated at minimal bandwidth costs

Page 44: Consistency & Replication Chapter No. 6

(44)

Pull versus Push Protocols

Whether updates are pulled or pushed?• Push-based approach (server-based)

– Updates are propagated to other replicas without those replicas even asking for the updates

• Pull-based approach (client-based)– a server or client requests another server to send it any

updates it has at that moment.

A comparison between push-based and pull-based Protocolsin the case of multiple client, single server systems.

Page 45: Consistency & Replication Chapter No. 6

(45)

Unicasting Vs. Multicasting

Whether unicasting or multicasting should be used?

• Introduction – With multicasting, the underlying network takes

care of sending a message efficiently to multiple receivers.

– In unicast communication, when a server that is part of the data store sends its update to N other servers, it does so by sending N separate messages, one to each server.

• Compare– In many cases, it is cheaper to use available

multicasting facilities.– Multicasting can often be efficiently combined

with a push-based approach to propagating updates.

Page 46: Consistency & Replication Chapter No. 6

(46)

Epidemic Protocols

• Ensures eventual consistency • Epidemic spreading of updates, propagates updates to

all replicas efficiently • Does not solve any update conflicts directly • Concern: update to all replicas in as few messages as

possible

• Terminology:– Infective – a server that holds an update and it is willing to spread– Susceptible – a server that has not been updated– Removed – a server that is not willing or able to update

Page 47: Consistency & Replication Chapter No. 6

(47)

Epidemic Protocols

• Anti-entropy Propagation Approach:– A server P picks another server Q at random, three approaches in exchanging

updates: 1. P only pushes its updates to Q

2. P only pulls in new updates from Q 3. P and Q exchange updates (push-pull)

• Gossip Protocol:– If server P has just been updated for data item x,

• Contact an arbitrary server Q and push update to Q • If Q was already updated, P may lose interest with some probability

– Good way of rapidly spreading updates – Note: cannot guarantee that all servers will actually be updated

Page 48: Consistency & Replication Chapter No. 6

(48)

Consistency & ReplicationChapter No. 6

Muhammad Rafi6.5 Consistency Protocol6.6 Examples Orca

Page 49: Consistency & Replication Chapter No. 6

(49)

Consistency Protocol• A consistency protocol describes an implementation of a

specific consistency model• The consistency model in which operations are globally

serialized are the most important and widely applied models– Sequential Consistency– Weak consistency with synchronization variables – Atomic transaction

• There are two ways to classify the consistency protocols, one with the primary or one with write on any replica

• We discussed the following protocols

Primary-Based Protocols Remote-Write Protocols & Local-Write Protocols

Replicated-Write Protocols Active Replication & Quorum-Based Protocols

Cache-coherence Protocols

Page 50: Consistency & Replication Chapter No. 6

(50)

Primary-Based Protocols

• Primary based protocols use a primary server to responsible for every write operation in distributed data sources.

• There are several strategies for Primary based approach– Remote-write protocol with no backup replica – Remote-write protocol: primary-backup

approach (passive replication) – Local-write protocol with single migrating

primary copy (no backup) – Local-write protocol with migrating primary

copy and non-migrating backups (Primary-backup approach)

Page 51: Consistency & Replication Chapter No. 6

(51)

Primary-Based Protocols

• Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded.

Remote-Write Protocols (1) (No Backup Replica)

Page 52: Consistency & Replication Chapter No. 6

(52)

Primary-Based Protocols

• The principle of primary-backup protocol.

Remote-Write Protocols (2) Backup Replica

Page 53: Consistency & Replication Chapter No. 6

(53)

Primary-Based Protocols

• Problems– Update process implemented as blocking – Update process may wait longer to continue

p

A, B, C

q

A, B, C

s

A, B, C

t

A, B, C

Only p can write to A.

All writes to A mustbe forwarded to p

Others read A locally

Page 54: Consistency & Replication Chapter No. 6

(54)

Primary-Based Protocols

• Primary-based local-write protocol in which a single copy is migrated between processes.

• Non-Replicated, but distributed – move resource to client.• Problem- keeping track about where the single copy is located. (LAN

Broadcast)

Local-Write Protocols (1) (No Backup Replica)

Page 55: Consistency & Replication Chapter No. 6

(55)

Primary-Based Protocols

• Primary-backup protocol in which the primary migrates to the process wanting to perform an update.

Local-Write Protocols (2) Backup Replica

Page 56: Consistency & Replication Chapter No. 6

(56)

Replicated-Write Protocols

• In replicated-write protocols, write operations can be carried out at multiple replicas

• A distinction can be made between Active Replication ( updated goes to all replicas) and consistency protocols based on majority voting.

• Active Replication– In active replica operation is sent to each replica and

an associated process carries out that operation there.– Operation order is very important

• Totally-ordered multicast Lamport Timestamps• Central coordinator (Sequencer) – can be a bottleneck

• Another problem is Replicated Invocation (AB C)– No general purpose solution available

Page 57: Consistency & Replication Chapter No. 6

(57)

Replicated-Write Protocols

Active Replication (1)

Page 58: Consistency & Replication Chapter No. 6

(58)

Replicated-Write Protocols

a) Forwarding an invocation request from a replicated object.b) Returning a reply to a replicated object.

Active Replication (2)

Page 59: Consistency & Replication Chapter No. 6

(59)

Replicated-Write Protocols

• Basic Idea: require clients to request and acquire the permission(Quorum) of multiple servers before either reading or writing a replicated data item

• Scheme:

- Suppose:

N replicas exist, NR:read quorum, NW:write quorum

- Constraints :

1. NR+NW>N 2. NW>N/2

The first constraint is used to prevent read-write conflicts, whereas the second prevents write-write conflicts. Only after the appropriate number of servers has agreed to participate can a file be read or written.

Quorum-Based Protocols

Page 60: Consistency & Replication Chapter No. 6

(60)

Replicated-Write Protocols

Quorum-Based Protocols

Three examples of the voting algorithm:a) A correct choice of read and write setb) A choice that may lead to write-write conflictsc) A correct choice, known as ROWA (read one, write all)

Page 61: Consistency & Replication Chapter No. 6

(61)

Cache-Coherence Protocols• Caches are special case of replication• Cache solutions differ in their coherence detection

strategy• Distributed Database Example

1. In a Transaction a cache data is verified at the server for consistency and if OK transaction is executed locally based on cache.

2. Let the transaction execute while checking for cache consistency, if not Ok abort the transaction

3. Verify cache data only when a transaction is committed.

• Coherence enforcement strategy 1. Server send an invalidation whenever server data is changes.2. Propagate the update

• Hardware Cache– Write through – Write back

Page 62: Consistency & Replication Chapter No. 6

(62)

Orca Parallel Programming Language

• Orca is a language for parallel programming on distributed systems, based on the shared data-object model.

• This model is a simple and portable form of object-based distributed shared memory.

Page 63: Consistency & Replication Chapter No. 6

(63)

Orca

• A simplified stack object in Orca, with internal data and two operations.

OBJECT IMPLEMENTATION stack; top: integer; # variable indicating the top stack: ARRAY[integer 0..N-1] OF integer # storage for the stack

OPERATION push (item: integer) # function returning nothing BEGIN GUARD top < N DO stack [top] := item; # push item onto the stack top := top + 1; # increment the stack pointer OD; END;

OPERATION pop():integer; # function returning an integer BEGIN GUARD top > 0 DO # suspend if the stack is empty top := top – 1; # decrement the stack pointer RETURN stack [top]; # return the top item OD; END;BEGIN top := 0; # initializationEND;

Page 64: Consistency & Replication Chapter No. 6

(64)

Management of Shared Objects in Orca

• Four cases of a process P performing an operation on an object O in Orca.

Page 65: Consistency & Replication Chapter No. 6

(65)

Casually-Consistent Lazy Replication

• The general organization of a distributed data store. • Clients are assumed to also handle consistency-related communication.

Page 66: Consistency & Replication Chapter No. 6

(66)

Processing Read Operations

• Performing a read operation at a local copy.

Page 67: Consistency & Replication Chapter No. 6

(67)

Processing Write Operations

• Performing a write operation at a local copy.

Page 68: Consistency & Replication Chapter No. 6

(68)

Summary (1/2)

• Two reasons to replicate data 1. Reliability 2. Performance

• Replication introduced Consistency problems• Data centric consistency models are

– Strict consistency {hard to implement in distributed systems}

– Sequential (all operation seen by everyone in same order)and Linearizability (order operation according to a global clock) are good choice in concurrent programming

• Client centric consistency model do not perceive that the data is shared. These model focuses on individual client aspect of consistency.

• In mobile/ distributed database client may connect to different replicas in the course of time but all the differences must be transparent from the client.

Page 69: Consistency & Replication Chapter No. 6

(69)

Summary (2/2)

• To propagate updates among different replicas different techniques can be applied. – What is exactly propagated ?– Where updates are propagated?– By whom propagation is initiated ?– Frequency of updates?

• Consistency Protocols describe a specific implementation of a consistency model.– Primary based protocols– Replicated-write protocols

• Orca example

Page 70: Consistency & Replication Chapter No. 6

(70)

Question & Answer

Thank you!!!