replication and consistency. references r the case for non-transparent replication: examples from...

34
Replication and Consistency

Upload: grant-robertson

Post on 27-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Replication and Consistency

Page 2: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

References

The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer. IEEE Data Engineering, December 1998

Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System Douglas B. Terry, Marvin M. Theimer, Karin Petersen, Alan J. Demers, Mike J. Spreitzer and Carl H. Hauser. In ACM Symposium on Operating Systems Principles (SOSP ’95)

Page 3: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

ACID Transaction

Transaction has to show up everywhere at the same time or not at all. e.g., When you withdraw cash from your

ATM machine, the balance should reflect the actual money left. If it doesn’t, then you could go back to a store, use your ATM card and withdraw cash that you do not have

e.g., When you make your airline reservation and the system assigns you a seat, you expect the seat to be available to you (of course, Airlines overbook)

Page 4: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Replication and Availability

Replication is a powerful tool that allows us to tradeoff availability for consistency

Applications need different levels of consistency

Applications know best on how to deal with inconsistency

Page 5: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Application 1: Meeting room scheduler

Suppose we have two conference rooms of the same capacity. I want to schedule my meeting in one of the conference rooms. I don’t care which exact room it is.

If two people reserve the same room at the same time, there is a conflict, but if they reserve the same room at different times or reserve different rooms at the same time, there is no conflict.

Rm2

Rm1

time

No conflict

Page 6: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Application 1: Meeting room scheduler

Rm2

Rm1

time

No conflict

Rm2

Rm1

time

conflict

Page 7: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Application 1: Meeting room scheduler

We can lock the entire database Not needed when there is no conflict In the case of conflicts, there is an

application specific way to deal with the conflict – we move on reservation to the other room

If the other room is reserved, we ask the user, they can easily move the reservation to another acceptable time

Page 8: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Application 2: Shared mailbox

Shared mailbox folders- shared between, me and my 2TAs.

We all replicate the mailbox. OP1: I see a mail from the class, respond to it and

delete it. OP2: The TA sees the same mail and files it in CS402. OP3: I see an email from a friend and file it as

important.

mailbox

CS402 Important Recruiting Chocolate

Page 9: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Application 2: Shared Mailbox

All of us operate on the same mailbox You can lock the entire mailbox before

someone operates on it. Can’t work when disconnected Clearly not necessary for doing only one

operation For operation OP1 and OP2, it is not clear

who should win, should the mail be deleted or should it be filed in Assign1?

Page 10: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Two Approaches to Building Replicated Services

Transparent replication system: Allow systems that were developed assuming a

central file system or database to run unchanged on top of a strongly-consistent replicated storage system (as seen in Oceanstore)

Non-transparent replication system: Relaxed consistency model – access-update-

anywhere Applications involved in conflict detection and

resolution. Hence applications need to be modified (e.g. Bayou, Coda file system etc)

Page 11: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Hypothesis Applications know best on how to resolve

conflicts The challenge is providing the right interface

to support cooperation between applications and their data managers Programmers do not want to deal with propagating

updates, ensuring eventual consistency• Anyone who has synchronized the project files in

school, work, and home can feel the pain. Programmers want to set replication schedules and

control how conflicts or detected and resolved• Record level conflict detection rather than file level

Page 12: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Bayou

Update-anywhere replication model Bayou manages databases that can be fully

replicated at any number of sites Applications can read and write to any

single replica of the database (lazy group update)

Once a replica accepts a write operation, this write is performed locally and propagated to all other replicas via pair-wise reconciliation protocol

Page 13: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Conflict Detection :Dependency Checks

Each Write operation includes a dependency check consisting of an application-supplied query and its expected result.

If the check fails, then the requested update is not performed and the server invokes a procedure to resolve the detected conflict.

Page 14: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,
Page 15: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Example of Bayou Write3-tuple: <update><dependency check><mergeproc>

For example, Update: <insert, HL, 11/22/2003, 3:00pm, 1 hr, “Space Committee Meeting”>Dependency check: <Is there an entry in the database on 11/2/2003 at 3:0am for 1 hr? expected answer is empty>Mergeproc: <Try 11/22/2003 at 4:15pm for 45min; if successful write that tuple>

• sometimes users like conflicts

A different merge procedure altogether could search for the next available time slot to schedule the meeting, which is an option a user might choose if any time would be satisfactory.

Page 16: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Conflict Resolution :Merge Procedure

In practice, Bayou merge procedures are written by application programmers in the form of templates that are instantiated with the appropriate details filled in for each Write.

In the case where automatic resolution is not possible, the merge procedure will still run to completion, but is expected to produce a revised update that logs the detected conflict in some fashion that will enable a person to resolve the conflict later.

Page 17: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Replica Management Replicas held by two servers at any time may

vary in their contents because they have received and processed different Writes. However, this fundamental property is satisfied: Bayou system guarantees that all servers eventually

receive all Writes via the pair-wise anti-entropy process and that two servers holding the same set of Writes will have the same data contents.

It cannot enforce strict bounds on Write propagation delays since these depend on network connectivity factors that are outside of Bayou’s control

Page 18: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Replica Consistency

Bayou has two features that allows servers to achieve eventual consistency. Writes performed in the same, well-defined

order at all servers (global-ordering) Conflict detection and merge procedures

are deterministic

Page 19: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Replica Consistency

When a Write is accepted by a Bayou server from a client, it is deemed tentative.

Tentative writes are ordered according to timestamps assigned to them by their accepting servers.

Eventually, each Write is committed, by the anti-entropy process that will be described shortly.

Timestamps for tentative Writes must monotonically increase at each server.

Servers do not have to have synchronized clocks

Page 20: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Replica Consistency Consistency is potentially an issue since servers

may receive Writes from clients and from other servers in an order that differs from the required execution order and because servers immediately apply all known Writes to their replicas.

This implies that there must be support of undoing writes (use of write logs) and reapplying them

Each server maintains a log of all Write operations that it has received, sorted by their committed or tentative timestamps, with committed Writes at the head of the log.

Page 21: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Anti-Entropy

Entropy - a process of degradation or running down or a trend to disorder.

Bring 2 replicas up-to-date Three Major Design Decisions

Pairwise communication between replicas Exchange of update operations Ordered propagation of operations

Page 22: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Pair Reconciliation

Replica

Replica

Replica

Replica

Eventual consistencyGlobal commit order assigned by Primary server

Page 23: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Example

Suppose a user keeps the primary copy of his calendar with him on his laptop and allows others, such as a spouse or secretary, to keep secondary (mostly read copies).

The user updates to his own calendar; This is committed immediately.

Updates by the spouse/secretary are tentative until anti-entropy takes place with the user. At this point, the user can commit and propagate the order to the spouse/secretary during anti-entropy.

Page 24: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Basic Anti-Entropy

Protocol: Between pairs of servers The propagation of writes is constrained by

the accept order. Prefix property: A server R that holds a

write stamped write, Wi , that was initially accepted by another server X will also hold all writes accepted by X prior to Wi

Page 25: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Basic Anti-Entropy

Protocol R.V: This denotes R’s version vector; This is used to determine

which writes are unknown to the receiving server R anti-entropy(S,R) { Get R.V from receiving server R #now send all the writes unknown to R w = first write in S.write-log while (w) do if R.V(w.server-id) < w.accept-stamp then # w is new for R SendWrite(R,w) w = next write in S.write-log end}

Page 26: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Basic Anti-Entropy

Anti-entropy is incremental When a new write arrives at the receiver it can

be immediately included in the receiver's write-log because the sending replica ensures that the receiving server will hold all writes necessary to satisfy the prefix property.

Reconciliation between two replicas can make progress independently of where the protocol may get interrupted due to network failures or voluntary disconnections.

The protocol does not address the issue of the growing size of write logs.

Page 27: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Effective Write-Log Management

Storage is of concern We want to be able to prune the prefix of the

write logs A protocol is needed to stabilize writes (we

look at a primary commit) protocol. Primary replica commits write and assigns a

monotonically increasing commit sequence number called CSN.

Committed writes are totally ordered Propagation:

First send the committed writes Second send the tentative writes

Page 28: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Anti-Entropy with Support for Committed Writes

anti-entropy(S,R) { Get R.V from receiving server R #First send all the committed writes that R does #not know about if R.CSN < S.CSN then w = first committed write that R does not know about. while (w) do if w.accept-stamp < R.V(w.server-id) then # R has the write, but does not know it is committed. SendCommitNotification(R, w.accept-stamp,w.server-id, w.CSN) else SendWrite(R,w) end w = next committed write in S.write-log. #now send all the tentative writes while (w) do if R.V(w.server-id) < w.accept-stamp then SendWrite(R,W) w = next write in S.write-log endend}

Page 29: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Effective Write-Log Management

It is necessary to allow replicas to truncate any prefix of the committed (stable) part of the write log when there is a need.

Implication: A write-log may not hold enough writes to allow incremental reconciliation with another replica.

A commit sequence number is maintained for the omitted part of the log.

A vector characterizing the omitted prefix of the server’s write-log is also maintained.

If the commit sequence number of the receiver is less than the omitted sequence number of the server then a (perhaps full) database transfer occurs.

Page 30: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Access Control Certificates – Grant, delegate and revoke No assumptions about trust Mutual authentication and access control

is based on public-key cryptography. Every user possesses a public/private key

pair and a set of digitally signed access control certificates granting user access to various data collections.

Page 31: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Performance

Size is acceptable Write performance is acceptable

Page 32: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Future

Partial Databases Carry part of the database instead of the

entire database (mobile clients do not have enough storage space)The problem is that, if a client did not have a particular record, was it because it didn’t replicate that part of because it didn’t know about it?

Page 33: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Technology Impact

TrueSync - end-to-end synchronization software and infrastructure solutions for the wireless Internet http://www.starfish.com/

SyncML - SyncML is the common language for synchronizing all devices and applications over any network. Ericsson, IBM, Lotus, Motorola, Nokia, Palm Inc.,

Psion, Starfish Software etc. (614 companies) http://www.syncml.org/

Page 34: Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,

Conclusions

Difference from other replicated systems Non-transparency Application-specific conflict detection Per-write conflict resolvers Partial and multi-object updates Tentative and stable resolutions Security

Future goal Partial replication, policies for choosing servers for

anti-entropy, building servers with conventional database managers, alternate data models, and finer grain access control.