orbe : scalable causal consistency using dependency matrices & physical clocks

37
Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy Zwaenepoel, EPFL

Upload: jake

Post on 23-Feb-2016

61 views

Category:

Documents


0 download

DESCRIPTION

Orbe : Scalable Causal Consistency Using Dependency Matrices & Physical Clocks. Jiaqing Du, EPFL Sameh Elnikety , Microsoft Research Amitabha Roy, EPFL Willy Zwaenepoel, EPFL. Key-Value Data Store API. Read operation value = get( key ) Write operation put( key, value) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks

Jiaqing Du, EPFLSameh Elnikety, Microsoft ResearchAmitabha Roy, EPFLWilly Zwaenepoel, EPFL

Page 2: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

2

Key-Value Data Store API

• Read operation– value = get( key )

• Write operation– put( key, value)

• Read transaction– <value1, value2, …> = mget ( key1, key2, … )

Page 3: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

3

Partitioning

• Divide data set into several partitions. • A server manages each partition.

Partition 1 Partition 2 Partition N…

Page 4: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

4

• Data set is partitioned

Inside a Data Center

Partition 1

Application

Partition 2 Partition N…

client

Application Application

client client…Application

tier

Data tier

Page 5: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

5

Geo-Replication

Data Center E

Data Center B Data Center C

Data Center F

• Data close to end users• Tolerates disasters

Data Center A

Page 6: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

6

Scalable Causal Consistency in Orbe

• Partitioned and replicated data store• Parallel asynchronous update propagation• Efficient implementation of causal consistency

Partition 1 Partition 2 Partition N…

Partition 1 Partition 2 Partition N…Replica A

Replica B

Page 7: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

7

Consistency Models

• Strong consistency– Total order on propagated updates– High update latency, no partition tolerance

• Causal consistency– Propagated updates are partially ordered– Low update latency, partition tolerance

• Eventual consistency– No order among propagated updates– Low update latency, partition tolerance

Page 8: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

8

• If A depends on B, then A appears after B.

Causal Consistency (1/3)

Photo

Comment: Great weather! Comment: Great weather!

Update

Propagation

Alice

Alice

Page 9: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

9

• If A depends on B, then A appears after B.

Causal Consistency (2/3)

Photo

Comment: Nice photo! Comment: Nice photo!

Update

Propagation

Alice

Bob

Page 10: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

10

Causal Consistency (3/3)

• Partitioned and replicated data stores

Partition 1 Partition 2 Partition N…

Partition 1 Partition 2 Partition N…Replica A

Replica B

Client

Read(A) Read(B) Write(C, A+B)

Propagate (C)How to guarantee A and B appear first?

Page 11: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

11

Existing Solutions

• Version vectors– Only work for purely replicated systems

• COPS [Lloyd’11]– Explicit dependency tracking at client side– Overhead is high under many workloads

• Our work– Extends version vectors to dependency matrices– Employs physical clocks for read-only transactions– Keeps dependency metadata small and bounded

Page 12: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

12

Outline

• DM protocol• DM-Clock protocol• Evaluation• Conclusions

Page 13: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

13

Dependency Matrix (DM)

• Represents dependencies of a state or a client session • One integer per server• An integer represents all dependencies from a partition

Partition 1 Partition 2

Replica A

Partition 1 Partition 2

Replica B

9 5

00DM

first 9 updates

first 5 updates07Partition 3

Partition 3

Page 14: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

14

DM Protocol: Data Structures

Partition 1 of Replica A

Client Dependency matrix (DM)

0 0

0 00 0DM =

Page 15: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

15

DM Protocol: Data Structures

Partition 1 of Replica A

Client

3 8

Dependency matrix (DM)

Version vector(VV)

0 0

0 00 0

VV =

DM =

Page 16: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

16

DM Protocol: Data Structures

Partition 1 of Replica A

Client

3 8Item A, rid = A, ut = 2, dm =

Item B, rid = B, ut = 5, dm =

Dependency matrix (DM)

Version vector(VV)

Update timestamp(UT)

Source replica id(RID)

0 0

0 00 0

1 4

0 00 0

0 5

0 01 0

VV =

DM =

Page 17: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

17

DM Protocol: Read and Write

• Read item– Client <-> server– Includes read item in client DM

• Write item– Client <-> server– Associates client DM to updated item– Resets client DM (transitivity of causality)– Includes updated item in client DM

Page 18: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

18

0 0

0 01 0

4 0

0 00 0

Replica A

Partition 1

(v, rid = A, ut = 4)

Example: Read and Write

Partition 2

Client

DM = DM = DM =

read(photo) write(comment, )

VV = [7, 0]

(ut = 1)

VV = [0, 0] VV = [1, 0]

Partition 3VV = [0, 0]

0 0

0 00 0

4 0

0 00 0

Page 19: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

19

DM Protocol: Update Propagation

• Propagate an update– Server <-> server– Asynchronous propagation– Compares DM with VVs of local partitions

Page 20: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

20

Replica A

Partition 1

Example: Update Propagation

Replica BPartition 1

Partition 2

Partition 2

VV = [7, 0]

VV = [0, 0]

VV = [3, 0]

VV = [0, 0]

VV = [4, 0]

VV = [1, 0]

check dependency

VV = [1, 0]

Partition 3VV = [0, 0]

Partition 3VV = [0, 0]

replicate(comment, ut = 1, )0 00 04 0

Page 21: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

21

Complete and Nearest Dependencies

• Transitivity of causality– If B depends on A, C depends on B, then C depends on A.

• Tracking nearest dependencies– Reduces dependency metadata size– Does not affect correctness

C: write Comment 2

A: writePhoto

B: writeComment 1

Complete Dependencies

Nearest Dependencies

Page 22: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

22

DM Protocol: Benefits

• Keeps dependency metadata small and bounded– Only tracks nearest dependencies by

resetting the client DM after each update– Number of elements in a DM is fixed– Utilizes sparse matrix encoding

Page 23: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

23

Outline

• DM protocol• DM-Clock protocol• Evaluation• Conclusions

Page 24: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

24

Read Transaction on Causal SnapshotAlbum: PublicBob 1Photo

Bob 2

Album: Public Only close friends!

Bob 3

Photo

Bob 4

Replica A Replica B

Album: Public

Photo

Album: Public Only close friends!

Photo

Mom 1

Mom 2

Page 25: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

25

• Provides causally consistent read-only transactions• Requires loosely synchronized clocks (NTP)• Data structures

DM-Clock Protocol (1/2)

Timestamps from physical clocksPartition 0

Client

3 8Item A, rid = A, ut = 2, dm = , put = 27

Item B, rid = B, ut = 5, dm = , put = 35

0 0

0 00 0

1 4

0 00 0

0 5

0 01 0

VV =

DM = PDT = 0

Page 26: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

26

DM-Clock Protocol (2/2)

• Still tracks nearest dependencies• Read-only transaction– Obtains snapshot timestamp from local physical clock– Reads latest versions created “before” snapshot time

• A cut of the causal relationship graph

A0

B2

C1

B0

C0

D3

E0

snapshot timestamp

Page 27: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

27

Outline

• DM protocol• DM-Clock protocol• Evaluation• Conclusions

Page 28: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

28

Evaluation

• Orbe– A partitioned and replicated key-value store– Implements the DM and DM-Clock protocols

• Experiment Setup– A local cluster of 16 servers– 120 ms update latency

Page 29: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

29

Evaluation Questions

1. Does Orbe scale out?2. How does Orbe compare to eventual consistency?3. How does Orbe compare to COPS

Page 30: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

30

Throughput over Num. of PartitionsWorkload: Each client accesses two partitions.

Orbe scales out as the number of partitions increases.

Page 31: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

31

Throughput over Varied Workloads Workload: Each client accesses three partitions.

Orbe incurs relatively small overhead for tracking dependencies under many workloads.

Page 32: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

32

Orbe Metadata Percentage

Dependency metadata is relatively small and bounded.

Page 33: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

33

Orbe Dependency Check Messages

The number of dependency check messages is relatively small and bounded.

Page 34: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

34

Orbe & COPS: Throughput overClient Inter-Operation Delays

Workload: Each client accesses three partitions.

Page 35: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

35

Orbe & COPS: Number of Dependencies per Update

Orbe only tracks nearest dependencies when supporting read-only transactions.

Page 36: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

36

In the Paper

• Protocols– Conflict detection– Causal snapshot for read transaction– Garbage collection

• Fault-tolerance and recovery• Dependency cleaning optimization• More experimental results– Micro-benchmarks & latency distribution– Benefits of dependency cleaning

Page 37: Orbe :  Scalable  Causal  Consistency Using  Dependency Matrices  &  Physical Clocks

37

Conclusions

• Orbe provides scalable causal consistency– Partitioned and replicated data store

• DM protocol– Dependency matrices

• DM-Clock protocol– Dependency matrices + physical clocks– Read-only transactions (causally consistency)

• Performance– Scale out, low overhead, comparison to EC & COPS