transaction chains: achieving serializability with low-latency in geo-distributed storage systems...

50
Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos K. Aguilera Jinyang Li ork University *Microsoft Research Silicon V

Upload: reuben-harrell

Post on 14-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems

Yang Zhang Russell Power Siyuan ZhouYair Sovran *Marcos K. Aguilera Jinyang Li

New York University *Microsoft Research Silicon Valley

Page 2: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Large-scale Web applications

Why geo-distributed storage?

Geo-distributed storage

Replication

Page 3: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Geo-distribution is hard

Low latency:O(Intra-datacenter RTT)

Strong semantics:relational tables w/

transactions

Page 4: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

?Low latency

Key/value only

Limited forms of transaction

General transaction

Prior workStrictserializable

Serializable

Eventual

Variousnon-serializable

High latency

Provably high latency according to CAP

Spanner [OSDI’12]

Dynamo [SOSP’07]

COPS [SOSP’11]

Walter [SOSP’11]

Eiger [NSDI’13]

Our work

Page 5: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Our contributions1. A new primitive: transaction chain– Allow for low latency, serializable transactions

2. Lynx geo-storage system: built with chains– Relational tables– Secondary indices, materialized join views

Page 6: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Talk Outline• Motivation• Transaction chains• Lynx• Evaluation

Page 7: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Why transaction chains?

Bidder Item Price Seller Item Highest bidBids Items

Alice Book $100

Bob Book $20

Alice iPhone $20

Bob

Datacenter-1 Datacenter-2

Alice

Bob Camera $100

Auction service

Page 8: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Why transaction chains?

Alice’s BidsAlice Book $100

Bob

Datacenter-1 Datacenter-2

AliceBob Camera $100

Bob’s Items

1. Insert bid to Alice’s Bids

2. Update highest bid on Bob’s Items

Operation: Alice bids on Bob’s camera

1. Insert bid to Alice’s Bids

Page 9: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Why transaction chains?

Alice’s BidsAlice Book $100

Bob

Datacenter-1 Datacenter-2

AliceBob Camera $100

Bob’s Items

2. Update highest bid on Bob’s Items

Operation: Alice bids on Bob’s camera

1. Insert bid to Alice’s Bids

Page 10: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Low latency with first-hop return

Alice’s BidsAlice Book $100

Bob

Datacenter-1 Datacenter-2

Alice

Bob Camera $100Bob’s Items

bid on Bob’s camera

Alice Camera $500

$500

Page 11: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Problem: what if chains fail?

1. What if servers fail after executing first-hop?

2. What if a chain is aborted in the middle?

Page 12: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Solution: provide all-or-nothing atomicity

1. Chains are durably logged at first-hop– Logs are replicated to another closest data center– Chains are re-executed upon recovery

2. Chains allow user-aborts only at first hop

• Guarantee: First hop commits all hops eventually commit

Page 13: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Problem: non-serializable interleaving

• Concurrent chains ordered inconsistently at different hops

X=1 Y=1

X=2 Y=2

Time

T1

T2

Server-X: T1 < T2 Server-Y: T2 < T1

Not serializable!

T2 T1

• Traditional 2PL+2PC prevents non-serializable interleaving at the cost of high latency

Page 14: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Conflict?

Solution: detect non-serializable interleaving via static analysis

• Statically analyze all chains to be executed– Web applications invoke fixed set of operations

X=1 Y=1

X=2 Y=2

Serializable if no SC-cycle [Shasha et. al TODS’95]

A SC-cycle has both red and blue edges

T1

T2

Page 15: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Outline• Motivation• Transaction chains• Lynx’s design• Evaluation

Page 16: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

How Lynx uses chains

• User chains: used by programmers to implement application logic

• System chains: used internally to maintain– Secondary indexes– Materialized join views– Geo-replicas

Page 17: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Example: secondary index

Bob Car $20Alice Book $20

Bob Camera $100Alice iPhone $100

Bidder Item PriceBids (base table)

Alice Camera $100

Bob iPhone $20

Bidder Item PriceBids (secondary index)

Alice Camera $100

Bob Car $20

Page 18: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Example user and system chain

Alice Book $100

Bob

Datacenter-1 Datacenter-2

Alice

Bob Camera $100

bid on Bob’s camera

Alice Camera $100

Page 19: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Insert to Bids table

Update Items table

Lynx statically analyzes all chains beforehand

Put-bid

Read-bids

Put-bidInsert to Bids table

Update Items table

Read-bids

SC-cycleOne solution: execute chain as a distributed transaction

Read Bids table

Read Bids table

Page 20: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Insert to Bids table

Update Items table

SC-cycle source #1: false conflicts in user chains

Put-bid

Insert to Bids table

Update Items tablePut-bid

False conflict because max(bid, current_price)

commutes

Page 21: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Insert to Bids table

Update Items table

Solution: users annotate commutativity

Put-bid

Insert to Bids table

Update Items tablePut-bid

com

mut

es

Page 22: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

SC-cycle source #2: system chains

Insert to Bids table

…Put-bid

Insert to Bids table

…Put-bid

Insert to Bids-secondary

Insert to Bids-secondary

SC-cycle

Page 23: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Solution: chains provide origin-ordering• Observation: conflicting system chains originate at the

same first hop server.

Both write the same row of Bids table

• Origin-ordering: if chains T1 < T2 at same first hop, then T1 < T2 at all subsequent overlapping hops.– Can be implemented cheaply sequence number vectors

T1

Insert to Bids table

Insert to Bids-secondary

T2

Insert to Bids table

Insert to Bids-secondary

Page 24: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Limitations of Lynx/chains1. Chains are not strictly serializable, only serializable.2. Programmers can abort only at first hop

• Our application experience: limitations are managable

Page 25: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Outline• Motivation• Transaction chains• Lynx’s design• Evaluation

Page 26: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Simple Twitter Clone on Lynx

Author Tweet

Tweets

Alice New York rocks

From To

Follow-Graph

Alice Bob

Alice Eve

Bob Time to sleep

To From

Follow-Graph (secondary)

Bob Alice

Bob Clark

Geo-replicated

Geo-replicated

Author(=to)

From Tweet

Bob Alice Time to sleep

Eve Alice Hi there

Tweets JOIN Follow-Graph (Timeline)

Eve Hi there

Page 27: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Experimental setup

us-west

europe

us-east

82ms

153ms

102ms

Lynx protoype:• In-memory database• Local disk logging only.

Page 28: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Returning on first-hop allows low latency

Follow-user Post-tweet Follow-user Post-tweet Read-timeline0

50

100

150

200

250

300

174

252

3.2 3.1 3.1

Late

ncy

(ms)

First hop return

Chain completion

Page 29: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Applications achieve good throughput

Follow-User Post-Tweet Read-Timeline0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

184000 173000

1350000

Mill

ion

ops/

sec

Page 30: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Related work

• Transaction decomposition– SAGAS [SIGMOD’96], step-decomposed transactions

• Incremental view maintenance– Views for PNUTS [SIGMOD’09]

• Various geo-distributed/replicated storage– Spanner[OSDI’12], MDCC[Eurosys’13],

Megastore[CIDR’11], COPS [SOSP’11], Eiger[NSDI’13], RedBlue[OSDI’12].

Page 31: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Conclusion• Chains support serializability at low latency– With static analysis of SC-cycles

• Key techniques to reduce SC-cycles– Origin ordering– Commutative annotation

• Chains are useful – Performing application logic – Maintaining indices/join views/geo-replicas

Page 32: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos
Page 33: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Limitations of Lynx/chains1. Chains are not strict serializable

Time

Remedies: – Programmers can wait for chain completion– Lynx provides read-your-own-writes

2. Programmers can only abort at first hop• Our application experience shows the limitations are managable

Serializable Strict serializable

Page 34: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos
Page 35: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

2PC and chainsThe easy way

W(A)

R(A)

W(B)

W(A) W(B)

R(A)

2PC-W(AB)

R(A)

R(A)

T1

T2

T2

T1

T2

T1

T1

Page 36: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

2PC and chainsThe hard way

W(A)

R(A) R(B)

W(B)

W(A) W(B)

R(A) R(B)

2PC-W(AB)

R(A) R(B)

R(A) R(B)

T1

T2

T2

T1

T2

T1

T1

Page 37: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

2PC and chainsThe hard way

Chain

DC1 DC2 DC3 DC4

A B C D

2PC retry

Parallelunlock

Page 38: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos
Page 39: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Lynx is scalable

1 2 4 80

500

1000

1500

2000

2500

3000

48 93 184374

42 86 173356265

586

1350

2770

FollowTweetTimeline

#Servers per DC

QPS

(K/

s)

Page 40: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos
Page 41: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

1. Insert bid into bid history 2. Update max price on item

1. Insert bid into bid history 2. Update max price on item

T1

T2

Conflict onbid history

Conflict onitem

SC-cycle Not serializable

Challenge of static analysis: false conflict

Page 42: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Solution: communitivity annotations

1. Insert bid into bid history 2. Update max price on item

1. Insert bid into bid history 2. Update max price on item

T1

T2

Conflict onbid history

Commutativeoperation

No SC-cycle Serializable

Conflict onitem

No real conflict because bid ids

are unique

Updating max commutes

Commutativeoperation

Page 43: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

ACID: all-or-nothing atomicity• Chain’s failure guarantee:– If the first hop of a chain commits, then all hops

eventually commit• Users are only allowed to abort a chain in the first hop

• Achievable with low latency:– Log chains durably at the first hop• Logs replicated to a nearby datacenter

– Re-execute stalled chains upon failure recovery

Page 44: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

ACID: serializability• Serializability– Execution result appears as if obey a serial order

for all transactions– No restrictions on the serial order

Ordering 1 Ordering 2

Transactions

Page 45: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Problem #2: unsafe interleaving• Serializability– Execution result appears as if obey a serial order

for all transactions– No restrictions on the serial order

Ordering 1 Ordering 2

Transactions

Page 46: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Chains are not linearizable• Serializability• Linearability

Ordering 1 Ordering 2

Transactions

Time

Linearizable

a total ordering of chains a total ordering of chains

& total order obeys the issue order

Page 47: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Transaction chains: recap• Chains provide all-or-nothing atomicity• Chains ensure serializability via static analysis• Practical challenges:– How to use chains?– How to avoid SC-cycles?

Page 48: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Example user chain

Bidder Item PriceBids

Alice Camera 100

1. Insert bid into Alice’s bid history

Alice Bob

Seller Item HighestItems

Bob CameraBob Camera 100

2. Update max price on Bob’s camera

Page 49: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Lynx implementation

• 5000 lines C++ and 3500 lines RPC library• Uses an in-memory key/value store• Support user chains in Javascript (via V8)

Page 50: Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos

Geo-distributed storage is hard• Applications demand simplicity & performance– Friendly programming model

• Relational tables• Transactions

– Fast response• Ideally, operation latency = O(intra-datacenter RTT)

• Geo-distribution leads to high latency– Coordinate data access across datacenters

• Operation latency = O(inter-datacenter RTT) = O(100ms)