Transaction chains: achieving serializability with low-latency in geo-distributed storage systems

Post on 24-Feb-2016

41 views 3 download

Tags:

description

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems. Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos K. Aguilera Jinyang Li. New York University *Microsoft Research Silicon Valley. Why geo-distributed storage?. - PowerPoint PPT Presentation

transcript

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems

Yang Zhang Russell Power Siyuan ZhouYair Sovran *Marcos K. Aguilera Jinyang Li

New York University *Microsoft Research Silicon Valley

Large-scale Web applications

Why geo-distributed storage?

Geo-distributed storage

Replication

Geo-distribution is hard

Low latency:O(Intra-datacenter RTT)

Strong semantics:relational tables w/

transactions

?Low latency

Key/value only

Limited forms of transaction

General transaction

Prior workStrictserializable

Serializable

Eventual

Variousnon-serializable

High latency

Provably high latency according to CAP

Spanner [OSDI’12]

Dynamo [SOSP’07]

COPS [SOSP’11]

Walter [SOSP’11]

Eiger [NSDI’13]

Our work

Our contributions1. A new primitive: transaction chain– Allow for low latency, serializable transactions

2. Lynx geo-storage system: built with chains– Relational tables– Secondary indices, materialized join views

Talk Outline• Motivation• Transaction chains• Lynx• Evaluation

Why transaction chains?

Bidder Item Price Seller Item Highest bidBids Items

Alice Book $100

Bob Book $20

Alice iPhone $20

Bob

Datacenter-1 Datacenter-2

Alice

Bob Camera $100

Auction service

Why transaction chains?

Alice’s BidsAlice Book $100

Bob

Datacenter-1 Datacenter-2

AliceBob Camera $100

Bob’s Items

1. Insert bid to Alice’s Bids

2. Update highest bid on Bob’s Items

Operation: Alice bids on Bob’s camera

1. Insert bid to Alice’s Bids

Why transaction chains?

Alice’s BidsAlice Book $100

Bob

Datacenter-1 Datacenter-2

AliceBob Camera $100

Bob’s Items

2. Update highest bid on Bob’s Items

Operation: Alice bids on Bob’s camera

1. Insert bid to Alice’s Bids

Low latency with first-hop return

Alice’s BidsAlice Book $100

Bob

Datacenter-1 Datacenter-2

Alice

Bob Camera $100Bob’s Items

bid on Bob’s camera

Alice Camera $500

$500

Problem: what if chains fail?

1. What if servers fail after executing first-hop?

2. What if a chain is aborted in the middle?

Solution: provide all-or-nothing atomicity

1. Chains are durably logged at first-hop– Logs are replicated to another closest data center– Chains are re-executed upon recovery

2. Chains allow user-aborts only at first hop

• Guarantee: First hop commits all hops eventually commit

Problem: non-serializable interleaving• Concurrent chains ordered inconsistently at different hops

X=1 Y=1

X=2 Y=2

Time

T1

T2

Server-X: T1 < T2 Server-Y: T2 < T1

Not serializable!

T2 T1

• Traditional 2PL+2PC prevents non-serializable interleaving at the cost of high latency

Conflict?

Solution: detect non-serializable interleaving via static analysis

• Statically analyze all chains to be executed– Web applications invoke fixed set of operations

X=1 Y=1

X=2 Y=2

Serializable if no SC-cycle [Shasha et. al TODS’95]

A SC-cycle has both red and blue edges

T1

T2

Outline• Motivation• Transaction chains• Lynx’s design• Evaluation

How Lynx uses chains

• User chains: used by programmers to implement application logic

• System chains: used internally to maintain– Secondary indexes– Materialized join views– Geo-replicas

Example: secondary index

Bob Car $20Alice Book $20

Bob Camera $100Alice iPhone $100

Bidder Item PriceBids (base table)

Alice Camera $100

Bob iPhone $20

Bidder Item PriceBids (secondary index)

Alice Camera $100

Bob Car $20

Example user and system chain

Alice Book $100

Bob

Datacenter-1 Datacenter-2

Alice

Bob Camera $100

bid on Bob’s camera

Alice Camera $100

Insert to Bids table

Update Items table

Lynx statically analyzes all chains beforehand

Put-bid

Read-bids

Put-bidInsert to Bids table

Update Items table

Read-bids

SC-cycleOne solution: execute chain as a distributed transaction

Read Bids table

Read Bids table

Insert to Bids table

Update Items table

SC-cycle source #1: false conflicts in user chains

Put-bid

Insert to Bids table

Update Items tablePut-bid

False conflict because max(bid, current_price)

commutes

Insert to Bids table

Update Items table

Solution: users annotate commutativity

Put-bid

Insert to Bids table

Update Items tablePut-bid

com

mut

es

SC-cycle source #2: system chains

Insert to Bids table

…Put-bid

Insert to Bids table

…Put-bid

Insert to Bids-secondary

Insert to Bids-secondary

SC-cycle

Solution: chains provide origin-ordering• Observation: conflicting system chains originate at the

same first hop server.

Both write the same row of Bids table

• Origin-ordering: if chains T1 < T2 at same first hop, then T1 < T2 at all subsequent overlapping hops.– Can be implemented cheaply sequence number vectors

T1

Insert to Bids table

Insert to Bids-secondary

T2

Insert to Bids table

Insert to Bids-secondary

Limitations of Lynx/chains1. Chains are not strictly serializable, only serializable.2. Programmers can abort only at first hop

• Our application experience: limitations are managable

Outline• Motivation• Transaction chains• Lynx’s design• Evaluation

Simple Twitter Clone on Lynx

Author Tweet

Tweets

Alice New York rocks

From To

Follow-Graph

Alice Bob

Alice Eve

Bob Time to sleep

To From

Follow-Graph (secondary)

Bob Alice

Bob Clark

Geo-replicated

Geo-replicated

Author(=to)

From Tweet

Bob Alice Time to sleep

Eve Alice Hi there

Tweets JOIN Follow-Graph (Timeline)

Eve Hi there

Experimental setup

us-west

europe

us-east

82ms

153ms

102ms

Lynx protoype:• In-memory database• Local disk logging only.

Returning on first-hop allows low latency

Follow-user Post-tweet Follow-user Post-tweet Read-timeline0

50

100

150

200

250

300

174

252

3.2 3.1 3.1

Late

ncy

(ms)

First hop return

Chain completion

Applications achieve good throughput

Follow-User Post-Tweet Read-Timeline0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

184000 173000

1350000

Mill

ion

ops/

sec

Related work• Transaction decomposition– SAGAS [SIGMOD’96], step-decomposed transactions

• Incremental view maintenance– Views for PNUTS [SIGMOD’09]

• Various geo-distributed/replicated storage– Spanner[OSDI’12], MDCC[Eurosys’13],

Megastore[CIDR’11], COPS [SOSP’11], Eiger[NSDI’13], RedBlue[OSDI’12].

Conclusion• Chains support serializability at low latency– With static analysis of SC-cycles

• Key techniques to reduce SC-cycles– Origin ordering– Commutative annotation

• Chains are useful – Performing application logic – Maintaining indices/join views/geo-replicas

Limitations of Lynx/chains1. Chains are not strict serializable

Time

Remedies: – Programmers can wait for chain completion– Lynx provides read-your-own-writes

2. Programmers can only abort at first hop• Our application experience shows the limitations are managable

Serializable Strict serializable

2PC and chainsThe easy way

W(A)

R(A)

W(B)

W(A) W(B)

R(A)

2PC-W(AB)

R(A)

R(A)

T1

T2

T2

T1

T2

T1

T1

2PC and chainsThe hard way

W(A)

R(A) R(B)

W(B)

W(A) W(B)

R(A) R(B)

2PC-W(AB)

R(A) R(B)

R(A) R(B)

T1

T2

T2

T1

T2

T1

T1

2PC and chainsThe hard way

Chain

DC1 DC2 DC3 DC4

A B C D

2PC retry

Parallelunlock

Lynx is scalable

1 2 4 80

500

1000

1500

2000

2500

3000

48 93 184374

42 86 173356265

586

1350

2770

FollowTweetTimeline

#Servers per DC

QPS

(K/

s)

1. Insert bid into bid history 2. Update max price on item

1. Insert bid into bid history 2. Update max price on item

T1

T2

Conflict onbid history

Conflict onitem

SC-cycle Not serializable

Challenge of static analysis: false conflict

Solution: communitivity annotations

1. Insert bid into bid history 2. Update max price on item

1. Insert bid into bid history 2. Update max price on item

T1

T2

Conflict onbid history

Commutativeoperation

No SC-cycle Serializable

Conflict onitem

No real conflict because bid ids

are unique

Updating max commutes

Commutativeoperation

ACID: all-or-nothing atomicity• Chain’s failure guarantee:– If the first hop of a chain commits, then all hops

eventually commit• Users are only allowed to abort a chain in the first hop

• Achievable with low latency:– Log chains durably at the first hop• Logs replicated to a nearby datacenter

– Re-execute stalled chains upon failure recovery

ACID: serializability• Serializability– Execution result appears as if obey a serial order

for all transactions– No restrictions on the serial order

Ordering 1 Ordering 2

Transactions

Problem #2: unsafe interleaving• Serializability– Execution result appears as if obey a serial order

for all transactions– No restrictions on the serial order

Ordering 1 Ordering 2

Transactions

Chains are not linearizable• Serializability• Linearability

Ordering 1 Ordering 2

Transactions

Time

Linearizable

a total ordering of chains a total ordering of chains

& total order obeys the issue order

Transaction chains: recap• Chains provide all-or-nothing atomicity• Chains ensure serializability via static analysis• Practical challenges:– How to use chains?– How to avoid SC-cycles?

Example user chain

Bidder Item PriceBids

Alice Camera 100

1. Insert bid into Alice’s bid history

Alice Bob

Seller Item HighestItems

Bob CameraBob Camera 100

2. Update max price on Bob’s camera

Lynx implementation

• 5000 lines C++ and 3500 lines RPC library• Uses an in-memory key/value store• Support user chains in Javascript (via V8)

Geo-distributed storage is hard• Applications demand simplicity & performance– Friendly programming model

• Relational tables• Transactions

– Fast response• Ideally, operation latency = O(intra-datacenter RTT)

• Geo-distribution leads to high latency– Coordinate data access across datacenters

• Operation latency = O(inter-datacenter RTT) = O(100ms)