+ All Categories
Home > Documents > Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M....

Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M....

Date post: 06-Jan-2018
Category:
Upload: leslie-wells
View: 220 times
Download: 3 times
Share this document with a friend
Description:
Asynchrony isn’t that hard Logical timestamps Deterministic interleaving Ameloriation:
54
Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley
Transcript
Page 1: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Blazes: coordination analysis for distributed

program

Peter Alvaro, Neil Conway, Joseph M. Hellerstein David MaierUC Berkeley

Portland State

Page 2: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Distributed systems are hard

Asynchrony Partial Failure

Page 3: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Asynchrony isn’t that hard

Logical timestampsDeterministic interleaving

Ameloriation:

Page 4: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Partial failure isn’t that hard

ReplicationReplay

Ameloriation:

Page 5: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Asynchrony * partial failure is hard2

Logical timestampsDeterministic interleaving

ReplicationReplay

Page 6: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

asynchrony * partial failure is hard2

ReplicationReplay

Today:

Consistency criteria for fault-tolerant distributed systems

Blazes: analysis and enforcement

Page 7: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

This talk is all setupFrame of mind:

1. Dataflow: a model of distributed computation2. Anomalies: what can go wrong?3. Remediation strategies

1. Component properties2. Delivery mechanisms

Framework:

Blazes – coordination analysis and synthesis

Page 8: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Little boxes: the dataflow model

Generalization of distributed services

Components interact via asynchronous calls (streams)

Page 9: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Components

Input interfaces Output interface

Page 10: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Streams

Nondeterministic order

Page 11: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Example: a join operator

R

ST

Page 12: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Example: a key/value store

put

getresponse

Page 13: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Example: a pub/sub service

publish

subscribedeliver

Page 14: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Logical dataflow

“Software architecture”

Data source

client

Service X filter cachec

a

b

Page 15: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Dataflow is compositional

Components are recursively defined

Data source

client

Service X filter aggregator

Page 16: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Dataflow exhibits self-similarity

Page 17: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Dataflow exhibits self-similarity

DB HDFS

Hadoop

Index Combine

StaticHTTPApp1

App2

Buy

Content

Userrequests

App1 answers

App2answers

Page 18: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Physical dataflow

Page 19: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Physical dataflow

Data source

client

Service X filter aggregatorc

a

b

Page 20: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Physical dataflow

Data source

Service X filter

aggregator

client“System architecture”

Page 21: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

What could go wrong?

Page 22: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Cross-run nondeterminism

Data source

client

Service X filter aggregatorc

a

b

Run 1

Nondeterministic replays

Page 23: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Cross-run nondeterminism

Data source

client

Service X filter aggregatorc

a

b

Nondeterministic replays

Run 2

Page 24: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Cross-instance nondeterminism

Data source

Service X

client

Transient replica disagreement

Page 25: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Divergence

Data source

Service X

client

Permanent replica disagreement

Page 26: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Hazards

Data source

client

Service X filter aggregatorc

a

b

Order Contents?

Page 27: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Preventing the anomalies1. Understand component

semantics (And disallow certain compositions)

Page 28: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Component properties

• Convergence– Component replicas receiving the same

messages reach the same state– Rules out divergence

Page 29: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Insert Read

Convergentdata structure(e.g., Set CRDT)

Convergence

Insert Read

CommutativityAssociativityIdempotence

ReorderingBatchingRetry/duplication

Tolerant to

Page 30: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Convergence isn’t compositional

Data source

client

Convergent (identical input contents identical state)

Page 31: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Component properties

• Convergence– Component replicas receiving the same

messages reach the same state– Rules out divergence

• Confluence– Output streams have deterministic contents– Rules out all stream anomalies

Confluent convergent

Page 32: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Confluence

output set = f(input set)

{ }

{ }=

Page 33: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Confluence is compositional

output set = f g(input set)

Page 34: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Preventing the anomalies1. Understand component semantics

(And disallow certain compositions)2. Constrain message delivery

orders1. Ordering

Page 35: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Ordering – global coordination

Deterministicoutputs

Order-sensitive

Page 36: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Ordering – global coordination

Data source

client

The first principle of successful scalability is to batter the consistency mechanisms down to a minimum. – James Hamilton

Page 37: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Preventing the anomalies1. Understand component semantics

(And disallow certain compositions)2. Constrain message delivery

orders1. Ordering2. Barriers and sealing

Page 38: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Barriers – local coordination

Deterministicoutputs

Data source

clientOrder-sensitive

Page 39: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Barriers – local coordination

Data source

client

Page 40: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Sealing – continuous barriersDo partitions of (infinite) input streams “end”?

Can components produce deterministic results given “complete” input partitions?

Sealing: partition barriers for infinite streams

Page 41: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Sealing – continuous barriers

Finite partitions of infinite inputs are common …in distributed systems

– Sessions– Transactions– Epochs / views

…and applications– Auctions– Chats– Shopping carts

Page 42: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Blazes:

consistency analysis

+ coordination selection

Page 43: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Blazes:

Mode 1: Grey boxes

Page 44: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Grey boxes

Example: pub/sub

x = publishy = subscribez = deliver

x

yz

Deterministicbut unordered

Severity Label Confluent

Stateless

1 CR X X2 CW X3 ORgate X4 OWgate

x->z : CWy->z : CWT

Page 45: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Grey boxes

Example: key/value store

x = put; y = get; z = response

x

yz

Deterministicbut unordered

Severity Label Confluent

Stateless

1 CR X X2 CW X3 ORgate X4 OWgate

x->z : OWkeyy->z : ORT

Page 46: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Label propagation – confluent composition

CW CR

CR

CR

CRDeterministicoutputs

CW

Page 47: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Label propagation – unsafe composition

OW CR

CR

CR

CRTaintedoutputs

Interpositionpoint

Page 48: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Label propagation – sealing

OWkey CR

CR

CR

CRDeterministicoutputs

OWkeySeal(key=x)

Seal(key=x)

Page 49: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Blazes:

Mode 1: White boxes

Page 50: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

white boxesmodule KVS state do interface input, :put, [:key, :val] interface input, :get, [:ident, :key] interface output, :response,

[:response_id, :key, :val] table :log, [:key, :val] end bloom do log <+ put log <- (put * log).rights(:key => :key) response <= (log * get).pairs(:key=>:key) do |s,l|

[l.ident, s.key, s.val] end

endend

put response: OWkey

get response: ORkey

Negation ( order sensitive)Partitioned by :key

Page 51: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

white boxesmodule PubSub state do interface input, :publish, [:key, :val] interface input, :subscribe, [:ident, :key] interface output, :response,

[:response_id, :key, :val] table :log, [:key, :val] table :sub_log, [:ident, :key] end bloom do log <= publish

sub_log <= subscriberesponse <= (log * sub_log).pairs(:key=>:key) do |s,l|

[l.ident, s.key, s.val] end

endend

publish response: CWsubscribe response: CR

Page 52: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

The Blazes frame of mind:

• Asynchronous dataflow model• Focus on consistency of data in

motion– Component semantics– Delivery mechanisms and costs

• Automatic, minimal coordination

Page 53: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Queries?

Page 54: Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.

Recommended