Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016

Distributed Systems + NodeJSBruno Bossola

MILAN 25-26 NOVEMBER 2016

@bbossola

@bbossola

Whoami

● Developer since 1988

● XP Coach 2000+

● Co-founder of JUG Torino

● Java Champion since 2005

● CTO @ EF (Education First)

I live in London, love the weather...

@bbossola

Agenda

● Distributed programming

● How does it work, what does it mean

● The CAP theorem

● CAP explained with code

– CA system using two phase commit

– AP system using sloppy quorums

– CP system using majority quorums

● What next?

● Q&A

@bbossola

Distributed programming

● Do we need it?

@bbossola

Distributed programming

● Any system should deal with two tasks:

– Storage

– Computation

● How do we deal with scale?

● How do we use multiple computers to do what we used todo on one?

@bbossola

What do we want to achieve?

● Scalability

● Availability

● Consistency

@bbossola

Scalability

● The ability of a system/network/process to:

– handle a growing amount of work

– be enlarged to accommodate new growth

A scalable system continue to meet the needs of its users as thescale increase

clipart courtesy of openclipart.orgclipart courtesy of openclipart.org

@bbossola

Scalability flavours

● size:

– more nodes, more speed

– more nodes, more space

– more data, same latency

● geographic:

– more data centers, quicker response

● administrative:

– more machines, no additional work

@bbossola

How do we scale? partitioning

● Slice the dataset into smaller independent sets

● reduces the impact of dataset growth

– improves performance by limiting the amount of data tobe examined

– improves availability by the ability of partitions to failindipendently

@bbossola

How do we scale? partitioning

● But can also be a source of problems

– what happens if a partition become unavailable?

– what if It becomes slower?

– what if it becomes unresponsive?

clipart courtesy of openclipart.org

@bbossola

How do we scale? replication

● Copies of the same data on multiple machines

● Benefits:

– allows more servers to take part in the computation

– improves performance by making additional computingpower and bandwidth

– improves availability by creating copy of the data

@bbossola

How do we scale? replication

● But it's also a source of problems

– there are independent copies of the data

– need to be kept in sync on multiple machines

● Your system must follow a consistency model

v4 v4

v8

v8 v4 v5

v7

v8


@bbossola

Availability

● The proportion of time a system is in functioning conditions

● The system is fault-tolerant

– the ability of your system to behave in a well definedmanner once a fault occurs

● All clients can always read and write

– In distributed systems this is achieved by redundancy


@bbossola

Introducing: performance

● The amount of useful work accomplished compared to thetime and resources used

● Basically:

– short response time for a unit of work

– high rate of processing

– low utilization of resources


@bbossola

Introducing: latency

● The period between the initiation of something and theoccurrence

● The time between something happened and the time it hasan impact or become visible

● more high level examples:

– how long until you become a zombie after a bite?

– how long until my post is visible to others?

clipart courtesy of cliparts.co

@bbossola

Consistency

● Any read on a data item X returns a value correspondingto the result of the most recent write on X.

● Each client always has the same view of the data

● Also know as “Strong Consistency”


@bbossola

Consistency flavours

● Strong consistency

– every replica sees every update in the same order.

– no two replicas may have different values at the sametime.

● Weak consistency

– every replica will see every update, but possibly indifferent orders.

● Eventual consistency

– every replica will eventually see every update and willeventually agree on all values.

@bbossola

The CAP theorem

CONSISTENCY AVAILABILITY

PARTITIONTOLERANCE

@bbossola

The CAP theorem

● You cannot have all :(

● You can select twoproperties at once

Sorry, this has been mathematically proven and no, has not been debunked.

@bbossola

The CAP theorem

CA systems!

● You selected consistency and availability!

● Strict quorum protocols(two/multi phase commit)

● Most RDBMS

Hey! A network partition willf**k you up good!

@bbossola

The CAP theorem

AP systems!

● You selected availability and partition tolerance!

● Sloppy quorums andconflict resolution protocols

● Amazon Dynamo, Riak,Cassandra

@bbossola

The CAP theorem

CP systems!

● You selected consistency and partition tolerance!

● Majority quorum protocols(paxos, raft, zab)

● Apache Zookeeper,Google Spanner

@bbossola

NodeJS time!

● Let's write our brand new key value store

● We will code all three different flavours

● We will have many nodes, fully replicated

● No sharding

● We will kill servers!

● We will trigger network partitions!

– (no worries. it's a simulation!)


@bbossola

Node APP

General design

<proto> APIStorage

API

GET (k) SET (k,v)

<proto> Storage

Database

<proto> Core

fX fY fZ fK

@bbossola

CA key-value store

● Uses classic two-phase commit

● Works like a local system

● Not partition tolerant

@bbossola

Nodeapp

CA: two phase commit, simplified

2PCAPI

Storage API

GET (k) SET (k,v)

Storage

Database

2PC Core

propose(tx)

commit(tx)

rollback(tx)

@bbossola

AP key-value store

● Eventually consistent design

● Prioritizes availability over consistency

@bbossola

Nodeapp`

AP: sloppy quorums, simplified

QUORUMAPI

Storage API

GET (k) SET (k,v)

Storage

Database

QUORUM Core

(read) (repair)

propose(tx)

commit(tx)

rollback(tx)

@bbossola

CP key-value store

● Uses majority quorum (raft)

● Guarantees eventual consistency

@bbossola

CP: majority quorums (raft, simplified)

RAFTAPI

Storage API

GET (k) SET (k,v)

Storage

Database

RAFT Core

beat

voteme history

Nodeapp`

Urgently needs refactoring!!!!

@bbossola

What about BASE?

● It's just a way to qualify eventually consistent systems

● BAsic Availability

– The database appears to work most of the time.

● Soft-state

– Stores don’t have to be write-consistent, nor do differentreplicas have to be mutually consistent all the time.

● Eventual consistency

– Stores exhibit consistency at some later point (e.g.,lazily at read time).

@bbossola

What about Lamport clocks?

● It's a mechanism to maintain a distributed notion of time

● Each process maintains a counter

– Whenever a process does work, increment the counter

– Whenever a process sends a message, include thecounter

– When a message is received, set the counter tomax(local_counter, received_counter) + 1


@bbossola

What about Vector clocks?

● Maintains an array of N Lamport clocks, one per eachnode

● Whenever a process does work, increment the logicalclock value of the node in the vector

● Whenever a process sends a message, include the fullvector

● When a message is received:

– update each element in

● max(local, received)– increment the logical clock

– of the current node in the vector


@bbossola

What next?

● Learn the lingo and the basics

● Do your homework

● Start playing with these concepts

● It's complicated, but not rocket science

● Be inspired!

@bbossola

Q&A

Amazon Dynamo:http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

The RAFT consensus algorithm:https://raft.github.io/http://thesecretlivesofdata.com/raft/

The code used into this presentation:https://github.com/bbossola/sysdist


http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

https://raft.github.io/

http://thesecretlivesofdata.com/raft/

https://github.com/bbossola/sysdist

Date post:	07-Jan-2017
Category:	Technology
Upload:	codemotion
View:	62 times
Download:	1 times

Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016

Technology