Replication and Consistency in Cassandra... What Does it All Mean? (Christopher Bradford, DataStax)...

Post on 16-Apr-2017

214 views 1 download

transcript

Christopher Bradford

Replication and consistency in Cassandra... What does it all mean?

Who is this guy?

Christopher BradfordSolutions Architect with DataStax

Built the world’s smallest C* cluster

Twitter: @bradfordcp

GitHub: bradfordcp

© DataStax, All Rights Reserved. 3

Introduction

© DataStax, All Rights Reserved. 5

CAP Theorem

Pick 2 of the 3

Consistency

Availability Partition Tolerance

© DataStax, All Rights Reserved. 6

CAP Theorem

Consistency

Every read receives the most recent write or an error

Consistency

Availability Partition Tolerance

© DataStax, All Rights Reserved. 7

CAP Theorem

Every request receives a response

Consistency

Availability Partition Tolerance

Availability

© DataStax, All Rights Reserved. 8

CAP Theorem

Partition Tolerance

The system continues to operate despite arbitrary partitioning

Consistency

Availability Partition Tolerance

© DataStax, All Rights Reserved. 9

CAP Theorem Evolved

The modern CAP goal should be to maximize combinations of consistency and availability that make sense for the specific application. Such an approach incorporates plans for operation during a partition and for recovery afterward, thus helping designers think about CAP beyond its historically perceived limitations.

- Eric Brewer

C

A P

© DataStax, All Rights Reserved. 10

CAP Theorem

Cassandra’s View

AP – Availability & Partition tolerance above all else.

Consistency

Availability Partition Tolerance

ReplicationAvailability & Partition Tolerance

© DataStax, All Rights Reserved. 12

Replication

ClientCoordinator

Replica

Replica

Replica

Write

© DataStax, All Rights Reserved. 13

Replication

ClientCoordinator

Replica

Replica

Replica

Write

+1 Hint

© DataStax, All Rights Reserved. 14

Replication

ClientCoordinator

Replica

Replica

Replica

Read

© DataStax, All Rights Reserved. 15

Configuring Replication

Replication is defined at the keyspace level.

Strategy

Parameters

CREATE KEYSPACE cassandra_summit WITH REPLICATION = { ‘class’: ‘SimpleStrategy’, ‘replication_factor’: 3 };

1 Simple Strategy

2 Network Topology Strategy

Replication Strategies

© DataStax, All Rights Reserved. 16

© DataStax, All Rights Reserved. 17

Simple Strategy

ClientCoordinator

Replica

Replica

Replica

Request

Class: SimpleStrategyParameters:• replication_factor

© DataStax, All Rights Reserved. 18

Simple Strategy

ClientCoordinator

Replica

Replica

Replica

Request

© DataStax, All Rights Reserved. 19

Network Topology Strategy

ClientCoordinator

Replica

Replica

Replica

Request

Class: NetworkTopologyStrategyParameters:• dc_name: replication_factor

© DataStax, All Rights Reserved. 20

Network Topology Strategy

Client Coordinator

ReplicaRequestRack 1

Rack 2

Replica

Replica

© DataStax, All Rights Reserved. 21

Network Topology Strategy

ClientCoordinator

ReplicaRequestRack 1

Rack 3

Replica

Replica Rack 2

Tools:nodetool status ksnodetool getendpoints ks table val

ConsistencyBalancing performance and correctness

Tunable Consistency

© DataStax, All Rights Reserved. 24

Consistency Levels

© DataStax, All Rights Reserved. 25

Consistent Reads

• ALL• QUORUM• LOCAL_QUORUM• ONE• LOCAL_ONE• SERIAL

Replica

Replica

Replica

© DataStax, All Rights Reserved. 26

Consistent Writes

• ALL• QUORUM• LOCAL_QUORUM• ONE• LOCAL_ONE• ANY

Replica

Replica

Replica

© DataStax, All Rights Reserved. 27

Consistency Failures

What happens when the desired consistency level cannot be achieved?

Replica

Replica

Replica

© DataStax, All Rights Reserved. 28

Failure Recovery

Staying Consistent

In the event of a failure how do replicas get the latest data?

Replica

Replica

Replica

Conclusion

© DataStax, All Rights Reserved.29

Questions?