NoSQL like there is No Tomorrow

Post on 12-Jan-2015

219 views 2 download

Tags:

description

The AWS NoSQL team shares the design philosophy behind DynamoDB and lessons learned in building for massive scale.

transcript

@ksshams@swami_79

NoSQL like there is NoTomorrow

KhawajaEngineering Lead for NoSQL,

AWS

@ksshams

NASA JPL has visited every planet in the solar system ... except Pluto

25 Gbps!

@ksshams

@ksshams@swami_79

let’s start with a trilogy …

@ksshams@swami_79

once upon a time...

(in 2000)

episode 1

@ksshams@swami_79

a half mile away... (Seattle)

@ksshams@swami_79

amazon.com - a rapidly growing Internet based retail business relied on relational databases

@ksshams@swami_79

we had 1000s of independent services

@ksshams@swami_79

each service managed its own state in

Relational Databases

@ksshams@swami_79

Relational Databases are pretty seductive

@ksshams@swami_79

first of all... SQL!!

@ksshams@swami_79

so it is easier to query..

@ksshams@swami_79

easier to learn

@ksshams@swami_79

They are as versatile as a swiss army knife

complex queries key-value access

transactionsanalytics

@ksshams@swami_79

Relational Databases are very similar toSwiss Army Knives

@ksshams@swami_79

sometimes.. swiss army knifes.. can be more than what you bargained for

@ksshams@swami_79

partitioningeasy

re-partitioning

HARD..

@ksshams@swami_79

so we bought

bigger boxes...

@ksshams@swami_79

Q4 was hard-work at Amazon

benchmark new hardware

migrate to new hardware

repartition databases

pray ...

@ksshams@swami_79

Relational Databases have availability

challenges..

@ksshams@swami_79

then.. (in 2005)

episode 2

@ksshams@swami_79

amazon dynamopredecessor to dynamoDB

specialist tool : •limited querying capabilities•simpler consistency

replicated DHT with consistent hashingoptimistic replication“sloppy quorum”anti-entropy mechanismobject versioning

@ksshams@swami_79

dynamo had many benefits• higher availability• we traded it off for consistency

• incremental scalability• no more repartitioning • no need to architect apps for peak• just add boxes

• simpler querying model ==>> predictable performance

@ksshams@swami_79

but dynamo was not perfect...

lacked strong consistency

@ksshams@swami_79

but dynamo was not perfect...

scaling was easier, but...

@ksshams@swami_79

but dynamo was not perfect...

steep learning curve

@ksshams@swami_79

but dynamo was not perfect...

dynamo was a library ==>> not a service...

@ksshams@swami_79

then.. (in 2012)

episode 3

@ksshams@swami_79

ADMIN

DynamoDB

Managed NoSQL Database

Fast & Predictable Performance

Built for Scale

@ksshams@swami_79

“Even though we have years of experience with large, complex NoSQL architectures, we are

happy to be finally out of the business of managing it ourselves.” - Don MacAskill, CEO

DynamoDB

@ksshams@swami_79

DynamoDB Goals and Philosophies

durability and availability

scale is our problem

easy to use

scale in rps

consistent and low latencies

@ksshams@swami_79

durability is key…

@ksshams@swami_79

availability is key…

@ksshams@swami_79

scale is our problem, not yours..

@ksshams@swami_79

Fault Tolerant Design

Infrastructure Fails - deal with it!

Planning for failures is not easy

How do you ensure your recovery strategies work correctly?

@ksshams@swami_79

Byzantine General Problem

@ksshams@swami_79

A simple 2-way replication system of a traditional database…

Primary Standby

Writes

@ksshams@swami_79

P S

S is dead, need to

trigger new replica

P is dead, need to promote

myself

@ksshams@swami_79

Improved Replication: Quorum

Writes

Replica

Quorum: Successful write on a majority

Replica

Replica

@ksshams@swami_79

Easy?

Replica B

Replica C

Writes from client XReplica A

Replica D

New member in the group

Should I continue to serve reads? Should I start a new quorum?

Replica E Replica F

Reads and Writes from

client Y

Classic Split Brain Issue in Replicated systems leading to lost writes!

@ksshams@swami_79

Building correct distributed systems is not straight forward..

Handle partial failures of replicas

Handle replica failures

Ensure there isn’t a parallel quorum

Handle concurrent failures

@ksshams@swami_79

Trends in the World Of Databases

@ksshams@swami_79

A decade ago, it was all about the DBAs

@ksshams@swami_79

Last 5 years have been about self service.

@ksshams@swami_79

Today is about managed services.

@ksshams@swami_79

Plan for Success … Plan for Scale

@ksshams@swami_79

@ksshams@swami_79