Understanding, Choosing & Instrumenting NOSQL

Post on 20-Dec-2014

4,320 views 0 download

Tags:

description

Key

transcript

Understanding Choosing& INSTRUMENTING NOSQL

A!PRESENTATION!BY @timangladEFROM!YOUR!FRIENDS!AT @cloudant

@timanglade

/taɪn/A database engine so tiny,its name had to be shortened

Tin

NOSQLTAPES.com@NOSQLTAPES

You havea scaling problem

So you decide togive NOSQL a try

Now you HAveTWO Problems

That’s the realityof NOSQL TODAY

So why bother? Oh, Don’t worry, I’m going to tell you…

Why & HOWdo I NOSQL?

Understanding Choosing& INSTRUMENTING NOSQL

! UNDERSTANDING Your problem & NOSQL

NOSQL is aboutTwo things

1. Performance 2. Distribution

N.B. To varying degrees in each NOSQL project…

Performance?

PerformancePerformance/$

LOOK AT YOURRDBMS FIRST…

A-B-C

Tip #1Cost: close to $0…

A Always

B Be

C CACHING

AlwaysBe

CACHING

Always

BE

CACHING

If you don’tI’ve NO SYMPATHYFOR YOU, PALAlso, always be rewatching GlenGarry Glen Ross.Great Movie. Fantastic Cast.

INDEXFor God’s SAKE, INDEX

Tip #2Still For close to $0…

If You know sqlYou can nosql

Tip #3Did you go to school? Then yes, Still close to zero

Hirea DatabaseConsultant

Tip #4For a Fistful of dollars…

Buy a bigger box

Tip #5For a few dollars more…

5.1 TB 1.2 MIOPS 100 K Dollars

A word about“the Cloud”

Tip #6Say, That’s a really nice shirt you’re wearing thereI’m sure we could come to an arrangement…

You have a scaling problem,So you decide togive the Cloud a try

Now you HAveX ProblemsWhere X is arbitrarily large

The greatest sandboxever made

The BEST Bootcampever designed

BTWTHESE TIPS APPLYTO NOSQL TOO…

Did all that?don’t have the $$$?

Welcome TO NOSQL

NOSQL lets youUSE skillsINSTEAD of $$$But do you have the skills? Can you get them?

Distribution

What if you Can’tBUY a BOX ANY BIGGER?

NETWORK PARTITIONSHAPPEN

BROWNoutsHAPPEN

Distribution is a(mostly) efficient wayTO ADD moreCAPACITY & AvailabilityTo youR DB

A Word aboutMASTER-SLAVEREPLICATION

A WORD ABOUTTHE CAP THEOREM

! CHOOSING A NOSQL DATABASE

How do I chose?

Same 2 parametersDistribution+ “Performance”

What doesPerformanceMean?

Performance! Data / Query Model! Disk Structure

the Moon Methodology™

1. Distribution 2. Data / Query Model 3. Disk Structure

DistributionDynamo-StyleMaster-slaveMASTER-MASTER

Data / Query ModelMap/ReduceEverywhere?

Disk StructuresThe Devil is in the(implementation)Details

Now let’s lookat major NOSQL DBsthrough this lens…

CouchdbMaster-masterDoc + Persistent M/RAppend-only B+ Tree

SMALL SCALEQUERIES don’t CHANGEHTTP IS A MUST

Ideal SCENARIO

CouchBase 2.0Master-slaveK/V + Persistent M/RAppend-only B+ Tree

?

IDEAL SCENARIO

BigCouchDynamoDoc + Persistent M/RAppend-only B+ Tree

Same as COUCHDB,BIGGER SCALE

IDEAL SCENARIO

CassandraDynamoColumn FamiliesLog + SSTable

fast writes +Don’t mind hacking

ideal scenario

RiakMulti-DC DynamoK/V + M/R + 2aryLog-Struct. Hash Table(& others)

Knowledgeable team,really large scale

ideal scenario

MongoDBMaster-slaveDocS + M/R + 2aryLog + B-Tree

Prototyping

ideal scenario

RedisMaster-slave (?)ManyLOG + Many (?)

APPLICATION “GLUE”

ideal scenario

please don’t sniffyour REDIS

A plea for mercy

NEO4JMaster-slave (~)OO + RESTCustom Graph Struct.

LOTS OF SELF-JOINS

ideal scenario

! INSTRUMENTING YOUR STACK

You WILL FAIL

Oh yes, Yes you will…

Any advancedDistributed SystemWill behaveLIKE a BLACK BOX

TESTING IS FINE butMEASURING IS MOREUSEFULIn this context

PercentileRESPONSE TIMES

Some stuff to keep an eye on

ERROR RATES

Some stuff to keep an eye on

Memory usage& stack depth

Some stuff to keep an eye on

CPU usage& Number of Processes

Some stuff to keep an eye on

DISK USAGE& IOPS

Some stuff to keep an eye on

Hawk the graphsOVER LONG PERIODS

Instrumentation& METROLOGYare still DARK ARTS

Next Steps

Coda Hale’smetrics EVERYWHEREpivotallabs.com/talks/139-metrics-metrics-everywhere

Find a monitoringsystem that worksFOR YOU

THe NOSQL Handbooknosqlhandbook.com

Recap!!1.! NOSQL IS HARD!2.! KNOW YOUR RDBMS, KNOW YOUR PROBLEM!3.! PICK A DB By DISTRIBUTION, Query & Disk Models!4.! Instrument the heck out of it!5.! Rinse!6.! REPEAT

GoEXPERIMENTDeployMEASUREIMPROVEHAVE FUN

cloudant.comtim@cloudant.com

@timanglade

?