Scale Your Database And Be Happy

Scale Your Database

And Be HappySergio Bossa

@sbtourist

Spring Framework Italian Meeting 2009

Sergio Bossa - http://www.linkedin.com/in/sergiob

About Me

➔ Software architect and engineer➔ Gioco Digitale (online gambling and casinos)

➔ Open Source enthusiast➔ Terracotta Messaging (http://forge.terracotta.org)➔ Actorom (http://code.google.com/p/actorom/)➔ Terrastore (coming soon…)

➔ (Micro-)Blogger➔ http://twitter.com/sbtourist➔ http://sbtourist.blogspot.com


Premise #1

Database ≠

Relational Database


Premise #2

Relational DatabasesAre Not

Dead


Premise #3

You'll never hear the wordNoSQL

Here


Scaling Your Database … what?

● Scaling used as a loose term here.● Scale to handle heterogeneous data.● Scale to handle more data.● Scale to handle more load.● Scale to handle topology changes due to:

● Unplanned growth.● Unpredictable failures.


Scaling Your Database … why?

● Scaling the way you handle your data is going to be more and more important.● Business is moving toward data-centric

applications.● Let's call them “social”.

● Interest is toward efficient ways of:● Storing …● Serving …● Analyzing …● Data!


Scaling Your Relational Database


Replication

● Master - Slave replication.● One (and only one)

master database.● One or more slaves.● All writes goes to the

master.● Replicated to slaves.

● Reads are balanced among master and slaves.

● Major issues:● Single point of failure.● Single point of bottleneck.● Static topology.


Replication

● Master - Master replication.● One or more masters.● Writes and reads can go

to any master node.● Writes are replicated

among masters.● Major issues:

● Limited performance and scalability (due to quorum).

● Complexity.● Static topology.


Partitioning

● Vertical partitioning.● Put tables belonging to

different functional areas on different database nodes.● Scale your data and load

by function.● Move joins to the

application level.● Major issues:

● No more truly relational.● Limited scalability (what if

a functional area grows too much?).


Partitioning

● Horizontal partitioning.● Split tables by key and put

partitions (shards) on different nodes.● Scale your data and load

by key.● Move joins to the

application level.● Needs some kind of

routing.● Major issues:

● No more truly relational.● Limited scalability (what if

you need to rebalance?).


Caching

● Put a cache in front of your database.● Distribute.● Write-through for scaling

reads.● Write-behind for scaling

reads and writes.● Saves you a lot of pain, but

...● “Only” scales read/write

load.


Still left out ...

● We didn't scale our data model.● Still bound to the relational data model.

● We didn't scale our topology.● Still static.● Hard to add nodes for handling growth.● Hard to tolerate nodes leaving due to failures.


Non Relational Databases, coming...


Friends or Foes?

We come in peace.To help our old friend: the relational database.


Requirements

● Flexible data model.● Extreme reliability.● Scale as you need.

● Scale at unplanned change in the data model.● Scale at unplanned growth in data size.● Scale at unplanned growth in load.


Data Model

● Column oriented (hybrid).● Group by columns.● Hybrid: group by keys and column families.

● Dynamically add columns.● Different key-identified values may have

different number of columns.● Efficiently access the same group of columns

(column family).


Data Model

● Document oriented.● Group by named collections.● Identify by key.● Store a schema-less document.

● JSON.● XML.● Whatever ...

● Dynamically update your data model by simply changing your documents.

● Efficiently access whole documents.


Data Model

● Key/Value oriented.● Group by named collections.● Identify by key.● Store an opaque value (whatever).

● Maybe the ancestor of modern non relationals.


Data Partitioning

● Consistent Hashing.● Nodes mapped on a ring space of integers.

● Each node mapped on multiple locations.● Each node owns a range of integers.

● Keys assigned to integers in the ring space.● Stored on the owner node.

● Joining/Leaving nodes only affect the partition they're mapped to.● Hence, keys re-balancing is limited to that

specific range (efficient).


Data Partitioning


Data Consistency

● Strict (ACID) Consistency.● All nodes ...

● At every point in time ...● Hold a consistent view of the stored data.

● Reads and writes can executed on every node.● Results will be always consistent and up-to-

date.● Due to the CAP Theorem you will sacrifice one

of:● Availability.● Partition tolerance.


Data Consistency

● Eventual (BASE) Consistency.● N: number of nodes you want to replicate to.● W: number of required writes to succeed.● R: number of required reads to succeed.● W < N

● Nodes not receiving the write may eventually get that value later.

● R < N● Nodes not holding the read value are ignored.


Data Consistency

● Eventual (BASE) Consistency.● High read/write availability.

● Work even when some nodes fail to read and write values.

● Partition tolerance.● Work even when some nodes cannot be

reached anymore.● Due to the CAP Theorem you are sacrificing

consistency.


Data Versioning

● Vector Clocks.● List of (node, counter) values associated to

each object version.● Every time a given object is read by a node, all

its vector clocks are transferred.● Every time a given object is written back by a

node, counter for that node is incremented.● A vector clock can express causal ordering.● A vector clock can express branching.● Read-time reconciliation (read repair).


Data Versioning

● Other...● Multi-Version Concurrency Control.

● Each read/write operation works on a consistent snapshot.

● Optimistic concurrency.● Write operations succeed only if their version

is the current one.● Last Wins (optionally with timestamps).

● Last write operation wins.● Optionally, with the highest timestamp.


Data Recovery

● Hinted Handoff.● Writes to unavailable nodes get directed to

“secondary” nodes.● Secondary nodes get an hint about the

original destination node.● When the node is available again, the

secondary node send back the value.


Data Recovery

● Merkle Trees.● For nodes missing large number of values (i.e.

after disaster recovery).● Nodes exchange a tree composed of:

● Leaves containing each the hash of a value hosted by the node.

● Parents containing each the hash of the children.

● Updated values are recovered by comparing hashes and reading back from healthy nodes.


Membership

● Master-based.● Registry-like.● Membership

information maintained and broadcasted by one or more master nodes.

● Consistent.● No SPOF with

active/passive master.● Prone to partitioning

failures.


Membership

● Gossip-based.● Peer-to-Peer.● Membership information

is randomly spread among nodes.● Each node picks one

or more nodes, broadcasting them its own topology view.

● All nodes will eventually reach a consistent view of the cluster topology.


Data Analysis

● The importance of data locality.● A distributed system is built by:

● Moving data toward its behavior.● ... or ...● Moving behavior toward its data.

● An efficient distributed system is built by:● Moving behavior toward its data.


Data Analysis

● Map-Reduce.● Map data

analysis and computation tasks toward the data itself.

● Reduce results.● No need to

move data around.


Use Cases (1)

● Runtime data.● “Runtime” VS “Transactional”.● Not all data need complex relations.● Not all data need to be persisted forever.

● That is, everything regarding the current “runtime” state.● User session and everything related.

● Put the “runtime” state into your N-RDBMS.● When the “runtime” state turns into

“transactional”, put it into your RDBMS.


Use Cases (2)

● Hot spots.● For read-intensive data:

● Use your N-RDBMS as a primary database for reads.

● Use your RDBMS as a primary database for writes and load data into the N-RDBMS from a background thread.

● For read/write-intensive data:● Use your N-RDBMS as a primary database

for writes and reads.● Put your data in your RDBMS from a

background thread (if needed).


Use Cases (3)

● Intense data computations.● When the relational model doesn't efficiently

represent your data ...● And join operations are just too expensive ...● N-RDBMS come to rescue!

● Providing more efficient data representation/storage.

● Providing grid-style computations (i.e. Map-Reduce).


Products (1)

● MongoDB● http://www.mongodb.org● Document-based.

● (Binary) Json.● Support for indexes and object queries.● Full support for master-slave replication.● Alpha support for sharding.● ACID (unless failure scenarios during

replication).


Products (2)

● Cassandra● http://incubator.apache.org/cassandra/● Column-based (hybrid).

● Keys.● Column Families.

● Columns.● Super-Columns.

● Support for ordered range queries.● Fully distributed.

● Peer-to-Peer.● Eventually consistent.


Products (3)

● Voldemort● http://project-voldemort.com● Key/Value.

● Pluggable data serialization.● No support for queries.● Fully distributed.



Products (4)

● Riak● http://riak.basho.com/● Document-based.

● Json.● Links.

● Support for Map-Reduce.● Fully distributed.


● With runtime dynamic tuning.


Final words

● Know how to scale your relational database.● Don't dismiss it just to follow the hype.

● Know how non-relational databases scale.● There are many choices around.

● Know your use cases.● Make sensible decisions.

● Enjoy!● And be happy!


Thank you!

Q&A

Date post:	18-Jan-2015
Category:	Technology
Upload:	sergio-bossa
View:	12,097 times
Download:	0 times

Scale Your Database And Be Happy

Technology