July 11th, 2010
Apples, Oranges and NOSQL
Roi Aldaag Architect & ConsultantNadav Wiener Architect & Consultant
3
» What is NoSQL?
» What’s “wrong” with RDBMS?
» Why now?
Introduction
Agenda
4
» Scaling
» CAP Theorem
» ACID vs. BASE
RDBMS vs. NoSQL
Agenda
5
» Key / Value
» Column
» Document
» Graph
NoSQL Taxonomy
Agenda
6
» Comparing Apples to Oranges
» Polyglot Persistence
How to choose?
Agenda
Introduction
8
Introduction
Question: What do they all have in common?
9
Before we answer – some facts:
Introduction
10
Before we answer – some facts:
Introduction
Daily Page Views
Daily Visitors
Data size
7.8x109
620x106
Petabytes
7.1x109
500x106
Petabytes
550x106
56x106
Petabytes
350x106
37x106
Terabytes
82x106
12x106
Terabytes
July, 2010: http://www.alexa.com
11
Introduction
Answer: They use NoSQL data stores
12
Why!?
Introduction
13
» ACID doesn’t scale well horizontally
Sharding breaks relations
Joins are inefficient
» Transactions overhead
» Schema is not flexible
Predfined
Hard to evolve
Relational DBs Have Scaling Limitations
Introduction
14
» NO SQL / Not Only SQL
» A collective description of Open Source, Non-relational, data stores
Highly distributed
Highly scalable
Not ACID and... doesn’t use SQL
» Term coined in a convention in 2009 called “NoSQL” (Eric Evans)
» Started a movement that is gaining momentum
What is NoSQL?
Introduction
15
Introduction
16
» NoSQL data stores predate RDBMS (1970)
But remained a niche
» RDBMS – most popular and generic option
» Web 2.0 introduced new requirements:
Exponential increase in data
Information connectivity
Semi-structured data
» NoSQL data stores had answers
When time was right
When RDBMSs didn’t
Why now?
Introduction
17
It’s theory time:
Introduction
18
Scaling
19
» Adding resources to a single node in a system
» Add more CPUs or memory
» Move system to a larger machine
» Pros:
Quick and Simple
» Cons:
Outgrowing the capacity of largestsystem available (More’s law)
Expensive
Creates vendor lock-in
Scaling Up
Scaling
20
» Add more nodes to a system
» Functional Scaling (vertical)
Grouping data by function and spreading functional groups across databases
» Sharding (horizontal)
Splitting same functional data across multiple databases
» Pros: More flexible
» Cons: More complex
Scaling Out
Scaling
Distributed
Databases
22
Distributed Databases
» Many nodes
» Same databaseNode 1 Node 2
Node 3
23
» Consistency
All clients can see the same data
» Availability
All clients can always access data
» Partition tolerance
The ability to continue working when the network topology is broken
The ability to recover once the network is healed
What are the requirements from distributed databases?
Distributed Databases
24
» You can fully satisfy at most 2 out of 3
Compromise on 3rd
» Not “all or nothing”
Choose various levels of consistency, availability or partition tolerance
» Recognize which of the CAP rules your business needs for the task
CAP Theorem (E. Brewer, N. Lynch)
Distributed Databases
25
» Partition Tolerance is compromised
» Single site clusters (easier to ensure all nodes are always in contact)
» When a network partition occurs, the system blocks
» e.g. Two Phase Commit (2PC)
CA: Consistency & Availability
Distributed Databases
Partition
Tolerance
26
» Availability is compromised
» Access to some data may be temporarily limited
» The rest is still consistent/accurate
» e.g. Sharded database
» TBD sample
CP: Consistency & Partitioning
Distributed Databases
Partition
Tolerance
27
» Consistency is compromised
» System is still available under partitioning
» Some data returned may be temporarily not up-to-date
» Requires conflict resolution strategy
» e.g. DNS, caches, Master/Slave replication
» TBD sample
AP: Availability & Partitioning
Distributed Databases
Partition
Tolerance
ACID vs. BASE
29
» Atomicity
When a part of the transaction fails -> the entire transaction fails; Database state is left unchanged
» Consistency
A transaction takes database from one consistent state to another
» Isolation
A transaction can't see dirty state from other transactions
» Durability
Commit means commit.
ACID – a quick recap
ACID vs. BASE
30
» The CAP compliment of ACID
Just had to be called BASE
Backronym:
» Basically Available
» Soft State
» Eventually Consistent
BASE
ACID vs. BASE
31
» RDBMSs strive to provide ACID guarantees
ACID forces consistency
» NoSQL solutions often scale through BASE
BASE accepts that conflicts will happen
RDBMS & ACID / NoSQL & BASE
ACID vs. BASE
Taxonomy
33
Taxonomy
Key / Value Column
Graph
TXT
XML
BIN
Document
34
Taxonomy
Key / Value Databases
35
» Simple Key / Value lookups (DHT)
» Value is opaque
» Focus on scaling to huge amounts of data
» Designed to handle massive load
» E.g.
Riak
Project Voldemort
Redis
Key/Value Stores
Taxonomy
Based on Amazon’s
Dynamo paper
36
Key/Value e.g.: Riak
Taxonomy
» No single point of failure
» No machines are special or central
» MapReduce queries (Erlang / Javascript)
» HTTP/JSON API
» Ring cluster with automatic replication
» Elastic / partition rebalancing» Written in: Erlang, C, Javascript
» Developed by: Basho Technologies
» Java client: (jonjlee / riak-java-client)
37
Data Model
Key/Value e.g.: Riak
» Key / Value pairs are stored in a Bucket
» A Bucket ~ a namespace
» Each update is tracked by a Vector Clock
An algorithm for determining ordering and detecting conflicts
» When in conflict
Last wins / manual resolution
Versioning
38
» Read an object
» Store a new object
» Store an object with existing key (update)
Key/Value e.g.: Riak
GET /riak/bucket/key
POST /riak/bucket
PUT /riak/bucket/key
Example: REST API
39
» A framework supporting distributed computing on large data sets on clusters of machines
» Leverage parallel processing power
» Introduced by Google
» Inspired by map / reduce functions in functional programming
» Map step
» Reduce step
Key/Value e.g.: Riak
MapReduce
40
» Map
» Parse each document
» Emit a sequence of <word, doc_id> pairs
Key/Value e.g.: Riak
MapReduce example: Inverted Index
<doc_id, doc_text>
<100, >,
<200, >,
<300, >
Node
1
Node
2
Node
3
TXT1
TXT2
TXT3
<word ,doc_id>
< word1 ,100>,< word2 ,100>,
< word2 ,200>,
< word2 ,300>
41
» Reduce
» Accept all pairs for a given word
» Sort the corresponding document IDs
» Emit a <word, list(document ID)> pair
Key/Value e.g.: Riak
MapReduce example: Inverted Index
<word, list(document_id)>
< word1 ,(100) >,
< word2 ,(100,200)>, < word3 ,(300) >
42
Taxonomy
BigTable and
Column Oriented Databases
43
» Conceptually a single, infinitely large table
» Each rows can have different number of columns
» Table is sparse: |rows|*|columns| > |values |
» Based on Google’s BigTable paper
» E.g.
Cassandra
Hbase
Hypertable
Column Stores – BigTable derivatives
Taxonomy
44
» RDBMS:
Create a central table with common attributes
Create a table per product with unique attributes
Use a join query
Alternatively create a table that holds meta data on products
» NoSQL:
Column oriented database
Use arbitrarily columns
Use Case: Manage products with diverse attributes
Taxonomy
45
» Data model: Google’s BigTable
» Infrastructure: Amazon Dynamo
» Incremental scalability
» Flexible schema
» No single point of failure (Distributed P2P)
» Optimistic replication (Gossip protocol)
» Written in: Java
» Developed by: Facebook
» Java client: e.g. Hector / Thrift
Column Store e.g.: Cassandra
Taxonomy
46
» Column
Smallest increment of data: tuple of name, value, timestamp
Data Model
Column e.g.: Cassandra
{
name: "emailAddress",
value: “[email protected]",
timestamp: 123456789
}
47
» SuperColumn
A sorted, associative, unbounded array of columns
Column e.g.: Cassandra
{ // this is a SuperColumn
name: "homeAddress",
// with an unbounded array of Columns
value: {
// the keys is the name of the Column
street: {name: "street", value: "s", timestamp:...},
city: {name: "city", value: "c", timestamp:...},
zip: {name: "zip", value: "z", timestamp:...}
}
}
48
» ColumnFamily
A container (~Table) for columns sorted by their names
Column Families are referenced and sorted by row keys
Column e.g.: Cassandra
Users = { // ColumnFamily
john: { // key to row in CF
"role" : "admin",
"status" : "offline",
"nick" : "dude1934"
}, // end row
fred: { // another row
"nick" : “freddy",
"email" :"[email protected]",
"age" : "25",
"gender" : "male",…
},… // more rows
} Column Family
49
» Keyspace
The outer most grouping of data (~DB Schema)
Contains ColumnFamily’s
There is no imposed relationship between ColumsFamily’s
Column e.g.: Cassandra
50
» Example
Column e.g.: Cassandra
Tweets CF
Timeline CFKeyspace
51
Taxonomy
Document Oriented DatabasesTXT
XML
BIN
52
» Store semi-structured documents (think JSON)
» Document versioning
» Map/Reduce based queries, sorting, aggregation, etc.
» DB is aware of internal structure
» E.g. MongoDB
CouchDB
JackRabbit (JCR JSR 170)
Document Store
Taxonomy
53
» RDBMS:
Table for each: posts, comments, tags
Foreign relations
» NoSQL:
Document storage
Store post + tags + comments as a document
Use Case: Blog with tagged posts and comments
Taxonomy
54
» MongoDB (from "humongous")
» Manages collections of JSON-like documents (BSON)
» Queries can return specific fields of documents
» Supports secondary indexes
» Atomic operations on single documents
» Developed by: 10gen
» Written in: C++
» Clients: Java, Scala and more
Document Store e.g: MongoDB
Taxonomy
55
» Suppose you host a blog, where each post is tagged:
» Notice how posts have an array of tags
Example: Blog posts
Docment e.g.: MongoDB
db.posts.save({
_id : 3,
author:"john",
title : “Apples, Oranges and NOSQL",
text : “This article will…",
tags : [ “database", “nosql" ]
});
56
» MongoDB supports secondary indexes and a query optimizer
Compound indexes are also supported
Docment e.g.: MongoDB
db.posts.ensureIndex({ tags: 1 });
db.posts.ensureIndex({ author: 1});
db.posts.find({ author: "john", tags: "nosql" });
// Result:
{
"_id" : 3,
"author" : "john",
"title" : "Apples, Oranges and NOSQL",
"text" : "This article will…",
"tags" : ["database", "nosql", "mongodb" ]
}
57
» Let's update our posts to include some comments:
Docment e.g.: MongoDB
db.posts.update({ _id: 3 }, {
$inc: { comments_count: 4},
$pushAll : {
comments: [
{ text: “Comment 1" },
{ text: “Comment 2", author: "Mr. T" },
{ text: “Comment 3" },
{ text: “Comment 4" }
]
}
});
58
Taxonomy
Graph Databases
59
» Inspired by mathematical graph theory G=(E,V)
» Models the structure of data
» Navigational data model
» Scalability / data complexity
» Data model: Key-Value pairs on Edges / Nodes
» Relationships: Edges between Nodes
» E.g. Neo4j
Pregel (Google’s PageRank)
AllegroGraph
Graph databases
Taxonomy
60
» RDBMS
Complex recursive algorithm
Multiple Self joins
Round trips to DB / bulk read and resolve in RAM
» NoSQL:
Graph Storage
Network traversal
Use Case: Connected data - deep relationship links between users in a social network
Taxonomy
61
» High-performance graph engine
» Embedded / disk based
» Work with OO model: nodes, relationships, properties
» ACID Transactions
JTA support – participate in 2PC with your RDBMS
» Developed by: Neo Technologies
» Written in: Java
» Clients: Java, client libraries in other platforms
Graph e.g.: Neo4J
Taxonomy
62
Graph e.g.: Neo4j
http://neo4j.org/
Comparing Apples to Oranges
64
» RDBMS
Databases contains tables, columns and rows
All rows the same structure
Inherent ORM mismatch
» NoSQL
Choose your data structure
Data is stored in natural structure (e.g. Documents, Graphs, Objects)
Comparing Data Structures
Comparing Apples to Oranges
65
» RDBMS
Strict schema, difficult to evolve
Maintains relations and forces data integrity
» NoSQL
Structure of data can be changed dynamically
• e.g. Column stores – Cassandra
Data can sometimes be completely opaque
• e.g Key/Value – Project Voldemort
Comparing Schema Flexibility
Comparing Apples to Oranges
66
» RDBMS
The data model is normalized to remove data duplication
Normalization establishes table relations
» NoSQL
Denormalization is not a dirty word
Relations are not explicitly defined
Related data is usually grouped and stored as one unit
• E.g. document, column
Comparing Normalization & Relations
Comparing Apples to Oranges
67
» RDBMS
CRUD operations using SQL
Access data from multiple tables using SQL joins
Generic API such as JDBC
» NoSQL
Proprietary API and DSLs (e.g. Pig / Hive / Gremlin)
MapReduce, graph traversals
REST APIs, portable serialization formats• BSON, JSON, Apache Thrift, Memcached
Comparing Data Acces
Comparing Apples to Oranges
68
» RDBMS
Slice and Dice data, then reassemble any way you like
» NoSQL
Hard to repurpose data for ad-hoc usage
• Plan ahead
Think in advance
• How and what you store
• Data access patterns
Comparing Reporting Capabilities
Comparing Apples to Oranges
Summary
70
» ACID ruled exclusively in the last 40 years
doesn’t compromise on consistency
» Database industry neglected distributed DBs w/ availability
» Vacuum was filled with “NoSQL” BASE architectures
Strict A and P, minimize C compromise
» Relational databases are now trying to catch up
Why NOSQL / BASE
Summary
71
» Missing some query capabilities
joins / composite transaction
» Eventual consistency -- not for every problem
» Not a drop in replacement for RDBMS “on ACID”
» No standardization -> product lock-in
» Relatively immature (support, bugs, community)
NoSQL Limitations
Summary
72
» Relational databases and NoSQL databases are designed to meet different needs
» RDBMS-only should not be a default
» NOSQL databases outperform RDBMS’s in their particular niche
» No one size fits all / Silver bullet
...but you don’t have to choose one
Choose the right tool for the job
Summary
73
» Poly: many Glot: language
» Meshing up persistence mechanisms to best meet requirements
» Good integration stories:
E.g. Neo4j + JDBC using JTA
Polyglot Persistence
Summary