MongoDB Basic Concepts
Norberto Leite
Senior Solutions Architect, [email protected]
@nleite
Thursday, 25 October 12
Agenda
•Overview•Replication•Scalability•Consistency & Durability•Flexibility, Developer Experienc
Thursday, 25 October 12
http://bit.ly/OT71M4
Your data needs started here...
Thursday, 25 October 12
http://bit.ly/Oxcsis
...but soon you had to be here
Thursday, 25 October 12
Basic Concepts
Horizontally Scalable
{ author : “steve”, date : new Date(), text : “About MongoDB...”, tags : [“tech”, “database”]}
Document Oriented
Application
Fully Consistent
High Performance
Thursday, 25 October 12
depth of functionality
scal
abili
ty &
per
form
ance •memcached
•key/value
• RDBMS
Tradeoff: Scale vs Functionality
Thursday, 25 October 12
Replication
Thursday, 25 October 12
Why do we need replication
•Failover •Backups•Secondary batch jobs •High availability
Thursday, 25 October 12
Replica SetsData Availability across nodes• Data Protection
• Multiple copies of the data• Spread across Data Centers, AZs
• High Availability• Automated Failover• Automated Recovery
Thursday, 25 October 12
Replica Sets
Primary
Secondary
Secondary
Read
Write
Read
Read
App
Asynchronous Replication
Thursday, 25 October 12
Replica Sets
Primary
Secondary
Secondary
Read
Write
Read
Read
App
Thursday, 25 October 12
Replica Sets
Primary
Primary
Secondary
Read
Write
Read
Automatic Election of new Primary
App
Thursday, 25 October 12
Replica Sets
Recovering
Primary
Secondary
Read
Write
Read
New primary serves data
App
Thursday, 25 October 12
Replica Sets
Secondary
Primary
Secondary
Read
Write
Read
Read
App
Thursday, 25 October 12
Scalability
Thursday, 25 October 12
Horizontal Scalability
Thursday, 25 October 12
ShardingData Distribution across nodes• Data location transparent to your code• Data distribution is automatic• Data re-distribution is automatic• Aggregate system resources horizontally• No code changes
Thursday, 25 October 12
Sharding - Range distribution
shard01 shard02 shard03
sh.shardCollection("test.tweets", {_id: 1} , false)
Thursday, 25 October 12
Sharding - Range distribution
shard01 shard02 shard03
a-i j-r s-z
Thursday, 25 October 12
Sharding - Splits
shard01 shard02 shard03
a-i ja-jz s-z
k-r
Thursday, 25 October 12
Sharding - Splits
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
Thursday, 25 October 12
Sharding - Auto Balancing
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
js-jw
jz-r
Thursday, 25 October 12
Sharding - Auto Balancing
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
Thursday, 25 October 12
Sharding - Routed Query
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
find({_id: "norberto"})
Thursday, 25 October 12
Sharding - Routed Query
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
find({_id: "norberto"})
js-jw
jz-r
Thursday, 25 October 12
Sharding - Scatter Gather
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
find({email: "[email protected]"})
Thursday, 25 October 12
Sharding - Scatter Gather
shard01 shard02 shard03
a-i ja-ji n-z
ji-js
js-jw
jz-r
find({email: "[email protected]"})
Thursday, 25 October 12
Sharding - Caching
shard01
a-i
j-r
n-z
300
GB
Dat
a
300 GB
96 GB Mem3:1 Data/Mem
Thursday, 25 October 12
Aggregate Horizontal Resources
shard01 shard02 shard03
a-i j-r n-z
96 GB Mem1:1 Data/Mem
100 GB 100 GB 100 GB
300
GB
Dat
a
96 GB Mem1:1 Data/Mem
96 GB Mem1:1 Data/Mem
Thursday, 25 October 12
Consistency & Durability
Thursday, 25 October 12
Two choices for consistency
•Eventual consistency•Allow updates when a system has been partitioned•Resolve conflicts later•Example: CouchDB, Cassandra
•Immediate consistency•Limit the application of updates to a single master node for a given slice of data
•Another node can take over after a failure is detected•Avoids the possibility of conflicts•Example: MongoDB
Thursday, 25 October 12
Durability
•For how long is my data available?•When do I now that my data is safe?•Where?
•Mongodb style•Fire and Forget•Get Last Error•Journal Sync•Replica Safe
Thursday, 25 October 12
Data Durability
Thursday, 25 October 12
Flexibility
Thursday, 25 October 12
Data Model
• Why JSON?• Provides a simple, well understood encapsulation of data• Maps simply to the object in your OO language• Linking & Embedding to describe relationships
Thursday, 25 October 12
Json
place1 = { name : "10gen HQ", address : "578 Broadway 7th Floor", city : "New York", zip : "10011", tags : [ "business", "tech" ]}
Thursday, 25 October 12
Schema DesignRelational Database
Thursday, 25 October 12
Schema DesignMongoDB embedding
linkingThursday, 25 October 12
Schemas in MongoDB
Design documents that simply map to your application
post = {author: "Hergé", date: new Date(), text: "Destination Moon", tags: ["comic", "adventure"]}
> db.posts.save(post)
Thursday, 25 October 12
> db.blogs.find( { author: "Hergé"} )
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), text : "Destination Moon", tags : [ "comic", "adventure" ], comments : [! {! ! author : "Kyle",! ! date : ISODate("2011-09-19T09:56:06.298Z"),! ! text : "great book"! } ] }
Embedding
Thursday, 25 October 12
JSON & Scaleout
• Embedding removes need for• Distributed Joins• Two Phase commit
• Enables data to be distributed across many nodes without penalty
Thursday, 25 October 12
http://bit.ly/UmUnsUThursday, 25 October 12
http://bit.ly/cnP77LThursday, 25 October 12
http://bit.ly/ODoMhhThursday, 25 October 12
http://bit.ly/uW2nk
Thursday, 25 October 12
download at mongodb.org!
Facebook!http://bit.ly/mongofb!
Twitter!http://twitter.com/mongodb!
LinkedIn!http://linkd.in/joinmongo!
Support, Training, Consulting, Events, Meetupshttp://www.10gen.com
Thursday, 25 October 12