Date post: | 15-Jan-2015 |
Category: |
Technology |
Upload: | ethan-gunderson |
View: | 2,784 times |
Download: | 1 times |
MongoDB:Awesomely Dangerous
Twin Cities Code CampOctober 2010
Sunday, October 10, 2010
Ethan Gundersonhttp://ethangunderson.com
Twitter & Github: ethangunderson
Sunday, October 10, 2010
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
30,000 foot overview
• Schema-less
• Scalable
• High-Performance
• Open Source
• Document Database
Sunday, October 10, 2010
Driving Principles
1) Performance
2) Performance
3) Scalability
Sunday, October 10, 2010
Documents
• Similar to traditional rows
• First attribute is _id, which is an ObjectId
Sunday, October 10, 2010
ObjectID
1) Timestamp
2) Machine Id
3) Process Id
4) Counter
4b6857a07613c367094426b2
Sunday, October 10, 2010
Embedded Documents• Documents in Documents
• Indexable
• Queryable
Sunday, October 10, 2010
Max Document Size
• 4MB limit on individual documents
• Realistically ~250kb
• Used to force better data modeling
Sunday, October 10, 2010
Collections• Similar to traditional tables
• Collections of like documents
• Schema-less
Sunday, October 10, 2010
Capped Collections
• Fixed size collections
• Must be explicitly created
• Limited functionality (no deletes, limited updates)
Sunday, October 10, 2010
Sunday, October 10, 2010
BSONA language independent data interchange format
• The language of Mongo
• Similar to JSON, but BETTER
• Fast
• 10gen driver support for: C, C++, Java, Javascript, Perl, PHP, Python, Ruby
• Community support for:REST, C#, Clojure, Coldfusion, Scala, and a lot more
Sunday, October 10, 2010
B-Tree Indexes• Similar to a traditional RDBMS
• Can index any field, including arrays
• Missing keys in a unique will be given a value of null
• Blocking by default
Sunday, October 10, 2010
Sunday, October 10, 2010
Inserting
Sunday, October 10, 2010
Updating
Sunday, October 10, 2010
Modifier Operators
$inc$set$push$pushAll$pop$pull$pullAll$addToSet
Conventional updates do work, they’re just not as fast.
Sunday, October 10, 2010
Querying
Sunday, October 10, 2010
Query Operators$in$nin $all $ne $gt$gte$lt
$lte$size$where$limit $offset$sort$slice
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
Map / Reduce• Replaces GROUP BY in SQL
• Similar in spirit to Hadoop with all info coming from a collection and going to a collection
• Runs in parallel on all shards, but only one thread per node
• map and reduce functions written in Javascript
Sunday, October 10, 2010
GIS
• Built with location based queries in mind
• Assumes a flat map model of the Earth(!)
Sunday, October 10, 2010
Near Queries
Sunday, October 10, 2010
Bounded Queries
Sunday, October 10, 2010
GridFS• How you store large files in Mongo
• Spreads data to multiple 256kb documents in a ‘chunks’ collection
• Meta data about the file is stored in the files collection
• Permits range operations (x bytes from file)
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
The journey of a write
Sunday, October 10, 2010
Memory Mapped File
Save!Success!
Sunday, October 10, 2010
Sunday, October 10, 2010
Safe Mode
• Allows you to determine the durability of a write per query
• Sacrifice performance for safety
• Options for each stage of the write
Sunday, October 10, 2010
Safe mode query
Sunday, October 10, 2010
Memory Mapped File
Save!
Success!
Sunday, October 10, 2010
FsyncEvery 60 seconds, or when the kernel forces it
Sunday, October 10, 2010
Save mode with fsync
Sunday, October 10, 2010
Memory Mapped File
Save!
Success!
Sunday, October 10, 2010
Safe mode with replication flag
Sunday, October 10, 2010
Memory Mapped File
Save!
Success!
Sunday, October 10, 2010
Save Flag with fsync and replication
Sunday, October 10, 2010
Memory Mapped File
Save!
Success!
Sunday, October 10, 2010
Low and High Value
• Useful to determine when building queries
• Allows you to be more careful(and slow), with data that is more important
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
Sunday, October 10, 2010
Single Server DurabilityIt’s not
Sunday, October 10, 2010
Sunday, October 10, 2010
What 10gen has to say...
http://blog.mongodb.org/post/381927266/what-about-durability
True single server durability is almost never done correctly.First, there are many scenarios in which that server loses all its data no matter what. If there is water damage, fire, some hardware problems, etc… no matter how durable the software is, data can be lost.
The path to true durability is replication.
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
ScalingSince we’re forced to think about it
Sunday, October 10, 2010
Determining how to scale
Reads
Writes
Sunday, October 10, 2010
Replica Sets
• Distribute reads across the cluster
• Replaces the traditional Master/Slave setup
• Replication is done via an ops log
• Auto failover
• Rack and Datacenter aware
• Smart, very smart
Sunday, October 10, 2010
Master
SlaveSlave Slave
Sunday, October 10, 2010
Master
SlaveSlave Slave
X
Sunday, October 10, 2010
Now a slave
New Master
Slave Slave
X
Sunday, October 10, 2010
Slave
New Master
Slave Slave
Sunday, October 10, 2010
Slave
Master
SlaveSlave
Sunday, October 10, 2010
Slave
Master
SlaveSlave
X
Sunday, October 10, 2010
Master
Slave Slave
X
Me! Me!
Sunday, October 10, 2010
Master
Slave Slave
X
Me! Me!
Web Slice(Arbiter)
You
Sunday, October 10, 2010
Determining how to scale
Reads
Writes
Sunday, October 10, 2010
Being write heavyCurrently, Mongo can only process one concurrent write.
Usually not a problem, as writes are wicked fast
Sunday, October 10, 2010
Auto-Sharding• Partitions data across the cluster in an
order preserving manner
• No support for load based partitioning
• Automatic failover and balancing of nodes
• Distributes writes across the cluster
• Based very heavily off of Yahoo!’s PNUTS and Google’s BigTable
Sunday, October 10, 2010
Sunday, October 10, 2010
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
Takeaways• Mongo is fast, but it does interesting things to
be that fast
• Mongo is not SQL. You will need to learn new things
Sunday, October 10, 2010
Our agenda1) Intro to MongoDB
2) Advanced Features
3) Write Path
4) Durability
5) Scaling
6) Takeaways
7) Qs and As
Sunday, October 10, 2010
Qs and As
http://spkr8.com/t/4756
Sunday, October 10, 2010
ResourcesOfficial MongoDB sitehttp://mongodb.org
BSON sitehttp://bsonspec.org
Comprehensive writeup of mongo featureshttp://www.markus-gattol.name/ws/mongodb.html
Sunday, October 10, 2010