Post on 16-Apr-2017
transcript
An Introduction toMongoDB
Anuj Jain Equal Experts India
MongoDB
NoSQL
Key-value
Graph database
Document-oriented
Column family
The Great Divide
Not a RDBMS
• Mongo is not a relational database like MySQL, Oracle
• No transactions.
• No referential Integrity.
• No Joins.
• No schema, so no columns or rows.
• NoSQL.
What is MongoDB ?
• Scalable High-Performance Open-source, Document-orientated database.
• Built for Speed
• Rich Document based queries for Easy readability.
• Full Index Support for High Performance.
• Replication and Failover for High Availability.
• Auto Sharding for Easy Scalability.
• Map / Reduce for Aggregation
Quiz ?
Which of the following statement are true about MongoDB ?
1. MongoDB is document oriented.
2. MongoDB supports Joins.
3. MongoDB has dynamic schema.
4. MongoDB supports SQL.
What MongoDB is great for ?
• Semi structured Content Management.
• Real time Analytics and High-Speed Logging.
• Caching and High Availability.
• Mobile and Social Infrastructure, Big Data etc.
Some considerations while designing schema in MongoDB
• Combine objects into one document if you will use them together. Otherwise separate them (but make sure there should not be need of joins).
• Duplicate the data (but limited) because disk space is cheap as compare to compute time.
• Do joins while write, not on read.
• Optimize your schema for most frequent use cases.
• Do complex aggregation in the schema.
Example – Blog Post
• Every post has the unique title, description and url.
• Every post can have one or more tags.
• Every post has the name of its publisher and total number of likes.
• Every Post have comments given by users along with their name, message, data-time and likes.
• On each post there can be zero or more comments.
RDBMS
Mongo Schema
{ “_id” : ObjectId("55b1f50899708bec87f96edc")
“title” : “MongoDB Tutorial for beginner”, “description: “How to start using mongodb”, “by: Anuj Jain, “url: “http://mongodbtutorial.com/blog/mongodb”, “tags” : ['mongodb', 'nosql' ], “likes” : 200, “comments” : [ { “user” : ''MongoUser”, “message” : “Very Nice Tutorial” , “dateCreated” : NumberLong(1437725960469), “like” : true
} ]}
Quiz ?How many different data types are there in JSON ?
1. 4
2. 5
3. 6
4. 7
AnswerAns: 6
1. String
2. Number
3. Boolean
4. null
5. Array
6. Object/document
CRUD
Createdb.collection.insert( <document> ) db.collection.save( <document> ) db.collection.update( <query>, <update>, { upsert: true } ) Readdb.collection.find( <query>, <projection> )db.collection.findOne( <query>, <projection> )
Updatedb.collection.update( <query>, <update>, <options> )
Deletedb.collection.remove( <query>, <justOne> )
Some Other Operators
1. $and
2. $or
3. $in
4. $nin
5. $exists
6. $push
7. $pop
8. $addToSet
Indexes
1.Indexes are special data structures, that store a small portion of the data set in an easy to traverse form.
2. Stores the value of a specific field or set of fields.
3. Ordered by the value of the field as specified in index.
4. Indexes can improves read operation but slower the write operations.
5. Mongodb use B-Tree data structure to store indexes.
6.Blocks mongod process unless you specify the background.
7. null is assumed if field is missing.
When To Index ?1. Frequently Queried Fields
2. Low response time
3. Sorting
4. Avoid full collection scans.
Indexes Types1. Default (_id)
2. Single Field
3. Compound Index
4. Multikey Index
5. Geospatial Index
6. Sparse Index
7. TTL Index
Quiz ?Mongodb index can have keys of different types( ints, dates, string for example) in it ?1. True2. False
Covered IndexesQueries that can be resolved with only the index (does not need to
fetch the original document)
Example: { “name”:”Anuj”,
– “age”:28,– “gender”:Male,– “skills”:[“Java”,”Mongo”]}
db.people.ensureIndex({“name”:1,”age”:1})
db.people.find ({“name”:”Anuj”},{“_id” :0 , “age”:1})
TTL Indexes {“time to live”}
1. Mongod already remove the data from the collections after specify number of seconds.
2. Field type should either be BSON date type or an array of BSON date-type object
Eg. db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
Where createdAt is date field
Quiz ?Suppose we run :
db.foo.ensureIndex ({a:1, b:2, c:3})
db.foo.find({a : “sports”, b:{$gt : 100}})
Then
1.Only the index needs touched to fully execute the query.
2.Then index and some documents need to be executed.
Why Replication?
• To keep your data safe.
• High (24*7) availability of data.
• Disaster Recovery.
• Read scaling (extra copies to read from).
• No downtime for maintenance (like backups, index rebuilds, compaction).
Replica set features
• A class of N nodes.
• Anyone node can be primary.
• All write operation goes to primary.
• Automatic fail-overs.
• Automatic Recovery.
• Consensus election of primary.
Capped Collection• Fixed size circular queues that follow the insertion order.
• Fixed size is preallocated and when it exhausted, oldest document will automatically start getting deleted.
• We cannot delete documents from a capped collection.
• There are no default indexes present in a capped collection, not even on _id field.
Capped CollectionCommands :
1. db.createCollection (
"cappedcollection", {capped:true,size:10000} )
2. db.createCollection (
"cappedcollection", capped:true, size:10000, max:1000 })3. db.cappedLogCollection.isCapped()
4. db.runCommand({"convertToCapped":"posts",size:10000})
5. db.cappedLogCollection.find().sort({$natural:-1})
Query Limitations:
Indexing can't be used in queries which use:
1. Regular expressions or negation operators like $nin, $not, etc.
Arithmetic operators like $mod, etc.
2. $where clause
Hence, it is always advisable to check the index usage for your queries.
Index LimitationMaximum Ranges:• A collection cannot have more than 64 indexes.
• The length of the index name cannot be longer than 125 characters.
• A compound index can have maximum 31 fields indexed
$explain
The $explain operator provides information and statistics on the query for example :
1. Indexes used the query.
2. Number of document scan in serving the query.
3. Whether index enough to serve the query data i.e. covered Index.
Usage :
.
db.users.find({gender:"M"}, {user_name:1,_id:0} ).explain()
$hint
The $hint operator forces the query optimizer to use the specified index to run a query
db.users.find({gender:"M"},{user_name:1,_id:0})» .hint({gender:1,user_name:1})
Backup & RestoreBackup Utilities1. mongodump (use to dump complete data directory or db)
2. mongoexport (use to dump certain collection to output file like json or csv).
Restore Utilities
1. mongorestore
2. mongoimport
ObjectId
An ObjectId is a 12-byte BSON type having the following structure:
The first 4 bytes representing the seconds since the unix epoch
The next 3 bytes are the machine identifier
The next 2 bytes consists of process id
The last 3 bytes are a random counter value
Thank you
References:https://www.mongodb.org/