+ All Categories
Home > Software > MongoDB Schema Design: Four Real-World Examples

MongoDB Schema Design: Four Real-World Examples

Date post: 15-Apr-2017
Category:
Upload: lewis-lin-
View: 19 times
Download: 3 times
Share this document with a friend
51
Perl Engineer & Evangelist, 10gen Mike Friedman #MongoDBdays Schema Design Four Real-World Use Cases
Transcript
Page 1: MongoDB Schema Design: Four Real-World Examples

Perl Engineer & Evangelist, 10genMike Friedman

#MongoDBdays

Schema DesignFour Real-World Use Cases

Page 2: MongoDB Schema Design: Four Real-World Examples

Single Table En

Agenda• Why is schema design important• 4 Real World Schemas

– Inbox– History– Indexed Attributes– Multiple Identities

• Conclusions

Page 3: MongoDB Schema Design: Four Real-World Examples

Why is Schema Design important?

• Largest factor for a performant system• Schema design with MongoDB is

different• RDBMS – "What answers do I have?"• MongoDB – "What question will I have?"

Page 4: MongoDB Schema Design: Four Real-World Examples

#1 - Message Inbox

Page 5: MongoDB Schema Design: Four Real-World Examples

Let’s getSocial

Page 6: MongoDB Schema Design: Four Real-World Examples

Sending Messages

?

Page 7: MongoDB Schema Design: Four Real-World Examples

Design Goals• Efficiently send new messages to

recipients• Efficiently read inbox

Page 8: MongoDB Schema Design: Four Real-World Examples

Reading my Inbox

?

Page 9: MongoDB Schema Design: Four Real-World Examples

3 Approaches (there are more)• Fan out on Read• Fan out on Write• Fan out on Write with Bucketing

Page 10: MongoDB Schema Design: Four Real-World Examples

// Shard on "from"db.shardCollection( "mongodbdays.inbox", { from: 1 } )

// Make sure we have an index to handle inbox readsdb.inbox.ensureIndex( { to: 1, sent: 1 } )

msg = {from: "Joe",to: [ "Bob", "Jane" ],

sent: new Date(), message: "Hi!",

}

// Send a messagedb.inbox.save( msg )

// Read my inboxdb.inbox.find( { to: "Joe" } ).sort( { sent: -1 } )

Fan out on read

Page 11: MongoDB Schema Design: Four Real-World Examples

Fan out on read – Send Message

Shard 1 Shard 2 Shard 3

Send Message

Page 12: MongoDB Schema Design: Four Real-World Examples

Fan out on read – Inbox Read

Shard 1 Shard 2 Shard 3

Read Inbox

Page 13: MongoDB Schema Design: Four Real-World Examples

Considerations• One document per message sent • Reading an inbox means finding all

messages with my own name in the recipient field

• Requires scatter-gather on sharded cluster• Then a lot of random IO on a shard to find

everything

Page 14: MongoDB Schema Design: Four Real-World Examples

// Shard on “recipient” and “sent” db.shardCollection( "mongodbdays.inbox", { ”recipient”: 1, ”sent”: 1 } )

msg = {from: "Joe",to: [ "Bob", "Jane" ],

sent: new Date(), message: "Hi!",

}

// Send a messagefor ( recipient in msg.to ) {

msg.recipient = msg.to[recipient]db.inbox.save( msg );

}

// Read my inboxdb.inbox.find( { recipient: "Joe" } ).sort( { sent: -1 } )

Fan out on write

Page 15: MongoDB Schema Design: Four Real-World Examples

Fan out on write – Send Message

Shard 1 Shard 2 Shard 3

Send Message

Page 16: MongoDB Schema Design: Four Real-World Examples

Fan out on write– Read Inbox

Shard 1 Shard 2 Shard 3

Read Inbox

Page 17: MongoDB Schema Design: Four Real-World Examples

Considerations• One document per recipient• Reading my inbox is just finding all of the

messages with me as the recipient• Can shard on recipient, so inbox reads hit

one shard• But still lots of random IO on the shard

Page 18: MongoDB Schema Design: Four Real-World Examples

// Shard on “owner / sequence”db.shardCollection( "mongodbdays.inbox", { owner: 1, sequence: 1 } )db.shardCollection( "mongodbdays.users", { user_name: 1 } )

msg = {from: "Joe",to: [ "Bob", "Jane" ],

sent: new Date(), message: "Hi!",

}

Fan out on write with buckets

Page 19: MongoDB Schema Design: Four Real-World Examples

// Send a messagefor( recipient in msg.to) { count = db.users.findAndModify({

query: { user_name: msg.to[recipient] }, update: { "$inc": { "msg_count": 1 } }, upsert: true, new: true }).msg_count;

sequence = Math.floor(count / 50);

db.inbox.update({ owner: msg.to[recipient], sequence: sequence }, { $push: { "messages": msg } },{ upsert: true } );

}

// Read my inboxdb.inbox.find( { owner: "Joe" } ).sort ( { sequence: -1 } ).limit( 2 )

Fan out on write with buckets

Page 20: MongoDB Schema Design: Four Real-World Examples

Fan out on write with buckets• Each “inbox” document is an array of

messages• Append a message onto “inbox” of

recipient• Bucket inboxes so there’s not too many

messages per document• Can shard on recipient, so inbox reads hit

one shard• 1 or 2 documents to read the whole inbox

Page 21: MongoDB Schema Design: Four Real-World Examples

Fan out on write with buckets - Send

Shard 1 Shard 2 Shard 3

Send Message

Page 22: MongoDB Schema Design: Four Real-World Examples

Fan out on write with buckets - Read

Shard 1 Shard 2 Shard 3

Read Inbox

Page 23: MongoDB Schema Design: Four Real-World Examples

#2 – History

Page 24: MongoDB Schema Design: Four Real-World Examples
Page 25: MongoDB Schema Design: Four Real-World Examples

Design Goals• Need to retain a limited amount of history

e.g.– Hours, Days, Weeks– May be legislative requirement (e.g. HIPPA, SOX,

DPA)• Need to query efficiently by

– match– ranges

Page 26: MongoDB Schema Design: Four Real-World Examples

3 Approaches (there are more)• Bucket by Number of messages• Fixed size Array• Bucket by Date + TTL Collections

Page 27: MongoDB Schema Design: Four Real-World Examples

db.inbox.find() { owner: "Joe", sequence: 25, messages: [ { from: "Joe", to: [ "Bob", "Jane" ], sent: ISODate("2013-03-01T09:59:42.689Z"), message: "Hi!" }, …] }

// Query with a date rangedb.inbox.find ({owner: "friend1", messages: { $elemMatch: {sent:{$gte: ISODate("…") }}}})

// Remove elements based on a datedb.inbox.update({owner: "friend1" }, { $pull: { messages: { sent: { $gte: ISODate("…") } } } } )

Inbox – Bucket by # messages

Page 28: MongoDB Schema Design: Four Real-World Examples

Considerations• Shrinking documents, space can be

reclaimed with– db.runCommand ( { compact: '<collection>' } )

• Removing the document after the last element in the array as been removed– { "_id" : …, "messages" : [ ], "owner" : "friend1", "sequence" : 0 }

Page 29: MongoDB Schema Design: Four Real-World Examples

msg = { from: "Your Boss", to: [ "Bob" ],

sent: new Date(), message: "CALL ME NOW!"

}

// 2.4 Introduces $each, $sort and $slice for $pushdb.messages.update(

{ _id: 1 }, { $push: { messages: { $each: [ msg ],

$sort: { sent: 1 },

$slice: -50 }

} })

Maintain the latest – Fixed Size Array

Page 30: MongoDB Schema Design: Four Real-World Examples

Considerations• Need to compute the size of the array

based on retention period

Page 31: MongoDB Schema Design: Four Real-World Examples

// messages: one doc per user per daydb.inbox.findOne(){ _id: 1, to: "Joe", sequence: ISODate("2013-02-04T00:00:00.392Z"), messages: [ ] }// Auto expires data after 31536000 seconds = 1 yeardb.messages.ensureIndex( { sequence: 1 }, { expireAfterSeconds: 31536000 } )

TTL Collections

Page 32: MongoDB Schema Design: Four Real-World Examples

#3 – Indexed Attributes

Page 33: MongoDB Schema Design: Four Real-World Examples

Design Goal• Application needs to stored a variable

number of attributes e.g.– User defined Form– Meta Data tags

• Queries needed– Equality– Range based

• Need to be efficient, regardless of the number of attributes

Page 34: MongoDB Schema Design: Four Real-World Examples

2 Approaches (there are more)• Attributes as Embedded Document• Attributes as Objects in an Array

Page 35: MongoDB Schema Design: Four Real-World Examples

db.files.insert( { _id: "local.0", attr: { type: "text", size: 64, created: ISODate("..." } } )db.files.insert( { _id: "local.1", attr: { type: "text", size: 128} } )db.files.insert( { _id: "mongod", attr: { type: "binary", size: 256, created: ISODate("...") } } )// Need to create an index for each item in the sub-documentdb.files.ensureIndex( { "attr.type": 1 } )db.files.find( { "attr.type": "text"} )// Can perform range queriesdb.files.ensureIndex( { "attr.size": 1 } )db.files.find( { "attr.size": { $gt: 64, $lte: 16384 } } )

Attributes as a Sub-Document

Page 36: MongoDB Schema Design: Four Real-World Examples

Considerations• Each attribute needs an Index• Each time you extend, you add an index• Lots and lots of indexes

Page 37: MongoDB Schema Design: Four Real-World Examples

db.files.insert( {_id: "local.0", attr: [ { type: "text" },

{ size: 64 },

{ created: ISODate("...") } ] } )

db.files.insert( { _id: "local.1", attr: [ { type: "text" },

{ size: 128 } ] } )

db.files.insert( { _id: "mongod", attr: [ { type: "binary" },

{ size: 256 }, { created: ISODate("...") } ] } )

db.files.ensureIndex( { attr: 1 } )

Attributes as Objects in Array

Page 38: MongoDB Schema Design: Four Real-World Examples

Considerations• Only one index needed on attr• Can support range queries, etc.• Index can be used only once per query

Page 39: MongoDB Schema Design: Four Real-World Examples

#4 – Multiple Identities

Page 40: MongoDB Schema Design: Four Real-World Examples

Design Goal• Ability to look up by a number of

different identities e.g.• Username• Email address• FB Handle• LinkedIn URL

Page 41: MongoDB Schema Design: Four Real-World Examples

2 Approaches (there are more)• Identifiers in a single document• Separate Identifiers from Content

Page 42: MongoDB Schema Design: Four Real-World Examples

db.users.findOne(){ _id: "joe", email: "[email protected], fb: "joe.smith", // facebook li: "joe.e.smith", // linkedin other: {…}}

// Shard collection by _iddb.shardCollection("mongodbdays.users", { _id: 1 } )// Create indexes on each keydb.users.ensureIndex( { email: 1} )db.users.ensureIndex( { fb: 1 } )db.users.ensureIndex( { li: 1 } )

Single Document by User

Page 43: MongoDB Schema Design: Four Real-World Examples

Read by _id (shard key)

Shard 1 Shard 2 Shard 3

find( { _id: "joe"} )

Page 44: MongoDB Schema Design: Four Real-World Examples

Read by email (non-shard key)

Shard 1 Shard 2 Shard 3

find ( { email: [email protected] } )

Page 45: MongoDB Schema Design: Four Real-World Examples

Considerations• Lookup by shard key is routed to 1 shard• Lookup by other identifier is scatter

gathered across all shards• Secondary keys cannot have a unique

index

Page 46: MongoDB Schema Design: Four Real-World Examples

// Create unique indexdb.identities.ensureIndex( { identifier : 1} , { unique: true} )

// Create a document for each users documentdb.identities.save( { identifier : { hndl: "joe" }, user: "1200-42" } )db.identities.save( { identifier : { email: "[email protected]" }, user: "1200-42" } )db.identities.save( { identifier : { li: "joe.e.smith" }, user: "1200-42" } )

// Shard collection by _iddb.shardCollection( "mydb.identities", { identifier : 1 } )// Create unique indexdb.users.ensureIndex( { _id: 1} , { unique: true} )// Shard collection by _iddb.shardCollection( "mydb.users", { _id: 1 } )

Document per Identity

Page 47: MongoDB Schema Design: Four Real-World Examples

Read requires 2 reads

Shard 1 Shard 2 Shard 3

db.identities.find({"identifier" : { "hndl" : "joe" }})

db.users.find( { _id: "1200-42"} )

Page 48: MongoDB Schema Design: Four Real-World Examples

Considerations• Lookup to Identities is a routed query• Lookup to Users is a routed query• Unique indexes available

Page 49: MongoDB Schema Design: Four Real-World Examples

Conclusion

Page 50: MongoDB Schema Design: Four Real-World Examples

Summary• Multiple ways to model a domain problem• Understand the key uses cases of your

app• Balance between ease of query vs. ease

of write• Random IO should be avoided

Page 51: MongoDB Schema Design: Four Real-World Examples

Perl Engineer & Evangelist, 10gen

Mike Friedman

#MongoDBdays

Thank You


Recommended