+ All Categories
Home > Technology > How to Achieve Scale with MongoDB

How to Achieve Scale with MongoDB

Date post: 01-Nov-2014
Category:
Upload: mongodb
View: 14 times
Download: 1 times
Share this document with a friend
Description:
Learn how to achieve scale with MongoDB. In this presentation, we cover three different ways to scale MongoDB, including optimization, vertical scaling, and horizontal scaling.
Popular Tags:
58
Sr. Solutions Architect, MongoDB Jake Angerman How to Achieve Scale with MongoDB
Transcript
Page 1: How to Achieve Scale with MongoDB

Sr. Solutions Architect, MongoDB

Jake Angerman

How to Achieve Scale with MongoDB

Page 2: How to Achieve Scale with MongoDB

Today’s Webinar Agenda

Schema Design

Indexes

Monitoring your Workload

Optimization Tips

Scale Vertically

Horizontal Scaling

Achieve Scale

1

2

3

Page 3: How to Achieve Scale with MongoDB

Optimization Tips toScale Your App

Page 4: How to Achieve Scale with MongoDB

Premature Optimization

• There is no doubt that the grail of efficiency leads to

abuse. Programmers waste enormous amounts of time

thinking about, or worrying about, the speed of

noncritical parts of their programs, and these attempts

at efficiency actually have a strong negative impact

when debugging and maintenance are considered. We

should forget about small efficiencies, say about 97% of

the time: premature optimization is the root of all

evil. Yet we should not pass up our opportunities in

that critical 3%.

- Donald Knuth, 1974

Page 5: How to Achieve Scale with MongoDB

Premature Optimization

• "There is no doubt that the grail of efficiency leads to

abuse. Programmers waste enormous amounts of time

thinking about, or worrying about, the speed of

noncritical parts of their programs, and these attempts

at efficiency actually have a strong negative impact

when debugging and maintenance are considered. We

should forget about small efficiencies, say about 97% of

the time: premature optimization is the root of all

evil. Yet we should not pass up our opportunities in

that critical 3%."

- Donald Knuth, 1974

Page 6: How to Achieve Scale with MongoDB

Premature Optimization

• "There is no doubt that the grail of efficiency leads to

abuse. Programmers waste enormous amounts of time

thinking about, or worrying about, the speed of

noncritical parts of their programs, and these attempts

at efficiency actually have a strong negative impact

when debugging and maintenance are considered. We

should forget about small efficiencies, say about 97%

of the time: premature optimization is the root of

all evil. Yet we should not pass up our opportunities in

that critical 3%."

- Donald Knuth, 1974

Page 7: How to Achieve Scale with MongoDB

Schema Design

• Document Model

• Dynamic Schema

• Collections

{ "customer_id" : 123,"first_name" : ”John","last_name" : "Smith","address" : { "street": "123 Main

Street", "city": "Houston", "state": "TX", "zip_code": "77027"

}policies: [ {

policy_number : 13,description: “short

term”,deductible: 500

},{ policy_number : 14,

description: “dental”,visits: […]

} ] }

Page 8: How to Achieve Scale with MongoDB

The Importance of Schema Design

• MongoDB schemas are built oppositely than relational

schemas!

• Relational Schema:– normalize data– write complex queries to join the data– let the query planner figure out how to make queries efficient

• MongoDB Schema:– denormalize the data– create a (potentially complex) schema with prior knowledge

of your actual (not just predicted) query patterns– write simple queries

Page 9: How to Achieve Scale with MongoDB

Real World Example: Optimizing Schema for Scale

Product catalog schema for retailer selling in 20 countries

{_id: 375,en_US: { name: …, description: …, <etc…> },en_GB: { name: …, description: …, <etc…> },fr_FR: { name: …, description: …, <etc…> },fr_CA: { name: …, description: …, <etc…> },de_DE: …,de_CH: …,<… and so on for other locales …>

}

Page 10: How to Achieve Scale with MongoDB

What's good about this schema?

• Each document contains all the data about the product across all possible locales.

• It is the most efficient way to retrieve all translations of a product in a single query (English, French, German, etc).

Page 11: How to Achieve Scale with MongoDB

But that's not how the data was accessed

db.catalog.find( { _id: 375 }, { en_US:

true } );

db.catalog.find( { _id: 375 }, { fr_FR:

true } );

db.catalog.find( { _id: 375 }, { de_DE:

true } );

… and so forth for other locales

The data model did not fit the access pattern.

Page 12: How to Achieve Scale with MongoDB

Why is this inefficient?

Data in RED are

being used. Data

in BLUE take up

memory but are

not in demand.

{_id: 375,en_US: { name: …, description: …, <etc…> },en_GB: { name: …, description: …, <etc…> },fr_FR: { name: …, description: …, <etc…> },fr_CA: { name: …, description: …, <etc…> },de_DE: …,de_CH: …,<… and so on for other locales …>

}

{_id: 42,en_US: { name: …, description: …, <etc…> },en_GB: { name: …, description: …, <etc…> },fr_FR: { name: …, description: …, <etc…> },fr_CA: { name: …, description: …, <etc…> },de_DE: …,de_CH: …,<… and so on for other locales …>

}

Page 13: How to Achieve Scale with MongoDB

Consequences of the schema

• Each document contained 20x more data than the common use case requires

• Disk IO was too high for the relatively modest query load on the dataset

• MongoDB lets you request a subset of a document's contents via projection…

• … but the entire document must be loaded into RAM to service the request

Page 14: How to Achieve Scale with MongoDB

Consequences of the schema redesign

{_id: "375-en_GB",name: …,description: …, <… the rest of the document …>

}

• Queries induced minimal memory overhead

• 20x as many distinct products fit in RAM at once

• Disk IO utilization reduced

• Application latency reduced

Page 15: How to Achieve Scale with MongoDB

Schema Design Patterns

• Pattern: pre-computing interesting quantities, ideally

with each write operation

• Pattern: putting unrelated items in different collections

to take advantage of indexing

• Anti-pattern: appending to arrays ad infinitum

• Anti-pattern: importing relational schemas directly into

MongoDB

Page 16: How to Achieve Scale with MongoDB

Schema Design Tips

• Avoid inherently slow operations– Updates of unindexed arrays of several thousand elements– Updates of indexed arrays of several hundred elements– Document moves

• Arrays are great, but know how to use them

Page 17: How to Achieve Scale with MongoDB

Schema Design resources

• Blog series, "6 rules of thumb"– Part 1: http://goo.gl/TFJ3dr– Part 2: http://goo.gl/qTdGhP– Part 3: http://goo.gl/JFO1pI

Page 18: How to Achieve Scale with MongoDB

Indexing

• Indexes are tree-structured sets of references to your

documents

• Indexes are the single biggest tunable performance

factor in the database

• Indexing and schema design go hand in hand

Page 19: How to Achieve Scale with MongoDB

Indexing Mistakes

• Failing to build necessary indexes

• Building unnecessary indexes

• Running ad-hoc queries in production

Page 20: How to Achieve Scale with MongoDB

Indexing Fixes

• Failing to build necessary indexes– Run .explain(), examine slow query log, mtools,

system.profile collection

• Building unnecessary indexes– Talk to your application developers about usage

• Running ad-hoc queries in production– Use a staging environment, use secondaries

Page 21: How to Achieve Scale with MongoDB

mongod log files

Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms

Page 22: How to Achieve Scale with MongoDB

mongod log files

Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms

date and time

threadoperatio

n

nam

esp

ace

n…

counte

rs

locktimes

duration

number of yields

Page 23: How to Achieve Scale with MongoDB

You need a tool when doing log file analysis

Page 24: How to Achieve Scale with MongoDB

mtools

• http://github.com/rueckstiess/mtools

• log file analysis for poorly performing queries– Show me queries that took more than 1000 ms from 6 am to

6 pm:– mlogfilter mongodb.log --from 06:00 --to 18:00 --slow 1000 > mongodb-filtered.log

Page 25: How to Achieve Scale with MongoDB

Graphing with mtools

% mplotqueries --type histogram --group namespace --bucketSize 3600

Page 26: How to Achieve Scale with MongoDB

Real World Example: Indexing for Scale

Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms

Page 27: How to Achieve Scale with MongoDB

Document schema

{

_id: ObjectId("53b9ab7e939f1e229b4f574c"),

firstName: "Alice",

lastName: "Smith",

parent: {

company: 22794,

employeeId: 83881

}

}

Page 28: How to Achieve Scale with MongoDB

But there's an index!?!

db.system.indexes.find().toArray()

[{

"v" : 1,

"key" : {

"company" : 1,

"employeeId" : 1

},

"ns" : "test.docs",

"name" : "company_1_employeeId_1"

}]

Page 29: How to Achieve Scale with MongoDB

But there's an index!?!

db.system.indexes.find().toArray()

[{

"v" : 1,

"key" : {

"company" : 1,

"employeeId" : 1

},

"ns" : "test.docs",

"name" : "company_1_employeeId_1"

}]

This isn't the index

you're looking

for.

Page 30: How to Achieve Scale with MongoDB

Did you see the problem?

{

_id: ObjectId("53b9ab7e939f1e229b4f574c"),

firstName: "Alice",

lastName: "Smith",

parent: {

company: 22794,

employeeId: 83881

}

}

Page 31: How to Achieve Scale with MongoDB

The index was created incorrectly

db.system.indexes.find().toArray()

[{

"v" : 1,

"key" : {

"parent.company" : 1,

"parent.employeeId" : 1

},

"ns" : "test.docs",

"name" :

"parent.company_1_parent.employeeId_1"

}]

Subdocument needed

Page 32: How to Achieve Scale with MongoDB

Indexing Strategies

• Create indexes that support your queries!

• Create highly selective indexes

• Eliminate duplicate indexes with a compound index, if

possible– db.collection.ensureIndex({A:1, B:1, C:1})– allows queries using leftmost prefix

• Order compound index fields thusly: equality, sort,

then range– see http://emptysqua.re/blog/optimizing-mongodb-

compound-indexes/

• Create indexes that support covered queries

• Prevent collection scans in pre-production

environments– mongod --notablescan– db.getSiblingDB("admin").runCommand( { setParameter: 1, notablescan: 1 } )

Page 33: How to Achieve Scale with MongoDB

Monitoring Your Workload

• Log files, iostat, mtools, mongotop are for debugging

• MongoDB Management Service (MMS) can do

metrics collection and reporting

Page 34: How to Achieve Scale with MongoDB

What can MMS do?

Page 35: How to Achieve Scale with MongoDB

Database Metrics

Page 36: How to Achieve Scale with MongoDB

Hardware statistics (CPU, disk)

Page 37: How to Achieve Scale with MongoDB

MMS Monitoring Setup

Page 38: How to Achieve Scale with MongoDB

Cloud Version of MMS

1. Go to http://mms.mongodb.com

2. Create an account

3. Install one agent in your datacenter

4. Add hosts from the web interface

5. Enjoy!

Page 39: How to Achieve Scale with MongoDB

Today’s Webinar Agenda

Hardware ConsiderationsScale Vertically

Horizontal Scaling

Achieve Scale

2

3

Optimization Tips

1

Page 40: How to Achieve Scale with MongoDB

Vertical Scaling

Factors:– RAM– Disk– CPU– Network

Primary

Secondary

Secondary

Replica SetPrimary

Secondary

Secondary

Replica Set

Horizontal Scaling

Page 41: How to Achieve Scale with MongoDB

Working Set Exceeds Physical Memory

Page 42: How to Achieve Scale with MongoDB

RAM - Measure your working set and index sizes

• db.serverStatus({workingSet:1}).workingSet{ "computationTimeMicros": 2751, "note": "thisIsAnEstimate", "overSeconds": 1084, "pagesInMemory": 2041}

• db.stats().indexSize2032880640

• In this example,

(2041 * 4096) + 2032880640 =

2041240576 bytes

= 1.9 GB

• Note: this is a subset of the virtual memory used by

mongod

Page 43: How to Achieve Scale with MongoDB

Real World Example: Vertical Scaling

• System that tracked status information for entities in

the business

• State changes happen in batches; sometimes 10% of

entities get updated, sometimes 100% get updated

Page 44: How to Achieve Scale with MongoDB

Initial Architecture

Sharded cluster with 4 shards using spinning disks

Application / mongosmongod

Page 45: How to Achieve Scale with MongoDB

Adding shards to scale horizontally

• Application was a success! Business entities grew by a

factor of 5

• Cluster capacity multiplied by 5, but so did the TCOApplication / mongos

…16 more shards…

mongod

Page 46: How to Achieve Scale with MongoDB

More success means more shards

• 10x growth means … 200 shards

• Horizontal scaling with sharding is linear scaling, but

an order of magnitude was needed

• Bulk updates of random documents approaches

speed of disks

Page 47: How to Achieve Scale with MongoDB

Final architecture

• Scaling the random IOPS with SSDs was a vertical

scaling approach

Application / mongosmongod SSD

Page 48: How to Achieve Scale with MongoDB

Before you add hardware…

• Make sure you are solving the right scaling problem

• Remedy schema and index problems first– schema and index problems can look like hardware problems

• Tune the Operating System– ulimits, swap, NUMA, NOOP scheduler with hypervisors

• Tune the IO subsystem– ext4 or XFS vs SAN, RAID10, readahead, noatime

• See MongoDB "production notes" page

• Heed logfile startup warnings

Page 49: How to Achieve Scale with MongoDB

Today’s Webinar Agenda

The Basics of ShardingHorizontal

Scaling

Achieve Scale

3

Optimization Tips

1

Scale Vertically2

Page 50: How to Achieve Scale with MongoDB

The basics ofHorizontal Scaling

Page 51: How to Achieve Scale with MongoDB

The basics ofHorizontal Scaling(aka Sharding)

Page 52: How to Achieve Scale with MongoDB

The Basics of Sharding

Page 53: How to Achieve Scale with MongoDB

Rule of Thumb

To make good decisions about MongoDB implementations, you must understand MongoDB and

your applications and the workload your applications generate and your business

requirements.

Page 54: How to Achieve Scale with MongoDB

Summary

• Don't throw hardware at the problem until you

examine all other possibilities (schema, indexes, OS,

IO subsystem)

• Know what is considered "normal" performance by

monitoring

• Horizontal scaling in MongoDB is implemented with

sharding, but you must understand schema design

and indexing before you shard

Sharding a sub-optimally designed database will not

make it performant

Page 55: How to Achieve Scale with MongoDB

Today’s Webinar Agenda

The Basics of ShardingHorizontal

Scaling

Achieve Scale

3

Schema Design

Indexes

Monitoring your Workload

Scale Vertically2

Optimization Tips

1

Page 56: How to Achieve Scale with MongoDB

Limited Time: Get Expert Advice for Free

If you’re thinking about scaling, why reinvent the wheel?

Our experts can collaborate with you to provide detailed guidance.

Sign Up For a Free One Hour Consult:

http://bit.ly/1rkXcfN

Page 57: How to Achieve Scale with MongoDB

Questions?

Stay tuned after the webinar and take our survey for your chance to win MongoDB schwag.

Page 58: How to Achieve Scale with MongoDB

Sr. Solutions Architect, MongoDB

Jake Angerman

Thank You


Recommended