+ All Categories
Home > Technology > MongoDB Indexing: The Details

MongoDB Indexing: The Details

Date post: 08-Jul-2015
Category:
Upload: mongodb
View: 11,556 times
Download: 0 times
Share this document with a friend
Description:
Aaron Staple's presentation from MongoSV
Popular Tags:
199
MongoDB Indexing and Query Optimizer Details Aaron Staple MongoSV December 3, 2010
Transcript
Page 1: MongoDB Indexing: The Details

MongoDBIndexing and Query Optimizer

Details

Aaron Staple

MongoSV

December 3, 2010

Page 2: MongoDB Indexing: The Details

What will we cover?

• Many details of how indexing and the query optimizer work

• A full understanding of these details is not required to use mongo, but this knowledge can be helpful when making optimizations.

• We’ll discuss functionality of Mongo 1.8 (for our purposes pretty similar to 1.6 and almost identical to 1.7 edge).

• Much of the material will be presented through examples.

• Diagrams are to aid understanding – some details will be left out.

Page 3: MongoDB Indexing: The Details

What will we cover?

• Basic index bounds

• Compound key index bounds

• Or queries

• Automatic index selection

Page 4: MongoDB Indexing: The Details

How will we cover it?

• We’re going to try and cover this material interactively - please volunteer your thoughts on what mongo should do in given scenarios when I ask.

• Pertinent questions are welcome, but please keep off topic or specialized questions until the end so we don’t lose momentum.

Page 5: MongoDB Indexing: The Details

Btree (just a conceptual diagram)

1

2

3 4

5

6

7

8 9

{_id:4,x:6}

Page 6: MongoDB Indexing: The Details

Basic Index Bounds

Page 7: MongoDB Indexing: The Details

Find One Document

• db.c.find( {x:6} ).limit( 1 )

• Index {x:1}

Page 8: MongoDB Indexing: The Details

Find One Document

1 2 3 4 5 6 7 8 9

6 ?

{_id:4,x:6}

Page 9: MongoDB Indexing: The Details

Find One Document>db.c.find( {x:6} ).limit( 1 ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

}

Page 10: MongoDB Indexing: The Details

Find One Document

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

Page 11: MongoDB Indexing: The Details

Find One Document

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

Page 12: MongoDB Indexing: The Details

Find One Document

1

2

3 4

5

6

7

8 9

6 ?

{_id:4,x:6}

Page 13: MongoDB Indexing: The Details

Find One Document

1

2

3 4

5

6

7

8 9

6 ?

{_id:4,x:6}

Page 14: MongoDB Indexing: The Details

Find One Document

1 2 3 4 5 6 6 6 9

6 ?

{_id:4,x:6}

Now we have

duplicate x values

Page 15: MongoDB Indexing: The Details

Find One Document

1

2

3 4

5

6

6

6 9

6 ?

{_id:4,x:6}

Page 16: MongoDB Indexing: The Details

Equality Match

• db.c.find( {x:6} )

• Index {x:1}

Page 17: MongoDB Indexing: The Details

9

Equality Match

1 2 3 4 5 6 6 6

6 ?

{_id:4,x:6} {_id:5,x:6}

{_id:1,x:6}

Page 18: MongoDB Indexing: The Details

Equality Match>db.c.find( {x:6} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

}

Page 19: MongoDB Indexing: The Details

Equality Match

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

Page 20: MongoDB Indexing: The Details

Equality Match

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 3,

Page 21: MongoDB Indexing: The Details

Equality Match

1

2

3 4

5

6

6

6 9

6 ?

Page 22: MongoDB Indexing: The Details

Full Document Matcher

• db.c.find( {x:6,y:1} )

• Index {x:1}

Page 23: MongoDB Indexing: The Details

9

Full Document Matcher

1 2 3 4 5 6 6 6

6 ?

{y:4,x:6} {y:5,x:6}

{y:1,x:6}

Page 24: MongoDB Indexing: The Details

Full Document Matcher>db.c.find( {x:6,y:1} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

}

Page 25: MongoDB Indexing: The Details

Full Document Matcher

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

Page 26: MongoDB Indexing: The Details

Full Document Matcher

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 1, Documents for all

matching keys

scanned, but only

one document

matched on non

index keys.

Page 27: MongoDB Indexing: The Details

Range Match

• db.c.find( {x:{$gte:4,$lte:7}} )

• Index {x:1}

Page 28: MongoDB Indexing: The Details

8

Range Match

1 2 3 4 5 6 7 9

4 <= ? <= 7

Page 29: MongoDB Indexing: The Details

Range Match>db.c.find( {x:{$gte:4,$lte:7}} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 4,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

4,

7

]

]

}

}

Page 30: MongoDB Indexing: The Details

Range Match

"indexBounds" : {

"x" : [

[

4,

7

]

]

Page 31: MongoDB Indexing: The Details

Range Match

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 4,

Page 32: MongoDB Indexing: The Details

Range Match

1

2

3 4

5

6

7

8 9

Page 33: MongoDB Indexing: The Details

Exclusive Range Match

• db.c.find( {x:{$gt:4,$lt:7}} )

• Index {x:1}

Page 34: MongoDB Indexing: The Details

8

Exclusive Range Match

1 2 3 4 5 6 7 9

4 < ? < 7

Page 35: MongoDB Indexing: The Details

Exclusive Range Match>db.c.find( {x:{$gt:4,$lt:7}} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

4,

7

]

]

}

}

Page 36: MongoDB Indexing: The Details

Exclusive Range Match

"indexBounds" : {

"x" : [

[

4,

7

]

]

}

Explain doesn’t

indicate that

the range is

exclusive.

Page 37: MongoDB Indexing: The Details

Exclusive Range Match

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 2, But index keys

matching the

range bounds are

not scanned

because the

bounds are

exclusive.

Page 38: MongoDB Indexing: The Details

Exclusive Range Match

1

2

3 4

5

6

7

8 9

Page 39: MongoDB Indexing: The Details

Multikeys

• db.c.find( {x:{$gt:7}} )

• Index {x:1}

Page 40: MongoDB Indexing: The Details

Multikeys

1 2 3 4 5 6 7 9

? > 7

{_id:4,x:[8,9]}

8

Page 41: MongoDB Indexing: The Details

Multikeys>db.c.find( {x:{$gt:7}} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

7,

1.7976931348623157e+308

]

]

}

}

Page 42: MongoDB Indexing: The Details

Multikeys

"indexBounds" : {

"x" : [

[

7,

1.7976931348623157e+308

]

]

}

Page 43: MongoDB Indexing: The Details

Multikeys

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 1, All keys in valid

range are

scanned, but the

matcher rejects

duplicate

documents making

n == 1.

Page 44: MongoDB Indexing: The Details

Multikeys

1

2

3 4

5

6

7

8 9

Page 45: MongoDB Indexing: The Details

Range Types

• Explicit inequality

• db.c.find( {x:{$gt:4,$lt:7}} )

• db.c.find( {x:{$gt:4}} )

• db.c.find( {x:{$ne:4}} )

• Regular expression prefix

• db.c.find( {x:/^a/} )

• Data type

• db.c.find( {x:/a/} )

Page 46: MongoDB Indexing: The Details

Range Types

db.c.find( {x:{$gt:4,$lt:7}} )

"indexBounds" : {

"x" : [

[

4,

7

]

]

}

Page 47: MongoDB Indexing: The Details

Range Types

db.c.find( {x:{$gt:4}} )

"indexBounds" : {

"x" : [

[

4,

1.7976931348623157e+308

]

]

}

Page 48: MongoDB Indexing: The Details

Range Types

db.c.find( {x:{$ne:4}} )

"indexBounds" : {

"x" : [

[

{

"$minElement" : 1

},

4

],

[

4,

{

"$maxElement" : 1

}

]

]

}

Page 49: MongoDB Indexing: The Details

Range Types

db.c.find( {x:/^a/} )

"indexBounds" : {"x" : [

["a","b"

],[

/^a/,/^a/

]]

}

Page 50: MongoDB Indexing: The Details

Range Types

db.c.find( {x:/a/} )

"indexBounds" : {

"x" : [

[

"",

{

}

],

[

/a/,

/a/

]

]

}

Page 51: MongoDB Indexing: The Details

Set Match

• db.c.find( {x:{$in:[3,6]}} )

• Index {x:1}

Page 52: MongoDB Indexing: The Details

8

Set Match

1 2 3 4 5 6 7 9

3 , 6

Page 53: MongoDB Indexing: The Details

Set Match>db.c.find( {x:{$in:[3,6]}} ).explain()

{

"cursor" : "BtreeCursor x_1 multi",

"nscanned" : 3,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 8,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

3,

3

],

[

6,

6

]

]

}

}

Page 54: MongoDB Indexing: The Details

Set Match

"indexBounds" : {

"x" : [

[

3,

3

],

[

6,

6

]

]

}

Page 55: MongoDB Indexing: The Details

Set Match

"nscanned" : 3,

"nscannedObjects" : 2,

"n" : 2, Why is nscanned 3?

This is an

algorithmic detail

we’ll discuss more

later, but when there

are disjoint ranges

for a key nscanned

may be higher than

the number of

matching keys.

Page 56: MongoDB Indexing: The Details

Set Match

1

2

3 4

5

6

7

8 9

Page 57: MongoDB Indexing: The Details

All Match

• db.c.find( {x:{$all:[3,6]}} )

• Index {x:1}

Page 58: MongoDB Indexing: The Details

8

All Match

1 2 3 4 5 6 7 9

3 ?

{_id:4,x:[3,6]}

Page 59: MongoDB Indexing: The Details

All Match>db.c.find( {x:{$all:[3,6]}} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

3,

3

]

]

}

}

Page 60: MongoDB Indexing: The Details

All Match

"indexBounds" : {

"x" : [

[

3,

3

]

]

}

The first entry in the

$all match array is

always used for

index bounds. Note

this may not be the

least numerous

indexed value in the

$all array.

Page 61: MongoDB Indexing: The Details

All Match

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

Page 62: MongoDB Indexing: The Details

All Match

1

2

3 4

5

6

7

8 9

Page 63: MongoDB Indexing: The Details

Limit

• db.c.find( {x:{$lt:6},y:3} ).limit( 3 )

• Index {x:1}

Page 64: MongoDB Indexing: The Details

8

Limit

1 2 3 4 5 6 7 9

6? <

y:3 y:1 y:3 y:3 y:3

Page 65: MongoDB Indexing: The Details

Limit>db.c.find( {x:{$lt:6},y:3} ).limit( 3 ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 3,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

-1.7976931348623157e+308,

6

]

]

}

}

Page 66: MongoDB Indexing: The Details

Limit

"indexBounds" : {

"x" : [

[

-1.7976931348623157e+308,

6

]

]

}

Page 67: MongoDB Indexing: The Details

Limit

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 3, Scan until three

matches are found,

then stop.

Page 68: MongoDB Indexing: The Details

Skip

• db.c.find( {x:{$lt:6},y:3} ).skip( 3 )

• Index {x:1}

Page 69: MongoDB Indexing: The Details

8

Skip

1 2 3 4 5 6 7 9

6? <

y:3 y:1 y:3 y:3 y:3

Page 70: MongoDB Indexing: The Details

Skip>db.c.find( {x:{$lt:6},y:3} ).skip( 3 ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 5,

"nscannedObjects" : 5,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

-1.7976931348623157e+308,

6

]

]

}

}

Page 71: MongoDB Indexing: The Details

Skip

"indexBounds" : {

"x" : [

[

-1.7976931348623157e+308,

6

]

]

}

Page 72: MongoDB Indexing: The Details

Skip

"nscanned" : 5,

"nscannedObjects" : 5,

"n" : 1, All skipped

documents are

scanned.

Page 73: MongoDB Indexing: The Details

Sort

• db.c.find( {x:{$lt:6}} ).sort( {x:1} )

• Index {x:1}

Page 74: MongoDB Indexing: The Details

8

Sort

1 2 3 4 5 6 7 9

6? <

y:3 y:1 y:3 y:3 y:3

Page 75: MongoDB Indexing: The Details

Sort>db.c.find( {x:{$lt:6},y:3} ).sort( {x:1} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 5,

"nscannedObjects" : 5,

"n" : 4,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

-1.7976931348623157e+308,

6

]

]

}

}

Page 76: MongoDB Indexing: The Details

Sort

"cursor" : "BtreeCursor x_1",

Page 77: MongoDB Indexing: The Details

Sort

• db.c.find( {x:{$lt:6}} ).sort( {y:1} )

• Index {x:1}

Page 78: MongoDB Indexing: The Details

8

Sort

1 2 3 4 5 6 7 9

6? <

y:3 y:1 y:3 y:3 y:3

Page 79: MongoDB Indexing: The Details

Sort>db.c.find( {x:{$lt:6},y:3} ).sort( {y:1} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 5,

"nscannedObjects" : 5,

"n" : 4,

"scanAndOrder" : true,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

-1.7976931348623157e+308,

6

]

]

}

}

Page 80: MongoDB Indexing: The Details

Sort

"cursor" : "BtreeCursor x_1",

"nscanned" : 5,

"nscannedObjects" : 5,

"n" : 4,

"scanAndOrder" : true,Results are sorted

on the fly to match

requested order.

The scanAndOrder

field is only printed

when its value is

true.

Page 81: MongoDB Indexing: The Details

Sort and scanAndOrder

• With “scanAndOrder” sort, all documents must be touched even if there is a limit spec.

• With scanAndOrder, sorting is performed in memory and the memory footprint is constrained by the limit spec if present.

Page 82: MongoDB Indexing: The Details

Count

• db.c.count( {x:{$gte:4,$lte:7}} )

• Index {x:1}

Page 83: MongoDB Indexing: The Details

8

Count

1 2 3 4 5 6 7 9

4 <= ? <= 7

Page 84: MongoDB Indexing: The Details

Count

1

2

3 4

5

6

7

8 9

We’re just counting

keys here, not

loading the full

documents.

Page 85: MongoDB Indexing: The Details

Count

• With some operators the full document must be checked. Some of these cases:• $all

• $size

• array match

• Negation - $ne, $nin, $not, etc.• With current semantics, all multikey elements must match

negation constraints

• Multikey de duplication works without loading full document

Page 86: MongoDB Indexing: The Details

Covered Indexes

• db.c.find( {x:6}, {x:1,_id:0} )

• Index {x:1} Id would be returned

by default, but isn’t

in the index so we

need to exclude to

return only indexed

fields.

Page 87: MongoDB Indexing: The Details

8

Covered Indexes

1 2 3 4 5 6 7 9

6 ?

{_id:4,x:6}

Page 88: MongoDB Indexing: The Details

Covered Indexes>db.c.find( {x:6}, {x:1,_id:0} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : true,

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

}

Page 89: MongoDB Indexing: The Details

Covered Indexes

"isMultiKey" : false,

"indexOnly" : true,

Page 90: MongoDB Indexing: The Details

8

Covered Indexes

1 2 3 4 5 6 7 9

6 ?

{_id:4,x:[6,7]}

Page 91: MongoDB Indexing: The Details

Covered Indexes>db.c.find( {x:6}, {x:1,_id:0} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : true,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

6

]

]

}

}

Page 92: MongoDB Indexing: The Details

Covered Indexes

"isMultiKey" : true,

"indexOnly" : false, Currently we set

isMultiKey to true the

first time we save a doc

where the field is a

multikey array. But

when all multikey docs

are removed we don’t

reset isMultiKey. This

can be improved.

Page 93: MongoDB Indexing: The Details

Update

• db.c.find( {x:{$gte:4,$lte:7}}, {$set:{x:2}} )

• Index {x:1}

Page 94: MongoDB Indexing: The Details

8

Update

1 2 3 4 5 6 7 9

4 <= ? <= 7

{_id:4,x:4}

Page 95: MongoDB Indexing: The Details

Update

1

2

3 4

5

6

7

8 9

{_id:4,x:4}

Page 96: MongoDB Indexing: The Details

Update

1

2

3 4

5

6

7

8 9

{_id:4,x:4}

Page 97: MongoDB Indexing: The Details

Update

1

2

2 3

5

6

7

8 9

{_id:4,x:2}

Page 98: MongoDB Indexing: The Details

Update

• We track the set of documents that have been updated in the course of the current operation so they are only updated once.

Page 99: MongoDB Indexing: The Details

Compound Key Index Bounds

Page 100: MongoDB Indexing: The Details

Two Equality Bounds

• db.c.find( ,x:5,y:’c’- )

• Index {x:1,y:1}

Page 101: MongoDB Indexing: The Details

Two Equality Bounds

?5c

1b

3d

4g

5d

5f

6c

7a

9b

5c

Page 102: MongoDB Indexing: The Details

Two Equality Bounds>db.c.find( {x:5,y:'c'} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

]

],

"y" : [

[

"c",

"c"

]

]

}

}

Page 103: MongoDB Indexing: The Details

Two Equality Bounds"indexBounds" : {

"x" : [

[

5,

5

]

],

"y" : [

[

"c",

"c"

]

]

}

}

Page 104: MongoDB Indexing: The Details

Two Equality Bounds

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

Page 105: MongoDB Indexing: The Details

Two Equality Bounds

?

1b

3d

4g

5c

5d

5f

5c

6c

7a

9b

Page 106: MongoDB Indexing: The Details

Equality and Set

• db.c.find( ,x:5,y:,$in:*’c’,’f’+-- )

• Index {x:1,y:1}

Page 107: MongoDB Indexing: The Details

Equality and Set

,5c

1b

3d

4g

5d

5f

6c

7a

9b

5c

5f

Page 108: MongoDB Indexing: The Details

Equality and Set>db.c.find( {x:5,y:{$in:['c','f']}} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1 multi",

"nscanned" : 3,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

]

],

"y" : [

[

"c",

"c"

],

[

"f",

"f"

]

]

}

}

Page 109: MongoDB Indexing: The Details

Equality and Set"indexBounds" : {

"x" : [

[

5,

5

]

],

"y" : [

[

"c",

"c"

],

[

"f",

"f"

]

]

}

Page 110: MongoDB Indexing: The Details

Equality and Set

"nscanned" : 3,

"nscannedObjects" : 2,

"n" : 2,

Page 111: MongoDB Indexing: The Details

Equality and Set

1b

3d

4g

5c

5d

5f

6c

7a

9b

Page 112: MongoDB Indexing: The Details

Equality and Range

• db.c.find( ,x:5,y:,$gte:’d’-- )

• Index {x:1,y:1}

Page 113: MongoDB Indexing: The Details

Equality and Range

1b

3d

4g

5d

5f

6c

7a

9b

5c

<= ? <= 5d

5max string

Page 114: MongoDB Indexing: The Details

Equality and Range>db.c.find( {x:5,y:{$gte:'d'}} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

]

],

"y" : [

[

"d",

{

}

]

]

}

}

Page 115: MongoDB Indexing: The Details

Equality and Range"indexBounds" : {

"x" : [

[

5,

5

]

],

"y" : [

[

"d",

{

}

]

]

}

Page 116: MongoDB Indexing: The Details

Equality and Range

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 2,

Page 117: MongoDB Indexing: The Details

Equality and Range

1b

3d

4g

5c

5d

5f

6c

7a

9b

Page 118: MongoDB Indexing: The Details

Two Set Bounds

• db.c.find( ,x:,$in:*5,9+-,y:,$in:*’c’,’f’+-- )

• Index {x:1,y:1}

Page 119: MongoDB Indexing: The Details

Two Set Bounds

,5c

1b

3d

4g

5d

5f

6c

7a

9f

5c

5f ,9

c9

f,

Page 120: MongoDB Indexing: The Details

Two Set Bounds>db.c.find( {x:{$in:[5,9]},y:{$in:['c','f']}} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1 multi",

"nscanned" : 5,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

],

[

9,

9

]

],

"y" : [

[

"c",

"c"

],

[

"f",

"f"

]

]

}

Page 121: MongoDB Indexing: The Details

Two Set Bounds"indexBounds" : {

"x" : [

[

5,

5

],

[

9,

9

]

],

"y" : [

[

"c",

"c"

],

[

"f",

"f"

]

]

}

Page 122: MongoDB Indexing: The Details

Two Set Bounds

"nscanned" : 5,

"nscannedObjects" : 3,

"n" : 3,

Page 123: MongoDB Indexing: The Details

Two Set Bounds

1b

3d

4g

5c

5d

5f

6c

7a

9f

Page 124: MongoDB Indexing: The Details

Set and Range

• db.c.find( ,x:,$in:*5,9+-,y:,$lte:’d’-- )

• Index {x:1,y:1}

Page 125: MongoDB Indexing: The Details

Set and Range

<=?<=5min

string

1b

3d

4g

5d

5f

6c

9a

9f

5c

5d

9d, 9

minstring

<=?<=

Page 126: MongoDB Indexing: The Details

Set and Range>db.c.find( {x:{$in:[5,9]},y:{$lte:'d'}} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1 multi",

"nscanned" : 5,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

],

[

9,

9

]

],

"y" : [

[

"",

"d"

]

]

}

}

Page 127: MongoDB Indexing: The Details

Set and Range"x" : [

[

5,

5

],

[

9,

9

]

],

"y" : [

[

"",

"d"

]

]

}

Page 128: MongoDB Indexing: The Details

Set and Range

"nscanned" : 5,

"nscannedObjects" : 3,

"n" : 3,

Page 129: MongoDB Indexing: The Details

Range and Equality

• db.c.find( ,x:,$gte:4-,y:’c’- )

• Index {x:1,y:1}

Page 130: MongoDB Indexing: The Details

Range and Equality

? >=4

1b

3d

4g

5d

6a

7e

9f

5c

cand ?

8c

Page 131: MongoDB Indexing: The Details

Range and Equality>db.c.find( {x:{$gte:4},y:'c'} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 7,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

4,

1.7976931348623157e+308

]

],

"y" : [

[

"c",

"c"

]

]

}

}

Page 132: MongoDB Indexing: The Details

Range and Equality"indexBounds" : {

"x" : [

[

4,

1.7976931348623157e+308

]

],

"y" : [

[

"c",

"c"

]

]

}

Page 133: MongoDB Indexing: The Details

Range and Equality

"nscanned" : 7,

"nscannedObjects" : 2,

"n" : 2, High nscanned

because every

distinct value of x

must be checked.

Page 134: MongoDB Indexing: The Details

Range and Equality

1b

3d

4g

5c

5d

9f

6a

7e

8c

Page 135: MongoDB Indexing: The Details

Range and Equality

1b

3d

4g

5c

5d

9f

6a

7e

8c

Every distinct value

of x must be

checked.

Page 136: MongoDB Indexing: The Details

Range and Set

• db.c.find( ,x:,$gte:4-,y:,$in:*’c’,’a’+-- )

• Index {x:1,y:1}

Page 137: MongoDB Indexing: The Details

Range and Set

? >=4

1b

3d

4g

5d

6a

7e

9f

5c

cand ,

8c

a

Page 138: MongoDB Indexing: The Details

Range and Set>db.c.find( {x:{$gte:4},y:{$in:['c','a']}} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1 multi",

"nscanned" : 7,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

4,

1.7976931348623157e+308

]

],

"y" : [

[

"a",

"a"

],

[

"c",

"c"

]

]

}

}

Page 139: MongoDB Indexing: The Details

Range and Set"indexBounds" : {

"x" : [

[

4,

1.7976931348623157e+308

]

],

"y" : [

[

"a",

"a"

],

[

"c",

"c"

]

]

}

Page 140: MongoDB Indexing: The Details

Range and Set

"nscanned" : 7,

"nscannedObjects" : 3,

"n" : 3,

Page 141: MongoDB Indexing: The Details

Range and Set

1b

3d

4g

5c

5d

9f

6a

7e

8c

Page 142: MongoDB Indexing: The Details

Range and Set

1b

3d

4g

5c

5d

9f

6a

7e

8c

Every distinct value

of x must be

checked for y values

‘a’ and ‘c’.

Page 143: MongoDB Indexing: The Details

Two Ranges (2D Box)

• db.c.find( ,x:,$gte:3,$lte:7-,y:,$gte:’c’,$lte:’f’-- )

• Index {x:1,y:1}

Page 144: MongoDB Indexing: The Details

Two Ranges (2D Box)

x

y

3 7

c

f

{x:{$gte:3,$lte:7},

y:,$gte:’c’,$lte:’f’--

Page 145: MongoDB Indexing: The Details

Two Ranges (2D Box)

<=?<=7

1b

3d

4g

5d

6a

7e

9f

5c

c&

7g

f3 <=?<=

Page 146: MongoDB Indexing: The Details

Two Ranges (2D Box)>db.c.find( {x:{$gte:3,$lte:7},y:{$gte:'c',$lte:'f'}} ).explain()

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 6,

"nscannedObjects" : 4,

"n" : 4,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

3,

7

]

],

"y" : [

[

"c",

"f"

]

]

}

}

Page 147: MongoDB Indexing: The Details

Two Ranges (2D Box)"indexBounds" : {

"x" : [

[

3,

7

]

],

"y" : [

[

"c",

"f"

]

]

}

Page 148: MongoDB Indexing: The Details

Two Ranges (2D Box)

"nscanned" : 6,

"nscannedObjects" : 4,

"n" : 4,

Page 149: MongoDB Indexing: The Details

Two Ranges (2D Box)

1b

3d

4g

5c

5d

9f

6a

7e

7g

Page 150: MongoDB Indexing: The Details

Two Ranges (2D Box)

<=?<=7

c f

3

<=?<=

For every distinct value of x in this range

Scan for every value of y in this range

Page 151: MongoDB Indexing: The Details

$or

Page 152: MongoDB Indexing: The Details

Disjoint $or Criteria

• db.c.find( ,$or:*,x:5-,,y:’d’-+- )

• Indexes {x:1}, {y:1}

Page 153: MongoDB Indexing: The Details

Disjoint $or Criteria

?

1b

3d

4g

5d

6a

7e

9f

5c

d

7g

5

?

1b

3d

4g

5d

6a

7e

9f

5c

7g

Page 154: MongoDB Indexing: The Details

Disjoint $or Criteria>db.c.find( {$or:[{x:5},{y:'d'}]} ).explain()

{

"clauses" : [

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

]

]

}

},

{

"cursor" : "BtreeCursor y_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"y" : [

[

"d",

"d"

]

]

}

}

],

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 3,

"millis" : 1

}

Page 155: MongoDB Indexing: The Details

Disjoint $or Criteria{

"cursor" : "BtreeCursor x_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 2,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

5,

5

]

]

}

},

Page 156: MongoDB Indexing: The Details

Disjoint $or Criteria{

"cursor" : "BtreeCursor y_1",

"nscanned" : 2,

"nscannedObjects" : 2,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"y" : [

[

"d",

"d"

]

]

}

}

Only return one

document matching

this clause.

Page 157: MongoDB Indexing: The Details

Disjoint $or Criteria

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 3,

"millis" : 1

Page 158: MongoDB Indexing: The Details

Disjoint $or Criteria

?

1b

3d

4g

5d

6a

7e

9f

5c

7g

5

Page 159: MongoDB Indexing: The Details

Disjoint $or Criteria

d ?

1b

3d

4g

5d

6a

7e

9f

5c

7g

We have already

scanned the x index

for x:5. So this

document was

returned already. We

don’t return it again.

Page 160: MongoDB Indexing: The Details

Unindexed $or Clause

• db.c.find( ,$or:*,x:5-,,y:’d’-+- )

• Index {x:1} (no index on y)

Page 161: MongoDB Indexing: The Details

Unindexed $or Clause

>db.c.find( {$or:[{x:5},{y:'d'}]} ).explain()

{

"cursor" : "BasicCursor",

"nscanned" : 9,

"nscannedObjects" : 9,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

}

}

Since y is not indexed,

we must do a full

collection scan to

match y:’d’. Since a

full scan is required,

we don’t use the index

on x to match x:5.

Page 162: MongoDB Indexing: The Details

Eliminated $or Clause

• db.c.find( {$or:[{x:{$gt:2,$lt:6}},{x:5}]} )

• Index {x:1}

Page 163: MongoDB Indexing: The Details

Eliminated $or Clause

81 2 3 4 6 7 95

2 < ? < 6

81 2 3 4 6 7 95

5 ?

Page 164: MongoDB Indexing: The Details

Eliminated $or Clause

>db.c.find( {$or:[{x:{$gt:2,$lt:6}},{x:5}]} ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

2,

6

]

]

}

}

The index range of the

second clause is

included in the index

range of the first

clause, so we use the

first index range only.

Page 165: MongoDB Indexing: The Details

Eliminated $or Clause with Differing Unindexed Criteria

• db.c.find( ,$or:*,x:,$gt:2,$lt:6-,y:’c’-,,x:5,y:'d’-+- )

• Index {x:1}

Page 166: MongoDB Indexing: The Details

Eliminated $or Clause with Differing Unindexed Criteria

1b

3d

4g

5d

6a

7e

9f

5c

7g

< ? <2 6 and c

1b

3d

4g

5d

6a

7e

9f

5c

7g

5 and d

Page 167: MongoDB Indexing: The Details

Eliminated $or Clause with Differing Unindexed Criteria

>db.c.find( ,$or:*,x:,$gt:2,$lt:6-,y:’c’-,,x:5,y:'d’-+- ).explain()

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 2,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

2,

6

]

]

}

}

Page 168: MongoDB Indexing: The Details

Eliminated $or Clause with Differing Unindexed Criteria

1b

3d

4g

5d

6a

7e

9f

5c

7g

< ? <2 6 and c , d

The index range for the first clause contains the index

range for the second clause, so all matching is done

using the index range for the first clause.

Page 169: MongoDB Indexing: The Details

Overlapping $or Clauses

• db.c.find( {$or:[{x:{$gt:2,$lt:6}},{x:{$gt:4,$lt:7}}]} )

• Index {x:1,y:1}

Page 170: MongoDB Indexing: The Details

Overlapping $or Clauses

81 2 3 4 6 7 95

2 < ? < 6

81 2 3 4 6 7 95

4 < ? < 7

Page 171: MongoDB Indexing: The Details

Overlapping $or Clauses>db.d.find( {$or:[{x:{$gt:2,$lt:6}},{x:{$gt:4,$lt:7}}]} ).explain()

{

"clauses" : [

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

2,

6

]

]

}

},

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

7

]

]

}

}

],

"nscanned" : 4,

"nscannedObjects" : 4,

"n" : 4,

"millis" : 1

}

>

Page 172: MongoDB Indexing: The Details

Overlapping $or Clauses

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 3,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 0,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

2,

6

]

]

}

},

Page 173: MongoDB Indexing: The Details

Overlapping $or Clauses

{

"cursor" : "BtreeCursor x_1",

"nscanned" : 1,

"nscannedObjects" : 1,

"n" : 1,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

7

]

]

}

}

The index range

scanned for the

previous clause is

removed.

Page 174: MongoDB Indexing: The Details

Overlapping $or Clauses

81 2 3 4 6 7 95

2 < ? < 6

81 2 3 4 7 95

6 <= ? < 7

6

Page 175: MongoDB Indexing: The Details

2D Overlapping $or Clauses

• db.c.find( ,$or:*,x:,$gt:2,$lt:6-,y:,$gt:’b’,$lt:’f’--,,x:,$gt:4,$lt:7-,y:,$gt:’b’,$lt:’e’--+- )

• Index {x:1,y:1}

Page 176: MongoDB Indexing: The Details

2D Overlapping $or Clauses

x

y

2 6

b

f

Clause 2

e

7

Clause 1

Page 177: MongoDB Indexing: The Details

2D Overlapping $or Clauses>db.c.find( {$or:[{x:{$gt:2,$lt:6},y:{$gt:'b',$lt:'f'}},{x:{$gt:4,$lt:7},y:{$gt:'b',$lt:'e'}}]} ).explain()

{

"clauses" : [

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 4,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

2,

6

]

],

"y" : [

[

"b",

"f"

]

]

}

},

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 0,

"nscannedObjects" : 0,

"n" : 0,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

7

]

],

"y" : [

[

"b",

"e"

]

]

}

}

],

"nscanned" : 4,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 1

Page 178: MongoDB Indexing: The Details

2D Overlapping $or Clauses

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 4,

"nscannedObjects" : 3,

"n" : 3,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

2,

6

]

],

"y" : [

[

"b",

"f"

]

]

}

Page 179: MongoDB Indexing: The Details

2D Overlapping $or Clauses

{

"cursor" : "BtreeCursor x_1_y_1",

"nscanned" : 0,

"nscannedObjects" : 0,

"n" : 0,

"millis" : 1,

"nYields" : 0,

"nChunkSkips" : 0,

"isMultiKey" : false,

"indexOnly" : false,

"indexBounds" : {

"x" : [

[

6,

7

]

],

"y" : [

[

"b",

"e"

]

]

}

}

],

The index range

scanned for the

previous clause is

removed.

Page 180: MongoDB Indexing: The Details

2D Overlapping $or Clauses

x

y

2 6

b

f

Clause 2

e

7

We only have

to scan the

remainder

here

Clause 1

Page 181: MongoDB Indexing: The Details

Overlapping $or Clauses

• Rule of thumb for n dimensions: We subtract earlier clause boxes from current box when the result is a/some box(es).

2✓

2✓

11

Page 182: MongoDB Indexing: The Details

Overlapping $or Clauses

• Rule of thumb for n dimensions: We subtract earlier clause boxes from current box when the result is a/some box(es).

2✗

1

Page 183: MongoDB Indexing: The Details

$or TODO

• Use indexes on $or fields to satisfy a sort specification SERVER-1205

• Use full query optimizer to select $or clause indexes in getMore SERVER-1215

• Improve index range elimination (handling some cases where remainder is not a box)

Page 184: MongoDB Indexing: The Details

Automatic Index Selection

(Query Optimizer)

Page 185: MongoDB Indexing: The Details

Optimal Index

• find( {x:5} )– Index {x:1}

– Index {x:1,y:1}

• find( {x:5} ).sort( {y:1 } )– Index {x:1,y:1}

• find( {} ).sort( {x:1} )– Index {x:1}

• find( {x:{$gt:1,$lt:7}} ).sort( {x:1} )– Index {x:1}

Page 186: MongoDB Indexing: The Details

Optimal Index

• Rule of Thumb

– No scanAndOrder

– All fields with index useful constraints are indexed

– If there is a range or sort it is the last field of the index used to resolve the query

• If multiple optimal indexes exist, one chosen arbitrarily.

Page 187: MongoDB Indexing: The Details

Optimal Index

• These same criteria are useful when you are designing your indexes.

Page 188: MongoDB Indexing: The Details

Multiple Candidate Indexes

• find( ,x:4,y:’a’- )

– Index {x:1} or {y:1}?

• find( {x:4} ).sort( {y:1} )

– Index {x:1} or {y:1}?

– Note: {x:1,y:1} is optimal

• find( ,x:,$gt:2,$lt:7-,y:,$gt:’a’,$lt:’f’-- )

– Index {x:1,y:1} or {y:1,x:1}?

Page 189: MongoDB Indexing: The Details

Multiple Candidate Indexes

• The only index selection criterion is nscanned

• find( ,x:4,y:’a’- )

– Index {x:1} or {y:1} ?

– If fewer documents match {y:’a’- than ,x:4- then nscanned for {y:1} will be less so we pick {y:1}

• find( ,x:,$gt:2,$lt:7-,y:,$gt:’b’,$lt:’f’-- )

– Index {x:1,y:1} or {y:1,x:1} ?

– If fewer distinct values of 2 <x< 7 than distinct values of ‘b’ <y< ‘f’ then ,x:1,y:1- chosen (rule of thumb)

Page 190: MongoDB Indexing: The Details

Multiple Candidate Indexes

• The only index selection criterion is nscanned

• Pretty good, but doesn’t cover every case, eg

– Cost of scanAndOrdervs ordered index

– Cost of loading full document vs just index key

– Cost of scanning adjacent btree keys vs non adjacent keys/documents

Page 191: MongoDB Indexing: The Details

Competing Indexes

• At most one query plan per index

• Run in interleaved fashion

• Plans kept in a priority queue ordered by nscanned. We always continue progress on plan with lowest nscanned.

Page 192: MongoDB Indexing: The Details

Competing Indexes

• Run until one plan returns all results or enough results to satisfy the initial query request (based on soft limit spec / data size requirement for initial query).

• We only allow plans to compete in initial query. In getMore, we continue reading from the index cursor established by the initial query.

Page 193: MongoDB Indexing: The Details

“Learning” a Query Plan

• When an index is chosen for a query the query’s “pattern” and nscanned are recorded

– find( ,x:3,y:’c’- )

• {Pattern: {x:’equality’, y:’equality’-, Index: ,x:1-, nscanned: 50}

– find( ,x:,$gt:5-,y:,$lt:’z’-- )

• {Pattern: {x:’gt bound’, y:’lt bound’-, Index: ,y:1-, nscanned: 500}

Page 194: MongoDB Indexing: The Details

“Learning” a Query Plan

• When a new query matches the same pattern, the same query plan is used

– find( ,x:5,y:’z’- )

• Use index {x:1}

– find( ,x:,$gt:20-,y:,$lt:’b’-- )

• Use index {y:1}

Page 195: MongoDB Indexing: The Details

“Un-Learning” a Query Plan

• 100 writes to the collection

• Indexes added / removed

Page 196: MongoDB Indexing: The Details

Bad Plan Insurance

• If nscanned for a new query using a recorded plan is much worse than the recorded nscanned for an earlier query with the same pattern, we start interleaving other plans with the current plan.

• Currently “much worse” means 10x

Page 197: MongoDB Indexing: The Details

Query Planner

• Ad hoc heuristics in some cases

• Seem to work decently in practice

Page 198: MongoDB Indexing: The Details

Feedback

• Large and small scale optimizer features are generally prioritized based on user input.

• Please use jira to request new features and vote on existing feature requests.

Page 199: MongoDB Indexing: The Details

Thanks!

Feature Requests

jira.mongodb.org

Support

groups.google.com/group/mongodb-user

Next up:

Sharding Details with Eliot


Recommended