Date post: | 25-Feb-2018 |
Category: |
Documents |
Upload: | paula-dron |
View: | 218 times |
Download: | 0 times |
of 51
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
1/51
xploring the Aggregation Framew
Jay RunkelSolutions [email protected]@jayrunkel
mailto:[email protected]:[email protected]:[email protected]7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
2/51
Agenda
1. Analytics in MongoDB?
2. Aggregation Framework
3. Aggregation Framework in Action US Census Data
. Aggregation Framework !"tions
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
3/51
Analytics in MongoDB?
CreateRea#
U"#ateDelete
Analytics
?
$rou"CountDeri%e &alues
FilterA%erageSort
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
4/51
For Example: US Census Data
' Census #ata (rom 1))*+ 2***+ 2*1*
' ,uestion-
/ic/ US Di%ision /as t/e (astest growing "o"ulation #ensity?
e only want to inclu#e #ata states wit/ more t/an 1M "eo"le e only want to inclu#e #i%isions larger t/an 1**0 suare miles
Di%ision a grou" o( US States
o"ulation #ensity Area o( #i%ision45 o( "eo"leData is "ro%i#e# at t/e state le%el
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
5/51
US Regions and Divisions
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
6/51
o! !ould !e solve t"is in S#$?
' S676C8 $9!U B: ;A&
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
7/51
%"at A&out MongoDB?
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
8/51
Aggregation Frame!or'
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
9/51
%"at is an Aggregation (ipeline?
' A Series o( Document 8rans(ormations
6>ecute# in stages
!riginal in"ut is a collection
!ut"ut as a cursor or a collection
' 9ic/ 7irary o( Functions
Filter+ com"ute+ grou"+ an# summari@e #ata
!ut"ut o( one stage sent to in"ut o( ne>t
!"erations e>ecute# in seuential or#er
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
10/51
Aggregation (ipeline
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
11/51
(ipeline )perators
$match Filter #ocuments
$project
9es/a"e #ocuments
$group
Summari@e #ocuments
$unwind
6>"an# #ocuments
$sort !r#er #ocuments
$limit/$skip
aginate #ocuments
$redact
9estrict #ocuments
$geoNear
ro>imity sort #ocuments
$let,$map
De(ine %ariales
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
12/51
Aggregation Frame!or' in Action*lets play with the census data+
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
13/51
MongoDB State Collection
' Document For 6ac/ State
' =ame
' 9egion
' Di%ision
' Census Data For 1))*+ 2***+ 2*1*
o"ulation
;ousing Units
!ccu"ie# ;ousing Units
' Census Data is an array wit/ t/ree su#ocuments
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
14/51
Document Model{ "_id" : ObjectId("54e23c7b28099359f5661525"),
"name" : "a!if#nia",
"#e$in" : "%e&t",
"data" : '
{"tta!" : 33871648,
"tta!*+&e" : 12214549,
"cc*+&e" : 11502870, "ea#" : 2000-,
{"tta!" : 37253956,
"tta!*+&e" : 13680081,
"cc*+&e" : 12577498,
"ea#" : 2010-,
{"tta!" : 29760021, "tta!*+&e" : 11182882,
"cc*+&e" : 29008161,
"ea#" : 1990-
.,
/
-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
15/51
Count, Distinct
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
16/51
-otal US Area
dbcataa$$#e$ate('
{"$#+" : {"_id" : n+!!,
"tta!#ea" : {&+m : "a#ea"-,
"a$#ea" : {a$ : "a#ea"---.)
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
17/51
.group
' $rou" #ocuments y %alue Fiel# re(erence+ oect+ constant
!t/er out"ut (iel#s are com"ute#
' $max, $min, $avg, $sum
' $addToSet, $push' $first, $last
rocesses all #ata in memory y#e(ault
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
18/51
Area By Region
dbcataa$$#e$ate('
{"$#+" : {"_id" : "#e$in",
"tta!#ea" : {&+m : "a#ea"-,
"a$#ea" : {a$ : "a#ea"-,
"n+mtate&" : {&+m : 1-,
"&tate&" : {+& : "name"---.)
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
19/51
Calculating Average State Area By Region
{ $#+: {
_id: "#e$in",
a$#ea: {a$:
a#ea" -
--
{
_id: #t a&t",
a$#ea: 154
-
{
_id: ;%e&t",
a$#ea: 300
-
{
&tate: e< =#>",
a#ea: 218,
#e$in: ;#t a&t"
-
{
&tate: e< ?e#&e",
a#ea: 90,
#e$in: ;#t a&t
-
{
&tate: ;a!if#nia",
a#ea: 300,
#e$in: ;%e&t"
-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
20/51
Calculating -otal Area and State Count
{ $#+: {
_id: "#e$in",
tt#ea: {&+m:
a#ea" -,
&+nt : {&+m : 1---
{
_id: #t a&t",
tt#ea: 308
&+nt: 2-
{
_id: ;%e&t",
tt#ea: 300,
&+nt: 1-
{
&tate: e< =#>",
a#ea: 218,
#e$in: ;#t a&t"
-
{
&tate: e< ?e#&e",
a#ea: 90,
#e$in: ;#t a&t
-
{
&tate: ;a!if#nia",
a#ea: 300,
#e$in: ;%e&t"
-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
21/51
-otal US (opulation By /ear
dbcataa$$#e$ate(
'{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
22/51
.un!ind
' !"erate on an array (iel#
Create #ocuments (rom array elements
' Array re"lace# y element %alue
' Missing4em"ty (iel#s no out"ut
' =onarray (iel#s error i"e to $groupto aggregate
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
23/51
.un!ind
{ +n,
cen&+&: 1990-
{ &tate: e< =#>",
cen&+&: '1990, 2000,
2010.
-
{
&tate: e< ?e#&e",
cen&+&: '1990, 2000.
-
{
&tate: ;a!if#nia",
cen&+&: '1980, 1990,
2000, 2010.
-{
&tate: e!a
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
24/51
Sout"ern State (opulation By /ear
dbcataa$$#e$ate(
'{matc : {"#e$in" : "+t"--,
{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
25/51
.matc"
' Filter #ocuments
Uses e>isting uery synta>
=o w/ere Eser%er si#e a%ascri"tG
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
26/51
.matc"
{ matc:
{ ;#e$in : ;%e&t -
-
{
&tate: e< =#>",
a#ea: 218,
#e$in: ;#t a&t"
-
{
&tate: O#e$n",
a#ea: 245,
#e$in: ;%e&t
-
{
&tate: ;a!if#nia",
a#ea: 300,
#e$in: ;%e&t"
-
{
&tate: O#e$n",
a#ea: 245,
#e$in: ;%e&t
-
{
&tate: ;a!if#nia",
a#ea: 300,
#e$in: ;%e&t"
-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
27/51
(opulation Delta By State 0rom 1223 to4313
dbcataa$$#e$ate(
'{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
28/51
(opulation Delta By State 0rom 1223 to4313
dbcataa$$#e$ate(
'{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
29/51
.sort, .limit, .s'ip
' Sort documents by one or morefelds Same order syntax as cursors Waits or earlier pipeline operator to
return !n"memory unless early and indexed
' #imit and skip ollo cursor
beha%ior
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
30/51
(opulation Delta By State 0rom 1223 to4313
dbcataa$$#e$ate(
'{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
31/51
.0irst, .last
' Collection o"erations like "us/ an# a##8oSet
' Must e use# in grou"
' (irst an# last #etermine# y #ocument or#er
' 8y"ically use# wit/ sort to ensure or#ering isknown
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
32/51
(opulation Delta By State 0rom 1223 to4313
dbcataa$$#e$ate(
'{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
33/51
.pro5ect
' 9es/a"e Documents clu#e or rename (iel#s
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
34/51
6ncluding and Excluding Fields
{ #ject:
{ ;_id : 0,
;1990 : 1,
;2010 : 1
-
{ "_id" : "@i#$inia,
"1990" : 453588,
"2010" : 3725789-{ "_id" : "+t a>ta",
"1990" : 453588,
"2010" : 3725789-
{
"1990" : 453588,
"2010" : 3725789
-{
"1990" : 453588,
"2010" : 3725789-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
35/51
{
name" : ;+t a>ta,
de!ta" : 118176
Renaming and Computing Fields
{ #ject:
{ ;_id : 0, ;1990 : 0,
;2010 : 0,
;name : ;_id,
"de!ta" :
{"&+bt#act" :
'"2010",
"1990".--
-
{
"_id" : "@i#$inia,
"1990" : 6187358,
"2010" : 8001024
-{
"_id" : "+t a>ta",
"1990" : 696004,
"2010" : 814180
- {
name" : ;@i#$inia,
de!ta" : 1813666
-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
36/51
Compare num&er o0 people living !it"in7338M o0 Memp"is, -9 in 1223, 4333, 4313
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
37/51
Compare num&er o0 people living !it"in7338M o0 Memp"is, -9 in 1223, 4333, 4313
dbcataa$$#e$ate('
{$eea# : {
"nea#" : {"te" : "int", "c#dinate&" : '90, 35.-,
;di&tanceAie!d : "di&tca!c+!ated",
;maBi&tance : 500000,
;inc!+deCc& : "di&t!catin", ;&e#ica! : t#+e --,
{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
38/51
.geo9ear
' &rder'(ilter )ocuments by
#ocation Re*uires a geospatial index
&utput includes physical distance +ust be frst aggregation stage
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
39/51
{
"_id" : Denne&&ee",
"1990" : 4877185,
"2010" : 6346105,
;cente# :
{;te : ;int,
;c#dinate& :
'866, 378.-
{
"_id" : "@i#$inia,
"1990" : 6187358,
"2010" : 8001024,
;cente# :
{;te : ;int,
;c#dinate& :
'786, 375.-
-
.geo9ear
{$eea# : {
"nea#: {"te: "int", "c#dinate&:
'90, 35.-,
maBi&tance : 500000,
&e#ica! : t#+e --
{
"_id" : Denne&&ee",
"1990" : 4877185,
"2010" : 6346105,
;cente# :
{;te : ;int,
;c#dinate& :
'866, 378.-
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
40/51
%"at i0 6 !ant to save t"e results to acollection?
dbcataa$$#e$ate('
{$eea# : {
"nea#" : {"te" : "int", "c#dinate&" : '90, 35.-,
;di&tanceAie!d : "di&tca!c+!ated",
;maBi&tance : 500000,
;inc!+deCc& : "di&t!catin",
;&e#ica! : t#+e --,
{+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
41/51
.out
db.cData.aggregate([,
{$out :
resultsCollection}]
' Sa%e aggregation results to a new collection
' =ew aggregation uses-
' 8rans(orm #ocuments 687
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
42/51
Bac' -o -"e )riginal #uestion
' /ic/ US Di%ision /as t/e (astest growing "o"ulation #ensity?
e only want to inclu#e #ata states wit/ more t/an 1M "eo"le
e only want to inclu#e #i%isions larger t/an 1**0 suare miles
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
43/51
Division !it" Fastest ro!ing (opDensity
dbcataa$$#e$ate(
'{matc : {"datatta!" : {"$t" : 1000000---, {+n
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
44/51
Aggregate )ptions
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
45/51
Aggregate options
db.cData.aggregate([],
{!e"plain# : alse
%allo&Dis'se% : true,
%cursor% : {%batc)*i+e% : }}
e>"lain similar to (in#EG.e>"lainEG
allowDiskUse enale use o( #isk to store interme#iateresults
cursor s"eci(y t/e si@e o( t/e initial result
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
46/51
Aggregation and S"arding
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
47/51
S"arding
' orkloa# s"lit etween s/ar#s S/ar#s e>ecute "i"eline u"
to a "oint rimary s/ar# merges
cursors an# continues
"rocessingH Use e>"lain to analy@e"i"eline s"lit
6arly $match may e>cuses/ar#s
otential CU an# memoryim"lications (or "rimarys/ar# /ost
Hrior to %2.I secon# stage "i"eline "rocessing was
#one y mongos
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
48/51
Summary
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
49/51
Analytics in MongoDB?
CreateRea#
U"#ateDeletet
Analytics
?
$rou"CountDeri%e &aluesFilter
A%erageSort:6SJ
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
50/51
Frame!or' Use Cases
' Basic aggregation ueries
' A#/oc re"orting
' 9ealtime analytics
' &isuali@ing an# res/a"ing #ata
7/25/2019 Exploring the Aggregation Framework - Runkel 2015 - Slideshare
51/51
#uestions?
ay.runkelKmongo#.com
Kayrunkel
mailto:[email protected]:[email protected]:[email protected]