Date post: | 22-Jun-2015 |
Category: |
Technology |
Upload: | mongodb |
View: | 573 times |
Download: | 4 times |
High Performance, Scalable MongoDB
in a Bare Metal Cloud
Harold Hannon, Sr. Software Architect
100k servers
24k customers
23 million domains
13 data centers16 network POPs20Gb fiber interconnects
Global Footprint
On the agenda today…..• Big Data considerations• Some deployment options• Performance Testing with JS
Benchmarking Harness • Review some internal product research
performed• Discuss the impact of those findings on
our product development
“Build me a Big Data Solution”
Product Use Case
• MongoDB deployed for customers on purchase• Complex configurations including sharding and
replication• Configurable via Portal interface• Performance tuned to 3 ‘t-shirt size’
deployments
Big Data Requirements • High Performance• Reliable, Predictable Performance• Rapidly Scalable• Easy to Deploy
Requirements ReviewedCloud Provider Bare Metal Instance
High Performance
Reliable, Predictable Performance
Rapidly Scalable XEasy to Deploy X
I’ve got nothing……
The “Marc-O-Meter”
I’M NOT HAPPY
Marc… Angry
Thinking about Big Data
The 3 V’s
Physical Deployment
Cloud Vs Metal
Public Cloud
Public Cloud• Speed of deployment• Great for bursting use case• Imaging and cloning make POC/Dev work easy• Shared I/O• Great for POC/DEV• Excellent for App level applications• Not consistent enough for disk intensive applications• Must have application developed for “cloud”
Physical Servers
Bare Metal• Build to your specs• Robust, quickly scaled environment• Management of all aspects of environment• Image Based• No Hypervisor• Single Tenant• Great for Big Data Solutions
The Proof is in the Pudding
Beware The “Best Case Test Case”
185817.6 190525.4 187882.2 191101.8 184408.8 188135.4 187080.6 186343.4 191899.6 187736.6 188978.8 187440 186950.4 187623 187783.8 187775.8 192806.8 186643.2
192,806.8 Read Ops/Sec
Do It Yourself
• Data Set Sizing• Document/Object Sizes• Platform• Controlled client or AFAIC• Concurrency• Local or Remote Client• Read/Write Tests
JS Benchmarking Harness
• Data Set Sizing• Document/Object Sizes• Platform• Controlled client or AFAIC• Concurrency• Local or Remote Client• Read/Write Tests
db.foo.drop();db.foo.insert( { _id : 1 } )
ops = [{op: "findOne", ns: "test.foo", query: {_id: 1}}, {op: "update", ns: "test.foo", query: {_id: 1}, update: {$inc: {x: 1}}}]
for ( var x = 1; x <= 128; x *= 2) { res = benchRun( { parallel : x , seconds : 5 , ops : ops } ); print( "threads: " + x + "\t queries/sec: " + res.query );}
Quick Example
hostThe hostname of the machine mongod is running on (defaults to localhost).usernameThe username to use when authenticating to mongod (only use if running with auth).passwordThe password to use when authenticating to mongod (only use if running with auth).dbThe database to authenticate to (only necessary if running with auth).opsA list of objects describing the operations to run (documented below).parallelThe number of threads to run (defaults to single thread).secondsThe amount of time to run the tests for (defaults to one second).
Options
nsThe namespace of the collection you are running the operation on, should be of the form "db.collection".opThe type of operation can be "findOne", "insert", "update", "remove", "createIndex", "dropIndex" or "command".queryThe query object to use when querying or updating documents.updateThe update object (same as 2nd argument of update() function).docThe document to insert into the database (only for insert and remove).safeboolean specifying whether to use safe writes (only for update and insert).
Options
{ "#RAND_INT" : [ min , max , <multiplier> ] }[ 0 , 10 , 4 ] would produce random numbers between 0 and 10 and then multiply by 4.
{ "#RAND_STRING" : [ length ] }[ 3 ] would produce a string of 3 random characters.
var complexDoc3 = { info: "#RAND_STRING": [30] } }
var complexDoc3 = { info: { inner_field: { "#RAND_STRING": [30] } } }
Dynamic Values
Lots of them here:
https://github.com/mongodb/mongo/tree/master/jstests
Example Scripts
Read Only Test
• Random document size < 4k (mostly 1k)• 6GB Working Data Set Size• Random read only • 10 second per query set execution• Exponentially increasing concurrent clients from 1-128• 48 Hour Test Run• RAID10 4 SSD drives• Local Client• “Pre-warmed cache”
The ResultsConcurrent Clients Avg Read OPS/Sec
1 38288.5272 72103.357964 127451.88678 180798.439616 191817.336132 186429.451764 187011.7824128 188187.0704
Some Tougher Tests• Small MongoDB Bare Metal Cloud vs
Public Cloud Instance• Medium MongoDB Bare Metal Cloud vs
Public Cloud Instance• SSD and 15K SAS
• Large MongoDB Bare Metal Cloud vs
Public Cloud Instance• SSD and 15K SAS
Pre-configurations• Set SSD Read Ahead Defaults to 16 Blocks – SSD drives have
excellent seek times allowing for shrinking the Read Ahead to 16 blocks. Spinning disks might require slight buffering so these have been set to 32 blocks.
• noatime – Adding the noatime option eliminates the need for the system to make writes to the file system for files which are simply being read — or in other words: Faster file access and less disk wear.
• Turn NUMA Off in BIOS – Linux, NUMA and MongoDB tend not to work well together. If you are running MongoDB on NUMA hardware, we recommend turning it off (running with an interleave memory policy). If you don’t, problems will manifest in strange ways like massive slow downs for periods of time or high system CPU time.
• Set ulimit – We have set the ulimit to 64000 for open files and 32000 for user processes to prevent failures due to a loss of available file handles or user processes.
Use ext4 – We have selected ext4 over ext3. We found ext3 to be very slow in allocating files (or removing them). Additionally, access within large files is poor with ext3.
Private Network
JMETER SERVER
JMETER SERVER
JMETER SERVER
JMETER SERVER
RMI
Jmeter Master Client
RDP
Tester’s Local Machine
Test Environment
var numIterations = 1;var low_rand = 0;var RAND_STEP = 32767;
// print("high_id is "+high_id);// print("server is "+server);// print("maxThreads is "+maxThreads);// print("testDuration is "+testDuration);// print("readTest is "+readTest);// print("updateTest is "+updateTest);
Random.srand((new Date()).valueOf())
var last_id = 0;function nextId() {
return last_id++;}
var ops = [];
while (low_rand < high_id) {
if(readTest){ops.push({
op : "findOne",ns : "test.foo",query : {
incrementing_id : {"#RAND_INT" : [ low_rand, low_rand + RAND_STEP ]
}}
});}if(updateTest){
ops.push({ op: "update", ns: "test.foo", query: { incrementing_id: { "#RAND_INT" : [0,high_id]}}, update: { $inc: { counter: 1 }}, safe: true });}
low_rand += RAND_STEP;}
function printLine(tokens, columns, width) {line = "";column_width = width / columns;for (var i=0;i<tokens.length;i++) {
line += tokens[i];// token_width = tokens[token].toString().length;// pad = column_width - token_width;// while (pad--) { if(i != tokens.length-1)
line += " , ";// }
}line += " newline";print(line);
}
for (iteration = 1; iteration <= numIterations; iteration++) {print("theads, query/sec, query latency, updates/sec, update latency newline");// print("iteration " + iteration + " threads: " + maxThreads + " duration:// "// + testDuration);
// printLine([ "threads", "query/sec", "query latency", "update/sec",// "update latency" ], 5, 80);
for (x = 1; x <= maxThreads; x *= 2) {res = benchRun({
parallel : x,seconds : testDuration,ops : ops,host : server
});
printLine([ x, (res.query || 0),(res.findOneLatencyAverageMicros || 0).toFixed(2),(res.update || 0),(res.updateLatencyAverageMicros || 0).toFixed(2) ], 5, 80);
}}
Small TestSmall MongoDB ServerSingle 4-core Intel 1270 CPU64-bit CentOS8GB RAM2 x 500GB SATAII – RAID11Gb Network
Virtual Provider Instance4 Virtual Compute Units64-bit CentOS7.5GB RAM2 x 500GB Network Storage – RAID11Gb Network
Tests PerformedSmall Data Set (8GB of .5mb documents)200 iterations of 6:1 query-to-update operationsConcurrent client connections exponentially increased from 1 to 32Test duration spanned 48 hours
Small TestSmall Bare Metal Cloud Instance• 64-bit CentOS• 8GB RAM• 2 x 500GB SATAII – RAID1• 1Gb Network
Public Cloud Instance• 4 Virtual Compute Units• 64-bit CentOS• 7.5GB RAM• 2 x 500GB Network Storage – RAID1• 1Gb Network
Small Public Cloud
1 2 4 8 16 320
200
400
600
800
1000
1200
1400
Concurrent Clients
Op
s/S
eco
nd
Small Bare Metal
1 2 4 8 16 320
200
400
600
800
1000
1200
1400
1600
Concurrent Clients
Op
s/S
eco
nd
Medium TestMedium MongoDB ServerDual 6-core Intel 5670 CPUs64-bit CentOS36GB RAM2 x 64GB SSD – RAID1 (Journal Mount)4 x 300GB 15K SAS – RAID10 (Data Mount)1Gb Network – Bonded
Virtual Provider Instance26 Virtual Compute Units64-bit CentOS30GB RAM2 x 64GB Network Storage – RAID1 (Journal Mount)4 x 300GB Network Storage – RAID10 (Data Mount)1Gb Network
Tests PerformedSmall Data Set (32GB of .5mb documents)200 iterations of 6:1 query-to-update operationsConcurrent client connections exponentially increased from 1 to 128Test duration spanned 48 hours
Medium TestBare Metal Cloud Instance• Dual 6-core Intel 5670 CPUs• 64-bit CentOS• 36GB RAM• 2 x 64GB SSD – RAID1 (Journal Mount)• 4 x 300GB 15K SAS – RAID10 (Data Mount)• 1Gb Network – Bonded
Public Cloud Instance• 26 Virtual Compute Units• 64-bit CentOS• 30GB RAM• 2 x 64GB Network Storage – RAID1 (Journal Mount)• 4 x 300GB Network Storage – RAID10 (Data Mount)• 1Gb Network
Medium TestBare Metal Cloud Instance• Dual 6-core Intel 5670 CPUs• 64-bit CentOS• 36GB RAM• 2 x 64GB SSD – RAID1 (Journal Mount)• 4 x 400GB SSD– RAID10 (Data Mount)• 1Gb Network – Bonded
Public Cloud Instance• 26 Virtual Compute Units• 64-bit CentOS• 30GB RAM• 2 x 64GB Network Storage – RAID1 (Journal Mount)• 4 x 400GB Network Storage – RAID10 (Data Mount)• 1Gb Network
Medium TestTests Performed• Data Set (32GB of .5mb documents)• 200 iterations of 6:1 query-to-update operations• Concurrent client connections exponentially
increased from 1 to 128• Test duration spanned 48 hours
Medium Public Cloud
1 2 4 8 16 32 64 1280
500100015002000250030003500400045005000
Concurrent Clients
Op
s/S
eco
nd
Medium Bare Metal 15k SAS
1 2 4 8 16 32 64 1280
1000
2000
3000
4000
5000
6000
7000
8000
Concurrent Clients
Op
s/S
eco
nd
Medium Bare Metal SSD
1 2 4 8 16 32 64 1280
500
1000
1500
2000
2500
3000
3500
4000
4500
Concurrent Clients
Op
s/S
eco
nd
Large TestLarge MongoDB ServerDual 8-core Intel E5-2620 CPUs64-bit CentOS128GB RAM2 x 64GB SSD – RAID1 (Journal Mount)6 x 600GB 15K SAS – RAID10 (Data Mount)1Gb Network – Bonded
Virtual Provider Instance26 Virtual Compute Units64-bit CentOS64GB RAM (Maximum available on this provider)2 x 64GB Network Storage – RAID1 (Journal Mount)6 x 600GB Network Storage – RAID10 (Data Mount)1Gb Network
Tests PerformedSmall Data Set (64GB of .5mb documents)200 iterations of 6:1 query-to-update operationsConcurrent client connections exponentially increased from 1 to 128Test duration spanned 48 hours
Large TestBare Metal Cloud Instance• Dual 8-core Intel E5-2620 CPUs• 64-bit CentOS• 128GB RAM• 2 x 64GB SSD – RAID1 (Journal Mount)• 6 x 600GB 15K SAS – RAID10 (Data Mount)• 1Gb Network – Bonded
Public Cloud Instance• 26 Virtual Compute Units• 64-bit CentOS• 64GB RAM (Maximum available on this provider)• 2 x 64GB Network Storage – RAID1 (Journal Mount)• 6 x 600GB Network Storage – RAID10 (Data Mount)• 1Gb Network
Large TestBare Metal Cloud Instance• Dual 8-core Intel E5-2620 CPUs• 64-bit CentOS• 128GB RAM• 2 x 64GB SSD – RAID1 (Journal Mount)• 6 x 400GB SSD – RAID10 (Data Mount)• 1Gb Network – Bonded
Public Cloud Instance• 26 Virtual Compute Units• 64-bit CentOS• 64GB RAM (Maximum available on this provider)• 2 x 64GB Network Storage – RAID1 (Journal Mount)• 6 x 400GB Network Storage – RAID10 (Data Mount)• 1Gb Network
Large TestTests Performed• Data Set (64GB of .5mb documents)• 200 iterations of 6:1 query-to-update operations• Concurrent client connections exponentially
increased from 1 to 128• Test duration spanned 48 hours
Large Public Cloud
1 2 4 8 16 32 64 1280
1000
2000
3000
4000
5000
6000
Concurrent Clients
Op
s/S
eco
nd
Large Bare Metal 15k SAS
1 2 4 8 16 32 64 1280
1000
2000
3000
4000
5000
6000
7000
Concurrent Clients
Op
s/S
eco
nd
Large Bare Metal SSD
1 2 4 8 16 32 64 1280
1000
2000
3000
4000
5000
6000
Concurrent Clients
Op
s/S
eco
nd
Superior Performance
Deployment Size Bare Metal Drive Type
Bare Metal Average Performance Advantage over Virtual
Small SATA II 70%
Medium 15k SAS 133%
Medium SSD 297%
Large 15k SAS 111%
Large SSD 446%
Consistent Performance
Virtual Instance Bare Metal Instance
Small 6-36% 1-9%
Medium 8-43% 1-8%
Large 8-93% 1-9%
RSD (Relative Standard Deviation) by Platform
Requirements ReviewedCloud Provider Bare Metal Instance
High Performance XReliable, Predictable Performance XRapidly Scalable XEasy to Deploy X
Not Quite There Yet……
The “Marc-O-Meter”
NOT SURE IF WANT
The Dream
The Reality
Virtual Instance
Striped Network Attached Virtual Volumes
Cluster
Deployment ComplexityVirtual Instance
Striped Network Attached Virtual Volumes
Virtual Instance
Striped Network Attached Virtual Volumes
Virtual Instance
Striped Network Attached Virtual Volumes
Deployment Serenity:The Solution Designer
MongoDB Solutions• Preconfigured• Performance Tuned• Bare Metal Single Tenant• Complex Environment Configurations
Requirements ReviewedCloud Provider Bare Metal Instance
High Performance XReliable, Predictable Performance XRapidly Scalable X XEasy to Deploy X X
The “Marc-O-Meter”
B+ FOR EFFORT
Customer Feedback“We have over two terabytes of raw event data coming in every day ... Struq has been able to process over 95 percent of requests in fewer than 30 milliseconds”
- Aaron McKee CTO, Struq
The “Marc-O-Meter”
WIN!!
Summary• Bare Metal Cloud can be leveraged to
simplify deployments• Bare Metal has a significant
performance superiority/consistency over Public Cloud
• Public Cloud is best suited for Dev/POC or when running data sets in memory only
More information:
www.softlayer.comblog @ http://sftlyr.com/bdperf