+ All Categories
Home > Documents > CCB12 Benchmarking Couchbase

CCB12 Benchmarking Couchbase

Date post: 20-Aug-2015
Category:
Upload: couchbase
View: 2,121 times
Download: 1 times
Share this document with a friend
Popular Tags:
31
Copyright © Altoros Systems, Inc. | CONFIDENTIAL Benchmarking Couchbase Server Altoros Systems, Inc. Presented by Frank Weigel, VP Products, Couchbase
Transcript

Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Benchmarking Couchbase ServerAltoros Systems, Inc.

Presented by Frank Weigel, VP Products, Couchbase

2Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Presentation Outline

• Benchmark Goals • Benchmark Design and Scenario • Benchmarking Tools• Benchmark Results

3Copyright © Altoros Systems, Inc. | CONFIDENTIAL

4Copyright © Altoros Systems, Inc. | CONFIDENTIAL

• Software delivery acceleration specialist for big data application implementation services

• 200+ employees globally (Eastern Europe, US, UK, Denmark, Norway)• Big data practice areas:

Advertising analytics Automated device analytics Big data warehouse

Implementation Partner

Partners

Customers

About Altoros

5Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Why Benchmark NoSQL technologies?

• All NoSQL technologies say they are “high performance and scalable”

But this isn’t helpful to end users

• Performance needs to be measured for meaning full workloads To help users understand the performance characteristics of

databases those workloads

• So we decided to compare the commonly used NoSQL databases

• MongoDB 2.2RC • Cassandra 1.1.2• Couchbase Server 2.0 - Recent Build

6Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Benchmark Goals

• Reproducible by anyone– Open Source workload generator

• Focus on use case for which NoSQL typically selected

• Use a realistic workload– Simulate steady state of application running– Meaningful data amounts & runtime

• Compare latency vs throughput• Measure max throughput (for given scenario)

7Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Benchmarking Scenario

• For interactive web application• Scalability and performance are the most common

requirements • Typically leads to users selecting NoSQL over RDBMS

• The working set of data changes with time• End users using the application change over time• Example: every few hours, every few days, every few weeks

• There is more data available than memory (RAM)• Replication is used for fault tolerance• Real world data sizes• Use EC2 as deployment platform

– Commonly used– Easy to replicate results

8Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Benchmarking Scenario Details

Hardware • 4 Amazon m1.xlarge instances for the NoSQL DBs• 1 instance used as the client

Workload details • Operations are a mix of C:R:U:D in the ratio 5:60:33:2 • Each document roughly 1.5-2K in size (15 fields * 100 bytes)• 15 million active and 15 million replica documents• Workload with sliding working set • Load phase, warm-up phase, access phase• Runtime of the access phase ~1 hour• Latency measured for varying throughput - 3 times for each run• Focus on transaction performance

– Latency – Throughput

9Copyright © Altoros Systems, Inc. | CONFIDENTIAL

What was measured?

• Latency• Round trip time taken

for a request to execute from the client to the server and back

• Average, 95th and 99th percentile measured

• Why is this important?

• You want your users to have a great experience

• Not just an “average” one

• Throughput• Throughput was varied

from 1K ops/sec to 25K ops/sec depending on NoSQL database

• Max throughput was measured

• Why is this important?

• You want your app to support hundreds of thousands of users

Workloads are not rate limited, focused on max throughput.

10Copyright © Altoros Systems, Inc. | CONFIDENTIAL

YCSB

11Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Benchmark Implementation: YCSB

• Yahoo! team offered a “standard” benchmark

• Yahoo! Cloud Serving Benchmark (YCSB)– Focus on database– Focus on performance

• YCSB Client consists of 2 parts– Workload generator– Workload scenarios

12Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Why YCSB

• Open source

• Extensible

• Rich selection of connectors• Azure, BigTable, Cassandra, CouchDB,

• Dynomite, GemFire, HBase, Hypertable,

• Infinispan, MongoDB, PNUTS, Redis,

• Connector for Sharded RDBMS (i.e. MySQL),

• Voldemort, GigaSpaces XAP

• We developed a few connectors• Accumulo, Couchbase, Riak,

• Connector for Shared Nothing RDBMS (i.e. MySQL Cluster)

13Copyright © Altoros Systems, Inc. | CONFIDENTIAL

How YCSB Works

14Copyright © Altoros Systems, Inc. | CONFIDENTIAL

THE CONFIGURATIONS

15Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Amazon m1.xlarge Instances * 4

15 GB memory4 virtual cores4 EBS 50 GB volumes in RAID064-bit Amazon Linux

* Extra nodes for masters, routers, etc

Amazon m1.xlarge Instance

15 GB memory4 virtual cores4 EBS 50 GB volumes in RAID064-bit Amazon Linux (CentOS binary compatible) YCSB Client

Cluster specification

16Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Workload Generator Specs

Hotspot generator with sliding window:

hotspotslidingspeed=10Speed of the hot set window movement measured in keys per second, with a default value of 10 keys/sec (can be overridden in workload properties file).hotspotdatafraction=0.2Proportion of the hot data set to the whole dataset, default is 0.2hotspotoperationfraction=0.9Value specifying how often hot dataset will be queried comparing to cold dataset, default is 0.8, used 0.9lowerbound=0The minimal key value allowed to be queried. Set to 0upperbound=15000000The maximum key value allowed to be queried. Set to 15 million

Also specification of the client process, which drives workload:6) threadcount=30Number of parallel threads spawned on the client node to drive benchmark

17Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Couchbase Configuration

• 4 node Couchbase cluster • 1 replica setting • Each node has some active and some replica

data• 12GB used as the (12288 MB) Couchbase

bucket size per node

18Copyright © Altoros Systems, Inc. | CONFIDENTIAL

MongoDB Configation

• 4 shards each has 1 replica (replication factor – 1), where each shard is a set of 2 nodes - primary and secondary

• Journaling disabled (trying to maximize performance)• var shards = [

        "shard1/ycsb-node1:27017,ycsb-node2:27018",        "shard2/ycsb-node2:27017,ycsb-node1:27018",        "shard3/ycsb-node3:27017,ycsb-node4:27018",        "shard4/ycsb-node4:27017,ycsb-node3:27018"];Each node running • 2 mongod processes (all together 8 mongod

processes on 4 nodes) • 4 mongos processes, which is the MongoDB router,

process on 27019 port

19Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Cassandra Configuration

• Cassandra JVM settings: • 1.1) MAX_HEAP_SIZE, which is a total amount of

memory dedicated to the Java heap - 6G• 1.2) HEAP_NEWSIZE, total amount of memory for

the new generation of objects - 400M

• Cassandra settings:• 2.1) RandomPartitioner was used which distributes

rows across the cluster evenly by MD5• 2.2) Memtable size 4048 MB

20Copyright © Altoros Systems, Inc. | CONFIDENTIAL

THE RESULTS

21Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Reads (Average time)

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

1000

2000

3000

4000

5000

6000

7000Read latencies against throughput

Operations per Second

Ave

rage

Lat

ency

[ms]

MongoDB

Cassandra

Couchbase

22Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Reads (95th percentile)

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

2000

4000

6000

8000

10000

12000

14000

16000

18000

Read latencies against throughput

Operations per Second

95t

h Pe

rcen

tile

Late

ncy

[ms]

Couchbase

Cassandra

MongoDB

23Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Reads (99th percentile)

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

10000

20000

30000

40000

50000

60000

Read latencies against throughput

Operations per Second

99t

h Pe

rcen

tile

Late

ncy

[ms]

MongoDB

Cassandra

Couchbase

24Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Mongo Replica Reads

• MongoDB setup had 4 shards• By default only masters will service reads

• To allow replica reads and still be comparable, need to ensure that replica data is up-to-date • This was done using write-concern (REPLICAS_SAFE)

• Tests showed that results did not improve• This includes results for writes

25Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Writes (Average time)

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Insert and Update latencies against throughput

Operations per second

Ave

rage

Lat

ency

[ms]

MongoDB

Cassandra

Couchbase

26Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Writes (95th percentile)

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

5000

10000

15000

20000

25000

30000

Insert and update latencies against throughput

Operations per Second

95t

h Pe

rcen

tile

Late

ncy

[ms]

MongoDB

Cassandra

Couchbase

27Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Writes (99th percentile)

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

Insert and update latencies against throughput

Operations per Second

99t

h Pe

rcen

tile

Late

ncy

[ms] MongoDB

Cassandra

Couchbase

28Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Results Analysis

• Couchbase• Showed the lowest latencies & highest throughput• Latency was independent of throughput for up to 3/4th the max

achievable throughput (for both reads and write)

• Cassandra • Had the highest latencies of all the databases• Showed higher max throughput compared with mongoDB but only

60% of the throughput achieved by Couchbase• Latencies rose fast as throughput was increased

• MongoDB • Read latencies were better than Cassandra but higher than

Couchbase• Max throughput for read and writes was the lowest of all the

databases – Particularly for writes, high latencies seen for average throughput– Coarse write lock seems to have a big impact on performance

29Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Other Thoughts

• Evaluate before choosing the “horse”

• Construct your own or use existing workloads

• Benchmark it

• Tune database!

• Benchmark it again

Amazon EC2 observations

• Scales perfectly for NoSQL

• EBS slows down database on reads

• RAID0 it! Use 4 disk in array (good choice), some reported performance degraded with higher number (6 and >)

30Copyright © Altoros Systems, Inc. | CONFIDENTIAL

YCSB Connectors

github.com/Altoros/YCSB

31Copyright © Altoros Systems, Inc. | CONFIDENTIAL

Thank you!

Thank You!

[email protected] @renatkhasanshynTel. (650) 395-7002


Recommended