+ All Categories
Home > Technology > Денис Нелюбин, "Тамтэк"

Денис Нелюбин, "Тамтэк"

Date post: 12-May-2015
Category:
Upload: ontico
View: 2,002 times
Download: 3 times
Share this document with a friend
Description:
HighLoad++ 2013
Popular Tags:
49
Тестирование производительности NoSQL БД Денис Нелюбин
Transcript
Page 1: Денис Нелюбин, "Тамтэк"

Тестирование производительностиNoSQL БД

Денис Нелюбин

Page 2: Денис Нелюбин, "Тамтэк"

Thumbtack Technology Inc.

Page 4: Денис Нелюбин, "Тамтэк"

A.K.A.: Citrusleaf

Creator: Aerospike, August 2012

License: Proprietary, Community edition

Category: Key-value, Complex data types + Secondary indexes (from v.3.0)

Page 5: Денис Нелюбин, "Тамтэк"

A.K.A.: CouchDB + Membase

Creator: Couchbase, Inc. (CouchOne + Membase), January 2012

License: Apache 2.0,Proprietary (Enterprise edition)

Category: Key-value, Document + Secondary indexes (from v.2.0)

Page 6: Денис Нелюбин, "Тамтэк"

A.K.A.: Apache Cassandra

Creator: Facebook, July 2008

License: Apache 2.0

Category: Key-value, BigTable, Column-oriented

Page 7: Денис Нелюбин, "Тамтэк"

A.K.A.: Mongo

Creator: 10gen (MongoDB, Inc.), March 2010

License: AGPL, Commercial license

Category: Document-oriented

Page 8: Денис Нелюбин, "Тамтэк"

A.K.A.: Yahoo! Cloud Serving Benchmark

Creator: Yahoo! Research, June 2010

License: Apache 2.0

Category: NoSQL benchmark

YCSB

Page 9: Денис Нелюбин, "Тамтэк"

YCSB

Data set

● key: "user" + 64-bit Fowler-Noll-Vo hash● value: 10 fields of random data

Load

● insert N records

Run

● update and read on N records by the key

Page 10: Денис Нелюбин, "Тамтэк"

YCSB

Page 11: Денис Нелюбин, "Тамтэк"

YCSB

Does NOT do, does NOT check:

● join● secondary index● where clause● partial update

Page 12: Денис Нелюбин, "Тамтэк"

Why YCSB?

● applicable to any database ● popular● de-facto standard

Page 13: Денис Нелюбин, "Тамтэк"

Hardware

Servers:

4 * (8 * Xeon + 32GB RAM + 4 * 120GB SSD)

Clients:

8 * (4 * i5 + 4GB RAM)

Single client is not enough

Page 14: Денис Нелюбин, "Тамтэк"

Hardware: CPU

8 cores Xeon ≈ 4 cores i5 *

*(unproved)

Page 15: Денис Нелюбин, "Тамтэк"

Hardware: Network

1 Gbps is not enough

1 Gbit/sec / 1 KB of data ≈ 100 000 ops/sec

Single IO queue on single CPU is not enough# cat /proc/interrupts | grep eth

90: 0 0 0 0 IR-PCI-MSI-edge eth0

91: 275107859 0 0 0 IR-PCI-MSI-edge eth0-TxRx-0

92: 227858040 0 0 0 IR-PCI-MSI-edge eth0-TxRx-1

93: 242082684 0 0 0 IR-PCI-MSI-edge eth0-TxRx-2

94: 230651008 0 0 0 IR-PCI-MSI-edge eth0-TxRx-3

95: 217273950 0 0 0 IR-PCI-MSI-edge eth0-TxRx-4

96: 240149262 0 0 0 IR-PCI-MSI-edge eth0-TxRx-5

97: 194736879 0 0 0 IR-PCI-MSI-edge eth0-TxRx-6

98: 270089080 0 0 0 IR-PCI-MSI-edge eth0-TxRx-7

Page 16: Денис Нелюбин, "Тамтэк"

Hardware: SSDOverprovisioning

● hdparm● fdisk

http://en.wikipedia.org/wiki/Write_amplification

Page 17: Денис Нелюбин, "Тамтэк"

OS (GNU/Linux)

ulimit

● nofile > 4k

RAID (RAID 0?)

● mdadm● lvm

Read-ahead

● minimal

http://upload.wikimedia.org/wikipedia/commons/a/a4/Gnu-linux-on-white.png

Page 18: Денис Нелюбин, "Тамтэк"

Test

Data sets:

● RAM: 50M * 100 byte ≈ 5GB● SSD: 500M * 100 byte ≈ 50GB● replication factor = 2

Workloads:

● Heavy Write: 50% update / 50% read● Mostly Read: 5% update / 95% read

http://kushsrivastava.files.wordpress.com/2012/11/test.gif

Consistency:

● Sync replication● Async replication

Page 19: Денис Нелюбин, "Тамтэк"

Insert, RAM

Page 20: Денис Нелюбин, "Тамтэк"

Insert, SSD

Page 21: Денис Нелюбин, "Тамтэк"

Heavy Update, RAM

Page 22: Денис Нелюбин, "Тамтэк"

Heavy Update, SSD

Page 23: Денис Нелюбин, "Тамтэк"

Heavy Update, Latency

Page 24: Денис Нелюбин, "Тамтэк"

Mostly Read, RAM

Page 25: Денис Нелюбин, "Тамтэк"

Mostly Read, SSD

Page 26: Денис Нелюбин, "Тамтэк"

Speed

Insert

Couchbase*

Aerospike

Cassandra

MongoDB

Update

Couchbase*

Aerospike

Cassandra

MongoDB

Read

Aerospike

Couchbase*

MongoDB

Cassandra

* in memory or on smaller data set

Page 27: Денис Нелюбин, "Тамтэк"

Failover test

● 50%, 75%, 100% of max throughput● Heavy Update

● 10min warmup● kill -9● 10min without one node● service start● 20min after restore

Page 28: Денис Нелюбин, "Тамтэк"

Aerospike, Sync, SSD, 50%

Page 29: Денис Нелюбин, "Тамтэк"

Cassandra, Async, SSD, 50%

Page 30: Денис Нелюбин, "Тамтэк"

Couchbase, Async, RAM, 100%

Page 31: Денис Нелюбин, "Тамтэк"

MongoDB, Async, SSD, 50%

Page 32: Денис Нелюбин, "Тамтэк"
Page 33: Денис Нелюбин, "Тамтэк"
Page 34: Денис Нелюбин, "Тамтэк"
Page 35: Денис Нелюбин, "Тамтэк"
Page 36: Денис Нелюбин, "Тамтэк"

Replication

MongoDB Cassandra Couchbase/Aerospike

Page 37: Денис Нелюбин, "Тамтэк"

Data storing reliability

Cassandra

MongoDB

Aerospike

Couchbase

archive

live data

fast cache, eviction

cache (async only)

Page 38: Денис Нелюбин, "Тамтэк"

Capacity

Cassandra

MongoDB

Aerospike

Couchbase*

packed archive

unpacked live data

indexes in RAM + SSD

metadata and cache in RAM

* was able to take only 200M records

Page 39: Денис Нелюбин, "Тамтэк"

Deployment

Couchbase

Aerospike

Cassandra

MongoDB

four clicks

powerful config

config+config+calculator

shards of replica-sets

Page 40: Денис Нелюбин, "Тамтэк"

Managing

Couchbase

MongoDB

Cassandra

Aerospike

superduperwebconsole

commands and docs *

exists **

raw ***

***

***

use MMS (MongoDB Management Service)use DataStax productstry AMC (Aerospike Monitoring Console)

Page 41: Денис Нелюбин, "Тамтэк"

Unique features

Aerospike

● SSD support, speed

Couchbase

● good web console, easy deployment

Cassandra

● writes faster than reads ;)

MongoDB

● documents

Page 42: Денис Нелюбин, "Тамтэк"

TroublesomesAerospike

● eviction● secret config options● long start

Couchbase

● big data ● strange client behaviour● long start● long shutdown

http://www.spreadshirt.com/here-comes-trouble-women-s-t-shirts-C3376A9069098

Page 43: Денис Нелюбин, "Тамтэк"

Troublesomes

Cassandra

● need to think about the config ;)

MongoDB

● mongos have to be restarted ● replica-set is too surviving ;)

http://www.spreadshirt.com/here-comes-trouble-women-s-t-shirts-C3376A9069098

Page 44: Денис Нелюбин, "Тамтэк"

When to use: Aerospike

Big Fast Cache

http://x-celestia-x.deviantart.com/art/I-am-the-best-Rainbow-Dash-358472521

Page 45: Денис Нелюбин, "Тамтэк"

When to use: Couchbase

In-memory Cache with Persistence

http://zutheskunk.deviantart.com/art/MLP-Resource-Shadowbolt-Female-02-238973870

Page 46: Денис Нелюбин, "Тамтэк"

When to use: Cassandra

Big-Data Archive

http://www.deviantart.com/art/Zecora-324988216

Page 47: Денис Нелюбин, "Тамтэк"

When to use: MongoDB

Universal DB for Web

http://www.deviantart.com/art/Trixie-221583239

Page 48: Денис Нелюбин, "Тамтэк"

Not only YCSB

● From scratch, inspired by YCSB● More tests

○ Secondary indexes (cardinality, overhead)○ Aggregation (average value)○ Collection data types (stack, array, wide row)


Recommended