+ All Categories
Home > Data & Analytics > Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Date post: 21-Apr-2017
Category:
Upload: sematext-group-inc
View: 28,174 times
Download: 1 times
Share this document with a friend
95
Elasticsearch & Docker Rafał Kuć – Sematext Group, Inc. @ kucrafal @ sematext sematext.com Running High Performance Fault Tolerant Elasticsearch Clusters On Docker
Transcript
Page 1: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Elasticsearch & Docker

Rafał Kuć – Sematext Group, Inc.@kucrafal @sematext sematext.com

Running High PerformanceFault Tolerant

Elasticsearch Clusters On Docker

Page 2: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

About me…

Sematext consultant & engineerSolr.pl co-founderFather and husband :)

Page 3: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Next 30 minutes

Page 4: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With This

Development

Page 5: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With This

Development Test

Page 6: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With This

Development Test QA

Page 7: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

You Are Probably Familiar With This

Development Test QA

Production environment

Page 8: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

And The Problems That Come With It

Resources not utilized

Page 9: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

And The Problems That Come With It

Resources not utilized

OverprovisionedServers

Page 10: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

And The Problems That Come With It

Resources not utilized

OverprovisionedServers

≠ ≠

Page 11: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

The solution

Development Test QA Production

Page 12: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Container Technologies

Page 13: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

What is Docker?

Lightweight

Based onOpen Standards

Secure

Page 14: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Traditional Virtual Machine

Page 15: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Traditional Virtual Machine

Page 16: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Traditional Virtual Machine

Page 17: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Traditional Virtual Machine

Page 18: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Traditional Virtual Machine

Page 19: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Traditional Virtual Machine

Page 20: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Traditional Virtual MachineContainer

Page 21: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Docker Engine

Traditional Virtual MachineContainer

Page 22: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Containers vs Virtual Machines

Hardware

Host Operating System

Hypervisor

Guest OS Guest OS

Libraries Libraries

Application 1 Application 2

Hardware

Host Operating System

Docker Engine

Libraries Libraries

Application 1 Application 2

Traditional Virtual MachineContainer

Page 23: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

What is Elasticsearch?

Reasonabledefaults { JSON }{ JSON }

Distributed by design

http://www.dailypets.co.uk/2007/06/17/kittens-rest-at-half-time/

Page 24: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Running Official Elasticsearch Container

$ docker run -d elasticsearch

Page 25: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Running Official Elasticsearch Container

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

Page 26: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Running Official Elasticsearch Container

$ docker run -d elasticsearch:1.7

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

Page 27: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Running Official Elasticsearch Container

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

$ docker run --name es_1 -h es_master_1 elasticsearch

$ docker run -d elasticsearch:1.7

Page 28: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Running Official Elasticsearch Container

$ docker run -d elasticsearch == docker run -d elasticsearch:latest

$ docker run --name es_1 -h es_master_1 elasticsearch

$ docker run -d elasticsearch:1.7

Page 31: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Container Constraints

$ docker run -d -m 2G elasticsearch

$ docker run -d -m 2G --memory-swappiness=0 elasticsearch

$ docker run -d --cpuset-cpus="1,3" elasticsearch

http://docs.docker.com/engine/reference/run/

Page 32: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Container Constraints

$ docker run -d -m 2G elasticsearch

$ docker run -d -m 2G --memory-swappiness=0 elasticsearch

$ docker run -d --cpuset-cpus="1,3" elasticsearch

http://docs.docker.com/engine/reference/run/

$ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch

Page 33: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

Page 34: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

$ docker build -t devops/example .

Page 35: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Creating Optimized Image

Dockerfile:FROM elasticsearchADD ./elasticsearch.yml /usr/share/elasticsearch/config/

$ docker build -t devops/example .

Sending build context to Docker daemon 3.072 kBStep 1 : FROM elasticsearch ---> 8112755253f1Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/ ---> Using cache ---> c9ca48a22e58Successfully built c9ca48a22e58

Page 36: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

Page 37: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1

Page 38: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1

Page 39: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Network

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch

$ docker run -d elasticsearch -Dnetwork.publish_host=192.168.1.1

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=192.168.1.1

$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch -Dnetwork.publish_host=0.0.0.0

Page 40: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Page 41: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h es_node_1 elasticsearch

Page 42: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h es_node_1 elasticsearch

Expose 9200 & 9300 ports only for client nodes

Page 43: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Network - Good Practices

Separate network for Elasticsearch cluster

Common host names for containers$ docker run -d -h es_node_1 elasticsearch

Expose 9200 & 9300 ports only for client nodes

Elasticsearch data & client nodes point to masters only

Page 44: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Storage

By default in /usr/share/elasticsearch/data

Page 45: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Storage

By default in /usr/share/elasticsearch/data

By default not persisted

Page 46: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Storage

By default in /usr/share/elasticsearch/data

By default not persisted

$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch

Page 47: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Dealing With Storage

$ docker run -d -v /opt/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch

By default in /usr/share/elasticsearch/data

By default not persisted

Use data only containers

Permissions

Page 48: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Data-Only Docker Volumes

Bypasses Union File System

Page 49: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Page 50: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

Page 51: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch

Permissions

Page 52: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Data-Only Docker Volumes

Bypasses Union File System

Can be shared between containers

Data volumes persist if the container itself is deleted

$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data --name esdata elasticsearch

$ docker run --volumes-from esdata elasticsearch

Page 53: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

Page 54: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

minimum_master_nodes = N/2 + 1

Page 55: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Highly Available Cluster

Master only

Master only

Master only

Data only

Data only

Data only

Data only

Data only

Data only

Client only

Client only

minimum_master_nodes = N/2 + 1

recovery.after.nodes recovery.expected.nodes

cluster.routing.allocation.node_concurrent_recoveries

index.unassigned.node_left.delayed_timeoutindex.priority

Page 56: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Master Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=true -Dnode.data=false -Dnode.client=false

Page 57: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Client Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=false -Dnode.client=true

Page 58: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Data Nodes & Docker

$ docker run -d elasticsearch -Dnode.master=false -Dnode.data=true -Dnode.client=false

Page 59: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

Elasticsearch Node Elasticsearch Node

Elasticsearch Node Elasticsearch Node

Page 60: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

curl -XPUT 'http://localhost:9200/devops/' -d '{ "settings" : { "index" : { "number_of_shards" : 4, "number_of_replicas" : 0 } }}'

Page 61: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

P P

P P

Page 62: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1}'

Page 63: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

P P

P P

R

R R

R

Page 64: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 2}'

Page 65: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

P P

P P

R R

R R

R R

R R

Page 66: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

curl -XPUT 'http://localhost:9200/devops/_settings' -d '{ "index.number_of_replicas" : 1}'

Page 67: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

P P

P P

R

R R

R

Page 68: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

P P

P P

R

R R

R

Page 69: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Scaling

P P PP

UnassignedR

RR

R

Page 70: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

RAM Buffer

indices.memory.index_buffer_size: 10%indices.memory.min_index_buffer_size: 48mb

indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb

Page 71: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

RAM Buffer

indices.memory.index_buffer_size: 10%indices.memory.min_index_buffer_size: 48mb

indices.memory.max_index_buffer_size (unbounded) indices.memory.min_shard_index_buffer_size: 4mb

Higher IndexingThroughput

Lower IndexingThroughput

defaults ><

Page 72: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Time-Based Data?

2015-11-23

TODAY

WEEK

Page 73: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Time-Based Data?

curl -XPOST 'http://localhost:9200/_aliases' -d '{ "actions" : [ { "add" : {"index":"2015-11-23","alias":"today"} }, { "add" : {"index":"2015-11-23","alias":"week"} } ]}'

Page 74: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Time-Based Data?

2015-11-23 2015-11-24

TODAY

WEEK

Page 75: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Time-Based Data?

2015-11-23 2015-11-24 2014-11-25

TODAY

WEEK

Page 76: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

Page 77: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

curl -XPUT 'localhost:9200/data_2015-11-23' -d '{ "settings": { "index.routing.allocation.include.tag" : "hot" }}'

Page 78: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

Page 79: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

curl -XPUT 'localhost:9200/data_2015-11-23/_settings' -d '{ "settings": { "index.routing.allocation.exclude.tag" : "hot", "index.routing.allocation.include.tag" : "cold", }}'

Page 80: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

Page 81: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

data_2015-11-24

data_2015-11-24

Page 82: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tiers

node.tag=hot node.tag=cold node.tag=cold

data_2015-11-23

data_2015-11-23

data_2015-11-25

data_2015-11-25

data_2015-11-24

data_2015-11-24

Page 83: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tenants

Page 84: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tenants

Hot

Hot

Cold

Cold

Cold

Cold

Page 85: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Multiple Tenants

Hot

Hot

Cold

Cold

Cold

Cold

ROUTING

Page 86: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Indexing Without Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

userA

userA

userA

userAuserAuserA

userAuserA

Page 87: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Indexing With Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

user

A

Page 88: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Querying Without Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

Page 89: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Querying With Routing

Shard 1 Shard 2 Shard 3 Shard 4

Shard 5 Shard 6 Shard 7 Shard 8

Elasticsearch

Application

Page 90: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Routing vs No Routing

Queries without routing (200 shards, 1 replica)#thre

adsAvg response time Throughput 90%

lineMedian

CPU Utilization

1 3169ms 19,0/min 5214ms

2692ms

95 – 99%

Page 91: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Routing vs No Routing

Queries without routing (200 shards, 1 replica)#thre

adsAvg response time Throughput 90%

lineMedian

CPU Utilization

1 3169ms 19,0/min 5214ms

2692ms

95 – 99%

Queries with routing (200 shards, 1 replica)#thre

adsAvg response time Throughput 90%

lineMedian

CPU Utilization

10 196ms 50,6/sec 642ms

29ms 25 – 40%

20 218ms 91,2/sec 718ms

11ms 10 – 15%

Page 93: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

Short summary

http://www.soothetube.com/2013/12/29/thats-all-folks/

Page 94: Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker

We Are Hiring!Dig Search?Dig Analytics?Dig Big Data?Dig Performance?Dig Logging?Dig working with, and in, open–source?We’re hiring worldwide!

http://sematext.com/about/jobs.html


Recommended