Date post: | 16-Apr-2017 |
Category: |
Technology |
Upload: | sematext-group-inc |
View: | 278 times |
Download: | 2 times |
BuildingResilientLogAggregationPipeline
UsingElasticsearch andKafka
Rafał Kuć@Sematext Group,Inc.
Sematext &I
LogseneSPM
logs
metrics
Next30minutes…
Logshipping- buffers- protocols- parsing
Centralbuffering- Kafka- Redis
Storage&Analysis- Elasticsearch- Kibana- Grafana
Logshippingarchitecture
File Shipper
File Shipper
File Shipper
CentralizedBuffer
ES ES ES
ES ES ES
ES ES ES
data
Focus:Elasticsearch
File Shipper
File Shipper
File Shipper
CentralizedBuffer
ES ES ES
ES ES ES
ES ES ES
data
Elasticsearchclusterarchitecture
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Dedicatedmastersplease
client
client
client
data
data
data
data
data
data
master
master
master
discovery.zen.minimum_master_nodes ->N/2+1mastereligiblenodes
ingest
ingest
ingest
Onebigindexisano-go
Notscalableenoughfortimebaseddata
Onebigindexisano-go
Indexingslowsdownwithtime
Onebigindexisano-go
Expensivemerges
Onebigindexisano-go
Delete byquery neededfordataretention
Onebigindexisano-go
Notscalableenoughfortimebaseddata
Indexingslowsdownwithtime
Expensivemerges
Delete byquery neededfordataretention
Dailyindicesareagoodstart
2016.11.18 2016.11.19 2016.11.22 2016.11.23...
Indexing isfaster forsmallerindices
Deletes arecheap
Search canbeperformedonindicesthatareneeded
Static indicesarecachefriendly
indexing
mostsearches
Dailyindicesareagoodstart
2016.11.18 2016.11.19 2016.11.22 2016.11.23...
Indexing isfaster forsmallerindices
Deletes arecheap
Search canbeperformedonindicesthatareneeded
Static indicesarecachefriendly
indexing
mostsearches
Wedelete wholeindices
Dailyindicesaresub-optimal
black
friday
saturdaysunday
loadisnoteven
Sizebasedindicesareoptimal
sizelimitforindices
logs_01
indexing
around5– 10GBpershardonAWS
Sizebasedindicesareoptimal
sizelimitforindices
logs_01
indexing
around5– 10GBpershardonAWS
Sizebasedindicesareoptimal
sizelimitforindices
logs_01
indexing
logs_02
around5– 10GBpershardonAWS
Sizebasedindicesareoptimal
sizelimitforindices
logs_01
indexing
logs_02
around5– 10GBpershardonAWS
Sizebasedindicesareoptimal
sizelimitforindices
logs_01 logs_02
indexing
logs_N...
around5– 10GBpershardonAWS
Sliceusingsize
Predictable searchingandindexingperformance
Better indicesbalancing
Fewershards
Easier handling ofspikyloads
Lesscostsbecauseofbetter hardwareutilization
ProperElasticsearchconfiguration
Keepindex.refresh_interval atmaximumpossiblevalue1sec->100%,5sec->125%,30sec-> 175%
Youcanloosen upmerges- possiblebecauseofheavyaggregationuse- segments_per_tier ->higher-max_merge_at_once->higher-max_merged_segment ->lower
Allprefixedwithindex.merge.policy
} higherindexingthroughput
ProperElasticsearchconfiguration
Index onlyneededfields
Usedocvalues
Donotindex_source
Donotstore_all
Optimizationtime
Wecanoptimize datanodesfortimebaseddata
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Hot– coldarchitecture
EShot EScold EScold
-Dnode.attr.tag=hot -Dnode.attr.tag=cold -Dnode.attr.tag=cold
Hot– coldarchitecture
logs_2016.11.22
EShot EScold EScold
-Dnode.attr.tag=hot -Dnode.attr.tag=cold -Dnode.attr.tag=cold
curl-XPUTlocalhost:9200/logs_2016.11.22 -d'{"settings":{"index.routing.allocation.exclude.tag":"cold","index.routing.allocation.include.tag":"hot"}}'
Hot– coldarchitecture
logs_2016.11.22
EShot EScold EScold
indexing
Hot– coldarchitecture
logs_2016.11.22logs_2016.11.23
EShot EScold EScold
indexing
Hot– coldarchitecture
logs_2016.11.22logs_2016.11.23
EShot EScold EScold
indexing
moveindexafterdayends
curl-XPUTlocalhost:9200/logs_2016.11.22/_settings-d'{"index.routing.allocation.exclude.tag":"hot","index.routing.allocation.include.tag”:"cold"
}'
Hot– coldarchitecture
logs_2016.11.23 logs_2016.11.22
EShot EScold EScold
indexing
Hot– coldarchitecture
logs_2016.11.23logs_2016.11.24 logs_2016.11.22
EShot EScold EScold
indexing
Hot– coldarchitecture
logs_2016.11.23logs_2016.11.24 logs_2016.11.22
EShot EScold EScold
indexing
moveindexafterdayends
Hot– coldarchitecture
logs_2016.11.24 logs_2016.11.22 logs_2016.11.23
EShot EScold EScold
indexing
Hot– coldarchitecture
HotESTier
GoodCPULotsofI/O
ColdESTier
MemoryboundDecentI/O
EScold
ColdESTier
MemoryboundDecentI/O
Hot– coldarchitecturesummary
EScold
Optimizecosts – differenthardwarefordifferenttier
Performance – usecaseoptimizedhardware
Isolation – longrunningsearchesdon’taffectindexing
Elasticsearchclient nodeneeds
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Elasticsearchclient nodeneeds
Nodata=noIOPS
Largequerythroughput=highCPUusage
Lotsofresults=highmemory usage
Lotsofconcurrentqueries=higherresources utilization
Elasticsearchingest nodeneeds
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Elasticsearchingestnodeneeds
Nodata=noIOPS
Largeindexthroughput=highCPU&memoryusage
Complicatedrules=highCPUusage
Largerdocuments=moreresources utilization
Elasticsearchmaster nodeneeds
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Elasticsearchingestnodeneeds
Nodata=noIOPS
Largenumberofindices=highCPU&memoryusage
Complicatedmappings=highmemoryusage
Dailyindices=spikesinresources utilization
Focus:CentralizedBuffer
File Shipper
File Shipper
File Shipper
CentralizedBuffer
ES ES ES
ES ES ES
ES ES ES
data
WhyApacheKafka?
Fast &easytouse
Easytoscale
Faulttolerantandhighlyavailable
Supportsstreaming
Worksinpublish/subscribemode
Kafkaarchitecture
ZooKeeper
ZooKeeper
ZooKeeper
Kafka
Kafka
KafkaKafka
Kafka&topics
security_logs access_logs
app1_logs app2_logs
Kafkastoresdatain topics
writtenondisk
Kafka&topics&partitions&replicas
logspartition2
logspartition1
logspartition3
logspartition4
logsreplicapartition2
logsreplicapartition1
logsreplicapartition3
logsreplicapartition4
ScalingKafka
logspartition1
ScalingKafka
logspartition1
logspartition2
logspartition3
logspartition4
ScalingKafka
logspartition1
logspartition2
logspartition3
logspartition4
logspartition5
logspartition6
logspartition7
logspartition8
logspartition9
logspartition10
logspartition11
logspartition12
logspartition13
logspartition14
logspartition15
logspartition16
ThingstorememberwhenusingKafka
Scales byaddingmorepartitions notthreads
ThemoreIOPS thebetter
Keepthe#ofconsumersequalto#ofpartitions
Replicas usedforHA andFT only
Offsets storedperconsumer– multipledestinationseasilypossible
Focus:Shipper
File Shipper
File Shipper
File Shipper
CentralizedBuffer
ES ES ES
ES ES ES
ES ES ES
data
Whatabouttheshipper?
logs
CentralizedBuffer
Whichshippertouse?
Whichprotocol shouldbeused
Whataboutthebuffering
LogtoJSON orparse andhow
Buffers
performance & availability
batches&threads whencentralbufferisgone
Buffertypes
Disk ||memory ||combinedhybrid approachOnsource||centralized
App
Buffer
App
Buffer
fileorlocallogshipper
easyscaling– fewermovingpartsoftenwiththeuseoflightweightshipper
App
App
Kafka /Redis /Logstash /etc…
oneplaceforallchangesextrafeaturesmadeeasy(likeTTL)
ES
ES
BuffersSummary
Simple Reliable
App
Buffer
App
Buffer
ES
App
App
ES
Protocols
UDP– fast,coolfortheapplication,notreliableTCP – reliable(almost) applicationgetsACK whenwritten tobuffer
Application levelACKsmaybeneeded
HTTP
RELP
Beats
Kafka
Logstash,rsyslog,Fluentd
Logstash,rsyslog
Logstash,Filebeat
Logstash,rsyslog,Filebeat,Fluentd
Choosingtheshipper
application
rsyslog Elasticsearchhttp
socket
memory&diskassistedqueues
Choosingtheshipper
application
rsyslog Elasticsearchhttp
socket
memory&diskassistedqueues
application
filersyslogfilebeat
consumer
WhataboutOS?
SayNO toswapSettherightdiskscheduler
CFQ forspinningdisksdeadline forSSD
Usepropermount optionsforext4noatimenodirtimedata=writeback,nobarier
ForbaremetalcheckCPUgovernordisabletransparenthugepages
/proc/sys/vm/nr_hugepages=0
Weareengineers!
Wedevelop DevOpstools!
WeareDevOps people!
Wedofunstuff;)http://sematext.com/jobs
Thankyouforlistening!Getintouch!
Rafał[email protected]@kucrafal
http://sematext.com@sematext http://sematext.com/jobs
Cometalktousatthebooth