Вячеслав Крюков, Ivinco

Post on 25-Jun-2015

662 views 0 download

Tags:

description

HighLoad++ 2013

transcript

Multi-Terabyte Sphinx HA cluster

Vyacheslav Kryukovvkrukov@ivinco.com

Sphinx cluster

Sphinx cluster

Sphinx cluster

Sphinx cluster

Sphinx cluster

Sphinx cluster

Sphinx HA cluster, requrements

● Incident tolerance and availability level

● Adaptive balancing● Resources redundancy utilisation ● Easy deployment of new resources

Sphinx HA cluster architecture

Sphinx HA cluster, architecture #1

Sphinx HA cluster, architecture #2

Sphinx HA cluster, ha_strategy

● Simple balancing● random● roundrobin

● Adaptive balancing● nodeads● noerrors

http://sphinxsearch.com/docs/current.html#conf-ha-strategy

Sphinx HA cluster, adaptive balancing

● Latency● Query timeouts● Connect timeouts● Connect failures● Network errors● Wrong replies● Unexpected closings● Warnings

Sphinx HA cluster, configurationindex some_index {

type = distributedagent = se01-1:3312|se01-2:3312:some_index_se01agent = se02-1:3312|se02-2:3312:some_index_se02agent = se03-1:3312|se03-2:3312:some_index_se03agent = se04-1:3312|se04-2:3312:some_index_se04ha_strategy = nodeads

}

searchd{

...ha_ping_interval = 1000ha_period_karma = 60...

}

http://sphinxsearch.com/docs/current.html#conf-ha-ping-intervalhttp://sphinxsearch.com/docs/current.html#conf-ha-period-karma

Sphinx HA cluster, SHOW AGENT STATUSmysql> SHOW AGENT STATUS;+-------------------------------------+--------------------+| Key | Value |+-------------------------------------+--------------------+| status_period_seconds | 60 | | status_stored_periods | 15 | ...| ag_19_hostname | se02-1:3312 | | ag_19_references | 13 | | ag_19_lastquery | 1.91 | | ag_19_lastanswer | 1.86 | | ag_19_lastperiodmsec | 51 | | ag_19_errorsarow | 0 | | ag_19_1periods_query_timeouts | 0 | | ag_19_1periods_connect_timeouts | 0 | | ag_19_1periods_connect_failures | 0 | | ag_19_1periods_network_errors | 0 | | ag_19_1periods_wrong_replies | 0 | | ag_19_1periods_unexpected_closings | 0 | | ag_19_1periods_warnings | 0 | | ag_19_1periods_succeeded_queries | 101 | | ag_19_1periods_msecsperqueryy | 83.92 | (the same for 5periods_ and 15periods_)| ag_20_hostname | se02-2:3312 | | ag_20_references | 13 | | ag_20_lastquery | 0.55 | | ag_20_lastanswer | 0.49 | | ag_20_lastperiodmsec | 55 | | ag_20_errorsarow | 0 | | ag_20_1periods_query_timeouts | 0 | | ag_20_1periods_connect_timeouts | 0 | | ag_20_1periods_connect_failures | 0 | | ag_20_1periods_network_errors | 0 | | ag_20_1periods_wrong_replies | 0 | | ag_20_1periods_unexpected_closings | 0 | | ag_20_1periods_warnings | 0 | | ag_20_1periods_succeeded_queries | 55 | | ag_20_1periods_msecsperqueryy | 86.08 | (the same for 5periods_ and 15periods_)...

Sphinx HA cluster, balancing in real time

Sphinx HA cluster, balancing in real time

# cd /mnt/data

# iozone -i0 -i2 -s16g -r32k -f iozone.tmp

Sphinx HA cluster, balancing in real time

Sphinx HA cluster, balancing in real time

Sphinx HA cluster, data processing

● Data loading to permanent store● Data indexig● Indexes validation and synchronization (Rsync and NetCat)● Update indexes from application

Sphinx HA cluster, performance and availability

● Provide performance with band wide● What to monitor

● SHOW AGENT STATUS, nodes performance, disc space, io and cpu usage

● Errors, warnings, crashes● Indexes synchronization, validity, freshness

Sphinx HA cluster, distributed indexer

Sphinx HA cluster, distributed indexer

● Automated

● distributed indexing● Indexes validation● indexes delivery

● Failover● Centralised Sphinx indexes configuration management● Indexes rebalancing

Resources consumption accounting

● io ops● io size● fetched_docs● fetched_hits● fetched_skips● total_found

Rosette Linguistics Platform

● Used for analysis of unstructured text in CJK languages ● Better quality then using ngram options● Slow indexer performance

http://www.basistech.com/text-analytics/rosette/

Questions?

vkrukov@ivinco.com

Sphinx cluster