Using Nagios to monitor your WO systems

Post on 06-May-2015

408 views 0 download

transcript

Nagios for WO systemsPascal Robert Druide informatique

Nagios

• Open source project

• Available since 1999 (Netsaint)

• Pretty much the standard

• Interface a bit old (frames!)

Installation

• CentOS/Amazon Linux: yum install nagios nagios-plugins-all

• Ubuntu: apt-get install nagios3

• Mac OS X: port install nagios

Configuration directory

• CentOS/Amazon Linux: /etc/nagios/etc/httpd/conf.d/nagios.conf

• Ubuntu: /etc/nagios3

• Mac OS X: /opt/local/etc/nagios

NRPE• Agent to check local services

• CentOS/AmazonLinux: Installation: yum install nrpe Configuration: /etc/nagios/nrpe.cfg

• Ubuntu: apt-get install nagios-nrpe-server Configuration: /etc/nagios/nrpe.cfg

• Mac OS X: port install nrpe Configuration: /opt/local/etc/nrpe.cfg.sample

Basic monitoring

HTTP

• check_http plugin

• Can check port, string in respond, path, etc.

• Can do POST request with content

• Can do GET, HEAD, OPTIONS, TRACE, DELETE requests

• Can do BASIC auth

HTTPS

• Same plugin as HTTP

• Can check date of certificate

Using Selenium WebDriver

• Need more complex HTTP check?

• Selenium WebDriver + Google Chrome + script to the rescue!

MySQL

• Two plugins: check_mysql and check_mysql_query

• check_mysql can check status of slave

• check_mysql_query will check result of query against warning/critical levels

PostgreSQL

• check_pgsql

• Will check if specified database is active and running

Disk

• You don’t want to run out of disk space!

• check_disk plugin

• Check available disk space of specific file system or path

JMX

• Check the heap space of your WO apps!

• check_jmx

• http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_jmx/details

check_woapp.py

• Nagios plugin (Python) that checks numerous stuff in Monitor

• State

• Number of deaths

• Is refusing new sessions

• Is auto recover on?

• # of active sessions

Plugin development

• Can be anything! Bash, Python, Perl, Java, etc.

• Only need to send proper exit() signal

• Better to send performance data too

Other useful plugins

• check_load

• check_by_ssh

• check_dns

• check_file_age

• check_tcp/check_udp

• check_linux_raid

• check_ntp_time

• check_swap

Graphing

• Not built-in

• Numerous third-party

• I use PNP4Nagios

Actions

• Can launch actions (scripts) based on events

• Nagios call this « event handlers »

• Examples:

• Start new instance if one is down

• Start new VM if host memory is low

Demo

Next: Logstash

Q&A