OSMC 2014: MonitoringLove with Sensu | Jochen Lillich

Post on 02-Jul-2015

313 views 2 download

description

After it was mentioned on Twitter for the first time, it took the #monitoringsucks hashtag only a short time to get widely adopted. And as a long-time Nagios user, I certainly understand why. But at DevOps Days Rome, Ulf Månsson started a counter movement: #monitoringlove. This was after he introduced the Sensu monitoring framework at his company with great success. In my talk, I´m going to explain the basics and details of Sensu and how to get a complete Sensu monitoring system running in only a few hours.

transcript

#monitoringlovewith Sensu

OSMC 2014, Nuremberg, Germany

http://sensuapp.org

Why Sensu?• Written in clean Ruby

• Scalable architecture

• Plugins in any language

• Can use Nagios checks

• Collects both checks and metrics

• Great community

Jochen Lillichhttp://freistil.it

@geewiz

Sensu Core

Sensu Enterprise

Installation• Omnibus packaging

• Configuration in JSON files

• Sensu cookbook for Chef

• Puppet module

• Connects all Sensu components

• Asynchronous communication

Sensu Server• Orchestrates check execution

• Processes check results

• Triggers event handlers

Sensu Client• Registers automatically with the Server

• Sends keepalive information

• Receives check execution requests

• Schedules checks locally

• Executes checks

• Publishes check results

• Publishes external events

API• get event data

• get agent data

• trigger check execution

• resolve events

• silence checks

Scheduling• Standard checks (server)

• Standalone checks (client)

• Manual checks (API)

{ "checks": { "disk_free": { "type": "status", "subscribers": [ "all" ], "handlers": [ "default" ], "command": "/usr/lib/nagios/plugins/check_disk -w :::disk_warn::: -c :::disk_crit::: -A -x /dev/shm -X nfs -i /boot", "interval": 60 } }}

Checks in Chefsensu_check 'mysql_server' do command "/usr/lib/nagios/plugins/check_mysql " + "-u 'monitoring' " + "-p '#{node['mysql']['server_mon_password']}'" handlers ['default'] standalone true interval 30end

Metrics check{ "checks": { "load_metrics": { "type": "metric", "command": "load-metrics.rb", "subscribers": [ "production" ], "interval": 10 } }}

Metrics output$ ruby load-metrics.rbsrv3.local.load_avg.one 0.89 1365270842srv3.local.load_avg.five 1.01 1365270842srv3.local.load_avg.fifteen 1.06 1365270842$ echo $?0

External eventsecho '{ "name": "my_check", "output": "some output", "status": 0 }' > /dev/tcp/localhost/3030

Useful: https://github.com/solarkennedy/sensu-shell-helper

Handler types• Pipe • TCP • UDP • Transport

• Sets

Common event handlers• Email• PagerDuty• Graphite• IRC• Slack

Example handler code#!/usr/bin/env ruby

require 'rubygems'require 'json'

# Read event dataevent = JSON.parse(STDIN.read, :symbolize_names => true)# Write the event data to a filefile_name = "/tmp/sensu_#{event[:client][:name]}_" + "#{event[:check][:name]}"File.open(file_name, 'w') do |file| file.write(JSON.pretty_generate(event))end

Example handler configuration{ "handlers": { "file": { "type": "pipe", "command": "/etc/sensu/handlers/file.rb" } }}

Sensu CLIhttps://github.com/agent462/sensu-cli

• sensu-cli resolve srv3 apache_http

• sensu-cli client delete srv3

• sensu-cli silence srv3 --reason "Shut up already"

--expire 3600

#chatopshttps://github.com/sensu/sensu-hubot

• sensu events summarize

• sensu events filter severity critical

• sensu events filter subscription webservers

Monitoring your monitoring• Check RabbitMQ ready queue!

Scaling Sensu

Scaling a single site• Sensu Server • Sensu API • RabbitMQ • Redis

Multi-DC operation

HA• High availability monitoring with Sensu

References

Community pluginshttps://github.com/sensu/sensu-community-plugins

• Over 600 plugins

• 80 contributors

Support• #sensu on FreeNode IRC

• sensu-users mailing list

• Commercial support from HeavyWater

Thank you!@geewiz

jochen@freistil.it

Credits• Samuel Beckett Bridge by Miguel Mendez https://flic.kr/p/

dyn2FU