+ All Categories
Home > Documents > Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the...

Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the...

Date post: 21-Mar-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
43
Nagios cooler than it looks 1 Wednesday, 31 October 2007
Transcript
Page 1: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Nagioscooler than it looks

1Wednesday, 31 October 2007

Page 2: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Outline

• sysadmin 101

• Nagios Overview

• Installing nagios

• NRPE / NSCA

• Other Stuff

• Questions

2Wednesday, 31 October 2007

Page 3: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Sysadmin 101

• Every sysadmin needs a decent toolkit...

3Wednesday, 31 October 2007

Page 4: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Sysadmin 101

• Every sysadmin needs a decent toolkit...

• Ticketing / issue tracking / helpdesk

3Wednesday, 31 October 2007

Page 5: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Sysadmin 101

• Every sysadmin needs a decent toolkit...

• Ticketing / issue tracking / helpdesk

• Trend monitoring

3Wednesday, 31 October 2007

Page 6: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Sysadmin 101

• Every sysadmin needs a decent toolkit...

• Ticketing / issue tracking / helpdesk

• Trend monitoring

• Outage / warning alarms

3Wednesday, 31 October 2007

Page 7: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Sysadmin 101

• Every sysadmin needs a decent toolkit...

• Ticketing / issue tracking / helpdesk

• Trend monitoring

• Outage / warning alarms

• Espresso Maker

3Wednesday, 31 October 2007

Page 8: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Ticketing system

• Prevents mailbox overload

• see Limoncelli ‘Time Management for System Administrators’ - Glorified TODO list

• Highlights recurring themes

• Users like the feedback

4Wednesday, 31 October 2007

Page 9: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Example ticketing systems

• Remedy / BMC

• Footprints

• GGUS

• Request Tracker

5Wednesday, 31 October 2007

Page 10: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Example ticketing systems

• Remedy / BMC

• Footprints

• GGUS

• Request Tracker

Fix before users notice?

5Wednesday, 31 October 2007

Page 11: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Trend Monitoring

• X disk free - is that up or down?

• Temperature - What’s normal?

• Network activity - have you been slashdotted?

6Wednesday, 31 October 2007

Page 12: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Ganglia

• Most cluster vendors package it.

• http://ganglia.sf.net

7Wednesday, 31 October 2007

Page 13: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Ganglia

• Most cluster vendors package it.

• http://ganglia.sf.net

• Can be fed from MonAMI...

7Wednesday, 31 October 2007

Page 14: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

‘Something Broke’

• Various companies sell products that can monitor boxes / network / programs

• eg, Tivoli, NetView

• Nagios may not be ‘The Best’ - but it’s free, good enough and contributed to by the HEP community.

8Wednesday, 31 October 2007

Page 15: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Espresso Maker

• Nuff Said.

9Wednesday, 31 October 2007

Page 16: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

What is Nagios?

• “An Open Source host, service and network monitoring program”

• Central Daemon

• intermittently polls hosts and services

• uses plugins

• returns the status information

• Notifies / escalates depending on severity / pattern

10Wednesday, 31 October 2007

Page 17: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Nagios Overview

• http://www.nagios.org

• Ethan Galstad released under GPL2

• Version 2.10 (stable) and 3.0beta5

• Needs Linux and C compiler

• Web GUI - Apache and libgd

• Can also monitor Windows (NSClient) and Netware

11Wednesday, 31 October 2007

Page 18: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Screenshots

12Wednesday, 31 October 2007

Page 19: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Screenshots

12Wednesday, 31 October 2007

Page 20: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Screenshots

12Wednesday, 31 October 2007

Page 21: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Screenshots

12Wednesday, 31 October 2007

Page 22: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Screenshots

12Wednesday, 31 October 2007

Page 23: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Screenshots

12Wednesday, 31 October 2007

Page 24: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Installation

• Choose a SECURE box to host it on that can see the network

• Source from nagios.org

• RPMs from DAG

• nagios, nagios-plugins, nagios-plugins-nrpe, nagios-nsca

• .deb already in ubuntu (2.9)

13Wednesday, 31 October 2007

Page 25: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

14Wednesday, 31 October 2007

Page 26: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Configuration

• Start monitoring localhost until you get the basics

• Add in a new cfg_dir= into nagios.cfg

• Expand to ping test of your nodes

• Add a few network accessible services (sshd)

• Run probes on remote boxes

15Wednesday, 31 October 2007

Page 27: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Config Tips

• check_period 24*7 even if notifications aren’t

• Leave authentication up to Apache - use * in cgi.cfg

• See the ‘Time Saving Tricks for Object Definitions’ regexps and multiple hosts

16Wednesday, 31 October 2007

Page 28: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Templatescat <<EOF > $CFG# Nagios config file for gla.scotgrid worker nodes# built automatically from genhost.sh

define hostgroup{ alias Worker Nodes hostgroup_name workernodes}

define host{ name wn_template use linux-server hostgroups workernodes register 0}

define service{ hostgroup_name workernodes service_description sshd check_command check_ssh servicegroups sshservers use local-service}EOF

for i in `seq 1 140` ; doh=`printf "%03d" $i`cat <<EOF >> $CFGdefine host { host_name node$h alias Worker Node $h address 10.141.0.$i use wn_template}

EOFdone

17Wednesday, 31 October 2007

Page 29: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Plugins

• Can be written in any language - exit code counts

• 0 - OK, 1 - Warning, 2 - Critical, 3 - Unknown

• http://nagiosplug.sf.net/developer-guidelines.html

• Plenty of included ones in the rpms

• Beware of overhead (switch to C / embPerl)

18Wednesday, 31 October 2007

Page 30: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Active / Passive

19Wednesday, 31 October 2007

Page 31: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

NRPE

• Daemon runs on remote host (5666/tcp)

• Accepts SSL from check_nrpe

• Runs previously defined plugins on that host

• You need to install plugins on remote host...

!"#$%&'()*+,-.-/',!"#$%&'()*+,-.-/',!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!

01%2!3"4&56324!01%2!3"4&56324!

.7%#)89':+

"#$!%&'(!)**+,!-.!*$.-/,$*!0+!)11+2!3+4!0+!$5$640$!%)/-+.!714/-,.!+,!8$9+0$!:-,45;<,-5!9)6#-,$.=!!"#$!9)-,!8$).+,!>+8!*+-,/!0#-.!-.!0+!)11+2!%)/-+.!0+!9+,-0+8!?1+6)1?!8$.+486$.!@1-A$!B'<!1+)*C!9$9+83!4.)/$C!$06=D!+,!8$9+0$!9)6#-,$.=!!E-,6$!0#$.$!74F1-6!8$.+486$.!)8$!,+0!4.4)113!$57+.$*!0+!$50$8,)1!9)6#-,$.C!),!)/$,0!1-A$!%&'(!94.0!F$!-,.0)11$*!+,!0#$!8$9+0$!:-,45;<,-5!9)6#-,$.=

%+0$G!H0!-.!7+..-F1$!0+!$5$640$!%)/-+.!714/-,.!+,!8$9+0$!:-,45;<,-5!9)6#-,$.!0#8+4/#!EEI=!!"#$8$!-.!)!!"#!$%&'%(("!714/-,!0#)0!)11+2.!3+4!0+!*+!0#-.=!!<.-,/!EEI!-.!9+8$!.$648$!0#),!0#$!%&'(!)**+,C!F40!-0!)1.+!-97+.$.!)!1)8/$8!@B'<D!+J$8#$)*!+,!F+0#!0#$!9+,-0+8-,/!),*!8$9+0$!9)6#-,$.=!!"#-.!6),!F$6+9$!),!-..4$!2#$,!3+4!.0)80!9+,-0+8-,/!#4,*8$*.!+8!0#+4.),*.!+>!9)6#-,$.=!!K),3!%)/-+.!)*9-,.!+70!>+8!4.-,/!4.-,/!0#$!%&'(!)**+,!F$6)4.$!+>!0#$!1+2$8!1+)*!-0!-97+.$.=!

;7%&+:/<,%4=+8=/+>

"#$!%&'(!)**+,!6+,.-.0.!+>!02+!7-$6$.G

! "#$!!"#!$%)*+#!714/-,C!2#-6#!8$.-*$.!+,!0#$!1+6)1!9+,-0+8-,/!9)6#-,$! "#$!,-./!*)$9+,C!2#-6#!84,.!+,!0#$!8$9+0$!:-,45;<,-5!9)6#-,$

L#$,!%)/-+.!,$$*.!0+!9+,-0+8!)!8$.+486$!+>!.$8J-6$!>8+9!)!8$9+0$!:-,45;<,-5!9)6#-,$G

! %)/-+.!2-11!$5$640$!0#$!!"#!$%)*+#!714/-,!),*!0$11!-0!2#)0!.$8J-6$!,$$*.!0+!F$!6#$6A$*! "#$!!"#!$%)*+#!714/-,!6+,0)60.!0#$!,-./0*)$9+,!+,!0#$!8$9+0$!#+.0!+J$8!),!@+70-+,)113D!EE:M78+0$60$*!

6+,,$60-+,! "#$!,-./!*)$9+,!84,.!0#$!)778+78-)0$!%)/-+.!714/-,!0+!6#$6A!0#$!.$8J-6$!+8!8$.+486$! "#$!8$.410.!>8+9!0#$!.$8J-6$!6#$6A!)8$!7)..$*!>8+9!0#$!,-./!*)$9+,!F)6A!0+!0#$!!"#!$%)*+#0714/-,C!2#-6#!

0#$,!8$048,.!0#$!6#$6A!8$.410.!0+!0#$!%)/-+.!78+6$..=

%+0$G!"#$!%&'(!*)$9+,!8$N4-8$.!0#)0!%)/-+.!714/-,.!F$!-,.0)11$*!+,!0#$!8$9+0$!:-,45;<,-5!#+.0=!!L-0#+40!0#$.$C!0#$!*)$9+,!2+41*,O0!F$!)F1$!0+!9+,-0+8!),30#-,/=

:).0!<7*)0$*G!K)3!PC!QRRS ')/$!Q!+>!PT B+738-/#0!@6D!PUUUMQRRS!(0#),!V)1.0)*

20Wednesday, 31 October 2007

Page 32: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

NSCA

• Daemon runs on the nagios server

• Client spits output with send_nsca script

• Need to configure nagios to accept the passive checks

• <host_name>[tab]<svc_description>[tab]<return_code>[tab]<plugin_output>[newline]

• <host_name>[tab]<return_code>[tab]<plugin_output>[newline]

21Wednesday, 31 October 2007

Page 33: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

NSCA

• Daemon runs on the nagios server

• Client spits output with send_nsca script

• Need to configure nagios to accept the passive checks

• <host_name>[tab]<svc_description>[tab]<return_code>[tab]<plugin_output>[newline]

• <host_name>[tab]<return_code>[tab]<plugin_output>[newline]

• Yep, it works with MonAMI

21Wednesday, 31 October 2007

Page 34: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Jabber / SMS

• Perl script that uses Net::XMPP

• Presently hacky as hard-coded @gmail.com address

• Edited contacts.cfg to include...pager andrew.elwellservice_notification_commands notify-by-jabberhost_notification_commands host-notify-by-jabberservice_notification_period 24x7host_notification_period 24x7...

22Wednesday, 31 October 2007

Page 35: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Escalation

• Yep. Good Idea. We don’t use it.

23Wednesday, 31 October 2007

Page 36: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Event Handlers

• Attempts to fix critical services

• Log trouble tickets etc

• No, We don’t use it...

24Wednesday, 31 October 2007

Page 37: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Scheduled Maintenance

• stop nagios (blind)

• put node into maintenance using web page (single host)

• echo into the nagios pipe (scalable)

25Wednesday, 31 October 2007

Page 38: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

#!/bin/bash# This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command# to Nagios. Adjust variables to fit your environment as necessary.

now=`date +%s`minus1h=$(($now - 3600))plus1h=$(($now + 3600))commandfile='/var/log/nagios/rw/nagios.cmd'for i in `seq 109 138` 140 ; do /usr/bin/printf "[%lu] SCHEDULE_HOST_DOWNTIME;node$i;%lu;%lu;0;0;604800; SysAdmins;Down to reduce power\n" \

$now $minus1h $plus1h > $commandfiledone

26Wednesday, 31 October 2007

Page 39: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Dependencies

• DOWN

• UNREACHABLEdefine host{ host_name Switch2 parents Router1 }

27Wednesday, 31 October 2007

Page 40: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Availability Reporting

28Wednesday, 31 October 2007

Page 41: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

More Info...

• Nagios Community Wiki - http://www.nagioscommunity.org/wiki/index.php/Main_Page

• Plugins http://nagiosplugins.org/

• Nagios Exchange http://www.nagiosexchange.org/

• http://www.gridpp.ac.uk/wiki/Nagios

29Wednesday, 31 October 2007

Page 42: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

snippets from 3.0 docs

• use_large_installation_tweaks - OS does memory cleanup, doesn’t double fork() but no summary macros

• Multiline plugin output (from 350b to 4k)

• Docs are MUCH clearer than 2.0 ones

• Host checks run in parallel

• check_{host|service}_cluster for HA setups

30Wednesday, 31 October 2007

Page 43: Nagios - hep · #!/bin/bash # This is a sample shell script showing how you can submit the SCHEDULE_HOST_DOWNTIME command # to Nagios. Adjust variables to fit your environment as

Any Questions?

31Wednesday, 31 October 2007


Recommended